Skip to content

Various span edgecases #127

@untitaker

Description

@untitaker

These bugs were found with FUZZ_SPAN_INVARIANTS from #126. The fuzzing results were summarized by AI.

Bug 1: String token has incorrect span for character reference followed by bogus comment

Input: &<!

Symptom:

  • String token has value & but span 1..2 pointing to < in the input
  • Should have span 0..1 pointing to the & character

Token sequence:

  1. Error at 3..3 (IncorrectlyOpenedComment)
  2. String 1..2 - BUG: span points to wrong content (points to < but value is &)
  3. Comment 1..3

Another similar bug: echo 'asdasd&<!' | cargo run --example tokenize_with_spans

Bug 2: String token emitted out of order with backward span

Input: 0<h 0 0 0

Symptom:

  • String token for initial 0 character is emitted last, after errors at later positions
  • Violates span ordering invariant

Token sequence:

  1. Error 4..5 (DuplicateAttribute)
  2. Error 9..9 (EofInTag)
  3. String 0..1 - BUG: span goes backward from position 5/9 to position 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions