Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start CDATA section only after uppercase <![CDATA[ #781

Merged
merged 4 commits into from
Jul 6, 2024

Conversation

Mingun
Copy link
Collaborator

@Mingun Mingun commented Jul 5, 2024

Implement more strict check for the start of CDATA section. Incorrect case is a syntax error. In that sense this PR is differs from #780 where attributes in the end tag is validation error. So we should not be case tolerant in this case.

(Review in whitespace changes ignored mode)
@Mingun Mingun requested a review from dralley July 5, 2024 14:21
failures (6):
    syntax::cdata::lowercase::ns_reader::async_tokio
    syntax::cdata::lowercase::ns_reader::borrowed
    syntax::cdata::lowercase::ns_reader::buffered
    syntax::cdata::lowercase::reader::async_tokio
    syntax::cdata::lowercase::reader::borrowed
    syntax::cdata::lowercase::reader::buffered
@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 60.16%. Comparing base (7558577) to head (b71cf7c).
Report is 74 commits behind head on master.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #781      +/-   ##
==========================================
- Coverage   61.81%   60.16%   -1.66%     
==========================================
  Files          41       41              
  Lines       16798    16138     -660     
==========================================
- Hits        10384     9709     -675     
- Misses       6414     6429      +15     
Flag Coverage Δ
unittests 60.16% <100.00%> (-1.66%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

debug_assert!(buf.ends_with(b"]]"));
Ok(Event::CData(BytesCData::wrap(
// Cut of `![CDATA[` and `]]` from start and end
&buf[8..len - 2],
self.decoder(),
)))
}
// XML requires uppercase only, but we will check that on validation stage:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason to keep this difference in behavior intact in this PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTML allows using mixed case for DOCTYPE and handling of this was intentionally added to quick-xml in 0.23.0 and related #230. In one of previous PRs you said that you prefer to keep ability to parse HTML documents, so I leaved this ability.

CDATA sections, however, are always started from uppercase variant, event in HTML. Corresponding links a couple of lines below in the comment.

I planned to work on validation after merging #766 (because it changes possible content of Text events and we will know something about content, and may do not validate that).

After validation PR will be merged, the user will be able to validate each event manually after reading it if it requires strict parser and skip validation if that is not required. The API would be:

match reader.read_event() {
  Text(e) => {
    // Get iterator over all validation problems and stop at first
    // You also may skip problems that you want to allow
    if e.validate_iter().any() {
      return Err(...);
    }
    // Shortcut for reporting the first problem as a `Result`
    e.validate()?;
  }
  // do the same for other events
}

@Mingun Mingun merged commit 0960333 into tafia:master Jul 6, 2024
7 checks passed
@Mingun Mingun deleted the only-upper-cdata branch July 6, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants