Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate HTML of test fixtures in CI #422

Closed
wants to merge 5 commits into from

Conversation

Weibye
Copy link
Collaborator

@Weibye Weibye commented Apr 10, 2023

Objective

Fix #421

Context

  • During setting this up I spotted multiple typos in the HTML. One who was present in the HTML template of the test fixtures (!), having been propagated into multiple tests already.
  • I don't know to what degree these typos have had on the test results or not.
  • This is using the following github action: https://github.com/marketplace/actions/html5-validator which is a github action around https://github.com/svenkreiss/html5validator
    • It appears that this action does not currently support properties like aspect-ratio and margin-inline-start causing a lot of false negatives. Thus this PR is not usable in its current state.

Feedback wanted

I think this clearly demonstrates the value of this lint / validation but we need a validation tool that is equally up to date with specs as our codebase. We should add support for what we need upstream, find other crates to build upon within the rust ecosystem or roll our own. We do want HTML & CSS import / export functionality somewhere down the line so maybe this would tie into that.

For the short term I suggest we cherry pick the fixes to the fixtures and get them merged in on a new branch.

@nicoburns
Copy link
Collaborator

This is a definitely a good idea. I agree that we'll probably want to implement our own parser/linter. Because we'll want control over which properties are supported. I think it probably does make sense to tie this is with a user-facing CSS parsing feature.

FYI, on that topic:

  • WIP: Add ability to parse a Style struct from a CSS string #393 partially implements this feature, but it uses the lightningcss crate which is pretty big, which might limit it's usefulness in a WASM context (but perhaps that doesn't matter). It also depends on multiple version of the phf crate which is causing our CI to fail, but they have merged a PR to fix that (just waiting for it to be published to crates.io).
  • I tried using swc_css_parser but that requires nightly, and the author is unwilling to change that (even though I submitted a PR implementing the change).
  • Another possibility I'm considering would be implementing a custom parser on top of simplecss (simple CSS parses all values to strings, but then makes you parse the actual values yourself)

@Weibye
Copy link
Collaborator Author

Weibye commented Apr 10, 2023

Also, I really don't like linters where you have to go into the console log of the action runner to identify what is wrong. A linter should place a comment in the PR on the line where the error is.

@Weibye
Copy link
Collaborator Author

Weibye commented Apr 10, 2023

This is a definitely a good idea. I agree that we'll probably want to implement our own parser/linter. Because we'll want control over which properties are supported. I think it probably does make sense to tie this is with a user-facing CSS parsing feature.

FYI, on that topic:

  • WIP: Add ability to parse a Style struct from a CSS string #393 partially implements this feature, but it uses the lightningcss crate which is pretty big, which might limit it's usefulness in a WASM context (but perhaps that doesn't matter). It also depends on multiple version of the phf crate which is causing our CI to fail, but they have merged a PR to fix that (just waiting for it to be published to crates.io).
  • I tried using swc_css_parser but that requires nightly, and the author is unwilling to change that (even though I submitted a PR implementing the change).
  • Another possibility I'm considering would be implementing a custom parser on top of simplecss (simple CSS parses all values to strings, but then makes you parse the actual values yourself)

This is all good info, thanks. I'm going to dive a bit into this but unsure how far I get. (Work is going to be busy over the next month again)

@Weibye
Copy link
Collaborator Author

Weibye commented Apr 10, 2023

There's also this crate: https://crates.io/crates/html-validation which is a sub-crate of percy https://github.com/chinedufn/percy/tree/master/crates/html-validation

Edit: This crate seems much better suited for our potential needs:

@Weibye
Copy link
Collaborator Author

Weibye commented Apr 10, 2023

Closing this as it is not viable in its current form, but the issues spotted has been merged in #423 instead

@Weibye Weibye closed this Apr 10, 2023
@nicoburns
Copy link
Collaborator

This crate seems much better suited for our potential needs: https://github.com/servo/html5ever

@Weibye I'm sure html5ever will work, but you may find html_parser simpler to work with.

@Weibye
Copy link
Collaborator Author

Weibye commented Apr 15, 2023

This crate seems much better suited for our potential needs: https://github.com/servo/html5ever

@Weibye I'm sure html5ever will work, but you may find html_parser simpler to work with.

html_parser is indeed much simpler to work with, but it does not provide validation of any kind.

I did find scraper which is a higher level API on top of html5ever which might be a much better approach, but I also think I need to look at validation and parsing of the tree as two separate problems that may or may not share the same solution.

I'll tinker with html_parser for a bit to see if I can get something working without trying to solve the validation aspect of the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lint or validate HTML fixtures in CI
2 participants