Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclosed html tag lint #77119

Merged
merged 12 commits into from
Oct 7, 2020
Merged

Conversation

GuillaumeGomez
Copy link
Member

@GuillaumeGomez GuillaumeGomez commented Sep 23, 2020

Part of #67799.

I think @ollie27 will be interested (@Manishearth too since they opened the issue ;) ).

r? @jyn514

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 23, 2020
@jyn514 jyn514 added A-lint Area: Lints (warnings about flaws in source code) such as unused_mut. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Sep 23, 2020
Copy link
Member

@jyn514 jyn514 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have gone overboard on the comments 😆

compiler/rustc_session/src/lint/builtin.rs Outdated Show resolved Hide resolved
src/test/rustdoc-ui/unclosed-tags.rs Outdated Show resolved Hide resolved
src/test/rustdoc-ui/unclosed-tags.rs Outdated Show resolved Hide resolved
/// <script>
//~^ ERROR Unclosed HTML tag `unknown`
//~^^ ERROR Unclosed HTML tag `script`
/// <img><input>
Copy link
Member

@jyn514 jyn514 Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some more tests?

  • test these outside of the <script> to make sure they still don't warn
  • add some nested tags that should be closed and make sure they don't warn inside of <script>
  • add nested tags that should be closed and make sure they do warn outside of <script> (maybe <del> since that's what inspired me to look at the issue)

src/librustdoc/passes/html_tags.rs Outdated Show resolved Hide resolved
src/librustdoc/passes/html_tags.rs Outdated Show resolved Hide resolved
src/librustdoc/passes/html_tags.rs Outdated Show resolved Hide resolved
src/librustdoc/passes/html_tags.rs Outdated Show resolved Hide resolved
src/librustdoc/passes/html_tags.rs Outdated Show resolved Hide resolved
src/librustdoc/passes/html_tags.rs Outdated Show resolved Hide resolved
@GuillaumeGomez GuillaumeGomez force-pushed the unclosed-html-tag-lint branch 3 times, most recently from 6d8e95a to c043897 Compare September 24, 2020 20:17
@GuillaumeGomez
Copy link
Member Author

Updated!

@Manishearth
Copy link
Member

We should absolutely not be doing our own HTML parsing here, HTML parsing is extremely complicated even in cases like this one where HTML is rare.

If we can get this info from pulldown that's fine, but a hand rolled bespoke HTML parser is all but guaranteed to have problems.

@Manishearth
Copy link
Member

A different approach: perhaps we can look for very specific strings, e.g. <script>, <audio>, <video>, i.e. cases where the tag will slurp up and hide all proceeding content, and make sure that the number of such instances is equally matched with the number of closing tags

@Manishearth
Copy link
Member

Actually thinking about this more I guess it's fine if we parse a very limited subset of HTML. @jyn514 what is your take on this? I'd rather not do anything brittle here.

@jyn514
Copy link
Member

jyn514 commented Sep 24, 2020

Actually thinking about this more I guess it's fine if we parse a very limited subset of HTML. @jyn514 what is your take on this? I'd rather not do anything brittle here.

I think we absolutely shouldn't stabilize the lint with a bespoke HTML parser. I think it's ok to land this as allow by default on nightly, though.

Before merging we should definitely reach out to pulldown-cmark about how hard it would be to add this info in the API.

Copy link
Member

@camelid camelid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit regarding the lint docs :)

compiler/rustc_session/src/lint/builtin.rs Show resolved Hide resolved
@GuillaumeGomez
Copy link
Member Author

Updated!

src/librustdoc/passes/html_tags.rs Show resolved Hide resolved
src/doc/rustdoc/src/lints.md Outdated Show resolved Hide resolved
src/librustdoc/passes/html_tags.rs Outdated Show resolved Hide resolved
src/librustdoc/passes/html_tags.rs Outdated Show resolved Hide resolved
src/librustdoc/passes/html_tags.rs Outdated Show resolved Hide resolved
src/test/rustdoc-ui/invalid-html-tags.stderr Outdated Show resolved Hide resolved
@GuillaumeGomez
Copy link
Member Author

No, this is the reason that failed links on intra-doc re-exports don't warn: #77200

@jyn514 The issue is that the reexport item itself should be here, which isn't the case currently for some reason. In my opinion, this check is valid.

@GuillaumeGomez
Copy link
Member Author

Updated!

Copy link
Member

@jyn514 jyn514 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that the reexport item itself should be here, which isn't the case currently for some reason. In my opinion, this check is valid.

Ok, I opened #77230 for this. No need to block this PR on it.

src/librustdoc/passes/html_tags.rs Outdated Show resolved Hide resolved
src/librustdoc/passes/html_tags.rs Show resolved Hide resolved
compiler/rustc_session/src/lint/builtin.rs Show resolved Hide resolved
src/test/rustdoc-ui/invalid-html-tags.stderr Outdated Show resolved Hide resolved
@GuillaumeGomez
Copy link
Member Author

Updated!

@jyn514
Copy link
Member

jyn514 commented Sep 26, 2020

The last thing is adding a test for tags inside <script>, which I think isn't handled currently: #77119 (comment)

Other than that you just need to run x.py fmt.

 -                        let mut r = Range { start: range.start + start_pos, end: range.start + pos };
+                        let mut r =
If you're running `tidy`, try again with `--bless`. Or, if you just want to format code, run `./x.py fmt` instead.
+                            Range { start: range.start + start_pos, end: range.start + pos };

@bors

This comment has been minimized.

@jyn514 jyn514 added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 27, 2020
@GuillaumeGomez
Copy link
Member Author

So should I mark this PR as fix or not?

@jyn514
Copy link
Member

jyn514 commented Oct 5, 2020

Let's remove it for now so we can get the initial implementation in, we can always close the issue later. r=me with that done.

@jyn514
Copy link
Member

jyn514 commented Oct 5, 2020

@bors r+

We can make fixes in follow up PRs if necessary.

@bors
Copy link
Contributor

bors commented Oct 5, 2020

📌 Commit d3b7b7e has been approved by jyn514

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 5, 2020
@ollie27
Copy link
Member

ollie27 commented Oct 5, 2020

I'm strongly against adding really precise heuristics for catching broken HTML, we'd need a markdown-compatible HTML parser with recovery and that would be a mess for very marginal gain.

Hmm, I agree that we don't want this code to get too complicated. So yeah, maybe just catching some common cases might be the best we should do. Ideally we could actually fix close the unclosed tags so the rendering of the rest of the page isn't completely broken but that might be getting way too complicated.

I do think #67799 should to be left open at least until this lint is stabilized and warn by default though. At the very least we need a tracking issue for doing that and reusing #67799 seems like the easiest choice.

@GuillaumeGomez
Copy link
Member Author

I do think #67799 should to be left open at least until this lint is stabilized and warn by default though. At the very least we need a tracking issue for doing that and reusing #67799 seems like the easiest choice.

Agreed. I'll let @Manishearth update the issue then.

@Manishearth
Copy link
Member

I do think #67799 should to be left open at least until this lint is stabilized and warn by default though. At the very least we need a tracking issue for doing that and reusing #67799 seems like the easiest choice.

Yeah that seems fine. When this lands someone should do that.

@bors
Copy link
Contributor

bors commented Oct 6, 2020

⌛ Testing commit d3b7b7e with merge f2ef98f6f9ac42571c84c5f0d68c9d295f2216d9...

@bors
Copy link
Contributor

bors commented Oct 6, 2020

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Oct 6, 2020
@GuillaumeGomez
Copy link
Member Author

I can't see why it failed. Maybe it's a CI issue? cc @pietroalbini

@camelid
Copy link
Member

camelid commented Oct 6, 2020

Yeah, I always get confused when it fails at "mark the job as a failure". Where are the logs of the actual failure?

@pietroalbini
Copy link
Member

The weirdness was probably https://www.githubstatus.com/incidents/t34640ccrmfs.

@GuillaumeGomez
Copy link
Member Author

Oh okay. Well, let's try again then!

@bors: retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 7, 2020
@bors
Copy link
Contributor

bors commented Oct 7, 2020

⌛ Testing commit d3b7b7e with merge 8ae3b50...

@rust-log-analyzer
Copy link
Collaborator

Your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @rust-lang/infra. (Feature Requests)

@bors
Copy link
Contributor

bors commented Oct 7, 2020

☀️ Test successful - checks-actions, checks-azure
Approved by: jyn514
Pushing 8ae3b50 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 7, 2020
@bors bors merged commit 8ae3b50 into rust-lang:master Oct 7, 2020
@rustbot rustbot added this to the 1.49.0 milestone Oct 7, 2020
@GuillaumeGomez GuillaumeGomez deleted the unclosed-html-tag-lint branch October 7, 2020 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-lint Area: Lints (warnings about flaws in source code) such as unused_mut. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants