Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using a markdown formatter for (doc) comments & wrapping #5782

Open
tgross35 opened this issue Jun 16, 2023 · 12 comments
Open

Using a markdown formatter for (doc) comments & wrapping #5782

tgross35 opened this issue Jun 16, 2023 · 12 comments

Comments

@tgross35
Copy link
Contributor

tgross35 commented Jun 16, 2023

Are there any thoughts on using an off-the-shelf markdown formatter to handle doc comments? Specifically comrak which is quite popular and (I believe) has a compatible license.

Reasoning is that it seems like needs are slowly approaching a full markdown-aware formatter. Things like preserving list items (#5746) or maintaining table format are important but tricky to handle without one.

Just a sample from #5746, using comrak configured to line length 80 - note that text is wrapped but indentation and code are both preserved

Input:
//! - References?
//!   - Unsound. See this example:
//!     ```compile_fail
//!     # use core::mem::MaybeUninit;
//!     let mut a = 0;
//!     let b = maybe_uninit_ext::new::<&mut MaybeUninit<u8>>(&mut a);
//!     *b = maybe_uninit_ext::uninit();
//!     assert_eq!(a, 0); // undefined behavior! a is uninitialized.
//!     ```
//!     Due to interior mutability, this is an issue for immutable references
//!     as well.
//!
//!     Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean felis ipsum, mollis non
//!     malesuada quis, tincidunt a orci. Aliquam erat volutpat. Nullam dui diam, sollicitudin
//!     sed nulla ut, efficitur tempus mauris. Fusce consequat posuere nisl vitae consequat. Vivamus mattis porttitor sapien,
//!     sit amet interdum velit varius quis. Integer est dolor, interdum ultrices elementum
//!     vitae, porttitor sit amet ligula. Fusce ullamcorper sed velit nec dictum.

Output:
//! - References?
//!   - Unsound. See this example:
//!     ``` compile_fail
//!     # use core::mem::MaybeUninit;
//!     let mut a = 0;
//!     let b = maybe_uninit_ext::new::<&mut MaybeUninit<u8>>(&mut a);
//!     *b = maybe_uninit_ext::uninit();
//!     assert_eq!(a, 0); // undefined behavior! a is uninitialized.
//!     ```
//!     Due to interior mutability, this is an issue for immutable references as
//!     well. //\! Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean
//!     felis ipsum, mollis non malesuada quis, tincidunt a orci. Aliquam erat
//!     volutpat. Nullam dui diam, sollicitudin sed nulla ut, efficitur tempus
//!     mauris. Fusce consequat posuere nisl vitae consequat. Vivamus mattis
//!     porttitor sapien, sit amet interdum velit varius quis. Integer est dolor,
//!     interdum ultrices elementum vitae, porttitor sit amet ligula. Fusce
//!     ullamcorper sed velit nec dictum.

Edit with more context

Originally suggested at #3347 (comment) because it seems like we do a lot of things that need to be markdown-aware (wrapping comments and doc comments, formatting doc comments), and doing this all manually seems more tedious than using a tool designed to do this exact thing.

@ytmimi
Copy link
Contributor

ytmimi commented Jun 16, 2023

@tgross35 thanks for reaching out.

I can't seem to find the original comment, but I believe you or someone else has already pitched using comrak before.

Was this really the output? I'm concerned about //\! Lorem ipsum dolor sit amet, being merged with the paragraph before it.

//!     Due to interior mutability, this is an issue for immutable references as
//!     well. //\! Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean
//!     felis ipsum, mollis non malesuada quis, tincidunt a orci. Aliquam erat
//!     volutpat. Nullam dui diam, sollicitudin sed nulla ut, efficitur tempus
//!     mauris. Fusce consequat posuere nisl vitae consequat. Vivamus mattis
//!     porttitor sapien, sit amet interdum velit varius quis. Integer est dolor,
//!     interdum ultrices elementum vitae, porttitor sit amet ligula. Fusce
//!     ullamcorper sed velit nec dictum.

@tgross35
Copy link
Contributor Author

That was me, I just was considering maybe trying to implement it so wanted to get some feedback first.

Err… that part is probably my fault. I wrote about 10 lines of code just to test the ergonomics and probably accidentally trimmed the blank line

@ytmimi
Copy link
Contributor

ytmimi commented Jun 16, 2023

@tgross35 I appreciate you looking into this. I took a look at comrak when you initially suggested it. One major issue that I ran into was how it handled links. Maybe these issues can be fixed with configuration, but when I run comrak it makes modifications to the markdown that I didn't want, and I can almost certainly predict users would find unacceptable.

Here's the input I've been testing against (ignore the text it's mostly there for me as I've tried integrating markdown + code formatting):

# Some rust code
rustfmt will make minimal edits outside of code blocks like updating newlines after headers and before code blocks.
```rust
fn main()   {
                println!(            "Hello world!"
                )
}
```

Here's an indented code block that won't be formatted

    fn main()   {
                    println!(            "Hello world!"
                    )
    }

Hey check out the [commonmark spec]!

Look we can also link to rust types like [`Debug`] and [`Vec`].
Some additional text with [brackets]. what if I manually \[esacpe the bracket\]? looks like they stay escaped!

[commonmark spec]: https://spec.commonmark.org/0.30/

[a dead link]: https://this/link/isnt/used

[`Debug`]: core::fmt::Debug

Running comrak -t commonmark I get this output:

# Some rust code

rustfmt will make minimal edits outside of code blocks like updating newlines after headers and before code blocks.

``` rust
fn main()   {
                println!(            "Hello world!"
                )
}
```

Here's an indented code block that won't be formatted

    fn main()   {
                    println!(            "Hello world!"
                    )
    }

Hey check out the [commonmark spec](https://spec.commonmark.org/0.30/)\!

Look we can also link to rust types like [`Debug`](core::fmt::Debug) and \[`Vec`\].
Some additional text with \[brackets\]. what if I manually \[esacpe the bracket\]? looks like they stay escaped\!

Here are some issues that I can see with the output:

  • Rewrites the links inline
    • Not great, especially since that could mess with rustfmt's max line length leading to some issues.
  • Unnecessarily escapes the brackets around the type link for Vec
    • I think rustdoc will still link to Vec, but users would probably be irked by this (I know I was).
  • It escaped [brackets] (similar to Vec)
    • If this parsed why couldn't it have been spit out the same way?
  • It dropped the dead link
    • Arguably an improvement, but we definitely don't want to remove things the user wrote

@tgross35
Copy link
Contributor Author

Thanks for the update. I agree with what you said - if those things can be fixed, would you considere this worth pursuing? If so, I'll dig into what would be needed to get the desired output (configuration or upstream changes, the maintainer is responsive)

@ytmimi
Copy link
Contributor

ytmimi commented Jun 16, 2023

If these issues were resolved I'd be more open to leveraging comrak. As I mentioned, I've been looking into better markdown support myself, but I also want to stress that it's been lower priority for me since the focus for rustfmt will always be formatting rust code.

Right now the link issues are a big blocker. Full transparency, I haven't looked too deeply into comrak and how it formats other markdown constructs so there might be other issues that need to be addressed before I'd feel totally confident using it.

@calebcartwright
Copy link
Member

with the caveat that there be may be some nuance i'm unaware of (as i've not looked closely at comrak).. i'm generally highly disinclined to delegate any aspect of formatting to any 3rd party library.

a markdown parser might be something I'm more amenable to, but delegating the formatting responsibility opens up an unacceptably large risk for us wherein we'd need to move to a different version of the lib (license changes, bugs, etc.) but doing so would introduce breaking formatting changes, which would directly violate the formatting stability guarantee rustfmt is constrained by.

rustfmt delegating formatting to a 3rd party lib would, imo, be roughly analogous to rustc delegating the ownership system to a 3rd party lib. could be fine at the moment it's incorporated, but too great a risk of getting stuck between a rock and a hard place on a core feature/feature set

@ytmimi
Copy link
Contributor

ytmimi commented Jun 16, 2023

Very good point!

@tgross35
Copy link
Contributor Author

That makes sense. In that case, would you be open to using pulldown_cmark (also used by rustdoc I believe) to parse the text and handwriting a formatter for it?

I'm not sure how difficult this would be, but I would assume that it may be easier to produce properly formatted/wrapped output than the current approach. If it's something you are open to, I am willing to try it out.

@calebcartwright
Copy link
Member

First, I really want to emphasize my appreciation for bringing this forward into an issue for focused discussion, and offering to work on it on top of that!

I confess I am rather ambivalent and torn on how to respond, with the tl;dr version being that I think this is a worthwhile experiment that could prove valuable, but I also want to make sure we don't waste your time.

For the longer take...

We're really stretched on maintainer bandwidth right now, and we've got a huge stack of really core work and high priority items that don't leave us much capacity for anything else.

I'm planning on articulating this a bit more formally soon so that we can provide more clarity and explicitness to the community, but at a (non-exhaustive) high level, we've really only got the capacity to focus on (a) growing the team, (b) core formatting work for standard syntax rustfmt doesn't yet support, (c) the volume of work that's coming our way as part of the style_edition mechanism and associated changes for the 2024 edition, and maybe one or two other areas (d)(e), etc.

While we are/will be reviewing some lower-impact/priority PRs as part of (a) to help those interested in getting more holistically involved with project do so, we're most likely not going to be able to review most other PRs that don't tie into those core focus areas.

And this is the source of my equivocation; we're definitely open to it in concept (I think Yacin has made a similar suggestion previously), and it'd definitely be helpful for us over the long term horizon, but I'm worried we wouldn't be able to give it the review justice it deserves in the more near term.

If you or anyone is interested in working on this as part of getting more involved with rustfmt holistically, then that's a conversation I'd love to have (probably best in zulip dm). If you or anyone else wants to work on this as more of a one-off/casual contribution then you're welcome to do so, but beware of the likely extensive delay in review/attention.

@tgross35
Copy link
Contributor Author

Thank you for taking the time to comment! I know it seems like you two are completely holding up the show, and I hope that you are able to gain some maintainers/core members in the near future to free you up some more.

Anyway, I didn't intend to have a polished PR soon, mostly just curious whether you might consider this a worthwhile approach toward better handling of comments. It seems like that answer is likely yes to at least some extent, so I'll just put it on my "someday to-do" list.

If the topic comes up again as an alternative toward manually fixing md-comment parsing issues, feel free to ping me.

@ytmimi
Copy link
Contributor

ytmimi commented Sep 12, 2023

@tgross35 I got the ball rolling on this in #5909. If you've got the bandwidth and the interest to help out it would be great if you could take a look at the PR and leave a review 😁

@tgross35
Copy link
Contributor Author

Thanks for the ping, how exciting! I will leave a review soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants