Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-link footnotes in markdown are reported as broken links. #1409

Open
jan-ferdinand opened this issue Apr 22, 2024 · 9 comments
Open

Non-link footnotes in markdown are reported as broken links. #1409

jan-ferdinand opened this issue Apr 22, 2024 · 9 comments
Labels
bug Something isn't working

Comments

@jan-ferdinand
Copy link

jan-ferdinand commented Apr 22, 2024

Consider the following file /tmp/file.md:

Some[^1] text.

[^1]: short

Running lychee . produces the following error

[./file.md]:
✗ [ERR] file:///tmp/short | Failed: Cannot find file

To the best of my knowledge, short is never a link but always a footnote. It would be nice for these presumed false positives to not occur.

@mre mre added the bug Something isn't working label Apr 23, 2024
@mre
Copy link
Member

mre commented Apr 23, 2024

Oh wow, thanks for reporting. Definitely a bug.
If you have the time, you could add a (failing) unit test here: https://github.com/lycheeverse/lychee/blob/master/lychee-lib/src/extract/markdown.rs
You can use your example Markdown document for the test.
The bug is somewhere in

pub(crate) fn extract_markdown(input: &str, include_verbatim: bool) -> Vec<RawUri> {

@jan-ferdinand
Copy link
Author

I've added a test in #1410. Unfortunately, I can't figure out where extraction of markdown links is happening.

@mre
Copy link
Member

mre commented Apr 26, 2024

Thanks. :)

@jan-ferdinand
Copy link
Author

You're more than welcome. 😊 I'd have liked to give resolution a shot, but couldn't identify where to start. It's probably somewhere in the parser? If so, that's probably more than I can chew right now.

@mre
Copy link
Member

mre commented Apr 26, 2024

Feel free to dive in. But yeah, it's in the markdown parser in pulldown_cmark, which is the crate we're using for it.

@thomas-zahner
Copy link
Member

thomas-zahner commented Jun 14, 2024

I had a closer look at this. Actually this is not a bug in lychee.
We use pulldown_cmark to parse Markdown thus we are treating all Markdown as CommonMark. CommonMark does not know specify footnotes, so there really are no footnotes. Instead, there are shortcut reference links and the "footnote" in your example is treated as a shortcut reference links.

So the example you provide:

Some[^1] text.

[^1]: short

is understood as shortcut reference link and therefore converted into the following HTML:

<p>Some<a href="short">^1</a> text.</p>

When not using a valid (relative) link it is not a shortcut reference link and the text is simply understood as normal paragraphs:

Some[^1] text.

[^1]: multiple words

now becomes:

<p>Some[^1] text.</p>
<p>[^1]: multiple words</p>

So footnotes are neither part of CommonMark nor GitHub Flavored Markdown (the only Markdown specifications I know of) but still some people might be using them because many non-specified flavours do make use of them. (the beauty of the Markdown flavour swamp)

So one thing we could do is to treat the link of these shortcut type links not as URL but as plain text (extract_plaintext) where we extract the URLs from. This would reduce the false positive rate when people are checking Markdown which is not CommonMark compliant, which is probably the big majority. @mre what do you think?

@jan-ferdinand
Copy link
Author

jan-ferdinand commented Jun 14, 2024

Interesting! Thanks for digging and explaining what's going on.

footnotes are [not part of] GitHub Flavored Markdown […]

I disagree: 😌

You can add footnotes to your content by using this bracket syntax:

Here is a simple footnote[^1].

[^1]: My reference.

Edit: Quote linked documentation directly.

@thomas-zahner
Copy link
Member

No problem 👍

I disagree 😌

Wait...
I stated that because I could not find anything related to footnotes in their official spec but in the documentation link you sent they go on explaining how to use footnotes. So I guess that they are not even adhering to their own spec or are referencing some other falvour there? 🤯

@jan-ferdinand
Copy link
Author

The markdown swamp gets swampier the further you go. 🌧️ The strongest indication of them going against their own spec that I can see is: this1 works.

Footnotes

  1. thing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants