Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autolink and footnote extensions incorrectly process footnotes with certain letters #121

Closed
digitalmoksha opened this issue Oct 9, 2018 · 8 comments · Fixed by #229
Closed

Comments

@digitalmoksha
Copy link

Ran across a strange interaction. With footnotes and autolink enabled, certain footnote names, like with a w or an underscore in it, will cause the footnote not to be processed. Here's an example:

The problem seems to be when there's a w or an _ in the footnote anchor.[^w] and [^a] and [^a_a]

[^w]: This won't be rendered.

[^a]: This will be rendered.

[^a_a]: This won't be rendered.

using cmark-gfm --extension footnotes --extension autolink gives

<p>The problem seems to be when there's a w or an _ in the footnote anchor.[^w] and <sup class="footnote-ref"><a href="#fn1" id="fnref1">1</a></sup> and [^a_a]</p>
<section class="footnotes">
<ol>
<li id="fn1">
<p>This will be rendered. <a href="#fnref1" class="footnote-backref">↩</a></p>
</li>
</ol>
</section>

Using version cmark-gfm 0.28.3.gfm.16

@tonyg
Copy link

tonyg commented Oct 9, 2018

Nice. It seems to be an interaction between footnotes and autolink, because just --extension footnotes renders more (but not all!) of the footnotes in the example above.

Output from cmark-gfm --extension footnotes for the input above:

<p>The problem seems to be when there's a w or an _ in the footnote anchor.<sup class="footnote-ref"><a href="#fn1" id="fnref1">1</a></sup> and <sup class="footnote-ref"><a href="#fn2" id="fnref2">2</a></sup> and [^a_a]</p>
<section class="footnotes">
<ol>
<li id="fn1">
<p>This won't be rendered. <a href="#fnref1" class="footnote-backref">↩</a></p>
</li>
<li id="fn2">
<p>This will be rendered. <a href="#fnref2" class="footnote-backref">↩</a></p>
</li>
</ol>
</section>

Both the w and a footnotes are rendered there, but not the a_a footnote!

@tonyg
Copy link

tonyg commented Oct 9, 2018

A quick look at the code shows that this test is failing for the footnote refs that are not being recognised - that is, ->next->next is non-NULL. The quick experiment of disabling the check didn't help very much, but showed a possible related/adjacent problem, which is that the span considered here is shorter than one might hope, in the failing cases not extending to just before the close bracket, and varies depending on whether autolinking is enabled. Perhaps there's some hidden state interference here? Could checking for autolinks earlier/above interfere with tokenization at this point?

@digitalmoksha
Copy link
Author

digitalmoksha commented Nov 7, 2018

@tonyg Thanks for pointing me in the right direction.

For one of the cases [^w], it seems like it's due to setting special chars in the autolink:

cmark_mem *mem = cmark_get_default_mem_allocator();
special_chars = cmark_llist_append(mem, special_chars, (void *)':');
special_chars = cmark_llist_append(mem, special_chars, (void *)'w');
cmark_syntax_extension_set_special_inline_chars(ext, special_chars);

I don't know why these would be added. Removing them fixed the [^w] case (I'm not saying this is the fix, just narrowing it down).

But it doesn't fix the case of [^a_a]

@kivikakk
Copy link

kivikakk commented Apr 3, 2019

We add the special chars so the autolink extension gets called when it sees a w (such that it can autolink URLs starting with www.) and : (backtracks and looks for http:, https:, ftp:). The autolinker wasn't designed with the current footnotes extension in mind, and so there's as yet no solution to this.

@digitalmoksha
Copy link
Author

Thanks for the update @kivikakk.

Hmmm....

@wolftune
Copy link

incidentally, this doesn't happen with capital W, only with lowercase.

@phillmv
Copy link
Member

phillmv commented Aug 11, 2021

While standardizing on cmark-gfm for notes of mine, I realized this bug was impacting me and so I took a look at what's going on.

I chose to focus on just understanding how the footnote references work (thanks for the pointer @tonyg!), and after doing some thinking I think I came to a reasonable solution which can be found here: #227

tldr, i think it's safe to ignore that nodes that the autolinker is adding to the parser's ast (which is why the ->next->next check Tony pointed out uh 3 years ago is failing), and in my own local use this patch seems to do the trick.

Reviews / feedback welcome!

@digitalmoksha
Copy link
Author

Awesome, thanks @phillmv !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants
@kivikakk @tonyg @phillmv @digitalmoksha @wolftune and others