Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix footnotes plus fix footnote reference labels and backrefs #230

Merged
merged 30 commits into from
Sep 16, 2021

Conversation

phillmv
Copy link
Member

@phillmv phillmv commented Aug 23, 2021

This PR is a companion to/blocked by #229, and should only be merged after that PR. It also supports https://github.com/github/coding/issues/2251 .

This PR incorporates all of the changes in #229 plus it fixes an additional issue:

Before, footnote anchor ids were referenced by an incrementing integer id, and only included a single backreference link when rendered into html.

This prevents multiple blocks of independently rendered text (think: an Issue body and Issue comments) from all containing footnotes, because the first footnote in the issue body and the first footnote in any issue comment will both reference fnref1.

After this PR, footnote anchor ids use their footnote reference label text as the anchor id text. Also, we insert an additional backreference link per individual footnote reference, so that you can navigate back to each individual footnote reference if a given footnote is cited more than once.


Notes to reviewers,

Due to the changes set in #229, and given that this PR modifies every footnote test case, in order to prevent painful merge conflicts while keeping PR size down, I decided to package this as a separate PR.

All of the significant work is in this commit, but this PR is best reviewed after #229 is merged.

When two footnote references are adjacent, the handle_close_bracket
function will first try to match the closing bracket to a link
reference. Now we reset the subject's state, so that the parser
correctly picks up both footnote references.
…ark_nodes.

Sometimes, the autolinker will go ahead and greedily split input into
multiple text nodes in the hopes of matching a hyperlink. This broke
footnotes, which expected a singular node. Instead of relying on the
tokenizing to have worked perfectly, when handling footnote references
we now simply insert the reference based on the closing bracket and
ignore and delete any existing and superfluous nodes.
When a footnote is referenced multiple times, we now insert multiple
backrefs linking back to each reference. In order to do this, we had to
change how footnote ref link labels work away from an incrementing
index, and instead use footnote reference label text *plus* an index.
src/blocks.c Outdated Show resolved Hide resolved
@phillmv phillmv force-pushed the fix-footnotes-plus-fix-fnref-label-and-backrefs branch from 3dccaf9 to 32002ec Compare September 1, 2021 16:21
tldr, avoid freeing memory before passing it along to another function.
Sometimes, when cleaning up unusued footnotes, two footnote->nodes may
end up referencing each other. As they get free()'d up, this can lead to
problems. Instead, first we unlink every node and _then_ free them up.
Copy link

@brasic brasic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @phillmv! I left a few comments about ensuring global uniqueness and a question about sanitization. Happy to pair on this if you have time!

src/html.c Outdated Show resolved Hide resolved
<p>This is some text!<sup class="footnote-ref"><a href="#fn1" id="fnref1">1</a></sup>. Other text.<sup class="footnote-ref"><a href="#fn2" id="fnref2">2</a></sup>.</p>
<p>Here's a thing<sup class="footnote-ref"><a href="#fn3" id="fnref3">3</a></sup>.</p>
<p>And another thing<sup class="footnote-ref"><a href="#fn4" id="fnref4">4</a></sup>.</p>
<p>This is some text!<sup class="footnote-ref"><a href="#fn:1" id="fnref:1">1</a></sup>. Other text.<sup class="footnote-ref"><a href="#fn:footnote" id="fnref:footnote">2</a></sup>.</p>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this addition will still not get us quite to where we need to use this in github. What we need is a guarantee that multiple comments on the same page do not unintentionally link to eachother's anchors. It seems like with this approach that will still happen if comments reuse the same footnote labels (e.g. so far I've just been using labels like [^1])

What do you think about one of the following?

  • adding some some random text as part of the anchor name
  • adding some pseudorandom text obtained by e.g. digesting the footnote target body

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmmm, the random text is probably the most robust solution but on first thought I don't love inserting random strings into labels.

The pseudrandom is more pleasant, but then you're still stuck with folks say copying the same input generating the same output.

Maybe the best way to tackle that issue is to handle it in the pipeline phase, like how we handle header ids for tables of contents – that way we can scope it to the id of the rendered comment/issue/whatever and avoid randomness AND be guaranteed everything is properly scoped.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this solution although I bet that @talum will not! 😂

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is handled in the pipeline phase can there be a public description of how it is done so tools that try to mimic GitHub's output can do so without guessing?

Header IDs are still not publicly described and apparently not open to GitHub employees as well.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm testing the pipeline out with these colon-separated hrefs. Something with this syntax is wonky...

Example:

<a href="#fn:other-note">testing a link</a>

testing a link

^^ does not show up as a link

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Experimenting with this, it looks like our sanitizer strips links whose hrefs contain : since

<a href="#fn-other-note">this works</a> this works

so, meh, okay, let's use a dash instead!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@UziTech thanks for bringing this up. In general we intentionally don't document things like how we generate slugs and ids not because it's secret or anything but because these are implementation details; we don't want to unintentionally define an API contract that we need to adhere to.

However, I think we'd be happy to informally describe the algorithm we eventually choose for generating footnote anchor ids from the output of what is publicly specified here. Please create an issue and mention me after the feature is released and I'll do my best to provide an answer.

src/commonmark.c Outdated Show resolved Hide resolved
src/commonmark.c Outdated Show resolved Hide resolved
src/html.c Outdated Show resolved Hide resolved
@phillmv phillmv merged commit d7e50f0 into master Sep 16, 2021
@phillmv phillmv deleted the fix-footnotes-plus-fix-fnref-label-and-backrefs branch September 16, 2021 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants