-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix footnotes: nested, confused for link refs & mangled by the autolinker #229
Conversation
When two footnote references are adjacent, the handle_close_bracket function will first try to match the closing bracket to a link reference. Now we reset the subject's state, so that the parser correctly picks up both footnote references.
…ark_nodes. Sometimes, the autolinker will go ahead and greedily split input into multiple text nodes in the hopes of matching a hyperlink. This broke footnotes, which expected a singular node. Instead of relying on the tokenizing to have worked perfectly, when handling footnote references we now simply insert the reference based on the closing bracket and ignore and delete any existing and superfluous nodes.
…ontain 'w' or '_'.
@@ -468,7 +468,6 @@ static void process_footnotes(cmark_parser *parser) { | |||
while ((ev_type = cmark_iter_next(iter)) != CMARK_EVENT_DONE) { | |||
cur = cmark_iter_get_node(iter); | |||
if (ev_type == CMARK_EVENT_EXIT && cur->type == CMARK_NODE_FOOTNOTE_DEFINITION) { | |||
cmark_node_unlink(cur); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this file support allowing nested foornotes.
We accomplish this by simply changing when we unlink an unused footnote node.
Whereas before, we iterate thru the parser and delete all footnote definitions as we see them
Now we only unlink the footnotes definitions after we've scanned them for more references.
// advanced the current state beyond our footnote's actual closing | ||
// bracket, ie if it went looking for a `link_label`. | ||
// Let's just rewind the subject's position: | ||
subj->pos = initial_pos; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a single bug
For reference, that bug (and a similar fix) had been previously identified in #139.
I ran the fuzzing harness,
gives:
which corresponds to Lines 690 to 691 in 993e869
Similarly, an input like:
gives:
which corresponds to Lines 728 to 734 in 993e869
So it looks like we are incorrectly handling footnotes that themselves contain markdown formatting? Perhaps we shouldn't be calling into |
Very powerful find, thanks @philipturnbull! I poked around in this, and I was able to reproduce an infinite hang using I strongly suspect I am either failing to reset some aspect of the parser state, or there's a dangling pointer elsewhere 🤔 I shall look at it soon. |
@philipturnbull I may have figured it out in 6e186b3 ! go figure, trying to free things before calling the next function would lead to shenanigans. I should be able to clean it up, and add a regression test. I then set up the fuzzer on my end and it found a subtle use-after-free stemming from 71e27f2 which I have also figured out! but like with above, shall have to write into a patch another time. Fuzzers! What a great tool. |
tldr, avoid freeing memory before passing it along to another function.
Sometimes, when cleaning up unusued footnotes, two footnote->nodes may end up referencing each other. As they get free()'d up, this can lead to problems. Instead, first we unlink every node and _then_ free them up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm approving from a functionality perspective with the understanding that fuzzing and review from others has addressed known memory safety issues.
Hello!
This PR closes #121 and helps push https://github.com/github/coding/issues/2251 forward. It supersedes #227.
This PR fixes three different bug fixes in one relatively easy to read PR, combining two tiny changes with a slightly larger one:
I feel fairly confident functionality wise that this does the trick, but I invite more experienced C hands to tell me how I could improve or better allocate memory 😄.
This PR is a companion to/blocks/see also #230
Cheers,