Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad perf case: Lots of delimiters that can close but no openers #43

Closed
robinst opened this issue Jun 10, 2015 · 4 comments
Closed

Bad perf case: Lots of delimiters that can close but no openers #43

robinst opened this issue Jun 10, 2015 · 4 comments

Comments

@robinst
Copy link
Contributor

robinst commented Jun 10, 2015

  1. Paste the text from this gist into dingus.
  2. It takes about 1.3 seconds to parse
  3. Paste the text a second time (append, so that the input is doubled in size)
  4. It takes 6 to 13 seconds to parse or even longer

The test input is a_ (a, underscore, space) repeated 20000 times. The problem seems to be that in processEmphasis, for each potential closer, the opener is searched all the way back to the stack bottom.

Maybe it could be improved by removing a closer after not finding a corresponding opener iff the closer can not be an opener, so as to not have to check it again. Not sure if this is correct in all cases though or if there are other worst case inputs (need to think about it more).

jgm added a commit to commonmark/cmark that referenced this issue Jun 10, 2015
When they have no matching openers and cannot be openers themselves,
we can safely remove them.

This helps with a performance case: "a_ " * 20000.

See commonmark/commonmark.js#43.
@jgm
Copy link
Member

jgm commented Jun 10, 2015

Thanks for bringing this to my attention. Your proposed
solution works. I've implemented this in cmark and added
some tests to "pathological tests."

Still TODO: implement in commonmark.js. This should
be straightforward with the cmark change as a guide.

jgm added a commit that referenced this issue Jun 10, 2015
Tests for many emph closers without matching openers,
openers without matching closers, and similarly for links.

See #43.
@jgm jgm closed this as completed in 29214d4 Jun 10, 2015
@robinst
Copy link
Contributor Author

robinst commented Jun 11, 2015

Thanks 👍. Just found another case that's related: "[ a_" * 20000. The opening delimiter for the link only gets removed at the end, slowing down the search for a matching opener for the emphasis closers.

@robinst
Copy link
Contributor Author

robinst commented Jun 11, 2015

And another case is this: "*a_ " * 20000

@jgm jgm reopened this Jun 11, 2015
@jgm
Copy link
Member

jgm commented Jun 11, 2015

I have now fixed all of these problems in cmark. TODO: commonmark.js.

jgm added a commit that referenced this issue Jun 11, 2015
@jgm jgm closed this as completed in 7df3ca7 Jun 11, 2015
xxgreg pushed a commit to xxgreg/commonmark.js that referenced this issue Sep 14, 2015
talum pushed a commit to github/cmark-gfm that referenced this issue Sep 14, 2021
When they have no matching openers and cannot be openers themselves,
we can safely remove them.

This helps with a performance case: "a_ " * 20000.

See commonmark/commonmark.js#43.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants