Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡✅ Update more regexps to run in linear time #147

Merged
merged 1 commit into from
Apr 29, 2023
Merged

Conversation

nevans
Copy link
Collaborator

@nevans nevans commented Apr 28, 2023

I didn't have time to look closely at these when I pushed #145. For that PR, I prioritized BEG_REGEXP because it is used all the time, with unconstrained input. These other regexps are only used when Net::IMAP.debug = truthy or by saslprep, which is only used by the SCRAM-SHA-256 SASL mechanism, which isn't even merged yet, oops! (See #54, #78)

So this fixes both of those.

For the debug output gsub, the same string transform is simple to accomplish without negative lookahead by using a block with gsub and checking $'.empty?. I don't believe Net::IMAP.debug has any tests, but 1) I did test it manually, and 2) I intend to push a PR that changes how strings are validated, generated, and sent. The updated output will also change (simplify) this debug output, allowing redaction of passwords and authentication tokens, and come with its own tests.

For StringPrep::SASLprep::BIDI_FAILURE and the other related regexps, they all used \g<name> to define char classes and then re-use them. Unfortunately, ruby 3.2 can't compile \g<...> to run in linear time. The regexps can also be shortened by using lookahead, but that also can't run in linear time. Although this expands the BIDI_FAILURE regexp to ~15Kb, it slightly simplifies the generation.

The BIDI_FAILURE regexps all used `\g<name>` to define char classes and
then re-use them.  Unfortunately, ruby 3.2 can't compile that to run in
linear time.  The regexps could also be written using lookahead, but
that also wouldn't run in linear time.

The debug output gsub is simple to accomplish without negative lookahead
by using a block with gsub and checking `$'.empty?`.

`bin/check-regexps` was added for quick double-check, just in case the
tests aren't catching everything.
@nevans nevans merged commit 7601f6b into master Apr 29, 2023
@nevans nevans deleted the linear-regexps branch April 29, 2023 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant