Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Token.stringify will now call util.encode #1844

Closed

Conversation

RunDevelopment
Copy link
Member

@RunDevelopment RunDevelopment commented Mar 28, 2019

This PR changes the behavior of Token.stringify so that the function will now encode strings itself rather than relying on pre-encoded token streams.

This also simplified util.encode because it doesn't need to be able to handle token streams anymore.

It's also more efficient like this because encode doesn't have to create a deep copy of the token stream.


Btw. No language or plugin uses encode, so this should be a purely internal change.

Copy link
Contributor

@ExE-Boss ExE-Boss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe get rid of util.encode(…) altogether by moving the string replacement code into Token.stringify(…).

components/prism-core.js Show resolved Hide resolved
Copy link
Contributor

@Golmote Golmote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Golmote
Copy link
Contributor

Golmote commented Apr 22, 2019

Hmm on a second thought, Token is accessible from the outside isn't it? This could be considered a BC breaking change, even though Token was probably undocumented.

@ExE-Boss
Copy link
Contributor

The method is marked as private in the documentation: #1782.

@Golmote
Copy link
Contributor

Golmote commented Apr 22, 2019

Which is a good thing, but this doc is not merged yet, so we have to account for what the users have been given so far, which is mainly this page of the website... where Prism.Token is mentioned (but it's API is not described).
I just don't wanna repeat the situation we had a bit ago, when Prism broke for people because they were relying on unspecified behavior. Better safe than sorry.

@ExE-Boss
Copy link
Contributor

Unfortunately, @types/prismjs had documented it, albeit very wrongly.

The current version has a warning about it being private.

Copy link
Contributor

@ExE-Boss ExE-Boss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d say let’s merge this.

@RunDevelopment
Copy link
Member Author

I'm actually with @Golmote on this one. This is technically a breaking change. Let's merge this after JSDoc is through, so people can know how the API changed.

(PrismJS v2.0 when?)

@RunDevelopment
Copy link
Member Author

@mAAdhaTTah @Golmote
You guys think we can merge this now? Public doc is available, so I don't feel bad changing purely internal functions.

I also recently read an article where this was mentioned. So I made a little benchmark and by using a simpler escapeHTML function Prism.highlight was 1.13x faster on average. 13% is pretty impressive because Prism.highlight also does the whole tokenization plus the stringification.

The simplified function used:
(This version can be used to escape attribute values as well.)

var ATTR_REGEX = /[&<"]/;
/**
 * This is a fast HTML escaping function.
 *
 * All credit goes to Vladimir Kutepov (@frenzzy on GitHub).
 *
 * @param {string} html
 * @returns {string}
 */
function escapeHTML(html) {
	var match = ATTR_REGEX.exec(html);
	if (!match) return html;
	var index = 0;
	var lastIndex = 0;
	var out = '';
	var escape = '';
	for (index = match.index; index < html.length; index++) {
		switch (html.charCodeAt(index)) {
			case 34: // "
				escape = '&quot;';
				break;
			case 38: // &
				escape = '&amp;';
				break;
			case 60: // <
				escape = '&lt;';
				break;
			default:
				continue;
		}
		if (lastIndex !== index) out += html.substring(lastIndex, index);
		lastIndex = index + 1;
		out += escape;
	}
	return lastIndex !== index ? out + html.substring(lastIndex, index) : out;
}

Benchmark log:

Found 9 cases with 27 files in total.
Test 2 candidates with Prism.highlight
Estimated duration: 2m 42s

------------------------------------------------------------

c

  https://github.com/raw/git/git/master/mergesort.c (1 kB)
  | local              0.17ms ±  2%   44smp
  | PrismJS@master     0.21ms ±  0%   50smp 1.25x
  https://github.com/raw/git/git/master/mergesort.h (1 kB)
  | local              0.03ms ±  1%   48smp
  | PrismJS@master     0.04ms ±  0%   54smp 1.19x
  https://github.com/raw/git/git/master/remote.c (58 kB)
  | local              6.62ms ±  1%   49smp
  | PrismJS@master     9.53ms ±  1%   44smp 1.44x
  https://github.com/raw/git/git/master/remote.h (10 kB)
  | local              0.60ms ±  0%   51smp
  | PrismJS@master     0.77ms ±  0%   54smp 1.28x

------------------------------------------------------------

css

  ../../assets/style.css (7 kB)
  | local              1.34ms ±  1%   53smp 1.07x
  | PrismJS@master     1.26ms ±  0%   52smp

------------------------------------------------------------

css!+css-extras (css)

  ../../assets/style.css (7 kB)
  | local              1.54ms ±  1%   52smp
  | PrismJS@master     2.20ms ±  0%   50smp 1.43x

------------------------------------------------------------

javascript

  ../../assets/utopia.js (11 kB)
  | local              1.93ms ±  1%   52smp
  | PrismJS@master     2.24ms ±  0%   53smp 1.16x
  ../../components.json (27 kB)
  | local              5.48ms ±  0%   48smp
  | PrismJS@master     5.77ms ±  0%   50smp 1.05x
  ../../package-lock.json (206 kB)
  | local             37.60ms ±  2%   25smp 1.01x
  | PrismJS@master    37.41ms ±  5%   27smp
  https://cdnjs.cloudflare.com/ajax/libs/prism/1.20.0/prism.js (29 kB)
  | local              5.33ms ±  1%   49smp
  | PrismJS@master     6.28ms ±  1%   46smp 1.18x
  https://cdnjs.cloudflare.com/ajax/libs/prism/1.20.0/prism.min.js (14 kB)
  | local              4.64ms ±  1%   48smp
  | PrismJS@master     5.22ms ±  0%   50smp 1.13x
  https://code.jquery.com/jquery-3.4.1.js (274 kB)
  | local             96.38ms ±  4%   16smp
  | PrismJS@master   103.66ms ±  5%   15smp 1.08x
  https://code.jquery.com/jquery-3.4.1.min.js (86 kB)
  | local             69.70ms ±  8%   19smp
  | PrismJS@master    71.89ms ±  5%   21smp 1.03x

------------------------------------------------------------

json

  ../../components.json (27 kB)
  | local              2.65ms ±  1%   50smp
  | PrismJS@master     3.36ms ±  1%   49smp 1.27x
  ../../package-lock.json (206 kB)
  | local             28.48ms ±  3%   34smp 1.19x
  | PrismJS@master    23.99ms ±  2%   31smp

------------------------------------------------------------

markup

  ../../download.html (4 kB)
  | local              0.37ms ±  1%   51smp
  | PrismJS@master     0.41ms ±  0%   52smp 1.13x
  ../../index.html (19 kB)
  | local              2.03ms ±  0%   52smp
  | PrismJS@master     2.31ms ±  0%   50smp 1.14x
  https://github.com/PrismJS/prism (193 kB)
  | local             34.04ms ±  5%   28smp
  | PrismJS@master    41.18ms ±  4%   24smp 1.21x

------------------------------------------------------------

markup!+css+javascript (markup)

  ../../download.html (4 kB)
  | local              0.72ms ±  1%   49smp 1.03x
  | PrismJS@master     0.70ms ±  0%   53smp
  ../../index.html (19 kB)
  | local              2.93ms ±  2%   46smp
  | PrismJS@master     3.19ms ±  1%   48smp 1.09x
  https://github.com/PrismJS/prism (193 kB)
  | local             53.13ms ±  6%   19smp 1.17x
  | PrismJS@master    45.59ms ±  4%   21smp

------------------------------------------------------------

ruby

  https://github.com/raw/rails/rails/master/actionview/lib/action_view/base.rb (12 kB)
  | local              0.58ms ±  0%   53smp
  | PrismJS@master     0.68ms ±  0%   54smp 1.18x
  https://github.com/raw/rails/rails/master/actionview/lib/action_view/layouts.rb (16 kB)
  | local              0.66ms ±  0%   53smp
  | PrismJS@master     0.78ms ±  0%   54smp 1.17x
  https://github.com/raw/rails/rails/master/actionview/lib/action_view/template.rb (14 kB)
  | local              0.79ms ±  0%   53smp
  | PrismJS@master     0.92ms ±  0%   53smp 1.17x

------------------------------------------------------------

rust

  https://github.com/raw/rust-lang/regex/master/src/compile.rs (42 kB)
  | local              6.55ms ±  0%   48smp
  | PrismJS@master     7.87ms ±  1%   45smp 1.20x
  https://github.com/raw/rust-lang/regex/master/src/lib.rs (28 kB)
  | local              0.59ms ±  0%   52smp
  | PrismJS@master     0.60ms ±  0%   55smp 1.01x
  https://github.com/raw/rust-lang/regex/master/src/utf8.rs (9 kB)
  | local              1.47ms ±  0%   54smp
  | PrismJS@master     1.66ms ±  0%   55smp 1.12x

------------------------------------------------------------

summary
                  best  worst  relative
  local             22      5     1.00x
  PrismJS@master     5     22     1.13x

Copy link
Collaborator

@JaKXz JaKXz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this rebases and passes tests I think we can release it as v1.26? the benchmarks look promising :)

@RunDevelopment
Copy link
Member Author

I think we can release it as v1.26?

I tagged this as v2.0 because this is potentially a breaking change to some people. As mentioned above, @types/prismjs does document Token.stringify, so we have to assume that some people rely on its current behavior. (Yes, I changed my stance on this one.)

I also don't see any urgency to merge this. This PR doesn't block anything, nor is the improvement felt by our users.

the benchmarks look promising :)

That benchmark isn't for this change but the faster HTML escape function. Despite removing a deep copy of every token stream, this PR doesn't actually improve performance.

@RunDevelopment
Copy link
Member Author

Implemented in v2.

@RunDevelopment RunDevelopment deleted the stringify-calls-encode branch September 2, 2022 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants