Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(genAdds): refactor recursive procedure to iterative #1277

Closed
wants to merge 13 commits into from

Conversation

JonasBa
Copy link
Contributor

@JonasBa JonasBa commented Aug 7, 2023

Not sure how representative the benchmark is, so I'm adding the screenshot of the profile (deep DOM tree benchmark). Looking at timings, it seems to take ~18ms before and 13ms after my change, or about ~30% gain. Feel free to do more extensive benchmarking and take those timings with a grain of salt as I just ran the benchmarks once

Before
CleanShot 2023-08-08 at 10 11 25@2x

After
CleanShot 2023-08-08 at 10 11 36@2x

We can see the stack is shallower as expected (and uses less memory, sadly no metrics on that).
A small benefit is that we also remove the overhead of calling into the polymorphic genAdds fn.

The queue has a magic default size of 1000 - feel free to tweak that to some better value, I'm not sure what a sensible default could be here

@changeset-bot
Copy link

changeset-bot bot commented Aug 7, 2023

⚠️ No Changeset found

Latest commit: 4a984b0

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@eoghanmurray eoghanmurray force-pushed the jb/perf/genadds-recursion branch 2 times, most recently from a0284c0 to f5f9789 Compare August 15, 2023 09:52
@JonasBa
Copy link
Contributor Author

JonasBa commented Aug 15, 2023

@eoghanmurray nice find on the blocked, I moved that too quickly. Feel free to just leave a review next time and I'm happy to work on the changes

@eoghanmurray
Copy link
Contributor

I'm tempted to leave the Array pre-allocation out of this PR for code readability reasons.
If it's important likely we should be doing it in other places; unless there's a particular reason for it here?

Also, this article suggests the V8 engine is much more performant with .push anyway:
https://dev.to/henryjw/populating-a-pre-allocated-array-slower-than-a-pushing-to-a-regular-array-4l43

@eoghanmurray
Copy link
Contributor

We should also include the test changes in this PR also for further review — see the yarn test:update command (I can run this for you if you prefer, but you mentioned you had checked the output?)

@JonasBa
Copy link
Contributor Author

JonasBa commented Aug 17, 2023

@eoghanmurray no problem, let me make the change.

Is there a reason that you decided to move genAdds inside the childlist case here btw?

@JonasBa
Copy link
Contributor Author

JonasBa commented Aug 17, 2023

@eoghanmurray I updated the PR based on your review

@eoghanmurray
Copy link
Contributor

Is there a reason that you decided to move genAdds inside the childlist case here btw?

Do you mean f5f9789 ?
I imagine genAdds was originally separate only because it was recursive.

@JonasBa
Copy link
Contributor Author

JonasBa commented Aug 25, 2023

Yeah, thats what I meant. Lmk if I need to update anything else

@eoghanmurray
Copy link
Contributor

I'm not sure why the test is failing

@JonasBa
Copy link
Contributor Author

JonasBa commented Aug 30, 2023

Seems unrelated, I will rebase the PR

@JonasBa
Copy link
Contributor Author

JonasBa commented Aug 31, 2023

I think my dev env must have picked up a different TS version.
What I see in serialize-args is this error
CleanShot 2023-08-30 at 20 51 15@2x

which gets corrected to another error...
CleanShot 2023-08-30 at 20 51 47@2x

I reverted the changes, it should now be ok.

const next = genAddsQueue.pop();
if (!next) {
throw new Error(
'Add queue is corrupt, there is no next item to process',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be unreachable. I can add a continue clause in case we don't want to throw, or just silence ts if that's the preference

@JonasBa
Copy link
Contributor Author

JonasBa commented Sep 25, 2023

@eoghanmurray friendly ping :)

@JonasBa
Copy link
Contributor Author

JonasBa commented Oct 5, 2023

@eoghanmurray small ping here :)

@eoghanmurray
Copy link
Contributor

Apologies for delay; do you have this running in production?

I've just cherry-picked this PR today and am putting it out in a 'canary' environment to a subset of customers.

Reasons to be cautious: I've fixed a completely separate 'node ordering' issue on the replay side of things, where non-anticipated ordering of nodes in a mutation produced worst case performance and ultimately broke the replay (as it triggered a timeout).
Fix is here #1320 so hopefully that acts as a vote in favour of this one as we should be more robust at replay time now.

@JonasBa
Copy link
Contributor Author

JonasBa commented Oct 9, 2023

Apologies for delay; do you have this running in production?

We do not.

I've just cherry-picked this PR today and am putting it out in a 'canary' environment to a subset of customers.

Perfect, let me know.

I looked at your PR about node ordering issue, but since it's been a while since I looked at this code I lost some context and requirements, so I have a hard time imagining the exact issue you are fixing.

@JonasBa
Copy link
Contributor Author

JonasBa commented Nov 20, 2023

@eoghanmurray any findings from the canary builds here?

@eoghanmurray
Copy link
Contributor

No issues that I know of

@bruno-garcia
Copy link

I've just cherry-picked this PR today and am putting it out in a 'canary' environment to a subset of customers.

Does it give the confidence we need to merge this one in? Wondering if we should be cherry-picking it too or it merges here and update our fork

@eoghanmurray
Copy link
Contributor

eoghanmurray commented Apr 14, 2024

See also #1398 ... I'm trying to work out how these two PRs might interact

@JonasBa
Copy link
Contributor Author

JonasBa commented Apr 14, 2024

@eoghanmurray I think my PR already implements the changes from that PR as it performs an in-order traversal to enqueue items and then proceeds to pop the items from the queue (lifo), which is the essentially the same as iterating backwards as you process items with the added benefit that we remove recursion

@eoghanmurray
Copy link
Contributor

Super, I missed that it was LIFO.
No need for unshift so

@JonasBa
Copy link
Contributor Author

JonasBa commented Apr 15, 2024

Yeah, you want to avoid unshift in this case because it's On. We would have had to use a list in this case. Btw, whats the status of this PR, can we merge this or do you want me to do anything else here?

@eoghanmurray
Copy link
Contributor

I think we should have merged a long time ago, however with something like this it's hard to tell if there isn't some case where it makes things worse (not that I believe this is the case from having examined it).

I meant to put together another 'throw everything at it' mutation benchmark test today with this in mind, but ran into some technical issues.

I think it will be merged soon.

@eoghanmurray
Copy link
Contributor

You could rebase/merge on master which will re-trigger fresh tests; not sure if the indicated test failure is significant (it is no longer available for viewing).

@JonasBa

This comment was marked as outdated.

@eoghanmurray
Copy link
Contributor

To summarize benchmarks: they don't show any difference (so probably aren't exacting enough?)

@JonasBa
Copy link
Contributor Author

JonasBa commented Apr 17, 2024

@eoghanmurray I need to look more into it. There is a chance that benchmarks are not impacted because the depth of the benchmarked DOM is in most cases fairly shallow, which could mean recursive calls are never deep enough to impact these benchmarks.

I may create a separate benchmark to test adding deep DOM trees, it should be something we should benchmark for.

@JonasBa
Copy link
Contributor Author

JonasBa commented Apr 24, 2024

@eoghanmurray I wont have the time to look into adding a proper benchmark here soon. Can this still be merged or should we wait for that?

@JonasBa
Copy link
Contributor Author

JonasBa commented Apr 29, 2024

@eoghanmurray Any chance we could merge this? I would like to move over to splitting down the other larger PR I had

@JonasBa
Copy link
Contributor Author

JonasBa commented May 16, 2024

@eoghanmurray ping

@JonasBa
Copy link
Contributor Author

JonasBa commented May 28, 2024

@eoghanmurray I realized that the benchmarks were so similar because I wasnt rebuilding the packages and running yarn benchmark from within the rrweb package.

I've reran the benchmarks with the following results. The largest change was observed on the append 1000 elements and reverse their order benchmark where perf seems to have improved by 15%. The other benchmarks seem to have all stayed within a few % of the current value. The metrics of each these runs look like standard error, so I dont think we can take them into account

1121ms -> 1075ms
'append 70 x 70 x 70 elements' │ 3 │ 1121.3333333333333 │ '1128, 1117, 1119'
'append 70 x 70 x 70 elements' │ 3 │ 1075.6666666666667 │ '1071, 1094, 1062'

165ms -> 140ms -15% (largest change)
'append 1000 elements and reverse their order' │ 3 │ 165.33333333333334 │ '165, 168, 163'
'append 1000 elements and reverse their order' │ 3 │ 140.33333333333334 │ '139, 141, 141'

1390ms -> 1401ms
'real events recorded on bugs.chromium.org' │ 3 │ 1390 │ '1430, 1348, 1392'
'real events recorded on bugs.chromium.org' │ 3 │ 1401.3333333333333 │ '1465, 1388, 1351'

81.7m -> 82.9ms
'create 1000x10x2 DOM nodes and remove a bunch of them' │ 'benchmark-dom-mutation-add-and-remove.html' │ 'window.workload()' │ 10 │ 81.7 │ '84, 77, 81, 92, 76, 77, 95, 80, 79, 76'
'create 1000x10x2 DOM nodes and remove a bunch of them' │ 'benchmark-dom-mutation-add-and-remove.html' │ 'window.workload()' │ 10 │ 82.9 │ '82, 80, 79, 91, 79, 85, 96, 79, 75, 83'

54ms -> 56ms
'create 1000x10 DOM nodes' │ 'benchmark-dom-mutation.html' │ 'window.workload()' │ 10 │ 54 │ '53, 53, 54, 52, 53, 57, 53, 52, 56, 57'
'create 1000x10 DOM nodes' │ 'benchmark-dom-mutation.html' │ 'window.workload()' │ 10 │ 56.1 │ '52, 55, 55, 53, 55, 56, 54, 51, 71, 59'

22.8 -> 21.8
'create 1000 DOM nodes and append into its previous looped node' │ 'benchmark-dom-mutation-multiple-descendant-add.html' │ 'window.workload()' │ 5 │ 22.8 │ '23, 23, 22, 23, 23'
'create 1000 DOM nodes and append into its previous looped node' │ 'benchmark-dom-mutation-multiple-descendant-add.html' │ 'window.workload()' │ 5 │ 21.8 │ '21, 22, 22, 22, 22'

89.6ms -> 89.8ms
'create 10000 DOM nodes and move it to new container' │ 'benchmark-dom-mutation-add-and-move.html' │ 'window.workload()' │ 5 │ 89.6 │ '90, 95, 86, 91, 86'
'create 10000 DOM nodes and move it to new container' │ 'benchmark-dom-mutation-add-and-move.html' │ 'window.workload()' │ 5 │ 89.8 │ '87, 100, 89, 87, 86'

19.3ms -> 17.9ms
'modify attributes on 10000 DOM nodes' │ 'benchmark-dom-mutation-attributes.html' │ 'window.workload()' │ 10 │ 19.3 │ '18, 24, 17, 18, 28, 17, 18, 16, 17, 20'
'modify attributes on 10000 DOM nodes' │ 'benchmark-dom-mutation-attributes.html' │ 'window.workload()' │ 10 │ 17.9 │ '17, 16, 18, 20, 15, 19, 19, 16, 23, 16'

@eoghanmurray
Copy link
Contributor

I've rebased on master and cleaned up a little in https://github.com/eoghanmurray/rrweb/tree/genadds-recursion
One of the tests has already been introduced in #1489 so I wasn't sure what the intention was there; I've added a new benchmark-dom-mutation command in package.json (although I haven't tested whether it does what it's supposed to, i.e. just call that one new benchmark test)

If that branch is acceptable you can git reset --hard 2b8de82f9bbe04819b230189ac70bb2a71270823 and git push --force to this branch; I don't have write access to this PR even though Maintainers are allowed to edit this pull request.
I get the following error in console:

$ git push -f JonasBa
...
To github.com:JonasBa/rrweb.git
 ! [remote rejected]     genadds-recursion -> genadds-recursion (permission denied)
error: failed to push some refs to 'github.com:JonasBa/rrweb.git'

@eoghanmurray
Copy link
Contributor

(I thought this branch was already merged)

@JonasBa
Copy link
Contributor Author

JonasBa commented Jul 18, 2024

@eoghanmurray the tests fail because the order of iteration is reversed, which is expected given the changes (iirc this doesnt have an effect), but please lmk if it does and I can see if there is something I can fix

@eoghanmurray
Copy link
Contributor

So do I understand it correctly again that this changes things from depth first to breadth first?
If that's the case, is there any sense in reversing the order of breadth so that at least the first 'level' will match? This would be just to minimize the test changes and make them easier to review; this is likely a crazy notion.

Otherwise we'll need to test:update the snapshots and really review those changes very carefully to make sure nothing unexpected slips in.

@JonasBa
Copy link
Contributor Author

JonasBa commented Sep 19, 2024

@eoghanmurray I'm going to close this PR as it has been broken down into individual prs

@JonasBa JonasBa closed this Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants