Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support parallel snapshots calls #168

Merged
merged 5 commits into from
May 30, 2019

Conversation

blake-newman
Copy link
Contributor

@blake-newman blake-newman commented Apr 25, 2019

Fix parallel builds

When parallel builds occur there is the posibility that the same
assets will be triggerd on the same pooled page, this means that there
is a likely outcome that the resource will be intercepted and handled
twice, this is illegal with puppeteer.

Polling available pages up to 10 is also memory intensive for sequential
builds as 10 pages will be in memory. This can cause issues in low
memory enviroments.

  • Remove pooling and open close pages on demand to reduce memory
    footprint
  • Open new page on demand to ensure that parallel builds that request
    same resource do not conflict which causes the puppeteer to throw up
    error that request is already handled.
  • Add catch clauses around request interception to ensure that e2e tests
    continue if there is an issue with percy.

@blake-newman blake-newman force-pushed the blake.newman/fix-parallel-builds branch from cc7db54 to 2149e5f Compare April 25, 2019 13:45
@blake-newman blake-newman changed the title Fix parrellel builds Fix parallel builds Apr 25, 2019
@Robdel12
Copy link
Contributor

This looks really good to me 🔥. Excited about these changes and I like the approach.

@djones any thoughts on this?

@djones
Copy link
Contributor

djones commented Apr 27, 2019

Thanks for another great PR @blake-newman! It's delightful to see someone who doesn't work at Percy go this deep.

Just to give you a heads up, the asset discovery flow was designed this way with performance in mind. There's a 75ms or so cost to opening a page in Chromium, so the current flow keeps open 10 pages and opens them when the agent is started. This means the time to take a single snapshot doesn't need to include opening a new page in the browser.

You're quite right to point out that in doing so, we have broken parallel snapshotting and we would definitely like to fix that. This issue percy/percy-puppeteer#52 is highly relevant and this PR would likely fix it.

I want to measure what performance impact this has on a per-snapshot level and see if that's something we can stomach if it's slower. We have fairly good profiling if you run @percy/agent with LOG_LEVEL=debug set. It outputs a bunch of timings. The one of particular interested here is assetDiscoveryService.discoverResources.

One thing we have been considering is https://github.com/thomasdondorf/puppeteer-cluster it would allow us to have a pool of Chromium instances running and warmed up. Something like this would be one way to keep per-snapshot performance high and have parallel snapshotting working (though it would consume more memory).

@blake-newman
Copy link
Contributor Author

I had thought that above may be the case, 75ms seems to be quite high at will certainly have a knock-on effect for sequential builds.

We have also found that the pool can consume a large amount of memory causing issues when running in parallel as there is a large demand for memory which causes chromium to crash.

Based on the above information, I believe that a partial but adequate solution to fix parallel builds would be to run asset discovery in two formats depending on how the tests are running. As a sequential test run has more access to memory as there is less demand so pooling makes sense.

However, when running in parallel you have less memory available thus the pool can use up additional unavailable memory which becomes a bottleneck and can cause crashes. When running in parallel cost of the setup of opening a new page becomes less of an issue because parallel as a whole reduces test runs significantly.

I will submit an update to give a best of both worlds based on the test environment, as I feel enabling parallel builds is a huge win and optimizations to make snapshotting for those builds faster can be done at a later stage. However, correctly put we should ensure that sequential builds are not deopmitized when taking snapshots.

@blake-newman blake-newman force-pushed the blake.newman/fix-parallel-builds branch from 69481e3 to 41c1562 Compare April 28, 2019 11:35
@blake-newman
Copy link
Contributor Author

@djones have pushed up new changes and believe this solution covers both use cases, making builds optimized for the resource they require most.

In the case of parallel there is less resource for memory so not using a pool makes sense as it's memory heavy. The is also more bandwidth for execution time since parallel comes at a huge speed boost by itself. Further optimisation for future would be to create a revolving pool that spins up clean tabs after use to ensure that screen shots can execute in a quicker manner, however that comes at a cost for memory and with builds already crashing with 16gb of memory with a pool (believe this is due to high memory usage in cypress causing percy to push it to the limit) this would need some more investigation to ensure that test runs are stable on low memory environments.

For serial there is less resource for processing power but memory is more readily available for use, so having a pool helps in the case of serial test suites.

@blake-newman
Copy link
Contributor Author

@Robdel12 @djones sorry to push on this, wondering if you have had chance to look?

@djones
Copy link
Contributor

djones commented May 15, 2019

Hi @blake-newman, Sorry about the delay... this is super important to get right as you can imagine.

This is definitely getting closer. I'm leaning towards killing off isParallel and just having that on by default. Opening up to 10 pages really isn't that expensive. Would like to pick a good default for everyone.

@blake-newman blake-newman force-pushed the blake.newman/fix-parallel-builds branch from 11b1bd8 to 42f9364 Compare May 15, 2019 12:38
When parallel builds occur there is the posibility that the same
assets will be triggerd on the same pooled page, this means that there
is a likely outcome that the resource will be intercepted and handled
twice, this is illegal with puppeteer.

Polling available pages up to 10 is also memory intensive for sequential
builds as 10 pages will be in memory. This can cause issues in low
memory enviroments.

- Remove pooling and open close pages on demand to reduce memory
footprint
- Open new page on demand to ensure that parallel builds that request
same resource do not conflict which causes the puppeteer to throw up
error that request is already handled.
- Add catch clauses around request interception to ensure that e2e tests
continue if there is an issue with percy.
Add page pool for sequential builds, to reduce execution overhead when
creating new page

- Add checks to detirmine sequential build to create page pool at setup
- Add checks to ensure page pool has been setup for sequential build
Use generic pool to create a pool for both parallel builds and serial
builds. The pool will create page instances on demand, releasing them
after requests are finished.
@blake-newman blake-newman force-pushed the blake.newman/fix-parallel-builds branch from 42f9364 to 2ec0391 Compare May 28, 2019 07:34
@blake-newman
Copy link
Contributor Author

blake-newman commented May 29, 2019

I am currently creating a test suite capturing all these changes along with #167 to validate all the changes work as expected. (So far look positive)

- Rename pool to pagePool
- Create constants for the page pool size with configuration via env
variable
@@ -37,6 +39,21 @@ describe('SnapshotService', () => {
expect(snapshotResponse.body).to.deep.equal({data: {id: snapshotId}})
expect(snapshotResponse.statusCode).to.eq(200)
})

it('creates multiple snapshots', async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm on master, this test passes when length: 1 and fails with length: 2 or more.

@@ -9,17 +10,19 @@ interface AssetDiscoveryOptions {
networkIdleTimeout?: number
}

const DEFAULT_PAGE_POOL_SIZE = process.env.PERCY_POOL_SIZE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll shift this to use the .percy.yml config file later. Happy to leave this is for now, but don't rely on this

@djones djones changed the title Fix parallel builds feature: Support parallel snapshots calls May 30, 2019
if (!this.shouldRequestResolve(request)) {
await request.abort()
return
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍

profile('--> assetDiscoveryService.pool.release')

return maybeResources.filter((maybeResource) => maybeResource != null)
} catch (error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love that we're going to catch these async errors now. Does it get rid of all of the deprecations?

Copy link
Contributor

@Robdel12 Robdel12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🏁

@djones djones merged commit 744a399 into percy:master May 30, 2019
@djones djones changed the title feature: Support parallel snapshots calls feat: Support parallel snapshots calls May 30, 2019
djones pushed a commit that referenced this pull request May 30, 2019
# [0.5.0](v0.4.9...v0.5.0) (2019-05-30)

### Features

* Support parallel snapshots calls ([#168](#168)) ([744a399](744a399))
@blake-newman blake-newman deleted the blake.newman/fix-parallel-builds branch May 31, 2019 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants