feat: Support parallel snapshots calls #168

blake-newman · 2019-04-25T13:43:10Z

Fix parallel builds

When parallel builds occur there is the posibility that the same
assets will be triggerd on the same pooled page, this means that there
is a likely outcome that the resource will be intercepted and handled
twice, this is illegal with puppeteer.

Polling available pages up to 10 is also memory intensive for sequential
builds as 10 pages will be in memory. This can cause issues in low
memory enviroments.

Remove pooling and open close pages on demand to reduce memory
footprint
Open new page on demand to ensure that parallel builds that request
same resource do not conflict which causes the puppeteer to throw up
error that request is already handled.
Add catch clauses around request interception to ensure that e2e tests
continue if there is an issue with percy.

Robdel12 · 2019-04-25T20:37:26Z

This looks really good to me 🔥. Excited about these changes and I like the approach.

@djones any thoughts on this?

djones · 2019-04-27T00:15:46Z

Thanks for another great PR @blake-newman! It's delightful to see someone who doesn't work at Percy go this deep.

Just to give you a heads up, the asset discovery flow was designed this way with performance in mind. There's a 75ms or so cost to opening a page in Chromium, so the current flow keeps open 10 pages and opens them when the agent is started. This means the time to take a single snapshot doesn't need to include opening a new page in the browser.

You're quite right to point out that in doing so, we have broken parallel snapshotting and we would definitely like to fix that. This issue percy/percy-puppeteer#52 is highly relevant and this PR would likely fix it.

I want to measure what performance impact this has on a per-snapshot level and see if that's something we can stomach if it's slower. We have fairly good profiling if you run @percy/agent with LOG_LEVEL=debug set. It outputs a bunch of timings. The one of particular interested here is assetDiscoveryService.discoverResources.

One thing we have been considering is https://github.com/thomasdondorf/puppeteer-cluster it would allow us to have a pool of Chromium instances running and warmed up. Something like this would be one way to keep per-snapshot performance high and have parallel snapshotting working (though it would consume more memory).

blake-newman · 2019-04-27T11:47:42Z

I had thought that above may be the case, 75ms seems to be quite high at will certainly have a knock-on effect for sequential builds.

We have also found that the pool can consume a large amount of memory causing issues when running in parallel as there is a large demand for memory which causes chromium to crash.

Based on the above information, I believe that a partial but adequate solution to fix parallel builds would be to run asset discovery in two formats depending on how the tests are running. As a sequential test run has more access to memory as there is less demand so pooling makes sense.

However, when running in parallel you have less memory available thus the pool can use up additional unavailable memory which becomes a bottleneck and can cause crashes. When running in parallel cost of the setup of opening a new page becomes less of an issue because parallel as a whole reduces test runs significantly.

I will submit an update to give a best of both worlds based on the test environment, as I feel enabling parallel builds is a huge win and optimizations to make snapshotting for those builds faster can be done at a later stage. However, correctly put we should ensure that sequential builds are not deopmitized when taking snapshots.

blake-newman · 2019-04-30T10:54:48Z

@djones have pushed up new changes and believe this solution covers both use cases, making builds optimized for the resource they require most.

In the case of parallel there is less resource for memory so not using a pool makes sense as it's memory heavy. The is also more bandwidth for execution time since parallel comes at a huge speed boost by itself. Further optimisation for future would be to create a revolving pool that spins up clean tabs after use to ensure that screen shots can execute in a quicker manner, however that comes at a cost for memory and with builds already crashing with 16gb of memory with a pool (believe this is due to high memory usage in cypress causing percy to push it to the limit) this would need some more investigation to ensure that test runs are stable on low memory environments.

For serial there is less resource for processing power but memory is more readily available for use, so having a pool helps in the case of serial test suites.

blake-newman · 2019-05-09T15:46:57Z

@Robdel12 @djones sorry to push on this, wondering if you have had chance to look?

djones · 2019-05-15T00:02:15Z

Hi @blake-newman, Sorry about the delay... this is super important to get right as you can imagine.

This is definitely getting closer. I'm leaning towards killing off isParallel and just having that on by default. Opening up to 10 pages really isn't that expensive. Would like to pick a good default for everyone.

src/services/asset-discovery-service.ts

When parallel builds occur there is the posibility that the same assets will be triggerd on the same pooled page, this means that there is a likely outcome that the resource will be intercepted and handled twice, this is illegal with puppeteer. Polling available pages up to 10 is also memory intensive for sequential builds as 10 pages will be in memory. This can cause issues in low memory enviroments. - Remove pooling and open close pages on demand to reduce memory footprint - Open new page on demand to ensure that parallel builds that request same resource do not conflict which causes the puppeteer to throw up error that request is already handled. - Add catch clauses around request interception to ensure that e2e tests continue if there is an issue with percy.

Add page pool for sequential builds, to reduce execution overhead when creating new page - Add checks to detirmine sequential build to create page pool at setup - Add checks to ensure page pool has been setup for sequential build

Use generic pool to create a pool for both parallel builds and serial builds. The pool will create page instances on demand, releasing them after requests are finished.

blake-newman · 2019-05-29T13:23:03Z

I am currently creating a test suite capturing all these changes along with #167 to validate all the changes work as expected. (So far look positive)

src/services/asset-discovery-service.ts

- Rename pool to pagePool - Create constants for the page pool size with configuration via env variable

djones · 2019-05-30T21:21:16Z

test/services/snapshot-service.test.ts

@@ -37,6 +39,21 @@ describe('SnapshotService', () => {
      expect(snapshotResponse.body).to.deep.equal({data: {id: snapshotId}})
      expect(snapshotResponse.statusCode).to.eq(200)
    })
+
+    it('creates multiple snapshots', async () => {


I can confirm on master, this test passes when length: 1 and fails with length: 2 or more.

djones · 2019-05-30T21:30:07Z

src/services/asset-discovery-service.ts

@@ -9,17 +10,19 @@ interface AssetDiscoveryOptions {
  networkIdleTimeout?: number
 }

+const DEFAULT_PAGE_POOL_SIZE = process.env.PERCY_POOL_SIZE


we'll shift this to use the .percy.yml config file later. Happy to leave this is for now, but don't rely on this

Robdel12 · 2019-05-30T21:39:18Z

src/services/asset-discovery-service.ts

-      if (!this.shouldRequestResolve(request)) {
-        await request.abort()
-        return
+      try {


Robdel12 · 2019-05-30T21:39:57Z

src/services/asset-discovery-service.ts

+      profile('--> assetDiscoveryService.pool.release')
+
+      return maybeResources.filter((maybeResource) => maybeResource != null)
+    } catch (error) {


Love that we're going to catch these async errors now. Does it get rid of all of the deprecations?

Robdel12

🏁

# [0.5.0](v0.4.9...v0.5.0) (2019-05-30) ### Features * Support parallel snapshots calls ([#168](#168)) ([744a399](744a399))

blake-newman force-pushed the blake.newman/fix-parallel-builds branch from cc7db54 to 2149e5f Compare April 25, 2019 13:45

blake-newman changed the title ~~Fix parrellel builds~~ Fix parallel builds Apr 25, 2019

blake-newman force-pushed the blake.newman/fix-parallel-builds branch from 69481e3 to 41c1562 Compare April 28, 2019 11:35

ethanve mentioned this pull request May 6, 2019

fix: ensure doctype is present 4Catalyzer/percy-seleniumjs#3

Merged

blake-newman force-pushed the blake.newman/fix-parallel-builds branch from 41c1562 to 11b1bd8 Compare May 9, 2019 15:34

djones reviewed May 15, 2019

View reviewed changes

src/services/asset-discovery-service.ts Outdated Show resolved Hide resolved

djones reviewed May 15, 2019

View reviewed changes

src/services/asset-discovery-service.ts Outdated Show resolved Hide resolved

blake-newman force-pushed the blake.newman/fix-parallel-builds branch from 11b1bd8 to 42f9364 Compare May 15, 2019 12:38

blake-newman added 3 commits May 27, 2019 17:01

Create page pool for sequential builds only

fd33f00

Add page pool for sequential builds, to reduce execution overhead when creating new page - Add checks to detirmine sequential build to create page pool at setup - Add checks to ensure page pool has been setup for sequential build

Use generic-pool

2ec0391

Use generic pool to create a pool for both parallel builds and serial builds. The pool will create page instances on demand, releasing them after requests are finished.

blake-newman force-pushed the blake.newman/fix-parallel-builds branch from 42f9364 to 2ec0391 Compare May 28, 2019 07:34