vCPU allocation #126

redhog · 2016-12-08T14:15:02Z

We have a curious case of under-allocation of vCPUs at a particular stage in our scio pipeline. At this stage, the vCPU allication is down to 7 while the pipeline is run with --maxNumWorkers=200

We have tested to run it with

--zone=us-central1-f --experiments=use_mem_shuffle --workerHarnessContainerImage=dataflow.gcr.io/v1beta3/java-batch:1.8.0-mm

with no difference.

At this stage, items in the SCollection have already been grouped and so the SCollection does contain way fewer elements than in the start of the pipeline, but there are still plenty ( >10k ). However, processing each item in this stage is CPU intensive (it does a DBSCAN clustering of items inside each group). As a matter of fact, this is the most cpu intensive part of the pipeline.

Code reference: https://github.com/GlobalFishingWatch/vessel-classification-pipeline/blob/84-cluster-anchorages/pipeline/anchorages/src/main/scala/Anchorages.scala#L380

The text was updated successfully, but these errors were encountered:

enriquetuya · 2016-12-13T14:51:11Z

@seacourtaw did you have any news from your google contacts about this one?

enriquetuya · 2017-01-19T17:18:41Z

@redhog @seacourtaw based on what we talk with Amy, we need to try to turn off autoscaling and set to 200 workers (there are some known issues w/ autoscaling today)
If that still doesn't resolve the issue, grab the process ID and other details and pass along to Amy to involve other Google engineers to dig deeper

enriquetuya assigned seacourtaw Dec 13, 2016

enriquetuya assigned redhog and unassigned seacourtaw Jan 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vCPU allocation #126

vCPU allocation #126

redhog commented Dec 8, 2016 •

edited

Loading

enriquetuya commented Dec 13, 2016

enriquetuya commented Jan 19, 2017

vCPU allocation #126

vCPU allocation #126

Comments

redhog commented Dec 8, 2016 • edited Loading

enriquetuya commented Dec 13, 2016

enriquetuya commented Jan 19, 2017

redhog commented Dec 8, 2016 •

edited

Loading