You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have a curious case of under-allocation of vCPUs at a particular stage in our scio pipeline. At this stage, the vCPU allication is down to 7 while the pipeline is run with --maxNumWorkers=200
At this stage, items in the SCollection have already been grouped and so the SCollection does contain way fewer elements than in the start of the pipeline, but there are still plenty ( >10k ). However, processing each item in this stage is CPU intensive (it does a DBSCAN clustering of items inside each group). As a matter of fact, this is the most cpu intensive part of the pipeline.
@redhog@seacourtaw based on what we talk with Amy, we need to try to turn off autoscaling and set to 200 workers (there are some known issues w/ autoscaling today)
If that still doesn't resolve the issue, grab the process ID and other details and pass along to Amy to involve other Google engineers to dig deeper
We have a curious case of under-allocation of vCPUs at a particular stage in our scio pipeline. At this stage, the vCPU allication is down to 7 while the pipeline is run with --maxNumWorkers=200
We have tested to run it with
with no difference.
At this stage, items in the SCollection have already been grouped and so the SCollection does contain way fewer elements than in the start of the pipeline, but there are still plenty ( >10k ). However, processing each item in this stage is CPU intensive (it does a DBSCAN clustering of items inside each group). As a matter of fact, this is the most cpu intensive part of the pipeline.
Code reference: https://github.com/GlobalFishingWatch/vessel-classification-pipeline/blob/84-cluster-anchorages/pipeline/anchorages/src/main/scala/Anchorages.scala#L380
The text was updated successfully, but these errors were encountered: