[5.x] Add autoScalingStrategy option #1254

cosmastech · 2023-03-04T12:05:49Z

This PR adds an option to configure how a supervisor auto-scales queues. It allows for runtime or size, set on the supervisor level under the autoScalingBehavior key.

runtime is the current behavior (portion the number of available processes based on time to complete a queue / total time to complete all queues), where as size portions processes based on total number of jobs in queue / total number of jobs in all supervised queues.

For some context:

We have a Horizon supervisor set up like this:

'supervisor-medium' => [
    'connection' => 'redis_medium',
    'queue' => [
        'ai',
        'notifications',
        'import',
        'other-queue',
    ],
    'balance' => 'auto',
    'minProcesses' => 1,
    'maxProcesses' => 10,
    'memory' => 64,
    'tries' => 3,
    'timeout' => 300,
],

How should the queue supervisor balance when there are 2700 jobs in the ai queue and 5000 jobs in the import queue?

We are seeing that, though the supervisor spins up more processes, they are all given to the ai queue and not allocated to the import queue. So, we would see that ai queue processes went from 1 to 6 processes, but import stayed at 1.

To solve this problem, we extended the AutoScaler class to add this behavior. I thought it might be a nice feature for Horizon to have.

Also, I think it's worth noting in the Horizon docs about the time-to-complete behavior. As they currently read:

The auto strategy adjusts the number of worker processes per queue based on the current workload of the queue. For example, if your notifications queue has 1,000 pending jobs while your render queue is empty, Horizon will allocate more workers to your notifications queue until the queue is empty.

I think this is easy to misunderstand as it reads like it's balanced on job count, not time to complete. It required digging into the AutoScaler code to understand the behavior we were seeing.

Additionally, I think it might be nice to:

allow for configuring a custom strategy per supervisor, and/or
make the process allocation based on a combination of size + runtime

For the second point, I couldn't think of a good way to do that. I was thinking maybe that jobs should be a factor in the calculation, weighted lower than the time to complete. But I didn't spend much time trying things out.

taylorotwell · 2023-03-05T20:00:51Z

Do we really need a new balancing strategy? Could you just define a second supervisor configuration for the other queues you want to ensure have more processes?

taylorotwell · 2023-03-05T20:03:03Z

You also state you think it would be nice to allow a custom strategy per supervisor... but isn't that what you have done in this PR? Am I missing something there?

Also, can you share an example of a configuration using you new option?

cosmastech · 2023-03-05T20:58:15Z

Do we really need a new balancing strategy? Could you just define a second supervisor configuration for the other queues you want to ensure have more processes?

This was an issue we noticed on our production app.

We our running our Laravel monolith on K8s. We scale our worker pods automatically. We have 12 different queues handled by Horizon, running 3 separate supervisors.

Using the current Horizon strategy, the ai queue with 2700 jobs (but a longer time to complete each job) starves the import queue that has 5000 jobs (but a much lower time to complete), putting out of mind how it affects the other queues shared by the supervisor.

I do think that time to clear should play a factor in the balancing strategy, but it shouldn’t impact it this significantly. With a supervisor responsible for 6 queues & having maxProcesses of 24, it isn't desirable to allocate 19 processes to one queue, and 1 for each of the rest.

Ideally I think it would be best if the process allocation was based on a combination of time to clear + size of queue, but I'm not sure what that would look like.

Because of the current Horizon auto-balance strategy, we see an increase in customer support tickets. It feels like we would be playing whack-a-mole to guess which combinations of queues should share a supervisor.

Is it envisioned that production applications have a supervisor for each queue?

You also state you think it would be nice to allow a custom strategy per supervisor... but isn't that what you have done in this PR? Am I missing something there?

I could've added more clarity to that one.

Yes, that is sort of what I have done with this PR, but that's still only 2 strategies. Was thinking it might be beneficial for Horizon to allow a consumer application to modify or add their own auto-balancing strategy. (I'm not entirely should what that would look like, but just figured I would include it as a thought that I had).

Also, can you share an example of a configuration using you new option?

Here is an example of the configuration so it relies on the new strategy included in the PR:

'supervisor-medium' => [
    'connection' => 'redis_medium',
    'queue' => [
        'ai',
        'notifications',
        'import',
        'other-queue',
    ],
    'balance' => 'auto',
    'minProcesses' => 1,
    'maxProcesses' => 10,
    'memory' => 64,
    'tries' => 3,
    'timeout' => 300,
    'autoScalingStrategy' => 'size', // now this supervisor will partition # of processes to the queue with the most jobs, not the estimated time to complete
],

On Friday, we launched our app to use the size auto-scaling strategy for all supervisors and have noticed a significant improvement in balance of processes.

If a consumer doesn't set an autoScalingStrategy in the config, it will still default to 'runtime' as it works now. I believe this PR only adds new functionality without breaking backwards compatibility.

taylorotwell · 2023-03-06T19:10:29Z

Thanks!

cosmastech · 2023-03-06T19:14:51Z

Thanks!

No no, thank you @taylorotwell

adds autoScalingStrategy option

ceaaa58

cosmastech force-pushed the feature/auto-scaling-method-option branch from de7529a to ceaaa58 Compare March 4, 2023 12:08

cosmastech added 2 commits March 4, 2023 07:22

changes autoScalingStrategy property to runtime

a52e621

style fix

9ff2b48

driesvints changed the title ~~adds autoScalingStrategy option~~ [5.x] Add autoScalingStrategy option Mar 6, 2023

taylorotwell added 4 commits March 6, 2023 13:03

formatting

59138bc

move property

b941c40

fix variable

319ec2c

Update SupervisorOptions.php

597ce23

taylorotwell merged commit a447e39 into laravel:5.x Mar 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[5.x] Add autoScalingStrategy option #1254

[5.x] Add autoScalingStrategy option #1254

cosmastech commented Mar 4, 2023 •

edited

Loading

taylorotwell commented Mar 5, 2023

taylorotwell commented Mar 5, 2023

cosmastech commented Mar 5, 2023 •

edited

Loading

taylorotwell commented Mar 6, 2023

cosmastech commented Mar 6, 2023

[5.x] Add autoScalingStrategy option #1254

[5.x] Add autoScalingStrategy option #1254

Conversation

cosmastech commented Mar 4, 2023 • edited Loading

taylorotwell commented Mar 5, 2023

taylorotwell commented Mar 5, 2023

cosmastech commented Mar 5, 2023 • edited Loading

taylorotwell commented Mar 6, 2023

cosmastech commented Mar 6, 2023

cosmastech commented Mar 4, 2023 •

edited

Loading

cosmastech commented Mar 5, 2023 •

edited

Loading