Client-side throttling #1019

mecampbellsoup · 2023-03-28T14:20:18Z

Problem

I want to be able to use kopf to watch resources and update last-handled-configuration annotations for a large number of resources. (We are "migrating" from kubectl apply --patch to using kopf as our k8s "reconciler" and so for all previously managed k8s resources we need to patch those resources w/ our new kopf-specific annotation.)

Since we are sending requests for something like 50k+ k8s resources individually, our apiserver is responding w/ 429s.

Most k8s clients in other ecosystems (Go comes to mind, kubectl being a good example) implement client-side throttling so that controllers, operators, etc., do not DoS the k8s apiserver.

tl;dr: I want to be able to start kopf and "migrate" a large number of k8s resources to be managed by kopf without having to worry about DoS'ing our apiserver.

Proposal

Implement a configurable, naive semaphore that wraps the aiohttp client so that a maximum number of requests can be sent from kopf to the k8s apiserver, i.e. implement a request queue client-side.

Code

sem = asyncio.Semaphore(10)

# ... later
async with sem:
    # do kopf PATCH request  as soon as a slot is available; otherwise, wait for a slot to open up

Additional information

Related to #963.

The text was updated successfully, but these errors were encountered:

mecampbellsoup · 2023-08-09T03:42:50Z

@nolar any thoughts on this?

nolar · 2023-08-09T07:12:13Z

Hi.

You can probably implement this logic now by defining a semaphore in the on-startup handler and putting it to “memo”. Then, this same semaphore object will be available in every handler of every resource. As such, you can postpone leaving the handler until the semaphore is released.

Similarly, you can avoid patching via kopf and do patch via your api client in the handlers — with the same semaphore logic.

There is also a control on the number of synchronous workers via the threadpool in settings — not the most straightforward way, but also a way.

Either way, Kopf already has a throttler/retrier “on error” with configurable delays, 429s included. You can find it by the Fibonacci sequence in “configuration.py” and docs. As I understand, you want not reactive but proactive throttling — i.e., without even trying?

I do not think this feature is worth adding to the operator framework — it is rarely needed and quite specialized.

But if those tricks above do not work, what I think is possible is let users override the client used by Kopf — with whatever logic they want. I already had this drafted last week in 3 different ways, but abandoned this idea because the code was not nice-looking in the end (it exposed too many implementation details, which I would like to be able to remake in the future without breaking “backward compatibility”). My goal was to make a mock k8s server for tests, and I finally did it the same way as “aresponses” — on the server side, without exposing/overriding the internal client. But at least one draft was okay’ish, so I can revive it.

mecampbellsoup added the enhancement New feature or request label Mar 28, 2023

dheeg mentioned this issue Oct 26, 2023

Unhandled Exception on Error 429 - Too Many Requests #1011

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client-side throttling #1019

Client-side throttling #1019

mecampbellsoup commented Mar 28, 2023 •

edited

Loading

mecampbellsoup commented Aug 9, 2023

nolar commented Aug 9, 2023

Client-side throttling #1019

Client-side throttling #1019

Comments

mecampbellsoup commented Mar 28, 2023 • edited Loading

Problem

Proposal

Code

Additional information

mecampbellsoup commented Aug 9, 2023

nolar commented Aug 9, 2023

mecampbellsoup commented Mar 28, 2023 •

edited

Loading