-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate risk of performance regression from share-capable saved object types #113743
Comments
Pinging @elastic/kibana-core (Team:Core) |
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
@elastic/kibana-alerting-services just want to make sure you're aware of the potential performance impact as I think your plugins are the most performance sensitive |
Thanks, @rudolf. Is Pinging the @elastic/security-detections-response team as well. They have some logic to import pre-packaged rules, where I believe the function calls |
Thanks for opening this, Rudolf! I have some thoughts on your comments for specific methods, but first, a general note:
All that to say: the additional ES client calls we make in the Repository for multi-namespace saved objects don't necessarily translate into a 2x or 3x latency. It would be good to try to capture some performance data, though.
The scripted update to record the resolve outcome uses
Looking at the code, I think we have an opportunity to combine (1) and (2) 👍 I don't see why these were ever separate in the first place.
It's also worth noting that in #114693, we're going to need to add a check before |
As "luck" would have it, I discovered a bug in the current |
I've updated the description to clarify what I mean with increase in latency. Latency is the minimum impact we would see and would really only affect code where the hot path consists of saved objects operations happening in series like task manager polling and attempting to claim a task, getting a claim conflict, retrying etc. @mikecote yes, |
With the introduction of #116007, the With the introduction of #117056, the I've updated the issue description accordingly. |
We'll go ahead and close this issue as alerting was the main case we wanted to test here, and an investigation was already conducted in #115197 |
Related #112492 |
When saved object types become share-capable #100489 the saved objects repository adds some additional logic to method calls which could introduce a performance regression. It's impossible to guess how much of an impact additional queries to Elasticsearch might add. So in the analysis below I assume that the operation itself has a zero cost (takes 0ms) and the only impact is the extra time it takes for the request / response roundtrip i.e. latency. So if the latency is 200ms an 3x increase in latency would mean that at a minimum, the operation will now take 600ms to complete.
SavedObjectsClient::resolve()
Whereas
get()
uses a single Elasticsearch API callresolve()
performs the following calls serially:bulk
request that performs a scripted update on any legacy url aliases that match the resolved idmget
request to fetch matched saved objects (similar to aSavedObjectsClient::get()
)update
to record resolve outcome telemetryimpact: higher write load and 3x increase in latency
SavedObjectsClient::update()
Adds an additional
get
request if updating a multi-namespace saved object type. Also, ifupsert==true
, adds an additionalmget
request (if the object exists in <=3 spaces) orsearch
request (if the object exists in >3 spaces) to find aliases.impact: up to 3x increase in latency
SavedObjectsClient::bulkUpdate()
Adds an additional
mget
request if multi-namespace saved objects are being updated.impact: 2x increase in latency
SavedObjectsClient::incrementCounterInternal()
Adds an additional
mget
request if incrementing a multi-namespace saved object type. Also, if the object exists in >3 spaces, adds asearch
request to find aliases.impact: up to 3x increase in latency
SavedObjectsClient::create()
Adds an additional
mget
request ifoverwrite==true
and creating a multi-namespace saved object type. Also, if the object exists in >3 spaces, adds asearch
request to find aliases.impact: up to 3x increase in latency
SavedObjectsClient::bulkCreate()
Adds an additional
mget
request ifoverwrite==true
and creating 1 or more multi-namespace saved object types. Also, if any object exists in >3 spaces, adds asearch
request to find aliases.impact: up to 3x increase in latency
SavedObjectsClient::delete()
Adds an additional
get
request if deleting a multi-namespace saved object (requiresforce=true
if the object exists in >1 spaces). Also adds an additionalupdateByQuery
request to delete any associated alias.impact: up to 3x increase in latency
SavedObjectsClient::bulkDelete()
Adds an additional
mget
request if deleting multi-namespace saved objects (requiresforce=true
if the object exists in >1 spaces). Also perfroms an additionalupdateByQuery
request for each saved object in the bulk to delete any associated alias per multi-namespace saved object type. So e.g. deleting 1000 multi-namespace saved objects will create anmget
as part of the preflight check, a bulk request to actually delete the objects and 1000updateByQuery
requests (10 requests in parallel)impact: up to 3x increase in latency
The text was updated successfully, but these errors were encountered: