-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Wait for job updates in AutoDetectResultProcessor.awaitCompletion #36856
[ML] Wait for job updates in AutoDetectResultProcessor.awaitCompletion #36856
Conversation
Pinging @elastic/ml-core |
jobResultsProvider.getEstablishedMemoryUsage(jobId, latestBucketTimestamp, modelSizeStatsForCalc, establishedModelMemory -> { | ||
if (latestEstablishedModelMemory != establishedModelMemory) { | ||
|
||
|
||
client.threadPool().executor(MachineLearning.UTILITY_THREAD_POOL_NAME).submit(() -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can submit()
throw? I'm not sure if it is necessary to wrap this in a try...catch and release the semaphore in the catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it can throw rejected execution exception if the thread pool’s queue is full or it is shutting down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it can use execute()
rather than submit()
as it's not interested in getting a Future
to track completion. Since you're off I'll push another commit that changes both things.
Also use execute() rather than submit() when scheduling update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
#36856) There was a race where the job update in `AutoDetectResultProcessor.updateEstablishedModelMemoryOnJob` could execute after `AutoDetectResultProcessor.awaitCompletion` returned. This was because ` jobUpdateSemaphore` was acquired after the call to `jobResultsProvider.getEstablishedMemoryUsage` and during that call `awaitCompletion` is free to acquire and release the semaphore after which the method returns. This commit fixes the problem. Closes #36849
There is a race where the job update in
AutoDetectResultProcessor.updateEstablishedModelMemoryOnJob
may execute afterAutoDetectResultProcessor.awaitCompletion
returns. This was becausejobUpdateSemaphore
was acquired after the call tojobResultsProvider.getEstablishedMemoryUsage
and during that callawaitCompletion
is free to acquire and release the semaphore after which the method returns.Closes #36849