Skip to content

Commit

Permalink
Revert regression from PR elastic#98139 affecting model deployment up…
Browse files Browse the repository at this point in the history
…dates

This commit reverts changes to the memory usage estimation logic introduced by PR elastic#98139, which caused failures when updating the `number_of_allocations` for trained model deployments. The reversion restores the system's stability in high-availability environments.

Relates elastic#107807
  • Loading branch information
Rassyan committed Apr 24, 2024
1 parent 21c32b0 commit 37bbd9c
Showing 1 changed file with 1 addition and 20 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -714,26 +714,7 @@ public static long estimateMemoryUsageBytes(
int numberOfAllocations
) {
// While loading the model in the process we need twice the model size.

// 1. If ELSER v1 or v2 then 2004MB
// 2. If static memory and dynamic memory are not set then 240MB + 2 * model size
// 3. Else static memory + dynamic memory * allocations + model size

// The model size is still added in option 3 to account for the temporary requirement to hold the zip file in memory
// in `pytorch_inference`.
if (isElserV1Or2Model(modelId)) {
return ELSER_1_OR_2_MEMORY_USAGE.getBytes();
} else {
long baseSize = MEMORY_OVERHEAD.getBytes() + 2 * totalDefinitionLength;
if (perDeploymentMemoryBytes == 0 && perAllocationMemoryBytes == 0) {
return baseSize;
} else {
return Math.max(
baseSize,
perDeploymentMemoryBytes + perAllocationMemoryBytes * numberOfAllocations + totalDefinitionLength
);
}
}
return isElserV1Or2Model(modelId) ? ELSER_1_OR_2_MEMORY_USAGE.getBytes() : MEMORY_OVERHEAD.getBytes() + 2 * totalDefinitionLength;
}

private static boolean isElserV1Or2Model(String modelId) {
Expand Down

0 comments on commit 37bbd9c

Please sign in to comment.