Add explicit GPU models for all makes #111

parikls · 2024-10-08T17:20:49Z

No description provided.

zubenkoivan

Looks good, lgtm once comments addressed

neuro_config_client/entities.py

asvetlov · 2024-10-10T11:49:03Z

neuro_config_client/entities.py

@@ -121,8 +121,12 @@ class NodePool:
    disk_size: int | None = None
    disk_type: str | None = None

-    gpu: int | None = None


You cannot just drop these fields, it breaks SDK/CLI.
Some migration procedure is required.
I suggest keeping gpu / gpu_model here for a while (say, until the New Year).
gpu could be a sum of amd, intel and nvidia.
gpu_model could be a first available model or concatination of available modes. We don't parse these values, so nvidia-tesla-k80+amd-rizen-xxx would be fine.

Later, when all clients update CLI/FLOW on their machines, we can drop these fields.

@asvetlov I've restored both fields for now to not break the compatibility with the SDK.

Just FYI:

gpu was already dropped from the platform config api at April: https://github.com/neuro-inc/platform-config/commit/fc83088224859d5e81b35beb426069888e05c336#diff-61fdfd6f40520f332212d51ba625421a707ce6ff111f58610dd10c84b27b1d9b

If gpu property is sent to the API, it's converted by the API to an nvidia_gpu under the hood - https://github.com/neuro-inc/platform-config/commit/fc83088224859d5e81b35beb426069888e05c336#diff-c2827afbcb2bed90d9593131e0055607f29b65656938e4b93d972c2d40c7e27eR118

I assume that gpu property is actually blank in a DB since April? @zubenkoivan can you elaborate please?

In April we renamed gpu to nvidia-gpu and added two other gpu kinds. Because before the April we didn't have non-nvidia systems.

I like your approach for keeping gpu as an alias for nvidia, e.g. nvidia_gpu=payload.get("nvidia_gpu") or payload.get("gpu"),

Should NodePoolOptions and Resource have triple nvidia/amd/intel values like NodePool?
If not please elaborate why.

@asvetlov
Resource is used for idle jobs only. I've asked @zubenkoivan about that and he mentioned that it is not required to have a different manufacturers support there.

NodePoolOptions are used only be a cloud providers. As far as I understand - we don't need to add such support for clouds.

@zubenkoivan please correct me if I'm wrong.

correct, we don't need to support this for NodePoolOptions and Resource classes

for more information, see https://pre-commit.ci

parikls requested a review from zubenkoivan October 8, 2024 17:20

zubenkoivan approved these changes Oct 9, 2024

View reviewed changes

neuro_config_client/entities.py Outdated Show resolved Hide resolved

neuro_config_client/entities.py Outdated Show resolved Hide resolved

asvetlov reviewed Oct 10, 2024

View reviewed changes

parikls added 7 commits October 11, 2024 22:19

Add explicit GPU models for all makes

e1c2d5b

small review fixes

5d117ee

apply renaming

1a59516

address review comments

cec1ef9

Restore gpu and gpu_model props for the node pool

00acbf0

fix typo

3606364

add idea to a gitignore

7ba9d6d

parikls force-pushed the explicit-gpu-make-and-model-f606fcc4960f branch from d7cf539 to 7ba9d6d Compare October 11, 2024 19:20

[pre-commit.ci] auto fixes from pre-commit.com hooks

581b721

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add explicit GPU models for all makes #111

Add explicit GPU models for all makes #111

parikls commented Oct 8, 2024

zubenkoivan left a comment

asvetlov Oct 10, 2024

parikls Oct 10, 2024

parikls Oct 10, 2024

asvetlov Oct 10, 2024

asvetlov Oct 10, 2024

parikls Oct 10, 2024

zubenkoivan Oct 11, 2024

Add explicit GPU models for all makes #111

Are you sure you want to change the base?

Add explicit GPU models for all makes #111

Conversation

parikls commented Oct 8, 2024

zubenkoivan left a comment

Choose a reason for hiding this comment

asvetlov Oct 10, 2024

Choose a reason for hiding this comment

parikls Oct 10, 2024

Choose a reason for hiding this comment

parikls Oct 10, 2024

Choose a reason for hiding this comment

asvetlov Oct 10, 2024

Choose a reason for hiding this comment

asvetlov Oct 10, 2024

Choose a reason for hiding this comment

parikls Oct 10, 2024

Choose a reason for hiding this comment

zubenkoivan Oct 11, 2024

Choose a reason for hiding this comment