You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.
When PAI is deployed on cloud, admins may want to stop some free nodes to save money. When a new job is submitted, the closed nodes can be started again to let the job fit in.
This feature is usually called "autoscaler", and was implemented in #4735 before. However, #4735 only works on AKS. We can design an extensible autoscaler framework, which works in different cloud environment: e.g. Azure Virtual Machine Scale Set, or other cloud provider.
The text was updated successfully, but these errors were encountered:
There are a few points in which this proposal and other low-level auto-scaling services differ
more customizable. Users could customize easily to let OpenPAI, an AI workload platform, to make decision when and which worker nodes to be scaled. Admins could write custom codes to enable trigger conditions such as observation of waiting jobs, virtual cluster utilization, and other high-level and end-to-end metrics.
a snip of codes that could easily support multiple types and hybrids cloud infrastructures.
Motivation
When PAI is deployed on cloud, admins may want to stop some free nodes to save money. When a new job is submitted, the closed nodes can be started again to let the job fit in.
This feature is usually called "autoscaler", and was implemented in #4735 before. However, #4735 only works on AKS. We can design an extensible autoscaler framework, which works in different cloud environment: e.g. Azure Virtual Machine Scale Set, or other cloud provider.
The text was updated successfully, but these errors were encountered: