Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: slots being filled returned out of order on k8s [RM-42] #9276

Merged
merged 7 commits into from
May 2, 2024

Conversation

NicholasBlaskey
Copy link
Contributor

@NicholasBlaskey NicholasBlaskey commented May 1, 2024

Ticket

Description

Fix issues where slots were being returned out of order. There are two fixes in this PR

  1. Make k8s use slot IDs that will return sorted. For more than 10 gpus on an agent the response would be out of order
  2. The issue in the ticket, rbac would jumble the order.

Test Plan

Unit tests cover this pretty well

Manual

Run a devcluster with

        security:
          authz:
            type: rbac

and agent

        artificial_slots: 8

run a experiment, create a user with no access det -u admin user create test

check agents det -u test dev curl /api/v1/agents and see that the contaner is on the same slot id every refresh

Checklist

  • Changes have been manually QA'd
  • User-facing API changes need the "User-facing API Change" label.
  • Release notes should be added as a separate file under docs/release-notes/.
    See Release Note for details.
  • Licenses should be included for new code which was copied and/or modified from any external code.

@cla-bot cla-bot bot added the cla-signed label May 1, 2024
Copy link

netlify bot commented May 1, 2024

Deploy Preview for determined-ui canceled.

Name Link
🔨 Latest commit 341ab9c
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/663262020d28e10008b45dcb

Copy link

netlify bot commented May 1, 2024

Deploy Preview for determined-ui ready!

Name Link
🔨 Latest commit aaadd0f
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/6633935b793dcf0009b63e89
😎 Deploy Preview https://deploy-preview-9276--determined-ui.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

codecov bot commented May 1, 2024

Codecov Report

Attention: Patch coverage is 91.30435% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 44.60%. Comparing base (a9c8700) to head (aaadd0f).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9276      +/-   ##
==========================================
+ Coverage   44.58%   44.60%   +0.01%     
==========================================
  Files        1275     1275              
  Lines      156214   156226      +12     
  Branches     2451     2450       -1     
==========================================
+ Hits        69655    69679      +24     
+ Misses      86319    86307      -12     
  Partials      240      240              
Flag Coverage Δ
backend 41.80% <91.30%> (+0.03%) ⬆️
harness 64.09% <ø> (ø)
web 35.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
master/internal/authz/obfuscate.go 54.09% <100.00%> (+33.72%) ⬆️
master/pkg/model/agent.go 1.81% <100.00%> (+1.81%) ⬆️
master/internal/rm/kubernetesrm/pods.go 21.82% <83.33%> (+0.32%) ⬆️

... and 3 files with indirect coverage changes

@determined-ci determined-ci requested a review from a team May 1, 2024 16:24
@determined-ci determined-ci added the documentation Improvements or additions to documentation label May 1, 2024
@NicholasBlaskey NicholasBlaskey changed the title fix: slots being filled returned out of order on k8s fix: slots being filled returned out of order on k8s [RM-42] May 1, 2024
@NicholasBlaskey NicholasBlaskey marked this pull request as ready for review May 1, 2024 16:49
@NicholasBlaskey NicholasBlaskey requested review from a team as code owners May 1, 2024 16:49
@NicholasBlaskey NicholasBlaskey enabled auto-merge (squash) May 2, 2024 13:22
@NicholasBlaskey NicholasBlaskey merged commit 21f76e9 into main May 2, 2024
84 of 98 checks passed
@NicholasBlaskey NicholasBlaskey deleted the fix_jumping_k8s_slots branch May 2, 2024 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants