Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: report errors from deletecheckpoints endpoint + improve feedback #9178

Merged
merged 4 commits into from
Apr 17, 2024

Conversation

ashtonG
Copy link
Contributor

@ashtonG ashtonG commented Apr 16, 2024

Ticket

ET-169

Description

This fixes an issue where errors from the delete checkpoints endpoint were not reported in the ui. We fix that and also improve the error messaging such that the user knows which model version the checkpoint is registered to.

Test Plan

  • visit the model versions page and select a model, then a version.
  • on the resulting page, click the trial breadcrumb in the model checkpoint card
  • find the checkpoint in the trial checkpoint table
  • click the flag icon to show the checkpoint info for the checkpoint
  • click the "request checkpoint deletion" button, then click "request delete" on the resulting modal
  • An error message should pop up saying which model and version the checkpoint is registered to.
image

Checklist

  • Changes have been manually QA'd
  • User-facing API changes need the "User-facing API Change" label.
  • Release notes should be added as a separate file under docs/release-notes/.
    See Release Note for details.
  • Licenses should be included for new code which was copied and/or modified from any external code.

Copy link

netlify bot commented Apr 16, 2024

Deploy Preview for determined-ui canceled.

Name Link
🔨 Latest commit 4c8434f
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/661fe53dd232f000080280e6

Copy link

codecov bot commented Apr 16, 2024

Codecov Report

Attention: Patch coverage is 39.39394% with 20 lines in your changes are missing coverage. Please review.

Project coverage is 37.10%. Comparing base (3f70a46) to head (4c8434f).
Report is 14 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9178      +/-   ##
==========================================
- Coverage   45.71%   37.10%   -8.61%     
==========================================
  Files        1179      600     -579     
  Lines      146795    93617   -53178     
  Branches     2419     2420       +1     
==========================================
- Hits        67110    34741   -32369     
+ Misses      79471    58662   -20809     
  Partials      214      214              
Flag Coverage Δ
backend 34.24% <ø> (-9.49%) ⬇️
harness 40.45% <ø> (-23.58%) ⬇️
web 35.82% <39.39%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
webui/react/src/services/apiConfig.ts 72.65% <100.00%> (+0.13%) ⬆️
webui/react/src/services/api.ts 0.00% <0.00%> (ø)
webui/react/src/components/CheckpointModal.tsx 19.90% <12.50%> (-0.92%) ⬇️
.../pages/ExperimentDetails/ExperimentCheckpoints.tsx 13.59% <12.50%> (-0.46%) ⬇️

... and 665 files with indirect coverage changes

Copy link
Contributor

@keita-determined keita-determined left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

web lgtm

@@ -128,12 +126,13 @@ const ExperimentCheckpoints: React.FC<Props> = ({ experiment, pageRef }: Props)
[registerModal],
);

const handleDelete = useCallback((checkpoints: string[]) => {
readStream(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq: do you know why it used to use readStream

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the copy implies that we purposefully dropped the error messaging -- i think that's why we used readstream, because its default behavior is to drop errors

@keita-determined keita-determined added the to-cherry-pick Pull requests that need to be cherry-picked into the current release label Apr 16, 2024
Comment on lines 356 to 366
}
checkpointMsgs := make([]string, len(registeredCheckpointUUIDs))
i = 0
for k, v := range registeredCheckpointUUIDs {
if _, ok := accessibleModels[v.ID]; ok {
checkpointMsgs[i] = fmt.Sprintf("%v, registered to %v (model #%d), version %d", k, v.Name, v.ID, v.Version)
} else {
checkpointMsgs[i] = fmt.Sprintf("%v, registered to an unknown model", k)
}
i++
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is very strange go style, a micro-optimization not worth the cost of readability. can we write this instead:

Suggested change
}
checkpointMsgs := make([]string, len(registeredCheckpointUUIDs))
i = 0
for k, v := range registeredCheckpointUUIDs {
if _, ok := accessibleModels[v.ID]; ok {
checkpointMsgs[i] = fmt.Sprintf("%v, registered to %v (model #%d), version %d", k, v.Name, v.ID, v.Version)
} else {
checkpointMsgs[i] = fmt.Sprintf("%v, registered to an unknown model", k)
}
i++
}
}
var checkpointMsgs []string
for k, v := range registeredCheckpointUUIDs {
if _, ok := accessibleModels[v.ID]; ok {
msg := fmt.Sprintf("%v, registered to %v (model #%d), version %d", k, v.Name, v.ID, v.Version)
checkpointMsgs = append(checkpointMsgs, msg)
} else {
msg := fmt.Sprintf("%v, registered to an unknown model", k)
checkpointMsgs = append(checkpointMsgs, msg)
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var checkpointMsgs []string part can be checkpointMsgs := make([]string, 0, len(registeredCheckpointUUIDs))? i think thats what Ashton meant. then we can avoid using index access and also no unnecessary memory allocation i guess

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that works too. but still is a micro-optimization. preallocating it with cap len(registeredCheckpointUUIDs) vs just var x []T doesn't matter unless you've benchmarked that it does. i'm sure it is nothing compared to the sql query above it.

Comment on lines 330 to 335
modelIDs := make([]int, len(registeredCheckpointUUIDs))
i := 0
for _, v := range registeredCheckpointUUIDs {
modelIDs[i] = v.ID
i++
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here, too

Copy link
Contributor

@corban-beaird corban-beaird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, there are couple CI issues that need to resolved that are due to changes made in this PR

@ashtonG
Copy link
Contributor Author

ashtonG commented Apr 17, 2024

checked and both current failures are due to ci issues being tracked in slack

@ashtonG ashtonG merged commit 26f5e0b into main Apr 17, 2024
67 of 84 checks passed
@ashtonG ashtonG deleted the bug/ET-169/report-delete-checkpoint-errors branch April 17, 2024 19:38
dai-release bot pushed a commit that referenced this pull request Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed to-cherry-pick Pull requests that need to be cherry-picked into the current release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants