Releases: determined-ai/determined
Releases · determined-ai/determined
0.26.1
Release Notes
Changelog
- a6b26b0 chore: bump version: 0.26.1-rc3 -> 0.26.1
- 4bd3dcb docs: add release notes for 0.26.1 (#8131)
- de1526b chore: bump version: 0.26.1-rc2 -> 0.26.1-rc3
- 6e19285 fix: Return ResourcePools with a fixed order (#8103)
- 355cb62 chore: bump version: 0.26.1-rc1 -> 0.26.1-rc2
- 3f97397 fix: Trial data loading state (#8083)
- 740e730 fix: trigger
autoupdate_users_modified_at
byusername
change (#8093) - 288db9f fix: single point different axis ranges (#8096)
- d972837 fix: Empty / NotEmpty operators for descriptions, tags [WEB-1751] (#8090)
- 3c5b281 fix: ignore
last_auth_at
to updatemodified_at
on users table (#8091) - b086b84 chore: bump version: 0.26.1-rc0 -> 0.26.1-rc1
- 2283723 fix: Metric type is blank in comparison chart (#8085)
- 90dfb54 fix: Experiment name settable in fork config (#8081)
- ce7549a fix: shell open gives unfriendly message on terminated (#8074)
- 663f4d8 docs: quick fix for version dropdown (#8070)
- 180d1b3 chore: bump version: 0.26.1-dev0 -> 0.26.1-rc0
- 1a96b8a chore: lock published urls to preserve redirects
- d1c3fa4 Revert bump version (#8063)
- 9b3b65a chore(tests): fix data race in grpclog init during integrations (#8058)
- ef0ef50 chore: lock published urls to preserve redirects
- b241b78 chore: bump version: 0.26.1-dev0 -> 0.27.0-dev0
- b14cc40 test: stable diffusion example tests [MLG-903] (#7855)
- 4cb3151 docs: Improve upgrade instructions (#8032)
- aa736a2 fix: detached mode tensorboard storage support [MLG-872] (#7992)
- d11f6c9 fix: fake cert gen.sh generated broken certs (#8055)
- a8a3399 fix: bring back logging for
core.train.report_*
calls. (#7975) - f4524ac fix: pass final forked config to server for new experiment (#8051)
- 5ff6e9c feat: update InlineForm inputs (#8033)
- 9c05b00 feat: updating agent user group affects users.modified_at (#8052)
- 9414539 fix: fasterrcnn image not found [MLG-516] (#8047)
- 6aca426 chore(tests): fix data races in k8s tests (#8049)
- 764f407 chore(tests): fix data races in telemetry tests (#8048)
- 90fb051 chore: rename 'cancel' to 'stop' for experiments and trials [WEB-291] (#8038)
- a4c244e fix: command resolve pool defaulting to workspace 0 instead of 1 (#8050)
- ccefc6d chore: allow passing clusterID in master config (#8042)
- 1214a2b fix: Return ResourcePools rather than names (#7990)
- 7aa8541 fix: govcloud agent AMIs are out of sync with bumpenvs [MLG-986] (#7983)
- be1f734 chore: add DatePicker to UI Kit [WEB-1674] (#8040)
- 5bd7b8a feat: SDK can list workspaces. (#7765)
- c0467b0 ci(performance): create initial gha workflow [INFENG-224] (#7969)
- 9dcedad fix: k8s custom pod spec affinity would get ignored (#8043)
- f4d3c47 fix: Note UI respects project permissions (#8028)
- 338d2f6 fix: rename
last_login
tolast_auth_at
(#8022) - 34373ac feat: Batch actions for multiple users into one request [WEB-1640] (#7971)
- 1d85304 test: fix shell open test flake (#8035)
- 827d9a6 feat: Filter user list by role id for EE (#7988)
- e57b8d6 fix: Hide checkpoint deletion btn when already deleted (#8039)
- 04609e0 fix: Remove the unexposed GetJobQStats from RM interface and all RMs (#8030)
- d00ddc9 chore: Add error case and tests to Loadable [WEB-1333][WEB-1711] (#8025)
- 78dfc83 fix: Support longer titles on HParam scatter plots (#8031)
- fd9b406 fix: nil ptr for Proto() on users who haven't logged in (#8029)
- 846fbc3 fix: add css formating for miltiple input errors (#8011)
- 42a4911 chore: adding an example for distributed batch inference for mnist (#7976)
- c83dfe6 docs: fair-share scheduling policy [skip ci] (#7981)
- ccfda89 fix: Add fetch to resource pool bindings page (#8023)
- 4b085ff chore: move Loadable to kit [WEB-1688] (#7973)
- 67fdacb fix: sort files for conflict resolution in sharded checkpoints (#8014)
- 10fcb10 feat: add tooltip linebreak to the project card (#7995)
- 4b4ad8d fix: add existing "non-setting" query parameter into the settingsToQuery function (#8013)
- f021ac2 test: fix master intg user flake (#8019)
- 8937f2b fix: align default markdown font-family to theme [WEB-617] (#8009)
- e03e469 feat: "det notebook|shell|tensorboard open" doesn't error when task not ready (#8008)
- 109e69f fix: tensorboard deletion when det e delete [DET-9844] (#7997)
- e3ffc4f docs: Fix minor issues (#7999)
- 2e77f52 fix: regular integer spacing of chart ticks [WEB-1714] (#8010)
- f1ae472 feat: add
last seen
column in user management table (#7991) - d2a5505 refactor: remove external dependencies from UI kit [WEB-1689] (#7968)
- e86abd5 fix: fix sorting in GetUsers endpoint (#8001)
- 69921eb feat: commands download user files at startup so k8s can support larger context directories [DET-8830] (#7889)
- 86e6f47 fix: report searcher progress according to reporting period (#8006)
- 5b2f238 ci: more precisely select files for splitting in E2E tests (#7989)
- f918188 fix: show deep files in experiment code viewer (#7945)
- 1d2b60b fix: data fetch shouldn't interrupt editing model version description [WEB-1703] (#8000)
- 7dff1f2 chore: bun debug mode off (#7996)
- 0241795 fix: Version dropdown in docs is scrollable (#7994)
- 46da826 ci: disable docs review action [skip ci] (#7982)
- 390e0ac chore: add tests for postgres_users.go (#7875)
- a60c0d7 fix: Hide action menu on dashboard project cards (#7986)
- a2fb7ef chore: bump version: 0.26.0-dev0 -> 0.26.1-dev0
- 7eaf361 docs: add release notes for 0.26.0 (#7987)
- 25439fc chore: put grpc panics and other logs into 'master logs' (#7965)
- 8c373cf fix: Force GCP node name length to be less than maximum length (#7964)
- 7f21b4f chore(templates): refactor templates to their own package (#7876)
- b9fb3eb chore: Add toast to UI kit (#7950)
- d45e22b feat: Support filter by status and role for users (#7953)
- 5b5fb2f chore: fix det deploy aws requiring --db-size (#7984)
- e254bf3 docs: Update custom pod specs page (#7970)
- 3d1ec1f fix: Update Tasks Stats Causes Deadlock [DET-9853] (#7980)
- 5c5e0ec feat: single experiment continue [DET-9703] (#7764)
- 34c5bdb docs: document prometheus auth (#7957)
- 6e3cad3 chore(codeowners): map performance dir to web team (#7916)
- 9a5cfd7 fix: handle nil actor message and nil actor errors in agent RM (#7951)
- d25612b feat: Add instance flavor and size arguments for det deploy aws [INFENG-227] (#7931)
- e031a2f fix: update
modified_at
by insert in user table (#7949) - 2d65e82 fix: Change measure of text lines in log containers [WEB-1664] (#7860)
- 14a779c chore: add last_login column to users table/model (#7948)
- f4e3638 chore: fix docs reference in cli [MLG-891] (#7926)
- a4fdd0a chore: Improve failure diagnostics in shell test [FE-216] (#7932)
- b576b6b chore: less verbose mockery output (#7822)
- 9df980c chore(performance): add initial Makefile and README (#7914)
- 47a4070 chore: handle case where steps completed is more than max length (#7816)
- 1193acd chore: log k8s nil event objects at trace level and ignore (#7962)
- 1c86762 chore: max_slots_per_pod can be per resource pool [DET-9771] (#7923)
- 3267b1f fix: return searcher_metric_value as-is (#7961)
- b5845b7 fix: singularity agent env variable (#7960)
- 19d703b fix: mitigate user settings race conditions (#7905)
- f4382a7 fix(db): handle erroneous nulls from the summary metric migration (#7958)
- 65cabae fix(experiments): don't transition experiment to "" state on crash (#7956)
- aa30e86 docs: quick fix for version dropdown (#7952)
- 18e2f1f fix(scheduler): tolerate missing groups in priority scheduling by skipping them (#7947)
- 4b19928 chore: bunify & tidy up internal/user (#7886)
- 3f067a3 fix(allocation): allocation lifetimes should contain resource lifetimes (#7944)
- b6b5a84 Remove references to --auto-bind-mount (#7910)
- 2ba2580 chore(deps): bump tibdex/github-app-token from 2.0.0 to 2.1.0 (#7938)
- 2baaf30 fix: clear selection after action (#7921)
- 4db1c08 refactor: css in docs (#7934)
- a392dc8 fix: button in 404 page (#7936)
- 0e3f4d0 fix: avoid hiding tabs in single trial experiment [WEB-1651] (#7941)
0.26.0
Release Notes
Changelog
- 29705a8 chore: bump version: 0.26.0-rc3 -> 0.26.0
- 084e485 docs: add release notes for 0.26.0 (#7987)
- 2882e78 chore: bump version: 0.26.0-rc2 -> 0.26.0-rc3
- 623774e fix: Update Tasks Stats Causes Deadlock [DET-9853] (#7980)
- c11c5e4 fix: handle nil actor message and nil actor errors in agent RM (#7951)
- df6c317 fix: update
modified_at
by insert in user table (#7949) - 9a795c1 chore: bump version: 0.26.0-rc1 -> 0.26.0-rc2
- d330154 chore: log k8s nil event objects at trace level and ignore (#7962)
- a8465d8 fix(db): handle erroneous nulls from the summary metric migration (#7958)
- ab9d933 fix(experiments): don't transition experiment to "" state on crash (#7956)
- 7b506f9 fix(allocation): allocation lifetimes should contain resource lifetimes (#7944)
- 2b12176 chore: bump version: 0.26.0-rc0 -> 0.26.0-rc1
- 78482b2 docs: quick fix for version dropdown (#7952)
- 0893e30 fix: clear selection after action (#7921)
- a9a382f fix: button in 404 page (#7936)
- 7f85454 chore: bump version: 0.26.0-dev0 -> 0.26.0-rc0
- 9ac8f82 chore: lock published urls to preserve redirects
- 6ba27fe chore: lock api state for backward compatibility check
- b52b3a6 chore: bump version: 0.25.2-dev0 -> 0.26.0-dev0
- c2cea7d chore: include api op and param description in py bindings (#7798)
- c98cc07 feat: Allow passing in swagger json as an argument (#7843)
- 482285f docs: Add another top nav link (#7933)
- d9e1bb5 chore: track dead code [WEB-258] (#7924)
- 537cc3d docs: Update launcher version to 3.3.8 for consistency with docs (#7915)
- 5b859c6 fix(cli): det model describe should call GET /model not GET /models (#7912)
- 262c33a docs: Clarify weighted fair-share scheduling policy (#7913)
- 4a486bb feat: Add performance tests for endpoints used in the WebUI initial load [WEB-1459] (#7906)
- cc360ac feat: Add workspaces to the SDK client (#7883)
- c87ca94 fix: api_command.go does not merge map values when overrides TaskContainerDefaults [FE-114] (#7887)
- dc40688 chore: update docs ownership per discussion [INFENG-225] [skip ci] (#7907)
- ee79213 test: fix e2e_tests ray dependency. (#7925)
- 688ff63 fix: align items in task list (#7894)
- 694f44a feat: submit forms in modals by pressing enter [WEB-1130] (#7857)
- 376ea50 fix: Display data point in line chart when epoch is 0 (#7898)
- 8eff8ac chore: update user docs (#7902)
- 4441ac8 feat: Input should capture the Esc button and Clicks while focused [WEB-1251] (#7859)
- 0365ca7 revert: "chore(actors): remove pkg/actors usage from pods.go (#7658) [DET-9652]" (#7908)
- c1a0cf4 chore(actors): remove pkg/actors usage from pods.go (#7658) [DET-9652]
- ff2e16d fix: NTSC use workspace's agent group info (#7892)
- 8bef0d4 chore: no code owners for auto-generated files (#7896)
- 5278758 chore: increase Go's max line length to 120 (#7903)
- b09334d feat: Add display name to user list in cli [MLG-930] (#7901)
- 4301fc2cc feat: move UI related files to the UI kit. (#7852)
- 3f9a980 feat: Hide code related actions based on model definition size (#7854)
- dde10f1 Revert "ci: temporarily move e2e to only nightly [skip ci] (#7837)"
- 35aa028 feat: add an API to get an allocation's exit status (#7731)
- e56ed43 chore: prompt for docs in github question template (#7895)
- 4d827cd fix: remove unused go code (#7893)
- b93f0c9 feat: disable actions of unmanaged experiments/trials (#7874)
- 3b1bebf fix: Metadata deleting last row, cancelling delete [WEB-1655] (#7805)
- 9cddca9 Refactor: Use userSettings store in learning curve (#7783)
- 4a6afb1 feat: add config option to omit default resource pools (#7885)
- e5555ca fix: redefine user columns updated in postgres_users toUpdate (#7890)
- 29561a8 test: quarantine nightly cifar10-keras convergence test (#7780)
- 1cfc7f3 fix: Project move/delete updates UI state [WEB-1668] (#7870)
- 00dfcca test: enable command run tests for hpc (#7880)
- 90d66b8 chore: disable interactive matching for dev bindings (#7747)
- b6632d1 chore: postgres_users.go bun migration [DET-8238] (#7769)
- 826f2b4 feat: containerize performance tests [INFENG-222] (#7863)
- 66f6f4a ci: fix webui test results upload (#7877)
- 64ec5ed Revert "feat: add config option to omit default resource pools (#7696)" (#7878)
- b95d57f fix: use setPartial in experiment list setting (#7873)
- e18d5cf feat: add config option to omit default resource pools (#7696)
- 5e69b6c feat: k8s agent enable disable [DET-9750] (#7779)
- d2e5abb docs: Remove black borders on gif (#7872)
- 144dd0f test: remove deepspeed marks from dsat tests (#7871)
- 40b4341 docs: Add gif to the Readme (#7865)
- a5b29cf docs: Adjust diagrams replacing fluentbit icon (#7867)
- 096935f chore: update user docs (#7864)
- 6d5ad2d docs: Add page for using Determined Agent on Slurm/PBS (#7866)
- 79060ba chore: Remove imagenet (#7664)
- cdceeac feat: Display unmanaged experiments with label (#7861)
- 16a6262 fix: Make models list editing work via ModelActionDropdown [WEB-1603] (#7799)
- 84a6612 fix: doc url in jupyter config modal (#7862)
- 5e04699 fix: fix how we are calling the bert embedding example (#7851)
- d8c7bd2 docs: Clarify meaning of trial api (#7818)
- 813ed36 fix: error message for
det agent [enable|disable]
. (#7839) - 9132dcb feat: expose
externalExperimentId
andexternalTrialId
(#7840) - 9874951 chore: bump version: 0.25.1-dev0 -> 0.25.2-dev0
- a5bdfa7 docs: add release notes for 0.25.1 (#7850)
- 38dc440 chore: trial actor refactor (#7821)
- f3aaf4d fix: pass configString once to createexperimentmodal (#7849)
- 346d4aa chore(deps): bump tibdex/github-app-token from 1.8.2 to 2.0.0 (#7847)
- 93e5341 feat: helm ca.cert injection, cluster-wide non-namespaced res creation flag, password change and minor-fix (#7808)
- 442bac6 feat: backend support for inference metric tracking part 2 (#7592)
- 06080b9 feat: allow metrics with duplicate keys and the same value [MLG-890]. (#7820)
- e42c973 feat: enable display of metrics with floating point epoch [MLG-857] (#7829)
- 3c9e0e2 feat: add new API endpoint to get and post accelerator data (#7723)
- febbe18 fix: enable RP bindings management for workspace admins (#7834)
- e843173 ci: temporarily move e2e to only nightly [skip ci] (#7837)
- 4956673 fix: display
progress
value as it is (#7836) - 4d74e95 refactor: flipped k8's enable reattach to always true [DET-9726] (#7692)
- 4010b74 chore: nil exception on GetResourcePoolsRequest error (#7835)
- 12d393f fix: dupe checkpoints (#7833)
- 6b12390 fix: log viewer not updating when page switched (#7823)
- 590ea21 ci: fix check-rebaseable syntax [ci skip] (#7826)
- a99385d ci: Add a newline to the output for pre-check (#7824)
- c2ce179 chore: support binary output via dev curl (#7778)
- d78bafe fix: SSO button text color (#7819)
- b4f6f0c fix: correct useResize hook to return proper element sizes [WEB-1656] (#7807)
- e31a077 chore: Split out partial updates into setPartial (#7815)
- 8bcee31 chore: make pre-commit dev setup opt-in. (#7774)
- 675de43 chore: minor copy change (#7810)
- dceb00c chore: agent device discovery too greedy (#7802)
- 1284914 chore(deps): bump actions/checkout from 3 to 4 (#7786)
- 9651c9b fix: progress filter in exp (#7811)
- c284b09 fix: lower severity of allocation log changed when debugging (#7803)
- ff830f9 fix: Learning curve will send falsey metricType (#7809)
- 3c2ab1e chore(deps): bump tibdex/github-app-token from 1.8.0 to 1.8.2 (#7772)
- da23134 docs: HPC launcher doc tweaks, add image scheme docker-archive:// (#7812)
- 1a89c56 docs: Add sections on HPC upgrade and package verificaiton (#7804)
- a51892e fix: Avoid dropdown repeating in ExpList fields dropdown [WEB-1598] (#7800)
- fab413b chore: tools/k8s doesn't use coscheduler (#7795)
- 2b95373 docs: Update the installation guide (#7762)
- 8291a18 ci: quarantine some flaky nightlies (#7725)
0.25.1
Release Notes
Changelog
- 39a421a chore: bump version: 0.25.1-rc2 -> 0.25.1
- 61c11df docs: add release notes for 0.25.1 (#7850)
- e0d0ed2 chore: bump version: 0.25.1-rc1 -> 0.25.1-rc2
- 74eeb77 fix: enable RP bindings management for workspace admins (#7834)
- 1d8e3d2 fix: display
progress
value as it is (#7836) - 2c86593 chore: bump version: 0.25.1-rc0 -> 0.25.1-rc1
- 117b173 fix: log viewer not updating when page switched (#7823)
- b93bc72 fix: SSO button text color (#7819)
- cfdacb4 fix: correct useResize hook to return proper element sizes [WEB-1656] (#7807)
- 81b673d fix: progress filter in exp (#7811)
- 29ad1d1 fix: Learning curve will send falsey metricType (#7809)
- b0a7e4e fix: Avoid dropdown repeating in ExpList fields dropdown [WEB-1598] (#7800)
- ebd1906 docs: Update the installation guide (#7762)
- 1bf08e5 chore: bump version: 0.25.1-dev0 -> 0.25.1-rc0
- 7f7e89b chore: lock published urls to preserve redirects
- 59ebdf0 chore: lock api state for backward compatibility check
- 9b6c6c7 fix: Get distributed jobs working with devcluster [FE-181] (#7785)
- d786078 chore: revert trial actor refactor (#7797)
- f4ca02a docs: quick fix for version dropdown (#7796)
- 72d34d9 chore: reduce master log noise (#7794)
- 12a513c chore: Create/document a mechanism to run the nightly tests on a PR [FE-146] (#7750)
- af24954 Revert "chore: track dead code [WEB-258] (#7767)" (#7793)
- c815f76 fix: handles custom TLS certs in enrich_task_logs.py [DET-9803] (#7782)
- 5c83901 chore: remove empty
determined/common/api/checkpoint/
. (#7776) - a2b873f chore: suppress the daemonize message on HPC jobs (#7775)
- 5f2f6b8 fix: glitchy width in code editor (#7771)
- bdeb0ea chore: track dead code [WEB-258] (#7767)
- 2167292 chore: trial actor refactor (#7559)
- 06e361e fix: correct date range for avg queued time charts [WEB-1621] (#7754)
- 7c765ae refactor: remove fluent bit & replace with slurm log shipper [DET-9704] (#7639)
- 214198d fix: include
unmanaged
field inGetExperiment
. (#7768) - f8caa0e feat: Create performance tests [WEB-1458] (#7741)
- b0badb2 fix: Handle chart x-axis with all points at x=0 [WEB-1622] (#7760)
- a6d0fba chore: rearrange log level constants (#7752)
- b22f652 chore: ignore flake8 import restrictions pre-commit check (#7759)
- ef8a295 chore: bump version: 0.25.0-dev0 -> 0.25.1-dev0
- 9333c9d docs: add release notes for 0.25.0 (#7756)
- 8af14ab chore(actors): refactor pod.go (#7617)
- 046e060 test: make error checking case insensitive fixing rbac test (#7749)
- 6c530d3 build: fix
go-version-check
command (#7751) - 418931b refactor: make glide-table conform to standard event handler pattern and fix paginated row selection bug [WEB-1471, WEB-1561] (#7704)
- 93d861d chore: use Message for no data in ComparisonView (#7654)
- 4d25428 docs: tweak brew instructions (#7743)
- cf57ce4 chore: upgrade go 1.20 to 1.21 (#7657)
- 0553f19 fix: make rbac messages consistent (#7745)
- e1675b2 fix: not all resource pools should be labeled "default" [WEB-1600] (#7744)
- 2e907ce fix: resource pool card workspace tweaks (#7732)
- 5154c3b fix: proxy tunnel server should use
SO_REUSEADDR
. (#7735) - e845836 fix: React build issue (#7742)
- 2fc5f57 fix: Faster polling for first experiment metrics [WEB-1576] (#7740)
- ad46c44 chore: add a new assertion method to check command exit status and report any errors (#7737)
- 418d5ae fix: allow deletion of workspaces when case-insensitive matches exist (#7738)
- 690d451 docs: Reorganize model dev guide sidenav (#7713)
- 9936984 fix: properly display group metrics in metrics tab charts [WEB-1604] (#7727)
- 15e150b fix: allow zeroes for user agent id and group agent id (#7730)
- 8c84750 fix: catch correct import error and set tensorboard logging to false for --test --local (#7715)
- b6f4f30 fix: allow NodeInformer to fail with permission error [DET-9772] (#7703)
- 00a0bc3 fix(cli): not found errors should retain useful context (#7733)
- 688ea88 fix: fix failing e2e_cpu tests (#7734)
- 613e0ce chore: New constructor for Determined objects using existing session. (#7663)
- 108ffea fix: backfilled tasks weren't seen as trial tasks (#7729)
- 1fcd2f9 docs: Add user guides to the Documentation section (#7721)
- d20f577 fix: changing x axis type should reset any current custom zoom (#7728)
- ef5ae83 chore: update determined cli to handle timestamp format for external jobs (#7668)
- 2eadef1 feat: show external jobs on the resource pool page (#7666)
- 2a570bb chore: crash cluster given RM crash (#7621)
- b66ff4a fix: correct GPU name for A100-80GB. (#7724)
- fdddcbf chore: Add nightly tests to release branches (#7720)
- ae6c927 fix: reset chart min/max when changing xaxisdomain (#7719)
- 5e6af2a fix: properly encode metric to keys for LineChart and ParallelCoordinates (#7714)
0.25.0
Release Notes
Changelog
- fea5014 chore: bump version: 0.25.0-rc7 -> 0.25.0
- 3201f27 docs: add release notes for 0.25.0 (#7756)
- 29fbea2 chore: bump version: 0.25.0-rc6 -> 0.25.0-rc7
- 16509f6 test: make error checking case insensitive fixing rbac test (#7749)
- 79a5faa chore: bump version: 0.25.0-rc5 -> 0.25.0-rc6
- 9fde7c7 fix: not all resource pools should be labeled "default" [WEB-1600] (#7744)
- 154168f fix: resource pool card workspace tweaks (#7732)
- d3a42d7 chore: bump version: 0.25.0-rc4 -> 0.25.0-rc5
- 41f9251 fix: React build issue (#7742)
- 1c81e4c chore: bump version: 0.25.0-rc3 -> 0.25.0-rc4
- c4443b3 fix: make rbac messages consistent (#7745)
- 1a83243 fix: allow deletion of workspaces when case-insensitive matches exist (#7738)
- f298ebf fix: properly display group metrics in metrics tab charts [WEB-1604] (#7727)
- 450d1b5 fix: allow zeroes for user agent id and group agent id (#7730)
- fd13b29 chore: bump version: 0.25.0-rc2 -> 0.25.0-rc3
- c121387 chore: bump version: 0.25.0-rc1 -> 0.25.0-rc2
- 1cea783 fix: allow NodeInformer to fail with permission error [DET-9772] (#7703)
- 5d5718f fix(cli): not found errors should retain useful context (#7733)
- 98e0621 fix: backfilled tasks weren't seen as trial tasks (#7729)
- 306baaa fix: changing x axis type should reset any current custom zoom (#7728)
- 406656f chore: bump version: 0.25.0-rc0 -> 0.25.0-rc1
- 00f3af9 fix: correct GPU name for A100-80GB. (#7724)
- 4dd33e4 fix: properly encode metric to keys for LineChart and ParallelCoordinates (#7714)
- a028cd2 chore: Add nightly tests to release branches (#7720)
- 17796e3 fix: reset chart min/max when changing xaxisdomain (#7719)
- 05f808b chore: bump version: 0.25.0-dev0 -> 0.25.0-rc0
- 1cb537e chore: lock published urls to preserve redirects
- 7196181 chore: lock api state for backward compatibility check
- 6e39429 chore: bump version: 0.24.0-dev0 -> 0.25.0-dev0
- 1c8ce3f fix: add missing workspace_id from get_templates (#7706)
- efdc70b fix: code cleanup for mapx unit test (#7710)
- 0a34529 fix: add unit test cases for mapx methods Values and Clear (#7699)
- 1a4bee4 feat:
det deploy gcp
support for a2-ultragpu and g2-standard. (#7702) - 2fd2535 fix: users can see inaccessible RPs (#7707)
- e83660c fix: rp bindings intg test failure (#7701)
- f1e9b72 chore: Remove estimatortrial (#7700)
- e2f2173 feat: replace clone function with structuredClone and add polyfill (#7624)
- f528cc6 fix: botched rebase/rename in the detached mode. (#7695)
- 08de858 fix: Continue Trial modal does not reset mode [WEB-1566] (#7688)
- f9c5600 fix: error message in jupyter (#7693)
- 2607fd1 fix: patch workspace has duplicate update statements (#7697)
- bf1b87b fix: correct outstanding error in mapx (#7698)
- cdc41e9 fix: add type check to pod spec merge (#7691)
- e340d50 chore: add Values and Clear methods for mapx (#7669)
- 6bc3c68 docs: algolia scraper to scrape only xml (#7690)
- 0977986 docs: fix new release notes (#7694)
- 76134b2 chore: dev cli support for calling master apis (#7462)
- 43c715b docs: add release notes for 0.24.0 (#7680)
- c513e70 chore: add new RBAC permission view external jobs (#7671)
- 128b106 docs: work around bug causing version dropdown to fail (#7685)
- 26559ed feat: check if default resource pools are bound (#7687)
- 7d8fce5 docs: improve writing of the github readme (#7689)
- 857309f feat: add rp bindings permissions (#7673)
- 5c68568 chore: api intg tests [DET-9725] (#7589)
- 1ad812d docs: Improve the GitHub Readme (#7613)
- 8239b19 fix: default pools editable and submittable (#7647) (#7672)
- 4f60d64 chore: unpin click version (#7684)
- ec90842 chore(deps): bump arduino/setup-protoc from 1 to 2 (#7537)
- b4cbe9c chore: enable mask closable by default for drawers (#7676)
- b1f02ed chore: limit reported slots (#7683)
- 75c1f17 fix: tensorflow version for macos (#7679)
- 0a5b406 fix: allow special characters in user manangement filter (#7681)
- 1af32e3 feat: Update user.modified_at when user added or removed from groups (#7665)
- 23a8224 fix: Case-insensitive client-side username search [DET-9770] (#7677)
- a368a55 chore: limit reported slots (#7648)
- 60a07e4 fix: make -C master clean build [DET-9333] (#7660)
- fcf7807 chore: custom metrics group in new experiment list (#7518)
- dc006b1 docs: Fix formatting (#7670)
- 6ee0521 docs: Introduce users to pachyderm w det (#7661)
- 4ef8b20 feat: detached mode v1 / core api v2. (#7060)
- d580ecf fix: allow checkpoints to be GCed without validation metrics and add tests (#7653)
- 204caa5 fix: optional chaining in
extractMetricValue
(#7662) - 0ead6ac chore: telemetry actor refactor [DET-9663] (#7585)
- e437bd8 docs: Point to pytorch distributed launcher (#7649)
- 08ff4be docs: fix epoch metrics article (#7643)
- 395aa40 docs: Update resource pool to workspace mapping (#7642)
- 28cbdc2 fix: properly show the pagination for experiment list paged view (#7638)
- 68624f6 chore: avoid creating new table columns for non-legacy metrics (#7656)
- e558063 chore: Add eslint rule for imports to take one line [WEB-1567] (#7650)
- 0bc5dce ci: bump everything to torch==1.11 (#7599)
- 4eb6ebe fix: Project delete/move triggers update of workspace projects list [WEB-1497] [WEB-1377] (#7646)
- 145dd63 ci: indicate GHA run URL when reporting a cherry-pick conflict (#7635)
- 8d2b531 chore: Show Tooltip instead of actions for Default Resource Pools [WEB-1554] (#7644)
- 0bab558 fix: select component width (#7640)
- 32fac33 chore: use eslint rule to avoid relative imports through parent [WEB-1496] (#7637)
- dfd5475 chore: remove unused parseFloat for decoding string metric values (#7641)
- e472ea7 fix: alphabetical binding workspaces and search copy change [WEB-1552, WEB-1553] (#7633)
- 7b96933 fix: properly clear out the settings from the database [WEB-1559] (#7636)
- e9e66b1 fix: fix incorrect return type for downsampled metrics (#7618)
- 930fc9d feat: custom metric groups (formally known as types) [WEB-1469] (#7570)
- 75e93d9 docs: bump rstfmt version (#7611)
- 34c5b5a fix: trigger jobs fetchAll on pagination changes [WEB-1546] (#7602)
- 29e63af Remove say workaround and update version (#7628)
- 0a24176 chore: fix pod-spec merge logic (#7574)
- ce3136a feat: Don't show charts where all series are Loaded(no data) [WEB-1524] (#7609)
- d3c027b feat: OptionsMenu moved to left group (#7623)
- 4d002c4 docs: Add article on how to view epoch metrics (#7504)
- 4327a25 fix: rp binding resolving resource pools (#7629)
- f79e95e ci: fix release branch selection when cherry-picking EE PRs (#7630)
- d6a5c79 fix: Handle metric names finish loading, but still empty (#7634)
- 2844566 chore: support mobile view in UIKit [WEB-1314] (#7626)
- 3ec9c49 fix: button filter text (#7632)
- 0d293b5 fix: ChartGroup vertical spacing (#7631)
- 6fd4b21 feat: replace custom
isEqual
to lodashisEqual
(#7625) - fa91629 feat: add searcher metric sorting (#7614)
- 5ff0b7d fix: avoid converting workspace name to sentence casing [WEB-1548] (#7622)
- 37259d4 fix: treat searcher metrics value as a number in the ui (#7612)
- bbe70e0 feat: Resource pool tab for workspace (#7582)
- 369ddf3 feat: Copy cell value from experiment list table (#7604)
- 3cb434d fix(actors): trial lifetime must contain allocation lifetime, still (#7615)
- 033a9f6 fix: Single-point tooltip closes when mouse exits chart [WEB-1541] (#7595)
- b8f95ad docs: Add css rule to turn off scrolling when clicking on section links (#7610)
- 9994aa3 fix: code editor height issues (#7573)
- acdd6c4 docs: improve a release note (#7601)
- 37cc9f0 chore: add agent --image-root (#7597)
- 69ab985 refactor: trial's can have one or many tasks [DET-9647] (#7355)
- 7c076b2 ci: fix remote name in PR tracking script (#7607)
- b2ee9b3 fix(actors): create valid fake group actor for checkpoint GC, don't leak it (#7606)
- 776be10 fix: fix checkpoint gc which was incorrectly deleting some checkpoints (#7523)
0.24.0
Release Notes
Changelog
- 620162e chore: bump version: 0.24.0-rc5 -> 0.24.0
- 809eda9 docs: add release notes for 0.24.0 (#7680)
- 399ad9c chore: bump version: 0.24.0-rc4 -> 0.24.0-rc5
- b1b2bf2 fix: allow checkpoints to be GCed without validation metrics and add tests (#7653)
- 378fe8d chore: bump version: 0.24.0-rc3 -> 0.24.0-rc4
- c1ffb75 docs: fix epoch metrics article (#7643)
- abf253b make error message more accurate (#7659)
- d57c5ec fix: default pools editable and submittable (#7647)
- b7aa1f4 chore: bump version: 0.24.0-rc2 -> 0.24.0-rc3
- 9095ee8 fix: properly clear out the settings from the database [WEB-1559] (#7636)
- 328ff9a docs: Add article on how to view epoch metrics (#7504)
- 39fa1db fix lint from #7629
- 24b89f7 fix lint from #7634
- 8806cd5 fix: Handle metric names finish loading, but still empty (#7634)
- a104da8 Revert "fix: Handle metric names finish loading, but still empty (#7634)"
- ed5e38c fix: rp binding resolving resource pools (#7629)
- aafcff8 fix: Handle metric names finish loading, but still empty (#7634)
- 6846471 fix: treat searcher metrics value as a number in the ui (#7612)
- 7aea1b4 fix(actors): trial lifetime must contain allocation lifetime, still (#7615)
- 524cc58 chore: bump version: 0.24.0-rc1 -> 0.24.0-rc2
- 1f0b33e fix: fix checkpoint gc which was incorrectly deleting some checkpoints (#7523)
- 57d7bc4 fix(actors): create valid fake group actor for checkpoint GC, don't leak it (#7606)
- 3a20df1 chore: bump version: 0.24.0-rc0 -> 0.24.0-rc1
- d17dbfd chore: bump version: 0.24.0-dev0 -> 0.24.0-rc0
- d7b2f5c chore: lock published urls to preserve redirects
- 4931751 chore: bump version: 0.23.5-dev0 -> 0.24.0-dev0
- bfd964c fix: proto user always has false remote (#7126)
- 5bc1666 chore: fix incorrect make invocation (#7605)
- e960b57 Update README.md (#7587)
- c957354 chore: bumpenvs 24.0 (#7594)
- 3c44a2f fix: ensure selected file path matches loaded file in codeeditor (#7563)
- 84a5800 fix: fetch the latest projects (#7598)
- 50fc3e7 chore: update comment to clarify endpoint behavior (#7583)
- 38a49cf docs: update doc string related to adding enable_tensorboard_logging flag (#7600)
- 549273c feat: Enable disabling Tensorboard logging [MLG-22] (#7508)
- 6d6d5ad chore: Remove ptl adapter (#7591)
- 306e0df fix: remove flaky component if new xp list is active. (#7590)
- db7ae17 ci: Add pytorch2 tests (#7581)
- 62cc08c feat: Move new charts into always-on Metrics tab [WEB-1522] [WEB-1523] (#7542)
- 5cf569d docs: fix lint for docs/architecture/introduction.rst (#7586)
- 7c6aead chore: migrate to singularity --nvccli (#7576)
- 923a742 chore: Make explist_v2 generally available (#7561)
- efc4458 Added Profiling to the Benefits table in the intro (#7580)
- 4ad4a76 fix: quick disambiguation on exp. checkpoint size (#7579)
- d40ae77 chore: fix rph docs url publishing step (#7487)
- 5f21741 docs: edit release notes readme (#7578)
- cd19761 fix: Button icon spacing (#7568)
- 22ca050 feat: get unbound pools endpoint [DET-9696] (#7527)
- d05e530 test: unpinning responses (#7551)
- d63d903 chore: Parse and format APIExceptions (#7531)
- ada554f docs: edit some release notes (#7540)
- 9d44ffc docs: add sphinx-tabs extension (#7577)
- 3623a23 docs: Edit the release notes readme (#7572)
- 04fb7ef chore: summarize state of mock_client_test.go [DET-9731] (#7571)
- 708db2d fix: Accomodate partial experiment list settings (#7564)
- 345da9c chore: add linter for google style guide-compliant python imports. (#7550)
- 2969f53 chore(rm): refactor cluster management APIs into RM (#7569)
- f10b474 chore: update type to handle generic summary metric types [WEB-1538] (#7567)
- 577cebe chore: Add web as codeowner for /webui (#7565)
- 56a59c0 fix: hide
cluster logs
in mobile view (#7557) - 4bcb04b fix: log level filter broken on det t/e logs [MLG-798] (#7558)
- 88338c1 fix: overwrite bindings not working for zero length list (#7549)
- 3258b92 fix: correct name for external jobs (#7566)
- fd7edc8 fix: scrolling when dragging column headers in Glide Table (#7548)
- fbcc4b5 fix: Dont seek min and max on projects with 0 experiments (#7560)
- 6f448b4 feat: Replace sum and count training metrics with mean in new experiment list (#7493)
- de9e4f5 chore(actors): refactor checkpoint GC tasks (#7435)
- da4696b chore: master linting less verbose (#7553)
- 99279ad feat: Provide an interface to enable resource managers to show External jobs on the resource pool queue. (#7070)
- 42feb36 docs: apply minor tweaks (#7554)
- 3534b49 docs: correct some
det deploy gcp
docs facts. (#7556) - f42cdcc chore: add copy to manage bindings modal (#7555)
- 88a93b0 chore: update default exp list columns [WEB-1488] (#7534)
- cc8cb63 fix: Groups modal does not include inactive users [WEB-1256] (#7528)
- b693ddd build: convert svgs to react by default (#7541)
- 01bdf93 feat: Heatmap support for glide table (#7267)
- 277e124 chore: update fmt-sql config and version (#7544)
- db8e965 ci: setup CODEOWNERS for ml-sys team. (#7546)
- 318ffda chore: Add new
update
function to userSettings store (#7469) - bde72d6 chore: use Dropdown for Experiment List menus (#7522)
- fcbc80e refactor: add Spinner to UI Kit [WEB-1451] (#7498)
- 03f503c ci: add a sql fromatter (#7538)
- ec32f2f ci: handle duplicate cherry-picks of a PR to release branch (#7502)
- 3db61f5 fix: compare charts showing no data while loading [WEB-1485] (#7516)
- 43ff696 fix: Improve HPC error shutdown to improve logging [FE-44] (#7488)
- a69ec27 chore: temporarily rollback agent usage of --nvccli (#7533)
- 90e131a fix: fix default pools and refactor (#7535)
- e86d47a fix: use proper experiment project resolution (#7532)
- c7ac817 fix: disable Manage Bindings option for default pools [WEB-1521] (#7519)
- 5f0ff57 fix: custom proxies do not work for trials in slurm [DET-9718] (#7529)
- 4722ee9 fix: filter scrollbar adjustment (#7530)
- 9867d4d fix: Selected experiments in glide table persist [WEB-1366] (#7289)
- 39013e9 chore: add option to syntax highlight cli json output (#7471)
- e75c7f4 chore: rename db metric references to custom_type (#7473)
- fa626b0 chore(actors): refactor allocation actor without actors (#7391)
- 303f0d2 chore: bump version: 0.23.4-dev0 -> 0.23.5-dev0
- 630f721 docs: add release notes for 0.23.4 (#7524)
- 2dd26d5 fix: Don't return a workload for deleted checkpoints [WEB-1505] (#7491)
- e13190a fix: Reset cluster jobs pagination when offset is out of bounds (#7521)
- ab21b6e chore: rollback torch 1.7 support removal. (#7525)
- 56d444e chore: update vite (#7505)
- ec0e750 style: update copy and add dividers to exp list table action dropdown [WEB-1490] (#7506)
- cddaf4f fix: rename checkpoints (#7513)
- 7b6f8e1 chore: remove double newline in cli error messages (#7472)
- 44f03fd fix: det tunnel should work with proxy port exposed [FE-121] (#7492)
- 28d18d9 fix: unbumpenvs. (#7496)
- d868241 feat: Pytorch2 necessary changes (#7515)
- 54c5a7d style: update experiment selection label [WEB-1510] (#7510)
- 0c4a19f fix: rp-workspace mapping RP not found [WEB-1508] (#7514)
- 9deeb48 fix: support
searcherMetric
(#7511) - 525ebfe chore: consolidate k8s informers code & fix Makefile mocks (#7455)
- 5ef7d30 ci: put all GHA jobs for release tracking in a concurrency group (#7507)
- 47ac573 fix: Metrics with dot in name appear correctly in trial view [DET-9691] (#7450)
- 856a963 fix: k8s determined-container gets wrong RunAsUser (#7503)
- 18821cd ci: avoid latest responses==0.23.2 (#7501)
- c02fce2 docs: rp workspace mapping release notes (#7499)
- 33ab05a fix: typo that allows binding default aux pool (#7500)
- 4fde629 fix: avoid crashing the new exp page (#7489)
- 28953dc docs: FE-120: Add
job_history_enable = True
to PBS installation requirements (#7480) - fcfcbe6 docs: Add RP to Workspaces user guide (#7326)
- 707221c docs: Describe WebUI settings (#7478)
- 8366613 chore: add migration number validation to migration util (#7347)
- 57c3e8c test: skip
test_efficientdet_coco_pytorch_const
. (#7494) - 8471017 test: fix rbac test failures (#7476)
- aac4272 fix: stop showing invalid loading (#7470)
- 9327830 fix: ChartGrid styling (#7485)
- 7f35b7d fix: rp workspace mapping not working (#7490)
- d29d49d feat: backend support for inference metric tracking part 1 (#7375)
- db01c03 fix: chart tooltip overflow (#7484)
- 99ac604 fix: experiment list compare panel resize (#7477)
- b11d1cb fix: only let cluster admins manage resource pool bindings [WEB-1476] (#7483)
- 6f147fd docs: Add torch batch process example (#7482)
- b71e79a fix: rename
checkoutCount
tocheckpoints
(#7481) - ebac1d3 ci: Conda bump (#7479)
- 1e021bc ci(aws): fix RDS connections (#7475)
- 19945b6 ci: correctly change item status in release tracking (#7460)
0.23.4
Release Notes
Changelog
- f5484dd chore: bump version: 0.23.4-rc4 -> 0.23.4
- 9c8ea0c docs: add release notes for 0.23.4 (#7524)
- 91e6b82 chore: bump version: 0.23.4-rc3 -> 0.23.4-rc4
- 8afa912 fix: rename checkpoints (#7513)
- d8d671b chore: bump version: 0.23.4-rc2 -> 0.23.4-rc3
- 6b26541 fix: unbumpenvs. (#7496)
- fdc9acf fix: rp-workspace mapping RP not found [WEB-1508] (#7514)
- 99f9cc3 fix: support
searcherMetric
(#7511) - 65f76b2 chore: bump version: 0.23.4-rc1 -> 0.23.4-rc2
- dc4472e fix: k8s determined-container gets wrong RunAsUser (#7503)
- 54b6602 ci: avoid latest responses==0.23.2 (#7501)
- ecbfe61 docs: rp workspace mapping release notes (#7499)
- 3131f43 fix: typo that allows binding default aux pool (#7500)
- 593ace0 fix: avoid crashing the new exp page (#7489)
- 557a272 docs: Add RP to Workspaces user guide (#7326)
- 7fcff5a test: skip
test_efficientdet_coco_pytorch_const
. (#7494) - 8648e05 test: fix rbac test failures (#7476)
- 17ac0d6 chore: bump version: 0.23.4-rc0 -> 0.23.4-rc1
- 8db5076 fix: rp workspace mapping not working (#7490)
- ecaad56 fix: rename
checkoutCount
tocheckpoints
(#7481) - ed19b69 ci: Conda bump (#7479)
- 66f67f9 ci(aws): fix RDS connections (#7475)
- 2d64e16 chore: bump version: 0.23.4-dev0 -> 0.23.4-rc0
- 4dda861 feat: RP<>workspace mapping (#7461)
- 9bffd21 chore: add a toy experiment example for pushing generic metrics (#7442)
- 7e7ac8b fix: use friendly names for user settings (#7465)
- fda7235 chore: Turn new experiment listing off by default (#7466)
- e6cabd6 docs: rephrase the home page title (#7463)
- 4e9748f feat: Allow user setting of feature flags (#7438)
- 307e2ee chore: avoid postgres uuid extension (#7459)
- a66db23 chore: explist_v2 feature switch should default to on (#7249)
- 3ba9904 style: mobile version of exp list v2 [WEB-1422] (#7433)
- 6fe9218 test: mark RBAC-related tests 'e2e_cpu_rbac' (#7401)
- aafac46 fix: chart axis long label (#7451)
- 9e37ed4 docs: Reset home page tiles and sidebar (#7446)
- 8dc0cfd fix: wider width for top trials select (#7431)
- b67d981 ci: fix latest release branch selection (#7432)
- c915f84 chore: podspec-capability-bugfix (#7447)
- 2cc2656 feat: Pin column when compare panel open (#7414)
- 04485b2 chore: update docs hpe compliance web section id (#7449)
- 78d8baa feat: minor edit to torch batch process embedding example (#7440)
- 2ff3d2e chore: allow no pinned columns [WEB-1423] (#7419)
- 210c446 feat: sort resource pools by name when creating new jupyterlab sessions [WEB-1138] (#7444)
- 2e776a4 fix: chart tooltip has exp. name (#7410)
- bd470b6 feat: Merge data and row height menus in explist v2 (#7405)
- 9caaae8 docs: update aria labelling for sidebar toggles and toctree groups (#7381)
- 9de58bd fix: update get agents call for agent disable all (#7441)
- 2d3120d feat: master audit logging should log failed requests at
Info
level. (#7434) - 34347a1 feat: Compare icon reflects status (#7443)
- c409c31 style: apply hover color for selected table rows (#7429)
- 639e34a fix(cli): compound keys in --config opts for commands (#7439)
- 70e65df feat: allow-pachyderm-notebook-extension [DET-9355] (#7395)
- bfdccdf docs: Add documentation about passing in optional tensorboard arguments [MLG-333] (#7352)
- 07e12f7 fix: table column copy change (#7373)
- 8300458 chore: events & preemption listener actor refactor [DET-9617] (#7256)
- 497df17 feat: [MLG-647] Add batch inference examples using Core API and torch_batch_process (#7274)
- 119f577 chore: fail with non-zero exit code when password change fails (#7403)
- 06c7d0b feat: edit and reset raw user settings [WEB-1361] (#7377)
- f98d15a fix: perf improvement in
useTrialMetrics
(#7426) - 2ce5aaf fix: clear table selection when changing pages (#7418)
- 533f3a7 chore: migrate to singularity --nvccli [DET-9081] (#7337)
- 1002645 ci: fix e2e tests for tf keras cifar10 example (#7422)
- 71243e3 chore: agent provisioner refactor (#7287)
- d82f35f fix: Correctly sequence updates in useSettings bridge code (#7423)
- 0fc6238 feat: Add Bert embedding torch_batch_process example (#7402)
- 22f82a1 feat: obfuscates slot id in agent summary(#7421)
- 0855b78 feat: Settings drawer (#7356)
- 6da387b fix: fix: label default compute and default aux pools differently (FE-108) (#7420)
- e2112c9 chore: k8s pod & node informer refactor (#7182)
- c20ad5a fix: show loading state when applicable for comparison view tabs (#7370)
- 30a20ba fix: All metrics viewable in the glide table compare tab [WEB-1372] [WEB-1394] (#7412)
- 8ceb955 docs: remove RBAC limitations section (#7415)
- 403754b ci: warn about PRs that are from forks but by users with write access (#7380)
- ee68c3d perf: migrate checkpoint v1 into v2 table (#7325)
- 3e9eaed chore: skip process auth admin check if rbac is enable (#7400)
- af9d562 ci: make CASPER_TOKEN optional in
track-pr
script (#7379) - c510035 chore: bump version: 0.23.3-dev0 -> 0.23.4-dev0
- 37e6d9c docs: add release notes for 0.23.3 (#7413)
- b795a23 fix: custom column resize [WEB-1341] (#7365)
- 227a684 fix: replace div with table in Trial Comparison View (#7341)
- 59f34bd chore: update data loading in Keras CIFAR10 example (#7378)
- 9464a1f chore: change compare tab refresh behavior [WEB-1434] (#7397)
- f7a6ba4 chore: add release notes for notebook tls (#7408)
- 6ef9862 docs: hpe docs compliance for section 5 [WEB-1292] (#7383)
- 3718a4f chore: update left over api references to metric type (#7399)
- 4fdf5ae fix: userSettings store should function correctly when useSettings is active (#7409)
- 5ae6311 chore: Workspace as an object in the SDK. (#7387)
- 638f56e chore: add generic metrics support to harness (#7407)
- 60987a5 fix: refactor checkpoint modal so it closes correctly [WEB-1441] (#7398)
- fccd378 ci: Fix GKE cluster version used for CircleCI (#7385)
- 28f2c03 fix: add fp16 flags to hf ds examples (#7265)
- 2ac22f0 fix: install sigusr1 on main thread only. (#7350)
- 03c3faf fix: memoize settings in model detail and experiment detail pages (#7390)
- 056d414 chore: Clean up redundant aria-labels, correct wrong aria-labels [WEB-1379] (#7384)
- 1f5b9a0 fix: rbac filter on columns API (#7386)
- a0a87cd feat: add update master config API end point with RBACK and CLI command (#7318)
- 839dcbd chore: bump up flake8 to 3.9.2 (#7389)
- 7eaea60 chore: rename metric type concept to metric group (#7353)
- dd15be9 fix: add condition by default in filter (#7362)
- fe85366 fix: new exp detail crash [WEB-1444] (#7382)
- 4e80c72 fix: Metrics reporting no data (#7368)
- b9d820e fix: make
checkpoint_count
explicit (#7374) - 2d66b53 ci: fix some bugs in tracking of cherry-picked PRs (#7367)
- 13a6ad8 chore: clean up comments in trial and metric code (#7376)
- 5aed4c5 chore: update make_url utility with fallback (#7259)
- 03bb110 chore: use det custom json encoder in print_json (#7371)
- 4f92cab fix: use DEFAULT_COLUMNS in GroupManagement table (#7366)
- bea92ef feat: add training metrics to columns api (#7320)
- b2fadf1 fix: Trials use their own trial state enum [DET-9639] (#7354)
- 90257d6 feat: allow merging metrics with same batch number (#7304)
- 5f18070 ci(circle/test-unit): regroup gpu unit tests (#7345)
- c2a8113 fix: handle long values (#7349)
- 4c41432 docs: add HPE marketing analytics code (#7269)
- 223f28d fix: TypeError 'NoneType' object processing exp config (#7290)
- 43abf42 fix: Fix issue in user settings store (#7363)
- 1640d7a fix: reverting not-found-errs change in python user-groups (#7361)
- 6e0c2bf chore: bumpenvs (#7348)
0.23.3
Release Notes
Changelog
- f8a9d45 chore: bump version: 0.23.3-rc2 -> 0.23.3
- e713e3b docs: add release notes for 0.23.3 (#7413)
- e3fbe53 chore: add release notes for notebook tls (#7408)
- f102fef fix: refactor checkpoint modal so it closes correctly [WEB-1441] (#7398)
- 1f502eb fix: install sigusr1 on main thread only. (#7350)
- c2ed933 chore: bump version: 0.23.3-rc1 -> 0.23.3-rc2
- 7b3f913 fix: memoize settings in model detail and experiment detail pages (#7390)
- 289c2a8 fix: rbac filter on columns API (#7386)
- af821f4 fix: make
checkpoint_count
explicit (#7374) - bc8147c chore: bump version: 0.23.3-rc0 -> 0.23.3-rc1
- 7d73601 fix: use DEFAULT_COLUMNS in GroupManagement table (#7366)
- 6bbeb9c fix: Fix issue in user settings store (#7363)
- aa25670 fix: reverting not-found-errs change in python user-groups (#7361)
- 4749d06 chore: bump version: 0.23.3-dev0 -> 0.23.3-rc0
- 275da69 chore: lock published urls to preserve redirects
- 4965100 ci: try rebasing main before checking PR (INFENG-178) (#7357)
- 54f5bcf fix: Cleared date field removed from filters [WEB-1408] (#7359)
- 0266d5d fix: SDK methods don't update local on failure (#7270)
- 2f9d924 feat: add tls to notebook connections [DET-9002] (#6735)
- 6afe3e9 docs: Batch Processing API doc (#7273)
- 240e41d ci: tweak PR title matching in release tracking (#7251)
- e0bec3f ci: track unreleased cherry-picked PRs during release (#7324)
- 87fab84 chore: correct order of err check for master config (#7351)
- 881a6e8 chore: Remove/archive trial collection code [WEB-938] [WEB-1289] (#7297)
- 34bdc97 chore: adjusted cli to not nest master config under config (#7344)
- badb925 feat: Add the InlineForm component (#7199)
- 0a93bb2 refactor: remove unused
Trial.JobID
property. (#7314) - ecf27f1 feat: create bindings tab in resource pool detail page (#7339)
- b19c295 feat: added rbac to grpc master config (#7303)
- ad5ddc7 chore: removed echo-backed /config and updated cli [DET-9580] (#7307)
- cede2e9 docs: fix version switcher (#7327)
- acec97e docs/Edit homepage title (#7342)
- 67c0977 fix: ensure design kit landing page is typechecked (#7336)
- 211d8b7 chore: fix ray proxy pydantic testing issue (#7340)
- 1e6377d chore: fix flaky TestReceiveContainerLog (#7338)
- bf1a9e7 chore: Add proper Json type to the code (#7335)
- af59d02 docs: correct release date (#7333)
- 9ff4ef7 chore: temporarily disable python-coverage on test-gpu (#7331)
- 49bf9a4 fix: Download to existing directory from shared_fs source [MLG-672] (#7328)
- d614f6b fix: harmonize python & golang GET NotFound errs (#7252)
- 070865a fix: Fix type mismatch on design page (#7334)
- 497b72e chore: Add new UserSettings store [WEB-1353] (#7322)
- 0991540 chore: fix golangci-lint pre-commit (#7330)
- 6f6f568 chore: golangci check the package instead (#7226)
- b518504 fix: Table header context menu enabled [WEB-1382] (#7321)
- 13d2a0b feat: manage resource pool bindings (#7300)
- 6bcf997 fix: navbarCollapsed shortcut key (#7323)
- ccb3745 feat: launch.deepspeed passes (almost) all envvars (#7295)
- ff3cc84 chore: validate ROCM support (#7313)
- a234895 chore: Remove TimeSeries log scale, SummarizeTrial [WEB-1406] (#7278)
- f9ec91f chore: prepare for strict rbac jq control (#7305)
- 3d74226 chore: bump version: 0.23.2-dev0 -> 0.23.3-dev0
- f195adb docs: add release notes for 0.23.2 (#7285)
- 65d5918 fix: set user in model CLI client [DET-9612] (#7220)
- ecf7d36 test: pin pydantic in e2e_tests (#7302)
- 65bcf19 docs: Rename image assets (#7283)
- 9f441a4 feat: list resource pool binding (#7286)
- f659c71 fix: Link omnibar shortcut to new Space convention [WEB-1418] (#7293)
- 2f2864a chore: update pillow version for test requirements (#7291)
- 2329801 chore: GCP operation tracker refactor (#7262)
- 4e512fc docs: Provide additional Slurm configuration guidance [FE-88] (#7288)
- 6bbe184 feat: keyboard shortcut for JupyterLabModal (#7184)
- 4cd31ab chore: use settings value for omnibar shortcut (#7198)
- 4d7c849 docs: Make correction on workspaces page (#7254)
- a782ee5 chore: remove admin flag check in delete_model & delete_model_version [DET-9596] (#7175)
- 5dcfe20 fix: ManageGroupsModal on UserManagement page (#7268)
- e1fdb26 feat: Add keyboard shortcut input to UI Kit [WEB-1362] (#7192)
- 7fbb363 feat: implement web resource pool binding api [WEB-1402] (#7272)
- 3323a4b test: Pin Pydantic (#7282)
- 5fa9d85 fix: adding credential check to gcp list clusters function (#7258)
- a22af67 fix: hp seach launch issue (#7271)
- 227ead4 fix: no data to plot is shown sometimes incorrectly [WEB-1413] (#7279)
- 6b7f7f2 fix: LogViewer request handling to avoid filter mismatch [WEB-1412] (#7277)
- 5bcc2b3 fix: summary metrics decode (#7253)
- c010993 chore: revert make_url changes (#7261)
- 29c0ec5 fix: experiment list comparison view (#7250)
- c72ec84 fix: Race condition loading parsed query settings into all settings [WEB-1376] (#7170)
- 761245b ci: handle cherry-picking EE PRs into release branch (#7266)
- a37ef89 feat: rbac for templates supporting changes (#7224)
- 00a7749 fix: refetch groups on settings change (#7263)
- e6502e2 fix: use user id for new explist user filter (#7248)
- 10c7def feat: Batch Inference (Processing) API (#6807)
- 49435a8 chore(actors): refactor allgather (#7195)
- 9fc8ae9 feat: add num of experiments in quick search modal (#7223)
- 74a8122 Docs: Spell out acronyms in a release note (#7247)
0.23.2
Release Notes
Changelog
- 70503d9 chore: bump version: 0.23.2-rc3 -> 0.23.2
- 5301a55 chore: bump version: 0.23.2-rc2 -> 0.23.2-rc3
- d90e364 docs: add release notes for 0.23.2 (#7285)
- 9913b5b chore: update pillow version for test requirements (#7291)
- fdcca52 test: pin pydantic in e2e_tests (#7302)
- 680c13b chore: bump version: 0.23.2-rc1 -> 0.23.2-rc2
- e447321 fix: ManageGroupsModal on UserManagement page (#7268)
- 8a3d4fd test: Pin Pydantic (#7282)
- 964e765 fix: adding credential check to gcp list clusters function (#7258)
- 4f2e489 fix: hp seach launch issue (#7271)
- e557208 fix: no data to plot is shown sometimes incorrectly [WEB-1413] (#7279)
- 2349c89 fix: LogViewer request handling to avoid filter mismatch [WEB-1412] (#7277)
- 48b8208 fix: summary metrics decode (#7253)
- 0739a28 chore: revert make_url changes (#7261)
- 8bfe93e fix: experiment list comparison view (#7250)
- fbadc8b chore: bump version: 0.23.2-rc0 -> 0.23.2-rc1
- da0464c fix: refetch groups on settings change (#7263)
- 1f9673d fix: use user id for new explist user filter (#7248)
- 7323e2b chore: bump version: 0.23.2-dev0 -> 0.23.2-rc0
- 67a1ea1 chore: lock published urls to preserve redirects
- 593b8ad chore: lock api state for backward compatibility check
- 6357915 fix: sort of
GetWorkspaceProjects
API (#7214) - 5a0beab fix: text overflow in code view (#7205)
- b035cca fix: Windows file permissions (#7215)
- 46d18e7 test: Add a unit test for direct downloading from S3. (#7174)
- 8167314 tara: Add the approved dtrain diagram (#7218)
- e1c141c chore: indirect job package imports to avoid cycle in EE (#7219)
- 870b678 fix: add missing searcher types (#7212)
- a6ea8bc chore: account for paths passed in for DET_MASTER (#7097)
- 092634f chore: add NewProxyHandler test to proxy_intg_test [DET-9555] (#7196)
- 8552a26 fix: replace dsat cli underscores with dashes (#7187)
- d5552ab feat: add trust_remote_code to hf_trainer_api examples (#7209)
- a830e51 chore(actors): remove AllocationRef's from tasklist.TaskList (#7208)
- 4dccb5a feat: add glide table filter field add shortcut (#7127)
- 53ad788 fix: pinned column resizing in compare view [WEB-1395] (#7193)
- 104d24f fix: revert #7100 (master returns rbacEnabled = false...) #7216
- cd5d0e6 fix: delete templates on workspace delete (#7204)
- 1d564a3 test: sort metrics by type in multiTrialSample (#7210)
- 33d43ed ci: notify via Slack on cherry-pick conflict during release (#7211)
- 4e8ae39 chore(go): add a generic queue as a drop-in actor inbox replacement (#6962)
- 76dbfec ci(circle/test-unit): split off gpu unit tests (#7207)
- fd48c32 feat: add new SVG icons to UI kit (#7142)
- a1d0776 fix: TLS enabled leads to zombie tasksd (#7197)
- cc8351f fix: Modify the Determined master dialer to honor the proxy environment variables [FE-69] (#7203)
- f623404 chore: request queue refactor (#7006)
- a12d20b chore: refactor job as global (#7178)
- f548a4e fix: update
hermes-parallel-coordinates
(#7190) - 3e76f63 fix: master returns rbacEnabled = false; change filter for rbac tests (#7100)
- 15fe41e docs: Improve the distributed training guide (#7038)
- 0dc8c0a ci: copy PR bodies into release party tracking issues (#7194)
- 1c7e008 feat: add selection menu to glide table (#6808)
- dabe00c chore: Helm file removed static imgs refs, added log config (#7032)
- 942a46f ci: torch.distributed parallel unit tests (#7156)
- 06cc665 fix: Make sure that docs can build properly again [DET-9607] (#7189)
- 0434fa2 chore: bump version: 0.23.1-dev0 -> 0.23.2-dev0
- 5d2621f docs: add release notes for 0.23.1 (#7186)
- c62b865 feat: added new list fuction, new delete subcommand and added the use of default gcs bucket if local tf state if not present in det deploy gcp (#7146)
- 50cafa7 chore: properly handle external logout (#7181)
- 35907a9 docs: Update readme with version switcher info (#7158)
- 61625dd chore(actors): refactor task idle timeout service (#7072)
- d47fadb chore(actors): replace BuildTaskSpec message (#7161)
- f3f7f82 chore: update license for web packages (#7185)
- 147014c ci(circleci): use custom GPU runners for harness GPU unit tests [INFENG-192] (#7120)
- 688d254 fix: Don't draw rows after end of data; compare panel width [WEB-1339] [WEB-1365] (#7180)
- 3d045b5 fix: set protocol scheme in kubernetes resource manager (#7165)
- 90fc15a chore: add go pre-commit checks (#7056)
- deb22ff chore: fix query parameter parsing in tls proxy (#7171)
- 64f4ea8 feat: more welcoming home page in doc (#7169)
- 5b2a256 fix: log viwer height (#7179)
- afe9b9c chore: support registry auth with Singularity (#7177)
- 0c5c1c3 feat: Create keyboard shortcut to toggle sidebar collapse (#7147)
- 0fe8609 fix: charts should show is loading instead of no data [WEB-1367] (#7167)
- c44a61e docs: Add version dropdown to sidebar (#7081)
- f816b45 feat: allow column pinning for glide table (#7093)
- 3870943 chore: default retry in sdk (#7063)
- cf6ee24 feat: Drawer component added to UI Kit [WEB-1349] (#7151)
- 58d0e17 fix: deepspeed autotune user guide clarification (#7140)
- 6b2e1bc chore: backport agent info permission definition. (#7143)
- a944f1a feat: trials comparison table in experiment comparison [WEB-993] (#7111)
- 77181f4 docs: Minor edits to master config reference (#7164)
- a4814ed fix: correct scale positional argument in compareTrials API call (#7168)
- e31f50e chore: sanitize metrictype [DET-9585] (#7155)
- 29dc6d8 ci: cherry-pick all appropriately labeled PRs, regardless of type (#7157)
- 041fb1a chore: deprecation warnings for
determined.common.experimental
imports (#7103) - ba99a69 fix: correct the fetch page index for paged view (#7153)
- e09ea0c chore: fix a metric name comment. (#7154)
- b5467a1 fix: Display chart if filters.batch is a number (#7152)
- ff39e50 fix: menu overlapping (#7148)
- 467e176 fix: use
MetricBadgeTag
for chart grid title (#7139) - 3c8a5a4 fix: restore experiments with no provider or capacity (#7113)
- 08fc285 fix: remember experiment list sorts (#7150)
- 2f32351 feat: feature switch can be turned off (#7144)
- a40cdc0 fix: mask checkpoint storage secrets in gc logs (#7135)
- 245ed0f docs: Make minor edits to username release note (#7149)
- d3aff5b docs: Fix typos in Intro to Determined (#7138)
- 260fd7a fix(api): complete trial to task log request mapping (#7022)
- 843b012 feat: add/update read apis for generic metrics (#7065)
- ee1e2b1 fix: show filter count while filter is open (#7145)
- afff53e chore: onboarding doc clarification (#7109)
- 6f98aac chore: sort generated binding parameters based on required status (#7137)
- 8592397 feat: Experiment-search API supports summary metrics from training [WEB-1302] (#7115)
- 1aff89c docs: Add TLS Certs Setup Guide (#7110)
0.23.1
Release Notes
Changelog
- 56e604f chore: bump version: 0.23.1-rc2 -> 0.23.1
- 02d04af docs: add release notes for 0.23.1 (#7186)
- 074daa7 chore: bump version: 0.23.1-rc1 -> 0.23.1-rc2
- f243677 fix: correct the fetch page index for paged view (#7153)
- cae371e fix: remember experiment list sorts (#7150)
- fbbab7b docs: Make minor edits to username release note (#7149)
- 56c5d55 chore: bump version: 0.23.1-rc0 -> 0.23.1-rc1
- fd87099 chore: bump version: 0.23.1-dev0 -> 0.23.1-rc0
- ca23f12 chore: lock published urls to preserve redirects
- d9598e0 chore: lock api state for backward compatibility check
- 609bbb6 feat: rp-workspace mapping proto (#7125)
- 588e640 chore: await_first_trial ensures first trial is returned (#7079)
- eb38615 ci: handle cherry-picking PRs for release (#7035)
- d7e66dc test: New test -- Model.get_versions reads paginated responses. (#7087)
- d99a4d4 chore: change import path (#7134)
- 6f7472c fix: spinner import error (#7132)
- e5ac7e2 fix: Glide Table comparison view no data error; loading saved filters issue (#7122)
- 6ed34e1 fix: allow creating a user with original username of a renamed user (#7121)
- b583b4e fix: persist Experiment List sorts (#7128)
- 95fdc45 chore: conditional debugging output (#7124)
- dfc7d10 chore: make printed py bindings more readable (#7018)
- 3c3112d refactor: remove shared web folder (#7112)
- 80504f1 chore: assume missing experiment project as deleted experiment (#7078)
- f888564 fix:
/experiments-search
endpoint best trial state (#7123) - f11bcfb chore: deprecate
LightningAdapter
. (#6989) - eb1868d chore: ensure golang gcp defaults for disk are valid [MLG-425] (#7116)
- 0f1c538 fix: Use correct theme colors for new table (#7114)
- f343651 docs: Fix a typo in a release note (#7107)
- 2f7098b chore: update gpt-neox example (#7106)
- b716404 fix: correct infinite scrolling behavior (#7096)
- 65efd7a fix: tr rendering bug (#7077)
- b607026 docs: provide user docs for manage-enroot-cache (#7108)
- 2121ed3 test: kill child experiments before workspace deletion (#7075)
- a43edc3 feat: Experiment List comparison view should show hyperparameters chart [WEB-991] (#7084)
- d477769 fix: unexpected icon movement in admin page (#7099)
- ed2223a feat: added rbac for agent endpoints [DET-9211] (#6991)
- d8b8f15 chore: add slot type to container client config (#7094)
- 160854d feat: show column sort [WEB-1219] (#7085)
- 97895ce feat: API for project metrics range [WEB-1000] (#7009)
- cc58653 docs: give prominence to Apptainer instead of Singularity (#7071)
- 29e6985 feat: New default glide table columns [WEB-1197] (#7073)
- 774a386 fix: scroll chaining in doc sidebar (#7062)
- ed329d6 fix: Chart regression in detecting single point cluster chart (#7091)
- 90bba17 fix: hide non-static columns when comparison view open (#7014)
- 56f95b1 feat: add row highlight on hover (#7055)
- c069e14 chore: fix allocations swagger generated urls (#7020)
- 0d33d57 fix: disable hightlight after clicking column menu (#7092)
- e249a58 chore: errata from onboarding doc (#7083)
- 0b4b3d0 fix:
det a
under kubernetes should reflect the output ofkubectl get nodes
again [DET-9450] (#6839) - 4acc527 chore: fix the lint script on package to resolve (#7080)
- 7dddd50 fix: det shell start fails on grenoble with 0.21.2 (reading HTTP_PROXY, but ignoring NO_PROXY) [DET-9364] (#7024)
- c5a339b feat: allow row height setting in glide table (#6952)
- 93095ee fix: prevent re-renders on glide table (#7076)
- e13b20a chore: capabilities options for singularity (#7074)
- 2d943b4 fix: Breadcrumb should not have extra margin (#7069)
- 11013bc fix: show experiment count and change
start time
unit (#7004) - 5a9fc61 fix: Bottom of code editor, find tool both visible (#7050)
- 4ce93aa feat: add summary metric support for generic metrics (#7012)
- d8ba9cf chore(actors): refactor task preemption (#7045)
- 7cd0694 fix: Button with icon styling (#7047)
- 728078b chore: bump version: 0.23.0-dev0 -> 0.23.1-dev0
- 82ab84e chore: bump version: 0.22.3-dev0 -> 0.23.0-dev0
- 6e23beb docs: add release notes for 0.23.0 (#7058)
- 0d21873 test: Add fixtures for mocking the REST API (#7064)
- 977b387 chore: move inline sql off to master/static/srv (#7061)
- e810990 test: new SDK test wait for experiments that are "PAUSED" (#6987)
- 7957064 chore: bind mount options for singularity (#7054)
- 3fdfcf8 fix: handleError is required in ChartGrid (#7057)
- 379b69e fix: missing prop (#7059)
- c7d6f0c chore: Remove pagination from get_versions. (#7030)
- 4da5441 chore: New chart component on LearningCurve and Profiler (#7052)
- c89a773 chore: support displaying limited jobs in cli and web (#7001)
- 7ba65b3 feat: Add Chart grid to experiment comparison view [WEB-990] (#7005)
- c4a73b1 chore: treat k8s like slurm in cluster info page (#7051)
- da66bca chore: actor refactor pod log (#6941)
- fa18f9b fix: Fixes to UserSettings loading state, experiment lists [WEB-1271] (#7026)
- be207c2 fix: add missing workspace_id column to FilterExperimentsQuery calls (#7049)
- 124794d chore: clean up message handling in main (#7040)
- 3ee0900 fix: change taskType from string to enum [DET-8847] (#6927)
- 18da973 chore: actor refactor proxy (#6944)
- c2c9eee fix: omit projects outside of users permissions [DET-9557] (#7046)
- d7bdaa4 chore: added release note for DET-9035 (#7042)
- 100bda7 fix: load config file into the code editor (#7033)
- 6db91ab chore: enrich agent-generated logs (#7029)
- 08c7210 chore(db): backfill tasks tables to always have entries for trials (#6711)
- ebcc63b fix: add 'check permissions' message to RBAC NotFound errors (#6937)
- f6793dd fix: fix k8s slots count growing for commands [DET-9550] (#7025)
- 2b3c3a8 chore: onboarding doc updates after test-drive (#7028)
- b7d2819 chore: Pass in the allocation ID to releasedResources() (#7034)
- 7e359cf feat: Add pagination to experiment list [WEB-995] (#6971)
- 5a1614d chore: Increase size of Algolia search modal (#7023)
- 9883a67 refactor: make UI kit self-contained (WEB-1243) (#6918)
- f4f9fda fix: ensure contiguous before gathering in stable diffusion example (#7021)
- f4bae0e chore: expose workspace for jobs (#6996)
- 52e095a feat: When modifying glide table filters, keep columns visible [WEB-1232] (#7010)
- e6413e1 chore(actors): replace tasklogger (#6979)
- fe22f81 fix: WorkspaceList tableOffset bug (#7017)
- b9e8b94 chore: upgrade to typescript v5 (#6977)
- ef85382 fix: In new breadcrumbs, uncategorized links to /projects/1 [WEB-1319] (#7013)
- 695df59 fix: fixing error that 'det w delete ' throws when the project(s) in it has no experiments (#6986)
- 67e40c6 ci: fix unit-test-react flake [WEB-1262] (#6995)
- 4a854a9 docs: Improve links to distributed training guide (#6983)
- d822536 chore: improve partial checkpoint CLI in progress message (#7015)
- d8ffd91 chore: ml-sys onboarding exercises (#6516)
- 1793f47 style: remove unnecessary wrapper around directory tree (#7008)
- 4b3c126 fix: seconds come from epoch duration, not timestamp piece (#7007)
0.23.0
Release Notes
Changelog
- ac107b7 chore: bump version: 0.23.0-rc4 -> 0.23.0
- fb0383b docs: add release notes for 0.23.0 (#7058)
- d10e02c chore: bump version: 0.23.0-rc3 -> 0.23.0-rc4
- 83670e0 fix: add missing workspace_id column to FilterExperimentsQuery calls (#7049)
- afb7772 chore: bump version: 0.23.0-rc2 -> 0.23.0-rc3
- d961a39 fix: remove CodeEditor onError prop
- 269685b chore: bump version: 0.23.0-rc1 -> 0.23.0-rc2
- 1dafc8b fix: omit projects outside of users permissions [DET-9557] (#7046)
- 33d6fc7 fix: load config file into the code editor (#7033)
- 7464932 fix: fix k8s slots count growing for commands [DET-9550] (#7025)
- ee20409 chore: bump version: 0.23.0-rc0 -> 0.23.0-rc1
- 125c0b4 fix: WorkspaceList tableOffset bug (#7017)
- 93bf322 fix: In new breadcrumbs, uncategorized links to /projects/1 [WEB-1319] (#7013)
- 6b0f905 chore: bump version: 0.23.0-dev0 -> 0.23.0-rc0
- b99b722 chore: bump version: 0.22.3-dev0 -> 0.23.0-dev0
- 55962b2 chore: lock published urls to preserve redirects
- 1659465 chore: lock api state for backward compatibility check
- 6e1384c fix: add canDoActionOnCheckpointThroughModel to core_checkpoint (#6889)
- f5060e3 feat: DeepSpeed Autotune [MLG-201] (#6924)
- 2e98832 feat: partial checkpoint delete [DET-9491] (#6901)
- 1e0bf34 ci: fix latest-main/preview deployments.
- 847928a chore: remove support for hdfs (#6967)
- 750f82c chore: Consolidate CodeMirror code files [WEB-1306] (#6999)
- 6984945 ci:
det deploy aws
flavor for a normal RDS postgres. (#7000) - fefca32 refactor: full trial summary metrics recompute take metric type as param (#6997)
- 6545423 feat: add provisioning timeouts (#6447)
- 82a706b chore: adjust end_time threshold for allocation tensorboard test [DET-9542] (#6992)
- 8a8f4cb fix: k8s tasklist forever growing (#6957)
- 6ff5a99 fix: Stabilize return order for experiments (#6968)
- 318adf9 chore: Remove unused ExperimentConfiguration (#6959)
- 38f7273 feat: strategic merge pod spec with task_container_defaults [DET-7227] (#5728)
- ee99f56 fix: sticky nav bar in UIKit page (#6980)
- d65f072 chore: consolidate and simplify isort config files (#6990)
- c0971bb ci: Rng tests refactor (#6946)
- 46379a6 feat: switch from Monaco to CodeMirror (#6926)
- 9ff5e31 feat: add comparison toggle to glide experiment list [WEB-989] (#6976)
- 9ed4888 fix: breadcrumb warnings (#6965)
- cb52a19 chore: install sigusr1 on main thread only [MLG-585] (#6993)
- 5600fe8 chore: allow tokens to be provided via secure cookie instead (#6862)
- 8acc879 docs: update Core API UG w literal includes (#6501)
- daab280 chore: remove old Python packages (#6985)
- c8be6af test: fix
test_experiment_proxy_ray_tunnel
. (#6972) - 89c535b chore: Sort and filter on description, duration, and searcherType [WEB-1199] (#6913)
- 0dbd6d7 chore: simplify metrics handler syntax (#6969)
- faba534 fix: parameterize agent tmp by agent id [DET-9111] (#6960)
- 3a4a247 test: Removes several multiprocessing executions of test_enqueuer. (#6949)
- 6ebf476 feat: added
gpu_hours
to historical allocation CSV [DET-9506] (#6948) - ee2b8f0 chore: More use of workspace store [WEB-1240] (#6880)
- 114e425 chore: add generic metrics support part 1 (#6641)
- 9388f1a chore: bump version: 0.22.2-dev0 -> 0.22.3-dev0
- 53abbb7 docs: Add release notes for 0.22.2 (#6954)
- d4445a4 perf: render full experiment list [WEB-1234] (#6947)
- df4d55a fix: only select new pages during selectAll mode (#6955)
- 0ee01af style: update glide table styles to design specs [WEB-1227, WEB-1281] (#6811)
- 5a7a7f8 chore: fix bugs in redirects.py (#6958)
- 77e08fa fix: Test runs dirs (#6916)
- 1f9e6a8 chore: simplify det generated enum's str representation (#6942)
- 1aea75f docs: fix a typo in the proxy ports guide. (#6953)
- 6f57431 docs: remove an obsolete section about
Checkpoint.load_from_path
. (#6909) - 9edc69e chore: check if a.restored to send task log [DET-9515] (#6956)
- 271013a ci: skip non-fix/feat PRs in release automation (#6939)
- 26372ef docs: Apply consistency to setup guides (#6930)
- 273e145 docs: Add HPC Launcher Security Considerations [FOUNDENG-617] (#6945)
- e232d2f chore: enable flake8 pre-commit check on e2e_tests (#6933)
- 29abca2 chore: Add Breadcrumb to every Page [WEB-335] (#6798)
- 7f5ce01 fix: out of bounds when page is bigger than actual (#6931)
- d4b76d3 fix: Workspace selection on JupyterLab modal [DET-9503] (#6938)
- 47110dd chore: vary generated training and validation metric counts (#6915)
- 88f5a43 feat: add theme (fonts, colors, etc) to design kit (#6847)
- e61264f feat: Hide experiments from archived projects in
GetExperiments
DET-9381 (#6832) - 7d78dbc chore: remove old events infrastructure from allocation (#6771)
- c03ec45 ci: increase resource classes for some tests (#6932)
- 2cdb946 ci: handle cherry-pick marker label in release automation (#6911)
- 8666415 fix: Experiment search filter API parentheses (#6893)
- eeecef8 refactor: make staticcheck linter use correct go version. (#6914)
- 411d49f feat: add json output format to det :ntsc logs commands (#6887)
- 47ed9de docs: automatically put the current year in the copyright (#6928)
- 1274927 fix: avoid having user edit modal reset to original state every 5~10 seconds [WEB-1273, WEB-1261, WEB-1203] (#6912)
- 653ed42 fix: turn off AutoPause for AWS RDS Aurora. (#6925)
- eda1385 feat: parallel coordinates chart on trial hyperparameters tab [WEB-1196] (#6872)
- de5e845 fix: make AWS template postgres version-agnostic. (#6923)
- c60a424 chore: add isort pre-commit hook (#6897)
- cbc564b fix: bump AWS CFN
SecondsUntilAutoPause
to see if it helps with timeouts. (#6921) - 628613b fix: replace
crypto.randomUUID
withuuid
(#6908) - be9f545 fix: remove oss admin flag from 'det user list' with RBAC (#6920)
- 0b59d79 fix: patch workspacelist context menu (#6919)
- 6b4615b refactor: remove /agents endpoint [DET-9479] (#6907)
- 77b0e94 chore: update polling intervals for stores (#6892)
- 9985f57 chore: rm ntsc events manager, endpoint, and client usage (#6840)
- 482481d fix: icon position (#6904)
- 6f12aca refactor: explicitly support partial settings and compare changes in full settings (as opposed to comparing settings to updates) (#6894)
- 15ad730 fix: Full height on code editor [WEB-1258] (#6906)
- 4538fe7 chore: enable exhaustiveness checks in switch statements (#6903)
- 14ee7d9 feat: new experiment table filter (#6502)
- fda0f1c chore: Remove unused rbac feature flags [WEB-920] (#6845)
- e7e293d chore: update Button component (#6841)
- 36dac4b fix: case-insensitive contains query within experiment (not metrics or hparams) (#6858)
- eb58396 test: add concurrent trial metric updates (#6884)
- bf84d0a ci(gha/link-docs-preview): strip md5sum output (#6881)
- 48648a2 ci: track merged PRs in a GitHub project (#6867)
- 0e39f19 feat: detached mode v0. (#6519)
- fcc4b70 fix: adjust pinned position calculation (#6890)
- bb76e55 docs: Indent content under 4th level headings (#6888)
- 771f664 refactor: bunify a chunk of experiment-related db queries. (#6854)
- 89a5e5e fix: handled when allocation end_time < start_time in updating aggregate-resources (#6878)
- bc76c1a fix: fix an issue with running det :ntsc logs (#6886)
- 2e0e490 fix: trials created by paused experiments should be paused [DET-9493] (#6882)
- 95f27d2 chore: Add ordered map (orderedmapx) to allow for queuing of job cancelations [DET-9465] (#6874)