Implement HPO for PyTorch pipeline. #246

erwulff · 2023-10-20T14:01:56Z

Now able to perform hyperparameter search using grid search or random search with automatic trial launching and Ray-compatible checkpointing.

Support for the following is left for a future PR:

Trial schedulers
Advanced Ray Tune search algorithms
Ray-compatible logging of all relevant metrics
Robust fault tolerance

Addresses issue #251 .

Now able to perform hyperparameter search using random search with automatic trial launching and Ray-compatbile checkpointing. Support is still missing for: - Trial schedulers - Advanced Ray Tune search algorithms

farakiko · 2023-10-25T10:54:47Z

@eric Wulff looks great! one comment, can you make the default value for num-workers set to 0? this is need for the torch code because the default value None will run into error. and I think we want by default the code to run with num-workers 0.
Thanks!

erwulff · 2023-10-25T11:03:52Z

@eric Wulff looks great! one comment, can you make the default value for num-workers set to 0? this is need for the torch code because the default value None will run into error. and I think we want by default the code to run with num-workers 0.
Thanks!

I have added num-workers to the config files, so that the default is specified there. My thinking is that any option given on the command line should override the value in the config file. I will change the default config files to have num_workers: 0.

* wip: implement HPO in pytorch pipeline * fix: bugs after rebase * chore: code formatting * fix: minor bug * fix: typo * fix: lr casted to str when read from config * try reducing --ntrain --ntest in tests * update distbarrier and fix stale pochs (jpata#249) * change pytorch CI/CD test to use gravnet model * feat: implemented HPO using Ray Tune Now able to perform hyperparameter search using random search with automatic trial launching and Ray-compatbile checkpointing. Support is still missing for: - Trial schedulers - Advanced Ray Tune search algorithms * fix: flake8 error * chore: update default config values for pyg --------- Co-authored-by: Farouk Mokhtar <farouk.mokhtar@gmail.com>

erwulff and others added 13 commits October 20, 2023 09:37

wip: implement HPO in pytorch pipeline

1591a2c

fix: bugs after rebase

f99b491

chore: code formatting

733f421

fix: minor bug

217e51d

fix: typo

6fcb076

fix: lr casted to str when read from config

2d6e659

try reducing --ntrain --ntest in tests

96f044d

up

992ba58

update distbarrier and fix stale pochs (jpata#249)

eb99bcc

change pytorch CI/CD test to use gravnet model

d359b15

feat: implemented HPO using Ray Tune

80d404d

Now able to perform hyperparameter search using random search with automatic trial launching and Ray-compatbile checkpointing. Support is still missing for: - Trial schedulers - Advanced Ray Tune search algorithms

Merge branch 'main' into feat_hpo4torch

dc7eb61

Merge branch 'main' into feat_hpo4torch

7fd1f8e

erwulff marked this pull request as ready for review October 24, 2023 10:54

erwulff and others added 2 commits October 25, 2023 11:48

Merge branch 'main' into feat_hpo4torch

82b9669

fix: flake8 error

7816d2a

chore: update default config values for pyg

249336a

jpata merged commit 370f47f into jpata:main Oct 25, 2023
10 checks passed

erwulff mentioned this pull request Oct 30, 2023

Implement HPO in PyTorch pipeline. #251

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement HPO for PyTorch pipeline. #246

Implement HPO for PyTorch pipeline. #246

erwulff commented Oct 20, 2023 •

edited

Loading

farakiko commented Oct 25, 2023

erwulff commented Oct 25, 2023

Implement HPO for PyTorch pipeline. #246

Implement HPO for PyTorch pipeline. #246

Conversation

erwulff commented Oct 20, 2023 • edited Loading

farakiko commented Oct 25, 2023

erwulff commented Oct 25, 2023

erwulff commented Oct 20, 2023 •

edited

Loading