Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement HPO for PyTorch pipeline. #246

Merged
merged 16 commits into from
Oct 25, 2023
Merged

Conversation

erwulff
Copy link
Collaborator

@erwulff erwulff commented Oct 20, 2023

Now able to perform hyperparameter search using grid search or random search with automatic trial launching and Ray-compatible checkpointing.

Support for the following is left for a future PR:

  • Trial schedulers
  • Advanced Ray Tune search algorithms
  • Ray-compatible logging of all relevant metrics
  • Robust fault tolerance

Addresses issue #251 .

@erwulff erwulff marked this pull request as ready for review October 24, 2023 10:54
@farakiko
Copy link
Collaborator

@eric Wulff looks great! one comment, can you make the default value for num-workers set to 0? this is need for the torch code because the default value None will run into error. and I think we want by default the code to run with num-workers 0.
Thanks!

@erwulff
Copy link
Collaborator Author

erwulff commented Oct 25, 2023

@eric Wulff looks great! one comment, can you make the default value for num-workers set to 0? this is need for the torch code because the default value None will run into error. and I think we want by default the code to run with num-workers 0.
Thanks!

I have added num-workers to the config files, so that the default is specified there. My thinking is that any option given on the command line should override the value in the config file. I will change the default config files to have num_workers: 0.

@jpata jpata merged commit 370f47f into jpata:main Oct 25, 2023
10 checks passed
farakiko added a commit to farakiko/particleflow that referenced this pull request Oct 25, 2023
* wip: implement HPO in pytorch pipeline

* fix: bugs after rebase

* chore: code formatting

* fix: minor bug

* fix: typo

* fix: lr casted to str when read from config

* try reducing --ntrain --ntest in tests

* update distbarrier and fix stale pochs (jpata#249)

* change pytorch CI/CD test to use gravnet model

* feat: implemented HPO using Ray Tune

Now able to perform hyperparameter search using random search with
automatic trial launching and Ray-compatbile checkpointing.

Support is still missing for:
- Trial schedulers
- Advanced Ray Tune search algorithms

* fix: flake8 error

* chore: update default config values for pyg

---------

Co-authored-by: Farouk Mokhtar <farouk.mokhtar@gmail.com>
farakiko added a commit to farakiko/particleflow that referenced this pull request Jan 23, 2024
* wip: implement HPO in pytorch pipeline

* fix: bugs after rebase

* chore: code formatting

* fix: minor bug

* fix: typo

* fix: lr casted to str when read from config

* try reducing --ntrain --ntest in tests

* update distbarrier and fix stale pochs (jpata#249)

* change pytorch CI/CD test to use gravnet model

* feat: implemented HPO using Ray Tune

Now able to perform hyperparameter search using random search with
automatic trial launching and Ray-compatbile checkpointing.

Support is still missing for:
- Trial schedulers
- Advanced Ray Tune search algorithms

* fix: flake8 error

* chore: update default config values for pyg

---------

Co-authored-by: Farouk Mokhtar <farouk.mokhtar@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants