forked from jpata/particleflow
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge from jpata/particleflow master #2
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit also includes - Custom tensorboard callback logging learning rate & momentum - A utils.py file collecting utilities used in more than one file - Clean-up of how output files are organized - Configuration files using the OneCycle scheduler
`mlpf/pipeline.py` is the beginning of a `click` based alternative to the `mlpf/launcher.py`.
Also add option to give a prefix to the name of the training directory
Also add lr_schedule parameter to configuration files
The previous commit still scaled the LR, this one fixes it.
- create get_train_val_datasets() function to get datasets for training - move targets_multi_output() from model_setup.py to utils.py for more flexible access (solving import loop issue)
The learning rate finder implements a technique to easily estimate a range of learning rates that should perform well given the current model setup. When the model architecture or other hyperparameters are changed, the learning rate finder can be run in order to find a new suitable LR range. The learning rate finder starts training the model at a very low LR, increasing it every batch. The batch loss is plotted vs training steps and a figure is created from which a suitable LR range can be determined. This technique was first introduced by Leslie Smith in https://arxiv.org/abs/1506.01186.
When running `python mlpf/pipeline.py evaluate -t <train_dir>` without specifying which weights to use explicitly the pipeline will load the weights with the smallest loss in <train_dir>/weights/ that it can find.
This can be useful when many large checkpoint files take up too much storage space.
The default parameters for expdecay added to the config files in this commit are the same as those used on the jpata/particleflow master branch at the time of writing.
Also: - Add missing parameters to config files. - Move make_weights_function to utils.py
OneCycle LR, LR finder, custom Tensorboard, etc.
erwulff
added a commit
that referenced
this pull request
Sep 2, 2022
* Initial commit * add template dataset definitions * Add initial CMS particle-flow dataset implementation Also changed to a new tensorflow dataset template * add test scripts * Run black formatting on python files * Add instructions to cms_pf, use manual_dir for preprocessing * fix: ability to choose data directory for the tfrecords files * feat: Add Delphes dataset * fix: support loading both .pkl.bz2 and .pkl * fix: remove extra dimension in cms_pf data items * fix cms * fixes for delphes * ensure dir exists * separate cms datasets * clarify manual dir * cleanup print * added singleele and singlemu * update 1.1 * cleanup cms datasets * update datamodel * added new datasets * gen/sim 12_3_0_pre6 generation (#1) * 1.2 format, ztt dataset * version 1.3.0 with new gensim truth * new dataset * add qcd * add some asserts * add new features * keep PS * add tau as pf target * 1.3.1 remove ps and brem (#2) * fix HF labeling (#3) * add new high-PU QCD dataset, update energy * up * fix * Add gen jet index (#4) * first attempt at gen jet clustering * add other reqs * revert test * fix mapping to before masking particles * fix out of index bufg * benchmark training for CMS * move path * move path * remove submodule * remove * move * fix import * format * format * remove some dummy files * up * try with masking * use a different dataset for logging the jet/met distributions * clean * added clic ttbar Co-authored-by: Eric Wulff <eric.g.t.wulff@gmail.com> Co-authored-by: Eric Wulff <eric.wulff@cern.ch> Co-authored-by: Javier Duarte <jduarte@ucsd.edu>
erwulff
added a commit
that referenced
this pull request
Sep 22, 2023
* Initial commit * add template dataset definitions * Add initial CMS particle-flow dataset implementation Also changed to a new tensorflow dataset template * add test scripts * Run black formatting on python files * Add instructions to cms_pf, use manual_dir for preprocessing * fix: ability to choose data directory for the tfrecords files * feat: Add Delphes dataset * fix: support loading both .pkl.bz2 and .pkl * fix: remove extra dimension in cms_pf data items * fix cms * fixes for delphes * ensure dir exists * separate cms datasets * clarify manual dir * cleanup print * added singleele and singlemu * update 1.1 * cleanup cms datasets * update datamodel * added new datasets * gen/sim 12_3_0_pre6 generation (#1) * 1.2 format, ztt dataset * version 1.3.0 with new gensim truth * new dataset * add qcd * add some asserts * add new features * keep PS * add tau as pf target * 1.3.1 remove ps and brem (#2) * fix HF labeling (#3) * add new high-PU QCD dataset, update energy * up * fix * Add gen jet index (#4) * first attempt at gen jet clustering * add other reqs * revert test * fix mapping to before masking particles * fix out of index bufg * benchmark training for CMS * move path * move path * remove submodule * remove * move * fix import * format * format * remove some dummy files * up * try with masking * use a different dataset for logging the jet/met distributions * clean * added clic ttbar Co-authored-by: Eric Wulff <eric.g.t.wulff@gmail.com> Co-authored-by: Eric Wulff <eric.wulff@cern.ch> Co-authored-by: Javier Duarte <jduarte@ucsd.edu> Former-commit-id: fb89d79
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.