Merge from jpata/particleflow master #2

erwulff · 2021-07-23T10:26:42Z

No description provided.

This commit also includes - Custom tensorboard callback logging learning rate & momentum - A utils.py file collecting utilities used in more than one file - Clean-up of how output files are organized - Configuration files using the OneCycle scheduler

`mlpf/pipeline.py` is the beginning of a `click` based alternative to the `mlpf/launcher.py`.

Also add option to give a prefix to the name of the training directory

Also add lr_schedule parameter to configuration files

The previous commit still scaled the LR, this one fixes it.

- create get_train_val_datasets() function to get datasets for training - move targets_multi_output() from model_setup.py to utils.py for more flexible access (solving import loop issue)

The learning rate finder implements a technique to easily estimate a range of learning rates that should perform well given the current model setup. When the model architecture or other hyperparameters are changed, the learning rate finder can be run in order to find a new suitable LR range. The learning rate finder starts training the model at a very low LR, increasing it every batch. The batch loss is plotted vs training steps and a figure is created from which a suitable LR range can be determined. This technique was first introduced by Leslie Smith in https://arxiv.org/abs/1506.01186.

When running `python mlpf/pipeline.py evaluate -t <train_dir>` without specifying which weights to use explicitly the pipeline will load the weights with the smallest loss in <train_dir>/weights/ that it can find.

This can be useful when many large checkpoint files take up too much storage space.

…nfig

The default parameters for expdecay added to the config files in this commit are the same as those used on the jpata/particleflow master branch at the time of writing.

Also: - Add missing parameters to config files. - Move make_weights_function to utils.py

OneCycle LR, LR finder, custom Tensorboard, etc.

* Initial commit * add template dataset definitions * Add initial CMS particle-flow dataset implementation Also changed to a new tensorflow dataset template * add test scripts * Run black formatting on python files * Add instructions to cms_pf, use manual_dir for preprocessing * fix: ability to choose data directory for the tfrecords files * feat: Add Delphes dataset * fix: support loading both .pkl.bz2 and .pkl * fix: remove extra dimension in cms_pf data items * fix cms * fixes for delphes * ensure dir exists * separate cms datasets * clarify manual dir * cleanup print * added singleele and singlemu * update 1.1 * cleanup cms datasets * update datamodel * added new datasets * gen/sim 12_3_0_pre6 generation (#1) * 1.2 format, ztt dataset * version 1.3.0 with new gensim truth * new dataset * add qcd * add some asserts * add new features * keep PS * add tau as pf target * 1.3.1 remove ps and brem (#2) * fix HF labeling (#3) * add new high-PU QCD dataset, update energy * up * fix * Add gen jet index (#4) * first attempt at gen jet clustering * add other reqs * revert test * fix mapping to before masking particles * fix out of index bufg * benchmark training for CMS * move path * move path * remove submodule * remove * move * fix import * format * format * remove some dummy files * up * try with masking * use a different dataset for logging the jet/met distributions * clean * added clic ttbar Co-authored-by: Eric Wulff <eric.g.t.wulff@gmail.com> Co-authored-by: Eric Wulff <eric.wulff@cern.ch> Co-authored-by: Javier Duarte <jduarte@ucsd.edu>

* Initial commit * add template dataset definitions * Add initial CMS particle-flow dataset implementation Also changed to a new tensorflow dataset template * add test scripts * Run black formatting on python files * Add instructions to cms_pf, use manual_dir for preprocessing * fix: ability to choose data directory for the tfrecords files * feat: Add Delphes dataset * fix: support loading both .pkl.bz2 and .pkl * fix: remove extra dimension in cms_pf data items * fix cms * fixes for delphes * ensure dir exists * separate cms datasets * clarify manual dir * cleanup print * added singleele and singlemu * update 1.1 * cleanup cms datasets * update datamodel * added new datasets * gen/sim 12_3_0_pre6 generation (#1) * 1.2 format, ztt dataset * version 1.3.0 with new gensim truth * new dataset * add qcd * add some asserts * add new features * keep PS * add tau as pf target * 1.3.1 remove ps and brem (#2) * fix HF labeling (#3) * add new high-PU QCD dataset, update energy * up * fix * Add gen jet index (#4) * first attempt at gen jet clustering * add other reqs * revert test * fix mapping to before masking particles * fix out of index bufg * benchmark training for CMS * move path * move path * remove submodule * remove * move * fix import * format * format * remove some dummy files * up * try with masking * use a different dataset for logging the jet/met distributions * clean * added clic ttbar Co-authored-by: Eric Wulff <eric.g.t.wulff@gmail.com> Co-authored-by: Eric Wulff <eric.wulff@cern.ch> Co-authored-by: Javier Duarte <jduarte@ucsd.edu> Former-commit-id: fb89d79

erwulff and others added 26 commits June 24, 2021 22:23

feat: pipeline - my alternative to the launcher

cf5f776

`mlpf/pipeline.py` is the beginning of a `click` based alternative to the `mlpf/launcher.py`.

fix: correct setting of global batch size

636dab6

Also add option to give a prefix to the name of the training directory

Merge branch 'jpata:master' into develop

62a6ba3

fix: do not silently scale learning rate with batch size

b896392

Also add lr_schedule parameter to configuration files

fix: do not silently scale learning rate with batch size

9fcba84

The previous commit still scaled the LR, this one fixes it.

refactoring to make pipeline.py simpler

2ea5c8b

- create get_train_val_datasets() function to get datasets for training - move targets_multi_output() from model_setup.py to utils.py for more flexible access (solving import loop issue)

fix: typo in OneCycleScheduler docstring

47db0d5

add installation of tqdm to github test

bb6fbdd

chore: reduce code duplication in mlpf/pipeline.py

771cc6f

chore: Reduction of code duplication

71b3534

feat: evaluate loads best weights in the <train_dir>/weights/

b6075d2

When running `python mlpf/pipeline.py evaluate -t <train_dir>` without specifying which weights to use explicitly the pipeline will load the weights with the smallest loss in <train_dir>/weights/ that it can find.

fix: Bugfix in loading of val data

73dcb01

fix: Bug in path handling

5147dde

feat: Add tests of pipeline for cms and delphes

4b31717

feat: Add command to delete all but best chekpoint weights

c326947

This can be useful when many large checkpoint files take up too much storage space.

fix: Use MeanSquaredLogarithmicError for pt and energy in OneCycle co…

fa71baa

…nfig

chore: Add description to find-lr command

456847c

feat: Add ability to configure expdecay parameters in config

79c8f95

The default parameters for expdecay added to the config files in this commit are the same as those used on the jpata/particleflow master branch at the time of writing.

Merge branch 'master' into develop

7191747

Make eval_model() work for multi-output

b1b6e2e

Also: - Add missing parameters to config files. - Move make_weights_function to utils.py

fix: Creation of history dir now using parents=True, exist_ok=True

cd8ee25

fix: Always configue model weights before loading saved weights

e4c85d0

feat: Pipeline train copies config file to outdir for reference

ac3c944

Merge pull request #70 from erwulff/develop

1e4c581

OneCycle LR, LR finder, custom Tensorboard, etc.

erwulff merged commit 12fa88d into erwulff:master Jul 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge from jpata/particleflow master #2

Merge from jpata/particleflow master #2

erwulff commented Jul 23, 2021

Merge from jpata/particleflow master #2

Merge from jpata/particleflow master #2

Conversation

erwulff commented Jul 23, 2021