Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF perf tuning, CLIC benchmarks, flatiron scripts #185

Merged
merged 21 commits into from
Jul 28, 2023
Merged

Conversation

erwulff
Copy link
Collaborator

@erwulff erwulff commented Jun 1, 2023

  • Update and add batch scripts for Flatiron
  • Add best practice code for TF optimal model performance
  • Add prefetch autotuning
  • Add autotuning of parallel data loading calls
  • Update raytune search space
  • Enable benchmarking on CLIC dataset
  • Fix numpy to v1.23.5 in requirements.txt due to later versions being incompatible with tf2onnx v1.14.0 (latest at time of commit)

@erwulff erwulff marked this pull request as ready for review July 20, 2023 09:00
@erwulff erwulff requested a review from jpata July 20, 2023 09:06
@jpata jpata changed the title Dev june23 TF perf tuning, CLIC benchmarks, flatiron scripts Jul 28, 2023
@jpata jpata merged commit afcf0fe into jpata:main Jul 28, 2023
11 checks passed
jpata pushed a commit that referenced this pull request Sep 15, 2023
* fix: error in raytune search space

* enable best practice settings for optimal model performance

* only max out NVIDIA L2 cache if GPUs are found

* enable benchmarking callback for clic dataset schema

* feat: configure tensorboard profiling from config file

* Update eval script on flatiron

* Update training batch script on flatiron

* Update raytune batch script on flatiron

* Add batch scripts for 8 GPU training on flatiron

* Update raytune search space file

* Setting numpy==1.23.5 in requirements.txt, later versions are incompatible with tf2onnx 1.14.0 (latest at time of commit)
jpata pushed a commit that referenced this pull request Sep 15, 2023
* fix: error in raytune search space

* enable best practice settings for optimal model performance

* only max out NVIDIA L2 cache if GPUs are found

* enable benchmarking callback for clic dataset schema

* feat: configure tensorboard profiling from config file

* Update eval script on flatiron

* Update training batch script on flatiron

* Update raytune batch script on flatiron

* Add batch scripts for 8 GPU training on flatiron

* Update raytune search space file

* Setting numpy==1.23.5 in requirements.txt, later versions are incompatible with tf2onnx 1.14.0 (latest at time of commit)

Former-commit-id: 35fc5d8
jpata pushed a commit that referenced this pull request Sep 15, 2023
* fix: error in raytune search space

* enable best practice settings for optimal model performance

* only max out NVIDIA L2 cache if GPUs are found

* enable benchmarking callback for clic dataset schema

* feat: configure tensorboard profiling from config file

* Update eval script on flatiron

* Update training batch script on flatiron

* Update raytune batch script on flatiron

* Add batch scripts for 8 GPU training on flatiron

* Update raytune search space file

* Setting numpy==1.23.5 in requirements.txt, later versions are incompatible with tf2onnx 1.14.0 (latest at time of commit)

Former-commit-id: 35fc5d8
jpata pushed a commit that referenced this pull request Sep 25, 2023
* fix: error in raytune search space

* enable best practice settings for optimal model performance

* only max out NVIDIA L2 cache if GPUs are found

* enable benchmarking callback for clic dataset schema

* feat: configure tensorboard profiling from config file

* Update eval script on flatiron

* Update training batch script on flatiron

* Update raytune batch script on flatiron

* Add batch scripts for 8 GPU training on flatiron

* Update raytune search space file

* Setting numpy==1.23.5 in requirements.txt, later versions are incompatible with tf2onnx 1.14.0 (latest at time of commit)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants