Skip to content

Releases: mlcommons/algorithmic-efficiency

algoperf-benchmark-0.1.5

31 Mar 03:57
5b4914f
Compare
Choose a tag to compare

Summary

  • Finalized variant workload targets.
  • Fix in random_utils helper function.
  • For conformer PyTorch Dropout layers set inplace=True.
  • Clear CUDA cache at begining of each trial for PyTorch.

What's Changed

Full Changelog: algoperf-benchmark-0.1.4...algoperf-benchmark-0.1.5

algoperf-benchmark-0.1.4

27 Mar 01:03
8bd3876
Compare
Choose a tag to compare

Upgrade CUDA version to CUDA 12.1:

  • Upgrade CUDA version in Dockerfiles that will be used for scoring.
  • Update Jax and PyTorch package version tags to use local CUDA installation.

Add flag for completely disabling checkpointing.

  • Note that we will run with checkpointing off at scoring time.

Update Deepspeech and Conformer variant target setting configurations.

  • Note that variant targets are not final.

Fixed bug in scoring code to take best trial in a study for external-tuning ruleset.

Added instructions for submission.

Changed default number of workers for PyTorch data loaders to 0. Running imagenet workloads with >0 may lead to incorrect eval results see #732.
Update: for speech workloads the pytorch_eval_num_workers flag to submission_runner.py has to be set to >0, to prevent data loader crash in jax code.

algoperf-benchmark-0.1.3

06 Mar 21:31
0618974
Compare
Choose a tag to compare

Update technical documentation.

Bug fixes:

  • Fix workload variant names in Dockerfile.
  • Fix VIT GLU OOM by reducing batch size.
  • Fix submission_runner stopping condition.
  • Fix dropout rng in ViT and WMT.

algoperf-benchmark-0.1.2

05 Mar 01:10
6b188ba
Compare
Choose a tag to compare

Add workload variants.

Add prize qualification logs for external tuning ruleset.
Note: FastMRI trials with dropout are not yet added due to #664.

Add functionality to Docker startup script for self_tuning ruleset.
Add self_tuning ruleset option to script that runs all workloads for scoring.

Data setup fixes.

Fix tests that check training differences in PyTorch and JAX on GPU.

algoperf-benchmark-0.1.0

28 Nov 18:27
ca87833
Compare
Choose a tag to compare

First release of the AlgoPerf: Training algorithms benchmarking code.