Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[datasets] Add a random_benchmark() method. #247

Merged

Conversation

ChrisCummins
Copy link
Contributor

The v0.1.8 release removed the random benchmark selection from
CompilerGym environments when no benchmark was specified. If the user
wishes for random benchmark selection, they were required to roll
their own implementation. Randomly sampling from
env.dataset.benchmark_uris() is not always easy as the generator may
be infinite. For some datasets, e.g. Csmith, it is trivial to select
random benchmarks by generating random numbers within the range of
numeric seed values, but this is not obvious and the user shouldn't
have to figure this out for the simple case of uniform random
selection.

This adds a random_benchmark() method to the Dataset class which
allows uniform random benchmark selection, and a random_benchmark()
method to the Datasets class for sampling across datasets.

Fixes #240.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 4, 2021
@ChrisCummins ChrisCummins added this to the v0.1.9 milestone May 4, 2021
The v0.1.8 release removed the random benchmark selection from
CompilerGym environments when no benchmark was specified. If the user
wishes for random benchmark selection, they were required to roll
their own implementation. Randomly sampling from
env.dataset.benchmark_uris() is not always easy as the generator may
be infinite. For some datasets, e.g. Csmith, it is trivial to select
random benchmarks by generating random numbers within the range of
numeric seed values, but this is not obvious and the user shouldn't
have to figure this out for the simple case of uniform random
selection.

This adds a `random_benchmark()` method to the `Dataset` class which
allows uniform random benchmark selection, and a `random_benchmark()`
method to the `Datasets` class for sampling across datasets.

Issue facebookresearch#240.
@codecov-commenter
Copy link

Codecov Report

Merging #247 (45580c7) into development (6a24c6d) will increase coverage by 0.03%.
The diff coverage is 75.00%.

Impacted file tree graph

@@               Coverage Diff               @@
##           development     #247      +/-   ##
===============================================
+ Coverage        63.72%   63.75%   +0.03%     
===============================================
  Files               77       77              
  Lines             5568     5587      +19     
===============================================
+ Hits              3548     3562      +14     
- Misses            2020     2025       +5     
Impacted Files Coverage Δ
compiler_gym/bin/manual_env.py 80.79% <0.00%> (ø)
compiler_gym/envs/llvm/datasets/csmith.py 49.53% <50.00%> (+0.01%) ⬆️
compiler_gym/envs/llvm/datasets/llvm_stress.py 65.51% <50.00%> (-2.49%) ⬇️
compiler_gym/datasets/dataset.py 83.62% <83.33%> (-0.02%) ⬇️
compiler_gym/datasets/datasets.py 93.15% <100.00%> (+0.50%) ⬆️
compiler_gym/datasets/files_dataset.py 100.00% <100.00%> (ø)
compiler_gym/datasets/tar_dataset.py 97.59% <0.00%> (-0.09%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6a24c6d...45580c7. Read the comment docs.

@ChrisCummins ChrisCummins merged commit e9d0937 into facebookresearch:development May 4, 2021
@ChrisCummins ChrisCummins deleted the random-benchmark branch May 4, 2021 20:44
This was referenced Jun 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a random_benchmark() method
3 participants