Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

somehow organize performance benchmarking of git-annex #3

Open
yarikoptic opened this issue Oct 2, 2020 · 1 comment
Open

somehow organize performance benchmarking of git-annex #3

yarikoptic opened this issue Oct 2, 2020 · 1 comment

Comments

@yarikoptic
Copy link
Member

We added some checks to git-annex build workflow to spot some cases which could lead to slow(er) standalone build operation, but overall we do not have a good way to detect whenever git-annex "slow downs". We can only see reflection of that whenever we try a new snapshot build sweeping through our datalad tests but then it becomes an archeological expedition to see which change brought the pessimization.

It would be nice to establish automated and consistent benchmarking of git-annex builds as pertinent to datalad.

Proposal:

  • take some release of datalad still compatible with current annex build (so we could take current release ATM IIRC)
  • use asv benchmarks of that datalad but for benchmarking git-annex (so whenever we improve our datalad benchmarks collection, it automagically helps to benchmark git-annex)
  • establish datalad/git-annex-benchmarking on github
    • git subtree benchmarks from datalad
    • Include git-annex's master branch (from git://git.kitenet.net/git-annex) as annex-master branch known to that repo
      • I think asv can benchmark commits in another branch, while benchmarks would be in the master. So asv configuration would do that
    • add pythonish setup to make standalone install the git-annex so asv could deploy any given version of git-annex ( I wonder if there is smth like ccache for haskell ;))
    • I think we better right away establish a singularity container based on e.g. https://github.com/datalad-tester/dev-containers/blob/master/Singularity.10.20200209.1 which would add apt build-dep git-annex-standalone, and that would be container to run asv in -- it would have all build-dependencies etc, and we use this for "worker" env (see below) later
    • add github action to run on cron, which would
    • git checkout annex-master && git pull --ff-only && git push origin annex-master && git checkout master
    • asv run on new commits in annex-master and then asv gh-pages && datalad save -m "ASV results update" .asv && git push origin
  • provide github actions worker on a dedicated box (I have some, consistent timing), probably within singularity (at least some isolation and again -- consistency)
    • make that github action to run only only on pushes, not PRs so we do not anyhow compromise security

WDYT @mih @kyleam @bpoldrack @jwodder

FYI @joeyh

@joeyh
Copy link
Contributor

joeyh commented Oct 2, 2020 via email

@yarikoptic yarikoptic transferred this issue from datalad/datalad-extensions Nov 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants