Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use version control for software and data #3

Open
gwaybio opened this issue Jul 17, 2019 · 4 comments
Open

Use version control for software and data #3

gwaybio opened this issue Jul 17, 2019 · 4 comments
Labels
Rule Discussing possible rule

Comments

@gwaybio
Copy link
Member

gwaybio commented Jul 17, 2019

One of the more obvious technical rules (even the low hanging fruit has to be grabbed! 🍌 )

Services like:

  • Github
  • Gitlab
  • Bitbucket
  • OSF

provide a nice framework that can easily (after some technical training) enable effective version control for software.

Version control of data is equally important. The resources I am aware of that can do these things (in addition to those above are):

  • Git-LFS (>= 2GB)
  • Figshare
  • Zenodo
  • Dat (maybe @daniellecrobinson can provide insight here and elsewhere!)

Benefit

Reproduce results, improves sharing and modification, can track how updated data impacts results... opening discussion below

@gwaybio gwaybio added the Rule Discussing possible rule label Jul 17, 2019
@allaway
Copy link

allaway commented Jul 25, 2019

Re data versioning:
I'm far from unbiased but I think that Synapse is a great system for versioning and provenancing data too. :)

I have a list of other similar services somewhere, I'll dig it up if I can find it.

@allaway
Copy link

allaway commented Jul 25, 2019

This is probably more of a software versioning idea: Use of CWL/WDL or other analysis workflows in tandem with dockerized or other containerized software is a time-intensive but robust way to enable reproducible science, but also to help others to employ your methodology on other data.

@allaway
Copy link

allaway commented Jul 25, 2019

Here's the list. It was compiled by the Harvard Dataverse folks. https://docs.google.com/spreadsheets/d/1KptHzDHIdB3s1v5m1mMwphcwXhOVWdkRYdjEWW1dqrE/edit#gid=2016420688

WRT that- OSF might be better considered a Data Versioning tool.

Edit - linking the blog post for citation purposes.

@allaway
Copy link

allaway commented Jul 26, 2019

Here is a really neat project for software (or perhaps more accurately - analysis) versioning and portability that I just learned about this afternoon: http://boutiques.github.io/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Rule Discussing possible rule
Projects
None yet
Development

No branches or pull requests

2 participants