-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write integration tests #39
Comments
Sounds good. I will add another issue to create unit tests as well as well as continuous integration. Is it ok to pick some wells from the recent directories? Is it ok to put them on github (make them public)? And does anyone have any preferences regarding if we use Unittest or pytest? |
Yes it is okay to host images publicly. They are small enough to hosted on github repository. |
When you mean "put them on github" do you mean directly in this repo? Or is there another data hosting mechanism? We've kept testing data outside of the repo (gdrive, for example) and downloaded it during testing.
We used pytest in reconstructOrder, but in the past I used Unittest, so will defer to other's preferences here. |
Yes, putting the images in the repo. Testing should be done with continuous integration on the CI server, so either you need to keep the test data in the repo or on a server from where they can be downloaded to the CI server every time a test is done (every time there's a push on GitHub). Since the repo itself is ~100MB and images are <5MB I think it's ok to put 3 images in the repo. |
I'm hesitant to include images in the repo, if only because we've done this before and found it can explode the size of the repo quickly and accidentally. Maybe it's a minor technical point, but what happens when you want an integration test to test multiple wells/images? Our approach was to host the test data on a server -- google drive -- and provide those to the tests via the googledrivedownloader pypi package. I'm not sure how well this plays with CI like github actions. @mattersoflight @smguo what do you think? |
I agree with @bryantChhun. I expect that we will update the test images as the data rolls in and antigen array format evolves. Thich means that repo can end up with multiple versions of test data and create a bloat. We did have to use BFG (https://www.phase2technology.com/blog/removing-large-files-git-bfg) on reconstruct-order. |
I looked up github actions and prefer that over travis. It looks like one can trigger the integration test when there is a pull request into a branch and once integration test is merged into master, trigger the test when there is PR for master. (https://help.github.com/en/actions/reference/workflow-syntax-for-github-actions). |
I'm fine with keeping images on google drive as long as we only do integration tests when there's a PR, and do unit testing every time there's a push to github. |
I just realize I missed the question on which testing framework should we use. I suggest pytest, since it is the easiest and succinct. We would like all contributors to write & maintain tests for relevant bits of code. |
Just a note: as of a month ago, PyTest went from 4-5 to 1 maintainer due to internal conflicts. |
What is the bottleneck? Please describe.
Our review process involves processing 2-3 datasets to determine if the pipeline is functional. Running these tests takes 15-20 minutes.
Describe the solution you'd like and alternatives
Having automated integration test will speed the review process. It would make sense to create a dataset of 4 random wells from 3 acquisitions and test that ODs, intensities, and backgrounds output by the pipeline match with curated reference. If our results become more accurate, we will update the reference.
Good time to add this feature would be after refactoring (#34 , #36 , #29 ), but before multiprocessing (#33).
It will also be useful to invoke the integration test from cli with --test flag.
The text was updated successfully, but these errors were encountered: