Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write integration tests #39

Closed
mattersoflight opened this issue Apr 8, 2020 · 10 comments
Closed

Write integration tests #39

mattersoflight opened this issue Apr 8, 2020 · 10 comments

Comments

@mattersoflight
Copy link
Member

mattersoflight commented Apr 8, 2020

What is the bottleneck? Please describe.

Our review process involves processing 2-3 datasets to determine if the pipeline is functional. Running these tests takes 15-20 minutes.

Describe the solution you'd like and alternatives

Having automated integration test will speed the review process. It would make sense to create a dataset of 4 random wells from 3 acquisitions and test that ODs, intensities, and backgrounds output by the pipeline match with curated reference. If our results become more accurate, we will update the reference.

Good time to add this feature would be after refactoring (#34 , #36 , #29 ), but before multiprocessing (#33).

It will also be useful to invoke the integration test from cli with --test flag.

@jennyfolkesson
Copy link
Contributor

Sounds good. I will add another issue to create unit tests as well as well as continuous integration.

Is it ok to pick some wells from the recent directories? Is it ok to put them on github (make them public)?

And does anyone have any preferences regarding if we use Unittest or pytest?

@mattersoflight
Copy link
Member Author

Yes it is okay to host images publicly. They are small enough to hosted on github repository.
I suggest 3 each from two flu plates, the recent covid-19 plate, and the vanilla ELISA plate for the start.

@bryantChhun
Copy link
Contributor

Is it ok to pick some wells from the recent directories? Is it ok to put them on github (make them public)?

When you mean "put them on github" do you mean directly in this repo? Or is there another data hosting mechanism? We've kept testing data outside of the repo (gdrive, for example) and downloaded it during testing.

And does anyone have any preferences regarding if we use Unittest or pytest?

We used pytest in reconstructOrder, but in the past I used Unittest, so will defer to other's preferences here.

@jennyfolkesson
Copy link
Contributor

Yes, putting the images in the repo. Testing should be done with continuous integration on the CI server, so either you need to keep the test data in the repo or on a server from where they can be downloaded to the CI server every time a test is done (every time there's a push on GitHub). Since the repo itself is ~100MB and images are <5MB I think it's ok to put 3 images in the repo.
Here's a relevant discussion:
https://softwareengineering.stackexchange.com/questions/257881/should-test-data-be-checked-into-version-control

@bryantChhun
Copy link
Contributor

I'm hesitant to include images in the repo, if only because we've done this before and found it can explode the size of the repo quickly and accidentally. Maybe it's a minor technical point, but what happens when you want an integration test to test multiple wells/images?

Our approach was to host the test data on a server -- google drive -- and provide those to the tests via the googledrivedownloader pypi package. I'm not sure how well this plays with CI like github actions.

@mattersoflight @smguo what do you think?

@mattersoflight
Copy link
Member Author

I agree with @bryantChhun. I expect that we will update the test images as the data rolls in and antigen array format evolves. Thich means that repo can end up with multiple versions of test data and create a bloat. We did have to use BFG (https://www.phase2technology.com/blog/removing-large-files-git-bfg) on reconstruct-order.

@mattersoflight
Copy link
Member Author

I looked up github actions and prefer that over travis. It looks like one can trigger the integration test when there is a pull request into a branch and once integration test is merged into master, trigger the test when there is PR for master.

(https://help.github.com/en/actions/reference/workflow-syntax-for-github-actions).

@jennyfolkesson
Copy link
Contributor

I'm fine with keeping images on google drive as long as we only do integration tests when there's a PR, and do unit testing every time there's a push to github.

@mattersoflight
Copy link
Member Author

I just realize I missed the question on which testing framework should we use. I suggest pytest, since it is the easiest and succinct. We would like all contributors to write & maintain tests for relevant bits of code.

@jennyfolkesson
Copy link
Contributor

Just a note: as of a month ago, PyTest went from 4-5 to 1 maintainer due to internal conflicts.
I think we should still go ahead with PyTest. Hopefully they will resolve this and realize that kindness makes team work so much better.
https://adrin.info/open-source-coc-conflicts.html

@lenafb lenafb closed this as completed Mar 11, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants