Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Unit Tests to Yolov5 #7870

Closed
1 task done
JWLee89 opened this issue May 18, 2022 · 10 comments
Closed
1 task done

Adding Unit Tests to Yolov5 #7870

JWLee89 opened this issue May 18, 2022 · 10 comments
Labels
question Further information is requested

Comments

@JWLee89
Copy link
Contributor

JWLee89 commented May 18, 2022

Search before asking

From checking the past issues, it seemed like this was already proposed, but has been closed:
#407

Question

I was wondering if there will be any plans in the future to add unit tests using testing libraries such as pytest and add the associated github workflows to the CI / CD phase.

From my knowledge, it seems that the CI tests so far involve just running the scripts such as export.py, detect.py, train.py, etc. Furthermore, I think the tests will also serve as documentation to developers, allowing them to understand the behavior, input / output of each function.

Are there any plans in the future to write tests and add github workflows to improve visibility?

  • visibility in terms of ensuring that each feature is working as expected on the function level

Although this will likely take a long time, I think it will also save as much time if not more in the future if implemented effectively.

Additional

No response

@JWLee89 JWLee89 added the question Further information is requested label May 18, 2022
@glenn-jocher
Copy link
Member

glenn-jocher commented May 18, 2022

@JWLee89 yes we have CI in place on any commit and every 24 hours also to catch any dependency changes affecting results. The high level scripts naturally invoke most low level functions.

We need to add more detailed export and benchmarking CI (python utils/benchmarks.py) as well in a separate action since this take longer.

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

@glenn-jocher
Copy link
Member

We also need to add CI on single and multi-GPU which is lacking now.

@JWLee89
Copy link
Contributor Author

JWLee89 commented May 18, 2022

@glenn-jocher Thank you very much for the response.

I took a look at the GitHub action workflow files and saw how the tests are being run. As you mentioned, since the high level scripts invoke most of the low level functions, so I am guessing that this is a quick and effective method (in terms of computational resources and implementation time) to receive timely feedback on the repo's functionality.

Apologies, for the ambiguity. My question was geared towards writing unit tests on some of the low-level functions to receive more detailed feedback. But come to think of it, this might not be worth it, since the amount of work and code written will be fairly significant.

Thank you for the answer!

@glenn-jocher
Copy link
Member

@JWLee89 yeah I'm not sure either. I know smaller unit tests are more common in industry, but at this point we actually need to expand the high level tests as our first priority. The things that are missing currently are:

  • Checking mAP on trained models. Right now a PR (or a dependency change) could create a silent training error that would drop mAP and we have no way of knowing. This is not really feasible on CPU though, which brings me to point 2:
  • GPU CI. We need to connect external Multi-GPU agents to GitHub to run the existing CI checks on single-GPU and Multi-GPU.
  • Exports. We need to run exports on every format and identify any failing formats.
  • Export benchmarks. Closely related to previous, we need to verify near-identical mAP on exported formats. See YOLOv5 Export Benchmarks for GPU #6963

Some of these may be time consuming so we may want to start them as daily chron jobs rather than on PRs/commits to master. If you'd like to help with any of the above that would be awesome!!

@JWLee89
Copy link
Contributor Author

JWLee89 commented May 18, 2022

@glenn-jocher Thank you for the detailed answer on what needs to be done!

Checking mAP on trained models. Right now a PR (or a dependency change) could create a silent training error that would drop mAP and we have no way of knowing.

I don't quite understand what is meant by checking mAP on trained models. I took a look at train.py briefly and it seems that mAP is being calculated periodically during the training phase and the best model (in terms of mAP) is periodically being serialized. Does this mean to raise a warning / event if the mAP does not follow the mAP trend of existing pre-trained models when a PR / dependency change event is triggered?

Right now, to familiarize myself with the code base, I started working on removing shell=True from subprocess calls in export.py that take in user inputs such as a file (to prevent possible or unexpected injections). Would this be helpful by any chance?

Exports. We need to run exports on every format and identify any failing formats.

From what I've seen in export.py, it seems that most majority of export formats are well-supported (even edgetpu).
Are there any specific export formats that are being targeted for support?

@glenn-jocher
Copy link
Member

@JWLee89 checking mAP means for example that we train VOC from scratch for 30 epochs and assert that the final mAP is above a threshold. We could do this on a Chron job or even better log to a spreadsheet to show users that training has improved/stayed the same over time.

Most export formats are not part of CI, so the proposals above would expand CI to all exports and also assert exported models are validation above mAP threshold, which is not done for any models now.

Yes absolutely, if you see any small improvements please feel free to submit those as well. Since YOLOv5 implements so many wide ranging functions there are some where I'm not an expert in and could use improvement.

@JWLee89
Copy link
Contributor Author

JWLee89 commented May 18, 2022

@glenn-jocher Thank you for taking time out of your busy schedule to respond.

checking mAP means for example that we train VOC from scratch for 30 epochs and assert that the final mAP is above a threshold. We could do this on a Chron job or even better log to a spreadsheet to show users that training has improved/stayed the same over time.

I am assuming that we allow yolo to access a database of some sort that stores the threshold mAP for each model at particular checkpoints (e.g. 20.3 at epoch 5, 27.5 at epoch 10, etc. Note the numbers are arbitrary) to show a trend and after training for 30 epochs, we assert that the final mAP is over a certain threshold and if it is not, we raise an error notifying developers of the bug / issue.

Would this be in line with your idea / direction?

Most export formats are not part of CI, so the proposals above would expand CI to all exports and also assert exported models are validation above mAP threshold, which is not done for any models now.

Ahh so I am guessing that this task would ensure that CI detects whether all the export functions are working as intended. After which, we will deserialize the pre-trained model (e.g. yolov5s6 for example), run it on the test set to see that each exports output models above the validation mAP threshold.

One possible issue is that in order to run this for certain export formats such as TensorRT, we would need to install TensorRT as a dependency in the yolo docker or have a self-hosted CI / CD server that does this.
Same with edgetpu, we would need a coral dev board or access to edgetpu on the CI / CD server.

Please correct me if any of my assumptions is incorrect.

Yes absolutely, if you see any small improvements please feel free to submit those as well. Since YOLOv5 implements so many wide ranging functions there are some where I'm not an expert in and could use improvement.

I'd love work on yolov5 during my weekends or downtime!
I just made a simple pull request (#7875) to ease my way into the yolov5 repository.

Thank you very much for the response!

@glenn-jocher
Copy link
Member

@JWLee89 yes the training CI seems correct.

Not all export formats are benchmarkable as you mentioned. Specifically CoreML, EdgeTPU, and TF.js etc. are not validatable on ubuntu-latest GitHub action runners leaving NaN in the table, but this is fine.

                   Format  mAP@0.5:0.95  Inference time (ms)
0                 PyTorch        0.4623                11.51
1             TorchScript        0.4623                 8.70
2                    ONNX        0.4623                17.08
3                OpenVINO           NaN                  NaN
4                TensorRT        0.4628                 3.36
5                  CoreML           NaN                  NaN
6   TensorFlow SavedModel        0.4623                25.13
7     TensorFlow GraphDef        0.4623                25.20
8         TensorFlow Lite           NaN                  NaN
9     TensorFlow Edge TPU           NaN                  NaN
10          TensorFlow.js           NaN                  NaN

TensorRT is installed on the default Docker image ultralytics/yolov5:latest
https://hub.docker.com/repository/docker/ultralytics/yolov5/tags

@JWLee89
Copy link
Contributor Author

JWLee89 commented May 18, 2022

Ahhh I see.

Thank you very much for the clarification! I will make a note what went on here as a reminder on the tasks that need to be done.

@glenn-jocher
Copy link
Member

@JWLee89 you're welcome! I'm glad I could help clarify things for you. If you have any more questions or need further assistance as you work on YOLOv5, feel free to reach out. Good luck and have fun contributing to the repository!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants