Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any plan for Knowledge Distillation? #1762

Closed
hzhuangdy opened this issue Dec 23, 2020 · 9 comments
Closed

Any plan for Knowledge Distillation? #1762

hzhuangdy opened this issue Dec 23, 2020 · 9 comments
Labels
enhancement New feature or request Stale

Comments

@hzhuangdy
Copy link

🚀 Feature

Use a teacher model to train a student model, which is lighter than the teacher model
It is a brilliant method for model simplification without a decrease in accuracy

Motivation

To make a small model more efficient

Pitch

Alternatives

Additional context

@hzhuangdy hzhuangdy added the enhancement New feature or request label Dec 23, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Dec 23, 2020

Hello @hzhuangdy, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com.

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

CI CPU testing

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

@glenn-jocher
Copy link
Member

@hzhuangdy yes this is a very interesting concept. I've used this myself for autolabelling data with a trained YOLOv5x model in order to teach a smaller YOLOv5s model, and it works very well. These steps are manually possible at the moment, but hopefully in the future we can head towards a more automated pipeline for allowing this sort of behavior.

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@RobinBram
Copy link

@glenn-jocher Did you train the smaller model on soft or hard labels?

@glenn-jocher
Copy link
Member

@RobinBram not sure I understand. All models are trained identically, with commands to reproduce trainings displayed in README here:
https://github.com/ultralytics/yolov5#training

@RobinBram
Copy link

@glenn-jocher Since the topic was knowledge destillation, and from all the research I have read, the best way is to use the soft labels of the teacher and not the hard labels to have the student learn to reason and generalize in the same wa as the teacher. I'm not sure what the soft labels is in the object detection case. I suppose including the confidence for the different boxes of the teacher in the loss function might work.

Knowledge distillation is also possible without any additional data, just using a weighted loss function with the ground truth and the soft labels of the teacher to train the student. Think I will try both in my master thesis that I'm currently working on.

@glenn-jocher
Copy link
Member

glenn-jocher commented Mar 6, 2021

@RobinBram ah I see, soft labels are provided by a teacher model rather than a human labeler.

Yes, we have some of the tools for this, but not the entire chain. You can autolabel any dataset by running it through test.py (or detect.py) with --save-txt, which will generate YOLO-format labels for all the detections, and you can also include the confidences in the label as a 6th column if you also pass --save-conf to test.py.

I've used this for example to label all of COCO test set (40k images) with YOLOv5x, and then add them to COCO train set (120k images), to train new models on the merged dataset (160k images). Training on two datasets at the same time is very easy with YOLOv5, you just pass them both in your data.yaml as a list: train: [data1/images, data2/images]

The result of the 160k experiment was that smaller models like YOLOv5s achieved better results, but YOLOv5x itself did not improve, since it's the same size as the teacher model.

We also don't have code in place to exploit label confidences yet though, so in my experiment above I only labelled high confidence objects, i.e. --conf 0.9. If you'd like to help contribute that would be great!

python test.py --data unlabelled.yaml --save-txt --save-conf

@glenn-jocher
Copy link
Member

@RobinBram these are the commands you should look at. --save-hybrid is a very advanced feature that appends model predictions to existing (probably human) labels in your dataset (if any). NMS is run on the combined set for each image, with the apriori/human labels assigned confidences of 1.0 before NMS.

yolov5/test.py

Lines 295 to 297 in cd8ed35

parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
parser.add_argument('--save-hybrid', action='store_true', help='save label+prediction hybrid results to *.txt')
parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')

@timothylimyl
Copy link

If anyone is looking at this thread, everything is migrated into detect.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Stale
Projects
None yet
Development

No branches or pull requests

4 participants