-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distilled teacher model #30
Comments
From my understanding, while there might not be a direct similarity between the ImageNet and MVTec datasets, the aim is for the features learned by the teacher model to encompass relevant aspects of visual patterns on a vast amount of data. During the distillation process for the student network, the student network is trained to emulate the behavior of the pretrained teacher (pre-distilled on ImageNet dataset), particularly on normal data of MVTec dataset. However, during testing on anomalous data, a high discrepancy between the student and teacher outputs is expected. The student hasn't been explicitly taught to replicate the teacher's output on anomalous instances, as the teacher's features are derived from the broader context of ImageNet representations. This discrepancy is leveraged as an indicator of anomalies, highlighting instances where the student deviates from the learned normal patterns. |
Addtion to good answer from @willyfh, I recommend you to read following paper for your comprehension about the mechanism. |
Sorry for my ignorance, i have small question about teacher model.
The teacher model pre-distilled with image dataset makes it smarter when training with other datasets?
Can you share the background of this rule. I feed there is no similarity between mvrec dataset and image dataset.
Thank you.
The text was updated successfully, but these errors were encountered: