Skip to content

Bag of Instances Aggregation Boosts Self-supervised Distillation (ICLR 2022)

Notifications You must be signed in to change notification settings

haohang96/bingo

Repository files navigation

Bag of Instances Aggregation Boosts Self-supervised Distillation

Official implementation of the paper Bag of Instances Aggregation Boosts Self-supervised Distillation,
by Haohang Xu*, Jiemin Fang*, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian.

Recent advances in self-supervised learning have experienced remarkable progress, especially for contrastive learning based methods, which regard each image as well as its augmentations as an individual class and try to distinguish them from all other images. However, due to the large quantity of exemplars, this kind of pretext task intrinsically suffers from slow convergence and is hard for optimization. This is especially true for small scale models, which we find the performance drops dramatically comparing with its supervised counterpart. In this paper, we propose a simple but effective distillation strategy for unsupervised learning. The highlight is that the relationship among similar samples counts and can be seamlessly transferred to the student to boost the performance. Our method, termed as BINGO, which is short for Bag of InstaNces aGgregatiOn, targets at transferring the relationship learned by the teacher to the student. Here bag of instances indicates a set of similar samples constructed by the teacher and are grouped within a bag, and the goal of distillation is to aggregate compact representations over the student with respect to instances in a bag. Notably, BINGO achieves new state-of-the-art performance on small scale models, i.e., 65.5% and 68.9% top-1 accuracies with linear evaluation on ImageNet, using ResNet-18 and ResNet-34 as backbone, respectively, surpassing baselines (52.5% and 57.4% top-1 accuracies) by a significant margin.

framework

Requirements

  • Pytorch >= 1.4.0
  • faiss-gpu
  • absl-py

Unsupervised Training

Data-Relation Extract

bash run_knn.sh # change --data_dir --ckpt_name --corr_name with the path in your server

Distillation

# Distill Efficentnet
bash scripts/unsupervised/SEffB0-TR50W2.sh

# Distill ResNet
# change --t_arch, --s_arch to run with different teacher and student networks
# change --pretrain_path, --data_dir, --corr_npy with the path in your server
# Note that --corr_npy need to be consist with --corr_name in run_knn.sh
bash scripts/unsupervised/SR18-TR50W2.sh
bash scripts/unsupervised/SR18-TR50.sh

Linear Classification

# Evaluate EfficientNet
bash scripts/lincls/Eff.sh

# Evaluate ResNet
bash scripts/lincls/Res.sh

Performance

performance
**Linear evaluation accuracy on ImageNet**
imagenet
**Semi-supervised learning on ImageNet with ResNet-18**
semi

Citation

If you find this repository/work helpful in your research, welcome to cite the paper.

@inproceedings{bingo,
    title={Bag of Instances Aggregation Boosts Self-supervised Distillation}, 
    author={Haohang Xu and Jiemin Fang and Xiaopeng Zhang and Lingxi Xie and Xinggang Wang and Wenrui Dai and Hongkai Xiong and Qi Tian},
    journal={International Conference on Learning Representations},
    year={2022}
}

About

Bag of Instances Aggregation Boosts Self-supervised Distillation (ICLR 2022)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published