This project tests MimicNorm on Cifar and ImageNet dataset with various network structures, including VGGs, ResNets, efficient networks. MimicNorm achieves similar accuracy with less memory consumption. There are other benefits to replace BN layers.
For experiments in cifar dataset, pytorch and torchvision is enough to run this project. As for imagenet, we use LMDB (https://github.com/xunge/pytorch_lmdb_imagenet) to accelerate IO read. pyarrows, lmdb, pandas of appropriate version should to be installed to use LMDB. If you don't use LMDB, a slight modification in ImgageNet/data_loader.py
will work.
Here is an example of create environ from scratch with anaconda
# create conda env
conda create --name torch python=3.5
conda activate torch
# install pytorch
conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=10.2 -c pytorch
Cifar dataset will automatically download at Cifar/data
, and imagenet dataset (ILSVRC 2012) path definition is in ImgageNet/data_loader.py
.
Our method:
python main.py --arch {$net}_cbn --weight mean -b 256
BN implement:
python main.py --arch {$net} -b 256
Some Key arguments:
--warm n
: warm-up learning rate in firstn
epochs--resume {path}
: resume training from file{path}
Note:
Some network architectures have been implemented but not tested, like densenets for cifar. The valid architectures including:
- cifar-100: vgg11, vgg13, vgg16, resnet18, resnet50, squeezenet, shufflenetv2
- imagenet: vgg11, vgg16, resnet18, resnet50, resnet101, shufflenetv2(modifying training strategy)
1, This work builds on many excellent works, which include:
- pytorch-cifar100 (cifar implementation)
- ShuffleNetv2 in PyTorch (shufflenetv2 implementaion for imagenet dataset)