Identify critical neurons using Mixed Integer Programming

A novel way of computing neuron importance score at fully connected / convolutional layers and using these computed scores to prune non-critical neurons with marginal loss in the accuracy without fine-tuning or retraining.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

Install the requirements

pip3 install -r requirements.txt

MIP Solver

We use the Commercial MOSEK. In order to run experiments, a license file mosek.lic is required at location /root/mosek for Ubuntu. To use another solver, open sparsify_model.py script and change solver=cp.Mosek to the solver available in the following table (CVXPY is a Python-embedded modeling language for convex optimization problems on top of different solvers).

Available Solvers for CVXPY

	LP	QP	SOCP	SDP	EXP	MIP
CBC	X					X
GLPK	X
GLPK_MI	X					X
OSQP	X	X
CPLEX	X	X	X			X
ECOS	X	X	X		X
ECOS_BB	X	X	X		X	X
GUROBI	X	X	X			X
MOSEK	X	X	X	X	X	X
CVXOPT	X	X	X	X
SCS	X	X	X	X	X

Running the experiments

All the experiments reported in the paper are in experiments_notebook.ipynb

Training Models

$ python3 train_model.py

Arguments

-sd : parent directory to store logs and models
-e : number of training epochs
-dl : dataset index to be used for training with the following order ['MNIST', 'FashionMNIST', 'KMNIST', 'Caltech256', 'CIFAR10']
-r : number of training reset to train multiple models with different initializations
-m : model index with the following order ['FullyConnectedBaselineModel', 'FullyConnected2Model', 'Lecun Model 98', 'Dense Fully Connected', 'Lenet', 'vgg19']
-op : optimizer used for training with the following order ['Adam', 'SGD', 'RMSPROP']
-l : learning rate index with the following order ['1e-1', '1e-2', '1e-3', '1e-5']
-bs : batch size used during training
-dgl: to enable decoupled greedy learning during the training

Sparsifying Models

$ python3 run_sparsify.py

Arguments

starts with same arguments as training to select the right experiment directory with the following extra arguments:

-tt : pruning threshold (neurons having an importance score below the selected threshold are going to be pruned)
-sw : \lambda used to control loss on accuracy (more weight will prune less to keep predictive capacity)
-ft : flag to enable fine tuning after pruning
-te : number of fine tuning epochs
-n : number of data points as input to the MIP
-mth: a flag when enabled will use mean of layer's importance score as the pruning threshold
-f : a flag that forces re-computing the neuron importance score instead of using cached results from previous runs
-rl : a flag to relax ReLU constraints
-dgl: to use auxiliary networks trained per layer to compute neuron importance score for large models
-seq: a flag to run the MIP independently on each class then taking the average
-bll: a flag to run the MIP on each layer independently starting from the last layer

Sparsifying every n iterations/epochs

$ python3 train_sparsify.py

Arguments

Starts with same arguments as training and sparsify to select the right experiment directory with the following extra arguments

-trst : a flag to run sparsify every n iterations, if disabled will run every n epochs
-ent : an integer for n between epochs/iterations to apply sparsification
-incr : a flag to enable incremental training of computed sub-network

Robustness to different batches Experiments

$ python3  verify_selected_data.py

Arguments

Starts with same arguments as sparsifying models to plot the pruning percentage and the accuracy changes when the batch of images fed to the MIP changes.

Different Lambdas Experiments

$ python3 plot_different_lambdas

Arguments

Starts with same arguments as sparsifying models to plot the pruning percentage and the accuracy changes when the value of the \lambda (-sw) changes.

Average runs on different classes robustness

$ python3 batch_data_experiments.py

Arguments

Starts with same arguments as run_sparsify.py with the following extra arguments:

-nex: an integer denoting the number of experiments conducted
-bbm: a flag when enabled, we sample a balanced set of images per class, otherwise a random number of images per class is used
-ppexp: a flag when enabled the MIP runs independently per class and the neuron importance score becomes the average of multiple runs

References

@article{elaraby2020identifying,
  title={Identifying Critical Neurons in ANN Architectures using Mixed Integer Programming},
  author={ElAraby, Mostafa and Wolf, Guy and Carvalho, Margarida},
  journal={arXiv preprint arXiv:2002.07259},
  year={2020}
}
@article{mosek2010mosek,
  title={The MOSEK optimization software},
  author={Mosek, APS},
  journal={Online at http://www. mosek. com},
  volume={54},
  number={2-1},
  pages={5},
  year={2010}
}
@article{cvxpy,
  author  = {Steven Diamond and Stephen Boyd},
  title   = {{CVXPY}: A {P}ython-Embedded Modeling Language for Convex Optimization},
  journal = {Journal of Machine Learning Research},
  year    = {2016},
  volume  = {17},
  number  = {83},
  pages   = {1--5},
}
@article{cvxpy_rewriting,
  author  = {Akshay Agrawal and Robin Verschueren and Steven Diamond and Stephen Boyd},
  title   = {A Rewriting System for Convex Optimization Problems},
  journal = {Journal of Control and Decision},
  year    = {2018},
  volume  = {5},
  number  = {1},
  pages   = {42--60},
}

License

MIT license

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
activation_modules		activation_modules
custom_pt_layers		custom_pt_layers
dataset		dataset
dgl		dgl
layers_modules		layers_modules
models		models
related_pruning		related_pruning
sparsify		sparsify
training		training
visualization		visualization
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
batch_data_experiments.py		batch_data_experiments.py
compress_snip.py		compress_snip.py
experiments_notebook.ipynb		experiments_notebook.ipynb
plot_different_lambdas.py		plot_different_lambdas.py
requirements.txt		requirements.txt
run_sparsify.py		run_sparsify.py
train_model.py		train_model.py
train_sparsify.py		train_sparsify.py
verify_selected_data.py		verify_selected_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Identify critical neurons using Mixed Integer Programming

Getting Started

Prerequisites

MIP Solver

Available Solvers for CVXPY

Running the experiments

Training Models

Arguments

Sparsifying Models

Arguments

Sparsifying every n iterations/epochs

Arguments

Robustness to different batches Experiments

Arguments

Different Lambdas Experiments

Arguments

Average runs on different classes robustness

Arguments

References

License

About

Releases

Packages

Languages

License

chair-dsgt/mip-for-ann

Folders and files

Latest commit

History

Repository files navigation

Identify critical neurons using Mixed Integer Programming

Getting Started

Prerequisites

MIP Solver

Available Solvers for CVXPY

Running the experiments

Training Models

Arguments

Sparsifying Models

Arguments

Sparsifying every n iterations/epochs

Arguments

Robustness to different batches Experiments

Arguments

Different Lambdas Experiments

Arguments

Average runs on different classes robustness

Arguments

References

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages