microsoft · chicm-ms · Jun 30, 2020 · Apr 21, 2020 · Apr 22, 2020 · Apr 22, 2020
diff --git a/README.md b/README.md
@@ -144,6 +144,10 @@ Within the following table, we summarized the current NNI capabilities, we are g
               <li><a href="docs/en_US/Compressor/Pruner.md#agp-pruner">AGP Pruner</a></li>
               <li><a href="docs/en_US/Compressor/Pruner.md#slim-pruner">Slim Pruner</a></li>
               <li><a href="docs/en_US/Compressor/Pruner.md#fpgm-pruner">FPGM Pruner</a></li>
+              <li><a href="docs/en_US/Compressor/Pruner.md#netadapt-pruner">NetAdapt Pruner</a></li>
+              <li><a href="docs/en_US/Compressor/Pruner.md#simulatedannealing-pruner">SimulatedAnnealing Pruner</a></li>
+              <li><a href="docs/en_US/Compressor/Pruner.md#admm-pruner">ADMM Pruner</a></li>
+              <li><a href="docs/en_US/Compressor/Pruner.md#autocompress-pruner">AutoCompress Pruner</a></li>
             </ul>
             <b>Quantization</b>
             <ul>

diff --git a/docs/en_US/Compressor/Pruner.md b/docs/en_US/Compressor/Pruner.md
@@ -17,8 +17,12 @@ We provide several pruning algorithms that support fine-grained weight pruning a
 
 **Pruning Schedule**
 * [AGP Pruner](#agp-pruner)
+* [NetAdapt Pruner](#netadapt-pruner)
+* [SimulatedAnnealing Pruner](#simulatedannealing-pruner)
+* [AutoCompress Pruner](#autocompress-pruner)
 
 **Others**
+* [ADMM Pruner](#admm-pruner)
 * [Lottery Ticket Hypothesis](#lottery-ticket-hypothesis)
 
 ## Level Pruner
@@ -349,6 +353,181 @@ You can view example for more information
 
 ***
 
+## NetAdapt Pruner
+NetAdapt allows a user to automatically simplify a pretrained network to meet the resource budget. 
+Given the overall sparsity, NetAdapt will automatically generate the sparsities distribution among different layers by iterative pruning.
+
+For more details, please refer to [NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications](https://arxiv.org/abs/1804.03230).
+
+![](../../img/algo_NetAdapt.png)
+
+#### Usage
+
+PyTorch code
+
+```python
+from nni.compression.torch import NetAdaptPruner
+config_list = [{
+    'sparsity': 0.5,
+    'op_types': ['Conv2d']
+}]
+pruner = NetAdaptPruner(model, config_list, short_term_fine_tuner=short_term_fine_tuner, evaluator=evaluator,base_algo='l1', experiment_data_dir='./')
+pruner.compress()
+```
+
+You can view [example](https://github.com/microsoft/nni/blob/master/examples/model_compress/auto_pruners_torch.py) for more information.
+
+#### User configuration for NetAdapt Pruner
+
+- **sparsity:** The target overall sparsity.
+- **op_types:** The operation type to prune. If `base_algo` is `l1` or `l2`, then only `Conv2d` is supported as `op_types`.
+- **short_term_fine_tuner:** Function to short-term fine tune the masked model.
+This function should include `model` as the only parameter, and fine tune the model for a short term after each pruning iteration.
+- **evaluator:** Function to evaluate the masked model. This function should include `model` as the only parameter, and returns a scalar value.
+- **optimize_mode:** Optimize mode, `maximize` or `minimize`, by default `maximize`.
+- **base_algo:** Base pruning algorithm. `level`, `l1` or `l2`, by default `l1`.
+Given the sparsity distrution among the ops, the assigned `base_algo` is used to decide which filters/channels/weights to prune.
+- **sparsity_per_iteration:** The sparsity to prune in each iteration. NetAdapt Pruner prune the model by the same level in each iteration to meet the resource budget progressively.
+- **experiment_data_dir:** PATH to save experiment data, including the config_list generated for the base pruning algorithm and the performance of the pruned model.
+
+
+## SimulatedAnnealing Pruner
+
+We implement a guided heuristic search method, Simulated Annealing (SA) algorithm, with enhancement on guided search based on prior experience. 
+The enhanced SA technique is based on the observation that a DNN layer with more number of weights often has a higher degree of model compression with less impact on overall accuracy.
+
+- Randomly initialize a pruning rate distribution (sparsities).
+- While current_temperature < stop_temperature:
+    1. generate a perturbation to current distribution
+    2. Perform fast evaluation on the perturbated distribution
+    3. accept the perturbation according to the performance and probability, if not accepted, return to step 1
+    4. cool down, current_temperature <- current_temperature * cool_down_rate
+
+For more details, please refer to [AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates](https://arxiv.org/abs/1907.03141).
+
+#### Usage
+
+PyTorch code
+
+```python
+from nni.compression.torch import SimulatedAnnealingPruner
+config_list = [{
+    'sparsity': 0.5,
+    'op_types': ['Conv2d']
+}]
+pruner = SimulatedAnnealingPruner(model, config_list, evaluator=evaluator, base_algo='l1', cool_down_rate=0.9, experiment_data_dir='./')
+pruner.compress()
+```
+
+You can view [example](https://github.com/microsoft/nni/blob/master/examples/model_compress/auto_pruners_torch.py) for more information.
+
+#### User configuration for SimulatedAnnealing Pruner
+
+- **sparsity:** The target overall sparsity.
+- **op_types:** The operation type to prune. If `base_algo` is `l1` or `l2`, then only `Conv2d` is supported as `op_types`.
+- **evaluator:** Function to evaluate the masked model. This function should include `model` as the only parameter, and returns a scalar value.
+- **optimize_mode:** Optimize mode, `maximize` or `minimize`, by default `maximize`.
+- **base_algo:** Base pruning algorithm. `level`, `l1` or `l2`, by default `l1`.
+Given the sparsity distrution among the ops, the assigned `base_algo` is used to decide which filters/channels/weights to prune.
+- **start_temperature:** Simualated Annealing related parameter.
+- **stop_temperature:** Simualated Annealing related parameter.
+- **cool_down_rate:** Simualated Annealing related parameter.
+- **perturbation_magnitude:** Initial perturbation magnitude to the sparsities. The magnitude decreases with current temperature.
+- **experiment_data_dir:** PATH to save experiment data, including the config_list generated for the base pruning algorithm, the performance of the pruned model and the pruning history.
+
+
+## AutoCompress Pruner
+For each round, AutoCompressPruner prune the model for the same sparsity to achive the ovrall sparsity:
+        1. Generate sparsities distribution using SimualtedAnnealingPruner
+        2. Perform ADMM-based structured pruning to generate pruning result for the next round.
+           Here we use `speedup` to perform real pruning.
+
+For more details, please refer to [AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates](https://arxiv.org/abs/1907.03141).
+
+#### Usage
+
+PyTorch code
+
+```python
+from nni.compression.torch import ADMMPruner
+config_list = [{
+        'sparsity': 0.5,
+        'op_types': ['Conv2d']
+    }]
+pruner = AutoCompressPruner(
+            model, config_list, trainer=trainer, evaluator=evaluator,
+            dummy_input=dummy_input, num_iterations=3, optimize_mode='maximize', base_algo='l1',
+            cool_down_rate=0.9, admm_num_iterations=30, admm_training_epochs=5, experiment_data_dir='./')
+pruner.compress()
+```
+
+You can view [example](https://github.com/microsoft/nni/blob/master/examples/model_compress/auto_pruners_torch.py) for more information.
+
+#### User configuration for AutoCompress Pruner
+
+- **sparsity:** The target overall sparsity.
+- **op_types:** The operation type to prune. If `base_algo` is `l1` or `l2`, then only `Conv2d` is supported as `op_types`.
+- **trainer:** Function used for the first subproblem.
+Users should write this function as a normal function to train the Pytorch model and include `model, optimizer, criterion, epoch, callback` as function arguments.
+Here `callback` acts as an L2 regulizer as presented in the formula (7) of the original paper.
+The logic of `callback` is implemented inside the Pruner, users are just required to insert `callback()` between `loss.backward()` and `optimizer.step()`.
+- **evaluator:** Function to evaluate the masked model. This function should include `model` as the only parameter, and returns a scalar value.
+- **dummy_input:** The dummy input for model speed up, users should put it on right device before pass in.
+- **iterations:** The number of overall iterations.
+- **optimize_mode:** Optimize mode, `maximize` or `minimize`, by default `maximize`.
+- **base_algo:** Base pruning algorithm. `level`, `l1` or `l2`, by default `l1`.
+Given the sparsity distrution among the ops, the assigned `base_algo` is used to decide which filters/channels/weights to prune.
+- **start_temperature:** Simualated Annealing related parameter.
+- **stop_temperature:** Simualated Annealing related parameter.
+- **cool_down_rate:** Simualated Annealing related parameter.
+- **perturbation_magnitude:** Initial perturbation magnitude to the sparsities. The magnitude decreases with current temperature.
+- **admm_num_iterations:** Number of iterations of ADMM Pruner.
+- **admm_training_epochs:** Training epochs of the first optimization subproblem of ADMMPruner.
+- **experiment_data_dir:** PATH to store temporary experiment data.
+
+
+## ADMM Pruner
+Alternating Direction Method of Multipliers (ADMM) is a mathematical optimization technique,
+by decomposing the original nonconvex problem into two subproblems that can be solved iteratively. In weight pruning problem, these two subproblems are solved via 1) gradient descent algorithm and 2) Euclidean projection respectively. This solution framework applies both to non-structured and different variations of structured pruning schemes.
+
+For more details, please refer to [A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers](https://arxiv.org/abs/1804.03294).
+
+#### Usage
+
+PyTorch code
+
+```python
+from nni.compression.torch import ADMMPruner
+config_list = [{
+            'sparsity': 0.8,
+            'op_types': ['Conv2d'],
+            'op_names': ['conv1']
+        }, {
+            'sparsity': 0.92,
+            'op_types': ['Conv2d'],
+            'op_names': ['conv2']
+        }]
+pruner = ADMMPruner(model, config_list, trainer=trainer, num_iterations=30, epochs=5)
+pruner.compress()
+```
+
+You can view [example](https://github.com/microsoft/nni/blob/master/examples/model_compress/auto_pruners_torch.py) for more information.
+
+#### User configuration for ADMM Pruner
+
+- **sparsity:** This is to specify the sparsity operations to be compressed to.
+- **op_types:** The operation type to prune. If `base_algo` is `l1` or `l2`, then only `Conv2d` is supported as `op_types`.
+- **trainer:** Function used for the first subproblem.
+Users should write this function as a normal function to train the Pytorch model and include `model, optimizer, criterion, epoch, callback` as function arguments.
+Here `callback` acts as an L2 regulizer as presented in the formula (7) of the original paper.
+The logic of `callback` is implemented inside the Pruner, users are just required to insert `callback()` between `loss.backward()` and `optimizer.step()`.
+- **num_iterations:** Total number of iterations.
+- **training_epochs:** Training epochs of the first subproblem.
+- **row:** Penalty parameters for ADMM training.
+- **base_algo:** Base pruning algorithm. `level`, `l1` or `l2`, by default `l1`.
+Given the sparsity distrution among the ops, the assigned `base_algo` is used to decide which filters/channels/weights to prune.
+
+
 ## Lottery Ticket Hypothesis
 [The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks](https://arxiv.org/abs/1803.03635), authors Jonathan Frankle and Michael Carbin,provides comprehensive measurement and analysis, and articulate the *lottery ticket hypothesis*: dense, randomly-initialized, feed-forward networks contain subnetworks (*winning tickets*) that -- when trained in isolation -- reach test accuracy comparable to the original network in a similar number of iterations.
 
@@ -396,7 +575,3 @@ We try to reproduce the experiment result of the fully connected network on MNIS
 ![](../../img/lottery_ticket_mnist_fc.png)
 
 The above figure shows the result of the fully connected network. `round0-sparsity-0.0` is the performance without pruning. Consistent with the paper, pruning around 80% also obtain similar performance compared to non-pruning, and converges a little faster. If pruning too much, e.g., larger than 94%, the accuracy becomes lower and convergence becomes a little slower. A little different from the paper, the trend of the data in the paper is relatively more clear.
-
-
-
-
diff --git a/docs/img/algo_NetAdapt.png b/docs/img/algo_NetAdapt.png