Skip to content
Yihui He 何宜晖 edited this page Jan 9, 2018 · 2 revisions

Our 3C approach applies 3 methods sequentially. Given conv weights W:

  • Spatial Decomposition produces W_v and W_h.
  • Channel Decomposition decomposes W_h and outputs W_h' and W_p.
  • Channel Pruning prunes W_p.

spatial decomposition

In the beginning, we adopt Filter Reconstruction in the Spatial Decomposition, which is data independent.

We found that the whole model performance can be improved by minimizing the error on the output feature map after ReLU with W_h (namely, data dependent). The method is from nonlinear case 3.2 in Channel Decomposition. The corresponding function in our code is nonlinear_fc.

It involves two alternative steps.

First, minimize the error on the feature map before ReLU with linear least squares:

Second, minimize the error on the feature map after ReLU: