The former reduces network size from the whole by searching optimal cell structure, while the latter compresses the network locally by removing unimportant connections in networks with weight-ranking-based pruning. These two methods will separately and sequentially lighten CNN from different scales, ...
(1) ‖⋅‖Fis Frobenius norm.XXcisN×kh×kwmatrix sliced fromc-th channel of input volumesXX,c=1,2,…,ni.WWcisni×kh×kwfilter weights sliced fromc-th channel ofWW.ββis coefficient vector of lengthnifor channel selection, andβc(c-th entry ofββ) is a scalar mask toc-th chan...
The input data is converted to a high dimensional bit-sliced format. In the post-training stage, we analyze the impact of different bit slices to the accuracy. By pruning the redundant input bit slices and shrinking the network size, we are able to build a more compact BNN. Our result ...