Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[R] BlockSwap: Fisher guided block substitution for network compression

[R] BlockSwap: Fisher guided block substitution for network compression

Many networks are composed of blocks. For compression, Moonshine [1] proposed substituting all blocks for a single type of substitute. We propose a method (BlockSwap) for choosing mixed block-type configurations.

Paper: https://arxiv.org/abs/1906.04113

PyTorch Code: https://github.com/BayesWatch/pytorch-blockswap

TL;DR: Compress overparameterised networks using Fisher information to rank randomly proposed alternatives.

Abstract:

The desire to run neural networks on low-capacity edge devices has led to the development of a wealth of compression techniques. Moonshine is a simple and powerful example of this: one takes a large pre-trained network and substitutes each of its convolutional blocks with a selected cheap alternative block, then distills the resultant network with the original. However, not all blocks are created equally; for a required parameter budget there may exist a potent combination of many different cheap blocks. In this work, we find these by developing BlockSwap: an algorithm for choosing networks with interleaved block types by passing a single minibatch of training data through randomly initialised networks and gauging their Fisher potential. We show that block-wise cheapening yields more accurate networks than single block-type networks across a spectrum of parameter budgets.

[1] Crowley, Elliot J., Gavin Gray, and Amos J. Storkey. “Moonshine: Distilling with cheap convolutions.” Advances in Neural Information Processing Systems. 2018.

submitted by /u/jw-turner
[link] [comments]