[R] Reusing Convolutional Activations from Frame to Frame to Speed up Learning and Inference
Abstract: When processing similar frames in succession, we can take advantage of the locality of the convolution operation to reevaluate only portions of the image that changed from the previous frame. By saving the output of a layer of convolutions and calculating the change from frame to frame, we can reuse previous activations and save computational resources that would otherwise be wasted recalculating convolutions whose outputs we have already observed. This technique can be applied to many domains, such as processing videos from stationary video cameras, studying the effects of occluding or distorting sections of images, applying convolution to multiple frames of audio or time series data, or playing Atari games. Furthermore, this technique can be applied to speed up both training and inference.
Summary of results: Reusing convolutional activations with CPUs is a good way to save computation for both training and inference, and can serve as a viable alternative to training or doing inference on GPUs in some scenarios. It is likely cheaper, sometimes faster, and it will likely have access to more memory. Unfortunately, there is currently not as much incentive to use this method on GPUs, other than possibly saving power. There are many possible application domains for this technique, and there are likely many ways to improve upon it.
Code and result figures: https://github.com/arnokha/reusing_convolutions
Paper link: [Submitted, coming soon]