[D] Paper review + interview; Learning spatiotemporal [video action] features with 3d convolutional networks
Join Karol Zak on a tour of video action detection using deep learning using 3DCNNs. Karol is going to cover off a breakthrough paper from 2015 “Learning spatiotemporal features with 3d convolutional networks”. 3D convnets are conceptually very easy to use and understand, they work like normal 2D CNNs but they use the third (depth) dimension to capture the time domain. Several large video action datasets have appeared on the scene too which have significantly democratised the practice i.e. Sports1M, Kinetics.
Karol’s style is very practical and hands-on and like last time will demonstrate the models working live in a Jupyter notebook and talk to some of his experience with video action detection.
Paper link; https://arxiv.org/abs/1412.0767