Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
The main reason why people BatchNorm despite being compute heavy (~25% of total model) is because of fast official cudnn implementations. Same reason why RNNs other than LSTM and GRU never went popular.
Also, BatchNorm requires computing square root and division which require full precision to work properly. Going half-precision or applying quantization is not easy.
Anyway, are there any new methods that can dethrone BatchNorm entirely? Some papers:
Equinormalization https://openreview.net/forum?id=r1gEqiC9FX
Generalized Hamming Network https://arxiv.org/abs/1710.10328
submitted by /u/tsauri
[link] [comments]