[D] Do GPUs only speed up ANN training when nodes per layer is high?
TL;DR: Tensorflow fashion mnist example only quicker running on GPU if I increase the number of nodes in the hidden layer.
Just spent a good day replacing my AMD gpu with an nvidia one and installing cuda and whatnot. Finally got it working and loaded up Tensorflow’s fashion mnist example, fully expecting training with my new setup to be miles quicker. To my horror, it was slower. Much slower: cpu=12s, gpu=20s.
The example only has 128 nodes in the hidden layer. If I increase that to 2048, the gpu is much faster (cpu=162s, gpu=31s). Increasing the number of layers (without changing nodes per layer) results in cpu being faster, even with 10 hidden layers.
Is this surprising? What with all the hype around ML with GPUs, I expected it to be way quicker even with relatively few nodes per layer. Is there something wrong with my setup or do you only feel the benefit of the GPU’s parallel computing if you’re using layers with large numbers of nodes?