[P]Real Time MLP with 50 lines of code
MLP is a bit old, however it is mature to be deployed in industry. This repo has two purposes: a minimal C++ MLP code for education and the real time performance for the industry/IoT. There are several good points:
0: It uses standard C++ code, no magic instruction. Thus is portable to most machines.
1: It use c++ templates, thus inlines everything. It works like a pre-defined static function, pure stream of float point instructions.
2: It works by SGD of 1 sample each time. Thus it enables real time learning and prediction which is useful for future industry. The training “FPS” can reach 100k for a 32-hidden,16-layer network, eg. We can learn and predict each WAV frame as it arrives.
3: It use shared hidden-hidden weights. In fact it is similar to RNN making use of marginal chaos. This reduces the size of network to the cache without loss of accuracy.
4: the activation function used is y=x/(1+|x|) which is sigmoid like. It and its gradient are fast to calculate and not easily saturated.
5: experiment shows that only a single CPU thread is needed, and more threads just not improve the speed due to memory bound.
6: for >=32 hidden units, gcc autovectorization will turn it to SSE/AVX code, which is 4X faster.
7: the float point type is a template parameter, float/double/long double are OK.
Hope you like it!