[R] Reprogrammable Electro-Optic Nonlinear Activation Functions for Optical Neural Networks
I am very excited to share our recently published work towards developing nonlinear activation functions for optical neural networks (ONNs).
There has been a lot of interest in specialized hardware for achieving high efficiency and performance on machine learning tasks. Matrix-vector multiplications are one of the most important (and computationally expensive) operations in neural networks. It turns out that analog optical processors can perform these operations in O(1) time (rather than the O(n^2) time on GPUs and CPUs). These specialized ONN processors, which are driven by modulated lasers, could potentially be scaled to use far less energy per operation than conventional digital processors.
Of course, the other piece of the puzzle for neural networks is the nonlinear activation function. Optics is excellent for performing linear operations, but nonlinearities are far more difficult, especially in on-chip circuits. Basically, in nature, if you want to see something or to send information, you use light. But, if you want to make a decision on that information you use electrical charge.
Our paper (linked below) proposes a scheme for building a full ONN with an activation function by coupling a small electrical circuit to the output of each ONN layer. This electrical circuit converts a small amount of the optical signal into and electrical voltage, which then nonlinearly modulates the optical signal. We performed a benchmark of this ONN on the MNIST image recognition task and found that our activation function significantly boosted the classification accuracy of the ONN, from ~85% without the activation to ~94% with the activation. This is still a bit below the performance achieved in state-of-the-art models, but our setup used only 16 complex Fourier coefficients of the images as inputs (rather than all 784 pixels).
Checkout the paper below and feel free to ask questions. Our two Python ONN simulator packages (developed by two of my co-authors) are available on GitHub: https://github.com/fancompute/neuroptica and https://github.com/solgaardlab/neurophox. These repos include several examples if you’re interested in playing around with training ONNs on a computer.
Journal Paper: https://doi.org/10.1109/JSTQE.2019.2930455
arXiv preprint: https://arxiv.org/abs/1903.04579 (same content as the journal version)