Skip to main content


Learn About Our Meetup

5000+ Members



Join our meetup, learn, connect, share, and get to know your Toronto AI community. 



Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.



Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] A Unifying Framework of Bilinear LSTMs

Disclaimer: this is my paper that I’ve been working on, if this sort of thing is not allowed on /r/ml please let me know.

arXiv page:

Abstract: This paper presents a novel unifying framework of bilinear LSTMs that can represent and utilize the nonlinear interaction of the input features present in sequence datasets for achieving superior performance over a linear LSTM and yet not incur more parameters to be learned. To realize this, our unifying framework allows the expressivity of the linear vs. bilinear terms to be balanced by correspondingly trading off between the hidden state vector size vs. approximation quality of the weight matrix in the bilinear term so as to optimize the performance of our bilinear LSTM, while not incurring more parameters to be learned. We empirically evaluate the performance of our bilinear LSTM in several language-based sequence learning tasks to demonstrate its general applicability.

Comments: This approach is novel because it considers improvement through the use of bilinear neurons (essentially polynomial regression + nonlinearity) as a building block. This is typically not done in neural networks as it is typically accepted that linear neuron + nonlinearity is sufficient as a universal approximator. However, we find that performance improvement can be achieved without incurring additional learnable parameters if bilinear neurons are used. It should be noted that the original proof on the universal approximability of linear neurons (Cybenko, 1989) does not show that they are efficient.

submitted by /u/ml_mohit
[link] [comments]