[D] Attention layer yields inconsistent results
I am currently working on a problem that involves Recurrent Neural Networks. More precisely, I am dealing with sequences of inputs and try to make a prediction at each time step. As such, I decided to try to include an attention layer that is to look on the left context only.
The problem with this is that the results vary depending on the validation data, and by this, I mean a difference of 10-15% accuracy ! I suspect that something is not going so well. I even wonder if the attention layer does not look at both the left and right contexts, which would partly explain why sometimes performances are so good (but I set
Has it ever happened to you, and what would you do in such situations ? Thank you 🙂