[D] Which part of the RNN architecture has the sequential memory stored ?

Written by torontoai on May 11, 2019. Posted in Reddit MachineLearning.

I was reading Andrej Karpathy’s blog on RNN to get familiarised with working of RNN, both mathematically and intuitively. From my understanding, there are three sets of parameters to optimise.

Wxh – multiple with new input to give a hidden state
Whh – multiply with rolling hidden state to add to it the above hidden state
Why – multiple with the rolling hidden state to obtain the output

And we have the rolling hidden state (H) which accumulates all the information from the inputs. And we optimise on the loss calculated from the output to find the best set of above params

What I am not able to visualise and understand is the part in which the so-called sequential memory is stores ?

Is it stored in the vector H (the rolling hidden state) or the weight matrix Whh ?

In either case, could you also give some intuition on how it contains memory in the form of matrix / vector ?

submitted by /u/thehumanlobster
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[D] Which part of the RNN architecture has the sequential memory stored ?