Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Which part of the RNN architecture has the sequential memory stored ?

I was reading Andrej Karpathy’s blog on RNN to get familiarised with working of RNN, both mathematically and intuitively. From my understanding, there are three sets of parameters to optimise.

  1. Wxh – multiple with new input to give a hidden state
  2. Whh – multiply with rolling hidden state to add to it the above hidden state
  3. Why – multiple with the rolling hidden state to obtain the output

And we have the rolling hidden state (H) which accumulates all the information from the inputs. And we optimise on the loss calculated from the output to find the best set of above params

What I am not able to visualise and understand is the part in which the so-called sequential memory is stores ?

Is it stored in the vector H (the rolling hidden state) or the weight matrix Whh ?

In either case, could you also give some intuition on how it contains memory in the form of matrix / vector ?

submitted by /u/thehumanlobster
[link] [comments]