Blog

Learn About Our Meetup

4500+ Members

[P] RNNs and Reinforcement learning

Keep in mind that I am still in the early phase of learning ML. I cannot disclose the exact task/problem I am working on (not related to NLP), but the below task captures the essence of it.

(REINFORCEMENT LEARNING)

Input – A paragraph written in English.

Output – On a scale of 1-10 (continuous/not discrete scale) predict the level of English of each sentence. For example, My English is poor should score better than I have bad English. Even though both are grammatically correct.

Example input: Hey! how are you? It has been so long since I last saw you.

Example output: [5.544554, 5.890909] (made up numbers)

My approach:

  1. Encode each word. (fixed length if that matters)
  2. Break paragraph into sentences, because prediction for a sentence will not depend upon other sentences.
  3. Pad every sentence so that they have same length.
  4. For every sentence:

i. Pass each word of the sentence to a RNN encoder, And extract the hidden state corresponding to the last word (before padding). (for example: Sentence: i am fine padded_word padded_word, RNN output: [A,B,C,D,E] so I extract RNN output/hidden_state C. (not sure if this is the right thing to do)

j. Pass this hidden state C to RNN decoder, which makes the prediction. This prediction leads to a reward.

  1. Use PPO (proximal policy optimization).

I hope this is clear and am sorry for being so vague about my problem. If it matters, I have a few fully connected layers between encoder and decoder.

So, is this the best approach for this problem?

Does PPO works well with RNNs?

Also, what might be the reason that the network is not learning even when I am using normalized environment?

Any help would be highly appreciated.

submitted by /u/xicor7017
[link] [comments]

Next Meetup

 

Days
:
Hours
:
Minutes
:
Seconds

 

Plug yourself into AI and don't miss a beat

 


Toronto AI is a social and collaborative hub to unite AI innovators of Toronto and surrounding areas. We explore AI technologies in digital art and music, healthcare, marketing, fintech, vr, robotics and more. Toronto AI was founded by Dave MacDonald and Patrick O'Mara.