Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Basic RNN predicting more than 1 timestep /w Keras (Python).

I’ve been working with RNNs for a little while now but prior to dipping my toe in this area I’ve successfully implemented a few basic feed forward models into a production environment. I like to think I understand the premise posed by recurrent topology (GRU, LSTM, for instance). I’m struggling with the basic shape of the data and/or the proper parameters for my training data.

Here’s a basic example I’ve been playing with for many in, one out (omitting the fancy Keras utils that do automatic encoding / mapping):

The Data / imports

import numpy as np from keras.utils import to_categorical, plot_model from keras.preprocessing.text import Tokenizer from keras.preprocessing.sequence import pad_sequences from keras.models import Sequential from keras.layers import Dense, Embedding, LSTM, GRU, Dropout, Flatten from keras import callbacks, regularizers, optimizers X = [ ["a", "b", "c"], ["b", "c", "d"], ["c", "d", "e"], ["d", "e", "f"], ["e", "f", "g"], ["f", "g", "h"], ["g", "h", "i"], ["h", "i", "j"], ] # Maps to each value observation X, 2 seq out y_b = [ ["d", "e"], ["e", "f"], ["f", "g"], ["g", "h"], ["h", "i"], ["i", "j"], ["j", "k"], ["k", "m"] ] # Maps to each value observation X, 1 seq out y_a = [ ["d"], ["e"], ["f"], ["g"], ["h"], ["i"], ["j"], ["k"] ] # Basic function to translate characters to their ordinal offsets decode = lambda char: [chr(i) for i in range(97, 97 + 26)] encode = lambda seq: np.array([[letters.index(i) for i in obs] for obs in seq]) X_encoded = encode(X) y_encoded = encode(y_a) # y_a == predict single timestep, y_b == predict 2 timesteps 

The resulting design matrix should look something like this:

array([[0, 1, 2], [1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7], [6, 7, 8], [7, 8, 9]]) 

Reshaping for network

sequence_length = 3 X_reshaped = np.reshape(X_encoded, (len(X_encoded), sequence_length, 1)) X_reshaped = to_categorical(X_reshaped) # This is from keras.util y_cat = to_categorical(y_encoded) 

y_cat ends up one-hot-encoded / binary-like for 1 representing the unique entity on index:

sequence_length = 3 X_reshaped = np.reshape(X_encoded, (len(X_encoded), sequence_length, 1)) ## Experimented with ## X_reshaped = to_categorical(X_reshaped) y_cat = to_categorical(y_encoded) 

The Model

## Network topology model = Sequential() model.add(GRU(64, input_shape = (X_reshaped.shape[1], X_reshaped.shape[2]))) ## This doesn't seem to be necessary # model.add(Flatten()) ## This, I believe sets the assumption about the output in terms of categorical encoding model.add(Dense(y_cat.shape[1], activation="softmax")) rmsprop = optimizers.rmsprop(lr = .1) model.compile(loss="categorical_crossentropy", optimizer=rmsprop, metrics=["accuracy"]) model_params = dict( x = X_reshaped, y = y_cat, epochs = 50, batch_size = 2, verbose = 1, # callbacks = [keras_tensorboard], validation_split = 0.3 ) history = model.fit(**model_params) 

My basic 3 in 1 out (predicting y_a) network works just fine. When I try to predict more than one (y_b), updating my parameters for the 2nd Dense layer, is when I run into problems. Given the shape and the assumptions I’ve made about the network, seem to be incorrect because the library throws an error about the shape of my ground truth (y).

  1. Is this the proper topology for this type of problem?
  2. Is my y encoded improperly?

Of course I’m interested in solving for the multi sequence output but more importantly, I’m hoping to understand “why” more than “how”. Thanks in advance for any advice or help!

submitted by /u/butter-jesus
[link] [comments]