Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[P] MLP output of first layer is zero after one epoch

I’ve been running into an issue lately trying to train a simple MLP.

I’m basically trying to get a network to map the XYZ position and RPY orientation of the end-effector of a robot arm (6-dimensional input) to the angle of every joint of the robot arm to reach that position (6-dimensional output), so this is a regression problem.

I’ve generated a dataset using the angles to compute the current position, and generated datasets with 5k, 500k and 500M sets of values.

My issue is the MLP I’m using doesn’t learn anything at all. Using Tensorboard (I’m using Keras), I’ve realized that the output of my very first layer is always zero (see Image1 ), no matter what I try.

Basically, my input is a shape (6,) vector and the output is also a shape (6,) vector.

Here is what I’ve tried so far, without success:

  • I’ve tried MLPs with 2 layers of size 12, 24; 2 layers of size 48, 48; 4 layers of size 12, 24, 24, 48.
  • Adam, SGD, RMSprop optimizers
  • Learning rates ranging from 0.15 to 0.001, with and without decay
  • Both Mean Squared Error (MSE) and Mean Absolute Error (MAE) as the loss function
  • Normalizing the input data, and not normalizing it (the first 3 values are between -3 and +3, the last 3 are between -pi and pi)
  • Batch sizes of 1, 10, 32
  • Tested the MLP of all 3 datasets of 5k values, 500k values and 5M values.
  • Tested with number of epoches ranging from 10 to 1000
  • Tested multiple initializers for the bias and kernel.
  • Tested both the Sequential model and the Keras functional API (to make sure the issue wasn’t how I called the model)
  • All 3 of sigmoid, relu and tanh activation functions for the hidden layers (the last layer is a linear activation because its a regression)

Additionally, I’ve tried the very same MLP architecture on the basic Boston housing price regression dataset by Keras, and the net was definitely learning something, which leads me to believe that there may be some kind of issue with my data. However, I’m at a complete loss as to what it may be as the system in its current state does not learn anything at all, the loss function just stalls starting on the 1st epoch.

Any help or lead would be appreciated, and I will gladly provide code or data if needed!

Thank you

submitted by /u/deathlymonkey
[link] [comments]