[D] python – how can I solve gradient divergence problem?

Written by torontoai on August 12, 2019. Posted in Reddit MachineLearning.

here is my code

for _ in range(10): K.clear_session() model = Sequential() model.add(LSTM(256, input_shape=(None, 1))) model.add(Dropout(0.2)) model.add(Dense(256)) model.add(Dropout(0.2)) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy']) hist = model.fit(x_train, y_train, epochs=20, batch_size=64, verbose=0, validation_data=(x_val, y_val)) p = model.predict(x_test) print(mean_squared_error(y_test, p)) plt.plot(y_test) plt.plot(p) plt.legend(['testY', 'p'], loc='upper right') plt.show()

dataset is stock time series

`Total params` : 330,241

`samples` : 2264

just same code for loop ten times

and below is the result

https://i.redd.it/jngtf9xx47g31.png

I haven’t changed anything.

I only ran for loop.

But the MSE difference in the results is very large.

I think the reason for this the weights are initialized randomly;

So, I increased the size of epochs and batch_size, but the gradient divergence problem was not solved.

I wonder how we should solve this problem.

Your valuable opinions and thoughts will be very much appreciated.

if you want to see full source here is link https://gist.github.com/Lay4U/e1fc7d036356575f4d0799cdcebed90e

submitted by /u/GoBacksIn
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[D] python – how can I solve gradient divergence problem?