Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] python – how can I solve gradient divergence problem?

[D] python - how can I solve gradient divergence problem?

here is my code

for _ in range(10): K.clear_session() model = Sequential() model.add(LSTM(256, input_shape=(None, 1))) model.add(Dropout(0.2)) model.add(Dense(256)) model.add(Dropout(0.2)) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy']) hist = model.fit(x_train, y_train, epochs=20, batch_size=64, verbose=0, validation_data=(x_val, y_val)) p = model.predict(x_test) print(mean_squared_error(y_test, p)) plt.plot(y_test) plt.plot(p) plt.legend(['testY', 'p'], loc='upper right') plt.show() 

dataset is stock time series

`Total params` : 330,241

`samples` : 2264

just same code for loop ten times

and below is the result

https://i.redd.it/jngtf9xx47g31.png

I haven’t changed anything.

I only ran for loop.

But the MSE difference in the results is very large.

I think the reason for this the weights are initialized randomly;

So, I increased the size of epochs and batch_size, but the gradient divergence problem was not solved.

I wonder how we should solve this problem.

Your valuable opinions and thoughts will be very much appreciated.

if you want to see full source here is link https://gist.github.com/Lay4U/e1fc7d036356575f4d0799cdcebed90e

submitted by /u/GoBacksIn
[link] [comments]