[D] python – how can I solve gradient divergence problem?
here is my code
dataset is stock time series `Total params` : 330,241 `samples` : 2264 just same code for loop ten times and below is the result https://i.redd.it/jngtf9xx47g31.png I haven’t changed anything. I only ran for loop. But the MSE difference in the results is very large. I think the reason for this the weights are initialized randomly; So, I increased the size of epochs and batch_size, but the gradient divergence problem was not solved. I wonder how we should solve this problem. Your valuable opinions and thoughts will be very much appreciated. if you want to see full source here is link https://gist.github.com/Lay4U/e1fc7d036356575f4d0799cdcebed90e submitted by /u/GoBacksIn |