[D] python – how can I solve gradient divergence problem?
here is my code
dataset is stock time series
`Total params` : 330,241
`samples` : 2264
just same code for loop ten times
and below is the result
I haven’t changed anything.
I only ran for loop.
But the MSE difference in the results is very large.
I think the reason for this the weights are initialized randomly;
So, I increased the size of epochs and batch_size, but the gradient divergence problem was not solved.
I wonder how we should solve this problem.
Your valuable opinions and thoughts will be very much appreciated.
if you want to see full source here is link https://gist.github.com/Lay4U/e1fc7d036356575f4d0799cdcebed90e