[D] Stochastic Weight Averaging and the Ornstein-Uhlenbeck Process
A short blog post discussing Stochastic Weight Averaging and the Ornstein-Uhlenbeck Process. We discuss why SGD is not able to position itself in the center of flat-wide minima but instead positions itself near the boundary of the minima.
submitted by /u/ArmenAg
[link] [comments]