News Regularization by gradient descent and getting rid of pesky learning rates 6 years ago • 4 min read