News Regularization by gradient descent and getting rid of pesky learning rates 3 years ago • 4 min read