News Regularization by gradient descent and getting rid of pesky learning rates 4 years ago • 4 min read