Optimizer
Optimizers: From SGD To AdamW
· ☕ 10 min read · âœī¸ k4i
A mechanism-first guide to optimizers: what SGD, momentum, RMSProp, Adam, and AdamW each solve, why AdamW became a strong default for modern deep learning, and when other optimizers still matter.
Optimizers: From SGD To AdamW