Optimizers: From SGD To AdamW
· â 10 min read · âī¸ k4i
A mechanism-first guide to optimizers: what SGD, momentum, RMSProp, Adam, and AdamW each solve, why AdamW became a strong default for modern deep learning, and when other optimizers still matter.