Regularization
A unified treatment of regularization in deep learning — from the bias-variance tradeoff through explicit penalties (L1, L2, weight decay), dropout and its variants (DropConnect, MC Dropout, Stochastic Depth), normalization layers (BatchNorm, LayerNorm, RMSNorm), early stopping, data augmentation (Mixup, CutOut, CutMix, RandAugment), output regularizers (label smoothing, confidence penalty), and implicit regularization from initialization, SGD noise, and spectral normalization.