Gradient Descent is an iterative optimization algorithm for finding the local minimum of a differentiable function. In machine learning, it is used to minimize the loss function (the difference between the model's predictions and actual targets) by adjusting the model's parameters (weights) in the opposite direction of the gradient.
The method is attributed to Augustin-Louis Cauchy (1847), who suggested it for solving systems of linear equations.
It is the fundamental optimization technique underlying almost all neural networks and deep learning models. Variants like Stochastic Gradient Descent (SGD) and Adam are standard in training modern AI.