Gradient Descent Step Calculator
Gradient descent is the workhorse of machine learning optimisation. Each step nudges a parameter against its gradient to lower the cost. Enter the current parameter value, the learning rate, and the gradient at that point, and this calculator returns the step size and the updated parameter for one iteration, along with the next gradient if you supply a slope.
Gradient descent update
step = eta * gradient
new parameter = current parameter - step
move direction = decrease if gradient > 0, increase if gradient < 0
absolute change = | step |
One update moves the parameter against the gradient by the learning rate times the gradient. Repeat with the new gradient at the updated point until the gradient is near zero.
Worked example
Minimising f(x) = x squared, the gradient is 2x. At x = 10 the gradient is 20. With learning rate 0.1, the step is 0.1 times 20 = 2.00 and the updated parameter is 10 minus 2 = 8.00, moving toward the minimum at zero.
Gradient descent: frequently asked questions
What is a gradient descent step?
Gradient descent moves a parameter in the direction that lowers a cost function. One step subtracts the learning rate times the gradient from the current value. Repeating this drives the parameter toward a minimum, provided the learning rate is suitable.
What is the update rule?
The new parameter equals the old parameter minus the learning rate times the gradient at the old parameter: new = old minus eta times g. The step size moved is the learning rate times the gradient. This calculator performs one such update.
How does the learning rate affect the step?
A larger learning rate takes bigger steps, which can converge faster but may overshoot or diverge. A smaller rate is more stable but slower. The calculator shows the exact step size so you can judge whether it is reasonable for your problem.
Why subtract the gradient rather than add it?
The gradient points in the direction of steepest increase of the cost. To reduce the cost you move in the opposite direction, so you subtract the gradient. Adding it would be gradient ascent, used to maximise instead.
Sources
- NIST Digital Library of Mathematical Functions: Iterative methods and optimisation.
Reviewed by the CalculatorHub team, edited by James Graham, 19 June 2026. See our methodology.