Skip to content

Getting a Concrete Feel for Gradient Descent

Let’s get a concrete feel for G.D by using a single variable. Using the method of gradient descent, find the minimum of

Recipe for 1D G.D

(1) Start with any value for or start with .

(2) Calculate

(3) Calculate

(4) Define

(5) Update

(6) Repeat 1-5 until convergence

Concrete Example

Iteration 1

Iteration 2

x_{\text{new}} = x_{\text{old}} - \alpha f’(x_{\text{old}})$

Iteration 3

Convergence Visualization

As the number of iterations tend to infinity, so , & this agrees with our graph.

Why does GD specifically work here?

Let index the iteration number.

There are two cases: to the left of 5 OR to the right of 5

Case 1:

(multiply both sides by )

by boxed in equality

Case 2: