Gradient Descent Can Take Exponential Time to Escape Saddle Points

less than 1 minute read

Published: September 16, 2019

time for theory papers! focus on works from Simon S. Du this time

for non-convex problem, GD can find a stationary point in poly. time

stationary point? saddle points, local minimizer
GD can escape saddle point
- given that 1. local min. are global min. 2. saddle points are strict
- hold for ML problems!2
- incorporate perturbations: escape in poly. time
- no perturbations: proofed to be exp. time <- this work

examples

initialization within a thin band
initialization far away from saddle point
what about natural cases?
- exists a function s.t. w.h.p. GD cannot visit second-order stationary point in T <= exp(omega(d))
- excape 2 or more saddle points sequentially

Share on

Twitter Facebook LinkedIn

You May Also Enjoy

Human Object Interaction

2 minute read

Published: June 18, 2022

Learning-Based Image Synthesis

3 minute read

Published: September 24, 2021

Advances in Neural Rendering

5 minute read

Published: September 22, 2021

Self Attention for Computer Vision

2 minute read

Published: September 21, 2021