NOTE: THIS DOCUMENT IS OBSOLETE, PLEASE CHECK THE NEW VERSION: "Introduction to Digital Filters with Audio Applications", by Julius O. Smith III, Copyright © 2017-11-26 by Julius O. Smith III - Center for Computer Research in Music and Acoustics (CCRMA), Stanford University
<< Previous page TOC INDEX Next page >>
Gradient Descent
Concavity is valuable in connection with the Gradient Method of minimizing
with respect to
.
Definition. The gradient of the error measure
is defined as the
column vector
Definition. The Gradient Method (Cauchy) is defined as follows.
Given
, compute
whereis the gradient of
at
, and
is chosen as the smallest nonnegative local minimizer of
Cauchy originally proposed to find the value ofwhich gave a global minimum of
. This, however, is not always feasible in practice.
Some general results regarding the Gradient Method are given below.
Theorem. If
is a local minimizer of
, and
exists, then
.
Theorem. The gradient method is a descent method, i.e.,
.
Definition.
,
, is said to be in the class
if all
th order partial derivatives of
with respect to the components of
are continuous on
.
Definition. The Hessian
of
at
is defined as the matrixof second-order partial derivatives,
wheredenotes the
th component of
,
, and
denotes the matrix entry at the
th row and
th column.
The Hessian of every element of
is a symmetric matrix [Williamson et al. 1972]. This is because continuous second-order partials satisfy
Theorem. If
, then any cluster point
of the gradient sequence
is necessarily astationary point, i.e.,
.
Theorem. Let
denote the concave hull of
. If
, and there exist positive constants
and
such that
for alland for all
, then the gradient method beginning with any point in
converges to a point
. Moreover,
is the unique global minimizer of
in
.
By the norm equivalence theorem [Ortega 1972], Eq. (5) is satisfied whenever
is a norm on
for each
. Since
belongs to
, it is a symmetric matrix. It is also bounded since it is continuous over a compact set. Thus a sufficient requirement is that
be positive definite on
. Positive definiteness of
can be viewed as ``positive curvature'' of
at each point of
which corresponds to strict concavity of
on
.