Next: Newton's method for unconstrained
Up: Newton's method for nonlinear
Previous: An example of the
To prove Theorem 2.2 requires some background from linear
algebra and multivariable calculus, which I will now review.
I need to apply the following result, which can be
easily proved from the Fundamental Theorem of Calculus:
Theorem 2.3
Suppose

is continuously differentiable and

.
Then
 |
(4) |
where
J is the Jacobian of
F.
The integral of a vector-valued function, as in (4), is interpreted
as the vector whose components are the integrals of the components of the
integrand.3 I also
need the triangle inequality for integrals:
Theorem 2.4
If

is integrable over the interval [
a,
b], then
 |
(5) |
In order to estimate the errors in Newton's method, I will need to
use a matrix norm. The reader should recall the following definition:
The space
of
matrices is a vector space,
since such matrices can be added and multiplied by scalars in a fashion
analogous to Euclidean vectors. Many norms could be defined on
,
but, as I will show, the following operator norm has significant
advantages for analysis:
Definition 2.6
Given any

,
the norm of
A is defined by
 |
(6) |
The vector norms used on the right-hand side of (
6) are the Euclidean
norms on

and

,
and the matrix norm is called the operator
norm induced by the Euclidean norm.
The second and third properties of the operator norm are key in
analyzing errors, particularly in producing upper bounds.
The next fact I need involves both linear algebra and analysis.
Theorem 2.8
Suppose

is a continuous matrix-valued function.
If
J(
x*) is nonsingular, then there exists

such that, for all

with

,
J(
x) is nonsingular and
This theorem implies that the set of nonsingular matrices is an open set.
The second part of the theorem follows from the fact that, if
is continuous, then so is
wherever this second map is
defined.
Finally, I need to define Lipschitz cotinuity.
Definition 2.9
Suppose

.
Then
F is said to be Lipschitz
continuous on

if there exists a positive constant
L such
that
The same definition can be applied to a matrix-valued function
(like the Jacobian), using a matrix norm to measure
the size of J(x)-J(y). The meaning of Lipschitz continuity is clear:
The difference F(x)-F(y) is, roughly speaking, proportional in size to x-y.
I can now prove Theorem 2.2. I begin with the definition of
the Newton iteration,
x(k+1)=x(k)-J(x(k))-1F(x(k)),
assuming that x(k) is close enough to x* that
J(x(k)) is nonsingular.
I then subtract x* from both sides to obtain
x(k+1)-x*=x(k)-x*-J(x(k))-1F(x(k)).
Since, by assumption, F(x*)=0, I can write this as
x(k+1)-x*=x(k)-x*-J(x(k))-1(F(x(k))-F(x*)).
I now use (4) to estimate
F(x(k))-F(x*):
Therefore,
(The reader should notice that, without the Lipschitz continuity of J,
I can conclude that
,
but I
need the Lipschitz continuity and the above argument to get the stronger
estimate
).
I now have
and so
I use the Lipschitz continuity of J again, this time
to estimate the size of
I-J(x(k))-1J(x*):
I have now obtained
The final step is to recognize that, for all x(k) sufficiently close
to x*,
 |
(7) |
where
.
Then, for x(k) sufficiently close
to x*,
 |
(8) |
If
 |
(9) |
then
 |
(10) |
I have now proved Theorem 2.2: If x(0) is chosen close
enough to x* that (7) and (9) both hold, then
(10) shows that
and (8) shows
that the convergence is quadratic.
Next: Newton's method for unconstrained
Up: Newton's method for nonlinear
Previous: An example of the
Mark S. Gockenbach
2003-01-23