   Next: Newton's method for unconstrained Up: Newton's method for nonlinear Previous: An example of the

## Proof of quadratic convergence of Newton's method

To prove Theorem 2.2 requires some background from linear algebra and multivariable calculus, which I will now review.

I need to apply the following result, which can be easily proved from the Fundamental Theorem of Calculus:

Theorem 2.3   Suppose is continuously differentiable and . Then (4)

where J is the Jacobian of F.

The integral of a vector-valued function, as in (4), is interpreted as the vector whose components are the integrals of the components of the integrand.3 I also need the triangle inequality for integrals:

Theorem 2.4   If is integrable over the interval [a,b], then (5)

In order to estimate the errors in Newton's method, I will need to use a matrix norm. The reader should recall the following definition:

Definition 2.5   A norm for a vector space X is a real-valued function defined on X satisfying the following properties:
1. for all , and if and only if x=0;
2. for all and all scalars ;
3. for all (the triangle inequality).

The space of matrices is a vector space, since such matrices can be added and multiplied by scalars in a fashion analogous to Euclidean vectors. Many norms could be defined on , but, as I will show, the following operator norm has significant advantages for analysis:

Definition 2.6   Given any , the norm of A is defined by (6)

The vector norms used on the right-hand side of (6) are the Euclidean norms on and , and the matrix norm is called the operator norm induced by the Euclidean norm.

Theorem 2.7   The norm defined by (6) has the following properties:
1.
It is a norm on the space ;
2. for all ;
3. for all .

The second and third properties of the operator norm are key in analyzing errors, particularly in producing upper bounds.

The next fact I need involves both linear algebra and analysis.

Theorem 2.8   Suppose is a continuous matrix-valued function. If J(x*) is nonsingular, then there exists such that, for all with , J(x) is nonsingular and This theorem implies that the set of nonsingular matrices is an open set. The second part of the theorem follows from the fact that, if is continuous, then so is wherever this second map is defined.

Finally, I need to define Lipschitz cotinuity.

Definition 2.9   Suppose . Then F is said to be Lipschitz continuous on if there exists a positive constant L such that The same definition can be applied to a matrix-valued function (like the Jacobian), using a matrix norm to measure the size of J(x)-J(y). The meaning of Lipschitz continuity is clear: The difference F(x)-F(y) is, roughly speaking, proportional in size to x-y.

I can now prove Theorem 2.2. I begin with the definition of the Newton iteration,

x(k+1)=x(k)-J(x(k))-1F(x(k)),

assuming that x(k) is close enough to x* that J(x(k)) is nonsingular. I then subtract x* from both sides to obtain

x(k+1)-x*=x(k)-x*-J(x(k))-1F(x(k)).

Since, by assumption, F(x*)=0, I can write this as

x(k+1)-x*=x(k)-x*-J(x(k))-1(F(x(k))-F(x*)).

I now use (4) to estimate F(x(k))-F(x*): Therefore, (The reader should notice that, without the Lipschitz continuity of J, I can conclude that , but I need the Lipschitz continuity and the above argument to get the stronger estimate ).

I now have and so I use the Lipschitz continuity of J again, this time to estimate the size of I-J(x(k))-1J(x*): I have now obtained The final step is to recognize that, for all x(k) sufficiently close to x*, (7)

where . Then, for x(k) sufficiently close to x*, (8)

If (9)

then (10)

I have now proved Theorem 2.2: If x(0) is chosen close enough to x* that (7) and (9) both hold, then (10) shows that and (8) shows that the convergence is quadratic.   Next: Newton's method for unconstrained Up: Newton's method for nonlinear Previous: An example of the
Mark S. Gockenbach
2003-01-23