Applied Mathematics
Vol.07 No.11(2016), Article ID:68747,19 pages
10.4236/am.2016.711111
SVD-MPE: An SVD-Based Vector Extrapolation Method of Polynomial Type
Avram Sidi
Computer Science Department, Technion-Israel Institute of Technology, Haifa, Israel
Copyright © 2016 by author and Scientific Research Publishing Inc.
This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/
Received 10 May 2016; accepted 18 July 2016; published 21 July 2016
ABSTRACT
An important problem that arises in different areas of science and engineering is that of computing the limits of sequences of vectors, where
, N being very large. Such sequences arise, for example, in the solution of systems of linear or nonlinear equations by fixed-point iterative methods, and
are simply the required solutions. In most cases of interest, however, these sequences converge to their limits extremely slowly. One practical way to make the sequences
converge more quickly is to apply to them vector extrapolation methods. Two types of methods exist in the literature: polynomial type methods and epsilon algorithms. In most applications, the polynomial type methods have proved to be superior convergence accelerators. Three polynomial type methods are known, and these are the minimal polynomial extrapolation (MPE), the reduced rank extrapolation (RRE), and the modified minimal polynomial extrapolation (MMPE). In this work, we develop yet another polynomial type method, which is based on the singular value decomposition, as well as the ideas that lead to MPE. We denote this new method by SVD-MPE. We also design a numerically stable algorithm for its implementation, whose computational cost and storage requirements are minimal. Finally, we illustrate the use of SVD-MPE with numerical examples.
Keywords:
Vector Extrapolation, Minimal Polynomial Extrapolation, Singular Value Decomposition, Krylov Subspace Methods
1. Introduction and Background
An important problem that arises in different areas of science and engineering is that of computing limits of sequences of vectors 1, where
, the dimension N being very large in many applications. Such vector sequences arise, for example, in the numerical solution of very large systems of linear or nonlinear equations by fixed-point iterative methods, and
are simply the required solutions to these systems. One common source of such systems is the finite-difference or finite-element discretization of continuum problems.
In most cases of interest, however, the sequences converge to their limits extremely slowly. That is, to approximate
, with a reasonable prescribed level of accuracy, by
, we need to consider very large values of m. Since the vectors
are normally computed in the order
it is clear that we have to compute many such vectors until we reach one that has acceptable accuracy. Thus, this way of app- roximating
via the
becomes very expensive computationally.
Nevertheless, we may ask whether we can do something with those that are already available, to somehow obtain new approximations to
that are better than each individual available
. The answer to this question is in the affirmative for at least a large class of sequences that arise from fixed-point iteration of linear and nonlinear systems of equations. One practical way of achieving this is by applying to the sequence
a suitable convergence acceleration method (or extrapolation method).
Of course, in case does not exist, it seems that no use could be made of the
. Now, if the sequence
is generated by an iterative solution of a linear or nonlinear system of equations, it can be thought of as “diverging from” the solution
of this system. We call
the antilimit of
in such a case. It turns out that vector extrapolation methods can be applied to such divergent sequences
to obtain good approximations to the relevant antilimits, at least in some cases.
Two different types of vector extrapolation methods exist in the literature:
1) Polynomial type methods: The minimal polynomial extrapolation (MPE) of Cabay and Jackson [1] , the reduced rank extrapolation (RRE) of Kaniel and Stein [2] , Eddy [3] , and Mešina [4] , and the modified minimal polynomial extrapolation (MMPE) of Brezinski [5] , Pugachev [6] and Sidi, Ford, and Smith [7] .
2) Epsilon algorithms: The scalar epsilon algorithm (SEA) of Wynn [8] (which is actually a recursive procedure for implementing the transformation of Shanks [9] ), the vector epsilon algorithm (VEA) of Wynn [10] , and the topological epsilon algorithm (TEA) of Brezinski [5] .
The paper by Smith, Ford, and Sidi [11] gives a review of all these methods (except MMPE) that covers the developments in vector extrapolation methods until the end of the 1970s. For up-to-date reviews of MPE and RRE, see Sidi [12] and [13] . Numerically stable algorithms for implementing MPE and RRE are given in Sidi [14] , these algorithms being also economical both computationally and storagewise. Jbilou and Sadok [15] have developed an analogous algorithm for MMPE along the lines suggested in Sidi, Ford, and Smith [7] and Sidi [14] . For the convergence properties and error analyses of MPE, RRE, MMPE, and TEA, as these are applied to vector sequences generated by fixed-point iterative methods from linear systems, see the works by Sidi [16] - [18] , Sidi, Ford, and Smith [7] , Sidi and Bridger [19] , and Sidi and Shapira [20] [21] . VEA has been studied by Brezinski [22] [23] , Gekeler [24] , Wynn [25] [26] , and Graves-Morris [27] [28] .
Vector extrapolation methods are used effectively in various branches of science and engineering in acceler- ating the convergence of iterative methods that result from large sparse systems of equations.
All of these methods have the useful feature that their only input is the vector sequence whose convergence is to be accelerated, nothing else being needed. In most applications, however, the polynomial type methods, especially MPE and RRE, have proved to be superior convergence accelerators; they require much less computation than, and half as much storage as, the epsilon algorithms for the same accuracy.
In this work, we develop yet another polynomial type method, which is based on the singular value decomposition (SVD), as well as some ideas that lead to MPE. We denote this new method by SVD-MPE. We also design a numerically stable algorithm for its implementation, whose computational cost and storage requirements are minimal. The new method is described in the next section. In Section 3, we show how the error in the approximation produced by SVD-MPE can be estimated at zero cost in terms of the quantities already used in the construction of the approximation. In Section 4, we give a very efficient algorithm for implementing SVD-MPE. In Section 5, we derive determinant representations for the approximations produced by SVD-MPE, while in Section 6, we show that this method is a Krylov subspace method when applied to vector sequences that result from the solution of linear systems via fixed-point iterative schemes. Finally, in Section 7, we illustrate its use with two numerical examples.
Before closing, we state the (reduced version of) the well known singular value decomposition (SVD) theorem. For different proofs, we refer the reader to Golub and Van Loan [29] , Horn and Johnson [30] , Stoer and Bulirsch [31] , and Trefethen and Bau [32] , for example.
Theorem 1.1 Let,
. Then there exist unitary matrices
,
, and a diagonal matrix
, with
, such that
Furthermore, if and
, then
In case, there holds
,
and the rest of the
are zero.
Remark: The are called the singular values of
and the
and
are called the corresponding right and left singular vectors of
, respectively. We also have
2. Development of SVD-MPE
In what follows, we use boldface lower case letters for vectors and boldface upper case letters for matrices. In addition, we will be working with general inner products and the
norms
induced by them: These are defined as follows:
・ In, with
hermitian positive definite,
(2.1)
・ In,
with
hermitian positive definite,
(2.2)
Of course, the standard Euclidean inner product and the
norm
induced by it are obtained by letting
in (2.1) and
in (2.2); we will denote these norms by
(we will denote by
the identity matrix in every dimension).
2.1. Summary of MPE
We begin with a brief summary of minimal polynomial extrapolation (MPE). We use the ideas that follow to develop our new method.
Given the vector sequence in
, we define
(2.3)
and, for some fixed n, define the matrices via
(2.4)
Clearly, there is an integer, such that the matrices
,
, are of full rank, but
is not; that is,
(2.5)
(Of course, this is the same as saying that is a linearly independent set, but
is not.) Following this, we pick a positive integer
and let the vector
be the solution to
(2.6)
This minimization problem can also be expressed as in
(2.7)
and, as is easily seen, is the standard least-squares solution to the linear system
, which, when
, is overdetermined, and generally inconsistent. With
determined, set
, and compute the scalars
via
(2.8)
provided. Note that
(2.9)
Finally, set
(2.10)
as the approximation to, whether
is the limit or antilimit of
.
What we have so far is only the definition (or the theoretical development) of MPE as a method. It should not be taken as an efficient computational procedure (algorithm), however. For this topic, see [14] , where numerically stable and computationally and storagewise economical algorithms for MPE and RRE are designed for the case in which. A well documented FORTRAN 77 code for implementing MPE and RRE in a unified manner is also provided in [14] , Appendix B.
2.2. Development of SVD-MPE
We start by observing that the unconstrained minimization problem for MPE given in (2.7) can also be expressed as a superficially “constrained” minimization problem as in
(2.11)
For the SVD-MPE method, we replace this “constrained” minimization problem by the following actual constrained minimization problem:
(2.12)
With determined, we again compute
via
(2.13)
provided, noting again that
(2.14)
Finally, we set
(2.15)
as the SVD-MPE approximation to, whether
is the limit or antilimit of
.
Of course, the minimization problem in (2.12) has a solution for. Let
for this (optimal)
. Lemma 2.1 that follows next gives a complete characterization of
and the (optimal)
.
Lemma 2.1 Let be the singular values of the
matrix
(2.16)
ordered as in
(2.17)
and let be the corresponding right singular vectors of
, that is,
(2.18)
Assuming that, the smallest singular value of
, is simple, the (optimal) solution
to the minimi- zation problem in (2.12) is unique (up to a multiplicative constant
,
), and is given as in
(2.19)
Proof. The proof is achieved by observing that, with,
(2.20)
so that the problem in (2.12) becomes
(2.21)
The (optimal) solution to this problem is and
. We leave the details to the reader. □
In view of the nature of the solution for the (optimal) involving singular values and vectors, as described in Lemma 2.1, we call this new method SVD-MPE.
Remarks:
1) Recall that there exists a positive integer, such that
, for
, but
. Therefore, we have
for all
.
2) Of course, exists if and only if the (optimal)
satisfies
. In addition, by (2.13), the
are unique when
is simple.
Before we go on to the development of our algorithm in the next section, we state the following result concerning the finite termination property of SVD-MPE, whose proof is very similar to that pertaining to MPE and RRE given in [13] :
Theorem 2.2 Let be the solution to the nonsingular linear system
, and let
be the se- quence obtained via the fixed-point iterative scheme
,
with
chosen arbitrarily. If k is the degree of the minimal polynomial of
with respect to
(equivalently, with respect to
)2, then
produced by SVD-MPE satisfies
.
3. Error Estimation
We now turn to the problem of estimating at zero cost the error, whether
is the limit or antilimit of
. Here we assume that
is the solution to the system of equations
and that the vector sequence is obtained via the fixed-point iterative scheme
being the initial approximation to the solution
.
Now, if is some approximation to
, then a good measure of the error
in
is the residual vector
of
, namely,
This is justified since. We consider two cases:
1) is linear; that is,
, where
and
is nonsingular.
In this case, we have
and, therefore, by,
satisfies
and thus
(3.1)
2) is nonlinear.
In this case, assuming that and expanding
about
, we have
where is the Jacobian matrix of the vector-valued function
evaluated at
. Recalling that
, we rewrite this in the form
from which, we conclude that the vectors and
satisfy the approximate equality
That is, for all large m, the sequence behaves as if it were being generated by an N-dimensional approximate linear system of the form
through
where and
In view of what we already know about
for linear systems [from (3.1)], for nonlinear systems, close to convergence, we have
(3.2)
Remark: That the vector is the exact residual vector for
from linear systems and a true app- roximate residual vector for
from nonlinear systems was proved originally in Sidi [14] .
was adopted in a subsequent paper by Jbilou and Sadok [15] and named the “generalized residual.” Despite sounding interesting, this name has no meaning and is also misleading. By expanding
about the solution
and retaining first order terms only, it can be shown that
is actually a genuine approximation to the true residual vector
when
is nonlinear. There is nothing “generalized” about it.
Now, we can compute at no cost in terms of the quantities that result from our algorithm, without having to actually compute
itself. Indeed, we have the following result on
, which can be incor- porated in the algorithm for SVD-MPE that we discuss in the next section:
Theorem 3.1 Let be the smallest singular value of
and let
be the corresponding right singular vector. Then the vector
resulting from
satisfies
(3.3)
Proof. First, the solution to (2.12) is by (2.19). Next, letting
, we have
by (2.13). Consequently,
Thus, by Lemma 2.1, we have
which is the required result. □
4. Algorithm for SVD-MPE
We now turn to the design of a good algorithm for constructing numerically the approximation that results from SVD-MPE. We note that matrix computations in floating-point arithmetic must be done with care, and this is what we would like to achieve here.
In this section, we let and
for simplicity. Thus,
. Since there is no room for confu- sion, we will also use
,
, and
to denote
,
, and
, respectively.
As we have seen in Section 2, to determine, we need
, the right singular vector of
corresponding to its smallest singular value
. Now,
and
can be obtained from the singular value decomposition (SVD) of
. Of course, the SVD of
can be computed by applying directly to
the algori- thm of Golub and Kahan [34] , for example. Here we choose to apply SVD to
in an indirect way, which will result in a very efficient algorithm for SVD-MPE that is economical both computationally and storagewise in an optimal way. Here are the details of the computation of the SVD of
, assuming that
:
1) We first compute the QR factorization of in the form
(4.1)
where is unitary (that is,
) and
is upper triangular with positive diagonal elements, that is,
(4.2)
(4.3)
Of course, we can carry out the QR factorizations in different ways. Here we do this by the modified Gram- Schmidt process (MGS) in a numerically stable way as follows:
1. Compute and
.
2. For do
Set
For do
and
end do (i)
Compute and
.
end do (j)
Note that the matrices and
are obtained from
and
, respectively, as follows:
(4.4)
For MGS, see [29] , for example.
2) We next compute the SVD of: By Theorem 1.1, there exist unitary matrices
,
(4.5)
and a square diagonal matrix,
(4.6)
such that
(4.7)
In addition, since is nonsingular by our assumption that
, we have that
for all i. Consequently,
.
3) Substituting (4.7) in (4.1), we obtain the following true singular value decomposition of:
(4.8)
Thus, , the singular values of
, are also the singular values of
, and
, the corresponding right singular vectors of
, are also the corresponding right singular vectors of
. (Of course, the
are corres- ponding left singular vectors of
. Note that, unlike
,
, and
, which we must compute for our alg- orithm, we do not need to actually compute
because we do not need
in our computations. The mere knowledge that the SVD of
is as given in (4.8) suffices to conclude that
is the required optimal solution to (2.12); we continue with the development of our algorithm from this point.)
Remark: Computing the SVD of a matrix by first forming its QR factorization
, next com- puting the SVD of
as
, and finally setting
, with
was first suggested by Chan [35] .
With already determined, we next compute the
as in (2.13); that is,
(4.9)
provided.
Next, by the fact that
and by (2.14), we can re-express in (2.15) in the form
(4.10)
where the are computed from the
as in
(4.11)
Making use of the fact that, with
(4.12)
where the and the
are exactly those that feature in (4.2) and (4.3), we next rewrite (4.10) as in
(4.13)
Thus, the computation of can be carried out economically as in
(4.14)
Of course, is best computed as a linear combination of the columns of
, hence (4.14) is computed as in
(4.15)
It is clear that, for the computation in (4.14) and (4.15), we need to save both and
.
This completes the design of our algorithm for implementing SVD-MPE. For convenience, we provide a systematic description of this algorithm in Table 1, where we also include the computation of the norm of
the exact or approximate residual vector, namely,
, which is given in Theorem 3.1.
Note that the input vectors,
need not be saved; actually, they are overwritten by
,
as the latter are being computed. As is clear from the description of MGS given above, we can overwrite the matrix
simultaneously with the computation of
and
, the vector
overwriting
as soon as it is computed,
that is, at any stage of the QR factorization, we store
N-dimensional vectors in the memory. Since
in our applications, the storage requirement of the
matrix
is negligible. So is the cost of computing the SVD of
, and so is the cost of computing the
-dimensional vector
. Thus, for all practical purposes, the computational and storage requirements of SVD-MPE are the same as those of MPE.
Remark: If we were to compute the SVD of, namely,
, directly- and not by 1) first carrying out the QR factorization of
as
, and 2) next computing the SVD of
as
, and 3) noting that
without actually computing
―then we would need to waste extra resources in carrying out the computation of
This direct strategy will have either of the following consequences:
1) If we have storage limitations, then we would have to overwrite with
. (Recall that both of these matrices are
and hence they are large.) As a result, we would have to compute the
,
a second time in order to compute
.
2) If we do not want to compute the vectors,
a second time, then we would have to save
, while computing the matrix
in its singular value decomposition. Thus, we would need to save two
matrices, namely,
and
in the core memory simultaneously. Clearly, this limits the size of k, the order of extrapolation hence the rate of acceleration, severely.
Clearly, the indirect approach we have taken here for carrying out the singular value decomposition of
Table 1. Algorithm for SVD-MPE.
enables us to save extra computing and storage very conveniently when N is very large.
5. Determinant Representations for SVD-MPE
In [7] and [16] , determinant representations were derived for the vectors that are produced by the vector extrapolation methods MPE, RRE, MMPE, and TEA. These representations have turned out to be very useful in the analysis of the algebraic and analytic properties of these methods. In particular, they were used for obtaining interesting recursion relations among the
and in proving sharp convergence and stability theorems for them. We now derive two analogous determinant representations for
produced by SVD-MPE.
The following lemma, whose proof can be found in [7] , Section 3, will be used in this derivation in Theorem 5.2.
Lemma 5.1 Let and
be scalars and let the
satisfy the linear system
(5.1)
Then, whether are scalars or vectors, there holds
(5.2)
where
(5.3)
provided. In case the
are vectors, the determinant
is defined via its ex- pansion with respect to its first row.
For convenience of notation, we will write
Then is the principal submatrix of
obtained by deleting the last row and the last column of
. In addition,
is hermitian positive definite just like
.
Theorem 5.2 that follows gives our first determinant representation for resulting from SVD-MPE and is based only on the smallest singular value
of
and the corresponding right singular vector
.
Theorem 5.2 Define
(5.4)
and assume that
3 (5.5)
Then, provided,
exists and has the determinant representation
(5.6)
where is the
determinant defined as in (5.3) in Lemma 5.1 with the
as in (5.4).
Proof. With as in (2.16), we start by rewriting (2.18) in the form
(5.7)
Invoking here, which follows from (2.19), and multiplying the resulting equality on the left by
, we obtain
(5.8)
Dividing both sides of this equality by, and invoking (2.13), we have
(5.9)
which, by the fact that
is the same as
(5.10)
where we have invoked (5.4). We will be able to apply Lemma 5.1 to prove the validity of (5.6) if we show that, in (5.10), the equations with are linearly independent, or, equivalently, the first k rows of the matrix
are linearly independent. By the fact that
we have
where
and
Invoking, we obtain
Since is the smallest eigenvalue of
and since
, it turns out that
is positive definite, which guarantees that the first k rows of
are linearly independent. This completes the proof. □
Remark: We note that the condition that in Theorem 5.2 is equivalent to the condition that
, which we have already met in Section 2.
The determinant representation given in Theorem 5.3 that follows is based on the complete singular value decomposition of, hence is different from that given in Theorem 5.2. Since there is no room for confusion, we will denote the singular values
and right and left singular vectors
and
of
by
,
and
respectively.
Theorem 5.3 Let be as in (2.16), and let
be the singular value decomposition of; that is,
and
Define
(5.11)
Then, has the determinant representation
(5.12)
where is the
determinant defined as in (5.3) in Lemma 5.1 with the
as in (5.11).
Proof. By Theorem 1.1,
(5.13)
Therefore,
(5.14)
By (2.16) and by the fact that, which follows from (2.19), and by the fact that
,
, which follows from (2.13), and by (5.14), we then have
(5.15)
But, by (5.11), (5.15) is the same as
Therefore, Lemma 5.1 applies with as in (5.11), and the result follows. □
6. SVD-MPE as a Krylov Subspace Method
In Sidi [17] , we discussed the connection of the extrapolation methods MPE, RRE, and TEA with Krylov subspace methods for linear systems. We now want to extend the treatment of [17] to SVD-MPE. Here we recall that a Krylov subspace method is also a projection method and that a projection method is defined uniquely by its right and left subspaces4. In the next theorem, we show that SVD-MPE is a bona fide Krylov subspace method and we identify its right and left subspaces.
Since there is no room for confusion, we will use the notation of Theorem 5.3.
Theorem 6.1 Let be the unique solution to the linear system
which we express in the form
and let the vector sequence be produced by the fixed-point iterative scheme
Define the residual vector of via
Let also
be the approximation to
prod- uced by SVD-MPE. Then the following are true:
1) is of the form
(6.1)
2) The residual vector of, namely,
, is orthogonal to the subspace
Thus,
(6.2)
Consequently, SVD-MPE is a Krylov subspace method for the linear system, with the Krylov subspace
as its right subspace and
as its left subspace.
Proof. With the generated as above, we have
Therefore,
and
Upon substituting in this equality, we obtain (6.1).
To prove (6.2), we first recall that by (3.1). By this and by (5.15), the result in (6.2) follows. □
Remark: We recall that (see [17] ), when applied to linearly generated sequences as in Theorem 6.1, MPE, RRE, and TEA are mathematically equivalent to, respectively, the full orthogonalization method (FOM) of Arnoldi [36] , the generalized minimum residual method (GMR), the best implementation of it being GMRES by Saad and Schultz [37] , and the method of Lanczos [38] . In all these methods the right subspace is the k-dimensional Krylov subspace
. The left subspaces are also Krylov subspaces, with 1)
for FOM, 2)
for GMR, and 3)
for the method of Lanczos. One important point about the left subspaces of these three methods is that they expand as k increases, that is, the left subspace of dimension
contains the left subspace of dimension k. As for SVD-MPE, its right subspace is also the k-dimensional Krylov subspace
, which makes SVD-MPE a bona fide Krylov subspace method, and the left subspace is
. Thus, the k-dimensional left subspace is not a Krylov subspace, since it is not contained in the left subspace of dimension
, as the left singular vectors
of
are different from the left singular vectors
of
.
7. Numerical Examples
We now provide two examples that show the performance of SVD-MPE and compare SVD-MPE with MPE. In both examples, SVD-MPE and MPE are implemented with the standard Euclidean inner product and the norm induced by it. Thus, and
throughout.
As we have already mentioned, a major application area of vector extrapolation methods is that of numerical solution of large systems of linear or nonlinear equations by fixed-point iterations
. [Here
is a possibly preconditioned form of
.] For SVD-MPE, as well as all other poly- nomial methods discussed in the literature, the computation of the approximation
to
, the solution of
, requires
of the vectors
to be stored in the computer memory. For systems of very large dimension N, this means that we should keep k at a moderate size. In view of this limitation, a practical strategy for systems of equations is cycling, for which n and k are fixed. Here are the steps of cycling:
C0) Choose integers, and
and an initial vector
.
C1) Compute the vectors [via
].
C2) Apply SVD-MPE (or MPE) to the vectors, with end result
.
C3) If satisfies accuracy test, stop.
Otherwise, set, and go to Step C1.
We will call each application of steps C1 - C3 a cycle, and denote by the
that is computed in the rth cycle. We will also denote the initial vector
in step C0 by
. Numerical examples suggest that the se-
quence has very good convergence properties. (A detailed study of errors and convergence properties for MPE and RRE in the cycling mode is given in [20] and [21] .)
Note that the remark at the end of Section 4 is relevant to the implementation of SVD-MPE in the cycling mode when N, the dimension of the, is very large and storage is limited and, therefore, the size of k is limited as well.
Example 7.1 Consider the vector sequence obtained from
,
where
and is symmetric with respect to both main diagonals, and and is hermitian. Therefore,
is diago- nalizable with real eigenvalues. The vector
is such that the exact solution to
is
. We have
, so that
converges to
.
Figure 1 shows the norms of the errors in
,
with
fixed. Here
. Note that all of the approximations
make use of the same (infinite) vector sequence
, and, practically speaking, we are looking at how the methods behave as
. It is interesting to see that SVD-MPE and MPE behave almost the same. Although we have a rigorous asymptotic theory confirming the behavior of MPE in this mode as observed in Figure 1 (see [16] [18] and [19] ), we do not have any such theory for SVD-MPE at the present5.
Figure 2 shows the norm of the error in
in the cycling mode with
and
. Now
, a relatively large dimension.
Example 7.2 We now apply SVD-MPE and MPE to the nonlinear system that arises from finite-difference approximation of the two-dimensional convection-diffusion equation considered in Kelley ( [39] , pp. 108-109), namely,
where satisfies homogeneous boundary conditions.
is constructed by setting
in the differential equation and by taking
as the exact solution.
The equation is discretized on a square grid by approximating,
,
, and
by centered differe- nces with truncation errors
. Thus, letting
, and
Figure 1.norm of error in
,
with
, from MPE and SVD-MPE, for Example 7.1 with
.
Figure 2. norm of error in
in the cycling mode, from SVD-MPE, for Example 7.1 with
.
and
and
we replace the differential equation by the finite difference equations
with
Here is the approximation to
, as usual.
We first write the finite difference equations in a way that is analogous to the PDE written in the form
and split the matrix representing to enable the use of the Jacobi and Gauss-Seidel methods as the iterative procedures to generate the sequences
.
Figure 3 and Figure 4 show the norms of the errors in
from SVD-MPE and MPE in the cycling mode with
and
, the iterative procedures being, respectively, that of Jacobi and that of Gauss- Seidel for the linear part
of the PDE. Here
, so that the number of unknowns (the dimension) is
. Note that the convergence of cycling is much faster with Gauss-Seidel iteration than with
Figure 3. norm of error in
in the cycling mode, from MPE and SVD-MPE, for Example 7.2 with
hence
. The underlying iteration method is that of Jacobi.
Figure 4. norm of error in
in the cycling mode, from MPE and SVD-MPE, for Example 7.2 with
hence
. The underlying iteration method is that of Gauss- Seidel.
Jacobi iteration6. For both cycling computations, SVD-MPE and MPE seem to perform very similarly in this example.
Note that the Jacobi and Gauss-Seidel iterations converge extremely slowly. In view of this slow conve- rgence, the acceleration produced by SVD-MPE and MPE in the cycling mode is remarkable.
Acknowledgements
The author would like to thank Boaz Ophir for carrying out the computations reported in Section 7 of this work.
Cite this paper
Avram Sidi, (2016) SVD-MPE: An SVD-Based Vector Extrapolation Method of Polynomial Type. Applied Mathematics,07,1260-1278. doi: 10.4236/am.2016.711111
References
- 1. Kelley, C.T. (1995) Iterative Methods for Linear and Nonlinear Equations. SIAM, Philadelphia.
http://dx.doi.org/10.1137/1.9781611970944 - 2. Lanczos, C. (1952) Solution of Systems of Linear Equations by Minimized Iterations. Journal of Research of the National Bureau of Standards, 49, 33-53.
http://dx.doi.org/10.6028/jres.049.006 - 3. Saad, Y. and Schultz, M.H. (1986) GMRES: A Generalized Minimal Residual Method for Solving Nonsymmetric Linear Systems. SIAM Journal on Scientific and Statistical Computing, 7, 856-869.
http://dx.doi.org/10.1137/0907058 - 4. Arnoldi, W.E. (1951) The Principle of Minimized Iterations in the Solution of the Matrix Eigenvalue Problem. Quarterly of Applied Mathematics, 9, 17-29.
- 5. Chan, T.F. (1982) An Improved Algorithm for Computing the Singular value Decomposition. ACM Transactions on Mathematical Software, 8, 72-83.
http://dx.doi.org/10.1145/355984.355990 - 6. Golub, G.H. and Kahan, W. (1965) Calculating the Singular Values and Pseudo-Inverse of a Matrix. SIAM Journal on Numerical Analysis, 2, 205-224.
http://dx.doi.org/10.1137/0702016 - 7. Householder, A.S. (1964) The Theory of Matrices in Numerical Analysis. Blaisedell, New York.
- 8. Trefethen, L.N. and Bau, D. (1997) Numerical Linear Algebra. SIAM, Philadelphia.
- 9. Stoer, J. and Bulirsch, R. (2002) Introduction to Numerical Analysis. 3rd Edition, Springer-Verlag, New York.
http://dx.doi.org/10.1007/978-0-387-21738-3 - 10. Horn, R.A. and Johnson, C.R. (1985) Matrix Analysis. Cambridge University Press, Cambridge.
http://dx.doi.org/10.1017/CBO9780511810817 - 11. Golub, G.H. and Van Loan, C.F. (2013) Matrix Computations. 4th Edition, Johns Hopkins University Press, Baltimore.
- 12. Graves-Morris, P.R. (1992) Extrapolation Methods for Vector Sequences. Numerische Mathematik, 61, 475-487.
http://dx.doi.org/10.1007/BF01385521 - 13. Graves-Morris, P.R. (1983) Vector Valued Rational Interpolants I. Numerische Mathematik, 42, 331-348.
http://dx.doi.org/10.1007/BF01389578 - 14. Wynn, P. (1964) General Purpose Vector Epsilon Algorithm Procedures. Numerische Mathematik, 6, 22-36.
http://dx.doi.org/10.1007/BF01386050 - 15. Wynn, P. (1963) Continued Fractions Whose Coefficients Obey a Noncommutative Law of Multiplication. Archive for Rational Mechanics and Analysis, 12, 273-312.
http://dx.doi.org/10.1007/BF00281229 - 16. Gekeler, E. (1972) On the Solution of Systems of Equations by the Epsilon Algorithm of Wynn. Mathematics of Computation, 26, 427-436.
http://dx.doi.org/10.1090/S0025-5718-1972-0314226-X - 17. Brezinski, C. (1971) Sur un algorithme de résolution des systèmes non linéaires. Comptes Rendus de l’Académie des Sciences, 272A, 145-148.
- 18. Brezinski, C. (1970) Application de l’ε-algorithme à la résolution des systèmes non linéaires. Comptes Rendus de l’Académie des Sciences, 271A, 1174-1177.
- 19. Sidi, A. and Shapira, Y. (1998) Upper Bounds for Convergence Rates of Acceleration Methods with Initial Iterations. Numerical Algorithms, 18, 113-132.
http://dx.doi.org/10.1023/A:1019113314010 - 20. Sidi, A. and Shapira, Y. (1992) Upper Bounds for Convergence Rates of Vector Extrapolation Methods on Linear Systems with Initial Iterations. Technical Report 701, Computer Science Department, Technion-Israel Institute of Technology.
- 21. Sidi, A. and Bridger, J. (1988) Convergence and Stability Analyses for Some Vector Extrapolation Methods in the Presence of Defective Iteration Matrices. Journal of Computational and Applied Mathematics, 22, 35-61.
http://dx.doi.org/10.1016/0377-0427(88)90287-7 - 22. Sidi, A. (1994) Convergence of Intermediate Rows of Minimal Polynomial and Reduced Rank Extrapolation Tables. Numerical Algorithms, 6, 229-244.
http://dx.doi.org/10.1007/BF02142673 - 23. Sidi, A. (1988) Extrapolation vs. Projection Methods for Linear Systems of Equations. Journal of Computational and Applied Mathematics, 22, 71-88.
http://dx.doi.org/10.1016/0377-0427(88)90289-0 - 24. Sidi, A. (1983) Convergence and Stability Properties of Minimal Polynomial and Reduced Rank Extrapolation Algorithms. SIAM Journal on Numerical Analysis, 23, 197-209.
http://dx.doi.org/10.1137/0723014 - 25. Jbilou, K. and Sadok, H. (1999) LU-Implementation of the Modified Minimal Polynomial Extrapolation Method. IMA Journal of Numerical Analysis, 19, 549-561.
http://dx.doi.org/10.1093/imanum/19.4.549 - 26. Sidi, A. (1991) Efficient Implementation of Minimal Polynomial and Reduced Rank Extrapolation Methods. Journal of Computational and Applied Mathematics, 36, 305-337.
http://dx.doi.org/10.1016/0377-0427(91)90013-A - 27. Sidi, A. (2012) Review of Two Vector Extrapolation Methods of Polynomial Type with Applications to Large-Scale Problems. Journal of Computational Science, 3, 92-101.
http://dx.doi.org/10.1016/j.jocs.2011.01.005 - 28. Sidi, A. (2008) Vector Extrapolation Methods with Applications to Solution of Large Systems of Equations and to PageRank Computations. Computers & Mathematics with Applications, 56, 1-24.
http://dx.doi.org/10.1016/j.camwa.2007.11.027 - 29. Smith, D.A., Ford, W.F., and Sidi, A. (1987) Extrapolation Methods for Vector Sequences. SIAM Review, 29, 199-233.
http://dx.doi.org/10.1137/1029042 - 30. Wynn, P. (1962) Acceleration Techniques for Iterated Vector and Matrix Problems. Mathematics of Computation, 16, 301-322.
http://dx.doi.org/10.1090/S0025-5718-1962-0145647-X - 31. Shanks, D. (1955) Nonlinear Transformations of Divergent and Slowly Convergent Sequences. Journal of Mathematics and Physics, 34, 1-42.
http://dx.doi.org/10.1002/sapm19553411 - 32. Wynn, P. (1956) On a Device for Computing the em(Sn) Transformation. Mathematical Tables and Other Aids to Computation, 10, 91-96.
http://dx.doi.org/10.2307/2002183 - 33. Sidi, A., Ford, W.F., and Smith, D.A. (1986) Acceleration of Convergence of Vector Sequences. SIAM Journal on Numerical Analysis, 23, 178-196.
http://dx.doi.org/10.1137/0723013 - 34. Pugachev, B.P. (1978) Acceleration of the Convergence of Iterative Processes and a Method of Solving Systems of Nonlinear Equations. USSR Computational Mathematics and Mathematical Physics, 17, 199-207.
http://dx.doi.org/10.1016/0041-5553(77)90023-4 - 35. Brezinski, C. (1975) Généralisations de la transformation de Shanks, de la table de Padé, et de l’ε-algorithme. Calcolo, 12, 317-360.
http://dx.doi.org/10.1007/BF02575753 - 36. Mesina, M. (1977) Convergence Acceleration for the Iterative Solution of the Equations X = AX + f. Computer Methods in Applied Mechanics and Engineering, 10, 165-173.
http://dx.doi.org/10.1016/0045-7825(77)90004-4 - 37. Eddy, R.P. (1979) Extrapolating to the Limit of a Vector Sequence. In: Wang, P.C.C., Ed., Information Linkage between Applied Mathematics and Industry, Academic Press, New York, 387-396
- 38. Kaniel, S. and Stein, J. (1974) Least-Square Acceleration of Iterative Methods for Linear Equations. Journal of Optimization Theory and Applications, 14, 431-437.
http://dx.doi.org/10.1007/BF00933309 - 39. Cabay, S. and Jackson, L.W. (1976) A Polynomial Extrapolation Method for Finding Limits and Antilimits of Vector Sequences. SIAM Journal on Numerical Analysis, 13, 734-752.
http://dx.doi.org/10.1137/0713060
NOTES
1Unless otherwise stated, will mean
throughout this work.
2Given a matrix and a nonzero vector
the monic polynomial
is said to be a minimal polynomial of
with respect to
if
and if
has smallest degree. It is easy to show that the minimal polynomial
of
with respect to
exists, is unique, and divides the minimal polynomial of
, which in turn divides the characteristic polynomial of
. [Thus, the degree of
is at most r, and its zeros are some or all of the eigenvalues of
.] Moreover, if
for some polynomial
with
, then
divides
. Concerning this subject, see Householder [33] , for example.
3From the Cauchy interlace theorem, we already know that. See [29] , for example.
4A projection method for the solution of the linear system, where
, is defined as follows: Let Y and Z be k-dimensional subspaces of
and let
be a given vector in
. Then the projection method produces an approximation
to the solution of
as follows: 1)
,
, 2)
for every
. Y and Z are called, respectively, the right and left subspaces of the method. If Y is the Krylov subspace
, where
, then the projection method is called a Krylov subspace method.
5As proved in [7] and [16] , if we order the distinct eigenvalues of
as
then with fixed k, we have
as
for MPE, RRE, MMPE, and TEA, if
is diagonalizable and
. (This result explains the straight line behavior of MPE in Figure 1.) Generalizations and extensions of this result are given in the papers [18] - [21] , for the cases in which
is not necessarily diagonalizable and/or
.
6Recall that the Gauss-Seidel method converges twice as fast as the Jacobi method when the two methods are applied to linear systems with consistently ordered matrices, and that the matrix obtained by discretizing by central differences is consistently ordered; see [31] , for example. We believe this could explain the behavior of the solution with Gauss-Seidel iteration relative to that with Jacobi iteration, despite the fact that the system of equations we are solving in Example 2 is nonlinear.