 Applied Mathematics, 2011, 2, 247-253 doi:10.4236/am.2011.22028 Published Online February 2011 (http://www.SciRP.org/journal/am) Copyright © 2011 SciRes. AM Matrix Padé-Type Method for Computing the Matrix Exponential Chunjing Li1, Xiaojing Zhu1, Chuanqing Gu2 1Department of Mat hem at ic s, Tongji University, Shanghai, China 2Department of Mat hem at ic s, Shanghai University, Shanghai, China E-mail: cqgu@staff.shu.edu.cn Received October 31, 2010; revised December 23, 2010; accepted December 28, 2010 Abstract Matrix Padé approximation is a widely used method for computing matrix functions. In this paper, we apply matrix Padé-type approximation instead of typical Padé approximation to computing the matrix exponential. In our approach the scaling and squaring method is also used to make the approximant more accurate. We present two algorithms for computing Ae and for computing Ate with many 0t respectively. Numeri-cal experiments comparing the proposed method with other existing methods which are MATLAB’s func-tions expm and funm show that our approach is also very effective and reliable for computing the matrix ex-ponential Ae. Moreover, there are two main advantages of our approach. One is that there is no inverse of a matrix required in this method. The other is that this method is more convenient when computing Ate for a fixed matrix A with many 0t. Keywords: Matrix PadÉ-Type Approximation, Matrix Exponential, Scaling and Squaring, Backward Error1. Introduction The matrix exponential is the most studied and the most widely used matrix function. It plays a fundamental role in the solution of differential equations in many applica- tion problems such as nuclear magnetic resonance spec- troscopy, Markov models, and Control theory. These problems often require the computation of Ate for a fixed matrix and many 0t For example, with some suitable assumptions on the smooth of a function f, the solution to the inhomoge- neous system ,,0 ,,,nnndy Ayftyyc yAdt  is 0,.tAt sAtytecefsyds A great number of methods for computing the matrix exponential have been proposed in the past decades. Moler and Van Loan’s classic papers [1,2], studied nine- teen dubious methods for computing the matrix expo- nential. Higham  which improved the traditional scal- ing and squaring method by a new backward error analy- sis is by far one of the most effective methods for com-puting the matrix exponential and his method has been implemented in MATLAB’s expm function. A revisited version can be seen in . In Higham’s method and many other methods, Padé approximation is used to eva-luate the matrix exponential because of its high accu- racy. In this paper we present a new approach to compute the matrix exponential. On one hand, we use the matrix Padé-type approximation given by Gu  instead of the typical Padé approximation to evaluate the matrix expo- nential. Padé-type approximation was first proposed by Brezinski [6,7] in the scalar case. Draux  exp lored th is method for the case of matrix-valued function and pro- posed a matrix Padé-type approximation. On the other hand, the powerful scaling and squaring method is also applied in our approach. This method exploits the rela- tion 22ssAAee. Thus we need to evaluate 2sAeat first and then take squarings for s times to obtain the evaluation of Ae. In the spirit of , the scaling nu mber s is determined by backward error analysis. Besides, the problem of overscaling is also considered in our ap- proach. We briefly introduce the definition and basic property of the matrix Padé-type approximation in Section 2. C. J. LI ET AL. Copyright © 2011 SciRes. AM 248 Then we develop a specific analysis for our approach in Section 3. In Section 4, two algorithms and some nu-merical experiments compared with other existing me-thods are presented. Conclusions are made in Section 5. 2. Matrix Padé-Type Approximation In this section, we give a brief introduction about Matrix Padé-type approximation or MPTA for short. There are various definitions for MPTA. We are concerned with those based on Brezinski [6,7] and Gu . Let ()fz be a given power series with nn ma- trix coefficients, i.e., 0,,.inniiifz CzCz Let :nnP be a generalized linear function on a polynomial space P, defined by ,0,1,llxCl Let mvbe a scalar polynomial of degree m 0() mjmjjvz bz (2.1) and assume that the coefficient 1mb. Then we define   11,km kmmmkxvx zvzWz xz  (2.2) 1,mmmvz zvz (2.3) 1.kkkWz zWz (2.4) Definition 1 Let kWz and mvz be defined by (2.4) and (2.3) respectively. Then kmk mRzWzvz is called a matrix Padé-type approximation to fz and is denoted by.fkmz The approximant.fkmz satisfies the error for- mula  1.kffzkmzOz (2.5) In fact, from (2.1 ) - (2 .4), we have 0mmjmjjvz bz (2.6) and 0000 .mkmjkmji ikjjimkmjkmjijijiWzbx zbCz   (2.7) Then 0000.mkmjmjikjijimkmjmj ijijiWzb Czbz Cz  (2.8) It follows from (2.6)-(2 .8) that  1100 .mkimjkmjiijvzfzWzbC z (2.9) Under the normalization condition01,mmvb we obtain (2.5). Note that the numerator kWz is a ma- trix-valued polynomial and the denominator mvz is a scalar polynomial. For more theories on matrix Padé approximation and matrix Padé-type approximation see [5-10]. 3. MPTA for the Matrix Exponential Let f be a function having the Taylor series expansion with the radius of convergence r of the form, ()0!, .iiifzfizzr (3.1) It follows from Higham  that for any nnA and z with()Azr, whererepresents the spectral radius of a matrix, ()0!.iiifAzfi Az According to (2.8) and (2.9) in the previous section, the [k/m] matrix Padé-type approximant to ()fAz can be expressed by the form kmkm kmRAzPAzqAz, where  0,1,mmjkmjkm mjqAzbz qzb (3.2) and ()00!.imkmjmj ikm jjiPAzbzf iAz (3.3) In fact, the difference between the matrix Padé-type approximants and the typical matrix Padé approximants is that the denominator of the former is a scalar polyno- mial, as denoted by kmqz in (3.2). In the case of the matrix exponential, Ate can be de- fined as follows 222!At AeIAtt  By (3.2) and (3.3), we immediately obtain the [k/m] matrix Padé-type approximant to Ateas follows, ,kmkm kmRAtPAtqt where 0,1,mmjkmj mjqtbt b (3.4) C. J. LI ET AL. Copyright © 2011 SciRes. AM 249and 00(!).mkmjmji ikm jjiPAt bttiA (3.5) Now we give an error expression in the following theorem. Theorem 1 Let kmq and kmP be expressed as (3.4) and (3.5) respectively, then   1100.1!kkmijmkm Ati jijkm kmPAt tAetbqtqtkm ij  (3.6) Proof: Take !iiCAiin (2.9). So far, the coefficients of the denominator kmq are arbitrary. These coefficients may affect the accuracy of the approximant. Sidi  proposed three procedures to choose these coefficients in the case of vector-valued rational approximation. We can generalize his proce- dures to the matrix case. We only introduce one proce- dure, which is called SMPE based on minimal polyno- mial extrapolation. In the spirit of SMPE in , we aim to minimize the norm of the coefficient of the first term of the numerator, i.e., the 0i term in the error ex- pression (3.6). Since this term can be viewed as the ma- jor part of the error. Under the condition 1mb, other coefficients of kmqare the solution to the following mi-nimization problem 01,, 0min ,1!mkmjmjbbjAbkm j (3.7) where we choose the Frobenius norm. The corresponding inner product of two ma trices A and B is defined by 11,.nnHjijiijABtrAB AB (3.8) The next lemma can transform the minimization prob- lem (3.7) into the solution of a linear system. Lemma 1 The minimization problem (3.7) is equiva- lent to the following linear system 111011,,,0,,1,mkm ikmjjjkm ikDD bDDi m  (3.9) where !llDAl. Proof: See Sidi . Different from the series form error expression in (3.6), we present another form of error bound based on Taylor series truncation error in the following theorem. Theorem 2 Suppose f has the Taylor series expansion of the form as in (3.1) with radius of convergence r. Let kmkm kmRAzPAzqz be the matrix Padé-type approximant to(),fAz where,nnAzwith ()Azr and kmq and kmPhave the form (3.2) and (3.3) respectively. If kmq is a real coefficient polynomial and 1010,mmjjjbz then for any matrix norm, it follows that 11100()()1! ,1kmkmjkmkmjjjjmmjjjfAzR AzAzb fAzkm jbz(3.10) where ,0,,jjm are some real constants in (0,1). Proof: The result follows from a similar analysis used in . Corollary 1 Let kmkm kmRAzPAzqz be the matrix Padé-type approximant to the matrix exponential Ate, where ,nnAt and kmq and kmP have the form (3.4) and (3.5) respectively and kmq is a real coefficient polynomial. If 1010,mmjjjbz then 11001! ,1jAt kmkm jkmAtjjmmjjjeRAtAtb ekm jbt (3.11) where ,0,,jjm are some real constants in (0,1). Proof: The result follows directly from Theorem 2. The result of corollary 1 shows that the MPTA to Ateis feasible. In fact, we do not hav e to worr y abou t the denominator of the right side in (3.11) since our numeri- cal experiments in the next section show that all jb are much smaller than 1. Now we turn to the implementation of the scaling and squaring method of our approach. If the norm of the ma- trix At is not small enough, we use the scaling and squaring technique which computes the matrix exponent- tial indirectly. We want to reduce the norm of At to a sufficiently small number because the function can be well approximated near the origin. Thus we need to compute 2skmRAt and then take 22.sAt skmeRAt The scaling and squaring method goes back at least to Lawson . The scaling integer s should be carefully determined. We choose this number based on the error analysis. Hig-ham  explored a new backward error analysis of the matrix exponential. According to the analysis in , we C. J. LI ET AL. Copyright © 2011 SciRes. AM 250 give the following lemma. Lemma 2 If the approximant kmR satisfies 22,sAt skmeR AtIG where 1G and the norm is any consistent matrix norm, then 22,ssAt EkmRAte where E commutes with A and log 1.2sGEAt At (3.12) Proof: See the proof in . The result of (3.12) shows, as Higham  points out, that the truncation error E in the approximant to Ate as equivalent to a perturbation in the original matrix At. So it is a backward error analysis. If we define 22,ssAtkmRAte then 2sAtGe. In order to obtain a clear error analy- sis for the perturbation E, we should give a practical es- timation for . Though (3.11) gives an upper bound for , it is a rough estimation for us rather than a practical one. Let 202!siksAtTiAAtie be the Taylor series truncation error of 2sAte. In terms of  or , the norm of T is bounded by the in-equality 122.1!skAtsTAt ek  (3.13) Actually, according to the choice of jbaccording to the minimization problem (3.7), we can assume that .T It therefore follows from (3.13) that 12122(2 ).1!sss AtskAtAt TAt eGe ek   (3.14) We want to choose such s that the backward relative error EAt does not exceed 161.1 10u , which is the unit roundoff in IEEE double precision arithmetic. To this end, in accordance with (3.12), we require the following inequality hold, log 12,sGAtu that is, 21.sAt uGe (3.15) Then (3.14) implies that 1122211!sskAtsAt uAt eek (3.16) is a sufficient condition for (3.15). If we denote 2sAt, then (3.16) can be rewrit-ten as follows 12 1.1!kueek Since 1ueu, finally we obtain  2:.1!kkeguk (3.17) Then we define this value max :.kguThus EuAtif s satisfiesmax2sAt, which means we choose 2maxlog .sAt (3.18) However, in practice, the choice of s in (3.18) may lead to overscaling, so we need to take measures to re- duce the effect caused by overscaling. Therefore the me-thod in Al-Mohy and Higham  can be applied. Lemma 3 Define ()jljjlhx ax with the radius of convergence r and jljjlhx ax and supposeArand p. Then if 1lpp, 11(1)1max ,.ppppllhA hAA Proof: See . Note that  2.!1!jkjkjkgxjk k Then according to lemma 3, we have16 16gg, where  1413 15435max 2,min 2,2.sssAtAt At (3.19) So s is chosen so thatmax 0.744. For the numerator degree k, we have no clear theore- tical analysis or strategy for the best choice. Based on our numerical experiments, we find 16k may be a C. J. LI ET AL. Copyright © 2011 SciRes. AM 251good choice and correspondingly havemax0.744. Since if k is too small, a large s is required which is po- tentially dangerous and if k is too large, evaluating ma- trix multiplications may require excessive computations and storages. Now we need to consider the total computational cost. The computation of choosing jb consists of two parts. One part is computing the inner products in the above linear system (3.9). It needs k matrix multiplications to obtain 11,,km kAA  and 1mm computations of those inner products. Each inner product defined by (3.8) costs 22n flops. So the cost of computing inner prod- ucts is 221mm n flops. We choose 1/2mn to make the cost do not exceed 3On . The other part is solving the mm linear system (3.9) to obtain the co- efficients jb. It costs 32m flops which is not expensive since m is much smaller than n. When we evaluate kmPAt, we rewrite (3.5) as the following form 0max ,0,!jkm mi jkm iji jkmtPAtbt Aj  (3.20) where the powers of A have already been obtained in the previous steps. So there are no extra matrix multipli- cations. Finally, we should take s matrix multiplications for squaring which costs 32sn flops. 4. Algorithms and Numerical Experiments Based on the analysis above, we present the following algorithm for computing the matrix Padé-type approxi- mation to the matrix exponential. Algorithm 1 This algorithm evaluates the matrix ex- ponential using Padé-type approximation together with scaling and squaring method. The algorithm is intended for IEEE double precision arithmetic. 1) Compute and Store23 17,,,AAA. 2) Repeat scaling 2sAA until s satisfies 0.744, where  is defined by (3.19) with 1-norm. 3) Set 1/20min ,mmn. 4) Compute the inner products in (3.9) via (3.8). 5) Solve (3.9) to obtain the coefficients of kmq. 6) Compute kmq via (3.4). 7) Compute kmP via (3.20). 8) kmkmkmRPq. 9) 2skm kmRR. Note that the mm linear system in (3.9) maybe ill-conditioned. To avoid this difficulty, we can use a fast and stable iterative method to solve it. The iteration will stop if it fails to converge after the maximum number of iterations. We can also modify the algorithm by resetting jbor choosing 1,0,0,,1,mjbbj m  when the system is ill-conditioned. The latter case is actually a truncated Taylor series. Now we present some numerical experiments that compare the accuracy of three methods for computing the matrix exponential. First we took 53 test matrices obtained from MAT- LAB R2009b’s gallery function in the Matrix Computa- tion Toolbox and most of them are real and 88. Fig- ure 1 shows the relative error in the Frobenius-norm of expmpt (Algorith m 1) and MATLAB R2009 b’s fun ction s expm and funm. For details about expm and funm or other algorithms for computing the matrix exponential see . The exact matrix exponential Ae is obtained by compu-ting a 150 order Taylor polynomial using MATLAB’s Symbolic Math Toolbox. We can make the observations from Figure 1 as follows.  MATLAB’s function expm is still the best code in general.  The proposed algorithm expmpt is also very effect- tive. Precisely, expmpt is more accurate than expm in 16 examples.  Both expm and expmpt are more reliable than funm in general. Second we concentrate on a class of 22 matrix of the following form ,0abAc where a, c is generated randomly from (–1, –0. 5) and b is generated rando mly from (–1000, –500). We tested 20 matrices of this form and plotted the results in Figure 2. We can make the observations from Figure 2 as follows.  Algorithm 1 and MATLAB’s function funm perform similarly in most examples. Figure 1. Normwise relative errors for MATLAB’s expm, funm and Algor ithm 1 on testing matrices from MATL AB’s gallery function. C. J. LI ET AL. Copyright © 2011 SciRes. AM 252 Figure 2. Normwise relative errors for MATLAB’s expm, funm and Algorithm 1 on testing 2/times 2 matrices descri- bed above.  MATLAB’s function ex pm is less reliable than the other two methods. Therefore we can conclude from our experiments that the proposed method has certain value in practice. Compared with typical Padé approximation which is used in traditional methods for the matrix exponential, Padé-type approximation we used in this paper has both some advantages and disadvantages. Recall that the [k/m] Padé approximation, denoted by  kmkm kmrx pxqx, to the function f is defined by the properties that that kmp and kmq are polynomials of degrees k and m, respectively, and that  1.kmkmfx r xOx Replacing x withA, we have  1.kmkmfA rAOA (4.1) So Padé-type approximation requires a polynomial of higher degree to achieve the same accuracy as Padé ap- proximation does. Another disadvantage of our approach is that each ,2,,jAjk has to be evaluated in our approach because we need them to determine the coeffi- cients jb. In , Paterson-Stockmeyer (PS) method was used to evaluate the matrix polynomials. For exam- ple, a the matrix polynomial such as 15151510 ,pAbAbAbI (4.2) can be evaluated by only 7 matrix multiplications. There- fore, our approach costs more than expm if we only com- pute Ate for one or few t. But in another point of view, our method has some advantages. Real problems usually requireAtefor many 0t, where t represents time. For example, these t are 0,1 ,1,,,itir where r is not a small number. We should first divide these t into different intervals accord- ing to the criterion that points in the same interval have the same scaling integer. According to (3.20), when computing kmPAt for t in the same interval, we only evaluate ,2,,jAjk for one time. However, using PS method to evaluate a matrix polynomial such as 15 ()pAt needs to update these powers of matrix At 2468,,,,AtAtAtAtAt and take three extra ma- trix multiplications for each t. Therefore, when the num- ber of t is very large, PS method may be worthless. Moreover, in our method kmq is a scalar polynomial and therefore no matrix division is required except for solv- ing the mm linear system with m much smaller than n. Nevertheless, Padé approximation requires the matrix division 1km kmqAtpAt for each t. Matrix division with large dimension is not what we expect. In the end of this section, we present a modified ver- sion of Algorithm 1 to compute Ate for many 0t. Note that in the following algorithm, we preprocessed the given matrix A by the Schur decomposition to reduce the computational cost since only triangle matrices are involved after the decomposition. Algorithm 2 This algorithm based on Algorithm 1 evaluates Ate for 10rtt. The algorithm is in-tended for IEEE double precision arithmetic. 1) Schur decomposition:HAQTQ, where Q is un-itary and T is upper triangular. 2) Compute and store 23 17,,,TT T. 3) Divide all of the points it into several intervals which are 01,,,wUU U. The corresponding scaling integer isis. 4) Set 1/20min ,mmn with some positive in-teger 0m. 5) For 0:iw 2isTT. Compute inner products in (3.9) via (3.8). Solve (3.9) to obtain the coefficient of kmq. For jitU Compute kmqvia (3.4). Compute kmP via (3.20) where A is re-placed by T. kmkmkmRPq. 2()sikm kmRR. Hkm kmRQRQ. end end Note that the purposes of algorithm 1 and of algorithm 2 are different. Algorithm 2 aims to computing Ate for many different t and algorithm 1 is designed to compute Ae only. But we emphasize here that in each inner loop, C. J. LI ET AL. Copyright © 2011 SciRes. AM 253algorithm uses the same approach as algorithm 1 to com- pute eachAte. In addition, only Schur decomposition is added at the beginning of the algorithm to reduce the computational cost when the number t is very much. Therefore, we do not present any numerical results of al- gorithm 2. 5. Conclusions In this paper, we develop a new approach based on ma- trix Padé-type approximation mixed with the scaling and squaring method to compute the matrix exponential in- stead of typical Padé approximation. Two numerical al- gorithms for computing Ae and forAtewith many 0t respectively are proposed. Our approach is estab- lished closely relative to the backward error analysis and computational considerations. Numerical results com- paring the proposed algorithm with existing functions expm and funm in MATLAB have shown the proposed algorithm is effective and reliable for computing the ma- trix exponential. Compared with typical Padé approxi- mation, the most significant advantages of matrix Padé- type approximation lie in two aspects. One is the con-venience for computing Ate for a large amount of t. The other is its avoiding nn matrix divisions, where n is the size of the given matrix A. 6. Acknowledgements The authors gratefully acknowledge the fund support from Applied Mathematics. 7. References  C. B. Moler and C. F. Van Loan, “Nineteen Dubious Ways to Compute the Exponential of a Matrix,” SIAM Review, Vol. 20, No. 4, 1978, pp. 801-836. doi:10.1137/1020098  C. B. Moler and C. F. Van Loan, “Nineteen Dubious Ways to Compute the Exponential of a Matrix, Twenty- Five Years Later,” SIAM Review, Vol. 45, No. 1, 2003, pp. 3-49. doi:10.1137/S00361445024180  N. J. Higham, “The Scaling and Squaring Method for the Matrix Exponential Revisited,” SIAM Journal on Matrix Analysis and Application, Vol. 26, No. 4, 2005, pp. 1179- 1193. doi:10.1137/04061101X  N. J. Higham, “The Scaling and Squaring Method for the Matrix Exponential Revisited,” SIAM Review, Vol. 51, No. 4, 2009, pp. 747-764. doi:10.1137/090768539  C. Gu, “Matrix Padé-Type Approximant and Directional Matrix Padé Approximant in the Inner Product Space,” Journal of Computational and Applied Mathematics, Vol. 164-165, No. 1, 2004, pp. 365-385. doi:10.1016/S0377-0427(03)00487-4  C. Brezinski, “Rational Approximation to Formal Power Series,” Journal of Approximation Theory, Vol. 25, No. 4, 1979, pp. 295-317. doi:10.1016/0021-9045(79)90019-4  C. Brezinski, “Padé-Type Approximation and General Orthogonal Polynomials,” Birkh Äauser-Verlag, Basel, 1980.  A. Draux, “Approximants de Type Padé et de Table,” Little: Publication A, University of Lille, Lille, No. 96, 1983.  C. Gu, “Generalized Inverse Matrix Padé Approximation on the Basis of Scalar Products,” Linear Algebra and Its Applications, Vol. 322, No. 1-3, 2001, pp. 141-167. doi:10.1016/S0024-3795(00)00230-5  C. Gu, “A Practical Two-Dimensional Thiele-Type Ma-trix Padé Approximation,” IEEE Transactions on Auto-matic Control, Vol. 48, No. 12, 2003, pp. 2259-2263. doi:10.1109/TAC.2003.820163  N. J. Higham, “Functions of Matrices: Theory and Com-putation,” SIAM Publisher, Philadelphia, 2008. doi:10.1137/1.9780898717778  A. Sidi, “Rational Approximations from Power Series of Vector-Valued Meromorphic Functions,” Journal of Ap-proximation Theory, Vol. 77, No. 1, 1994, pp. 89-111. doi:10.1006/jath.1994.1036  R. Mathias, “Approximation of Matrix-Valued Functions,” SIAM Journal on Matrix Analysis and Application, Vol. 14, No. 4, 1993, pp. 1061-1063. doi:10.1137/0614070  J. D. Lawson, “Generalized Runge-Kutta Processes for Stable Systems with Large Lipschitz Constants,” SIAM Journal on Numerical Analysis, Vol. 4, No. 3, 1967, pp. 372-380. doi:10.1137/0704033  A. H. Al-Mohy and N. J. Higham, “A New Scaling and Squaring Algorithm for the Matrix,” SIAM Journal on Matrix Analysis and Application, Vol. 31, No. 3, 2009, pp. 970-989. doi:10.1137/09074721X