Paper Menu >>
Journal Menu >>
![]() J. Software Engineering & Applications, 2010, 3: 503-509 doi:10.4236/jsea.2010.35057 Published Online May 2010 (http://www.SciRP.org/journal/jsea) Copyright © 2010 SciRes. JSEA A Line Search Algorithm for Unconstrained Optimization* Gonglin Yuan1, Sha Lu2, Zengxin Wei1 1College of Mathematics and Information Science, Guangxi University, Nanning, China; 2School of Mathematical Science, Guangxi Teachers Education University, Nanning, China. Email: glyuan@gxu.edu.cn Received February 6th, 2010; revised March 30th, 2010; accepted March 31st, 2010. ABSTRACT It is well known that the line search methods play a very important role for optimization problems. In this paper a new line search method is proposed for solving unconstrained optimization. Under weak conditions, this method possesses global convergence and R-linear convergence for nonconvex function and convex function, respectively. Moreover, the given search direction has sufficiently descent property and belongs to a trust region without carrying out any line search rule. Numerical results show that the new method is effective. Keywords: Line Search, Unconstrained Optimization, Global Convergence, R-linear Convergence 1. Introduction Consider the unconstrained optimization problem min( ) n xR f x , (1) where :n f RR is continuously differentiable. The line search algorithm for (1) often generates a sequence of iterates {} k x by letting 1,0,1,2, kkkk xx dk (2) where k x is the current iterate point, k d is a search direction, and 0 k is a steplength. Different choices of k d and k will determine different line search methods [1-3]. The method is divided into two stages at each iteration: 1) choose a descent search direction k d; 2) choose a step size k along the search direction k d. Throughout this paper, we denote () k f x by k f , () k f x by k g , and 1 () k f x by 1k g , respectively. || || denotes the Euclidian norm of vectors. One simple line search method is the steepest descent method if we take kk dg as a search direction at every iteration, which has wide applications in solving large-scale minimization problems [4]. However, the steepest descent method often yields zigzag phenomena in solving practical problems, which makes the algorithm converge to an optimal solution very slowly, or even fail to converge [5,6]. Then the steepest descent method is not the fastest one among the line search methods. If kkk dHg is the search direction at each iteration in the algorithm, where k H is an n × n matrix approxi- mating 21 [()] k fx , then the corresponding line search method is called Newton-like method [4-6] such as Newton method, quasi-Newton method, variable metric method, etc. Many papers [7-10] have been proposed by the method for optimization problems. However, one drawback of the Newton-like method is required to store and compute matrix k H at each iteration and thus adds the cost of storage and computation. Accordingly, this method is not suitable to solve large-scale optimization problems in many cases. The conjugate gradient method is a powerful line search method for solving the large scale optimization problems because of its simplicity and its very low memory requirement. The search direction of the conju- gate gradient method often takes the form ,1 ,0, kkk k k gdifk dgifk (3) where kR is a scalar which determines the different * This work is supported by China NSF grants 10761001, the Scientific Research Foundation of Guangxi University (Grant No. X081082), and Guangxi SF grants 0991028. ![]() A Line Search Algorithm for Unconstrained Optimization Copyright © 2010 SciRes. JSEA 504 conjugate gradient methods [11-13] etc. The convergence behavior of the different conjugate gradient methods with some line search conditions [14] has been widely studied by many authors for many years (see [4, 15]). At present, one of the most efficient formula for k from the com- putation point of view is the following PRP method 11 2 () . || || T PRP kk k k k g gg g (4) If 1kk x x , it is easy to get 0 PRP k , which implies that the direction k d of the PRP method will turn out to be the steepest descent direction as the restart condition automatically when the next iteration point is approximate to the current point. This case is very important to ensure the efficiency of the PRP conjugate gradient method (see [4,15] etc.). For the convergence of the PRP conjugate gradient method, Polak and Ribière [16] proved that the PRP method with the exact line search is globally con- vergent when the objective function is convex, and Powell [17] gave a counter example to show that there exist nonconvex functions on which the PRP method does not converge globally even the exact line search is used. We all know that the following sufficient descent con- dition 2 || || T kk k g dcg , for all 0k and some constant 0c (5) is very important to insure the global convergence of the algorithm by nonlinear conjugate gradient method, and it may be crucial for conjugate gradient methods [14]. It has been showed that the PRP method with the following strong Wolfe-Powell (SWP) line search rules which is to find the step size k satisfying 1 () T kkk kkkk f xdf gd , (6) and 2 |() ||| TT kkkk kk g xdd gd (7) did not ensure the condition (5) at each iteration, where 1 1 02 , 12 1 . Then Grippo and Lucidi [18] presented a new line search rule which can ensure the sufficient descent condition and established the conver- gence of the PRP method with their line search technique. Powell [17] suggested that k should not be less than zero. Considering this idea, Gilbert and Nocedal [14] proved that the modified PRP method max{0, } PRP kk is globally convergent under the sufficient descent as- sumption condition and the following weak Wolfe-Powell (WWP) line search technique: find the steplength k such that (6) and 2 () TT kkkk kk g xdd gd . (8) Over the past few years, much effort has been put to find out new formulas for conjugate methods such that they have not only global convergence property for gen- eral functions but also good numerical performance (see [14,15]). Resent years, some good results on the nonlinear conjugate gradient method are given [19-25]. These observations motivate us to propose a new method which possesses not only the simplicity and low memory but also desirable theoretical features. In this paper, we design a new line search method which pos- sesses not only the sufficiently descent property but also the following property 1 || |||||| kk dcg , for all 0k and some constant 10c (9) whatever line search rule is used, where the property (9) implies that the search direction k d is in a trust region automatically. This paper is organized as follows. In the next section, the algorithms and other line search rules are stated. The global convergence and the R-linear convergence of the new method are established in Section 3. Numerical re- sults and one conclusion are presented in Section 4 and in Section 5, respectively. 2. The Algorithms Besides the inexact line search techniques WWP and SWP, there exist other line search rules which are often used to analyze the convergence of the line search method: 1) The exact minimization rule. The step size k is chosen such that 0 ()min() kkkk k f xd fxd . (10) 2) Goldstein rule. The step size k is chosen to satisfy (6) and 2 () T kkkkkkk f xdf gd . (11) Now we give our algorithm as follows. 1) Algorithm 1 (New Algorithm) Step 0: Choose an initial point 0, n x R and constants 01 , 1 1 02 , 12 1 . Set 00 dg 0 () f x , :0.k Step 1: If |||| , k g then stop; Otherwise go to step 2. Step 2: Compute steplength k by one line search technique, and let 1kkkk x xd . Step 3: If 1 |||| , k g then stop; Otherwise go to step 4. ![]() A Line Search Algorithm for Unconstrained Optimization Copyright © 2010 SciRes. JSEA 505 Step 4: Calculate the search direction 1k d by (3), where k is defined by (4). Step 5: Let 11 11 11 2 1 min{0, } || || T new kk kk kk k gd dd gg g , where 1 11 1 || |||||| || |||||| kk kk kk yg dd sd , 1kk k s xx , || ||max{||||,|| ||} kkk ysy , 1kk k yg g . Step 6: Let 1 :naw k dd , :1kk, and go to step 2. Remark. In the Step 5 of Algorithm 1, we have ||||max{||||,||||}1 || |||| || kkk kk ysy ss , which can increase the convergent speed of the algorithm from the computation point of view. Here we give the normal PRP conjugate gradient algo- rithm and one modified PRP conjugate gradient algorithm [14] as follows. 2) Algorithm 2 (PRP Algorithm ) Step 0: Choose an initial point 0, n x R and constants 01 , 1 1 02 , 12 1 . Set 00 dg 0 () f x, :0.k Step 1: If |||| , k g then stop; Otherwise go to step 2. Step 2: Compute steplength k by one line search technique, and let 1kkkk x xd . Step 3: If 1 |||| , k g then stop; Otherwise go to step 4. Step 4: Calculate the search direction 1k d by (3), where k is defined by (4). Step 5: Let :1kk and go to step 2. 3) Algorithm 3 (PRP+ Algorithm see [14]) Step 0: Choose an initial point 0, n x R and con- stants 01 , 1 1 02 , 12 1 . Set 0 d 00 () g fx , :0.k Step 1: If |||| , k g then stop; Otherwise go to step 2. Step 2: Compute steplength k by one line search technique, and let 1kkkk x xd . Step 3: If 1 |||| , k g then stop; Otherwise go to step 4. Step 4: Calculate the search direction 1k d by (3), where max{0, } PRP kk Step 5: Let :1kk and go to step 2. We will concentrate on the convergent results of Al- gorithm 1 in the following section. 3. Convergence Analysis The following assumptions are often needed to analyze the convergence of the line search method (see [15,26]). Assumption A (i) fis bounded below on the level set 0 {:()()} n x Rfx fx ; Assumption A (ii) In some neighborhood 0 of , f is differentiable and its gradient is Lipschitz con- tinuous, namely, there exists a constants 0L such that ||()() |||||| g xgy Lxy , for all 0 ,xy. In the following, let 0 k g for all k, for otherwise a stationary point has been found. Lemma 3.1 Consider Algorithm 1. Let Assumption (ii) hold. Then (5) and (9) hold. Proof. If 0k , (5) and (9) hold obviously. For 1k, by Assumption (ii) and the Step 5 of Algorithm 1, we have 111 11 |||||||| |||| ||||(2 max{1,}1)|||| . new kkk kk ddd gLg Now we consider the vector product 11 T kk g d in the following two cases: case 1. If 110 T kk gd . Then we get 22 11 11 1111 2 1 2 11 1 2 1 min{0, }|||||||| || || || || |||| . T Tnew Tkk kk kkkk k T kk k k gd gd gdgg g gd g g case 2. If 110 T kk gd . Then we obtain 22 11 11 1111 2 1 22 11 111 1 2 1 2 1 min{0, }|||||||| || || || |||| || || || |||| . T Tnew Tkk kk kkkk k T Tkk kkk k k k gd gd gdgg g gd gdg g g g Let (0,1)c , 12max{1, }1cL and use the Step 6 of Algorithm 1, (5) and (9) hold, respectively. The proof is completed. The above lemma shows that the search direction k d has such that the sufficient descent condition (5) and the condition (9) without any line search rule. Based on Lemma 3.1, Assumption (i) and (ii), let us give the global convergence theorem of Algorithm 1. Theorem 3.1 Let 11 {,, ,} kkkk dx g be generated by Algorithm 1 with the exact minimization rule, the Gold- stein line search rule, the SWP line search rule, or the WWP line search rule, and Assumption (i) and (ii) hold. Then lim ||||0 k kg (12) holds. ![]() A Line Search Algorithm for Unconstrained Optimization Copyright © 2010 SciRes. JSEA 506 Proof. We will prove the result of this theorem with the exact minimization rule, the Goldstein line search rule, the SWP line search rule, and the WWP line search rule, respectively. 1) For the exact minimization rule. Let the step size k be the solution of (10). By the mean value theorem, 0 T kk gd , and Assump- tion (ii), for any 22 |||| 12 , 5 ||||5 |||| TT kk kk k kk gd gd Ld Ld , we have 1 0 1 0 () ()() () ()() [() ]() kkk kkkkk T kkkkk TT kkkkkkk kk f xd fxfxdfx gxtdd dt g dgxtdg ddt 22 2 2 224 2 2 1|| || 2 || () 114 ()|||| (13) 5|| ||225|| || () 3, 25 |||| T kkkk k TT T kk kk kk kk T kk k gdL d gd gd gd Ld LdL d gd Ld which together with Assumption (i), we can obtain 2 2 0 () || || T kk kk gd d . (14) This implies that 2 2 () lim 0 |||| T kk k k gd d (15) holds. By Lemma 3.1, we get (12). 2) For Goldstein rule. Let the step size k be the so- lution of (6) and (11). By (11) and the mean value theorem, we have 12 () TT kk kkkk kkkkk g xddff gd , where (0,1) k , thus 2 () TT kkkkk kk g xddgd . Using Assumption (ii) again, we get 2 2 (1 ) [ ()]|||| T kk T kkkkkk kk gd g xdgdLd , which combining with (6), and use Assumption (i), we have (14) and (15), respectively. By Lemma 3.1, (12) holds. 3) For strong Wolf-Powell rule. Let the step size k be the solution of (6) and (7). By (7), we have 22 () TTT kkkkk kkk g dgx ddgd . Similar to the proof of the above case. We can obtain (12) immediately. 4) For weak Wolf-Powell rule. Let the step size k be the solution of (6) and (8). Similar to the proof of the case 3), we can also get (12). Then we conclude this result of this theorem. By Lemma 3.1, there exists a constant 00 such that 0, |||||| || T kk kk gd gd for all .k (16) By the proof process of Lemma 3.1. We can deduce that there exists a positive number 1 satisfying 2 11 (), || || T kk kk k gd ff d for all .k (17) Similar to the proof of Theorem 4.1 in [27], it is not difficult to prove the linear convergence rate of Algorithm 1. We state the theorem as follows but omit the proof. Theorem 3.2 (see [27]) Based on (16), (17), and the condition that the function f is twice continuously dif- ferentiable and uniformly convex on n R. Let 11 {,, ,} kkkk dxg be generated by Algorithm 1 with the exact minimization rule, the Goldstein line search rule, the SWP line search rule, or the WWP line search rule. Then {} k x converges to x at least linearly, where x is the unique minimal point of () f x. 4. Numerical Results In this section, we report some numerical experiments with Algorithm 1, Algorithm 2, and Algorithm 3. We test these algorithms on some problem [28] taken from MATLAB with given initial points. The parameters common to these methods were set identically, 10.1 , 20.9 , 6 10 , In this experiment, the following Himmeblau stop rule is used: If 1 |()| k f xe,let 1 |()( )| 1|()| kk k fx fx stop fx ; Other- wise, let 1 1| ( )()| kk stopfxfx , where 5 110e . If || || k g or 2 1 s top e was satisfied, the program will be stopped, where 5 210e . We also stop the program if the iteration number is more than one thousand. Since the line search cannot always ensure the descent condition 0 T kk dg , uphill search direction may occur in the numerical experiments. In this case, the line search rule maybe failed. In order to avoid this case, the stepsize k will be accepted if the ![]() A Line Search Algorithm for Unconstrained Optimization Copyright © 2010 SciRes. JSEA 507 searching number is more than forty in the line search. The detailed numerical results are listed on the web site http://210.36.18.9:8018/publication.asp?id=34402 Dolan and Moré [29] gave a new tool to analyze the efficiency of Algorithms. They introduced the notion of a performance profile as a means to evaluate and compare the performance of the set of solvers S on a test set P. Assuming that there exist s n solvers and p n problems, for each problem p and solver s , they defined ,ps tcomputing time (the number of function evalua- tions or others) required to solve problem p by solvers. Requiring a baseline for comparisons, they compared the performance on problem p by solver s with the best performance by any solver on this problem; that is, using the performance ratio , , , . min{ :} ps ps ps t rtsS Suppose that a parameter , M ps rr for all p , s is chosen, and , p sM rr if and only if solver s does not solve problem p . The performance of solver s on any given problem might be of interest, but we would like to obtain an overall assessment of the performance of the solver, then they defined , 1 (){ :}, sps p tsizepPrt n Thus () st was the probability for solver s S that a performance ratio , p s r was within a factor tR of the best possible ration. Then function s was the (cumula- tive) distribution function for the performance ratio. The performance profile :[0,1] sR for a solver was a nondecreasing, piecewise constant function, continuous from the right at each breakpoint. The value of (1) s was the probability that the solver would win over the rest of the solvers. According to the above rules, we know that one solver whose performance profile plot is on top right will win over the rest of the solvers. In Figures 1-3, NA denotes Algorithm 1, PRP denotes Algorithm 2, and PRP+ denotes Algorithm 3. Figures 1-3 show that the performance of these methods is relative to NTNFm NG, where NF and NG denote the number of function evaluations and gradient evaluations respectively, and m is an integer. According to the re- sults on automatic differentiation [30], the value of m can be set to 5m. That is to say, one gradient evalua- tion is equivalent to m number of function evaluations if automatic differentiation is used. From these three figures Figure 1. Performance profiles(NT) of methods with Gold- stein rule Figure 2. Performance profiles(NT) of methods with strong Wolfe-Powell rule Figure 3. Performance profiles (NT) of methods with weak Wolfe-Power rule ![]() A Line Search Algorithm for Unconstrained Optimization Copyright © 2010 SciRes. JSEA 508 it is clear that the given method has the most wins (has the highest probability of being the optimal solver). In summary, the presented numerical results reveal that the new method, compared with the normal PRP method and the modified PRP method [14], has potential advan- tages. 5. Conclusions This paper gives a new line search method for uncon- strained optimization. The global and R-linear conver- gence are established under weaker assumptions on the search direction k d. Especially, the direction k d satis- fies the sufficient condition (5) and the condition (9) without carrying out any line search technique, and some paper [14,27,30] often obtains these two conditions by assumption. The comparison of the numerical results shows that the new search direction of the new algorithm is a good search direction at every iteration. REFERENCES [1] G. Yuan and X. Lu, “A New Line Search Method with Trust Region for Unconstrained Optimization,” Commu- nications on Applied Nonlinear Analysis, Vol. 15, No. 1, 2008, pp. 35-49. [2] G. Yuan, X. Lu, and Z. Wei, “New Two-Point Stepsize Gradient Methods for Solving Unconstrained Optimi- zation Problems,” Natural Science Journal of Xiangtan University, Vol. 29, No. 1, 2007, pp. 13-15. [3] G. Yuan and Z. Wei, “New Line Search Methods for Uncons- trained Optimization,” Journal of the Korean Statistical Society, Vol. 38, No. 1, 2009, pp. 29-39. [4] Y. Yuan and W. Sun, “Theory and Methods of Optimi- zation,” Science Press of China, Beijing, 1999. [5] D. C. Luenerger, “Linear and Nonlinear Programming,” 2nd Edition, Addition Wesley, Reading, MA, 1989. [6] J. Nocedal and S. J. Wright, “Numerical Optimization,” Springer, Berlin, Heidelberg, New York, 1999. [7] Z. Wei, G. Li, and L. Qi, “New Quasi-Newton Methods for Unconstrained Optimization Problems,” Applied Mathe- matics and Computation, Vol. 175, No. 1, 2006, pp. 1156- 1188. [8] Z. Wei, G. Yu, G. Yuan, and Z. Lian, “The Superlinear Convergence of a Modified BFGS-type Method for Unconstrained Optimization,” Computational Optimiza- tion and Applications, Vol. 29, No. 3, 2004, pp. 315-332. [9] G. Yuan and Z. Wei, “The Superlinear Convergence Anal- ysis of a Nonmonotone BFGS Algorithm on Convex Objective Functions,” Acta Mathematica Sinica, English Series, Vol. 24, No. 1, 2008, pp. 35-42. [10] G. Yuan and Z. Wei, “Convergence Analysis of a Modified BFGS Method on Convex Minimizations,” Computational Optimization and Applications, Science Citation Index, 2008. [11] Y. Dai and Y. Yuan, “A Nonlinear Conjugate Gradient with a Strong Global Convergence Properties,” SIAM Journal of Optimization, Vol. 10, No. 1, 2000, pp. 177- 182. [12] Z. Wei, G. Li, and L. Qi, “New Nonlinear Conjugate Gradient Formulas for Large-Scale Unconstrained Optimi- zation Problems,” Applied Mathematics and Computation, Vol. 179, No. 2, 2006, pp. 407-430. [13] G. Yuan and X. Lu, “A Modified PRP Conjugate Gradient Method,” Annals of Operations Research, Vol. 166, No. 1, 2009, pp. 73-90. [14] J. C. Gibert, J. Nocedal, “Global Convergence Properties of Conjugate Gradient Methods for Optimization,” SIAM Journal on Optimization, Vol. 2, No. 1, 1992, pp. 21-42. [15] Y. Dai and Y. Yuan, “Nonlinear Conjugate Gradient Methods,” Shanghai Science and Technology Press, 2000. [16] E. Polak and G. Ribiè, “Note Sur la Xonvergence de Directions Conjugèes,” Rev Francaise Informat Recher- che Operatinelle 3e Annèe, Vol. 16, 1969, pp. 35-43. [17] M. J. D. Powell, “Nonconvex Minimization Calculations and the Conjugate Gradient Method,” Lecture Notes in Mathematics, Springer-Verlag, Berlin, Vol. 1066, 1984, pp. 122-141. [18] L. Grippo and S. Lucidi, “A Globally Convergent Version of the Polak-RibiÈRe Gradient Method,” Mathematical Programming, Vol. 78, No. 3, 1997, pp. 375-391. [19] W. W. Hager and H. Zhang, “A New Conjugate Gradient Method with Guaranteed Descent and an Efficient Line Search,” SIAM Journal on Optimization, Vol. 16, No. 1, 2005, pp. 170-192. [20] Z. Wei, S. Yao, and L. Liu, “The Convergence Properties of Some New Conjugate Gradient Methods,” Applied Mathematics and Computation, Vol. 183, No. 2, 2006, pp. 1341-1350. [21] G. H. Yu, “Nonlinear Self-Scaling Conjugate Gradient Methods for Large-scale Optimization Problems,” Thesis of Doctor's Degree, Sun Yat-Sen University, 2007. [22] G. Yuan, “Modified Nonlinear Conjugate Gradient Methods with Sufficient Descent Property for Large-Scale Optimization Problems,” Optimization Letters, Vol. 3, No. 1, 2009, pp. 11-21. [23] G. Yuan, “A Conjugate Gradient Method for Uncons- trained Optimization Problems,” International Journal of Mathematics and Mathematical Sciences, Vol. 2009, 2009, pp. 1-14. [24] G. Yuan, X. Lu, and Z. Wei, “A Conjugate Gradient Method with Descent Direction for Unconstrained Optimi- zation,” Journal of Computational and Applied Mathe- matics, Vol. 233, No. 2, 2009, pp. 519-530. [25] L. Zhang, W. Zhou, and D. Li, “A Descent Modified Polak-RibiÈRe-Polyak Conjugate Method and its Global Convergence,” IMA Journal on Numerical Analysis, Vol. 26, No. 4, 2006, pp. 629-649. [26] Y. Liu and C. Storey, “Efficient Generalized Conjugate Gradient Algorithms, Part 1: Theory,” Journal of Optimi- zation Theory and Application, Vol. 69, No. 1, 1992, pp. 17-41. ![]() A Line Search Algorithm for Unconstrained Optimization Copyright © 2010 SciRes. JSEA 509 [27] Z. J. Shi, “Convergence of Line Search Methods for Unconstrained Optimization,” Applied Mathematics and Computation, Vol. 157, No. 2, 2004, pp. 393-405. [28] J. J. Moré, B. S. Garbow, and K. E. Hillstrome, “Testing Unconstrained Optimization Software,” ACM Transac- tions on Mathematical Software, Vol. 7, No. 1, 1981, pp. 17-41. [29] E. D. Dolan and J. J. Moré, “Benchmarking Optimization Software with Performance Profiles,” Mathematical Programming, Vol. 91, No. 2, 2002, pp. 201-213. [30] Y. Dai and Q. Ni, “Testing Different Conjugate Gradient Methods for Large-scale Unconstrained Optimization,” Journal of Computational Mathematics, Vol. 21, No. 3, 2003, pp. 311-320. |