PRP-Type Direct Search Methods for Unconstrained Optimization

doi:10.4236/am.2011.26096

Paper Menu >>

Journal Menu >>

Applied Mathematics, 2011, 2, 725-731

doi:10.4236/am.2011.26096 Published Online June 2011 (http://www.SciRP.org/journal/am)

PRP-Type Direct Search Methods for Unconstrained

Optimization*

Qunfeng Liu, Wanyou Cheng

College of Comp uter , Dongguan University of Technology, Dongguan, China

E-mail: liuqf@dgut.edu.cn

Received March 26, 2011; revised April 14, 2011; accepted April 17, 2011

Abstract

Three PRP-type direct search methods for unconstrained optimization are presented. The methods adopt

three kinds of recently developed descent conjugate gradient methods and the idea of frame-based direct

search method. Global convergence is shown for continuously differentiable functions. Data profile and per-

formance profile are adopted to analyze the numerical experiments and the results show that the proposed

methods are effective.

Keywords: Direct Search Methods, Descent Conjugate Gradient Methods, Frame-Based Methods, Global

Convergence, Data Profile, Performance Profile

1. Introduction

Direct search methods form an important class of nu-

merical methods for solving optimization problems.

They are particularly useful in the solution of the prob-

lems where the derivatives are not available, or difficult

to compute. Early direct search methods can trace back

to the compass search method [1] and the pattern search

method [2]. In the 1960s, the direct search methods were

widely applied. Due to the lack of convergence theory,

direct search methods fell out of favor with the mathe-

matical optimization community by the early of 1970s.

In the past twenty years, direct search methods have

seen a revival. Significant interests have been provoked

by new convergence results. For a direct search method

to achieve convergence, it often needs to employ one of

the following three techniques, namely line searches,

trust regions and discrete grids [3]. In this paper we will

adopt the discrete grid strategy to develop some globally

convergent direct search methods. Alternative appro-

aches based on line searches and trust regions can be

found in [3-6] and the references therein.

Early convergence theory based on discrete grids was

established by Torczon et al. in [7-9]. They developed a

framework for the GPS (generalized pattern search) me-

thod which contains many earlier algorithms, such as the

compass search, Hooke and Jeeves’ pattern search, as

special cases. The GPS methods have been extended to

bound and linearly constrained problems [10,11], non-

smooth problems [12] and general constrained problems

[13].

The grids play an important role in the GPS methods

and their extensions. In the GPS methods, each grid is

often a subset of some member of a sequence of nested

grids.

Coope and Price [14] has shown that the grids can be

chosen more freely. The grid-based methods proposed in

[14] incorporated two arbitrary finite processes which

may reoriented and reshaped the grids. Partly provoked

by Coope and Price’s ideas, the review paper [15] pre-

sented a new unified presentation which was called GSS

(generating set search) method for a large number of

direct search methods including the GPS methods. The

GSS methods have been also extended to linear con-

strained optimization [16,17].

The great freedom in the choice of grids permits to

choose grids to reflect the information gathered during

previous iterations, especially the possible gradient in-

formation. In [18], a concept called frame was proposed

to gather the gradient information. Here a frame (see

Definition 2.2) is a fragment of a grid. Loosely speaking,

a frame is a set of points which surround a central point

called frame center. Suppose that we already have a

frame around the iterate k

, then the set of search direc-

tions must contain all the frame points and may contain

an arbitrary infinite process. If there is no search direc-

*This work was supported by the NNSF of China Grant 11071087 and

10971058.

Q. F. LIU ET AL.

726

tion such that the objective function value is sufficiently

less than



x, then k

is called a quasi-minimal

point (see Definition 2.3). After a quasi-minimal point is

founded, the frame size must be reduced for conver-

gence.

A direct search method which confirms to the frame-

based template in [18] was proposed in [19], where the

PRP+ method was used to make use of the gradient in-

formation gathered during previous iterations. The pre-

sented numerical experiments in [19] show that this

frame-based conjugate gradients method is effective.

In this paper, we consider the following unconstrained

optimization



min , ,

xxR (1)

where the objective function



x is continuously dif-

ferential but the gradient of

is not available or com-

putationally expensive.

Our main purpose is to construct efficient direct search

methods for solving (1). We also use the frame-based

template in [18], but substitute PRP+ method with some

descent conjugate gradient methods to make use of the

gradient information. These descent conjugate gradient

methods for solving unconstrained optimization prob-

lems enjoy some nice properties and good numerical

behavior. A common property of these methods is that

they can generate sufficient descent directions for the

objective function. In the case when exact gradients are

available, descent conjugate gradient methods are com-

putational more effective than PRP+ method [20-23]. We

are going to include the descent conjugate gradient

methods in the frame-based direct search framework to

develop more efficient direct search methods for solving

(1).

We will consider descent conjugate gradient methods

which we call the TTPRP (three-term PRP) method pro-

posed in [23], the TMPRP (two-term modified PRP)

method proposed in [4] and the PRP-DC proposed in

[24].

Let us have a simple review to these descent conjugate

gradient methods. Each method generate a sequences of

iterates



1, 0,1,

kkkk

xx dk



 

where the step-length k



is determined by some line

search, and the initial direction is set to





dfx



dk

In the TTPRP method, the direction is de-

termined by

 



kk kk

k k

fxyfx d

dfd y

fx fx









1k

x

(2)

where n the TMPRP method,

the direction

 

1kk

yfxfx



 .





dkmined by is deter







 



fx y



kk k

fx fx



















(3)

where

denotes the identity matrix. In the PRP-DC

method, the direction





dk is determined by

  



Tkk

yfx

dfx

fx fx

sfx



 k









 







(4)

where 11kkk





.

As the computation of the gradient

 is not avail-

able in the direct search methods, we need to find some

estimation of the gradient



instead of

 in (2), (3)

and (4). Based on these conjugate gradient directions, we

then design direct search methods. Under mild condi-

tions, we will prove that these methods are globally con-

vergent. Our numerical results show that the proposed

methods are efficient.

In the next section, we describe our direct search algo-

rithm framework. In Section 3 we establish a global con-

vergence theorem for this algorithm framework. In Sec-

tion 4, we compare the performance of the proposed

methods with some existing methods.

2. The Algorithm

In this section, we first introduce some concepts and then

describe our algorithms.

Definition 2.1. A set of vectorsis called a positive

basis forif and only if



1) every vector in is a nonnegative linear combi-

nation of the members of , and

RV

V2) no proper subset of



satisfies 1).

Positive basis was first proposed in [25], and has be-

come an important concept in direct search methods. In

general, if





,,,

vv v is a basis for , then





,,,vv vv

The

1 2

, ,,,

v v n

is a positive basis for n

re are many other important positive bases, see [25]

for more details.

The following two definitions define the frame and

quasi-minimal frame proposed by Coope and Price in [18]

and [19] respectively.

Definition 2.2. A frame round

with size > 0 is de-

fined by









,, :

hVxhv vV





where V



is a positive basis.

Q. F. LIU ET AL.727

Definition 2.3. A frame







 is called quasi-

minimal if and only if

 

xfxhvvV





, (5)

where , and





h1







is a given constant.

If a frame



 is quasi-minimal, then the

frame center



is called a quasi-minimal point. If there is

a frame point





xhV



 such that

 

fy fx



, (6)

then the frame is not quasi-minimal, and we say that the

function value at is sufficiently less than that of the

frame center or that at the frame point the objective

function obtain sufficient decrease. In a frame which is

not a quasi-minimal frame, there is at least a frame point

at which the objective function can obtain sufficient de-

crease.

In this paper, we use the well-known positive basis

to define a quasi-minimal

frame, where is the unit vector.



,,, ,,

Veee e



eth





We now describe our PRP-type direct search algo-

rithm.

Firstly, a frame is constructed centered at the current

iterate k

. Using the function values at the frame points,

we estimate k

as an approximation to the gradient





x. The element of

is given by



kkikki

xhe fxhe

 

The next search direction is then calculated according

to (2) for TTPRP (or (3) for TMPRP or (4) for PRP-DC)

in which



x is replaced by k

We use the line search method proposed in [19] to get

a step-length k



. Specifically, we solve the following

one dimension optimization problem to get k





min kkkk

xhdd



 

 , (7)

where the current iterate k

, the frame size k and the

conjugate direction are given. The next iterate is

then set to be

1kkkkk

xhdd





. If the current

frame is quasi-minimal, then decrease the frame size by

letting 14

. Otherwise, if k



is large enough

such that 22



 , then increase the frame size by

letting 152

.

As the search direction may not be a descent direction,

the step-lengthk



determined by (7) may be negative.

In our method, we adopt the n-step restart strategy.

Because the n-step restart conjugate gradient method

possesses n-step quadratic convergence property when

applied to minimizing a twice continuously differentiable

function. Specifically, at iterate , we use 1kn

1kn 





as the search direction. For the purpose of convergence,

we need to set 1kn

 equal to the lowest known point

before the next reset. That is to say, we let 1kn



be de-

termined by the following rule.



 





j i1 1

(1)

min ,

knj j

knjkn

xfx





fxhe, (8)

We summarize the above process as the algorithm be-

low.

Algorithm.

1) Initialize: Set = 1, j = n + 1, h0 = 1. k

2) While (stopping conditions do not hold) do

a) Construct a frame centered at the current iterate k

Calculate the function values at the frame points, and

form the gradient estimation k

b) Check stopping conditions.

c) Calculate the new search direction .

d) Execute the line search process to find



e) Set 1kkkk

xhdd







f) If the current frame is quasi-minimal, then set





; else if 22



 , then set

152





; else set . Increasekby one and

1kk

decrease by one respectively.



g) If j > 1 then return to step (a); else

 set 1k



equal to the lowest known point accord-

ing to (8);

 set 1jn



, increasekby one.

end.

In the above algorithm, the variablecounts the num-

ber of iterates until next restart.

3. Convergence Analysis

If we regard all the conjugate gradient steps as part of the

finite process, then the general convergence result of the

frame-based method will still hold [18]. In this section,

we describe the convergence result specified for our al-

gorithm.

Suppose the sequence of quasi-minimal points gener-

ated by the algorithm is denoted by . The following

theorem shows that







z must be infinite and each

limit points of





z is a stationary point of the objec-

tive function.

Theorem 3.1. Assume thats

1) the sequence of iterates



generated by the al-

gorithm is bounded;

2) is continuously differentiable; and

3) as .

kk

Then





zm is an infinite sequence whose cluster

points are stationary points of .

Proof. Firstly, we prove that the algorithm generates

an infinite sequence of quasi-minimal points. Suppose on

the contrary that the sequence of quasi-minimal points is

finite. Let the final quasi-minimal point be m. After the

final quasi-minimal iterate is found, the parameter

Q. F. LIU ET AL.

728

These two inequalities yield . □



0fz





h will never decrease, and (6) will hold for all .

This implies the sequence of the function values is un-

bounded. It contradicts condition 1). Consequently, the

sequence of quasi-minimal iterates

km





z is infinite.

4. Numerical Experiments

Let be an arbitrary cluster point of and the

subsequence

z



z converges to . Then for any

In this section, we report some numerical results. We

compare the proposed direct search methods (denoted as

TTPRP, TMPRP and PRP-DC respectively) with the

preconditioned PRP+ direct search method (denoted as

PRP+(pre)) proposed in [19].

z



zz

, we have







ˆˆ

() 1,,

mmi m,

zhefzhi n



where is the frame size corresponding to the qua-

si-minimal point . So we get

ˆm







ˆ1,

mmim

fz hefzhi



 ,n

We adopt both the performance profile [26] and the

data profile [27] to compare the performance among dif-

ferent methods.

The performance profile seeks to capture how well

one solver performs relative to the other selected solvers

on the set of test problems. While the data profile display

the raw data. In particular, the data profile seeks to tell

the percentage of the problems that can be solved (for a

given tolerance



) with any given number of function

evaluations. So the data profile is especially suit for the

situation where the function evaluation is expensive [27].

Letting with mkK



we obtain



i1, ,

ze i



 n

1, ,

and



zei



n

We use the 53 smooth test problems proposed in [27].

Figure 1. Data profiles show the perc e ntage of problems solved as a function of a computational budget of simplex gradients.

Q. F. LIU ET AL.

729

The maximum dimension of these problems is 12. All

methods halt when the following convergence test pro-

posed in [27] was satisfied











xffxf



 (9)

where 0



 is a tolerance, 0

is the starting point for

the test problem, and

is computed for each solver as

the smallest value of

obtained by any solver within

1300 function evaluations. Because we are interested in

the short-term behavior of these methods as the accuracy

level changes, we present the data profiles and the per-

formance profiles for with {1, 3, 5, 7}.

k





10 k

In our numerical experiments, if the current frame is

quasi-minimal, then the frame size is reduced by the

formula



1min

max, 4



hhhk, where 10

min 10h



.

The role of min is to ensure thatwould not be too

close to the machine precision. If gets too close to

the machine precision then the gradient estimates may

become completely inaccurate.

h h

The codes were written in Fortran 90, and the program

was run on a PC with a Genuine Intel (R) CPU (T1350@

1.86 GHz) and 504 M memory.

Figure 1 shows the data profiles for tolerance



10−1, 10−3, 10−5 and 10−7 respectively. It can be seen that

TTPRP solves the largest percentage of problems for

almost all size of computational budget and levels of

accuracy



. TMPRP and PRP-DC are also comparable

with PRP + (pre). It is noteworthy that the performance

differences between TTPRP and other solvers tend to

increase as the tolerance



decreases and the computa-

tional budget



increases. On the other hand, Figure 1

also shows that within about 50 simplex gradients the

performance differences between four solvers are small.

Figure 2 shows the performance profiles based on the

number of function evaluations. The left side of each plot

gives the percentage of the test problems which a solver

can solve with the greatest efficiency; the right side gives

the percentage of the test problems which a solver can

solve successfully. Mainly, the right side measures the

robustness of a solver.

It can be seen from Figure 2 that TTPRP solves the

largest percentage of problems for almost all size of per-

Figure 2. Performance profiles (logarithmic scale).

Q. F. LIU ET AL.

730

formance ratio and levels of accuracy



. TMPRP and

PRP-DC are also comparable with PRP + (pre). It is

noteworthy that the performance differences between

TTPRP and other solvers still tend to increase as the tol-

erance



decreases.

5. Discussion

Based on the Coope-Price direct search framework, we

proposed three PRP-type direct search methods. All of

them employ a kind of descent conjugate gradient direc-

tion. When exact gradients are available, descent conju-

gate gradient methods can generate sufficient descent

directions for the objective function, and numerical re-

sults have shown that they are often computational more

effective than PRP+ method.

In this paper, the gradient information of the objective

function is not available. We estimate the gradients based

on the function values obtained at the maximal positive

basis. These gradients estimated may be not accurate

very much, but global convergence can be ensured under

the Coope-Price direct search framework. In other words,

convergence is guaranteed by the frame-based nature of

the algorithms, not the fact that they mimic a conjugate

gradients method.

The accuracy of the gradients estimated may also af-

fect the descent property of the descent conjugate gradi-

ent directions. However, the numerical results show that

the proposed PRP-type direct search methods are prom-

ising and competitive, especially the TTPRP method.

6. References

[1] W. C. Davidon, “Variable Metric Method for Minimiza-

tion,” SIAM Journal on Optimization, Vol. 1, No. 1, 1992,

pp. 1-17. doi:10.1137/0801001

[2] R. Hooke and T. A. Jeeves, “Direct Search Solution of

Numerical and Statistical Problems,” Journal of the ACM,

Vol. 8, No. 2, 1961, pp. 212-229.

doi:10.1145/321062.321069

[3] M. J. D. Powell, “Direct Search Algorithms for Optimi-

zation Calculations,” Acta Numerica, Vol. 7, No. 1, 1998,

pp. 287-336. doi:10.1017/S0962492900002841

[4] A. Conn, K. Scheinberg and P. L. Toint, “On the Con-

vergence of Derivative-Free Methods for Unconstrained

Optimization,” In: M. D. Buchmann and A. Iserles, Eds.,

Approximation Theory and Optimization, Cambridge

University Press, Cambridge, 1997.

[5] B. Karasozen, “Survey of Trust-Region Derivative Free

Optimization Methods,” Journal of Industrial and Man-

agement Optimization, Vol. 3, No. 2, 2007, pp. 321-334.

[6] M. J. D. Powell, “A View of Algorithms for Optimization

without Derivatives,” Technical Report DA MT P2007/NA03,

Department of Applied Mathematics and Theoretical

Physics, University of Cambridge, Cambridge, 2007.

[7] R. M. Lewis and V. Torczon, “Rank Ordering and Posi-

tive Bases in Pattern Search Algorithms,” Technical Re-

port 96-71, Institute for Computer Applications in Sci-

ence and Engineering, NASA Langley Research Center,

Hampson, 1996.

[8] V. Torczon, “On the Convergence of the Multidirectional

Search Algorithm,” SIAM Journal on Optimization, Vol.

1, No. 1, 1991, pp. 123-145. doi:10.1137/0801010

[9] V. Torczon, “On the Convergence of Pattern Search Al-

gorithms,” SIAM Journal on Optimization, Vol. 7, No. 1,

1997, pp. 1-25. doi:10.1137/S1052623493250780

[10] R. M. Lewis and V. Torczon, “Pattern Search Algorithms

for Bound Constrained Minimization,” SIAM Journal on

Optimization, Vol. 9, No. 4, 1999, pp. 1082-1099.

doi:10.1137/S1052623496300507

[11] R. M. Lewis and V. Torczon, “Pattern Search Methods

for Linearly Constrained Minimization,” SIAM Journal

on Optimization, Vol. 10, No. 3, 2000, pp. 917-941.

doi:10.1137/S1052623497331373

[12] C. Audet and J. E. Dennis Jr., “Analysis of Generalized

Pattern Searches,” SIAM Journal on Optimization, Vol.

13, No. 3, 2003, pp. 889-903.

doi:10.1137/S1052623400378742

[13] C. Audet and J. E. Dennis Jr., “Mesh Adaptive Direct

Search Algorithms for Constrained Optimization,” SIAM

Journal on Optimization, Vol. 17, No. 1, 2006, pp. 188-

217. doi:10.1137/040603371

[14] I. D. Coope and C. J. Price, “On the Convergence of

Grid-Based Methods for Unconstrained Optimization,”

SIAM Journal on Optimization, Vol. 11, No. 4, 2001, pp.

859-869. doi:10.1137/S1052623499354989

[15] T. G. Kolda, R. M. Lewis and V. Torczon, “Optimization

by Direct Search: New Perspectives on Some Classical

and Modern Methods,” SIAM Review, Vol. 45, No. 3,

2003, pp. 385-482. doi:10.1137/S003614450242889

[16] T. G. Kolda, R. M. Lewis and V. Torczon, “Stationary

Results for Generating Set Search for Linearly Con-

strained Optimization,” SIAM Journal on Optimization,

Vol. 17, No. 4, 2006, pp. 943-968.

doi:10.1137/S1052623403433638

[17] R. M. Lewis, A. Shepherd and V. Torczon, “Implement-

ing Generating Set Search Methods for Linearly Con-

strained Minimization,” SIAM Journal on Scientific Com-

puting, Vol. 29, No. 6, 2007, pp. 2507-2530.

doi:10.1137/050635432

[18] I. D. Coope and C. J. Price, “Frame Based Methods for

Unconstrained Optimization,” Journal of Optimization

Theory and Applications, Vol. 107, No. 2, 2000, pp. 261-

274. doi:10.1023/A:1026429319405

[19] I. D. Coope and C. J. Price, “A Direct Search Frame-

Based Conjugate Gradients Method,” Journal of Compu-

tational Mathematics, Vol. 22, No. 4, 2004, pp. 489-500.

[20] W. Y. Cheng, “A Two-Term PRP-Based Descent Method,”

Numerical Functional Analysis and Optimization, Vol. 28,

No. 11-12, 2007, pp. 1217-1230.

doi:10.1080/01630560701749524

[21] W. W. Hager and H. Zhang, “A New Conjugate Gradient

Q. F. LIU ET AL.731

Method with Guaranteed Descent and an Efficient Line

Search,” SIAM Journal on Optimization, Vol. 16, No. 1,

2005, pp. 170-192. doi:10.1137/030601880

[22] W. W. Hager and H. Zhang, “A Survey of Nonlinear

Conjugate Gradient Methods,” Pacific Journal of Opti-

mization, Vol. 2, 2006, pp. 35-58.

[23] L. Zhang, W. J. Zou and D. H. Li, “A Descent Modified

Polad-Ribiere-Polyak Conjugate Gradient Method and Its

Global Convergence,” IMA Journal of Numerical Analy-

sis, Vol. 26, 2006, pp. 629-640.

doi:10.1093/imanum/drl016

[24] N. Andrei, “A Modified Polak-Ribiere-Polyak Conjugate

Gradient Algorithm for Unconstrained Optimization,”

Optimization: A Journal of Mathematical Programming

and Operational Research, Accepted.

[25] C. Davis, “Theory of Positive Linear Dependence,”

American Journal of Mathematics, Vol. 76, No. 4, 1954,

pp. 733-746. doi:10.2307/2372648

[26] E. D. Dolan and J. J. More, “Benchmarking Optimization

Software with Performance Profiles,” Mathematical Pro-

gramming, Vol. 91, No. 2, 2002, pp. 201-213.

doi:10.1007/s101070100263

[27] J. J. More and S. M. Wild, “Benchmarking Deriva-

tive-Free Optimization Algorithms,” SIAM Journal on

Optimization, Vol. 20, No. 1, 2009, pp. 172-191.

doi:10.1137/080724083