We propose a new scalarization method which consists in constructing, for a given multiobjective optimization problem, a single scalarization function, whose global minimum points are exactly vector critical points of the original problem. This equivalence holds globally and enables one to use global optimization algorithms (for example, classical genetic algorithms with “roulette wheel” selection) to produce multiple solutions of the multiobjective problem. In this article we prove the mentioned equivalence and show that, if the ordering cone is polyhedral and the function being optimized is piecewise differentiable, then computing the values of a scalarization function reduces to solving a quadratic programming problem. We also present some preliminary numerical results pertaining to this new method.
Scalarization is one of the most commonly used methods of solving multiobjective optimization problems. It consists in replacing the original multiobjective problem by a scalar optimization problem, or a family of scalar optimization problems, which is, in a certain sense, equivalent to the original problem. The existing scalarization methods can be divided into two groups:
1) Methods that use some representation of a given multiobjective problem as a parametrized family of scalar optimization problems. Such scalarization methods should have the following two properties (see [
2) Methods that use local equivalence of a multiobjective optimization problem and some scalar optimization problem whose formulation depends on a given point. Such equivalence enables one to solve the multiobjective problem locally by using necessary and/or sufficient optimality conditions formulated for the scalar problem (for examples of such an approach, see [
There are also scalarization approaches which combine properties of both groups such as the Pascoletti-Serafini scalarization [
In this paper, we propose a new scalarization method different from the above-mentioned ones. It consists in constructing, for a given multiobjective optimization problem, a single scalarization function, whose global minimum points are exactly vector critical points in the sense of [
So far, the term “scalarization function” has been used for a scalar-valued function defined on the image space of an optimization problem, which transforms a vector-valued objective function into a scalar-valued one (see [
The purpose of this research is to describe the idea of our new scalarization method and to present some underlying theory for the case of an unconstrained multiobjective optimization problem. The extension to constrained optimization is also possible and will be the subject of further investigations.
Let Ω be an open set in ℝ n , and let f = ( f 1 , ⋯ , f p ) : Ω → ℝ p be a locally Lipschitzian vector function. Suppose that C is a closed convex pointed cone in ℝ p with nonempty interior. We denote by C+ the positive polar cone to C, i.e.,
C + : = { z ∈ ℝ p : 〈 z , y 〉 ≥ 0 , ∀ y ∈ C } , (1)
where 〈 ⋅ , ⋅ 〉 is the usual inner product in ℝ p . The partial order relation in ℝ p is defined by
y ≼ z ifandonlyif z − y ∈ C , (2)
for all y , z ∈ ℝ p . We consider the following multiobjective optimization problem:
minimize f ( x ) subject to x ∈ Ω . (3)
Definition 1 [
∂ f ( x ¯ ) : = co { lim n → ∞ J f ( x n ) : x n → x ¯ , J f ( x n ) exists } , (4)
where J f ( x ) denotes the usual Jacobian matrix of f at x whenever f is Fréchet differentiable at x, and “co” denotes the convex hull of a set.
We will denote by ℝ p × n the vector space of all p × n real matrices. It follows from ( [
Definition 2 Let Ω be an open subset of ℝ n and let f i : Ω → ℝ p , Math_22#, be a collection of continuous functions.
(i) A function f : Ω → ℝ p is said to be a continuous selection of the functions f 1 , ⋯ , f k on the set U ⊂ Ω if f is continuous on U and f ( x ) ∈ { f 1 ( x ) , ⋯ , f k ( x ) } for every x ∈ U .
(ii) A function f : Ω → ℝ p is called a PC1-function if for every x ¯ ∈ Ω there exists an open neighborhood U ⊂ Ω and a finite number of C1-functions f i : U → ℝ p , i = 1 , ⋯ , k , such that f is a continuous selection of f 1 , ⋯ , f k on U. In this case, we call f 1 , ⋯ , f k the selection functions for f at x ¯ .
(iii) Let f : Ω → ℝ p be a PC1-function and let x ¯ ∈ U ⊂ Ω (U open). Suppose that f is a continuous selection of f 1 , ⋯ , f k on U. We define the set of essentially active indices for f at x ¯ as follows:
I f e ( x ¯ ) : = { i ∈ { 1, ⋯ , k } : x ¯ ∈ cl ( int { x ∈ U : f ( x ) = f i ( x ) } ) } . (5)
Proposition 3 ( [
∂ f ( x ¯ ) = co { J f i ( x ¯ ) : i ∈ I f e ( x ¯ ) } . (6)
Definition 4 [
(i) x ¯ is a vector critical point for problem (3) if there exist z ∈ C + \ { 0 p } and A ∈ ∂ f ( x ¯ ) such that
z T A = 0 n , (7)
where 0 n is the zero vector in ℝ n ;
(ii) x ¯ is an efficient solution for (3) if
( f ( Ω ) − f ( x ¯ ) ) ∩ ( − C ) = { 0 p } ; (8)
(iii) x ¯ is a weakly efficient solution for (3) if
( f ( Ω ) − f ( x ¯ ) ) ∩ ( − int C ) = ∅ ; (9)
(iv) x ¯ is a local weakly efficient solution for (3) if there exists a neighborhood U of x ¯ such that
( f ( Ω ∩ U ) − f ( x ¯ ) ) ∩ ( − int C ) = ∅ . (10)
It is obvious that implications ( ii ) ⇒ ( iii ) ⇒ ( iv ) hold in Definition 4. The implication ( iv ) ⇒ ( i ) (for locally Lipschizian f) follows from [
Definition 5 [
Remark 6 If B is a base of the nontrivial convex cone C, then 0 p ∉ B .
Lemma 7 (a finite-dimensional version of [
B : = { z ∈ C + : 〈 z , y ¯ 〉 = 1 } (11)
is a compact base for C + .
In the sequel, we consider a fixed vector y ¯ ∈ int C and a base B for C + defined by (11). In order to define a global scalarization function for problem (3), we first consider the following mapping h : ℝ p × ℝ p × n → ℝ n :
h ( y , A ) : = y T A . (12)
Lemma 8 A point x ¯ ∈ Ω is a vector critical point for problem (3) if and only if
0 n ∈ h ( B × ∂ f ( x ¯ ) ) . (13)
Proof. If x ¯ ∈ Ω is a vector critical point for problem (3), then equality (7) holds for some z ∈ C + \ { 0 p } and A ∈ ∂ f ( x ¯ ) . Since B is a base for C + , there exist λ > 0 and b ∈ B such that z = λ b . Then, by (7),
h ( b , A ) = b T A = 0 n , (14)
so that (13) holds. Conversely, if (14) is true for some b ∈ B and A ∈ ∂ f ( x ¯ ) , then by Definition 5 and Remark 6, we have b ∈ C + \ { 0 p } . Taking z = b in Definition 4, we see that x ¯ is a vector critical point for (3). □
For a nonempty subset S of ℝ n , let d ( ⋅ , S ) : ℝ n → ℝ be the distance function of S, defined as follows:
d ( x , S ) : = inf { ‖ x − u ‖ : u ∈ S } , (15)
where ‖ ⋅ ‖ denotes the Euclidean norm. We now introduce the following scalari- zation function s : Ω → [ 0 , + ∞ ) :
s ( x ) : = d ( 0 n , h ( B × ∂ f ( x ) ) ) . (16)
Note that s depends on the choice of y ¯ . The name “scalarization function” is justified by the following.
Theorem 9 A point x ¯ ∈ Ω is a vector critical point for problem (3) if and only if s ( x ¯ ) = 0 .
Proof. If x ¯ is a vector critical point for (3), then by Lemma 8, condition (13) holds, which gives s ( x ¯ ) = 0 . Conversely, suppose that s ( x ¯ ) = 0 . Since h is continuous and the sets B and ∂ f ( x ¯ ) are compact in ℝ p and ℝ p × n , respectively, the set h ( B × ∂ f ( x ¯ ) ) is also compact; hence it is closed. Therefore, the equality s ( x ¯ ) = 0 implies condition (13). □
Having defined the scalarization function s, we can now replace problem (3) by the following scalar optimization problem:
minimize s ( x ) subject to x ∈ Ω . (17)
Obviously, problems (3) and (17) are not equivalent because there may exist vector critical points which are not (weakly) efficient solutions for (3). Nevertheless, by solving problem (17) we can obtain some approximation of the set of solutions to (3).
Computing the distance function in (16) is not easy in the general case, but under additional assumptions on both C and f, it is possible to apply some existing algorithms to perform this task. The details are described below.
Definition 10 ( [
D = { y ∈ ℝ p : 〈 y , b i 〉 ≤ β i , i = 1 , ⋯ , m } . (18)
A convex cone which is a polyhedral set is called a polyhedral cone.
Theorem 11 Suppose that the ordering cone C in ℝ p is polyhedral and the function f : Ω → ℝ p is PC1. Let y ¯ ∈ int C , let B be a base for C + defined by (11) and let h be the function defined by (12). Then, for each x ∈ Ω , the set h ( B × ∂ f ( x ) ) is polyhedral, or equivalently, it can be represented as the convex hull of a finite number of points in ℝ n .
Proof. It follows from ( [
x = λ 1 a 1 + ⋯ + λ k a k + λ k + 1 a k + 1 + ⋯ + λ l a l , (19)
where
λ 1 + ⋯ + λ k = 1 , λ i ≥ 0 for i = 1 , ⋯ , l . (20)
In particular, if D is bounded, then no λ i can be arbitrarily large, which implies that k = l , and conditions (19) - (20) reduce to
x ∈ co { a 1 , ⋯ , a k } .
By assumption, C is polyhedral, hence, by [
Theorem 11 reduces the problem of computing the values s ( x ) given by (16) to the problem of computing the Euclidean projection of 0 n onto the polyhedron h ( B × ∂ f ( x ) ) . This is a particular case of a quadratic programming problem (see [
For two objectives, under differentiability assumptions, it is possible to find some representation of the scalarization function s in terms of the gradients ∇ f 1 and ∇ f 2 . Let p = 2 and suppose that the mapping f = ( f 1 , f 2 ) is continuously differentiable on ℝ n . Denote by ∇ f i ( x ) the gradient of fi at x (i = 1, 2). Then (4) implies
∂ f ( x ¯ ) = { J f ( x ¯ ) } = [ ∇ f 1 ( x ¯ ) ∇ f 2 ( x ¯ ) ] . (21)
The following theorem will help to compute the scalarization function (16) for bi-objective problems.
Theorem 12 Let p = 2, y ¯ ∈ int C , and let B be the compact base for C + defined by (8). Then there exist vectors b i = ( b 1 i , b 2 i ) ∈ B , i = 1 , 2 , such that
h ( B × ∂ f ( x ¯ ) ) = co { b 1 1 ∇ f 1 ( x ¯ ) + b 2 1 ∇ f 2 ( x ¯ ) , b 1 2 ∇ f 1 ( x ¯ ) + b 2 2 ∇ f 2 ( x ¯ ) } . (22)
Proof. It follows from (8) that B is a subset of some line in ℝ 2 . Moreover, by Lemma 7, B is compact and convex, so it must be a closed line segment. Denote by b ( 1 ) and b ( 2 ) the endpoints of B. Using (21) and the linearity of h with respect to the first argument, we obtain
h ( B × ∂ f ( x ¯ ) ) = h ( co { b 1 , b 2 } × { J f ( x ¯ ) } ) = h ( { ( λ b 1 + ( 1 − λ ) b 2 , J f ( x ¯ ) ) : 0 ≤ λ ≤ 1 } ) = { λ h ( b 1 , J f ( x ¯ ) ) + ( 1 − λ ) h ( b 2 , J f ( x ¯ ) ) : 0 ≤ λ ≤ 1 } = co { h ( b 1 , J f ( x ¯ ) ) , h ( b 2 , J f ( x ¯ ) ) } = co { b 1 1 ∇ f 1 ( x ¯ ) + b 2 1 ∇ f 2 ( x ¯ ) , b 1 2 ∇ f 1 ( x ¯ ) + b 2 2 ∇ f 2 ( x ¯ ) } . □
Pareto OptimizationWe now consider the case of classical Pareto optimization, i.e., when C = ℝ + 2 . We have C + = C . Let y ¯ = ( 1 , 1 ) ∈ int C , then by Lemma 7 the set
B : = { z ∈ C + : z 1 + z 2 = 1 }
is a compact base for C + , and B is the closed line segment joining the two points b ( 1 ) = ( 1 , 0 ) and b ( 2 ) = ( 0 , 1 ) . According to Theorem 12, we have
h ( B × ∂ f ( x ¯ ) ) = co { ∇ f 1 ( x ) , ∇ f 2 ( x ) } ,
hence, the scalarization function has the form
s ( x ) = d ( 0 , co { ∇ f 1 ( x ) , ∇ f 2 ( x ) } ) .
For any point x ∈ ℝ n , there are two possible cases:
(i) ∇ f 1 ( x ) = ∇ f 2 ( x ) . Then s ( x ) = ‖ ∇ f 1 ( x ) ‖ = ‖ ∇ f 2 ( x ) ‖ .
(ii) ∇ f 1 ( x ) ≠ ∇ f 2 ( x ) . Then s ( x ) is the distance from 0 to the line segment S joining ∇ f 1 ( x ) and ∇ f 2 ( x ) .
We now consider case (ii). The line L passing through ∇ f 1 ( x ) and ∇ f 2 ( x ) is parametrized as L ( t ) = b + t a where b : = ∇ f 1 ( x ) is a point on the line, and a : = ∇ f 2 ( x ) − ∇ f 1 ( x ) is the line direction. The closest point on the line L to 0 is the projection of 0 onto L which is equal to
q : = b + t 0 a , where t 0 = − 〈 a , b 〉 〈 a , a 〉 = − 〈 a , b 〉 ‖ a ‖ 2 .
Using the same parametrization, we can represent the line segment S as follows:
S = { b + t a : 0 ≤ t ≤ 1 } .
Therefore, if t 0 ≤ 0 , then the point in S closest to 0 is b. Similarly, if t 0 ≥ 1 , then the point in S closest to 0 is b + a . Finally, if 0 < t 0 < 1 , then the point in S closest to 0 is q. Hence, the function s can be described as follows:
s ( x ) = { ‖ b ‖ if t 0 ≤ 0, ‖ b + t 0 a ‖ if 0 < t 0 < 1, ‖ b + a ‖ if t 0 ≥ 1. (23)
Taking into account the definitions of a and b above, we see that this scalarization function depends on the values of gradients of f 1 and f 2 only, so it is easily computable.
Example 13 (problem FON in [
f 1 ( x ) = 1 − exp ( − ∑ i = 1 3 ( x i − 1 3 ) 2 ) , (24)
f 2 ( x ) = 1 − exp ( − ∑ i = 1 3 ( x i + 1 3 ) 2 ) . (25)
The authors of [
x 1 = x 2 = x 3 ∈ [ − 1 / 3 , 1 / 3 ] . (26)
Here the set Ω is closed (contrary to the rest of our paper), but this constraint is in fact inessential and the problem can also be considered on the whole space ℝ 3 . Computing the partial derivatives of f 1 and f 2 , we obtain from (24) - (25)
∂ f 1 ∂ x j ( x ) = 2 ( x j − 1 3 ) exp ( − ∑ i = 1 3 ( x i − 1 3 ) 2 ) , j = 1 , 2 , 3 , (27)
∂ f 2 ∂ x j ( x ) = 2 ( x j + 1 3 ) exp ( − ∑ i = 1 3 ( x i + 1 3 ) 2 ) , j = 1 , 2 , 3. (28)
We have designed a program in Maple to compute s ( x ) , using formulae (23) and (27) - (28). This program consists of three nested loops for the values of the variables x 1 , x 2 , x 3 , each variable taking values from −4 to 4 in steps of 0.01. We have obtained s ( x ) = 0 for each x satisfying (26), and s ( x ) > 0 for all other points x. However, there are some points x for which the values s ( x ) are very small; the smallest value obtained is
s ( 4 , 4 , 4 ) = s ( − 4 , − 4 , − 4 ) = α : = 0.79802094823 × 10 − 26 . (29)
There are no other points at which s ( x ) < α , except the Pareto optimal solutions (26).
This example shows that one must be careful when using global optimization algorithms to minimize s because points like the ones appearing in (29) can be easily misclassified as vector critical points.
We have presented a new scalarization method for solving multiobjective optimization problems which is based on computing the Euclidean distance from the origin to some subset determined by the generalized Jacobian of the mapping being optimized. This article contains the main underlying theory and only some preliminary numerical computations pertaining to this method. More numerical results will be presented in another research.
The authors are grateful to an anonymous referee for his/her comments which have improved the quality of the paper.
Rahmo, E.-D. and Studniarski, M. (2017) A New Global Scalarization Method for Multiobjective Optimization with an Arbitrary Ordering Cone. Applied Mathematics, 8, 154-163. https://doi.org/10.4236/am.2017.82013