The Multiple-Square-Root Minimization Problem (MSR) has an objective function that consists of a sum of a linear term and at least two square root terms. The Lagrangian sub - problem for the LMRP is a typical MSR problem and there are other MSR problems in real life. A simple example is that we add other concave cost s besides the safety stock cost to the LMRP, such as the labor cost and even minimize the negation of the revenue. We tested a sort of heuristic involved, similar to the method to solve the problem of the LMRP vibrational days, and we explore the heuristic is probably the most optimal condition. The accuracy of this approach is declining at a slow rate, because the number of square roots is increasing, and when the number is not too large, it stays at a higher level.
The Multiple-Square-Root Minimization Problem (MSR) has an objective function that consists of a sum of a linear term and at least two square root terms. The Lagrangian sub-problem for the LMRP (Location Model with Risk Pooling) is a typical MSR problem, and there are other MSR problems in real life. A simple example is that we add other concave costs besides the safety stock cost to the LMRP, such as the labor cost and even minimize the negation of the revenue. Our focus is on the two-square-root problem in which this assumption does not hold. We address this problem along two directions. The first is to relax the assumption in the LMRP’s Lagrangian method and the second one is to add a condition for the parameters that make the sorting method always optimal.
They proposed an efficient algorithm for the continuous relaxation of this sub-problem. The Maximum Expected Covering Location Problem (MEXCLP) is introduced by Daskin [
In our thesis, we try to find under what condition the sorting algorithm for the LMRP Lagrangian sub-problem can be applied to the general two square root pricing problem.
The LMRP model is an extension of the UFLP that considers uncertain demand. Besides the fixed cost of opening locations and the variable transportation cost, it also includes the cost of cycle stock and safety stock. As a result, the LMRP is structured much like the UFLP model, with two extra non-linear terms in the objective function. Despite its concave objective function, the LMRP problem can be solved by Lagrangian relaxation quite efficiently, just like the UFLP, assuming that the ratio of the customer demand rate and the standard deviation of daily demand are constant. We use the following notations:
Parameters:
I set of retailers, indexed by i,
J set of candidate DC sites, indexed by j,
u i mean daily demand of retailer i, for each i ∈ I
σ i 2 variance of daily demand of retailer i, for each i ∈ I ,
f j fixed (daliy) demand of locating a DC at candidate site j, for each j ∈ J ,
K j fixed cost for DC j to place an order from the supplier, including fixed components of both ordering and transportation costs, for each j ∈ J ,
d i j cost per unit to ship between retailer i and candiddate DC site j, for each i ∈ I and j ∈ J ,
θ a constant parameter that captures the safety stock costs at candidate sites.
Decision Variables:
X j = { 1 ifwelocateatcandidatesite j 0 if not
Y i j = { 1 ifdemandsatretailer i areassignedtoaDCatcandidatesite j 0 ifnot
Then the model is formulated as follows.
Minimize ∑ j ∈ J { f j X j + ∑ i ∈ I d i j Y i j + K j ∑ i ∈ I u i Y i j + θ ∑ i ∈ I σ i 2 Y i j }
subjectto ∑ j ∈ J Y i j = 1 , ∀ i ∈ I
Y i j ≤ X j , ∀ i ∈ I , ∀ j ∈ J
X j ∈ { 0 , 1 } , ∀ j ∈ J
Y i j ∈ { 0 , 1 } , ∀ i ∈ I , ∀ j ∈ J .
To make the objective function linear, we introduce a new parameter γ j k to represent the cost of safety and cycle stock cost that k retailers are assigned to DC j, that is
γ j k = K j k u + θ k σ 2
Also we introduce a new decision variable
Z j k = { 1 , ifexactly k retailersareassignedtoDC j , 0 , if not
To associate Z j k with its meaning using linear constraints, we add the constraints
∑ k = 0 | I | k Z j k = ∑ i ∈ I Y i j , ∀ j ∈ J
∑ k = 0 | I | Z j k = 1 , ∀ j ∈ J
The second constraint says that only one of the Z j k can be equal to 1 for each j and the first constraint makes sure that the 1 appears when k = ∑ i ∈ I Y i j which is just how we define the meaning of Z j k .
So the linear model is:
Minimize ∑ j ∈ J { f j X j + ∑ i ∈ I d i j Y i j + ∑ k ∈ J γ j k Z j k }
subject to ∑ j ∈ J Y i j = 1 , ∀ i ∈ I
Y i j ≤ X j , ∀ i ∈ I , ∀ j ∈ J
∑ k = 0 | I | k Z j k = ∑ i ∈ I Y i j , ∀ j ∈ J
∑ k = 0 | I | Z j k = 1 , ∀ j ∈ J
X j ∈ { 0 , 1 } , ∀ j ∈ J
Y i j ∈ { 0 , 1 } , ∀ i ∈ I , ∀ j ∈ J
Z j k ∈ { 0 , 1 } , ∀ j ∈ J , ∀ k = 0 , ⋯ , I
From these two formulations, we can see although the second method is linear, it has many more constraints than the original formulation. On the other hand, it can be solved by an off the shelf MIP solver and does not require Lagrangian relaxation as in the original LMRP. So it’s hard to say which computation time would be shorter only by looking at the models. We will test randomly generated examples and compare the solution time of the two methods in Chapter 4.
Similar to the UFLP, we solve the LMRP by relaxing the assignment constraints Equation (3.2) to obtain the following Lagrangian sub-problem:
Minimize ∑ j ∈ J { f j X j + ∑ i ∈ I d i j Y i j + K j ∑ i ∈ I u i Y i j + θ ∑ i ∈ I σ i 2 Y i j } + ∑ i ∈ I λ I ( 1 − ∑ j ∈ J Y i j ) = Minimize ∑ j ∈ J { f j X j + ∑ i ∈ I ( d i j − λ i ) Y i j + K j ∑ i ∈ I u i Y i j + θ ∑ i ∈ I σ i 2 Y i j } + ∑ i ∈ I λ i
subject to Y i j ∈ X j , ∀ i ∈ I , ∀ j ∈ J
X j ∈ { 0 , 1 } , ∀ j ∈ J
Y i j ∈ { 0 , 1 } , ∀ i ∈ I , ∀ j ∈ J
Although the sub-problem is a concave integer minimization problem, it can be solved relatively efficiently, using a sorting method developed by Mark S. Daskin, Collette R. Coullard, and Zuo-Jun Max Shen in 2003. The algorithm relies on the assumption that the ratio of the demand variance to the demand mean is a constant for all retailers. That is, for all i ∈ I , σ i 2 / u i = γ ≤ 0 . Then we can collapse two square root terms into one and apply the sorting algorithm to solve the resulting sub-problem.
The optimal objective function value of the Lagrangian sub-problem gives us a lower bound of the original problem; then we need an upper bound. There are many ways to find a feasible solution to get the upper bound; in this paper, we use a simple algorithm to generate the solution from the sub-problem result.
In the LMRP model, the Lagrangian sub-problem can be written as
Minimize ∑ j ∈ J { f j X j + ∑ i ∈ I ( d i j − λ i ) Y i j + K j ∑ i ∈ I μ i Y i j + Θ ∑ i ∈ I σ i 2 Y i j } + ∑ i ∈ I λ i
subject to
Y i j ≤ X j , ∀ i ∈ I , ∀ j ∈ J
X j ∈ { 0 , 1 } , ∀ j ∈ J
Y i j ∈ { 0 , 1 } , ∀ i ∈ I , ∀ j ∈ J .
To solve this sub-problem, we can divide the objective function into three parts. The X part contains ∑ j ∈ J f j X j , the Y part contain
∑ i ∈ I { ( d i j − λ i ) Y i j + K j ∑ i ∈ I μ i Y i j + Θ ∑ i ∈ I σ i 2 Y i j } and the λ part contains
∑ i ∈ I λ i . Since the λ part has no decision variables, it can be ignored for the optimization. To minimize the objective function, we decide the value for X j as follows: for each j, if f j plus the minimum value of the Y part for that j is negative, we set X j to 1 and set Y i j to achieve the minimum value; else, we set X j and Y i j to 0. Therefore the problem reduces to finding the minimum value for the Y part. For given j ∈ J , let b i = d i j − λ i , c i 1 = K j 2 μ i , c i 2 = Θ 2 σ i 2 and y i = Y i j . Then the Y part minimization problem can be abstracted into
Minimize ∑ i ∈ I b i y i + ∑ i ∈ I c i 1 y i + ∑ i ∈ I c i 2 y i
In “A joint location-inventory model”, an efficient sorting method was introduced to solve this problem under the condition that c i 1 / c i 2 = constant for all i. This allows the two square-root terms to be combined into one, which is requires for the algorithms in “A joint location-inventory model”. From the original notation, this condition means that the ratio of the demand variance to the demand mean is a constant for all retailers. This is approximately true for many realistic cases. However, in some cases demand variance can be influenced by factors other than the demand mean and that will violate the condition. One example is selling wine; the demand mean is mostly related to the population that drinks the wine while the demand variance can be influenced by the climate. People tend to buy more when the grapes are good because of good climate, so places with stable climate tend to have a lower demand variance.
As we mentioned in the last paragraph, the method in “A joint location-in- ventory model” works under the condition that c i 1 / c i 2 = constant, let's say λ , for all i. Now the y part can be rewritten as follows:
∑ i ∈ I b i y i + ∑ i ∈ I c i 1 y i + ∑ i ∈ I c i 2 y i = ∑ i ∈ I b i y i + ∑ i ∈ I γ c i 2 y i + ∑ i ∈ I c i 2 y i = ∑ i ∈ I b i y i + ( 1 + γ ) ∑ i ∈ I c i 2 y i .
We get rid of one of the square root terms in this way. So the minimization problem of the Y terms can be written as:
(P) Minimize
∑ i ∈ I b i y i + ∑ i ∈ I c i y i (2.1)
subject to
y i ∈ { 0 , 1 } , ∀ i ∈ I (2.2)
where c i = c i 2
Mark S. Daskin, Collette R. Coullard, and Zuo-Jun Max Shen proved that (P) can be solved efficiently by a sorting method, in Ο ( | I | log | I | ) time. The sorting method works as follows:
Let I − = { i ∈ I | b i < 0 } and I 1 − = { i ∈ I − | b i > 0 }
1. Set y i = 0 for all i ∈ I −
2. Set y i = 1 for all i ∈ I − such that c i = 0 .
3. Sort the elements in I 1 − in increasing order of b i / c i .
4. For each r ∈ { 0 } ∪ I 1 − ,compute
S r = ∑ i = 1 r b i + ∑ i = 1 r c i
(If r = 0 , then S r = 0 .)
5. Choose the r that minimizes S r and set y i = 1 for i = 1 , ⋯ , r .
The proof for this sorting method only uses the concave property of the square root function and the process also works if we replace the square root terms with any concave function terms.
Without the condition that c i 1 / c i 2 = constant, we cannot combine the two square root terms into one in the Y part. So the sorting method provided by does not work in these cases. Our focus is on identifying conditions under which the problem can be solved by Mark S. Daskin, Collette R. Coullard, and Zuo-Jun Max Shen the sorting method for problems with two or more square root terms.
The general form of the multiple square root (MSR) minimization problem is as follows:
(PP) minimize
min ∑ i ∈ I b i y i + ∑ i ∈ I c i 1 y i + ∑ i ∈ I c i 2 y i + ⋯ + ∑ i ∈ I c i m y i = ∑ i ∈ I b i y i + ∑ t = 1 m ∑ i ∈ I c i t y i
subject to
y i ∈ { 0 , 1 } (2.3)
Throughout the analysis below, we assume that c i t > 0 for all i ∈ I and t ∈ { 1 , ⋯ , m } . In the case of the LMRP, this assumption holds as long as all of the demand variances are positive. If the assumption does not hold, one can approximate it by setting c i t = ε , for small ε , if it equals 0. Furthermore, we will assume that c i t ≥ 1 for all i ∈ I and t ∈ { 1 , ⋯ , m } . This assumption is not restrictive, since if 0 < c i t < 1 we can multiply the objective function by some constant M without changing the optimal solution; c i t then becomes M 2 c i t . For large enough M, all coefficients are greater than or equal to 1.
Let I − = { i ∈ I | b i ≤ 0 } . Obviously, if i ∉ I − than y i * = 0 in any optimal solution y * . Assume without loss of generality that the elements of I − are sorted such that
b 1 ∑ t = 1 m c 1 t ≤ b 2 ∑ t = 1 m c 2 t ≤ ⋯ ≤ b n ∑ t = 1 m c n t
where n = | I − | .
Consider the two square root problem, i.e., m = 2. As shown in “A joint location-inventory model”, if c i 1 c i 2 = constant for all i, then the sorting method dis-
cussed above works. On the other hand, we know that without this constraint, the sorting method does not hold in general. The question we are interested in is whether the sorting method is guaranteed to work if the c i 1 / c i 2 ratios are “close enough”.
Suppose that c i t c i 1 = ( 1 + α i ) c , t ∈ { 2 , ⋯ , m } and ∀ i ∈ I with α i ∈ [ α , α ¯ ] for some constants α i , α _ , α ¯ , c . Our aim is to identify values for α _ and α ¯ , which guarantee that the sorting method works.
In this section, we modify the proof of the correctness of the sorting method introduced by Mark S. Daskin, Collette R. Coullard, and Zuo-Jun Max Shen to the MSR problem and discuss the bottleneck of the proof.
We assume without loss of generality that the elements of I − = { i ∈ I − | b i < 0 } are indexed and sorted such that
b 1 ∑ t = 1 m c 1 t ≤ b 2 ∑ t = 1 m c 2 t ≤ ⋯ ≤ b n ∑ t = 1 m c n t (2.4)
Conjecture 2.1 There exists an optimal solution y * to (PP) such that the following property holds: if y k * = 1 for some k ∈ I 1 − , then y l * = 1 for all l ∈ { 1 , ⋯ , k − 1 } .
Start of Proof: Let y * be an optimal solution to (PP). Assume (for a contradiction) that y k * = 1 but y l * = 0 for some l ∈ { 1 , ⋯ , k − 1 } . Assume that y * is the optimal solution with the greatest number of components set to 1. We will find a new optimal solution with one more
Component set to one, thus contradicting our assumption.
Let
y ′ i = { 1 , if i = l y i * , otherwise ,
y ″ i = { 0 , if i = k y i * , otherwise .
Let z * ; z ′ ; z ″ be the objective values corresponding to y * ; y ′ ; y ″ , respectively. Let R = { i ∈ I − \ { k , l } | y i * = 1 } , and let C t = ∑ i ∈ R c i t . Clearly,
z ′ − z * = b l + ∑ t = 1 m [ C t + c k t + c l t − C t + c k t ] (2.5)
z * − z ″ = b k + ∑ t = 1 m [ C t + c k t − C t ] (2.6)
Then from Equation (2.5), we have:
z ′ − z * ∑ t = 1 m c l t = b t ∑ t = 1 m c l t + ∑ t = 1 m ( C t + c k t + c l t − C t + c k t ) ∑ t = 1 m c l t (2.7)
Also, from Equation (2.6), we have:
z * − z ″ ∑ t = 1 m c k t = b k ∑ t = 1 m c k t + ∑ t = 1 m ( C t + c k t − C t ) ∑ t = 1 m c k t
Since
z ′ ≤ z * and z ″ ≤ z * (2.8)
and
z ′ − z * ∑ t = 1 m c l t > 0 (2.9)
z * − z ″ ∑ t = 1 m c k t < 0 (2.10)
if we can identify a condition such that
z ′ − z * ∑ t = 1 m c l t ≤ z * − z ″ ∑ t = 1 m c k t (2.11)
Then conjecture 2.1 would be proved.
From the convexity of the square root function, we have
C t + c k t + c l t − C t + c k t c l t ≤ C t + c k t − C t c k t (2.12)
It is sufficient to find a condition that makes
∑ t = 1 m ( C t + c k t + c l t − C t + c k t ) ∑ t = 1 m c l t ≤ ∑ t = 1 m ( C t + c k t − C t ) ∑ t = 1 m c k t (2.13)
An instance that satisfies inequality (2.13) must satisfy (2.11). Since b l ∑ t = 1 m c l t ≤ b k ∑ t = 1 m c k t by the way I − is sorted.
Here comes the bottleneck, since inequality (2.13) doesn’t hold for all cases. In Section 2.4 we give a sufficient condition that ensures inequality (2.13) holds, and in Section 2.5 we try an alternate approach.
As we assumed in Section 5.2, c i t c i 1 = ( 1 + α i ) c , t ∈ { 1 , ⋯ , m } and ∀ i ∈ I with α i ∈ [ α _ , α ¯ ] . So we change the format of inequality (5.13) and associate it with our assumption.
First note that
C t + c k t + c l t − C t + c k t = ( C t + c k t + c l t − C t + c k t ) ( C t + c k t + c l t + C t + c k t ) C t + c k t + c l t + C t + c k t = ( C t + c k t + c l t ) − ( C t + c k t ) C t + c k t + c l t + C t + c k t = c l t C t + c k t + c l t + C t + c k t (2.14)
Given the assumption, we have:
c l t ≤ c ( 1 + α ¯ ) c l 1
C t + c k t + c l t + C + t c k t ≥ c 1 − α _ ( C t + c k 1 + c l 1 + C 1 + c k 1 )
Since all the terms are greater than zero, we have:
c l t C t + c k t + c l t + C t + c k t ≤ c ( 1 + α ¯ ) c l 1 c 1 − α _ ( C 1 + c k 1 + c l 1 + C 1 + c k 1 ) (2.15)
Simplifying the right hand side term, we have:
c ( 1 + α ¯ ) c l 1 c 1 − α _ ( C 1 + c k 1 + c l 1 + C 1 + c k 1 ) = c 1 + α ¯ 1 − α _ c l 1 ( C 1 + c k 1 + c l 1 − C 1 + c k 1 ) ( C 1 + c k 1 + c l 1 + C 1 + c k 1 ) ( C 1 + c k 1 + c l 1 − C 1 + c k 1 ) = c 1 + α ¯ 1 − α _ c l 1 ( C 1 + c k 1 + c l 1 − C 1 + c k 1 ) c l 1 = c 1 + α ¯ 1 − α _ ( C 1 + c k 1 + c l 1 − C 1 + c k 1 ) (2.16)
Combining (5.14), (5.15) and (5.16), we have:
C t + c k t + c l t − C t + c k t ≤ c 1 + α ¯ 1 − α _ ( C 1 + c k 1 + c l 1 − C 1 + c k 1 )
This holds for any t ∈ { 1 , ⋯ , m } . Summing over t, we get :
∑ t = 1 m ( C t + c k t + c l t − C t + c k t ) ≤ m c 1 + α ¯ 1 − α _ ( C 1 + c k 1 + c l 1 − C 1 + c k 1 ) (2.17)
Given the assumption, we also have:
∑ t = 1 m c l t ≥ m c ( 1 − α _ ) c l 1 (2.18)
Since all the terms are positive.
∑ t = 1 m ( C t + c k t + c l t − C t + c k t ) ∑ t = 1 m c l t ≤ m c 1 + α ¯ 1 − α _ ( C 1 + c k 1 + c l 1 − C 1 + c k 1 ) m c ( 1 − α _ ) c l 1 = ( 1 + α ¯ ) ( C 1 + c k 1 + c l 1 − C 1 + c k 1 ) c ( 1 − α _ ) 3 c l 1 (2.19)
Similarly we have
C t + c k t − C t = ( C t + c k t − C t ) ( C t + c k t + C t ) C t + c k t + C t = ( C t + c k t ) − ( C t ) C t + c k t + C t = c k t C t + c k t + C t (2.20)
Given the assumption, we have:
c k t ≥ c ( 1 − α _ ) c k 1
C t + c k t + C t ≤ c 1 + α ¯ ( C 1 + c k 1 + C 1 )
Since all the terms are greater than zero, we have:
c k t C t + c k t + C t ≥ c ( 1 − α _ ) c k 1 c 1 + α ¯ ( C 1 + c k 1 + C 1 ) (2.21)
Simplifying the right hand side term, we have:
c ( 1 − α _ ) c k 1 c 1 + α ¯ ( C 1 + c k 1 + C 1 ) = c 1 − α _ 1 + α ¯ c k 1 ( C 1 + c k 1 − C 1 ) ( C 1 + c k 1 + C 1 ) ( C 1 + c k 1 − C 1 ) = c 1 − α _ 1 − α ¯ c k 1 ( C 1 + c k 1 − C 1 ) c k 1 = c 1 − α _ 1 + α ¯ ( C 1 + c k 1 − C 1 ) (2.22)
Combining (2.20), (2.21) and (2.22), we have:
C t + c k t − C t ≥ c 1 − α _ 1 + α ¯ ( C 1 + c k 1 − C 1 )
This holds for any t ∈ { 1 , ⋯ , m } . Summing over t, we get:
∑ t = 1 m ( C t + c k t − C t ) ≥ m c 1 − α _ 1 + α ¯ ( C 1 + c k 1 − C 1 ) (2.23)
Given the assumption, we also have:
∑ t = 1 m c k t ≤ m c ( 1 + α ¯ ) c k 1 (2.24)
since all the terms are positive, inequality (2.23) and (2.24) imply
∑ t = 1 m ( C t + c k t − C t ) ∑ t = 1 m c k t ≥ m c 1 − α _ 1 + α ¯ ( C 1 + c k 1 − C 1 ) m c ( 1 + α ¯ ) c k 1 = ( 1 − α _ ) ( C 1 + c k 1 − C 1 ) c ( 1 + α ¯ ) 3 c k 1 (2.25)
From inequalities (2.19) and (2.25), we can see that if we have
( 1 + α ¯ ) ( C 1 + c k 1 + c l 1 − C 1 + c k 1 ) c ( 1 − α _ ) 3 c l 1 ≤ ( 1 − α _ ) ( C 1 + c k 1 − C 1 ) c ( 1 + α ¯ ) 3 c k 1 (2.26)
then inequality (2.13) will hold. In other word, Conjecture (5.1) holds if we impose (5.16) as an additional condition.
Inequality (2.26) can be reformulated as
( 1 + α ¯ ) 5 ( 1 − α _ ) 5 ≤ C 1 + c k 1 − C 1 c k 1 C 1 + c k 1 + c l 1 − C 1 + c k 1 c l 1 = ( C 1 + c k 1 − C 1 ) ( C 1 + c k 1 + C 1 ) c k 1 ( C 1 + c k 1 + C 1 ) ( C 1 + c k 1 + c l 1 − C 1 + c k 1 ) ( C 1 + c k 1 + c l 1 + C 1 + c k 1 ) c l 1 ( C 1 + c k 1 + c l 1 + C 1 + c k 1 ) = c k 1 c k 1 ( C 1 + c k 1 + C 1 ) c l 1 c l 1 ( C 1 + c k 1 + c l 1 + C 1 + c k 1 ) = C 1 + c k 1 + c l 1 + C 1 + c k 1 C 1 + c k 1 + C 1 (2.27)
For the right hand side of inequality (2.27), needing the minimum value of C 1 + c k 1 + c l 1 + C 1 + c k 1 C 1 + c k 1 + C 1 .
Over all k, l, call it R * , is easy to accomplish through a simple iterative procedure.
For every t ∈ 1 , ⋯ , m
Step 1: Let l = arg min 1 ≤ i ≤ k − 1 { c i t } .
Step 2: Let k = arg min 2 ≤ i ≤ n { C 1 + c k 1 + c l 1 + C 1 + c k 1 C 1 + c k 1 + C 1 } .
For the left hand side of inequality (2.17), the value is decided by 1 + α ¯ 1 − α ¯ and
the smallest value of this fraction can also be determined from the dataset easily. Given our assumption in Section 2.2, the smallest value of c ( 1 + α ¯ ) should be the largest ratio of c i 1 and c i t , for all i ∈ I and t ∈ 2 , ⋯ , m ; call it γ ¯ . Also, the largest value of c ( 1 + α ¯ ) should be the smallest ratio of c i 1 and c i t , for all i ∈ I and t ∈ 2 , ⋯ , m ; call it γ ︸ .So
1 + α ¯ 1 − α _ = c ( 1 + α ¯ ) c ( 1 − α _ ) ≥ r ¯ r _
As a result, in a given dataset, if we have ( γ ¯ γ ) 2.5 ≤ R then inequality (2.27)
holds for all k, l, and so inequality (2.13) holds. In this case conjecture (2.1) holds, and the sorting method can be applied to the data set.
The following theorem summaries our analysis:
Theorem 2.4.1. If a data set has the property that ( γ ¯ γ ) 2.5 ≤ R , then there ex-
ists an optimal solution y to (PP) such that the following property holds: if y k * = 1 for some k ∈ I 1 − , then y l * = 1 for all l ∈ { 1 , ⋯ , k − 1 } .
If the condition in Theorem (5.4.1) holds, then the sorting method can be applied.
There is another direction for needing the condition. Continuing the logic from the Section 2.3, let’s start from the simplest case, the two square root case.
First we subtract the left hand side from the right hand side of inequality (2.13);
C 1 + c k 1 + c l 1 + C 2 + c k 2 + c l 2 − C 1 + c k 1 − C 2 + c k 2 c l 1 + c l 2 − C 1 + c k 1 + C 2 + c k 2 − C 1 − C 2 c k 1 + c k 2 = ( C 1 + c k 1 + c l 1 + C 2 + c k 2 + c l 2 − C 1 + c k 1 − C 2 + c k 2 ) ( c k 1 + c k 2 ) − ( C 1 + c k 1 + C 2 + c k 2 − C 1 − C 2 ) ( c l 1 + c l 2 ) ( c l 1 + c l 2 ) ( c k 1 + c k 2 )
The denominator is larger than zero as ck and cl are larger than zero. For the numerator, we need to compare.
We need to compare
( C 1 + c k 1 + c l 1 + C 2 + c k 2 + c l 2 ) ( c k 1 + c k 2 ) + ( C 1 + C 2 ) ( c l 1 + c l 2 )
and
( C 1 + c k 1 + C 2 + c k 2 ) ( c k 1 + c k 2 + c l 1 + c l 2 ) .
From here we can see that if we do not add a condition on the relation between c k 1 + c k 2 and c l 1 + c l 2 .
The difference between the previous terms can be either positive or negative. So we first add the simple condition that ∑ t = 1 m c l t = ∑ t = 1 m c k t .
So the problem changes to comparing
C 1 + c k 1 + c l 1 + C 2 + c k 2 + c l 2 + C 1 + C 2
and
2 ( C 1 + c k 1 + C 2 + c k 2 )
As these terms are all positive, we can square them without changing their magnitude relation.
The left hand side becomes
C 1 + c k 1 + c l 1 + C 2 + c k 2 + c l 2 + C 1 + C 2 + 2 ( C 1 + c k 1 + c l 1 C 2 + c k 2 + c l 2 ) + ( C 1 C 2 ) + ( C 1 + c k 1 + c l 1 C 1 ) + ( C 2 + c k 2 + c l 2 C 2 ) + ( C 1 + c k 1 + c l 1 C 2 ) + ( C 2 + c k 2 + c l 2 C 1 )
The right hand side becomes
4 ( C 1 + c k 1 + C 2 + c k 2 + 2 ( C 1 + c k 1 C 2 + c k 2 ) )
which can be written as the same pattern as the left hand side,
C 1 + c k 1 + C 2 + c k 2 + C 1 + c k 1 + C 2 + c k 2 + 2 ( C 1 + c k 1 C 2 + c k 2 ) + ( C 1 + c k 1 C 2 + c k 2 ) + ( C 1 + c k 1 C 1 + c k 1 ) + ( C 2 + c k 2 C 2 + c k 2 ) + ( C 1 + c k 1 C 2 + c k 2 ) + ( C 1 + c k 1 C 2 + c k 2 )
The difference between these two sides is
( C 1 + c k 1 + c l 1 C 2 + c k 2 + c l 2 ) + ( C 1 C 2 ) + ( C 1 + c k 1 + c l 1 C 1 ) + ( C 2 + c k 2 + c l 2 C 2 ) + ( C 1 + c k 1 + c l 1 C 2 ) + ( C 2 + c k 2 + c l 2 C 1 ) − 4 ( C 1 + c k 1 C 2 + c k 2 ) .
This term’s value is near 0 but can be positive or negative for different data sets. We tried to add some conditions to make this term always negative, but no sufficient condition was found.
However, if we only impose the simple condition that, the sorting method often finds the optimal solution, and when it does not, the error is usually small. This will be shown in the next section.
As we discussed in Theorem 2.4.1, if a data set has the property that ( r ¯ r _ ) 2.5 ≤ R * ,
then there exists an optimal solution y * to (PP) such that the following property holds: if y k * = 1 for some k ∈ I 1 − , then y l * = 1 for all l ∈ { 1 , ⋯ , k − 1 } .
From the Theorem we can see that in a data set, the will of R * ( r ¯ r _ ) − 2.5 will de-
cide whether this data set holds the property. Let's name this number as R. So when R ≥ 1 , the property holds and the sorting method can be applied to this data set. In this section, we discuss a method to get a lower bound through the sorting method when 0 < R < 1 and the gap between the lower bound and optimal solution.
Conjecture 2.2: There exists a lower bound z l b = z ′ + ( 1 − R R ) min { b i ( i ∈ 1 , ⋯ , k − 1 ) } to (PP) such that z ′ is the objective value of a feasible solution y ′ that the following property holds: if y ′ k = 1 for some k ∈ I 1 − , then y l ′ = 1 for all l ∈ { 1 , ⋯ , k − 1 } .
Start of Proof: Let y * be an optimal solution to (PP) and z * is its objective value. Assume (for a contradiction) that y k * = 1 but y l * = 0 for some l ∈ { 1 , ⋯ , k − 1 } and z * < z l b . We will find a feasible solution y ′ with the following property: if y ′ k = 1 for some k ∈ I 1 − , then y ′ t = 1 for all l ∈ { 1 , ⋯ , k − 1 } that breaks the inequality, thus contradicting our assumption.
z ′ − z * = b l + ∑ t = 1 m [ C t + c k t + c l t − C t + c k t ]
as 0 < R < 1 we have
From the Assumption,
z * < z l b R z * < R z ′ + ( 1 − R ) min { b i ( i ∈ 1 , ⋯ , k − 1 ) } R z * < R z ′ + ( 1 − R ) b l R z ′ − R z * + ( 1 − R ) b l > 0. (2.29)
From Equation (2.28) we have
R z ′ − R z * + ( 1 − R ) b l ∑ t = 1 m c l t = b l ∑ t = 1 m c l t + R ∑ t = 1 m [ C t + c k t + c l t − C t + c k t ] ∑ t = 1 m c l t (2.30)
Also, we have Equation (2.8)
z * − z ″ ∑ t = 1 m c k t = b k ∑ t = 1 m c k t + ∑ t = 1 m ( C t + c k t − C t ) ∑ t = 1 m c k t (2.31)
Based on the way we sort the data,
b l ∑ t = 1 m c l t ≤ b k ∑ t = 1 m c k t (2.32)
According to Inequalities (2.19), (2.25), and (2.26), we can imply that,
R ∑ t = 1 m [ C t + c k t + c l t − C t + c k t ] ∑ t = 1 m c l t ≤ ∑ t = 1 m ( C t + c k t − C t ) ∑ t = 1 m c k t (2.33)
Combining Inequalities (2.32) and (2.33) we have,
b l ∑ t = 1 m c l t + R ∑ t = 1 m [ C t + c k t + c l t − C t + c k t ] ∑ t = 1 m c l t ≤ b k ∑ t = 1 m c k t + ∑ t = 1 m ( C t + c k t − C t ) ∑ t = 1 m c k t
So
R z ′ − R z * + ( 1 − R ) b l ∑ t = 1 m c l t ≤ z * − z ″ ∑ t = 1 m c k t (2.34)
From Inequality (2.29),
R z ′ − R z * + ( 1 − R ) b l ∑ t = 1 m c l t > 0 (2.35)
Clearly,
z * − z ″ ∑ t = 1 m c k t ≤ 0 (2.36)
There exists a contradiction in inequalities (2.34), (2.35) and (2.36). As a result, Conjecture 2.2 is proved.
We generate the parameters in our random samples using uniformly distributed random numbers. b i was generated from ∪ [ − 10 , 0 ] and all c i k were generated from U [ 0 ; 1 ] and we choose | I | = 10 and | m | = 2 , 3 , 4 . There are 600 instances in total. We get the optimal value of the sample instances by enumerating each of the feasible solutions and choosing the ones with the smallest objective value. Then we apply the sorting method to the same instance and compare the result with the enumeration method. Both methods were implemented in MATLAB.
We can see that in most of the two square root cases, 94 out of 100, the sorting methoded the optimal solution. In these samples, 42 instances satisfy the sufficient condition (2.17) and for each of these instances, the sorting method finds the optimal solution, as predicted by Theorem (5.4.1). In the three square root cases, the sorting method works in 73 out of 100 data sets and 15 of them meet condition (2.17). When there are four square root terms, the sorting method works in 56 out of 100 data sets, and there are only 6 cases that meet condition (2.17). The results are shown in
For the alternate analytical approach in Section 2.5; the accuracy rates are shown in
Number of square root terms | Satisfaction rate | Average error for all instances | Average error for failed instances |
---|---|---|---|
2 | 0.42 | 0.9856 | 0.7314 |
3 | 0.15 | 0.9347 | 0.6026 |
4 | 0.06 | 0.8653 | 0.6892 |
Number of Square root terms | Accuracy rate | Average error for all instances | Average error for failed instances |
---|---|---|---|
2 | 0.93 | 0.9875 | 0.8514 |
3 | 0.87 | 0.9493 | 0.7483 |
4 | 0.62 | 0.8853 | 0.7048 |
with more square roots, the accuracy rate goes down. However, the rates are acceptable when the number of square roots is not large. For future research, we will search for additional sufficient conditions to guarantee the optimality of the sorting method.
As for the MSR problem, we discuss a sufficient condition that guarantees the sorting method works, and an alternate analytical approach that makes the sorting method often needs the optimal solution while the average error is low.
For future research, we would like to determine the conditions under which the linearization method will have a shorter solution time than the Lagrangian method does. Searching for relatively relaxed sufficient conditions to guarantee that the sorting method works for the MSR problem is also a direction.
Zhang, X., Tian, X., Wang, C. and Li, T. (2017) Multiple- Square-Root Minimization Problem. Ame- rican Journal of Industrial and Business Ma- nagement, 7, 927-943. https://doi.org/10.4236/ajibm.2017.77066