A Cross Entropy-Genetic Algorithm for m-Machines No-Wait Job-ShopScheduling Problem

doi:10.4236/jilsa.2011.33018

Paper Menu >>

Journal Menu >>

Journal of Intelligent Learning Systems and Applications, 2011, 3, 171-180

doi:10.4236/jilsa.2011.33018 Published Online August 2011 (http://www.scirp.org/journal/jilsa)

171

A Cross Entropy-Genetic Algorithm for

m-Machines No-Wait Job-Shop Scheduling

Problem

Budi Santosa, Muhammad Arif Budiman, Stefanus Eko Wiratno

Industrial Engineering, Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia.

Email: budi_s@ie.its.ac.id

Received October 26th, 2010; revised January 26th, 2011; revised February 6th, 2011.

ABSTRACT

No-wait job-shop scheduling (NWJSS) problem is one of the classical scheduling problems that exist on many kinds of

industry with no-wait constraint, such as metal working, plastic, chemical, and food industries. Several methods have

been proposed to solve this problem, both exact (i.e. integer programming) and metaheuristic methods. Cross entropy

(CE), as a new metaheuristic, can be an alternative method to solve NWJSS problem. This method has been used in

combinatorial optimization, as well as multi-external optimization and rare-event simulation. On these problems, CE

implementation results an optimal value with less computational time in average. However, using original CE to solve

large scale NWJSS requires high computational time. Considering this shortcoming, this paper proposed a hybrid of

cross entropy with genetic algorithm (GA), called CEGA, on m-machines NWJSS. The results are compared with other

metaheuritics: Genetic Algorithm-Simulated Annealing (GASA) and hybrid tabu search. The results showed that CEGA

providing better or at least equal makespans in comparison with the other two methods.

Keywords: No-Wait Job Shop Scheduling, Cross Entropy, Genetic Algorithm, Combinatorial Optimization

1. Introduction

No-wait job-shop scheduling problem (NWJJS) is a

problem categorized to non-polynomial hard (NP-Hard)

problem, especially for m-machines [1]. In a typical job

shop problem, each job has its own unique operation

route. Because the continuity of operation in each job

must be kept to avoid operation reworking or job redoing,

the use of incorrect method for scheduling purpose may

make the makespan significantly longer. In addition, the

existance of no-wait constraint, e.g. on metal, plastic, and

food industries, made the problem even more complex.

Many researches have been using various methods to

solve NWJJS. Genetic Algorithm-Simulated Annealing

(GASA) [2] and Hybrid Tabu Search [3] are examples of

methods used to solve this problem. Several methods fail

to achieve the optimum solution, others succeed, but

with relatively long computational time.

Cross entropy method, as a relatively new metaheuris-

tic, has been widely used in broad applications, such as

combinatorial optimization, continuous optimization, noisy

optimization, and rare event simulation [4]. On these

problems, cross entropy can find optimal or near optimal

solution with less computational time. However, using

original CE to solve large scale NWJSS requires longer

computational time. This paper proposed a new algo-

rithm of hybridized cross entropy with genetic algorithm

(CEGA). The proposed method is also new in solving

NWJSS problem. Using the hybrid of CE and GA the

computational time can be reduced significantly while

maintaining better makespan.

2. Problem Overview

NWJSS is a specific job-shop scheduling problem in

which a constraint not to allow any waiting time between

two sequential processes for each job applies. This kind

of problem can be found in many industries with “no-

wait” constraint, such as steel processing, plastic Indus-

tries, and chemical-related industries (such as pharmacy

and food industries), also for semiconductor testing pur-

poses [1] and [5]. On such industries, if there’s any wait-

ing time exist between processes, it may cause a defect

on the product and would require it to be reworked with

a certain process. It may also cause a product failure,

means that we must redo all the processes for related job

A Cross Entropy-Genetic Algorithm for m-Machines No-Wait Job-Shop Scheduling Problem

172

from the beginning.

Many researches were conducted to obtain a better al-

gorithm to approach this problem. A simple heuristic ap-

proach for solving this problem is presented by Mascis

and Pacciarelli. The method consists of four alternative-

graph-based greedy algorithms: AMCC, SMCP, SMBP,

and SMSP. These algorithms are also being tested on

job-shop with blocking problem, assumed that complex-

ity of both problems are almost equal [6]. Later, hybridi-

zations of more than one heuristic method tend to be

used for better results, for instances: a hybridization of

Genetic Algorithm with Simulated Annealing (GASA) to

make the convergence of the results better [2], a combi-

nation of GA with a specific genetic operator contained

ATSP and local search principle [1], and a hybridization

of Tabu Search with part of HNEH algorithm, which

aims to ensure that the solution produced is much ac-

ceptable [3].

In [2], another type of heuristic method based on local

neighbourhood search so called fast deterministic vari-

able neighbourhood search is introduced. The search was

used for exploiting the special structure of the problem to

be solved with GASA algorithm. While the development

of hybridization methods is gaining its momentum, the

pure methods are not yet old fashion. In fact, modifica-

tions have been proposed to improve obtained solution.

A complete local search with memory (CLM) using local

neighbourhood search was introduced withthe use of a

memory system for special purposes i.e. to avoid the

same solution alternative visited [7]. Modifications of

completed local search with limited memory (CLLM) are

also options in the field of pure heuristic development,

i.e. by giving a constraint to limit the number of memory

to CLM algorithm [8], and the preference to use a shift

timetabling technique rather than enhanced timetabling

proposed on CLM. Graham et al. (tahun) on literature [2]

defines NWJSS problem as follow.

Given a set of machines



1, 2,,



m



1, 2,,

purposed

to process a set of jobs

n









, ,O



. For each i-th job

J, given a sequence of j operations

as the detail of process

in i-th job. Each operation has







,1,, 2,OOiOi





,, ,mij wijM







,wij

, specifying that operation

will be processed on with processing

time .



,mij

No wait constraint is given by setting the condition of

’s starting time equals to ’s finishing

time. Then, the assumptions used are: one job can not be

proc- essed at more than one machine at a time, or one

machine can not process more than one job at a time;

also there is no interruption or pre-emption allowed.



,Oij

Generally, this problem is divided into two sub-prob-

lems: 1) sequencing; is how to find the best sequence of

job-scheduling-priority with the best makespan obtained

from all of the combinations, and 2) timetabling; is how

to get the best starting time for all jobs scheduled for

finding better makespan than one obtained from se-

quencing sub-problem [8].

As illustration, an example is given below.

Given a set of jobs





1, 2, 3J to be processed on a

set of machines {I, II, III}. The route of machines and

processing time for each machine is indicated in Table 1.

The sequencing sub-problem here is how to find the

priority of each job to be scheduled. There are 3! possi-

bilities or 6 priority sequences: 1-2-3, 1-3-2, 2-1-3, 2-3-1,

3-1-2, and 3-2-1. Then, the timetabling sub-problem is

how to get the best makespan from all possible se-

quences. For example, for priority sequence 1-2-3, with a

type of timetabling method will produce result as shown

in Figure 1(a). When another method used, it may pro-

duce result as shown in Figure 1(b). From this explana-

tion, we can conclude that the use of different timeta-

bling methods may result in different makespans.

The use of optimum sequencing method within opti-

mum timetabling method may not produce an optimum

makespan, and otherwise. Therefore, to obtain the best

makespan, we must choose the best method to combine

these two sub-problems.

3. Problem Formulation

Referring to Brizuela’s model [1], NWJSS problem with

objective of minimizing makespan can be modelled with

integer programming formulation as follow:

 Symbols definition

Ji i-th job

(a)





,1Oij

(b)

Figure 1. Comparison of timetable results using different

methods for sequence 1-2-3.

A Cross Entropy-Genetic Algorithm for m-Machines No-Wait Job-Shop Scheduling Problem173

Mk k-th machine

k Operation of Ji to e processed on Mk

O(i,j) j-th operation of Ji

Ni Number of operations in Ji

 Problem parameters

M A very large positive number

n Number of jobs

m Number of machines

k Processing time Ji on Mk

r(i,j,k) 1 if O(i,j) requires Mk, 0 otherwise

 Decision variables

Cmax Maximum completion time of all jobs (makespan)

k The earliest starting time of Ji on Mk

Z(i,i’,k) 1 if Ji precedes Ji’ on Mk, 0 otherwise

 Problem formulation

Minimize Cmax

Subject to



,, ,,

ii i

ijk ijk

rsw rs







(1)







kk k

ii iii k

ss wMZ



  (2)



kk k

ii iii k

ss wMZ



 







(3)



max

mkk

iN k

krswC



 (4)

max 0C; ; (5) 0

s

with ; ;

and .



1, 2,,,in





0, 1



1, 2,,1

jN

1'ii n 



1, 2,,k

,,ijk

Constraint (1) restricts that Mk begins the processing

of right after finished (to ensure that

no-wait constraints are met). Constraints (2) and (3) en-

force that only one job may be processed on a machine at

any time. is a binary variable used to guarantee

that one of the constraints must hold when the other is

eliminated. Constraint (4) is useful to minimize Cmax in

the objective function. Finally, Constraint (5) guarantees

that Cmax and si

k are non-negative.



,1ij

O



,ij



,',ijk

4. Cross Entropy

4.1. Basic Idea of Cross Entropy

If GA is inspired by natural biological evolution theory

developed by Mendel, which includes genes transmission,

natural selection, crossover/recombination and mutation,

differently, cross entropy (CE) is inspired by a concept

of modern information theory namely the concept of

Kullback-Leibler distance, also well-known with the

same name: the concept of cross entropy distance [4].

This concept was developed to measure the distance be-

tween an ideal reference distribution and the actual dis-

tribution. This method generally has two basic steps,

generating samples with specific mechanism and updat-

ing parameters based on elite sample. The concept then

is redeveloped by Reuven Rubinstein with combining the

Kullback-Leibler concept and Monte Carlo simulation

technique [4].

CE has been applied in wide range of problems. Re-

cently, it had been applied in credit risk assessment

problems for commercial banks [8], in clustering and

vector quantization [9], as well as to solve combinatorial

and continuous optimization problem [4]. Additionally,

CE is also powerful as an approach to combining multi-

ple object classifiers [9] and network reliability estima-

tion [10] while other has successfully used CE on gener-

alized orienteering problem [11]. CE application has

been widely adopted in the case of difficult combinato-

rial such as the maximal cut problem, Traveling Sales-

man Problem (TSP), quadratic assignment problem,

various kinds of scheduling problems and buffer alloca-

tion problem (BAP) for production lines [4].

For solving optimization problem, cross entropy in-

volves the following two iterative phases:

1) Generation of a sample of random data (trajectories,

vectors, etc.) according to a specified random mechanism,

i.e. probability density function (pdf)

2) Updating parameters of the random mechanism,

typically parameters of pdfs, on the basis of data, to pro-

duce a “better” sample in the next iteration.

Suppose we wish to minimize some cost function S(z)

over all z in some set Z. Let us denote the minimum by

γ*, thus



*min

zZSz





 (6)

We randomize our deterministic problem by defining a

family of auxiliary pdfs







vvV and we asso-

ciate with Equation (6) the following estimation problem

for a given scalar γ:









()[ ]

PSZ EI





 the so-called associated

stochastic problem. Here, Z is a random vector with pdf

(.;u), for some u



V (for example Z could be a Ber-

noulli random vector). We consider the event “cost is

low” to be rare event









ISZ



 of interest. To esti-

mate the event, the CE method generates a sequence of

tuples









ˆˆ



, that converge (with high probability) to

small neighbourhood of the optimal tuple









*, *v



where γ* is the solution of the problem (6), and v* is a

pdf that emphasize values in Z with low cost. We note

that typically the optimal v* is degenerated as it concen-

trates on the optimal solution (or small neighborhood

thereof). Let ρ denote the fraction of the best samples

used to find the threshold γ. The process based on sam-

pled data is termed the stochastic counterpart since it is

based on stochastic samples of data. The number of sam-

A Cross Entropy-Genetic Algorithm for m-Machines No-Wait Job-Shop Scheduling Problem

174

ples in each stage of the stochastic counterpart is denoted

by N, which is a predefined parameter. The following is

a standard CE procedure for minimization borrowed

from [4].

We initialize 00 and choose a not very small

ρ (rarity coeficient), say

vvu

10 2



. We then proceed

iteratively as follows.

4.1.1. Adaptive updatin g o f γt

A simple estimator ˆt



of γt can be obtained by taking

random sample



 

ZN from the pdfs f(·;vt-1),

calculating the performance







lSZ for all l ordering

them from smallest to biggest as









1, ,SS

ˆt

and

finally evaluating the ρ 100% sample percentile as





([ ])N



4.2.2. Adaptive updatin g o f vt

For a fixed γt and vt-1, derive vt from the solution of the

program:







minminln ;

vSZ

DvxEIf Zv









(7)

The stochastic counterpart of (7) is ˆt



and 1

ˆt



, de-

rive from the following program:

ˆt







()

minminln ;

DvIf Zv











(8)

The update formula of the kth element in v (Equation

(8)) in this case simply becomes:

























(9)

To simplify Equation (9), we can use the following

smoothed version provided by [4]:



ˆˆ ˆ

tt t

vv v





 (10)

where is the parameter vector obtained from the so-

lution of Equation (8), and β is a smoothing parameter.

The CE optimization algorithm is summarized in Algo-

rithm 1.

ˆt

Algorithm 1. The CE Method for Stochastic Optimi-

zation

1) Choose . Set t = 1 (level counter)

2) Generate a sample Z(1),, Z(N) from the density f

(·;vt-1) and compute the sample ρ 100-percentile

ˆt



the sample scores.

3) Use the same sample Z(1),, Z(N) and solve the

stochastic program (8). Denote the solution by .



4) Apply (10) to smooth out the vector .



ˆˆ

5) If for some , say d = 3, 1d

dt ˆ

tt t









then stop; otherwise set t = t + 1 and reiterate from step

It is found empirically that the CE method is robust

with respect to the choice of its parameter N, ρ, and β.

Typically those parameters satisfy that 0.01 0.1





,

0.5 0.9





, and , where n is the number of

parameter.

3Nn



This procedure provides the general frame. When we

are facing a specific problem, we have to modify it to fit

it with our problem.

4.2. Cross Entropy for Combinatorial

Optimization

In case of job scheduling we require parameter P in place

of v. P is a transition matrix where each entry pi,j denotes

the probability of the job i to the place j, for i = 1, 2,,

n, j = 1, 2,, n, where n is the number of job. For the

initial P we can put equal values to all entries, it means

that the probability of the job i to the place j is equally

distributed.

Based on matrix P, we will generate N sequences of

jobs. Each sequence (Zi) will be evaluated based on S(zi)

where S = Cmax value for each sequence. Out of N se-

quences, we take ρN percent elite samples with the best S

(instead of using



as a threshold to select elite sample).

Let ES = ρN, the updating formula for is given



ˆ,

Pij





ˆ,

PijES





(11)

To generate sequence of job we can use trajectory

generation using node placement [4] as shown in Algo-

ritm 1.

Algorithm 2. Trajectory generation using node place-

ment

1) Define P(1) = P, Let k = 1

2) Generate Zk from the distribution formed by the k-th

row of P(k). Obtain the matrix P(k+1) from P(k) by first

setting the Zk-th column of P(k) to 0 and then normalizing

the rows to sum up to 1.

3) If k = n then stop, otherwise k = k + 1 and reiterate

from Step 2.

4) Determine the sequences and evaluate their makespan

The main CE algorithm for job scheduling is given in

Algorithm 3.

Algorithm 3. The CE method for Job scheduling

1) Choose initial reference transition matrix 0, say

with all entries equal to 1/n, where n is the number of job.

Set t = 1.

2) Generate a sample Z1, , ZN of job sequence via

Algorithm 2 with P = t-1 and choose ρN elite sample

with the best performance of S(z).



3) Use the elite sample to update .

ˆt

4) Apply (10) to smooth out matrix .

ˆt

5) If for some , say d = 5, 1d

tdˆˆ ˆ

tt t









then stop; otherwise set t = t+1 and reiterate from step 2.

4.3. Example

A Cross Entropy-Genetic Algorithm for m-Machines No-Wait Job-Shop Scheduling Problem175

Table 1. NWJSS case example

Job Operation Machine Processing Time

1 O11 I 1

12 II 3

2 O21 I 1

22 II 4

23 III 2

3 O31 I 1

32 II 3

To understand the use of CE in jobshop scheduling more

easily, let’s see the following example. There are 3 jobs

with known processing time (L), due date (d) and weight

for tardiness (w) for each job as in Table 1. It is desired

to find the optimal sequence based on total weighted

tardiness. Let’s use N = 6, ρ = 1/3, β = 0.8.

The objective function for jobshop scheduling with

single machine with minimum total weighted tardiness

(SMTWT) is





minminmax, 0

zZkk k

Szwf d







where 1





Suppose the initial transition matrix is

13 1313











and the population generated is as follows:

Z1: 1-2-3; with S = 1.5

Z2: 1-3-2; with S = 2

Z3: 2-1-3; with S = 0.5

Z4: 2-3-1; with S = 1

Z5: 3-1-2; with S = 2

Z6: 3-2-1; with S = 2

Two best samples as elite sample are

Z3: 2-1-3; with S = 0.5

Z4: 2-3-1; with S = 1

For the sequence 2−1−3, we have

0120

12 00

0012











Considering the second best sequence 2-3-1, we get

010

120 12











Using P1 = βw + (1 − β) P0, we obtain the transition

probability for the next iteration as

0.0667 0.8667 0.0667

0.4667 0.0667 0.4667























Using this transition probability, N new sequences will

be generated. From these sequence, evaluate the objec-

tive function S(z) and repeat the same steps until stop-

ping criteria is met.

5. Proposed Algorithm

The proposed method to solve the NWJSS problem is a

hybrid of cross entropy with genetic algorithm (CEGA).

The cross entropy is used as the basic; while from the

GA the procedure of sample generation is adopted.

For this NWJSS problem, the flowchart of CEGA is

given in Figure 2.

The explanations of Figure 1 is as follows:

Defining Inputs and Outputs

The inputs and outputs are determined as follows:

Inputs:

 Machine routing matrix (R(j, k); j state the job number

and k state the operation number)

 Processing time matrix (W(j, k))

 Number of population (N)

 Ratio of elite sample (ρ)

 Smoothing coefficient (β)

 Initial crossover rate (Pps)

 Terminating criterion (



)

Figure 2. Flowchart of CEGA.

A Cross Entropy-Genetic Algorithm for m-Machines No-Wait Job-Shop Scheduling Problem

176

Outputs:

 Best schedule’s timetable (starting and finishing time

of each job)

 Best schedule’s makespan (Cmax)

 Computational time (T)

 Number of iteration

Assessing Initial Parameters

The initial values of predefined inputs (N, ρ, α, initial Pps,

and



) are determined by user. The parameters values are

as follows:

 Population size N, there is no certain threshold, the

larger number of job, requires larger number of popu-

lation size as the permutation of possible schedules

getting bigger. In this paper we use N as cubic of the

number of jobs (n3).

 Elite sample ratio ρ, suggested range is 1% - 10% [4].

In this paper, we used ρ = 2%.

 Smoothing coefficient α, the range is 0 - 1, and 0.4 -

0.9 is empirically the optimum range [4]. We used β =

0.8.

 Crossover rate (Pps), we used Pps = 1 for the initial

value.

 Terminating criterion



= 0.001.

Generating Sample

Each sample represents the sequence of job, which

should be scheduled as early as possible. The generation

of initial sample (iteration = 1) is fully randomized, but

in the next iterations, samples are generated using ge-

netic algorithm operators (crossover and mutation), are

done based on these stepsare done based on these steps:

1) Weighting Elite Sample

This weighting is necessary for the next step (selecting

parents), where the first parent is selected from elite sam-

ples by considering the weight of each elite sample. The

weighting rule is, if the makespan generated by a se-

quence is better than the best makespan ever visited of

the previous iteration, the weight is equal to the number

of elite sample, otherwise is given 1.

2) Assessing Linear Fitness Rank

Linear fitness rank (LFR) for actual iteration calculated

from fitness value of all sample generated in the previous

iteration. The value of LFR is formulated by











111

maxmax min

LFR INIFFFiN

where the fitness value is same as 1/makespan value. i is

stated the i-th sample (which is valued between 1 and N),

and I state the job index on sample matrix.

3) Selecting Parents

Parent selection is conducted by using roulette wheel

selection,samples with higher fitness values have larger

chance to be selected as parent. The first parent is se-

lected from elite samples (with the weight calculated by

Step 3a), and the second parent is selected from all of the

last iteration samples with LFR weight from step 3b).

4) Crossover

Crossover is done with two-point order-crossover tech-

nique, which the choosing of points held randomly from

both of parents. The offspring resulted from this tech-

nique have the same segment between these two points

with their parents. Other side, the other segment will be

kept from the other different parent’s sequence of jobs.

5) Mutation

Mutation was conducted with swapping mutation tech-

nique, whereas mutation conducted by exchanging se-

lected job with another job in the same offspring.

Calculating Makespan

The calculation of makespan value will be conducted

with simple shift timetabling method, adapted from shift

timetabling method by Zhu et al. [8]. The steps are:

a) Schedule the first job from t = 0

b) Schedule the next job from t = 0, check whether or

not machines are overloads. If they exist, shift jobto

therightside until there is no machine overloaded.

c) Repeat b) until all jobs are scheduled

Choosing Elite Sample

Elite sample was chosen as [ρN] best sample out of

population N based on makespan values.

Updating Crossover Rate and Mutation Rate

Parameter updating is done by taking the ratio between

average makespan and best makespan in each iteration,

noted as u. Crossover rate then updated with Pi = βu + (1

– β)Pi-1 , and mutation rate defined as half of crossover

rate.

Checking for Terminating Condition

Terminating condition used in this research is when the

difference between actual crossover rate with crossover

rate from previous iteration is less than (If this condition

is met, then stop the iterations. Otherwise, repeat from

Step 4.

The outputs of this process are the best timetable and

makespan, computational time, and number of iteration.

For more explanation, we use data in Table 2 as an

example. From Table 2, we obtain machine routing and

processing times as follows:

 Machine routing matrix

123

:1 32

123





















Table 2. Job data.

No Lj d

j w

1 1.0 2.0 1.0

2 1.0 1.0 1.0

3 1.0 2.5 1.0

A Cross Entropy-Genetic Algorithm for m-Machines No-Wait Job-Shop Scheduling Problem177

 Processing time matrix

130

:1 42

130







The row and column denote the number of job and

operation respectivelly. Actually O13 and O33 in W do not

exist, 0 processing time (dummy operation) in these en-

tries is just to keep the matrices squared. The other re-

quired parameters are N, ρ, β, initial Pps, and ε. Let set N

= 3, ρ = 0.02, β = 0.8; initial Pps = 1; and ε = 0.001. The

terminating condition is reached when |Pps(it) – Pps(it–1)| ≤

ε.

Initially, the population is generated randomly; sup-

pose the initial population is

123

312

213











For each sample, we compute the makespan with left-

shift technique, results 11 for first, 9 for second, and 10

for third instances. Then, we choose the elite samples by

[ρN] or [(0.2)(3)] = 1. Therefore only one out of three is

chosen as the elite sample, and it must be 3-1-2 with

makespan value 9.

Then, update the crossover rate (Pps) by updating pa-

rameter u value. Let the value for this NWJSS problem is

Average makespan

2The best makespan

u

Average makespan denotes the average of makespan

obtained in current iteration. The best makespan is the

best value of makespan in current iteration.

For current iteration, u value is 10

29 or 5

8. The Pps

for next iteration then updated as 0.85

8 + 0.2



1 = 0.7,

and the mutation rate is (1/2)0.7 = 0.35.

Go to next iteration. For second iteration until termi-

nating condition reached, generating samples will be

done by GA mechanism. First we must compute the

weight value w and the LFR value of each samples gen-

erated before. For this problem, both w for elite sample

or non-elite sample is 1, cause of the size of elite sample

is also 1. The Fmax value is 1/9, while Fmin is 1/11. Then,

for each sample, the LFR value results is 1/9, 1/11, and

10/99.

For parent selection, we use the roulette wheel mecha-

nism, when the first parent is chosen by weight value w,

and the second is chosen from LFR selection. Then we

conduct the two-point order crossover. The “chromo-

some” to be changed with the crossover results are just

the second and third, while the first sample is changed

with the first rank of sample elite to keep the best

makespan results (elitism mechanism).

Let 1-2-3 and 3-1-2 as the chosen parents. Choose a

random U (0, 1) number, say 0.56. Since 0.56 < 0.7, then

do the crossover mechanism. Let the lower and upper

bound of crossovered “genes” are 2 and 2 (so just the

second “gen” to be crossovered).

Then the temporary population is

312

213

321























After that, conduct the swap mutation mechanism for

each new sample (except the top one) by firstly choosing

again a random U (0, 1) number and check with the mu-

tation rate. When the mutation condition met, choose 2

different genes to be exchanged randomly. Suppose only

the second sample will be mutated, and the exchanging

genes are gen 2 and gen 3. Then, the new “chromosome”

is 2-3-1, and the temporary new samples matrix after

updated replaces old population matrix:

312

231

321























Do the same process as the first iteration (calculating

makespan etc.) until the terminating condition reached.

6. Experiments

The algorithm was coded using Matlab. The experiment

is conducted in 30 replications. The average and standard

deviation of all replications were recorded. The data used

in this experiment are taken from OR Library, including

Ft06, Ft10, La01-La25, Orb01-Orb06 and Orb08-Orb10.

The best makespan average and standard deviation

resulted from the experiments are shown in Tables 3 and

4. Ref denotes the best known solution obtained using

branch and bound technique. The term ARPD was cal-

culated using this formula:





100

best ref

ARPD ref





Based on the results in Table 3, we can see that the

minimum value of ARPD is 0.0 and all of the ARPD

values are below 1.0 (except La02 and La18). This value

shows that CEGA method can give good result as well as

branch and bound calculation. In addition, for Ft06 and

Orb08 data, the minimum and standard deviation of

ARPD value in CEGA is 0.0, which means that all repli-

A Cross Entropy-Genetic Algorithm for m-Machines No-Wait Job-Shop Scheduling Problem

178

Table 3. Performance of CEGA for small instances.

Size Makespan CEGA Time (sec)

Instances Job/

Mach

Ref Best Avg StDev ARPD AvgStDev

ft06 6/6 73 73 73.0 0.0 0.0 7.10.1

la01 10/5 971 975 990.1 14.7 0.4 132.610.8

la02 10/5 937 961 970.9 9.0 2.5 141.16.1

la03 10/5 820 820 852.4 19.3 0.0 133.012.2

la04 10/5 887 887 891.7 8.8 0.0 130.112.7

la05 10/5 777 781 788.0 11.4 0.5 149.312.4

ft10 10/10 16071607 1611.9 12.4 0.0 269.412.1

orb01 10/10 16151615 1630.6 20.2 0.0 307.824.2

orb02 10/10 14851485 1509.2 14.8 0.0 240.410.8

orb03 10/10 15991599 1620.2 19.8 0.0 295.816.9

orb04 10/10 16531653 1692.7 39.3 0.0 278.514.0

orb05 10/10 13651370 1390.3 18.7 0.4 257.114.9

orb06 10/10 15551555 1559.1 15.5 0.0 284.215.0

orb08 10/10 13191319 1319.0 0.0 0.0 291.73.4

orb09 10/10 14451445 1482.6 39.2 0.0 270.723.2

orb10 10/10 15571557 1585.6 16.2 0.0 253.417.7

la16 10/10 15751575 1581.5 19.9 0.0 250.913.6

la17 10/10 13711384 1405.5 24.9 0.9 241.521.6

la18 10/10 14171507 1509.7 9.1 6.0 240.017.8

la19 10/10 14821491 1531.4 34.8 0.6 253.27.4

la20 10/10 15261526 1542.5 28.8 0.0 255.811.4

Table 4. Performance of CEGA for large instances.

Size Makespan CEGA Time (sec)

Instan

ces Job/

Mach

Ref Best Avgstd ARPD Avg std

la06 15/5 1248 1304 1342.024.3 4.3 1635.8236.9

la07 15/5 1172 1221 1265.723.9 4.0 1670.1205.8

la08 15/5 1244 1274 1323.823.2 2.4 1626.9187.2

la09 15/5 1358 1382 1443.121.2 1.7 1745.9202.6

la10 15/5 1287 1299 1353.930.7 0.9 1624.3250.6

la11 20/5 1671 1722 1793.531.9 3.0 10061.71097.2

la12 20/5 1452 1538 1597.926.2 5.6 9695.11102.8

la13 20/5 1624 1674 1759.130.7 3.0 10525.01028.4

la14 20/5 1691 1749 1821.431.6 3.3 9976.71199.6

la15 20/5 1694 1752 1851.941.3 3.3 10722.11237.0

la21 15/10 2048 2054 2209.862.1 0.3 3032.3426.2

la22 15/10 1887 1910 1972.142.1 1.2 2970.9418.5

la23 15/10 2032 2098 2184.045.2 3.1 2995.8429.7

la24 15/10 2015 2056 2133.636.9 2.0 2889.1332.4

la25 15/10 1917 1994 2059.431.7 3.9 3049.5458.7

cations gives the optimal value. Based on this result, we

can conclude that the performance of CEGA is relatively

well, especially for small instances.

For larger size problem, CEGA’s performance tends to

decline. Based on the result in Table 4, we can see that

most of the ARPD values are greater than 1.0. It is oc-

curred because the increase of jobs number processed

will increase the number of search space as factorial. For

example, when the job number increases from 10 to 15

jobs, the search space increases from 10! to 15! or 15 ×

14 × 13 × 12 × 11 = 360360 times larger than 10 jobs.

This increase, of course is hard to be followed by the

number of sample size. However, by using n3 number of

sample, we can say that this algorithm still has a good

tolerable performance. This is indicated by the ARPD

values which are less than 1.0 (for example in La10 and

La21) and most of ARPD value are still less than stan-

dard error 5.0 (except La12).

Based on the result shown by Tables 3 and 4, the best

and average makespan value obtained from all replica-

tions, tends to have a close value with the reference

makespan that shows that the t to the result 3 and Table

4, the algorithm’s performance is good enough. Mean-

while, the standard deviation tends to be large (greater

than 10.0, except in certain cases). This means that the

algorithm produces non-uniform makespan at each repe-

tition. But, the probability of getting best result is rela-

tively higher. To assure that the result not to converge to

a local optimum, this algorithm actually needs to be op-

timized again. By reducing the standard deviation value

with better or at least same average value (which means

the makespan values obtained at each repetition will be

more uniform but still tend to approach best value of

reference).

Although this algorithm produces better makespans,

the computation time is relatively long and will increase

significantly when the job size are getting larger. This

fact can be seen on Figure 2, where the average time

needed by this algorithm and standard deviation value

increase drastically as the job size increases. This is al-

leged to be caused by timetabling process to calculate the

objective function which is relatively time consuming.

As explained previously that simple shift timetabling

method used requires checking for the presence of over-

load on the machine for each new job scheduled. Every

will-be-scheduled-job has a specific machine overload

when it is being scheduled. It must be shifted as far as

the relevant value of time of overload on that machine.

After being moved, the other machines are overload.

Then the shifting must be done again until all of ma-

chines have no overload. This mechanism certainly takes

a long computing time, especially when the size of the

jobs is getting larger. The use of lower level program-

ming language such as C or C++ can improve computa-

tional time performance.

Compared with other algorithms, such as Genetic Al-

gorithm-Simulated Annealing [2] and Hybrid Tabu

Search [3], CEGA performance can be shown in Tables

5 and 6. Highlighted in bold is the best makespan for

each instance. Reference makespans (Ref), for small in-

stances, are the optimum value obtained by branch and

bound algorithm. For large instances are the best known

makespan ever obtained by researchers until present [8].

Based on the comparison in Table 5, for small in-

stances, we can see that the performance of CEGA is

absolutely better than GASA in terms of makespan. Out

of 21 instance, CEGA can reach 18 optimal makespan

values better than GASA which only reach only 4 in-

A Cross Entropy-Genetic Algorithm for m-Machines No-Wait Job-Shop Scheduling Problem179

Table 5. Makespan comparison of GASA, HTS and CEGA

for small instances.

GASA HTS CEGA

Instance Ref Best ARPD Best ARPD Best ARPD

ft06 73

73 0.0 73 0.0 73 0.0

la01 971 1037 6.4 975 0.4 9750.4

la02 937 990 5.4 975 4.1 9612.5

la03 820 832 1.4 820 0.0 8200.0

la04 887 889 0.2 889 0.2 8870.0

la05 777 817 4.9 777 0.0 7810.5

ft10 1607 1620 0.8

1607 0.0 16070.0

orb01 1615 1663 2.9 1615 0.0 16150.0

orb02 1485 1555 4.5 1518 2.2 14850.0

orb03 1599 1603 0.2 1599 0.0 15990.0

orb04 1653

1653 0.0 1653 0.0 16530.0

orb05 1365 1415 3.5 1367 0.1 13700.4

orb06 1555

1555 0.0 1557 0.1 15550.0

orb08 1319

1319 0.0 1319 0.0 13190.0

orb09 1445 1535 5.9 1449 0.3 14450.0

orb10 1557 1618 3.8 1571 0.9 15570.0

la16 1575 1637 3.8 1575 0.0 15750.0

la17 1371 1430 4.1 1384 0.9 13840.9

la18 1417 1555 8.9 1417 0.0 15076.0

la19 1482 1610 8.0 1491 0.6 14910.6

la20 1526 1693 9.9 1526 0.0 15260.0

Average 3.5 0.52 0.5

Table 6. Makespan comparison of GASA, HTS and CEGA

for large instances.

GASA HTS CEGA In-

stances Ref Best ARPD Best ARPD BestARPD

la06 1248 1339 6.8 1248 0.0 13044.3

la07 1172 1240 5.5 1172 0.0 12214.0

la08 1244 1296 4.0 1298 4.2 12742.4

la09 1358 1447 6.2 1415 4.0 13821.7

la10 1287 1338 3.8 1345 4.3 12990.9

la11 1671 1825 8.4 1704 1.9 17223.0

la12 1452 1631 11.0 1500 3.2 15385.6

la13 1624 1766 8.0 1696 4.2 16743.0

la14 1691 1805 6.3 1722 1.8 17493.3

la15 1694 1829 7.4 1747 3.0 17523.3

la21 2048 2182 6.1 2191 6.5 20540.3

la22 1887 1965 4.0 1922 1.8 19101.2

la23 2032 2193 7.3 2126 4.4 20983.1

la24 2015 2150 6.3 2132 5.5 20562.0

la25 1917 2034 5.8 2020 5.1 19943.9

Average 6.5 3.3 2.8

stance. The average ARPD resulted by GASA is also

larger than the average ARPD resulted by CEGA. Com-

paring CEGA with HTS shows that CEGA is better than

HTS in terms of makespan. HTS reach 14 optimal

makespan which is less than those of CEGA. Though,

the ARPD of HTS is better than CEGA.

For larger instances, as shown in Table 6, compared to

GASA and HTS, generally CEGA performed better.

CEGA dominates 9 out of 15 instances, while the rest is

outperformed by HTS. For all those instances, all the

three methods; GASA, HTS and CEGA; can not reach

the ever best makespan values. In terms of ARPD, the

performance of CEGA is slightly better than HTS.

7. Conclusions

We have applied hybrid cross entropy-genetic algorithm

(CEGA) to solve NWJSS. We can conclude that CEGA

can be used as an alternative tool to solve NWJSS prob-

lem and can be applied widely on many industries with

NWJSS characteristics. For small instances CEGA per-

formed well in terms of makespan and computation time.

Generally, CEGA performance is better than the Genetic

Algorithm-Simulated Annealing (GASA) and Hybrid

Tabu Search (HTS), especially for small size instances.

In the future research, CEGA for NWJSS must be modi-

fied to get better performance especially for the large

size instances. The implementation using lower level

programming language might improve the performance

of CEGA. On the other hand, this algorithm application

on the other problems is also suggested.

8. References

[1] P. J. Chao-Hsien and H. Han-Chiang, “A Hybrid Genetic

Algorithm for No-Wait Job Shop Scheduling Problems,”

Expert Systems with Application, Vol. 36, No. 2, Part 2,

2009, pp. 5800-5806.

doi:10.1016/j.eswa.2008.07.005

[2] C. J. Schuster and J. M. Framinan, “Approximative Pro-

cedures for No-Wait Job Shop Scheduling,” Operations

Research Letters, Vol. 31, No. 4, 2003, pp. 308-318.

doi:10.1016/S0167-6377(03)00005-1

[3] W. Bożejko and M. Makuchowski, “A Fast Hybrid Tabu

Search Algorithm for the No-Wait Job Shop Problem,”

Computers & Industrial Engineering, Vol. 56, No. 4,

2009, pp. 1502-1509.

doi:10.1016/j.cie.2008.09.023

[4] R.Y. Rubinstein and D. P. Kroese, “The Cross-Entropy

Method: A Unified Approach to Combinatorial Optimi-

zation. Monte-Carlo Simulation and Machine Learning,”

Springer Verlag, New York, 2004.

[5] C. F. Liaw, “An efficient Simple Metaheuristic for Mini-

mizing the Makespan in Two-Machine No-Wait Job

Shops,” Computer & Operations Research, Vol. 35, No.

10, 2008, pp. 3276-3283.

doi:10.1016/j.cor.2007.02.017

[6] A. Mascis and D. Pacciarelli, “Discrete Optimization:

Job-Shop Scheduling with Blocking and No-Wait Con-

straints,” European Journal of Operational Research,

Vol. 143, No. 3, 2002, pp. 498-517.

doi:10.1016/j.cor.2007.02.017

[7] J. M. Framinan and C. J. Schuster, “An Enhanced Time-

tabling Procedure for the No-Wait Job Shop Problem: a

Complete Local Search with Memory Approach,” Com-

puters & Operations Research, Vol. 33, No. 1, 2006, pp.

1200-1213. doi:10.1016/j.cor.2004.09.009

[8] J. Zhu, X. Li and Q. Wang, “Complete Local Search with

Limited Memory Algorithm for No-Wait Job Shops to

Minimize Makespan,” European Journal of Operational

Research, Vol. 198, No. 2, 2009, pp. 378-386.

doi:10.1016/j.ejor.2008.09.015

A Cross Entropy-Genetic Algorithm for m-Machines No-Wait Job-Shop Scheduling Problem

180

[9] M. Derek, “A Sequential Scheduling Approach to Com-

bining Multiple Object Classifiers Using Cross-Entropy,”

In: T. Windeatt and F. Roli, Eds., MCS 2003, LNCS 2709,

Springer-Verlag, Berlin, pp. 135-145.

[10] D. P. Kroese and K. P. Hui, “Applications of the Cross-

Entropy Method in Reliability Computational Intelli-

gence,” Reliability Engineering (SCI), Vol. 40, 2007, pp.

37-82.

[11] B. Santosa and N. Hardiansyah, “Cross Entropy Method

for Solving Generalized Orienteering Problem,” iBusi-

ness, Vol. 2, No. 4, 2010, pp. 342-347.