J. Software Engineering & Applications, 2010, 3, 541-547
doi:10.4236/jsea.2010.36062 Published Online June 2010 (http://www.SciRP.org/journal/jsea)
Copyright © 2010 SciRes. JSEA
User Session-Based Test Case Generation and
Optimization Using Genetic Algorithm*
Zhongsheng Qian
School of Information Technology, Jiangxi University of Finance & Economics, Nanchang, China.
Email: changesme@163.com
Received March 22nd, 2010; revised April 12th, 2010; accepted April 13th, 2010.
ABSTRACT
An approach to generating and optimizing test cases is proposed for Web application testing based on user sessions using
genetic algorithm. A large volume of meaningful user sessions are obtained after purging their irrelevant information by
analyzing user logs on the Web server. Most of the redundant user sessions are also removed by the reduction process.
For test reuse and test concurrency, it divides the user sessions obtained into different groups, each of which is called a
test suite, and then prioritizes the test suites and the test cases of each test suite. So, the initial test suites and test cases,
and their initial executing sequences are achieved. However, the test scheme generated by the elementary prioritization is
not much approximate to the best one. Therefore, genetic algorithm is employed to optimize the results of grouping and
prioritization. Meanwhile, an approach to generating new test cases is presented using crossover. The new test cases can
detect faults caused by the use of possible conflicting data shared by different users.
Keywords: User Session, Genetic Algorithm, Test Case, Test Suite, Reduction, Prioritization
1. Introduction
Incapable Web applications can have far-ranging conse-
quences on businesses, economies, scientific progress,
health, and so on. Therefore, all the entities of a Web
application, in essence, must be tested adequately to en-
sure that the application meets its original design speci-
fications.
In Web application testing, some test methods and
techniques were presented [1-4]. Kung, et al. [2] depicted
an object-oriented Web Test Model to support Web ap-
plication testing. Hieatt, et al. [1] introduced a method of
acceptance testing, and developed a testing tool to show
the system operations and the expected output results in
XML. The approaches proposed by Kung and Hieatt, et al.,
however, focus primarily on unit testing without con-
cerning the whole testing for Web applications. Liu, et al.
[4] presented a data flow-based approach to testing Web
applications. Most of these methods of testing Web ap-
plications are achieved through extending the testing
methods for traditional software. Additionally, none of
these methods [1-4] yields test data according to user
sessions.
Elbaum, et al. [5] demonstrated the fault detection ca-
pabilities and cost-effectiveness of user session-based
testing. Increment concept analysis [6] is used to analyze
user sessions dynamically and minimize continuously the
number of user sessions maintained. Khor, et al. [7] com-
bined the genetic algorithm with formal concept analysis
to trace the relationship between test data and corres-
ponding test run.
Sthamer [8] analyzed deeply the test case optimization
efficiency of different coding schemes and fitness func-
tions of genetic algorithm for different structure-based
software in his doctoral dissertation. Some researchers
also studied on test case generation techniques using
genetic algorithm [9-10]. However, they aimed for sim-
plex optimization focusing on one-off optimization com-
putation, while the testing is continuous and iterative. So,
dynamically continuous optimization computation is
more propitious to improve test performance. Jia, et al.
[11] discussed the key problems of producing test data of
covering designated paths using genetic algorithm, and
introduced deeply the factors of influencing the gene-
This research has been partly supported by the National High-
Tech Research and Development Plan of China under Grant No. 2007
AA01Z144; the Science and Technology Plan Project of the Education
Department of Jiangxi Province of China under Grant Nos. GJJ10120
and GJJ10117; Shanghai Leading Academic Discipline Project of Chi-
na with Project No. J50103; and the School Foundation of Jiangxi Uni-
versity of Finance & Economics of China under Grant No. 04722015.
tic algorithm’s efficiency through experimental results.
Berndt and Watkins [12] summarized the advancement in
generating data for generic algorithm-based testing re-
cently. These studies [5,7-12] used genetic algorithm to
User Session-Based Test Case Generation and Optimization Using Genetic Algorithm
542
analyze test problems with the exception of [5], but not
focusing on Web application testing. In [5], the authors
discussed user session-based Web application testing, but
not concerning deeply the optimization of test cases.
Different from them, this paper investigates a key pro-
blem in Web application testing: test case generation and
optimization. It delves into an approach to testing and
optimizing Web applications based on user sessions us-
ing genetic algorithm. When converting a user session to
a corresponding test case, we preserve the user input
data.
2. Collecting and Reducing User Sessions
In the logs on each Web server, each access record cor-
responds to a request by a user each time. The contents of
the record include request source (user IP address), re-
quest time, request mode (such as GET, POST), the URL
of requested information, data transport protocol (such as
HTTP), status code, the number of bytes transferred and
the type of client, etc. It needs to scan the logs only once
to resolve the original active historic records from current
access logs. However, it is difficult to organize these
original records directly, which must be preprocessed. We
first purge irrelevant data including the records whose
status codes are erroneous (the code 200 for success, 400
for error), embedded resources such as script files and
multimedia files whose extension names are .gif, .jpeg
or .css, etc., to obtain the set of user sessions for primitive
analysis. Then, we create user sessions through scanning
the logs on Web servers. Once a new IP address occurs, a
new user session is created. The sequential requests sent
from this IP address are appended to the new session
under the condition that the time interval of two con-
tinuous requests is not greater than max-session-idle-time
pre-determined, or else another new user session begins.
The set of all the user sessions is finally achieved.
There are often large volumes of user sessions collected
and a user session is converted to a test case transmitted to
Web server. Therefore, we can eliminate redundant user
sessions using reduction techniques, and then preserve
necessary user sessions. For the convenience of discus-
sion, some important concepts are given first.
Definition 1 (URL trace). URL trace is the URL se-
quence requested by a user session.
Let α be a URL trace. Its length is the number of URLs
requested in the trace, denoted by |α|.
Definition 2 (prefix). A trace α is the prefix of another
trace β, if and only if α is the subsequence of β and they
have the same initial symbol.
Definition 3 (common prefix). If a trace is the prefix of
several traces, then this trace becomes their common prefix.
Definition 4 (greatest common prefix). The longest
common prefix in all the common prefixes of two traces is
their greatest common prefix.
The longer the greatest common prefix of two traces is,
the more similar the two traces are, i.e., their similarity is
higher. Let we have four traces that are γ1 = abcdefg, γ2 =
abcdeh, γ3 = abcd and γ4 = cde. Then, γ3 is the common
prefix of γ1 and γ2, but not the greatest one. The greatest
common prefix of γ1 and γ2 is abcde. Although γ4 is the
subsequence of γ1 and γ2, it is not their prefix, for the
initial symbol of γ4 is not the same as that of γ1 and γ2. We
define a function isPrefix(α, β), which decides whether or
not α is the prefix of β, i.e., whether or not the URL trace
requested by a user session is redundant corresponding to
that requested by another user session. If there is a prefix,
then the function returns a BOOLEAN value TRUE, or
else returns FALSE. The URL trace-based user session
algorithm ReduceUSession is shown in Figure 1. Note
that, an HTTP request here is regarded as a symbol of a
trace.
The ReduceUSession algorithm decides whether or
not a URL trace α requested by a user session is the pre-
fix of β requested by another user session. If it is true,
then the user session corresponding to α is removed. The
number of user sessions obtained using this algorithm
will be reduced greatly.
The ReduceUSession algorithm is different from other
Algorithm: ReduceUSession
input:
The set of user sessions Λ = {s1, …, sk}, where k is the number
of user sessions;
The URL trace U1, …, Uk, which are requested by s1, …, sk
respectively;
output:
The reduced set of user sessions denoted by Γ;
begin
Γ = Ф;
while (another user session that is not marked in Λ exists)
tag1 = FALSE;
tag2 = FALSE;
Select a user session si that is not marked in Λ, and then
mark it with “USED”;
for (the URL trace Uj requested by each user session sj in
Γ)
if isPrefix(Uj, Ui) //the URLs requested by si is
//more, so sj is redundant
Γ = Γ-{sj};
tag1 = TRUE;
endif;
if isPrefix(Ui, Uj) //here, Γ keeps unchanged,
//and si is redundant
tag2 = TRUE;
break; //exit for cycle
endif;
endfor;
if tag1 || (!tag1 && !tag2) //si is necessary
Γ = Γ{si};
endif;
endwhile;
Output the reduced set of user sessions Γ;
end.
Figure 1. The URL trace-based user session reduction algo-
rithm ReduceUSession
Copyright © 2010 SciRes JSEA
User Session-Based Test Case Generation and Optimization Using Genetic Algorithm
Copyright © 2010 SciRes JSEA
543
user session reduction algorithms. Most other reduction
algorithms analyze each test case in the test suites one by
one and remove those test cases that cannot change or
affect test requirements, i.e., they distinguish redundant
and necessary test cases, as is too difficult to manage in
practice, for it is very intractable to discriminate that the
test requirements satisfied by some test cases (i.e., re-
dundant ones) are also satisfied by other test cases before
the test execution using the existential algorithms. While
our ReduceUSession algorithm decides whether or not a
URL trace requested by a user session is the prefix of
another URL trace requested by another user session.
Based on this, we can identify the redund an t test cases,
as can be easily done in practice. Besides, it covers all
the URLs requested by the original set of user sessions
and keeps the sequence of URL requests, i.e., it guaran-
tees that the original test requirements are satisfied.
3. Grouping and Prioritizing User Sessions
The user sessions reduced by the ReduceUSession algo-
rithm are divided into subgroups, each of which is re-
garded as a test suite. The goal of grouping is to reuse test
cases and the testing can be executed at different plat-
forms in parallel (or concurrently), to lessen test time and
improve test efficiency. Moreover, the interacting test can
be conducted between a pair of user sessions in each
group (see Subsection 4.4 for the details). We try our best
to keep the property for the user sessions in the same
group that the URL traces requested bear greatest com-
mon prefix of a certain length. Several discontinuous
integral threshold values, denoted by ζ1, ζ2, …ζk (ζi 1, 1
i k), are defined. The user sessions, the lengths of
whose greatest common prefix of URL traces requested
fall in between a certain threshold values, are grouped
together. Fox example, let we have three threshold values
ζ1 = 2, ζ2 = 4 and ζ3 = 7, then the user sessions are divided
into four groups that are S1, S2, S3 and S4, the lengths of
whose greatest common prefix (denoted by α) are |α| 2,
2 < |α| 4, 4 < |α| 7 and |α| > 7 respectively. The test
cases in these four groups can be executed at different
platforms in parallel (or concurrently), and the greatest
common prefix is also reused in each group. Notice that
this is not the unique grouping way. For example, in the
four traces γ1 = abcdefg, γ2 = abcdeh, γ3 = abcd and γ4 =
cde, we can divide them into two groups {γ1, γ2} and {γ3,
γ4}, the lengths of whose greatest common prefix are 4 <
|α| 7 and |α| 2 respectively, or another two groups {γ1,
γ2, γ3} and {γ4}, the lengths of whose greatest common
prefix are 2 < |α| 4 and |α| 2 respectively. Of course, it
is unnecessary to require that the greatest common prefix
of URL traces requested by any two user sessions in each
group fall in between a certain threshold values; it is
recommended that most of them satisfy this property (a
percentage can be pre-designed or generated randomly for
the measurement). A compromise way needs to be found
to classify these URL traces (or test suites), i.e., a tradeoff
should be found between grouping and concurrent testing
and test reuse. Suppose that N user sessions are divided
into K groups. In general, there is almost the same number
of user sessions in one group as that in another, i.e., the
number equals approximately to
N
K



. Additionally, this
way of grouping is preparatory, called elementary
grouping.
The common prefix indicates the users’ common
events, or the same or similar operations. It also shows
that the users bear the same or similar interests. The
longer the common prefix is, the more evident it is, as is
the case of most users. In addition, there is a special group
of user sessions, the length of whose greatest common
prefix of URL traces requested is shortest. This group of
user sessions often indicates different URL requests,
which represent distinct requirements for a Web applica-
tion. In these sessions, many aberrant events often occur
with unwonted input data. They belong to boundary cases,
which are very easy to go wrong for the Web application.
Herein, we prioritize test suites. The test suite with short-
est length of common prefix ranks first, then all the other
test suites are arranged according to their lengths of
common prefix in descending order. So, the test suite in
final position is that whose length of common prefix is
last but one. In the test suites S1, S2, S3 and S4, the lengths
of whose common prefixes are |α| 2, 2 < |α| 4, 4 < |α|
7 and |α| > 7 respectively, if we prioritize them, then the
test executing sequence for those test suites are S1, S4, S3,
S2. That is to say, the test cases in S1 are executed first,
then the test cases in S4 and S3 are executed respectively
and finally, the test cases in S2 are executed. In each test
suite, the test cases are prioritized according to the cov-
erage ratios of URLs requested, i.e., the test case with
longer URL trace requested is executed earlier. If the
lengths of URL traces of several test cases in the same test
suite are equal, then they are randomly executed.
Our approach of prioritization is different from others.
The existing methods of prioritization add the strongest
test case, which is of the maximal use for the coverage
ratio of test requirements, to the new test suite. Those
methods aim to execute earlier the test cases of high pri-
ority than those of lower priority, to satisfy some test re-
quirements as soon as possible. These methods, however,
are often difficult to find the (nearly) strongest test case
each time before test run, while our approach divides test
cases into several groups according to the idea of common
prefix before prioritizing them. It is more convenient for
the testers to selectively execute the test cases of some
group by grouping all the test cases first, to detect some
User Session-Based Test Case Generation and Optimization Using Genetic Algorithm
544
types of faults, for the same or similar types of errors are
often detected by those test cases in the same groups. In
addition, test cases can be reused by grouping and the
testing can be executed at different platforms in parallel
(or concurrently), to lessen test time and improve test
efficiency. Moreover, we have considered a special type
of user sessions, the length of whose greatest common
prefix of URL traces is shortest. This group of user ses-
sions often contains specific requests, which are the pri-
mary source for errors.
4. Testing Web Applications Using Genetic
Algorithm
After the process of grouping and prioritization above, we
obtain several initial test suites and test cases with the
initial executing sequences. However, the test scheme
generated by the elementary prioritization is not fast in
finding faults and can not satisfy the requirements earlier,
i.e., it is not much approximate to the best one. Therefore,
genetic algorithm is employed further to optimize the
grouping and prioritization.
In 1975, an American professor Holland first proposed
the idea of genetic algorithm systematically [13]. It has
attracted a large number of researchers and extended into
those aspects of optimization, search and machine learn-
ing with a solid theoretic foundation. The genetic algo-
rithm focuses on all the individuals in one population, and
uses random techniques to search efficiently for a coded
parameter space. Selection, crossover and mutation are
the basic operators in genetic algorithm; parameter coding,
the setting of initial population, the design of fitness
function, the selection of genetic operations and control
parameters consist of the critical part of genetic algorithm.
As a global optimization search algorithm of high effi-
ciency, the genetic algorithm has distinctive advantage in
solving the difficult problems in the domains of big space,
multiple peak, nonlinearity and parallel processing, etc.
In the following, we test Web applications using genetic
algorithm to further optimize the initial test suites and test
cases, and their initial executing sequences, in order to
achieve better test suites and test cases that satisfy test
requirements. The process of yielding a population of next
generation using three basic operators that are selection,
crossover and mutation once is called an iteration. To
obtain a good result, much iteration is repeated. The fol-
lowing introduces the process of selection, crossover and
mutation.
4.1 Selection
A pair of individuals is selected from a parent population
with the probability of ps. The probability that an indi-
vidual is selected is in direct proportion to its fitness
value, as is often implemented using the strategy of rou-
lette wheel [13]. In selecting, the individual of high fit-
ness value is duplicated into the population of next gen-
eration directly. The higher the fitness value of an indi-
vidual is, the higher the probability of yielding its off-
spring is, as shows that it is more appropriate to the ex-
pected result. Let we get K test suites (constituting the
initial population), which are S1, S2, …, SK of the de-
scending order according to the prioritization technique.
In practice, we combine the error coverage ratios of test
suites and the cost of test run to design fitness function.
The fitness value is listed as f1, f2, …, fK from high to
low, where fi is the fitness of Si (1iK). The probability
that an individual is selected equals to the resulting value
that its fitness value divides the sum of fitness values of
all the individuals, i.e., the selected probability of Si,
denoted by ps
Si, is
1j
K
ij
f
f
. Obviously, the sum of the
selected probabilities of all the K test suites equals to 1.
According to the discussion above, the fitness f1 and f2
corresponds to two special test suites S1 and S2, the
lengths of whose greatest common prefixes are shortest
and longest respectively. S1 and S2 are selected to be the
individuals of next generation directly (in practical use,
we can also select more than two individuals of high
fitness to become the next individuals); in case they do
not be selected. Now, we randomly yield two numbers
that are g1, g2[0, 1], and randomly select two test suites
that are Si and Sj whose probabilities are not less than g1
and g2 respectively. The two new test suites Si’ and Sj
(the individuals of next generation) are generated throu-
gh crossovering and mutating their parents Si and Sj (see
Subsections 4.2 and 4.3). Repeat the process of selection,
until adequate test suites of next generation are yielded.
One point should be emphasized that some test suites of
higher probabilities may not be selected as parents, while
others of lower probabilities are selected. This case is
reasonable and accords with the theory of biological
evolution in nature, because any thing has its necessity
and occasionality at the same time.
4.2 Crossover
The genes chain of two parent individuals selected are
crossovered with the probability of pc using the
TSCrossover algorithm, where pc is a system control
parameter. The new individuals after crossovering are
used to replace their parents. The TSCrossover algorithm
is shown in Figure 2.
Let we have two test suites Si = < c1, c2, c3, c4, c5, c6, c7,
c8, c9 >, which contains 9 test cases and Sj = < c10, c11, c12,
c13, c14, c15, c16 >, which contains 7 test cases. If the posi-
tion of crossovering is 3, then two new individuals of
next generation after crossovering Si and Sj are: Si’ = < c1,
c2, c12, c13, c14, c15, c16, c8, c9 > and Sj’ = < c10, c11, c3, c4,
c5, c6, c7 >, where the test cases indicated in bold face are
those interchanged one by one from corresponding test
cases in parent test suites.
Copyright © 2010 SciRes JSEA
User Session-Based Test Case Generation and Optimization Using Genetic Algorithm
Copyright © 2010 SciRes JSEA
545
4.3 Mutation
Each bits of the genes chain of new individuals are mu-
tated with the probability of pm using the TSMutation
algorithm, where pm is a system control parameter. The
probability of mutation is often small, or else it will raise
questions over the system stability. The populations can
be prevented from stagnating through mutation. If there
is no mutation, then the test data of new populations will
be confined to the initial values. The mutation process is
conducted respectively using the TSMutation algorithm
for the test suites Si’ and Sj’ crossovered using the
TSCrossover algorithm. The new individuals after muta-
ting are employed to replace those before mutating. The
TSMutation algorithm is shown in Figure 3.
Let pm = 0.015 and Si’ = <c1, c2, c12, c13, c14, c15, c16, c8,
c9>, which is one of the test suites after using the
TSCrossover algorithm. We randomly generate some
data: g(c1) = 0.246, g(c2) = 0.580, g(c12) = 0.025, g(c13) =
0.012, g(c14) = 0.632, g(c15) = 0.073, g(c16) = 0.431, g(c8)
= 0.193 and g(c9) = 0.059. For pm is not less than g(c13),
c13 is mutated, i.e., the positions of c13 and c12 are inter-
changed. The test suites Si’ after mutating becomes < c1,
c2, c13, c12, c14, c15, c16, c8, c9 >.
4.4 The Interacting Testing among User Sessions
Some new test cases, which contain the requested infor-
mation of different users, can also be generated using
crossover. The goal of the new test cases is to detect er-
rors caused by the use of possible conflicting data shared
by different users. The crossover way shows the idea of
information interchange among different user sessions.
Let S be a test suite, the steps of generating a test case
using crossover are:
1) for a test case c in S, randomly generate a number g
[0, 1], if a given crossover probability is not less than
g, then
a) copy the requests from r1 to ri in c to an interim test
case, where i is a random number, i[1, |c|]; |c| is the
length of URL trace in c;
b) select a test case d (different from c) in S randomly,
and then search rj in d reversely, which is the first one to
have the same URL as ri. If not found, then another test
case (also denoted as d) is selected, until it is found or up
to a given time;
c) if d is found, then all the requests after rj in d are
appended to the interim test case, which is the new test
case generated; otherwise go 2);
2) repeat 1) until each test case in the test suite is
processed.
We search the requests in test case d with an inverse
search method, for the length of greatest common prefix
of two test cases c and d in the same test suite is often
greater than 1; when the corresponding URL trace from r1
to ri is the prefix of this greatest common prefix, we can
not obtain different test cases if using the sequential search
method. For example, given two URL traces c =
u1u2u4u3u5u6u8u7 and d = u1u2u4u3u7u4u5u9, we randomly
copy u1u2u4 from c to an interim test case t firstly (i.e., t
equals to u1u2u4 temporarily). Here, I = 3 and ri = u4. Now,
we search rj in d reversely, which is the first one to have
the same URL as ri and then we have j = 6 and rj = u4. So,
all the requests after r6 in d (i.e., u5u9) are appended to t
such that t = u1u2u4u5u9. If searching the requests in d with
a sequential search method, we have j = 3 and rj = u4. This
time, if copying all the requests after r3 in d (i.e.,
u3u7u4u5u9) to t, we have t = u1u2u4u3u7u4u5u9, which
equals to d itself. And no new test case is generated. The
Algorithm: TSCrossover
input:
Parent test suites Si and Sj;
The crossover probability of pc;
output:
Si’ and Sj’ of next generation;
begin
Si’, SjSi, Sj;
Randomly generate a number g[0, 1];
if pc is not less than g
Randomly generate an integral number g’(0, min(|Si’|,
|Sj’|)]; //|Si’| and |Sj’| are the number
//of test cases in Si’ and Sj’ respectively
Interchange the test cases in Si’ and Sj’ from the position of
g’ one by one,
until no more test case needs to be interchanged in any of
them;
endif;
Output Si’ and Sj’;
end.
Figure 2. The crossover algorithm TSCrossover for test suites
Algorithm: TSMutation
input:
The test suites Si’ and Sj’ after using TSCrossover algorithm;
The mutation probability of pm;
output;
The test suites Si’ and Sj’ after mutating;
begin
for each S in {Si’, Sj’}
for each c in S
Randomly generate a number g[0, 1];
if pm is not less than g
Interchange the positions of c and the test case
before c;
endif;
endfor
Output S;
endfor
end.
Figure 3. The mutation algorithm TSMutation for test suites
User Session-Based Test Case Generation and Optimization Using Genetic Algorithm
546
reason is that the corresponding URL trace (i.e., u1u2u4)
from r1 to ri (i = 3) in c is the prefix of greatest common
prefix (i.e., u1u2u4u3) of c and d.
In addition, it is often not to mutate a test case, for the
new generated test case after exchanging the two adja-
cent requests ri and ri+1 makes no sense. The possible
reason is that the former request ri-1 may never reach the
request ri+1 (note that, this time ri becomes the next re-
quest of ri+1).
5. Experimental Analysis
Consider a typical miniature Web application developed
to demonstrate our approach: the SWLS (Simple Web
Login System) was shown in Figure 4.
Starting at the first page (indicated by a dashed arrow,
reasonably, a blank page can be used to request for the
first page of a Web application), i.e., a home page (p1),
the user can enter into the news page (p2) to list the news
by clicking on the view link, or enter into the login page
(p3) by pressing the login button. In page p3, the user
enters the userid and password, and presses the submit
button. Upon this pressing, the userid and password are
sent to the Web server for authentication. A logged page
(p4) will be loaded if both userid and password are co-
rrect. On the contrary, an error page (p7) containing an
error message is displayed if at least one of the submitted
values for userid and password is wrong. From the
logged page, it is possible to go to info page (p5) for se-
cure information viewing by just clicking on the browse
link. The user can click on the intra-page link continue to
view the different parts of the same page p5 if it is too
long. A logout page (p6) will be displayed when the user
presses the exit or logout button. Then, the user may
come back to the home page for login again. Note that
each time the login page is displayed; both the userid
and password fields should be initialized to be empty.
We inject only one fault in each page respectively, so
at least 7 faults exist totally (there may be other faults in
the Web application originally). A set of user sessions
are obtained after scanning user logs on the Web server
and purging their irrelevant information; then 89 mean-
ingful user sessions are finally created after scanning
them again. Only 17 user sessions are used to generate
test cases after the reduction of the meaningful user ses-
sions using the URL trace-based ReduceUSession algo-
rithm; and the reduction ratio reaches 80.9%. Imperson-
ally, more user sessions there are for a given Web appli-
cation, much higher is the reduction efficiency, for more
user sessions mean higher possibility that the URL trace
requested by a user session is the prefix of that requested
by another in the view of statistics. The set of 17 user
sessions (or test cases) is denoted by Γ1. After grouping
and prioritizing Γ1, an initial executing sequence <S1, S2,
S3> of test suites is obtained, which is denoted by Γ2,
where S1, S2 and S3 contain initial executing sequences
of 7, 6 and 4 test cases respectively. We achieve the final
executing sequence <S1’, S2’, S3’> of test suites using
genetic algorithm for further processing, which is de-
noted by Γ3, i.e., S1’, S2’ and S3’ is obtained through
crossovering and mutating S1, S2 and S3.
Now, we run the test suites (and test cases) in Γ1, Γ2
and Γ3 for the Web application respectively, and find all
the 7 faults injected as well as an additional fault (i.e.,
the user may browse p2 just by directly entering its ad-
dress in the URL address bar). The executing time of Γ2
and Γ3, however, is much shorter than that of Γ1, and it
takes less time of Γ3 than that of Γ2 too. This means that
the test suites (and test cases) grouped and prioritized run
faster than those, which are not grouped and prioritized;
and that the test suites (and test cases) processed further
by genetic algorithm are much faster. It is predictive that
the test approach proposed in this paper will yield more
evident positive effects for larger Web applications.
6. Concluding Remarks and Future Work
Generating test cases of high quality is the premise of
Web application testing. The approach in this paper cap-
tures a series of user events, i.e., the sequences of URLs
and name-value pairs in Web server (s). It then employs
the reduction, grouping, prioritization and genetic algo-
rithm to yield test cases and optimizes them. Compared
to the methods of capturing user events in clients, our
approach is very effective when a large volume of users
exist, and it is a Web application testing method of high
efficiency. The main contributions include:
1) an approach to generating and optimizing test cases
is proposed for Web application testing based on user
sessions using genetic algorithm.
2) several important definitions such as URL trace,
prefix, common prefix, greatest common prefix, are
given. These definitions are convenient for reducing and
grouping user sessions.
3) a URL trace-based reduction algorithm is designed.
The user sessions acquired are lessened greatly using the
view
browse
lo
g
in continue
logout
return submit
submit
exit
home
login
page
p
3
logged
page
p
4
logout
page
p
6
error
page
p
7
info
page
p
5
home
page
p
1
news
page
p
2
For the convenience of test run, we have preprocessed the user sessions
appropriately. Figure 4. A simple web login system
Copyright © 2010 SciRes JSEA
User Session-Based Test Case Generation and Optimization Using Genetic Algorithm
Copyright © 2010 SciRes JSEA
547
algorithm. However, it covers all the URLs requested by
the original set of user sessions and keeps the sequence
of URL requests.
4) an approach to grouping and prioritizing user sessions
is presented, as can improve the efficiency of test run.
5) an approach to generating new test cases is pro-
posed using crossover. The new test cases generated in-
clude information requested by different users and can
detect errors caused by the use of possible conflicting
data shared by different users. That’s to say, the cross-
over indicates the idea of information interchange among
different user sessions, which helps to test the interacting
of user sessions.
6) a strategy of test reuse and concurrence is provided,
as can decrease the time of test run greatly as well as le-
ssen test cost.
Web application testing is much complex systems engi-
neering. It is not easy to acquire an effective and practical
test scheme. Our approach only evaluates a test case ac-
cording to its test coverage ratio. However, many factors
need to be considered, such as the running cost of each test
case itself (for example, CPU time), including the actual
running time, loading time, time to save test state, and the
influence of different test criteria on a test case. All these
questions need to be answered in future research.
REFERENCES
[1] E. Hieatt and R. Mee, “Going Faster: Testing the Web Ap-
plication,” IEEE Software, Vol. 19, No. 2, 2002, pp. 60-65.
[2] D. C. Kung, C. H. Liu and P. Hsia, “An Object-Oriented
Web Test Model for Testing Web Applications,”
Proceedings of the 1st Asia-Pacific Conference on Web
Applications, New York, 2000, pp. 111-120.
[3] J. Offutt, Y. Wu. and X. Du, et al., “Bypass Testing of
Web Applications,” Proceedings of the 15th IEEE Inter-
national Symposium on Software Reliability Engineering,
Bretagne, November 2004.
[4] C. H. Liu, D. C. Kung and P. Hsia, “Object-Based Data
Flow Testing of Web Applications,” Proceedings of the
1st Asia-Pacific Conference on Quality Software, Hong
Kong, 2000, pp. 7-16.
[5] C. Elbaum, G. Rothermel and S. Karre, et al., “Leveraging
User Session Data to Support Web Application Testing,”
IEEE Transaction on Software Engineering, California,
May 2005.
[6] R. Godin, R. Missaoui and H. Alaoui, “Incremental
Concept Formation Algorithms Based on Galois (Concept)
Lattices,” Computational Intelligence, Vol. 11, No. 2,
1995, pp. 246-267.
[7] S. Khor and P. Grogono, “Using a Genetic Algorithm and
Formal Concept Analysis to Generate Branch Coverage
Test Data Automatically,” Proceedings of the Inter-
national Conference on Automated Software Engin-
eering, Austria, 2004, pp. 346-349.
[8] H. H. Sthamer, “The Automatic Generation of Software
Test Data Using Genetic Algorithms,” PhD. Dissertation,
University of Glamorgan, Wales, 1996.
[9] D. Berndt, J. Fisher and L. Johnson, et al., “Breeding
Software Test Cases with Genetic Algorithms,” Pro-
ceedings of the 36th Hawaii International Conference on
System Sciences, 2003, pp. 17-24.
[10] R. P. Pargas and M. J. Harrold, “Test-Data Generation
Using Genetic Algorithms,” Journal of Software Testing,
Verification and Reliability, Vol. 9, No. 4, 1999, pp.
263-282.
[11] X. X. Jia, J. Wu, M. Z. Jin, et al., “Some Experiment
Analysis of Using Generic Algorithm in Automatic Test
Data Generation,” in Chinese, Journal of Chinese Com-
puter Systems, Vol. 28, No. 3, 2007, pp. 520-525.
[12] D. J. Berndt and A. Watkins, “Investigating the Perfor-
mance of Genetic Algorithm-Based Software Test Case
generation,” Proceedings of the International Symposium
on High Assurance Systems Engineering, Florida, 2004,
pp. 261-262.
[13] J. H. Holland, “Adaptation in Natural and Artificial
System,” University of Michigan Press, Michigan, 1975.