Journal of Software Engineering and Applications, 2013, 6, 47-52 Published Online October 2013 ( 47
A Survey of Software Test Estimation Techniques
Kamala Ramasubramani Jayakumar1, Alain Abran2
1Amitysoft Technologies, Chennai, India; 2École de Technologie Supérieure—University of Quebec, Montreal, Canada.
Received August 30th, 2013; revised September 27th, 2013; accepted October 5th, 2013
Copyright © 2013 Kamala Ramasubramani Jayakumar, Alain Abran. This is an open access article distributed under the Creative Commons
Attribution License, which permits unrestricted use, d istribution, and reproduction in any medium, provided the original work is p rop erly cited.
Software testing has become a primary business for a number of IT services companies, and estimation, which remains
a challenge in software development, is even more challenging in software testing. This paper presents an overview of
software test estimation techniques surveyed, as well as some of the challenges that need to be overcome if the founda-
tions of these software testing estimation techniques are to be improved.
Keywords: Software Testing; Testing Estimation; Software Estimation; Outsourcing
1. Introduction
Software testing has become a complex endeavor, owing
to the multiple levels of testing th at are required, such as
component testing, integration testing, and system testing,
as well as the types of testing that need to be carried out,
such as functional testing, performance testing, and secu-
rity testing [1]. It is critical that relevant estimation tech-
niques be used in software testing, depending on the
scope what the testing called for.
Furthermore, software testing has become an industry
of its own over the years, with the emergence of inde-
pendent testing serv ices firms and IT services companies
establishing testing services as a business unit. Test esti-
mation consists of the estimation of effort and cost for a
particular level of testing, using various methods, tools,
and techniques. The incorrect estimation of testing effort
often leads to an inadequate amount of testing, which, in
turn, can lead to failures of software systems once they
are deployed in organizations. Estimation is the most
critical activity in software testing, and an unavoidable
one, but it is often performed in haste, with those respon-
sible for it merely hoping for the best.
Test estimation techniques have often been derived
from generic software development estimation tech-
niques, in which testing figures are as one of the phases
of the software development life cycle, as in the CO-
COMO 81 and COCOMO II models [2].
Test estimation in an outsourcing context differs sig-
nificantly from test estimation embedded in software de-
velopment, owing to several process factors related to
development organization and testing organization, in
addition to factors related to the product to be tested.
Distinct pricing approaches, such as time and material,
fixed-bid, output-based, and outcome-based pricing, are
followed by the industry based on customer needs. An
adequate estimation technique is essential for all pricing
models except for time and material pricing. The focus of
the survey reported here is the estimation of effort for
testing, and not the estimation of cost or schedule.
2. Evaluation Criteria and Groups of
Estimation Techniques
For this study, the following criteria have been selected
to analyze test estimation techniques:
1. Customer view of requirements: This criterion makes
it possible to determine whether the estimation tech-
nique looks at the software requirements from a cus-
tomer viewpoint or from the technical/implementa-
tion viewpoint: estimation based on the customer
viewpoint assures the customer that he is getting a
fair price (i.e. estimate) in a competitive market for
what he is asking for in terms of quantity and quality,
and that an increase or decrease in price (i.e. the esti-
mated effort) is directly related to increases or de-
creases in the number of functions and/or levels of
quality expected, and not on how efficient, or other-
wise, a supplier is at delivering software using dif-
ferent sets of tools and with different groups of peo-
2. Functional size as a prerequisite to estimation: Most
estimation methods use some form of size, which is
either implicit or explicit in effort estimation: when
Copyright © 2013 SciRes. JSEA
A Survey of Software Test Estimation Technique s
size is not explicit, benchmarking and performance
studies across projects and organizations are not pos-
sible. Functional size can be measured using either
international standards or locally defined sizing me-
3. Mathematical validity: Surprisingly, quite a few es-
timation techniques have evolved over the years,
mostly based only on “feel good” factor and ignor-
ing the validity of their mathematical foundations.
This criterion looks at the metrological foundation of
the proposed estimation techniques. A valid mathe-
matical foundation provides a sound basis for further
4. Verifiability: The estimate produced must be verifi-
able, in order to inspire confidence. Verifiability makes
the estimate more dependable.
5. Benchmarking: It is essential in an outsourcing con-
text that estimates be comparable across organizations,
as this can help in benchmarking and performance
improvement. The genesis of the estimation techniques
is looked at to determine whether or not benchmark-
ing is feasible.
Estimation techniques are based on a number of dif-
ferent philosophies. For the purposes of this survey,
these techniques have been classified in to the following
1) techniques based on judgment and rules of thumb,
2) techniques based on analogy and work breakdown,
3) techniques based on factors and weights,
4) techniques based on size,
5) fuzzy and other models.
There are also, of course, several variations of tech-
niques in each group. In this paper, we present a few
representative techniques from each group to illustrate
their basis for estimation, as well as their strengths and
3. Survey Findings
Using the five criteria described in the previous section,
Table 1 presents a high level analysis of each of the
above groups of techniques for th e estimation of softw are
testing. Comments specific to each group of testing tech-
niques are presented su bsequently.
3.1. Judgment and Rule of Thumb Techniques
Delphi [3]: a classic estimation technique in which ex-
perts are involved in determining individual estimates for
a particular set of requirements based on their own earlier
experience. Multiple iterations take place during which
the experts learn the reasoning from other experts, and
rework their estimates in subsequent iterations. The final
estimate is picked from the narrowed range of values
estimated by experts in the last iteration.
Wide-band Delphi [4]: a technique enabling interac-
tion between experts to arrive at a decision point. A
quick estimate is provided by experts knowledgeable in
the domain and in testing, but the resulting estimate will
be very approximate, and should be applied with caution.
Estimates are not verifiable in this case, and benchmark-
ing is possible. These techniques mostly take the imple-
mentation view of requirements, and functional size is
often ignored.
Rule of Thumb: estimates which are based on ratios
and rules pre-established by individuals or by experi-
enced estimators, but without a well documented and
independently verifiable basis.
Typically, functional size is not considered in this
group of testing estimation techniques. They are not bas-
ed on the analysis of well documented historical data,
and benchmarking is not feasible.
3.2. Analogy and Work Breakdown Techniques
Analogy-based [3]: techniques involving comparison of
the components of the software under test with standard
components, for which test effort is known based on his-
torical data. The total estimate of all the components of
the software to be tested is further adjusted based on
project-specific factors and the management effort re-
quired, such as planning and review.
Table 1. Summary analysis of strengths and weaknesses of estimation techniques for testing.
estimation techniques
Customer view of
requirements Functional size as a
prerequisite Mathematical validity Verifiable Bench marking
1) Judgment & rule of thumb NO NO Not applicable NO NO
2) Analogy & work
breakdown NO NO YES YES
Partially, and only when
standards are used
3) Factor & weight NO NO NO-units are most often
ignored YES NO
4) Size YES YES Varies with sizing
technique selected YES YES
5) Fuzzy & other models Partially Most often, No YES, in general, but at
times units are ignored Partially Partially, and only when
standards are used
Copyright © 2013 SciRes. JSEA
A Survey of Software Test Estimation Technique s 49
Proxy-based (PROBE) [5]: a technique for estimating
development effort, which can be adapted for estimating
testing efforts for smaller projects. When the components
of the product to be tested are significantly different from
the standard components, a new baseline has to be estab-
lished. Validation based on historical data is weak and
fuzzy in this case. Functional size is not considered in
such techniques, and benchmarking is not feasible.
Task-based [3]: a typical work breakdown-based es-
timation method where all testing tasks are listed and
three-point estimates for each task are calculated with a
combination of the Delphi Oracle and Three Point tech-
niques [6]. One of the options offered by this method for
arriving at an expected estimate for each task is a Beta
distribution formula. The individual estimates are then
cumulated to compute the total effort for all the tasks.
Variations of these techniques, such as Bottom-Up and
Top-Down, are based on how the tasks are identified.
These techniques can work in a local context within an
organization, where similar types of projects are exe-
cuted. Benchmarking is not possible, since there is no
agreed definition of what constitutes a task or work break-
Test Case Enumeration-based [3]: an estimation me-
thod which starts with the identification of all the test
cases to be executed. An estimate of the expected effort
for testing each test case is calculated, using a Beta dis-
tribution formula, for instance. A major drawback of this
technique is that significant effort has to be expended to
prepare test cases before estimating for testing. This
technique can work in a context where there is a clear
understanding of what constitutes a test case. Productiv-
ity measurements of testing activities and benchmarking
are not possible.
3.3. Factor and Weight-Based Estimation
Test Point Analysis [7]: a technique in which dynamic
and static test points are calculated to arrive at a test
point total. Dynamic test points are calculated based on
function points, fun ctionality-dep enden t factors, and qu a-
lity characteristics. Function-dependent factors, such as
user importance, usage intensity, interfacing require-
ments, complexity, and uniformity are given a rating ba-
sed on predefined ranges of values. Dynamic quality
characteristics, such as suitability, security, usab ility, and
efficiency, are rated between 0 and 6 to calculate dy-
namic test points. Static points are assigned based on the
applicability of each of the quality characteristic as per
ISO 9126. Each applicable quality characteristic is as-
signed a value of 16 and summed to obtain the total num-
ber of static test points. The test point total is converted
to effort based on ratings for a set of productivity and
environmental factors.
While this technique appears to take into account
several relevant factors, these factors do not have the
same measurement units, which makes it difficult to use
them in mathematical operations. Function ality- and qua-
lity-dependent factors are added together to obtain Dy-
namic Test Point, and subsequently to arrive at a Test
Point size, which again is the sum of three different
quantities with potentially d ifferent units of measu rement.
The basic mathematical principles of addition and multi-
plication are forgotten in the Test Point Analysis method.
As such, it can be referred to as a complex “feel good”
Use Case Test Points: a technique suggested [7] as an
alternative to Test Points and derived from Use Case-
based estimation for software development. Unadjusted
Use Case Test Points are calculated as the sum of the
actors multiplied by each actor’s weight from an actors’
weight table and the total number of use cases multip lied
by a weight factor, which depends on the number of
transactions or scenarios for each use case. Weights as-
signed to each of the technical and environmental factors
are used to convert un adju sted use case po ints to adj usted
use case points. A conversion factor accounting for tech-
nology/process language is used to convert adjusted use
case points into test effort. This is another example of a
“feel good” estimation method: ordinal scale values are
inappropriately transformed into interval scale values,
which are then multiplied with th e in tention of arriving at
a ratio scale value, which exposes th e weak mathematical
foundation. Although this technique takes a user’s view-
point of requirements, it is not amenable to benchmark-
Test Execution Points [8]: a technique which esti-
mates test execution effort based on system test size.
Each step of the test specifications is analyzed based on
characteristics exercised by the test step, such as screen
navigation, file manipulation, and network usage. Each
characteristic that impacts test size and test execution is
rated on an ordinal scale—low, average, and high—and
execution points are assigned. Authors have stopped
short of suggesting approaches that can be adopted for
converting test size to execution effort. The task of de-
veloping test specifications prior to estimation requires a
great deal of effort in itself. Since test effort has to be
estimated long before the product is built, it would not be
possible to use this technique early in the life cycle.
Test team efficiency is factored into another variation
of the estimation model for test execution effort. The
Cognitive Information Complexity Measurement Mo-
del [9] uses the count of operators and identifiers in the
source code coupled with McCabe’s Cyclomatic Com-
plexity measure. The measures used in this model lack
the basic metrological foundations for quantifying com-
plexity [10], and the validity of such measurements for
estimating test execution effort has not been demons-
Copyright © 2013 SciRes. JSEA
A Survey of Software Test Estimation Technique s
3.4. Software Size-Based Estimation
The Size-based: an estimation approach in which size is
used and a regression model is built based on historical
data collected adopting standard definitions. Here, size is
one of the key input parameters in estimation. Some of
the techniques use the conversion factors to convert size
into effort. Regression models built using size directly
enable the estimation of effort based on historical data.
Test Size -based [3]: an estimation technique proposed
for projects involving independent testing. The size of
the functional requirements in Function Points using
IFPUG’s Function Point Analysis (FPA) [11] (ISO
20926) is converted to unadjusted test points through a
conversion factor. Based on an assessment of the appli-
cation, the programming language, and the scope of the
testing, weights from a weight table are assigned to test
points. Unadjusted test points are modified using a com-
posite weighting factor to arrive at a test point size. Next,
test effort in person hours is computed by multiplying
Test Point Size by a productivity factor. FPA, the first
generation of functional size measurement methods, suf-
fers from severe mathematical flaws with respect to its
treatment of Base Functional Components, their weights
in relation to a complexity assessment, and the Value
Adjustment Factor used to convert Unadjusted Function
Points to Adjusted Function Points. The mathematical
limitations of FPA have been discussed in [10]. In spite
of these drawbacks, this technique has been used by the
industry and adopted for estimating test effort. Perform-
ance benchmarking across organizations is possible with
this technique for business application projects.
Other estimation models, such as COCOMO and
SLIM, and their implementation in the form of estima-
tion tools, along with other tools such as SEER and
Knowledge Plan [12], employ size based estimation of
software development effort from which testing effort is
derived, often in proportion to the total estimated effort.
Usually, the size of software, measured in SLOC, is an
input parameter for these models. There are provisions to
backfire function point size and convert it to SLOC to
use as input, although this action actually increases un-
certainty in the estimate. These models use a set of pro-
ject data and several predetermined factors and weights
to make up an estimation model. The intention behind
using these models and tools is interesting, in that they
capture a host of parameters that are expected to influ-
ence the estimate. Abran [10] has observed that many
parameters of COCOMO models are described by lin-
guistic values, and their influence is determined by ex-
pert opinion rather than on the basis of information from
descriptive engineering repositories. It should be noted
that some of these models are built based on a limited set
of project data and others with a large dataset, but they
primarily use a black box approach. Some of the tools
use predetermined equations, rather than the data directly.
These models and automated tools can provide a false
sense of security, when the raw data behind them cannot
be accessed for independent validation and to gain ad-
ditional insights.
ISBSG equations [13]: techniques derived from esti-
mation models based on a repository of large project
datasets. These equations are based on a few significant
parameters that influence effort estimates, as analyzed
through hundreds of projects data from the open ISBSG
database. Such estimation equations are built as a white
box approach, with the ab ility to understand the underly-
ing data and learn from them. Using a similar approach,
practical estimation models using functional size can be
developed by the organizations themselves. Another stu-
dy of projects in the ISBSG database has come up with
interesting estimation models for software testing using
functional size [14].
Capers Jones [15] mentions that FPA and COSMIC
Function Points [16] provide interesting insights into the
quantum of test cases (also referred to as “test volume”),
and proposes various rules of thumb based on function
points in order to calculate the number of test cases re-
quired and to estimate the potential number of defects.
Estimate of test cases and defects lead to estimate of ef-
forts for overall testing. However, the authors caution
against using rule of thumb methods, indicating that they
are not accurate and should not be used for serious busi-
ness purposes.
Non functional requirements, like functional require-
ments, are quite critical in software testing. An approach
for estimating the test volume and effort is proposed in
[17], where the initial estimate based on functional re-
quirements is adjusted subsequently by taking into con-
sideration non functional requirements. This model uses
COSMIC Function Points, the first of the 2nd genera-
tion functional size measurement methods adopted by the
ISO [18], as it overcomes the limitations of the 1st gener-
ation of Function Point sizing methods. Estimates for non
functional testing are arrived at based on graphical as-
sessment of non functional requirements of project data.
Estimation models developed by performing regression
analysis between COSMIC Function Points and develop-
ment effort have been successfully used in the industry
[19]. Estimation models built using historical project data
with a COSMIC Function Point size have a strong ma-
thematical foundation, are verifiable, and take into ac-
count the customer view of requirements. In addition,
they are amenable to benchmarking and performance stu-
3.5. Fuzzy Inference & Other Models
Fuzzy models have been proposed to account for in-
comeplete and/or uncertain input information available
Copyright © 2013 SciRes. JSEA
A Survey of Software Test Estimation Technique s 51
for estimation. One of the fuzzy logic approaches [20]
uses COCOMO [2] as a foundation for fuzzy inference
using mode and size as inputs. Even though this ap-
proach is proposed for estimating software development
effort, it could be adopted for estimating testing effort as
well. Another approach has been proposed by Ranjan et
al. [21]. This technique again uses COCOMO as the ba-
sis on which KLOC is used as an input, and development
effort is calculated using effort adjustment factors based
on cost drivers.
Francisco Valdès [22] designed a fuzzy logic estima-
tion process in his Ph. D thesis. His approach is the pur-
est form of the application of fuzzy logic to the software
estimation context, since it does not use any of the other
estimation techniques. The model allows experts to de-
cide on the most significant input variables for the kinds
of projects in which th e model will be applied . Th e mem-
bership function is defined for the input variables, and
the values are assigned based on expert opinion. This
creates fuzzy values that are used in inference rule exe-
cution. Unlike other expert judgement-based methods,
where the knowledge resides with experts, here the knowl-
edge is captured in the form of inference rules and stays
within the organization. The estimates produced by these
models can also be verified. However, setting up a fuzzy
rule base requires time and the availability of experts in
the domain/tech nology for wh ich esti mation is being per-
Software Development Estimation techniques using
Artificial Neural Networks (ANN) [23,24] and Case-
Based Reasoning (CBR) [25] are reported by researchers.
There is an implementation of ANN for software testing
[26], which inherits the drawbacks from Use Case based
estimation. ANN and CBR have to be further investi-
gated for arriving at practical and mathematically correct
estimation of software testing.
4. Summary and Key Research Needs
Judgment & Rule of Thumb-based techniques are quick
to produce very approximate estimates, but the estimates
are not verifiable and of unknown ranges of uncertainty.
Analogy & Work Break-Down techniques may be
effective when they are fine-tuned for technologies and
processes adopted for testing. They take an implementa-
tion view of requirements, and cannot be used for bench-
markin g purposes.
Factor & Weight-based techniques perform several
illegal mathematical operations , and lose scientific credi-
bility in the process. Th ey can serve as “feel good” tech-
Functional Size-based techniques are more amenable
to performance studies and benchmarking. They tend to
produce more accurate estimates [13]. COSMIC Func-
tional Size-based estimation overcomes the limitations of
the first generation of th e Function Point method.
Estimation Tools based on black-box datasets have
to be used very judiciously, with an understanding of
where the models come from and the kind of data used to
build them.
Innovative approaches, such as Fuzzy Inference, Ar-
tificial Neural Networks, and Case-based Reasoning,
are yet to be adopted in the industry for estimating test-
ing effort.
The estimation techniques surveyed here are currently
limited in their scope for application in outsourced soft-
ware testing projects, and do not attempt to tackle esti-
mates for all types of testing. The classification of test
effort estimation techniques presented here is also ap-
plicable for software development effort estimation tech-
niques with the same strengths and weakness.
Prior to pursuing further research to improve any one
specific estimation technique, it would be of interest to
develop a more refined view of the evaluation criteria
identified in this paper, and to explore, in either software
engineering or other disciplines, what candidate solutions
or approaches could bring about the most benefit in terms
of correcting weaknesses that have been identified or
adding new strengths.
[1] ISO, “ISO/IEC DIS 29119—1, 2, 3, & 4: Software and
Systems Engineering—Software Testing—Part 1, 2, 3, &
4,” International Organization for Standardization (ISO),
Geneva, 2012.
[2] B. W. Boehm, “Software Cost Estimation with CO-
COMO II,” Prentice Hall, Upper Saddle River, 2000.
[3] M. Chemuturi, “Test Effort Estimation,” 2012.
[4] M. Nasir, “A Survey of Software Estimation Techniques
and Project Planning Practices,” 7th ACIS International
Conference on Software Engineering (SNPD’06), Las Ve-
gas, 19-20 June 2006, pp. 305-310.
[5] W. Humphrey, “A Discipline for Software Engineering,”
Addison Wesley, Boston, 1995.
[6] R. Black, “Test Estimation—Tools & Techniques for Rea-
listic Predictions of Your Test Effort,” 2012.
[7] M. Kerstner, “Software Test Effort Estimation Methods,”
Graz University of Technology, Graz, 2012.
[8] E. Aranha and P. Borba, “Sizing System Tests for Esti-
mating Test Execution Effort,” Federal University of
Pernambuco, Brazil & Motorola Industrial Ltd, CO-
COMO Forum 2007, Brazil, 2012.
[9] D. G. E. Silva, B. T. De Abreu and M. Jino, “A Simple
Approach for Estimation of Execution Effort of Func-
tional Test Cases,” International Conference on Software
Copyright © 2013 SciRes. JSEA
A Survey of Software Test Estimation Technique s
Copyright © 2013 SciRes. JSEA
Testing Verification and Validation, Denver, April 2009,
pp. 289-298.
[10] A. Abran, “Software Metrics and Software Metrology,”
Wiley & IEEE Computer Society Press, Hoboken, 2010.
[11] IFPUG, “Function Point Counting Practices Manual, Ver-
sion 4.2.1,” International Fun ction Point s Users Group, New
Je r s e y , 2005.
[12] S. Basha and P. Dhavachelvan, “Analysis of Empirical
Software Effort Estimation Models,” International Jour-
nal of Computer Science and Information Security, Vol. 7,
No. 2, 2010, pp. 68-77.
[13] P. R. Hill, “Practical Project Estimation,” 2nd Edition, In-
ternational Software Benchmarking Standards Group (ISBSG),
V i c t o r ia , 2005.
[14] K. R. Jayakumar and A. Abran, “Analysis of ISBSG Data
for Understanding Software Testing Efforts,” IT Confi-
dence Conference, Rio, 3 October 2013.
[15] C. Jones, “Estimating Software Costs,” 2nd Edition, Tata
McGraw-Hill, New York, 2007.
[16] A. Abran, J.-M. Desharnais, S. Oligny, D. St-Pierre and C.
Symons, “The COSMIC Functional Size Measurement
Method Version 3.0.1, Measurement Manual, The COS-
MIC Implementation Guide for ISO/IEC 19761:2003,”
Common Software Measurements International Consor-
tium (COSMIC), Montreal, 2009.
[17] A. Abran, J. Garbajosa and L. Cheikhi, “Estimating the
Test Volume and Effort for Testing and Verification &
Validation,” IWSM-Mensura Conference, 5-9 November 2007,
UIB-Universitat de les Illes Baleares, Spain, 2007, pp. 216-
ation__Validatio n
[18] ISO, “ISO/IEC 19761: Software Engineering—COSMIC
—A Functional Size Measurement Method,” International
Organization for Standardization (ISO), Geneva, 2011.
[19] R. Dumke and A. Abran, “Case Studies of COSMIC Us-
age and Benefits in Industry, COSMIC Function Points—
Theory and Advanced Practices,” CRC Press, Boca Raton,
[20] P. V. G. D. Prasad Reddy, K. R. Sudha and P. Rama Sree,
“Application of the Fuzzy Logic Approach to Software
Estimation,” International Journal of Advanced Com-
puter Science and Applications, Vol. 2, No. 5, 2011, pp.
[21] P. R. Srivastava, “Estimation of Software Testing Effort:
An Intelligent Approach,” 20th International Symposium
on Software Reliability Engineering (ISSRE), Mysore,
ion_of_Softw are_Testing_ Effort_An_intellig ent_A pproach
[22] F. Valdès, “Design of the Fuzzy Logic Estimation Proc-
ess,” Ph.D. Thesis, University of Quebec, Quebec, 2011.
[23] A. S. Grewal, V. Gupta and R. Kumar, “Comparative Ana-
lysis of Neural Network Techniques for Estimation,” In-
ternational Journal of Computer Applications, Vol. 67,
No. 11, 2013, pp. 31-34.
[24] J. Kaur, S. Sing, K. S. Kahlon and P. Bassi, “Neural Net-
work—A Novel Technique for Software Effort Estima-
tion,” International Journal of Computer Theory and En-
gineering, Vol. 2, No. 1, 2010, pp. 17-19.
[25] Y. A. Hassan and T. Abu, “Predicting the Cost Estimation
of Software Projects Using Case-Based Reasoning,” The
6th International Conference on Information Technology
(ICIT 2013), Cape Town, 8-10 May 2013.
[26] C. Abhishek, V. P. Kumar, H. Vitta and P. R. Srivastava,
“Test Effort Estimation Using Neural Network,” Journal
of Software Engineering & Applications, Vol. 3, No. 4,
2010, pp. 331-340.