Security Engineering of SOA Applications Via
Reliability Patterns
Luigi Coppolino, Luigi Romano, Valerio Vianello
Centro Direzionale di Napoli, Dipartimento per le Tecnologie, Università degli Studi di Napoli “Parthenope”, Napoli, Italy
Email:{luigi.coppolino, lrom, valerio.vianello}
Received November 30th, 2010; revised December 20th, 2010; accepted December 30th, 2010.
Providing reliable compositions of Web Services is a challenging issue since the workflow architect often has only a
limited control over the reliability of the composed services. The architect can instead achieve reliability by properly
planning the workflo w architecture. To th is end he must be able to evaluate and co mpare the reliab ility of multiple arc-
hitectural solutions. In this paper we present a useful tool which allows to conduct reliability analysis on planned
workflows, as well as to compare the relia bility of a ltern ative solution s in a what-if analysis. The too l is implemented as
a plug-in for the widely adopted Active BPEL Designer and exploits the concept of reliability pattern to evaluate the
reliability formula of the workflow. The effectiveness of the approach and the operation of the tool are demonstrated
with respect to a case study of a business security infrastructure realized by orchestrating simple security services.
Keywords: Reliability of Security Services, Reliability Patterns, Workflow Systems, SOA Applications,
Web Services Technology
1. Rationale and Contribution
Web Services can be combined in complex composite
services achieving new functionalities [1]. Composition
may aggregate services developed and exposed within a
certain organization. More interestingly, the composed
Web Services can be the result of an orchestration of
services exposed by different organizations. Benefits of
composition have been long discussed in the past few
years highlighting and demonstrating the advantages
coming from the achievement of new functionality by
composing autonomous services [2]. Nevertheless it is
widely accepted that web applications are easy to fail as
confirmed by a U.S. Government study [3] and for Web
Services the situation is also worst since a number of
application layers are built on top of classical web serv-
ers. As pointed out in [4], failures are inevitable in the
modern Internet-Connected environments and, when dea-
ling with composite services, assuming the failure of any
individual Web Service will cause the failure of the
composite service, even if all the other Web Services are
reliable, one unreliable Web Service could decrease the
overall reliability to a very low level. This evidence re-
lated to the reliability of the composite Web Service rises
up a doubt with respect to the actual adoption of this dis-
tributed model of developing complex services [5]. Since
the reliability engineer designing the workflow has no
chance to modify the simple services at all, especially
while dealing with services composed across organiza-
tion boundaries, the only way to ensure the reliability of
the composite service is increasing the reliability of the
workflow by appropriately planning its architecture, i.e.
properly adopting diversity and redundancy. This req-
uires the development of appropriate methodologies for a
quick and early evaluation of the composite service re-
liability and the development of tools which can be easily
adopted to compare multiple architectural choices for the
orchestration of a service. In this paper, we propose a
formal approach that allows a workflow architect to per-
form reliability analysis of a SOA-based service. The ap-
proach exploits the concept of reliability patterns to de-
rive an aggregate reliability function and it is suited for a
wide class of workflow processes. The approach is im-
plemented in a tool (more precisely, as a plug-in for Ac-
tive BPEL Designer). The tool allows the system archi-
tect to evaluate the impact—in terms of reliability—of
possible workflow alternatives, as early as in the first
steps of the design. The effectiveness of the approach
and the operation of the tool are demonstrated with re-
spect to a case study of a business security infrastructure
realized by orchestrating simple security services. The
Security Engineering of SOA Applications Via Reliability Patterns
rest of the paper is organized as follows. Section 2 pro-
vides an overview of the related work. Section 3 presents
the concept of workflow patterns, while in Section 4 re-
liability patterns are derived and their reliability formulas
evaluated. Section 5 discusses the assumptions and limi-
tations of the model. Section 6 presents a typical case
study of a business security infrastructure. In Section 7 it
is described the implementation of the plug-in for Acti-
veBPEL and its operation is demonstrated with respect to
the case study at hand. Finally, Section 8 concludes the
paper with final remarks.
2. Related Work
Web Services based systems are typically composed by
orchestrating a number of simpler services (generally
Web Services themselves) in a common workflow. In
such a case it is widely accepted that the reliability func-
tion of the workflow must be derived based on the relia-
bility functions of individual tasks in the workflow. A
mature work in this field is [6], where the authors pro-
pose a set of workflow patterns with related reliability
expressions. A workflow engine named METEOR, which
allows combining such patterns to build a more complex
workflow, is also presented. Based on the reliability ex-
pression of the elementary patterns METEOR permits to
derive the aggregate reliability expression of the com-
posed workflow. The main limitation of this approach
lies in the possibility of getting the reliability expression
only for those workflows that can be obtained by com-
posing the patterns described in [6]. To overcome such
limitation, we start from results presented in [7] where
the authors present, by extending results reported in [8], a
set of 43 workflow patterns whose combinations can pro-
vide pretty every workflow. Starting from this set of pat-
terns, we identify the combinations of them that are mea-
ningful from a reliability point of view and derive for
them a reliability expression. We refer such combina-
tions as “reliability patterns”. Since virtually any work-
flow can be obtained by combining the workflow pat-
terns in [7], “reliability patterns” can also be applied to
obtain the reliability formula for any workflow. We de-
monstrate our approach to the patterns defined in [8],
similarly it is possible to derive new reliability patterns
from the remaining patterns defined in [7]. By doing so
we verify that, not only all the patterns defined in [6] are
also obtained as reliability patterns, but new patterns, not
considered in [6], such as the “Multi-Merge Parallel”, are
also identified and considered for reliability evaluation.
Since our approach can be applied to retrieve the reliabil-
ity expression of virtually any workflow, it can be ap-
plied to already existing workflow designing tools, in-
stead of needing the design of new ones, as it was for
METEOR. This is verified by applying the concept of re-
liability patterns to a popular commercial workflow de-
signing tool, namely Active BPEL Designer [9], and en-
abling an early evaluation of reliability formulas for de-
signed workflows. Even more interestingly, the proposed
plug-in can be easy adapted to any WS-BPEL [10] com-
pliant designer. Finally it is worth noting that the estima-
tion of such formulas is not intended at exactly measur-
ing the reliability of a composed service, but at allowing
a what-if analysis of alternative architectural solutions at
design time. This means that the simplicity of reliability
expressions should be preferred to their precision.
3. Workflow Patterns
E. Gamma et al. defined a pattern as “The abstraction
from a concrete form which keeps recurring in specific
non arbitrary contexts” [11]. When dealing with Web
Services composition it is worth considering workflow
patterns that are defined in [12] as “An abstract descrip-
tion of a recurrent class of interactions based on activa-
tion dependencies”. Workflow patterns can be considered
from multiple perspectives, namely a control-flow pers-
pective, a data perspective, a resource perspective, an
operational perspective. In particular control-flow pers-
pective refers the execution order of a set of activities.
With respect to the control-flow perspective of workflow,
W. M. P van der Aalst et al. identifies a set of twenty ba-
sic control-flow patterns (in the following referred as pat-
terns), which can be combined to generate virtually any
control flow. While analyzing the patterns provided in
[8], some observations are due: 1) Since we are only in-
terested in the reliability of the workflow from an archi-
tectural point of view, not all the patterns are relevant for
our purposes. As an example, the pattern Cancel Case re-
lates to the workflow management system and is there-
fore not relevant for the composition process. 2) From a
reliability point of view some patterns are equivalent. As
an example the Multiple Instances pattern provides the
same reliability of a Parallel Split or of a Multiple Choice
depending on the necessity of completing or not all the
activated instances. 3) Combinations of patterns are often
needed—in order to address reliability—instead of indi-
vidual patterns. This is the case of the Parallel Split, for
which deriving reliability requires knowing if the fol-
lowing task is a Synchronization or a Multi-Merge. 4) Fi-
nally, not all pattern combinations yield valid workflows,
as one example the sequence of a XOR-Split and an
AND-Join is not allowed since it refers to a scenario,
where only one in a set of tasks is activated but the end
of all of them is waited before the workflow can termi-
nate. In the next sections, we first describe an algorithm
which derives the aggregate reliability function through a
workflow graph reduction, then we discuss the derived
reliability patterns and their reliability formulas, finally
Security Engineering of SOA Applications Via Reliability Patterns
we present an example showing how the algorithm works.
4. Workflow Reliability Patterns
Starting from the resulting presented in [8] and consider-
ing the definitions provided in Section 3, we define the
concept of reliability pattern as: “An elementary combi-
nation of patterns which itself behaves as a pattern from
a reliability perspective”. The previous definition yields
that 1) for a reliability pattern, a reliability formula can
be defined starting from the reliability formula of each
activity in the pattern, 2) for any subset of patterns in the
reliability pattern, a related reliability formula cannot be
defined. As an example the sequence of an AND-SPLIT
pattern and a MULTI-MERGE JOIN pattern matches an
m-out-of-n reliability structure so it can be uniquely cha-
racterized from a reliability point of view. On the other
hand it is not possible to characterize the reliability of
neither the AND-SPLIT pattern nor the MULTI-MERGE
JOIN pattern if they are considered separately. In the
following sections, first is defined a redu- ction algorithm
exploiting the concept of reliability patterns to character-
ize the reliability of a workflow, then reliability patterns
are identified and their reliability formulas are obtained.
4.1. Reduction Algorithm
When dealing with a workflow, we are assuming that
Web Services are composed in an orchestration. We as-
sume a workflow described as
W t,a,fr,fp,fc (1)
Where: t is a set of tasks (each represented by a circle);
a is a set of transitions (each represented by an arrow); fr
is a function which associates to every task ti in t its re-
liability function; fp is a function which associates to
every transition aij (connecting the task i to the task j) a
probability pij, representing the probability that once task
ti terminates task tj is activated. In other words pij repre-
sents the probability of activation of the transition aij.
Every time the task ti is unambiguously identified, the
index “i” will be omitted and pij substituted with pj; and
fc is a function which for every task ti in t associates a
value ci in [0, 1] representing the probability that a fail-
ure of task ti does not lead to a failure of the workflow.
Hence ci represents a coverage factor, and can be ex-
pressed as:
g is a failure mode for the task i
G is the fault dictionary for the task i
= 1 if the failure g can be tolerated, 0 otherwise
P(g) is the occurrence probability of the failure g
This implies that the reliability for the single task is in-
creased by a factor representing the probability that the
component will fail without leading to a workflow failure,
that is:
ii i
R R(1Ri
c (3)
Where '
R represents the reliability of the task ti and
represents the reliability of the task ti as perceived by the
workflow engine. The latter equals the former when c is
zero, i.e. the workflow cannot tolerate a fault in one of
the orchestrated services. If c equals 1 the formula re-
turns 1 meaning that the component is optional from a re-
liability point of view. In the next two sub-sections we
will always use the term reliability with reference to the
meaning it assumes in (3). A start task and an end task
must be identified into the set of the tasks. The start task
does not have any incoming transition and represents the
invocation of the orchestrated service by an external
client. The end task does not have any outgoing transi-
tion and represents the end of the orchestration. Once the
graph representing the Web Services orchestration is de-
fined, the reduction algorithm is performed by going
backward through the graph (from the end task to the
start one) and each time an individual reliability pattern
is found its component tasks are collapsed in a single
task whose reliability is defined by the pattern reliability
formula. The process is than iterated until the whole
workflow is collapsed in a single task whose reliability
depends on the reliability of the individual tasks, the
probabilities pij and the coverage factors i
4.2. Reliability Patterns
The authors described the identified reliability patterns in
a previous work [13], in Table 1 are reported the formu-
las of these patterns.
5. Assumptions and Limits of the Model
The main hypothesis underling the analysis proposed in
the previous section, and of course the proposed app-
roach, is the independence of events Ai = ’time to first
failure of activity i’
t and Aj = ’time to first failure of
activity j’
t , for each I
j. This means, for example,
that if two services are offered by the same provider it is
assumed that they are deployed on physically indepen-
dent servers. A further simplification in this approach lies
in the absence from the model of the communication cha-
nnel reliability. Actually the communication channel may
itself introduce faults, as an example by dropping packets,
or modifying them or just delaying their delivery beyond
timeout expiration. Anyway such a kind of behavior can
be embedded into the model of the single service. Finally
it is worth noting that the obtained model provides the
reliability of the services orchestration without consider-
ing the reliability of the service that performs the orches-
Security Engineering of SOA Applications Via Reliability Patterns
Table 1. Reliability pattern expressions.
Pattern Name Reliability Pattern Expression
Sequence AB
ing Parallel 1
....( 1)
(1)( )(1)(2)(1)
AB j
ii j
RRRu ik
iRp iRp ip
 
u(n) = 1 if n 0, 0 otherwise
(n) = 1 if n = 0, 0 otherwise
(1)(1) 0
ik ik
Parallel 1n
..( 1)
(1)()(1 )(2)(1 )
jjjBj jjjj
RRu ik
iRpR iRp ip
 
XOR Parallel
AB Aii
Where …1
Loop (1 )
Where p is the probability of run-
ning the loop.
tor Parallel 1
AB iBi
tration. As an example let us consider a service which by
means of an orchestration engine (e.g. BPEL) coordinate
the invocation of other services by fo- llowing a prede-
fined workflow. In this case the reliability of the orches-
tration service, of the server hosting such a service and of
the orchestration engine, should be modeled and in case
of hypothesis of independence it should be multiplied by
the reliability of the entire workflow.
6. Case Study
This section considers a realistic case study to show how
the proposed approach can be applied in the field of reli-
able workflow development. The case study considers a
company with four branches each with its IT department
[14]. The four branches are federated in a trusted domain
[15]. So that a user which logs at one of the federated
entities, in order to access a service, obtains a SAML [16]
token which can be used at a later time to log in at any of
the other federated entities without the need of being
authenticated again. In the presented scenario the user,
holding the SAML token, sends a request to access a ser-
vice provided by the company (Figure 1). The request is
intercepted by the Access Manager of one federated ent-
ity which picks the SAML token up and tries to validate
it. Two alternatives are possible:
1) The SAML token was actually released by that ent-
ity (branch 1 of the company): in this case, the Access
Manager requests the Identity Manager to retrieve the
appropriate authorization profile for the user holding the
token. The Identity Manager will do it by means of the
Directory Server which provides an abstraction of the
data repositories in the company.
2) The SAML token is not recognized as a valid token
by the branch 1 of the company: the Access Manager
charges the Federation Manager in its domain with man-
aging the SAML token. The Federation Manager will ask
other Federation Managers in the same trusted domain to
check for the SAML token. Each of the Federation Ma-
nager will provide its Access Manager with a copy of the
token. These in turn will repeat the same steps of the
validation procedure as operated by the Access Manager
in the branch 1. Finally the required service will be ac-
cessed with the authorization profile provided by the
federated entity which actually issued the SAML token.
Assuming, that each of such functionalities is provided
as a service, the whole procedure can be described by the
workflow graph of Figure 2. A workflow architect can
study the workflow in order to make a reliability predic-
tion of the orchestrated service. Further the designer can
study the workflow even to modify the architecture of the
service itself; if, for example, the four federated entities
are distributed in Europe but two of them are both in Ita-
ly, the workflow architect could compare the reliability
of an architecture where the four entities are seen as four
branches of the company, with an architecture where the
two entities in Italy are connected through a virtual dedi-
cated LAN resulting in a company with only three bran-
ches for a reliability point of view. The workflow archi-
tect can take the best decision based on a trade-off analy-
sis in terms of total reliability of the service versus im-
plementation cost for the chosen solution. While, to eva-
luate the cost of an architectural solution could be a sim-
ple task, to compute the total reliability function of a
complex workflow is not straightforward. In the next se-
ction we present a useful tool that can help the workflow
architect performing a reliability prediction analysis.
7. Implementation
In this section we first show the main capability of the
proposed reliability prediction tool and then we have de-
monstrated the usage of the tool with respect to the case
study presented in the previous section.
7.1. The Plug-in
The proposed algorithm was developed as a plug-in for
ActiveBPEL Designer in [9] which is a widely used
workflow designing tool. Once installed the plugin all-
ows the workflow designer to perform a reliability analy-
sis for a BPEL Web Services orchestration. More precisely
Security Engineering of SOA Applications Via Reliability Patterns
Figure 1. The case study scenario: A company with four branches federated in a trusted domain.
it allows to:
Retrieve the reliability function for the workflow
under design;
Evaluate the reliability of the workflow at a specific
point in time, that is the probability that the work-
flow will not fail until the specified time;
Plot the resulting reliability function with respect to
the time;
Obtain usual reliability metrics, such as the MTTF
(Mean Time To Failure), for the analyzed workflow.
In order to perform the above described analysis the
designer has to provide the workflow with the informa-
tion required by the model (such as the symbolic expres-
sion of the reliability function of each activity, and the
transition probabilities). Such information is directly em-
bedded in the BPEL description of the workflow by ex-
ploiting the standard WS-BPEL [10] extensibility. This
allows the tool to remain compliant with any WS-BPEL
orchestration despite the specific editor adopted for its
design. Moreover the plug-in extends the Active BPEL
Designer interface to simplify the provisioning of relia-
bility related data. Then, in order to compute the total
workflow reliability function, the plug-in uses an XSLT
style-sheet to convert the workflow BPEL/XML based
representation to an internal representation that only in-
cludes the workflow dependability attributes and the re-
cognized reliability patterns. After this transformation
has been done, the plug-in calls a class that calculates the
global reliability function as described in the reduction
algorithm. The desired analysis estimates are then ob-
tained by evaluating the retrieved reliability function.
Symbolic operations are made possible by the adoption
of the MathEclipse plug-in [17].
7.2. Experiment
With respect to the case study workflow depicted in the
previous section in Figure 2, Figure 3 shows the steps
made by the reduction algorithm in order to obtain the
workflow reliability function. Each step is represented
by a numbered box in which the patterns recognized by
the algorithm are highlighted. In the first and third steps
sequence patterns are recognized and reduced (dotted
boxes). In the second and fifth steps XOR parallel pat-
terns are matched and reduced (dashed boxes). In the
fourth step an AND-SPLIT configuration is found (dot-
dashed boxes). Even though pencil and paper calculation
is possible, this is for sure an error prone process, as
well as a time consuming one, since the most complex is
the workflow graph the toughest is to evaluate the relia-
bility function by hand. As an alternative the reliability
function for the workflow can be automatically eval-
uated by using the proposed plug-in. Figure 4 shows the
workflow as it could be implemented with the Active
BPEL Designer tool. To make possible the evaluation of
the reliability estimates it was required specifying the
reliability function of each activity as well as the transi-
Security Engineering of SOA Applications Via Reliability Patterns
Figure 2. The workflow graph of the case study scenario.
Figure 3. The workflow of the case study scenario and the related reduction process.
Figure 4. The schematic of the case study workflow in Ac-
tive BPEL Designer.
Figure 5. Total workflow reliability function.
Security Engineering of SOA Applications Via Reliability Patterns
Figure 6. A screenshot of the ActiveBPEL workspace sho-
wing the plot reliability function.
tion probabilities. In our case study, for the sake of sim-
plicity, we assumed all the non-empty activities to have
reliability functions exponentially distributed with the
same failure rate value. We explicitly note that evaluat-
ing the failure rate of Web Services is beyond the scope
of this paper, we instead use a realistic value of 0.005 fai-
lures/second as resulting from the experiments conducted
in [18]. The transition probabilities were assumed to be p
= 50% and q = 0.37%, and the value of k for the Syn-
chronizing Parallel pattern is always equals to the num-
ber of branches of the pattern. Once fixed those values
the tool can infer the desired reliability evaluations. For
example using the plug-in we can easily compare the
reliability of the two architectural solutions presented in
the previous section. Figure 5 depicts the final reliability
expressions in the two cases and Figure 6 illustrates a
screenshot which shows the workspace with the charts of
R1(t) (4 branches) and R2(t) (3 branches), obtained run-
ning the plot reliability function command.
8. Conclusions
In this paper we have proposed a formal approach that
allows a workflow architect to perform reliability analy-
sis of a SOA-based service. The approach exploits the
concept of reliability patterns to evaluate the reliability
function of a wide class of workflows. We have integ-
rated our reduction process into an orchestration engine
so to provide a useful plug-in, which can be used to per-
form a reliability analysis for a planned workflow, as
well as to compare the reliability of alternative solutions
in a what-if analysis.
9. Acknowledgements
This work has been partially supported by the Italian
Ministry for Education, University, and Research (MIUR)
in the framework of the Project of National Research In-
terest (PRIN) DOTS-LCCI: Dependable Off-The-Shelf
based middleware systems for Large-scale Complex Cri-
tical Infrastructures, by the Italian Ministry of Industry in
the framework of funding INDUSTRIA 2015 (SIS-
TEMA project). This work received funding from the
European Community’s Seventh Framework Programme
(FP7/2007-2013) under grant agreement No. 225553
(INSPIRE Project).
