Int. J. Communications, Network and System Sciences, 2009, 2, 857-873
doi:10.4236/ijcns.2009.29100 Published Online December 2009 (
Copyright © 2009 SciRes. IJCNS
Forensic Investigation in Communication Networks Using
Incomplete Digital Evidences
CN&S Research Lab, University of the 7th of November, Carthage, Tunisia
Received July 16, 2009; revised September 10, 2009; accepted October 21, 2009
Security incidents targeting information systems have become more complex and sophisticated, and intruders
might evade responsibility due to the lack of evidence to convict them. In this paper, we develop a system for
Digital Forensic in Networking, called DigForNet, which is useful to analyze security incidents and explain
the steps taken by the attackers. DigForNet combines intrusion response team knowledge with formal tools
to identify the attack scenarios that have occurred and show how the system behaves for every step in the
scenario. The attack scenarios construction is automated and the hypothetical concept is introduced within
DigForNet to alleviate missing data related to evidences or investigator knowledge. DigForNet system sup-
ports the investigation of attack scenarios that integrate anti-investigation attacks. To exemplify the proposal,
a case study is proposed.
Keywords: Formal Digital Investigation, Incident Response Probabilistic Cognitive Map, DigForNet, Anti-
Forensic Attacks Investigation, Attack Scenarios Reconstruction
1. Introduction
Considering the state of digital security incidents which
has dramatically increased in terms of complexity, num-
ber, and sophistication, it becomes evident that the tradi-
tional ways of protecting information systems (e.g., Fire-
walls, IDSs) are no longer sufficient. Faced to this situa-
tion, security experts have started giving a great interest
to a novel discipline called the digital investigation of
security incidents, which is defined by the literature as
preservation, identification, extraction, documentation and
interpretation of computer data [1]. Digital investigation
aims to perform a post-incident examination on the com-
promised system while achieving several objectives in-
cluding evidence collection, attack scenarios construction,
and results argumentation with non refutable proofs.
Performing a digital investigation is a very complex task
for many reasons. First, attacks may use multiple sources
and become difficult to trace using the available trace-back
techniques. Second, systems may not be initially prepared
for investigation, leading to the absence of effective logs
or alerts to be used during the analysis of the incident.
Third, the attackers may use a number of techniques to
obfuscate their location or to hide traces on the system
that could be used to prove their malice. Fourth, attack
scenarios may use several automated tools that create
intensive damaging activities on the compromised sys-
tems. A large amount of data should thus be analyzed
and several evidences need to be identified and extracted.
To face the above complexity, the digital investigation
should, first, be well structured by reconciling both the
expertise of the incident response team (IRT) and the use
of formal reasoning techniques about security incidents.
This reconciliation allows to: 1) better filter the data to
be analyzed and source of evidences to be explored,
based on the skills developed by the IRT; 2) validate the
results of the formal techniques, by the IRT, before pre-
senting them and also use them to improve the content of
the attacks library. Second, digital investigation should in-
tegrate the use of formal techniques that are useful to
develop non-refutable results and proofs, and avoid er-
rors that could be introduced by manual interpretations.
Moreover, it should consider the development of tools to
automate the proof provided by these formal methods.
Third, since the collected evidences may be incomplete
and describing all potential malicious events in advance
is impractical, hypotheses need to be put forward in or-
der to fill in this gap.
Despite the usefulness of formal methods and approaches,
digital investigation of security incidents remains scarcely
explored by these methods. Stephenson [2] took interest
to the root cause analysis of digital incidents and used
Colored Petri Nets as formalism for modeling occurred
events. The methodology may become insufficient to
deal with sophisticated attack scenarios, where there is a
lack of information on the compromised system that re-
quires some hypotheses formulation. Stallard and Levitt
[3] proposed a formal analysis methodology entitled se-
mantic integrity checking analysis. It is based on the use
of an expert system with a decision tree that exploits
invariants relationship between existing data redundan-
cies within the investigated system. To be usable with
highly complex systems, it is imperative to have a prior
list of good state information; otherwise, the investigator
has to complete its analysis in ad hoc manner. Gladychev
[4,5] provided a Finite State Machine (FSM) approach to
reconstruct potential attack scenarios discarding scenar-
ios that disagree with the available evidences. Since in-
vestigation may occur on systems that could not be com-
pletely described due to their complexity, if unknown
system transitions are ignored, the event construction
may freeze or its accuracy may be severely affected.
Carrier and Spafford [6] proposed a model that supports
existing investigation frameworks. It uses a computation
model based on a FSM and the history of a computer. A
digital investigation is considered as the process that for-
mulates and tests hypotheses about occurred events or
states of digital data. Additionally, the model allows de-
fining different categories and classes of analysis tech-
niques. A key idea in the proposed approach is that every
computer has a history, which is not fully recorded or
known in the practice. A digital investigation is consid-
ered as the process that formulates and tests hypotheses
about occurred events or states of digital data. Wil-
lanssen [7] took interest in enhancing the accuracy of
timestamp evidences. The aim is to alleviate problems
related to the use of evidences whose timestamps were
modified or referred to an erroneous clock (i.e., which
was subject to manipulation or maladjustment). The pro-
posed approach consists in formulating hypotheses about
clock adjustment and verifying them by testing consis-
tency with observed evidences. Later, the testing of hy-
potheses consistency is enhanced by constructing a model
of actions affecting timestamps in the investigation sys-
tem [8]. In [9], a model checking-based approach for the
analysis of log files is proposed. The aim is to search for
pattern of events expressed in formal language. Using
this approach logs are modeled as a tree whose edges
represent extracted events in the form of algebraic terms.
P. Sencar and Memon [10] proposed a methodology to
recover files from unallocated space of disk without the
assistance of meta-data or file system table. The pro-
posed technique assumes that files may be initially frag-
mented and several contiguous blocks may be scattered
around the storage area. To enhance the effectiveness of
file recovery, the technique looks for detecting the point
of fragmentation of a file, using a sequential hypothesis
testing (SHT) procedure. Peisert [11] proposed to deter-
mine what data are necessary to perform investigation
and basis its idea on the use of the requires/provides
model, which is commonly used for intrusion detection.
We develop in this paper, a system for Digital Foren-
sic in Networking called DigForNet. It integrates the analy-
sis performed by the Incident Response Team on a com-
promised system, through the use of a new Cognitive
Map [12,13] called the Incident Response Probabilistic
Cognitive Map (IRPCM), which extended the Cognitive
Map proposed in [14]. DigForNet uses formal approach
to identify potential attack scenarios using a formal speci-
fication language entitled, I-TLA. The formal approach
allows specifying different forms of evidences. It identi-
fies an attack scenario as a series of elementary actions,
retrieved from a used library, which, if executed on the
investigated system, would produce the set of available
evidences. We developed in DigForNet the concept of
executable specification of attack scenarios, which shows
with details how an attack is performed progressively on
the system and how the latter behaved during the attack.
DigForNet uses I-TLC, an automated verification tool
for I-TLA specifications. To alleviate any missing evi-
dences or details related to attack scenarios, DigForNet
integrates a technique for generating hypothetical actions
to be appended to the scenario under construction.
Our contribution is three-fold. First, DigForNet recon-
ciles in the same framework conclusions derived by the
incident response team and theoretical and empirical
knowledge of digital investigators. To the best of our
knowledge, it is the first investigation system which sup-
ports such feature. Second, we proposed a new IRPCM
which integrates the temporal aspect. In fact, during
IRPCM construction, the appending of anti-investigation
relations between concepts could make other concepts inac-
tive. Several IRPCM snapshots could thus be obtained
depending of time. Third, using the concept of hypo-
thetical actions, DigForNet stands out from the other
existing approaches and allows generating sophisticated
and unknown attack scenarios. The new generated hypo-
thetical actions could be used to extend the content of the
library of attacks. Fourth, the formal techniques used by
DigForNet allow supporting a collaborative working be-
tween the IRT members, and generating a formal specifi-
cation useful for conducting an investigation, where a model
checking-like technique could be used to automate the
generation of executable specification of attack scenarios.
This paper is organized as follows. Section 2 defines
the important concepts related to the digital investigation
of security incidents and describes the DigForNet’s meth-
odology for reasoning about security incidents. The use
of the IRPCM technique to represent the intrusion re-
sponse team members’ view about the security incidents
is shown in Section 3. Section 4 describes I-TLA as logic
for specifying evidences and identifying potential attack
scenarios that satisfy them. It also shows how to pass
from IRPCM to I-TLA specification. Section 5 intro-
Copyright © 2009 SciRes. IJCNS
duces I-TLC as an automated verification tool for I-TLA
specifications, which allows generating executable specifi-
cation of attack scenarios. Section 6 illustrates an exam-
ple of the use of DigForNet in investigating a real secu-
rity incident. Finally, Section 7 concludes this paper.
2. Methodology of Structured Investigation
We start this section by introducing the need for digital
investigation and then we describe the DigForNet’s
2.1. Need for Digital Investigation
Focusing merely on restoring the system, which is the
simplest and easiest method, is disadvantageous. In fact,
valuable information and traces that allow understanding
the attack could be removed if the compromised system
is straightforwardly formatted or reinstalled. The above
mentioned weaknesses in the response point up the need
for conducting a post-incident digital investigation [15].
The latter can be considered as the process that allows to:
1) determine how the computer attack was performed and
what are the security weaknesses and design mistakes
that let the incident succeeds; 2) trace the attackers to
their source to identify their identities; 3) build a proof
from the collected information to bring a prosecution
against attackers who committed the attack; 4) argument
and underline the results with well-tested and proved
methods and techniques; 5) Study the attackers’ trends and
motives, and take the accurate security measures to pre-
vent future similar attack scenarios.
Since digital investigation [16,17] focuses on the in-
vestigation of an incident after it has happened, a digital
evidence should be gathered from the system to support
or deny some reasoning an investigator may have about
the incident. Digital evidence is defined as any data
stored or transmitted that support or refute a theory of
how an offence occurred or that address critical elements
of the alibi [16].
2.2. DigForNet Methodology
DigForNet integrates the incident response team con-
tributions under the form of Incident Response Probabil-
istic Cognitive Maps (IRPCMs). An IRPCM is built
with a collaborative fashion by the IRT members based
on the information collected on the system. IRPCMs
provide a foundation to mainly investigate and explain
occurred security attacks.
DigForNet provides a formal way for reconstructing
potential attack scenarios. It defines a novel logic entitled
Investigation-based Temporal Logic of Actions (I-TLA),
and its logic-based language entitled I-TLA+. DigForNet
methodology is composed of six steps organized in a
waterfall model as shown in Figure 1. They are de-
scribed as follows.
The first step collects evidences available within three
different sources, namely the operating systems, networks,
and storage systems. The second builds the IRPCM, which
is nothing but a directed graph representing security events,
actions and their results along with the relations between
these concepts. The third step consists in extracting the
sets of evidences and actions from the cognitive map for
the formal specification of the potential attack scenarios.
The fourth step generates a formal specification. A for-
mal approach is necessary for this purpose. DigForNet
uses logic, referred to as I-TLA, to generate a specifica-
tion containing a formal description of the set of ex-
tracted evidences and actions, the set of elementary at-
tack scenario fragments retrieved from the library of at-
tacks, and the initial system state. During this step, Dig-
ForNet uses I-TLA to prove the existence of potential
attack scenarios that satisfy the available evidences. To
be able to generate a variety of attack scenarios, Dig-
ForNet considers the use of a library of elementary ac-
tions supporting two types of actions: legitimate and ma-
licious. Malicious actions are specified by security ex-
perts after having assessed the system or appended by
investigators upon the discovery of new types of attacks.
The fifth step generates executable specification [18,19]
of potential attack scenarios using a model checker tool
associated with the formal specification. DigForNet builds
Investigation-based Temporal Logic model Checker called
I-TLC composed of two sub-steps. The first works to re-
build the attack scenarios in forward and backward chain-
ing processing, showing details of all intermediate sys-
tem states through where the system progresses during
the attack. The second sub-step provides a tolerance to
the incompleteness of details regarding the investigated
incident and the investigator knowledge. It interacts with
a library of hypothetical atomic actions to generate hy-
pothetical actions, append them to the scenarios under
construction, and efficiently manage them during the whole
process of generation. The library of hypothetical atomic
actions is composed of a set of entries showing interac-
tion between a set of virtual system components and a set
of rules used to efficiently create hypothetical actions as
a series of hypothetical atomic actions.
The sixth step uses the generated executable potential
attack scenarios to identify the risk scenario(s) that may
have compromised the system, the entities that have origi-
nated these attacks, the different steps they took to con-
duct the attacks, and the investigation proof that confirms
the conclusion. These results are discussed with the IRT
members in order to check the hypotheses added by I-TLC
and update the initial IRPCM by: 1) omitting some con-
cepts because they do not present an interest for the at-
tack scenario construction, and/or 2) adding other con-
cepts, corresponding to the hypothetical actions, to the
IRPCM and linking them to the other concepts. Links in
Copyright © 2009 SciRes. IJCNS
Copyright © 2009 SciRes. IJCNS
Figure 1. DigForNet methodology.
the IRPCM are deleted in the case where the concepts
originating from or going to them are omitted. New links
are also added to the IRPCM to link the newly added
concepts. Hypothetical actions are also added to the at-
tack library. In addition, tools collecting the evidences
are enhanced to be able to detect the newly discovered
3. Intrusion Response Probabilistic Causal
We have defined in [20] a category of cognitive maps to
support the intrusion response activity. In this paper, we
provide an extension to these cognitive maps referred to
as Incident Response Probabilistic Cognitive Maps (IRPCMs)
by introducing the notions of probability and activation
degree of concepts, and integrating links with complex
semantics. IRPCMs provide the foundation to investigate
and explain security attacks which have occurred in the
past and to predict future security attacks against the
system. These aspects are important for negotiation or
mediation between IRT members solving thus disparities
which are generated by the difference in their view
points and which can lead to conflicting decisions.
3.1. IRPCM Definition
An Incident Response Probabilistic Cognitive Map (IRPCM)
is a directed graph that represents intrusion response
team members’ experience-based view about security
events related to an incident. The nodes of the graph
represent concepts belonging to the network security
field and a set of edges representing relationships be-
tween the concepts.
IRPCM concepts can be symptoms, actions, and un-
authorized results related to network security field. Symp-
toms are signs that may indicate the occurrence of an
action. System crashes or the existence of new user ac-
counts or files are examples of symptoms. An action is a
step taken by a user or a process in order to achieve a
result. Probes, scans and floods are samples of actions.
An unauthorized result is an unauthorized consequence
of an event (defined by an action directed to a target).
For instance, an authorized result can be an increased
access or a disclosure of information or a theft of resources.
IRPCM concepts are labeled by values in the interval
[0,1] informing about the activation of the correspondent
concepts. They are also labeled by a value indicating their
occurrence time.
IRPCM edges link concepts to each others. An edge eij
linking concept ci to concept cj is labeled as (πij , qij)
where πij is the predicate expressing the relation between
the two nodes (examples include, , ) and qij
(taking values in ]0,1]) the probability expressing the
certitude degree that the relationship πij really occurred
between the concepts ci and cj. Quantitative values are
given by security experts. Notice that the semantic of the
predicate πij depends on the nature of the concepts ci and
cj. For the sake of simplicity, we consider seven cases of
relationships in this paper. They are described here after.
1) Input/output relation: Let ci be a symptom and cj be
a symptom or an action. An input/output relation, which
is expressed using the predicate πij=I/Q, means that part
of output of the concept ci is the input of the concept cj.
2) Temporal relation: Let ci and cj be two actions. A
temporal relation, which is expressed using the predicate
πij=t, means that ci is an action that precedes cj.
3) Cause/Effect relation: Let ci be an action and cj be an
unauthorized result. The cause/effect relation, which is
expressed using the predicate πij=CE, means that the effect
produced by concept ci is visible through concept cj.
4) Concealment relation: Let ci and be an action and cj
be an action, a symptom or an unauthorized result. The
concealment relation, expressed using the predicate πij=
conceal, means that concept ci when it happens, leads to
the hiding of concept cj. This corresponds to the situations
where the attackers execute some actions on the compro-
mised system to hide information revealing their access to
this system, or to hide the results on this system.
5) Destruction relation: Let ci be an action and cj be an
action, a symptom or an unauthorized result. The de-
struction relation, which is expressed using the predicate
πij=destroy, means that the occurrence of concept ci wipes
out the existence of concept cj. This corresponds to the
situation where the attacker executes some actions to de-
stroy any trace that may inform about his access to the
compromised system.
6) Forgery relation: Let ci be an action and cj be an ac-
tion, a symptom or an unauthorized result. The forgery
relation, which is expressed using the predicate πij=forge,
means that the occurrence of concept ci creates a new
forged concept cj with random time of occurrence. This
corresponds to the situation where the attacker tries to
deceive the investigation activity.
7) Replacement relation: Let ci be a forged action,
symptom, or unauthorized result and cj be a concept be-
longing to the same category as ci. The replacement rela-
tion expressed using the predicate πij=replace, means that
the concept cj is replaced by concept ci when the latter
has been forged by an action.
Notice that relations conceal, destroy, forge and re-
place corresponds to an anti-investigation activity.
3.2. Appearance-Period
Let ci and cj be two concepts belonging to the IRPCM
having respectively occurrence times equal to ti and tj.
Appearance period of cj, say Acj is determined as follows:
If then ],[= ij
ij concealdestroy replace
If forgeECOI tij ,/,<,/=
then ],[=
3.2.1. Snapshot Function
An IRPCM may vary as anti-forensic actions and rela-
tions are appended. Therefore, some concepts in the IRPCM
may be invalid at a given time, and the analysis of the
IRPCM becomes complex. To make the analysis simple,
we need a snapshot of the IRPCM for different instants.
To this end, we introduce the snapshot function. The
main feature of this function is to show a sub-view of the
IRPCM which hides the concepts in the IRPCM that are
invisible at that time. To do so, the appearance period of
concepts is exploited.
Formally, let Vsisible(c,t) be the function that returns
the Boolean value True if the time t is within the ap-
pearance period of the concept c. The IRPCM snapshot
at time instant t is created by deleting any concepts c in
that IRPCM, such that Vsisible(c,t)=false, and all edges
which are connected to ci.
3.3. Building IRPCMs
The IRT members are responsible for building the IRPCM
(second step in the DigForNet methodology). The basic
elements needed in this activity are the events collected
on the information system. These events may be IDS
alerts; compromises of services offered by the network,
or any sign indicating the occurrence of malicious or
suspect actions against the network. IRT members ana-
lyze these signs and define the appropriate symptoms,
actions and unauthorized results and assign the appropri-
ate probabilities and relationships to the edges linking
the defined concepts. The process of building an IRPCM
has two properties: completeness (if an attack has oc-
curred and a sufficient number of events are collected to
identify this attack, then we can find an IRT able to build
an IRPCM allowing to identify the attack) and conver-
gence (if an IRPCM is built and is large enough to col-
lect all the events related to a given attack, then the IRT
must build in a finite time an IRPCM allowing to provide
the right solution to protect against this attack).
The building of an IRPCM follows a methodology based
on the iterative process described in the following steps:
1) Collect a first set of security events observed in the
compromised system or detected by security tools.
2) Build an IRPCM based on the collected events.
3) Continue to collect security events.
4) Update the IRPCM based on the new recollected
events. Events which do not belong to the previous
IRPCM are added. Probable links related to the newly
considered concepts are also added to the IRPCM.
5) Refine the IRPCM.
6) Update the probabilities of the links and the activa-
tion degree of the concepts.
7) If the stopping criterion is satisfied, stop the
IRPCM building process; else, return to step 4.
In the second step of the above methodology, the gen-
eration and building of the IRPCM is the duty of the In-
cident Response Team (IRT). Two main tasks should be
handled within this step. First, the IRT members should
collaborate to append concepts based on their knowledge
and skills, and negotiate between them to classify the
appended concepts into necessary and unnecessary con-
Copyright © 2009 SciRes. IJCNS
cepts. Concepts can be considered as unnecessary if they
are duplicated, do not cope with the properties of the
attack or the system under investigation, or are erroneous.
The unnecessary concepts will thus be deleted from the
IRPCM under construction. Second, the IRT members col-
laborate to locate concepts in the IRPCM that could be
linked together and append edges. Obviously, such activ-
ity is subject to discussion and negotiation in order to
correct or delete erroneous edges.
In the fifth step of the methodology, the refinement of
the IRPCM is done through the analysis of the semantic
of concepts. Two forms of modifications on the IRPCM
take place during the refinement. The first consists in
substituting a concept by a more accurate one, merging
some concepts together, or segmenting a concept into
many other ones to make relations more significant. While
attack scenarios look different, they, in most cases, reuse
techniques of attacks and actions. The IRT, which is al-
ways in charge of constructing IRPCMs for the investi-
gated scenarios, could exploit IRPCMs related to previ-
ously resolved incidents to complete and update the cur-
rent IRPCM under construction. To do so, it suffices to
define patterns that allow detect similarities between
similar fragments of IRPCMs. The IRT has just to detect
patterns in the IRPCM under construction and find IRPCM
fragments that could be integrated from the previously
constructed ones.
Two criteria can be considered to decide about the end
of the IRPCM building process. The first is when all the
candidate actions in the library (those which have a rela-
tionship with the collected events) are present in the
IRPCM. The second is based on the decision of the IRT
members. If the latter agree that the IRPCM is large
enough, then the building process is stopped. The IRT
decision can be shared by all the members or it can be
taken by a mediator (a member of the IRT in charge of
coordinating activity helping solving conflicts and ter-
minating the process).
3.4. Activation Degree of a Concept
IRPCM concepts values give indications about their ac-
tivation. These values, referred to as activation degrees,
belong to the interval [0,1]. We define the function dac
assigning activation degrees to the concepts as follows:
 (1)
where C represents the set of concepts in the IRPCM and
S stands for the set of snapshots. A concept is said to be
dac-activated if its activation degree is equal to 1. In the
following, we show how to build a dac function based on
a given set of selected concepts in the IRPCM for a
snapshot s. Let I be the set of concepts related to col-
lected events of involvement in attack with respect to
detected intrusions. .
1) Let .
niscdac i1=1,=),(
2) Compute iteratively the remaining activation de-
grees as follows: Let F be the set of the concepts for
which we have already computed the activation degree.
F is initially equal to the set I.
3) Let G be the set of concepts that have a relation
with one or more concepts belonging to F.
where qdc is the probability ex-
pressing the certitude degree that there is a relationship
between the concepts d and c.
=),( scdac
)}({ ddacqsup dcGd
4) and return to step 3 if .
GFF := F
In the case where the IRT members have detected ma-
licious actions against the secured system, they start con-
structing the IRPCM corresponding to this situation. The
concepts that represent the collected events are activated
and will form the set I in this case. The activation degree
of the remaining concepts is determined according to the
previous algorithm.
The dac function is used in the third step of the Dig-
ForNet methodology to extract set of evidences. Nodes
having a dac degree greater than a predefined threshold
are extracted as evidences for the formal specification.
Notice that the activation degree of concepts may vary
from one snapshot to another if some concepts are de-
leted or, in the contrary, added to the current snapshot. In
the first case, a concept cm is deleted from a given snap-
shot of the IRPCM. If cf is a concept to which cm is di-
rectly linked, then the activation degree of the concept cf
is reduced if the activation degree of cm is the most im-
portant over the set of concepts directly linked to cf . In
the second case, if cm is a concept which is added to the
current snapshot, we distinguish three sub-cases: 1) the
new concept is not evidence and has no concepts directly
linked to it. In this case the da c value of the new concept
is unknown and must be set by the investigation team; 2)
the new concept is evidence. In this case its dac value is
set to one; 3) the new concept is not evidence and there
are concepts directly linked to it. In this case, the dac
value of the new concept is calculated according to the
previous algorithm. Having determined the dac value of
the new concept cm and if we represent by P the set of
concepts to which cm is directly linked, then the for every
concept c in P, the activation degree increases if cm has the
highest activation degree over those directly linked to c.
During IRPCM construction, it may happen that in
some snapshots, some concepts constitute evidences,
while they did not in the preceding IRPCM snapshot. In
this case, we set to one the activation degree of these
concepts and, using the previous algorithm, we update
the dac values of the concepts to which these evidences
are directly or indirectly linked. Conversely, if some
anti-forensic relations appear in the new snapshot show-
Copyright © 2009 SciRes. IJCNS
ing that some data sources were affected by attacks to
alter the stored evidences, the dac values of concepts
related to evidences collected from these sources should
be reduced. Consequently, the algorithm is re-executed
to update the dac values in the new IRPCM snapshot.
4. Generation of a Formal Specification of
Attack Scenarios
The Investigation-based Temporal Logic of Actions,
I-TLA [21], is a logic for the investigation of security
incidents. It is a extension to the Temporal Logic of
Actions (TLA [22]). I-TLA defines a theoretical frame-
work for: 1) modeling and specifying evidences left by
intruders further to the occurrence of a security incident;
2) supporting advanced description and specification of
potential happened attack scenarios as a series of ele-
mentary attacks, extracted from the library of attacks,
that if assembled together, would satisfy the available
evidences. Similarly to TLA, I-TLA allows to reason
about systems and their properties in a uniform logic
formalism. I-TLA is provided with I-TLA+, a highly
expressive formal language that defines a precise syntax
and module system for writing I-TLA specifications. I-
TLA will be used in this paper to generate a specification
describing potential attack scenarios that satisfy the
available set of evidences.
In the sequel, we focus on describing the different
forms of evidences supported by I-TLA, showing how
they can be specified and how they should be satisfied by
the expected attack scenario. The reader is referred to [21]
for more details of I-TLA and I-TLA+ and a complete
semantic and syntactic description.
4.1. Modeling Scenarios and Evidences in I-TLA
I-TLA is typeless and state-based logic. It allows the
description of states and state transitions. A state, while it
does not explicitly appear in a I-TLA specification formula,
is a mapping from the set of all variables names to the
collection of all possible values. An I-TLA specification
generates a potential attack scenario in the form of:
,...,,= 10
, as a series of system states (to n)
that satisfies all available evidences. I-TLA supports four
different forms of evidences, namely history-based, non-
timed event-based, timed event-based, and predicate-
based evidences. A state-based representation of attack
scenarios allows a security expert to observe how its
system progresses during the attack and how it interacts
with the actions executed in the scenario.
4.1.1. History-Based Evidences
Typically, security solutions do not have direct access to
all system components. Some of them are able to provide
evidences as histories of the values of the monitored
system variables, during the spread of an attack scenario.
These security solutions cannot realize that the system
has progressed or not from a state to another if the value
of the monitored component is either blind, or does not
change. I-TLA encodes a history-based evidence, say E,
as an observation over a potential attack scenario ω, gener-
ated by Obs(ω) (Obs() is the observation function that
characterizes the ability of a security solution to monitor
the history of the system during an attack scenario). It
uses a labeling function that allows a third party to only
see limited information about states of an execution.
Since a state is a valuation of all system variables, a la-
beling function allows to either:
Totally observe the content of variable value. Vari-
able v is visible and its value is interpretable by the ob-
server. It represents a system component whose modifi-
cation is monitored by some security solution.
Observe a fictive value instead of the real variable
value. Variable v is visible but not interpretable by the
observer, meaning that its variation does not bring any
supplementary information to an observer. It can repre-
sent an encrypted data whose decryption key is unknown
by the observer.
Observe empty value. Variable v is completely in-
visible, such that none information regarding its value
could be determined. It represents a system component
which is not monitored by any security solution.
Obs(ω) is obtained by following two steps:
1) Transform each state si toi
ˆ, by hiding some of
the details it provid. i
ˆ is obtained from si by making
the value of every system variable v in si to b
a) Unmodified. In this case the variable is visible and
its value is interpretable by the observer. It represents a
system component whose modification is monitored by
some security solution;
b) Equal to a fictive value fictive value. In this case
the variable is visible but not interpretable by the ob-
server, meaning that its variation does not bring any sup-
plementary information to an observer. It can represent
an encrypted data whose decryption key is unknown by
the observer;
c) Equal to an empty value, denoted by ε. In this case,
the variable is completely invisible, such that none in-
formation regarding its value could be determined. It
represents a system component which is not monitored
any security solution.
2) Delete any which is equal to null value (i.e., all
values are invisible) and then collapse together each
maximal sub-sequence
ji ss ˆ
ˆ such that ,
into a single .
ss ˆ
Taking into consideration the availability of a his-
tory-based evidence E, consists in generating, an attack
scenario ω such that Obs(ω)=E.
Copyright © 2009 SciRes. IJCNS
4.1.2. Ordering of Observations
A step in the scenario may not change all the values of
the system variables. As the scope of the observations
differs, they may not allow noticing that the system has
progressed during the attack at the same time. I-TLA
allows to specify for two given observations, which one
will vary first (respectively last) when the attack scenario
starts (respectively finishes). Consider the following exam-
ple involving an attack scenario ω and two observations
and , generated by
observation functions and , respectively.
is said to be an observation that allows to notice
the occurrence of an incident before observation
]...,,[= 1n
)(Obs (sOb)
if and only if: x
such that: yx
11=)( eobs x
mj <1
12 jx ee
obs for some j
4.1.3. Non-Timed Events-Based Evidences
As the length of observations is different from the length
of an attack scenario, reconstructed attack scenarios may
differ by the manner in which observations are stretched
and aggregated together to generate intermediate states
of the execution. I-TLA defines non-timed events based
evidences in the form of predicates over I-TLA execu-
tions, which specify the modification pattern of variables
values through an execution. For instance, the execution
predicate AtSameTime, states that state predicate p1
switches to true at the same time the state predicate p2
switches to false.
12 1
(, ),:
ii ii
AtSameTime ppss
psps psp
 
 
Taking into consideration the availability of a non-
timed event-based evidence E, is amount to generate an
attack scenario ω such that E
4.1.4. Timed Events-Based Evidences
Starting from a set of available alerts, an investigator
can extract some indications related to occurred
events. I-TLA defines timed event-based evidence
as a set of ordered actions (A0 to Am)
that should be part of an expected execution. While the
order in which events appear should be respected, there
is no need that these events be contiguous. Given a timed
event-based evidence , an execution
]...,,[= 0m
 n
ss ...,=0
]...,,[= 0m
satisfies evidence E if and only if:
EAA xx  ),( 1
,( iss
sj), 1
such that
true and for some .
=),( 1iix ssA
1 i
4.1.5. Predicate-Based Evidences
With regards to the security response team’s members,
an unexpected system property is a preliminary argument
supporting the incident occurrence (e.g., the integrity of a
file was violated). I-TLA defines predicate-based evi-
dence as a predicate, say E, over system states, that
characterizes the system compromise. An execution
satisfies evidence E if E divides ω into two successive
execution fragments ω1 and ω2 (ω can thus be written as
ω=ω1ω2). ω1 is composed of secure states (1
), while ω2 is composed of insecure system states
4.1.6. Illustrative example
The following example clarifies the use of I-TLA in
digital investigation, and illustrates the mechanism of
handling evidences during the construction of potential
attack scenarios. We consider a system under investiga-
tion which is specified by three variables x, y, and z. The
initial system state, described in advance, is the state
defined by variables x, y, and z are all equal to 0. The
library of elementary actions contains two actions A1 and
A2 that can be executed by the system.
Action A1, for instance, keeps the value of variable x
in the new state unchanged with respect to the previous
state, and sets the values of y, and z in the new state 1
and 2 higher than its values in the old state, respectively.
Three different evidences are provided. The two first
ones represent history-based evidences, defined as
 
and E2=0,1,2, 3
 
i i
, where ε
stands for the invisible value. These evidences are gener-
ated by observation functions obs1() and obs2(), respec-
tively. The first observation function obs1(), allows a
security solution to only monitor variable x, meaning that,
when it is applied to a state s, makes the value of y and z
both equal to ε, and keeps the values of variable x un-
changed. The second observation function obs2() allows
a security solution to only monitor variable y. The order-
ing of observations indicates that observation provided by
obs2() allows to notice the occurrence of an incident before
the observation provided by obs1(). The third evidence E3, is
provided as a predicate-based evidence defined as
. The fourth evidence E4, defined as
 psps
p 
, is an
non-timed evidence, stating that predicate ,
false in a state si, could not switch to true in the next state
Copyright © 2009 SciRes. IJCNS
si+1, unless predicate 24pz
is true in that state.
Finally, evidence E
5, indicates that sequences of events
(A1, A2) is part of the attack scenario.
Figure 2 shows how I-TLA guarantees the satisfaction
of evidences during the construction of the potential at-
tacks. Two potential attack scenarios satisfying the avail-
able evidences are provided by I-LA, namely ω1 and ω2.
The first scenario ω1 is described as ω1=<s1, s3, s4, s7, s12,
s15>, and consists in consecutively executing the five
following actions A1A1A1A2A2. The second sce-
nario ω2 is described as ω1=<s1, s3, s5, s9, s11, s
18>, and
consists in a consecutive execution of the five actions
Starting from state s1, I-TLA cannot execute action A2
as it moves the system to a state that does not satisfy the
ordering of observations. In fact, the sub-scenario <s0,
s1> is observed by obs1() as <0εε,1εε> and by obs2() as
<ε0ε>. Thus, the event A2 is detected by E1 but not by E2.
Starting from state s4, I-TLA does not execute action A2
as it moves the system to a state that violates evidence E4.
State s8 could not be considered in the construction
process as it violates predicate E3. In fact, the predicate
p1 has become already true in state s3 and should not
change to false in state s8 again. I-TLA discards states s13,
s14, and s19 as each one of them would create an execu-
tion that violates evidence E2 if appended to the scenario
under construction. In the same context, state s16 is also
not added to the scenario under construction as it creates
an execution that violates E1.
4.2. Handling Anti-Investigation Attacks
To elude the process of digital investigation, a seasoned
attacker may try to conduct an anti-investigation attack
[23, 24] to remove, hide, obfuscate, or alter available
evidences after breaking into the system. Available tech-
niques include deletion of relevant log entries, installa-
tion of root-kits, steganography, and even wiping of
disks [25] to disable any further recovery.
Let obs(-) be an available observation function, and
OBS be a history-based observation which corresponds
to the output of obs(-) when executed on the attack sce-
nario under progression, say ω. Formally, an anti- inves-
tigation attack represents any action which moves the
system to some state, say sj in ω, and does not only ap-
pend obs(sj) to OBS, but also affects OBS to modify any
content related to obs(s0,…,sj-1).
We remind that I-TLA reconstructs attack scenarios by
executing an action unless it satisfies all the available
history-based evidences. Let si be the current state reach-
able from the initial state s0 through the execution
<s0,…,si-1>. If an I-TLA action A is executed from state
si to produce state si+1, the execution obtained after
reaching the new state should satisfy the available ob-
servation. Formally, . We dem-
onstrate in the following the impact of the anti-investigation
attack on the process of attack scenarios reconstruction in
I-TLA using the regular definition of observation functions.
OBSssobsi )...,,( 10
4.2.1. Example
We consider a system modeled using two variables p and
l which are related to the user granted privilege and the
content of the system log file, respectively. Variable p
can take three values: 0, 1, and 2 which stand for no ac-
cess, unprivileged access, and privileged access, respec-
tively. As the log file is typically accessed in append
mode, variable v takes a series of values representing the
commands executed on the system. These values are
included in chronological order of their execution.
50 ..,,=ss
represents an attack scenario composed of
five states, where actions A1, A2 and A3 stand for the
execution of arbitrary commands. Every one of these
actions appends an entry to the tail of the log file. A4 con-
sists in exploiting vulnerability on the system to get a
privileged access. Action A5 is an anti-investigation at-
tack. It consists in getting a privileged access on the sys-
tem and altering the content of the log file by deleting the
entry corresponding to the execution of action A2. The
attack scenario ω is described as a series of six states,
where every state is a valuation of the two variables, and
edges are labeled by the executed action.
1, 1
,1, Act 2
321 ,,,1,ActActAct 4A
321 ,,,2,ActActAct
 31 ,,,2ActAct
We consider a security solution which is modeled by
the observation function obs(). The latter allows observ-
ing the current executed commands on the system by
looking for new entries appended to the log file. For-
mally, obs (s)=tail(s(log)), where tail(x) returns the last
entry in x. Using the regular definition of observation
function obs(), the history-based evidence generated by
the security solution further to the execution of the attack
scenario is given by:
(), (), (), (), (), ()
=, ,,
obssobss obssobss obss obss
Act ActAct
 
Starting from this definition, it is expected that the
provided observation content will be in the form of <-,
Act1, Act2, Act3,>. However, since this history-based evi-
dence is provided by the content of the log file, which is
retrieved after the attack, only the content <-, Act1, Act3,>
will be visible. The difference between the expected and
the available output is due to the execution of the
anti-investigation attack.
If this history-based evidence is considered during the
reconstruction of the attack scenario, starting from state s2,
Copyright © 2009 SciRes. IJCNS
Copyright © 2009 SciRes. IJCNS
ϭ͕Ϭ͕Ϭ Ϭ͕ϭ͕Ϯ
Ϭ͕Ϯ͕ϰ ϭ͕ϭ͕ϭ
Figure 2. I-TLA attack scenario specification: an illustrative example.
action A2 cannot be executed since it does not provide a
state whose observation is included in the evidence. In other
words, there is no entry in the log file after the one corre-
sponding to the execution of A1, which shows an indication
regarding the executed action A2.
Starting from this statement, we describe in the sequel
a new observation function which allows coping with
anti-investigation attacks.
4.3. Observations
Let {v1,...,vn} represents the set of system variables. A
variable in this set could represent a system component
which is accessed in append mode (e.g., log or alert file,
raw traffic capture) and takes a value or a series of values.
Let v x(s) and Card(v(s)) represents the xth value, and the
number of values, in the series v(s), respectively. We
denote by the operation which consists in
superimposing the series v(si) on the series v(si-1) while
keeping elements in v(si) and discarding those situated
beyond the limit of the intersection. Formally,
()( )
vs vs
v sv
() (()]
vs vss
(( i
=[ (())
11 111
=[()( ),...,()( )]
i iiinini
ssvsvsv sv s
  (7)
By applying *()obs
function on the scenario ω pro-
vided in the example of subsection 4.2, we obtain obs*(ω)=
<-, A1, A3>. The output of this function is equal to the con-
tent retrieved from the system after the execution of the
attack scenario which included an anti-investigation attack.
Theorem 1: Given an executed attack scenario ω, and
an observation function obs(-). If *() ()obs obs
, the
attack scenario ω includes an anti-investigation attack.
Proof: We suppose that *() ()obs obs
and there is
no anti-investigation attack in the scenario ω. In the fol-
lowing we will disapprove this proposition.
Let v be an observable variable (with regard to obs(-)
function). Typically, since the variable v is in append
mode, and the modifications are introduced to the tail of
the series, the xth value in v(si) should be the one corre-
sponding to the xth value in v(si-1). In the absence of
anti-investigation attack, none action executed from state
si would modify the xth value in v(si-1). Formally, the fol-
lowing condition should be satisfied:
ii i
Cardsv sv s
 (8)
Therefore, . Assuming that the
investigated system is modeled using only variable v and
the attack scenario is composed of two states si and si-1,
we obtain and
()( )=( )
ii i
vs vsvs
s ssobs s
ob *()= ()obs obs
The proposition is therefore disapproved.
We denote by obs*(ω) a new observation function
over the executed attack scenario which allows capturing
the situation where an anti-investigation attack has been
conducted. Formally,
obs ssobsssobs s
  (6)
4.4. From IRPCM to I-TLA Specification
where Starting from the IRPCMs built by the IRT, useful in-
formation, in the form of symptoms, unauthorized results,
or actions, will be extracted and used to formally de-
scribe different type of evidences with I-TLA. We denote
by useful information, any concept in the IRPCM having
a degree of activation value that exceeds some prede-
fined threshold, denoted by extraction threshold.
Symptoms are typically extracted from log files, traf-
fic capture, or even keystrokes. They can be traduced to
history-based evidences by transforming the whole con-
tent of the log file (including the record indicating the
symptom itself) into an I-TLA history-based evidence.
Symptoms extracted from alerts files indicate the occur-
rence of an events whose position in the reconstructed
attack scenario cannot be determined. They will typically
be transformed to non-timed I-TLA based evidence.
Actions selected from an IRPCM represent steps taken
by a user or a process in order to achieve some result. A
well intentioned reader has noticed that actions in the
I-TLA library and actions in the IRPCM may not have
the same form, and are not of the same granularity. In
fact, an IRPCM action can be traduced to one or several
consecutive I-TLA actions. In this context, for every
selected IRPCM action an investigator has to extract
sequence of elementary actions from the I-TLA library.
The different obtained sequences will represent timed
events-based evidences.
Unauthorized results represent unauthorized conse-
quence of events. They are traduced to I-TLA predicate-
based evidences. An investigator has to identify the sys-
tem variable affected by the unauthorized consequence
and then use it to describe the evidence.
Since the attack scenario may integrate anti-investiga-
tion attacks, the investigator has to locate in the IRPCM
the set of concepts that are linked by anti-investigation
relations (i.e., conceal, destroy, forge, and replace). Since
anti-investigation attacks are executed to compromise
evidences, the investigator has to determine which of the
system variables, specified with I-TLA, describe the
content of these compromised evidences. After that, it
has to select observation functions that are defined to
observe the content of these affected variables. This fea-
ture is highly essential for the reconstruction of the attack
5. Executable Scenarios Generation Using
To automate the proof in the context of digital investigation
and generate executable attack scenarios showing with
details how the attack was conducted and how the system
progressed for each action part of the scenario, I-TLC
[21], a model checker for I-TLA+ specifications can be
used. I-TLC is somehow an extension to TLC, the model
checker of TLA+ specification. It works by generating an
optimized directed graph of states representing the space
of possible scenarios generated from the I-TLA+ specifi-
cation. Despite checking that some types of computation
are impossible as they violate safety properties, I-TLC
aims to reconstruct execution (i.e., potential attack sce-
narios) that satisfy each form of evidences supported by
I-TLA. I-TLC provides a novel concept entitled hypo-
thetical action, defines techniques for its generation and
management, and improves the representation of states.
The directed graph is built by ensuring that a given node
is reachable under optimal sets of hypothetical actions.
5.1. I-TLC’s States Representation
I-TLC represents a node in the graph as a tuple of two
information: node core and node label. The core of a
node represents a valuation of the entire system vari-
ables, and the node label represents the potential sets of
hypothetical actions under which the node core is
reached. A reading of the node label indicates a) the
state of the system in the current node, and b) the alter-
natives (hypothetical action sets) under which the sys-
tem state is reachable.
}}],{},,{{[(1,3), 3241HHHH represents an example
of a node which can be represented by the graph gener-
ated by I-TLC. (1,3) is the node core, {H1,H4} and
{H2,H3} represent the set of hypothetical actions under
which the node core is reachable, and {H1, H4}{H2, H3}
is the node label. This representation means that the sys-
tem state (1,3), representing a valuation of the two vari-
ables x and y, respectively, is reachable under one of the
two sets of hypothetical actions {H1,H4} or {H2,H3}.
5.2. Generation of Hypothetical Actions
Generation of potential attack scenarios may fail if the
library of actions is incomplete. In fact, for the particular
case of attack scenarios that involve the use of unknown
techniques, the system may come at some state while
being unable to reach another state that if appended to
the scenario under construction, will make it satisfy all
the available evidences. To alleviate this issue, I-TLC
tries to generate a hypothetical action and append it to
the graph under construction, whenever available evi-
dences are not completely satisfied.
The idea behind the generation of hypothetical actions
is based on the fact that unknown actions can be gener-
ated if additional details about internal system compo-
nents (i.e., those abstracted by the specification) are
available. This detail involves a description of how these
internal system components are expected to behave (if
atomic actions are executed on them) and how they de-
pend on each other. These internal system components
are modeled by a specific set of variable denoted by in-
ternal variables. The other variables specified by I-TLA
are denoted by external variables.
Semantically, a hypothetical action is true or false for
Copyright © 2009 SciRes. IJCNS
a pair of states <s, t>. Syntactically, a hypothetical action
is modeled as a series of hypothetical atomic actions,
executed one after the other from state s to move the
system to state t. It is defined in the following form
. defines a mapping from
the external variables values to the internal variables
values in state s and defines a mapping from the
internal variables to the external variables in state t. The
set of hi (i from 0 to n) represents executed hypothetical
atomic actions. A hypothetical atomic action hi only
modifies a single internal variable, and represents a rela-
tion between two consecutive internal system states.
During hypothetical actions generation, I-TLC needs
access to the library of hypothetical atomic actions. This
library describes all the potential hypothetical atomic
actions that can be executed on the investigated system.
einie mhhmH...= 0
During scenarios generation, several hypothetical ac-
tions may be appended whenever needed. I-TLC man-
ages hypotheses following the two key ideas. First as
hypotheses are not completely independent from each
others and some hypotheses are contradictory, I-TLC
avoids reaching a state under a contradictory situation. In
this context, the library of hypotheses indicates potential
contradictory sequence of hypothetical atomic actions.
Second, in order to ensure that generated hypothetical
actions are at the maximum close to real actions per-
formed on the system, I-TLC defines techniques to refine
the selection of hypothetical atomic actions.
5.3. Generation of Anti-Investigation Attacks
Typically, when an I-TLA action is to be executed,
I-TLC verifies whether it satisfies all available evidences,
especially history-based observations. Similarly to the
case of unknown actions (as discussed in the previous
subsection), the generation of potential attack scenarios
may fail if the history-based observations were compro-
mised using an anti-investigation attack. To cope with
such an issue, I-TLC handles separately actions which
modify the compromised evidences (with regard to his-
tory-based observations detected by I-TLA in the previ-
ous phase to be compromised using anti-investigation
Let be the set of observations compromised by
anti-investigation attacks, and V be the set of variables
affected by these attacks. I-TLC could execute, during
the reconstruction of the attack scenario, an action, say A,
which do not create states satisfied by the available ob-
servations in , provided that the executed action modi-
fies at least a variable in V. In the sequel, an action which
satisfies the above conditions will be entitled Prep-anti-
investigation action.
To support the generation of Prep-anti-investig ation ac-
tions, heuristics can be used so that only accurate actions
will be integrated to the attack scenario under construction.
These heuristics exploit the values that could be taken by
some other variables in the execution. For instance:
Execute an action A if one of the collected evi-
dences, namely the history-based evidences, shows that
later the user will get a privileged system access. Such
condition would mean that an anti-investigation attack
can potentially be executed.
Discard the attack scenario under construction if the
number of states between the first generation of action A
and the execution of the anti-investigation action has
exceeded a threshold.
While staring from the first generation of Prep-anti
-investigation action, the generated attack scenario will
no longer satisfy the available observations, I-TLC
should verify later that, further to the execution of some
I-TLA action, which will typically be an anti-investigation
attack (included in I-TLA as evidence), the attack sce-
nario under construction, say ω, becomes satisfied by all
evidences. For a completely generated attack scenario,
say ω, which included, at some step in the execution, an
anti-investigation action, I-TLC should verify that
() ()obs obs
5.4. Inferring Scenarios with I-TLC
To generate potential scenarios of attacks, DigForNet
uses I-TLC Model Checker, which follows three phases.
The reader is referred to [21] for a detailed description of
I-TLC algorithms.
5.4.1. Initialization Phase
During this step, the generated scenarios graph is initial-
ized to empty, and each state satisfying the initial system
predicate is computed and then checked whether it satis-
fies the system invariants and the set of evidences. In that
case, it will be appended to the graph with a pointer to
the null state, and a label equal to (as no hypotheti-
cal action is generated).
5.4.2. Forward Chaining Phase
The algorithm starts from the set of initial system states,
and computes in forward chaining manner the entire suc-
cessor states that form scenarios satisfying evidences de-
scribed in I-TLA. Successor states are computed by exe-
cuting an I-TLA action or by generating a hypothetical
action or Prep-anti-investigation action, and executing it.
When a new state is generated, I-TLC verifies if an-
other existing node in the graph has a node core equal to
that state. If the case is false, a new node, related to the
generated state, is appended to the graph under construc-
tion, and linked to its predecessor state. If the case is true,
the label of the existing node is updated so that it em-
bodies the set of hypothetical actions under which the
new system state is reachable. During label update,
I-TLC ensures that each node label is provided with the
Copyright © 2009 SciRes. IJCNS
following properties: soundness (a node holds the set of
hypothetical actions under which its core is reachable),
consistency (none set of hypothetical actions in the node
label is an inconsistent or contradictory one), complete-
ness (every set of hypothetical actions in the node is a
superset of some other hypothetical actions), and mini-
mal (none set of hypothetical actions is a proper subset
of any other). If the scenario yielding the new generated
state satisfies all the evidences, the system state is con-
sidered as a terminal state.
5.4.3. Backward Chaining Phase
All the optimal scenarios that could produce terminal states
generated in forward chaining phase and satisfy the
available evidences, are constructed. This helps obtain-
ing potential and additional scenarios that could be the
root causes for the set of available evidences. During this
phase, the algorithm starts with a queue holding the set
of terminal states generated in forward chaining phase.
Afterwards, and until the queue becomes empty, the tail
of the queue is retrieved and its predecessor states is
computed. The new generated states are managed and
appended to the graph under construction with the same
manner followed in forward chaining phase.
All potential scenarios are supposed to be generated
by I-TLC. The only exception may occur due to the lack
of actions in the library of elementary actions. Nonethe-
less, the use of hypothetical actions concepts allows alle-
viating this problem.
6. Case Study
To demonstrate how DigForNet works, we provide in the
following a case study related to the investigation of a
compromised Linux Red Hat 7.2 operating system,
which was deployed as a Virtual Honeypot in a VMWare
session. The compromised system was suspended with
VMWare immediately after the attack and a live image
was created and posted by the Honeynet Project1 for
investigation. This case study deals with an investigation
of a live system, the attack is highly complicated and
requires advanced digital investigation skill knowledge,
and the conducted scenario integrates several anti-
investigation actions. In this case study, we will start by
describing the attack. Then, we will show the use of
DigForNet to investigate such incident.
6.1. Attack Description
First, the attacker probed the HTTP server from the ma-
chine identified by the IP address Then,
the attacker tried to exploit the Apache SSL handshake
bug. Using this vulnerability, he gained a remote access
as the Apache user. After that, he escalated his privilege
and gained root access. At this level, the attacker con-
ducted many attempts to install a rootkit. Only one of
these attempts has succeeded. The following paragraphs
describe the rootkits installation attempts.
The attacker has downloaded the tarball rk.tar.gz from Obviously, he then
installed the rk.tar.gz. This install operation infected some
binary files on the system, including ifconfig, ls, netstat,
ps and top, and saved their original version in /usr/lib/
libshtift. When the install script of rk.tar.gz finished the
installation process, some system files (such as /bin/ps)
have been replaced; mails with information about the
system have been sent to and; new unknown processes have
been run as daemon; and the log file has been deleted to
hide the attacker actions. After this, the hacker downloaded
other tools including abc.tgz, an installation script for the
current SSH server; and mass2.tgz, which is an exploit
used to hack the server. However, the attacker has failed
to stop the SSH daemon and has installed an SSH server
under the file name “smbd -D”. The attacker does not even
know the backdoor password. So, he carried out a novel
attempt. He downloaded adore rootkit and tried to install
it. But the install operation failed.
The attacker did not give up. He again gained a root
access. This time, the attacker used the program gods (a
shell script from, to download adore LKM
and an SSH server. After this, the attacker has installed
the SucKIT rootkit using the installation script inst. This
time, the rootkit installation has succeeded. The attacker
also run xopen and lsn programs and moved /lib/.x/.boot
from /var/tmp/.boot. After this, the attacker has con-
nected to the FTP server identified by the IP address Then, the file /root/sslstop.tar.gz has been
moved from /lib/.x/s.tgz. It contains the sslstop program
which modifies httpd.conf to disable the SSL support.
The program sslport modifies httpd.conf to change the
default SSL port (443) to another port (3128 in this case).
The primitive HAVE_SSL has been replaced by HAVE_
SSS in /etc/httpd/conf/httpd.conf. This indicates that sslstop
has been run. In addition, the attacker downloaded psyBNC
from psybnc and installed it.
This program is used to hold open an IRC connection
and run a proxy IRC in order to hide the user’s IP ad-
dress. Using psyBNC, the user sic has connected from to, an
IRC server. He has created an account named redcode.
6.2. IRPCM Construction
The generated IRPCM is given by Figure 3. Actions, symp-
toms, and unauthorized results in this IRPCM are de-
picted by plain, dashed, and dotted ellipses, respectively.
To construct the IRPCM, we used the evidences which
*Honeynet Project-Scan of the Month #29
Copyright © 2009 SciRes. IJCNS
Copyright © 2009 SciRes. IJCNS
were collected on the compromised system. Three con-
cepts in the form of symptoms, namely S1, S2, and S3, are
initially appended to the IRPCM. Symptom S1 indicates
that the httpd log file contains suspicious entries showing
potential exploit of ssl. Symptom S2 indicates the exis-
tence of suspicious connections to the web server from in the /var/log/ messages. Symptom S3
shows that the web server banner is indicating a vulner-
able version of Apache/OpenSSL. These three symptoms
are linked to the concept A1 which represents the action
“Execution of mod_ssl/OpenSSL exploit”. The latter action
leads to the creation of the following unauthorized result,
denoted by U1 “Unauthorized access to the system with
privileged rights” meaning that the intruder can execute
some commands on the compromised system. Therefore,
an edge is appended to the IRPCM from the concept A1
to the concept U1, creating a Cause/Effect relation.
The attacker reconnected to the system from the IP
address In the IRPCM this action is
defined by the concept A2, succeeds the action A1, and
precedes the action A3 which represents a tentative to
install the Adore rootkit. Action A3 is vindicated by the
content of log files and some mails from Apache indi-
cating a failed installation of the Adore rootkit. In this
context, the symptom S5 is linked to the action A3.
Always using the privileged access, the attacker down-
loaded rk.tar.gz from geocites and installed the rootkit it
contains. This is shown by the action A4 in the IRPCM.
The latter is vindicated by the content of swap-colon.txt
(symptom S8 in the IRPCM). This rootkit leads to the
unauthorized result U2 which indicates that an unauthor-
ized installation of programs on the system was per-
formed. The attacker succeeded to install other programs
such as a port scanner called sl2. Such activity is repre-
sented by action A5 in the IRPCM. It is vindicated by
some entries in the swap-colon.txt file too. By the com-
pletion of the installation, the attacker erased the log in-
stallation history of the tools contained in rk.tar.gz. Since
this behavior represents an anti-forensic attack, the con-
cept A4 is linked to the concept S9 using a destruction
relation. The investigation process did not show any fur-
ther use of this rootkit.
After this, the attacker used his privileged access to
run the /lib/.x/hide script (action A6 in the IRPCM) in
order to destroy the symptoms “gods is running” and
“inst is running”. Two destruction relations are therefore
created from the action A6 to the symptoms S10 and S11,
respectively. The two latter symptoms give a proof re-
garding the execution of actions “Install SucKIT” and
“Download SucKIT”, respectively. In reality, the at-
tacker installed the SucKIT rootkit as proven by the /
partition analysis which shows the use of gods, a script
used to download SucKIT, and inst, a script used to in-
stall this rootkit. SucKIT installation led to the unauthori-
T rusted ifconfig outp ut:
network card in
promiscuous mode
Files analysis:
Change time set to August 10, 2003
For /dev/ttyop, /dev/ttyoa, /dev/tty of,
/dev/hdx2, /dev /h dx1
T ru ste d N et stat ou tp ut:
Ports 2003, 3128, 65336
and 65436 are open
Trusted lsof output:
smbd (pid 3137) has TCP
ports 80, 443 and 2003 open
and many HTTPd files
T rusted lso foutput:
swapd(pid3153) has
TCP ports 80 and 443
open and
many HTTPdfiles
T rusted lsof output:
xopen (pid 25241) has
files from /lib/.x/ open
and TCP ports 80, 443
and 3128 open
Install the swapd
network sniffer
Install sl2
port scanner
Tentative to install
Adore rootkit
Install SucKITrootkit
Review of xinetd config:
/etc/ftpaccess is configured
to allow anonymous access
Remote FTP access to
Attempt any of the exploits
T rusted date output:
Sunday 10 August 2003
httpd log file analysis:
Suspecious entries showing
Potential exploit of ssl
web server banner:
Vulnerable v ersion of
Suspecious connections to
the web server from
Execution of
mod_ssl/OpenSSL exploit
Parameter settings of
/dev/ttyop, /dev/ttyoa,
/dev/ttyof to hide ports,
addresses and files
Connection from
213.154.1 18.218 t o spawn
shell on port 2003
Content of rk.tar.gz:
logclear, sense, sl2, etc .
Install rk.tar.gz
Review of swap-colon.txt:
Successful download of rk.tar.gz
Disk sectors:
Log installation history
/ partition analysis:
gods script is run
Run the /lib/.x/hide
scriptto hide
installation output
Run xopen SSH server
httpd.conf analysis:
HAVE_SSL has been
replaced by HAVE_SSS
Run sslstopand sslport
t10 I/O
Install psyBNC,
an IRC proxy
Tr usted lsof output:
psyBNC is running
Connection to
an IRC server
/ partition analysis:
/root/sslstop.tar .gz was
moved from /lib/.x/s.tgz
Connection from
213.154.1 18.21 9 to spawn
Shell on port 2003
Download SucKITrootkit
/ partition analysis:
inst script is run
Unauthorized access to the
system with privileged rights
t1 ’’’
programs install
t4 ’’’
Unauthorized programs
with other crackers
Unauthorized programs
download via FTP
Log files analysis:
Mails from apache
Indicating a failed install
Figure 3. IRPCM related to the attack against the VMWare Linux honeypot.
Copyright © 2009 SciRes. IJCNS
zed result “unauthorized programs install/ modification”,
namely U3. This rootkit was installed on 10 August 2003
as indicated by a trusted version of the date command.
This information was provided by symptom S12 which is
linked to the action A8. The action representing the SucKIT
installation was followed by action A9 denoted by “run
xopen SSH server”. This action is vindicated by the
symptom S12 which is provided by the output of a trusted
version of the lsof command. The latter shows that xopen
is running on the compromised system. This output
shows also that SSL is using port 3128 instead of 443.
After this, the attacker executed action A10 to run sslstop
and sslport programs. The content of the concept A10 is
vindicated by symptom S10. The latter is indicated by the
analysis of /etc/httpd/conf/httpd.conf which shows that
the primitive HAVE_SSL was replaced by HAVE_SSS.
The script sslstop modifies httpd.conf to disable the SSL
support. sslport modifies httpd.conf to change the default
SSL port (443) to something else (3128 in this case) and
then to conceal any port scanner output that can provide
the symptom informing about the use of SSL. A conceal-
ment relation is appended from Action A10 to symptom
S14 in the IRPCM.
After installing the SucKIT rootkit, the attacker closed
the first connection and reconnected to the same web
server from This action, namely A12,
succeeds the action A11 in the IRPCM, which consists in
setting the parameters of /dev/ttyop, /dev/ttyoa and /dev/
ttyof to hide processes, addresses and files, respectively
and then to conceal symptoms such as NetStat output.
Action A11 conceals also the action of connecting to the
server A concealment relation is cre-
ated from the concept A11 to the concept A12.
Using the new shell, the attacker conducted three other
actions. He first installed the swapd network sniffer (Ac-
tion A14 in the IRPCM). This action is supported by the
two symptoms S19 and S20. S19 indicates that a trusted
version of ifconfig showed that the network card was in
promiscuous mode. S20 represents the output of a trusted
version of the lsof command indicating that swapd was
running using the pid 3153. Second, the attacker exe-
cuted a remote FTP access (Action A16 in the IRPCM).
This action is vindicated by the concept S18 representing
the analysis of xinetd configuration. It shows that /etc/
ftpaccess is configured to allow anonymous access. Ac-
tion A16 leads to the unauthorized U5 showing that an
unauthorized downloading of programs via FTP was per-
formed. Third, the attacker installed psyBNC (Action A13
in the IRPCM) which is an Internet Relay Chat (IRC)
proxy. This action conceals the unauthorized result U4
which shows a communication with other crackers. By
concealing U4, the IRC program allows communications
without revealing the intruder identity. Action A13 is vin-
dicated by the symptom S17. In fact the execution of a
trusted lsof command on the compromised system shows
that psyBNC is running. Using psyBNC, the attacker con-
nected to the IRC server (Action
A15 in the IRPCM). An edge labeled by a concealment
relation is created from the concept A13 to the concept S17.
6.3. Extracting Evidences from IRPCMs
We model the investigated system using six variables;
namely Pr, hhtplog, port2003, ConAddr2003, SorftLog,
and AppSoft. They represent the system privilege granted
to the remote user (i.e., the attacker), the tail of the con-
tent of the web service http log file, the service running
on port 2003, the IP address connected to port 2003, the
content of residual software installation logs, and the
additional software installed on the system.
The evidences extracted from the IRPCM in conjunc-
tion with the library of elementary actions are then used
by the I-TLA logic to specify the set of potential attack
scenarios. For the sake of space, we will only consider a
specific part of the IRPCM and we will describe the re-
lated I-TLA specification. Concepts in the IRPCM hav-
ing a degree of activation that exceeded a pre-defined
threshold are traduced into I-TLA evidences.
The concept “Disk sectors: Log installation history” is
traduced to history-based evidence in I-TLA. This evi-
dence, which is provided by the log installation file, al-
lows monitoring the content of variable SoftInsLog. The
provided evidence represents an observation over such
variable. Since it was targeted by an anti-investigation
attack, it is equal to <-> showing that none log file,
which could be left by the installed software, is on the
system. Some concepts from the IRPCM, in the form of
actions, are mapped to I-TLA actions. For instance, the
two actions “Execution of mod_ssl/OpenSSL exploit”
and “Install rk.tar.gz” are traduced to the two following
I-TLA actions, say A1 and A2, respectively:
2003=" //"
2003, ,
httplog modsslattack
portbin sh
ConAddrSoftInsLog AppSoft
 
2003 =<>
2003 =<>
,, 2003,,
Pr httplogportConAddrAppSoft
Action A1 can be executed to compromise the web
service using the SSL vulnerability. It consists in induc-
ing the system to grant a shell on port 2003. Further to
the execution of such action, an entry is appended to the
HTTP log file showing that a suspicious behavior has
occurred. Since, the exploitation of the OpenSSL vul-
nerability gives access using the privilege of the Apache
user, variable Pr gets value 1. This value means that the
access level is more privileged that the user access but
less privileged than the root one.
Action A2 cannot be executed unless variable Pr is
equal to 2 to mean that the user should have gained a
root privilege on the system. The action consists in hid-
ing the execution of services on port 2003, and the con-
nection of suspicious hosts on port 2003, if the Linux
commands ps and netstat are used. Actions A1 and A2
stand for timed event-based evidence showing that A1 is
executed before A2 and both of them are part of the con-
ducted attack scenario.
6.3.1. Executable Scenarios Generation by I-TLC
I-TLC was used to generate executable specification [18],
of the potential attack scenarios from I-TLA specifica-
tion. One potential attack scenario is generated and is
shown by Figure 4. The scenario is composed of nine
states where every state shows the value of the modeled
variables. Edges linking states, are labeled by the name
of the executed action.
The system starts with an empty HTTP and system log
file. The attacker first connects to the web service and
runs the modssl exploit to get system access with the
Apache user privilege. After that, it connects to the shell
granted on port 2003. Thus, variable ConAddr2003 gets
the value of the IP address of the remote user. After that,
the attacker makes a tentative to install the rk rootkit.
The operation is logged to the installation log of this tool.
However, since it fails, no files were integrated to the
system directory and variable AppSoft remains unchanged.
Later, the attacker conducts a storage-based anti-inves-
tigation attack to hide the content of the installation log
file. It reconnects from another host identified by the IP
address, escalates its privilege, installs
the suckit rootkit, and configures it to hide the execution
of the shell on port 2003 and connected the IP address.
In this execution, it can be noticed that the execution
of action A2 creates state s3 which does not satisfy the
content of the history-based evidence. In fact, since an anti-
investigation attack was executed and detected during the
IRPCM construction, I-TLC has allowed the execution
of A2 because it modifies the content of variable SoftIns-
Log which was affected by the anti-investigation attack.
6.3.2. Hypothetical Actions Generation
I-TLC has generated some hypothetical actions. For the
lack of space, we only kept one hypothesis among those
generated. Starting from state s6, I-TLC could not find an
action described in I-TLA specification, which, if exe-
cuted, lets obtaining a potential attack scenario that satis-
fies the history-based evidence. I-TLC looks within the
^ŽĨƚ/ŶƐ>ŽŐ сфͲх
ŽŶĚĚƌϮϬϬϯсфͲ х
^ŽĨƚ/ŶƐ>ŽŐ сфͲх
ŵŽĚƐƐů ĞdžƉůŽŝƚ
^ƵĐ</d ƌŽŽƚŬŝƚ
^ŽĨƚ/ŶƐ>ŽŐ сфͲх
^ŽĨƚ/ŶƐ>ŽŐ сфƌŬ ƌŽŽƚŬŝƚ ŝŶƐƚĂůůх
^ŽĨƚ/ŶƐ>ŽŐ сфͲх
^ŽĨƚ/ŶƐ>ŽŐ сфͲх
^ƵĐ</d ƌŽŽƚŬŝƚ
^ŽĨƚ/ŶƐ>ŽŐ сфͲх
Figure 4. Fragment of the generated executable specification by I-TLC.
Copyright © 2009 SciRes. IJCNS
Copyright © 2009 SciRes. IJCNS
library of hypotheses and generates, a, hypothetical ac-
tion H, and executes it to move the system to state s7.
The hypothetical action consists in uploading an exploit
to the compromised system and executing it to get a root
privilege. Further to the execution of the hypothetical
action, variable pr gets value 2, and the value exploit is
appended to the content of variable AppSoft. Later ac-
tions A7 and A8 are executed. I-TLC specifies that states
s7, s8 and s9 are reachable under the hypothetical action H
by setting their label equal to the singleton H.
7. Conclusions
In this paper, we have developed a system for digital
investigation of networks security incidents. This system
uses formal techniques as well as the IRT members’
knowledge to analyze the attacks performed against the
networks. We have introduced the intrusion response
probabilistic cognitive maps that are constructed by the
IRT upon the occurrence of the attack. A formal language
has been introduced to help specifying the attack scenar-
ios based on the cognitive map. A model checker was
built to automatically extract the attack scenarios and a
hypothetical concept is introduced here to help in the
construction process. To illustrate the proposed system,
we used it in a real case of security attack.
8. References
[1] P. D. Dixon, “An overview of computer forensics,” IEEE
Potentials, Vol. 24, No. 5, pp. 7–10, 2005.
[2] P. Stephenson, “Modeling of post-incident root cause
analysis,” International Journal of Digital Evidence, Vol. 2,
No. 2, pp. 1–16, 2003.
[3] T. Stallard and K. Levitt, “Automated analysis for digital
forensic science: Semantic integrity checking,” Proceed-
ings of the 19th Annual Computer Security Applications
Conference, Las Vegas, USA, 2003.
[4] P. Gladyshev, “Finite state machine analysis of a blackmail
investigation,” International Journal of Digital Evidence,
Vol. 4, No. 1, 2005.
[5] P. Gladyshev and A. Patel, “Finite state machine approach
to digital event reconstruction,” Digital Investigation journal,
Vol. 1, No. 2, pp. 130–149, 2004.
[6] B. D. Carrier and E. H. Spafford, “Categories of digital
investigation analysis techniques based on the computer
history model,” Digital Investigation Journal, 3(S), pp.
121–130, 2006.
[7] S. Willassen, “Hypothesis-Based investigation of digital
timestamps,” Proceedings of Fourth Annual IFIP WG
11.9 International Conference on Digital Forensics, Kyoto,
Japan, 2008.
[8] S. Y. Willassen, “Timestamp evidence correlation by
model based clock hypothesis testing,” Proceedings of
the 1st International Conference on Forensic Applications
and Techniques in Telecommunications, Information, and
Multimedia, 2008.
[9] A. R. Arasteha, M. Debbabi, A. Sakhaa, and M. Saleh,
“Analyzing multiple logs for forensic evidence,” Digital
Investigation, Vol. 4, No. 1, pp. 82–91, 2007.
[10] A. Pal, H. T. Sencar, and N. Memon, “Detecting file frag-
mentation point using sequential hypothesis testing,” Digital
Investigation, Vol. 5, No. 1, pp. S2–S13, 2008.
[11] S. P. Peisert, “A model of forensic analysis using goal-
oriented logging,” PhD thesis, University of California,
San Diego, 2007.
[12] A. S. Huff, “Mapping strategic thought,” John Wiley &
Sons, 1990.
[13] J. Krichene, M. Hamdi, and N. Boudriga, “Collective com-
puter incident response using cognitive maps,” IEEE In-
ternational Conference on Systems, Man and Cybernetics,
Hammamet, Tunisia, pp. 1080–1085, 2004.
[14] S. Rekhis, J. Krichene, and N. Boudriga, “Dig for net:
Digital Forensic in networking,” In Proceedings of the
3rd International Information Security Conference (SEC),
Milan, Italy, 2008.
[15] B. D. Carrier and E. H. Spafford, “An event-based digital
forensic investigation framework,” Proceedings of Digital
Forensic Research Workshop, 2004.
[16] B. Mangnes, “The use of Levenshtein distance in com-
puter forensics,” Master’s thesis, Gjovik University Col-
lege, 2005.
[17] E. Casey, “Digital evidence and computer crime,” Second
Edition, Academic Press, 2004.
[18] D. Drusinsky and J. L. Fobes, “Executable specifications:
Language and applications,” The journal of Defense Soft-
ware Engineering, Vol. 17, No. 9, pp. 15–18, 2004.
[19] Y. Guan and A. K. Ghose, “Executable specifications for
agent oriented conceptual modelling,” Proceedings of the
IEEE/WIC/ACM International Conference on Intelligent
Agent Technology (IAT), France, pp. 475–478, 2005.
[20] M. Hamdi, J. Krichene, and N. Boudriga, “Collective com-
puter incident response using cognitive maps,” Proceedings
of IEEE conference on Systems, Man, and Cybernetics
(IEEE SMC 2004), The Hargue, Netherland, 2004.
[21] S. Rekhis and N. Boudriga, “A formal approach for the
reconstruction of potential attack scenarios,” Proceedings
of the International Conference on Information & Com-
munication Technologies: From Theory to Applications
(ICTTA), Damascus, Syria, 2008.
[22] F. Kröger and S. Merz, “Temporal logic and state sys-
tems,” Springer, 2008.
[23] S. Rekhis and N. Boudriga, “Formal forensic investiga-
tion eluding disk-based anti-forensic attacks,” Proceed-
ings of Workshop on Information Security Applications,
Jeju Island, Korea, 2005.
[24] S. Garfinkel, “Anti-forensics: Techniques, detection and
countermeasures,” Proceedings of the 2nd International
Conference on I-Warfare and Security, Monterey, USA, 2007.
[25] G. C. Kessler, “Anti-forensics and the digital investiga-
tor,” Proceedings of 5th Australian Digital Forensics
Conference, Perth, Australia, 2007.