J. Biomedical Science and Engineering, 2013, 6, 223-231 JBiSE
http://dx.doi.org/10.4236/jbise.2013.62A027 Published Online February 2013 (http://www.scirp.org/journal/jbise/)
Modeling of gene regulatory networks: A review
Nedumparambathmarath Vijesh, Swarup Kumar Chakrabarti, Janardanan Sreekumar
Central Tuber Crops Research Institute, Thiruvananthapuram, India
Email: sreejyothi_in@yahoo.com
Received 16 December 2012; revised 15 January 2013; accepted 22 January 2013
ABSTRACT
Gene regulatory networks play an important role the
molecular mechanism underlying biological processes.
Modeling of these networks is an important challenge
to be addressed in the post genomic era. Several me-
thods have been proposed for estimating gene net-
works from gene expression data. Computational me-
thods for development of network models and analy-
sis of their functionality have proved to be valuable
tools in bioinformatics applications. In this paper we
tried to review the different methods for reconstruct-
ing gene regulatory networks.
Keywords: Gene Network; Gene Expression Data; Gene
Regulation
1. INTRODUCTION
A gene regulatory network or genetic regulatory network
(GRN) is a collection of DNA segments in a cell which
interact with each other indirectly (through their RNA
and protein expression products) and with other sub-
stances in the cell, thereby governing the rates at which
genes in the network are transcribed into mRNA. GRNs
provide a systematic understanding of molecular mecha-
nisms underlying biological processes [1-7]. The groups
of genes, regulatory proteins and their interactions are
often referred to as regulatory networks, whereas the
complete set of metabolites and the enzyme-driven reac-
tions constitute the metabolic networks. The nodes of
this network are genes and the edges between nodes rep-
resent gene interactions through which the products of
one gene affect those of another. These interactions can
be inductive (the arrowheads), with an increase in the
expression of one leading to an increase in the other, or
inhibitory (the filled circles), with an increase in one
leading to a decrease in the other. A series of edges indi-
cates a chain of such dependences, with cycles corre-
sponding to feedback loops.
Gene regulatory networks play a vital role in organism
development by controlling gene expression. Under-
standing the structure and behavior of gene regulatory
network is a fundamental problem in biology. With the
availability of gene expression data and complete ge-
nome sequences, several novel experimental and com-
putational approaches have recently been developed
which helps to comprehensively characterize these regu-
latory networks by enabling the identification of their
genomic or regulatory state components. Accurate pre-
diction of the behavior of regulatory networks will also
accelerate biotechnological projects and such predictions
are quicker and cheaper than lab experiments.
Creating accurate dynamic models of GRNs is gaining
importance in biomedical research and development.
Gene expression microarrays monitor the transcription
activities of thousands of genes simultaneously, which
provides great opportunities to explore large scale regu-
latory networks. Constructing a GRN from expression
data, a process which is called reverse-engineering, is not
a computationally simple problem because an enormous
amount of time is needed even when a trivial approach is
applied. Various computational models developed for
regulatory network analysis can be roughly divided into
four classes (Figure 1). The first class 1) logical models,
describes regulatory networks qualitatively. They allow
users to obtain a basic understanding of the different
functionalities of a given network under different condi-
tions. Their qualitative nature makes them flexible and
easy to fit to biological phenomena, although they can
only answer qualitative questions. To understand and
manipulate behaviors that depend on finer timing and
exact molecular concentrations, a second class of models
was developed 2) continuous models. For example, to
simulate the effects of dietary restriction on yeast cells
under different nutrient concentrations, users must resort
to the finer resolution of continuous models. A third class
of models was introduced following the observation that
the functionality of regulatory networks is often affected
by noise. As the majority of these models account for
interactions between individual molecules, they are re-
ferred to 3) single molecule level models. The fourth
class includes 4) hybrid models combining different
techniques like neural networks and fuzzy rules.
A complete gene regulatory network model incorpo-
rates experimental knowledge about the components and
OPEN ACCESS
N. Vijesh et al. / J. Biomedical Science and Engineering 6 (2013) 223-231
224
their interactions as well as the initial state of these
components, and leads to the known final state or dy-
namical behavior of the network. Validated models then
are able to investigate cases that cannot be explored ex-
perimentally, for example changes in the initial state, in
the components or in the interactions, and they can lead
to predictions and insights into the functioning of the
system robust is the system under extreme conditions. In
this article we review the various modeling techniques
for reconstructing gene regulatory network.
2. MODELLING TECHNIQUES
Figure 1 illustrates various Gene Regulatory Network
construction models that are discussed in following sec-
tions.
2.1. Logical Models
The most basic and simplest modeling methodology is
discrete and logic-based, and was introduced by Kauff-
man and Thomas [8,9]. The reconstruction of the regula-
tory network that controls the development of sea urchin
embryos is a seminal example of the profound insights
that qualitative examination of regulatory network mod-
els can provide. This work demonstrates how maternal
cues initiate the activity of the regulatory network and
how this network orchestrates the developmental process.
Logical models represent the local state of each entity in
the system (for example, genes, proteins and small
molecules) at any time as a discrete level, and the tem-
poral development of the system is often assumed to oc-
cur in synchronous, discrete time steps. Entity levels are
updated at each time step according to regulation func-
tions. Discrete modeling allows researchers to rely on
purely qualitative knowledge. Such models can be ana-
lyzed using a broad range of well established mathe-
matical and statistical methods.
Figure 1. Classification of models.
2.1.1. Boolean Network
Boolean networks are a dynamic model of synchronous
interactions between nodes in a network. They are the
simplest network models that exhibit some of the bio-
logical and systemic properties of real gene networks
[10,11]. Because of the simplicity they are relatively
easier to interpret biologically.
A Boolean network is a directed graph G(X, E), where
the nodes, xi X, are Boolean variables. To each node, xi,
is associated a Boolean function, bi
1, 2,,
ii i
x
xxl
, l n,
xij X, where the arguments are all and only the parent
nodes of xi in G. Together, at any given time, the states
(values) of all nodes represent the state of the network,
given by the vector
12
,,,
n
tx tx tSt x. For
gene networks the node variables correspond to levels of
gene expression, discretized to either up or down [12-14].
The Boolean functions at the nodes model the aggregated
regulation effect of all their parent nodes. The states of
all nodes are updated at the same time (i.e., synchro-
nously) according to their respective Boolean functions:
11,2,,
iiii i
x
tbxtxtxlt .
All states’ transitions together correspond to a state
transition of the network from S(t) to the new network
state, S(t + 1). A sample network is shown in Figure 2.
LIMITATION: These models are ultimately limited by
their definition: they are Boolean and synchronous. In
reality, of course, the levels of gene expression do not
have only two states but can assume virtually continuous
values. Thus discretization of the original data becomes a
critical step in the inference, and often reducing the val-
ues to two states may not suffice. In addition, the updates
of the network states in this model are synchronous,
whereas biological networks are typically asynchronous.
Finally, despite their simplicity, only small nets can be
reverse engineered with the current state-of-the-art algo-
rithms.
2.1.2. Probabilistic Boolean Network
Often, due to insufficient experimental evidence or in-
Figure 2. An example Boolean network and three possible
ways to represent it. The one on the left is a gene network
modeled as a Boolean network, in the middle is a wiring dia-
gram obviating the transitions between network states, and on
the right is a truth table of all possible state transitions.
Copyright © 2013 SciRes. OPEN ACCESS
N. Vijesh et al. / J. Biomedical Science and Engineering 6 (2013) 223-231 225
complete understanding of a system, several candidate
regulatory functions may be possible for an entity. This
raises the need to express uncertainty in the regulatory
logic. Shmulevich et al., [15,16] addressed this idea by
modifying the Boolean network model such that an en-
tity can have several regulation functions, each of which
is given a probability based on its compatibility with
prior data. At each time step, every entity is subjected to
a regulation function that is randomly selected according
to the defined probabilities. Hence the model is stochas-
tic and an initial global state can lead to many trajecto-
ries of different probabilities. The new model, the prob-
abilistic Boolean network (PBN), generates a sequence
of global states that constitutes a Markov chain. For ex-
ample, a PBN was used to model a 15 gene sub network
that was inferred from human glioma expression data
[15,16]. This analysis demonstrates that the stationary
distributions of entities may indicate possible regulatory
relationships among them: entities that have the same
states in a significant proportion of the global states are
likely to be related. As the number of global states in the
gene sub network was prohibitively large, one study es-
timated the stationary distribution by sampling the global
states.
LIMITATION: Even though it is stochastic the state
space is discrete.
2.1.3. Bayesian Network
The basic of Bayesian Network is Bayes’ Theorem. It
can be described as follows. Let X be a data sample
whose class label is unknown. Let H be a hypothesis that
X belongs to class C. For classification problems, deter-
mine
P(H/X): the probability that the hypothesis holds given
the observed data sample X. It is called posteriori prob-
ability.
P(H): prior probability of hypothesis H (i.e., the initial
probability before we observe any data, reflects the
background knowledge).
P(X): probability that sample data is observed.
P(X|H): probability of observing the sample X, given
that the hypothesis holds.
Given training data X, posteriori probability of a hy-
pothesis H, P(H|X) follows the Bayes theorem:




PXHPH
PHX PX
A simple Bayesian Classifier will work as follows:
Let D be a training set of tuples and their associated
class labels. As usual, each tuple is represented by an
n-dimensional attribute vector,
12
,,,
n
X
xx x, de-
picting n measurements made on the tuple from n attrib-
utes, respectively, 12
,,,
n
A
AA.
Suppose that there are m classes, .
Given a tuple, X, the classifier will predict that X belongs
to the class having the highest posterior probability, con-
ditioned on X. That is, the naïve Bayesian classifier pre-
dicts that tuple X belongs to the class Ci if and only if
12 ,
m
C,,CC
for, .
ij
PCXPCXij mj i
Thus we maximize P(Ci|X). The class Ci for which
P(Ci|X) is maximized is called the maximum posteriori
hypothesis. By Bayes’ theorem




ii
i
PXC PC
PC XPX
Bayesian classifiers assume that the effect of an at-
tribute value on a given class is independent of the val-
ues of the other attributes. This assumption is called class
conditional independence. It is made to simplify the
computations involved and, in this sense, is considered
“naïve”. Bayesian belief networks are graphical models,
which unlike naïve Bayesian classifiers allow the repre-
sentation of dependencies among subsets of attributes.
Bayesian networks are a class of graphical probabilis-
tic models. Formally a Bayesian network [17,18] is a
joint probability distribution over a set of random vari-
ables. They combine two very well developed mathe-
matical areas: probability and graph theory. A Bayesian
network consists of an annotated directed acyclic graph
G(X, E), where the nodes, xi X, are random variables
representing genes’ expressions and the edges indicate
the dependencies between the nodes. The random vari-
ables are drawn from conditional probability distribu-
tions
|
ii
P
xPax, where
i
Pa x is the set of parents
for each node. A Bayesian network implicitly encodes
the Markov Assumption that given its parents, each vari-
able is independent of its non-descendants. With this as-
sumption each Bayesian network uniquely specifies a de-
composition of the joint distribution over all variables
down to the conditional distributions of the nodes:
 

12
1
,,
n
ni
i
Px xxPxPax
i
A belief network is defined by two components, a di-
rected acyclic graph and a set of conditional probability
tables [19]. Each node in the directed acyclic graph
represents a random variable. The variables may be dis-
crete or continuous-valued. They may correspond to ac-
tual attributes given in the data or to “hidden variables”
believed to form a relationship. If an arc is drawn from a
node Y to a node Z, then Y is a parent or immediate
predecessor of Z and Z is a descendant of Y. Each vari-
able is conditionally independent of its non descendants
in the graph, given its parents.
For example, let us consider the five variables in Fig-
ure 3. Without using any independence assumptions, the
Copyright © 2013 SciRes. OPEN ACCESS
N. Vijesh et al. / J. Biomedical Science and Engineering 6 (2013) 223-231
226
Figure 3. Conditional
independence in a sim-
ple Bayesian network.
This network structure
implies several condi-
tional independence
⊥⊥
cases: (A E), (B
D | A, E), (C A, D,
E | B), (D B, C, E |
A), and (E A, D).
joint probability distribution can be written as:




,,,,,,, ,,
,
PABCDEP EABCD P DABC
PCABPBAPA
.
In contrast, using the independence assumptions im-
plied by the network in Figure 3, the same distribution
can be expressed as:

,,, ,,PABCDE PEPAPBAEPDAPCB.
If the variables are all binary in this network, the former
form requires 31 parameters, while the latter only needs
10 parameters. More generally, if G is defined over N
binary variables and their maximal number of parents is
bound by M, then instead of using 2N 1 independent
parameters to represent the full joint probability distribu-
tion, a Bayesian network model can represent the same
joint distribution with at most 2MN parameters.
A node within the network can be selected as an “out-
put” node, representing a class label attribute. There may
be more than one output node. Various algorithms for
learning can be applied to the network. Rather than re-
turning a single class label, the classification process can
return a probability distribution that gives the probability
of each class. A major advantage of Bayesian network
models is the ability to learn them from observed data.
Bayesian networks can capture linear, non-linear, com-
binatorial, stochastic and other types of relationships
among variables. They are suitable for modeling gene
networks because of their ability to represent stochastic
events, to describe locally interacting processes, to han-
dle noisy or missing biological data in a principled statis-
tical way and to possibly make causal inferences from
the derived models [20,21]. Hence, Bayesian networks,
including their variants Dynamic Bayesian networks,
Gaussian networks, Module networks, mixture Bayesian
networks and state-space models (SSMs), etc., have be-
come widely used tools for regulatory-network model-
ing.
LIMITATION: Although effective in dealing with
noise, incompleteness and stochastic aspects of gene
regulation, they fail to consider temporal dynamic as-
pects that are an important part of regulatory networks
modeling. Dynamic Bayesian networks (DBN) evolved
feedback loops to effectively deal with the temporal as-
pects of regulatory networks but their benefits are hin-
dered by the high computational cost required for learn-
ing the conditional dependencies in the cases where large
numbers of genes are involved.
2.2. Continuous Models
Biological experiments usually produce real, rather than
discrete valued, measurements. Examples include reac-
tion rates, cell mass [22-25], cell cycle length and gene
expression intensities. Logical models require discretiza-
tion of the real valued data, which reduces the accuracy
of the data. Continuous models, using real valued pa-
rameters over a continuous timescale, allow a straight-
forward comparison of the global state and experimental
data and can theoretically be more accurate. In practice,
however, quantitative measurements are almost always
partial (that is, they cover only a fraction of the system’s
entities). Therefore, some of the parameters of continu-
ous models are usually based on estimations or inference.
2.2.1. Linear Model
The defining property of linear models is that each regu-
lator contributes to the input of the regulation function
independently of the other regulators, in an additive
manner [10]. In other words, the change in the level of
each entity depends on a weighted linear sum of the lev-
els of its regulators. This assumption allows a high level
of abstraction and efficient inference of network struc-
ture and regulation functions.
A biological system can be considered to be a state
machine, where the change in internal state of the system
depends on the current internal state plus any external
inputs. The mRNA levels form an important part of the
internal state of a cell (ideally, we also want to measure
protein levels, metabolites, etc.). As a first approximation,
we fit the expression data with a purely linear model,
where the change in expression level of each mRNA
species is derived as a weighted sum of the expression
levels of all other genes. Of course, a linear model can
never be much more than a caricature of the real system,
but perhaps we can still draw some interesting conclu-
sions from it.
The basic linear model is of the form

ii
j
jj
tt WXt ,
where Xi(t + Δt) is the expression level of gene i at time t
+ Δt, and Wij indicates how much the level of gene j inu-
Copyright © 2013 SciRes. OPEN ACCESS
N. Vijesh et al. / J. Biomedical Science and Engineering 6 (2013) 223-231 227
ences gene i. For each gene, we will also add an extra
term indicating the influence of kainate, and a constant
bias term to model the activation level of the gene in the
absence of any other regulatory inputs. The differences in
gene regulation due to tissue type will be modeled by a
difference in bias. The final formula becomes:
kainat e
iijji
jii
X
tt WXtKtCT 
where kainate(t) is the kainate level at time t, Ki is the
influence of kainate on gene i, Ci is a constant bias factor
for each gene, and Ti indicates the difference in bias be-
tween tissue types (Ti = 0 when simulating spinal cord,
so the total bias for spinal cord is Ci, for hippocampus Ci
+ Ti).
LIMITATION: Linear additive regulation models re-
vealed certain linear relations in regulatory systems but
failed to capture nonlinear dynamics aspects of genes
regulation. When higher sensitivity to detail is desired,
more complex models are preferable.
2.2.2. Differential Equ a ti on B ased M odel
Differential equation models encode a gene network as a
system of differential equations. Difference and differen-
tial equations allow more detailed descriptions of net-
work dynamics, by explicitly modelling the concentra-
tion changes of molecules over time [26,27].
The basic difference equation model is of the form


 

111111
11
nn
nnnnnn
g
ttgt wgtwgtt
g
tt gtwgtwgtt
 
 
where gi(t + Δt) is the expression level of gene i at time t
+ Δt, and wij the weight indicating how much the level of
gene i is influenced by gene j
,1,,ij n. Note that
this model assumes a linear logic control model—the
expression levels of genes at a time t + Δt, depends line-
arly on the expression levels of all genes at a time t. For
each gene, one can add extra terms indicating the influ-
ence of additional substances. Differential equation mo-
dels are similar to difference equation models, but follow
concentration changes continuously, modelling the time
difference between two time steps in infinitely small time
increases, i.e. Δt is approaching 0.
Difference and differential models depend on numeri-
cal parameters, which are often difficult to measure ex-
perimentally. An important question for these models is
stability—does the behaviour of the system depend on
the exact values of these parameters and initial substance
concentrations, or is it similar for different variations. It
seems unlikely that an unstable system represents a bio-
logically realistic model, while on the other hand, if the
system is stable, the exact values of some parameters
may not be essential.
The rate of change in concentration of a particular
transcript is given by an influence function of other RNA
concentrations. The non-linear differential equations de-
scribe the mutual activating and repressing influences of
genes in a GRN at a high-level of abstraction. In particu-
lar, it is assumed that the rate of gene expression depends
exclusively on the concentration of gene products arising
from the nodes (genes) of the GRN. This means that the
influence of other molecules (e.g., transcription factors)
and cellular processes (translation) is not taken into ac-
count directly. Even with these limitations, dynamic
GRN models of this kind can be useful in deciphering
basic aspects of gene-regulatory interactions.
One major advantage of all three methods described
below lies in their simple homogeneous structures, as
this allows the settings of parameter discovering software
to be easily customized for these structures. The three
methods describe dynamic GRN models by means of a
system (or set) of ordinary differential equations. For a
GRN comprising N genes, N differential equations are
used to describe the dynamics of N gene product concen-
trations, Xi with 1, ,iN
. In all three methods, the
expression rate dXi/dt of a gene product concentration
may depend on the expression level of one or more gene
products of the genes Xj, with . Thus, the
gene product concentration Xi may be governed by a
self-regulatory mechanism (when i = j), or it may be
regulated by products of other genes in the GRN. The
three modeling methods differ in the way they represent
and calculate expression rates.
1, ,jN
2.2.2.1. The Artificial Neural Network (ANN) Method
Vohradsky [28] introduced ANNs as a modeling method
capable of describing the dynamic behavior of GRNs.
The way this method represents and calculates expres-
sion rates depends on the weighted sum of multiple
regulatory inputs. This additive input processing is capa-
ble of representing logical disjunctions. The expression
rate is restricted to a certain interval where a sigmoidal
transformation maps the regulatory input to the expres-
sion interval. ANNs provide an additional external input
which has an influence on this transformation in that it
can regulate the sensitivity to the summed regulatory
input. Finally, the ANN method defines the degradation
of a gene product on the basis of standard mass-action
kinetics. Formally, the ANN method is defined as:
1
d0
d
N
i
iijjiiiiii
j
Xvf wXkXvk
t


 


The parameters of the ANN method have the follow-
ing biological interpretations:
N: Number of genes in the GRN to be modeled. The
genes of the GRN are indexed by i and j, where
,1,,ij N
.
Copyright © 2013 SciRes. OPEN ACCESS
N. Vijesh et al. / J. Biomedical Science and Engineering 6 (2013) 223-231
228
vi: Maximal expression rate of gene i.
wij: The connection weight or strength of control of
gene j on gene i. Positive values of wij indicate activating
influences while negative values define repressing influ-
ences.
ϑi: Influence of external input on gene i, which modu-
lates the gene’s sensitivity of response to activating or
repressing influences.
f: Represents a non-linear sigmoid transfer function
modifying the influence of gene expression products Xj
and external input ϑi to keep the activation from growing
without bounds.
ki: Degradation of the i-th gene expression product.
The mathematical properties of the ANN method have
been well studied because it is a special case of a recur-
rent neural network. In particular, the symmetry of the
matrix of connection weights wij influences whether the
network dynamics are oscillatory or whether they con-
verge on a steady (or even chaotic) state. High positive
or negative values of the external input, ϑi, reduce the
effect of the connection weights. This is explored in Case
D where ϑi has been interpreted as a delay to the reaction
kinetics of the transcriptional machinery.
2.2.2.2. The S-System (SS) Method
Savageau [29] proposed the synergistic system or
S-system (SS) as a method to model molecular networks.
When modeling GRNs with the SS method, the expres-
sion rates are described by the difference of two products
of power-law functions, where the first represents the
activation term and the second the degradation term of a
gene product Xi. This multiplicative input processing can
be used to define logical conjunctions for both the regu-
lation of gene expression processes and for the regulation
of degradation processes. The SS method has no restric-
tions in the gene expression rates and thus does not im-
plicitly describe saturation. Formally, the SS method is
defined as:
11
d,0,,
d
ij ij
NN
gh
i
ijijii ijij
jj
XXX gh
t


 
 R
R
The parameters of the SS method have the following
biological interpretations:
N: Number of genes in the GRN to be modeled. The
genes of the GRN are indexed by i and j, where
.
,1,,ij N
αi: Rate constant of activation term; in SS GRN mod-
els, all activation (up-regulation) processes of a gene i
are aggregated into a single activation term.
βi: Rate constant of degradation term; in SS GRN
models, all degradation processes of a gene i are aggre-
gated into a single degradation term.
gij,hij: Exponential parameters called kinetic order.
These parameters describe the interactive influences of
gene j on gene i. Positive values of gij indicate an acti-
vating influence on the expression of gene i, whereas
inhibiting influences are represented by negative values.
Similarly, positive values of hij indicate increasing deg-
radation of the gene product Xi, whereas decreasing deg-
radation is represented by negative values. The parame-
ters used in SS models have a clear physical meaning
and can be measured experimentally, yet they describe
phenomenological influences, as opposed to stoichio-
metric rate constants in general mass action (GMA) sys-
tems. The SS method generalizes mass-action kinetics by
aggregating all individual processes into a single activi-
tion and a single degradation term (per gene). In contrast,
the GMA system defines all individual processes k with
1, ,k
with the sum of power-law functions ac-
cording to:
11
11
d
d
,0,,
ijk ijk
NN
RR
gh
i
ik jik j
kk
jj
ikikijk ijk
XXX
t
gh








R
The parameters of the GMA system have the follow-
ing biological interpretations:
αi: Rate constant of activation process k.
βik: Rate constant of degradation process k.
gijk: Exponential parameter called kinetic order de-
scribing the interactive influence of Xj on gene i of proc-
ess k.
hijk: Exponential parameter called kinetic order de-
scribing the interactive influence of Xj on gene i of proc-
ess k.
2.2.2.3. The General Rate Law of Transcription (GRLOT)
Method
The GRLOT method has been used to generate bench-
mark time-series data sets to facilitate the evaluation of
different reverse-engineering approaches. GRLOT mod-
els multiply individual regulatory inputs. Activation and
inhibition are represented by different functional expres-
sions that are similar to Hill kinetics, which allow the
inclusion of cooperative binding events. Identical to the
ANN, the degradation of gene products is defined via
mass-action kinetics. Formally, the GRLOT method is
defined as:
d
d
,, ,0.
jk
jjk k
nn
j
ik
ii
nnn n
jk
kk
jj
ij ji
Ki
XA
vk
tAKa
IKi
vKi Kak








 i
X
The parameters of the GRLOT method have the fol-
lowing biological interpretations:
vi: Maximal expression rate of gene i.
Ij: Inhibitor (repressor) j.
Ak: Activator k; the number of inhibitors I, and the
Copyright © 2013 SciRes. OPEN ACCESS
N. Vijesh et al. / J. Biomedical Science and Engineering 6 (2013) 223-231
Copyright © 2013 SciRes.
229
OPEN ACCESS
Table 1. Advantages and disadvantages of the different algorithms for gene network construction.
TECHNIQUE ADVANTAGES DISADVANTAGES
Boolean Networks
A simplistic Boolean formalism can represent
realistic complex biological phenomena such as
cellular state dynamics that exhibit switch-like
behavior, stability, and hysteresis.
Boolean: Two states are not sufficient for the
levels of real gene expressions. The updates of
the network states in this model are
synchronous, whereas biological networks
are typically asynchronous. Can be applied
only for small networks.
Probabilistic Boolean Networks
It is stochastic. Overcome the deterministic rigidity
of Boolean networks. They are able to cope with
uncertainty both in the data and in the model
selection.
Even though it is stochastic the state space is
discrete
Bayesian Networks
Effective in dealing with noise, incompleteness and
stochastic aspects of gene regulation.
Dynamic Bayesian networks (DBN) evolved
feedback loops to effectively deal with the temporal
aspects of regulatory networks.
Fail to consider temporal dynamic aspects that
are an important part of regulatory networks
modeling.
The benefits are hindered by the high
computational cost required for learning the
conditional dependencies in the cases where
large numbers of genes are involved.
Linear Model
Linear models do not require extensive knowledge
about regulatory mechanisms. It can be used to
obtain qualitative insights about regulatory
networks.
Failed to capture nonlinear dynamics aspects of
genes regulation.
Not sufficient if higher sensitivity to detail is
desired.
Differential Equation Based Model
Simple homogeneous structures: this allows the
settings of parameter discovering software to be
easily customized for these structures.
Involve a large number of parameters—O(d2 )
parameters where d is the number of genes
modeled.
Single Molecule Level Model The most detailed, can capture stochasticity. computationally expensive
Hybrid Model
In the real world systems both continuous aspects
and discrete aspects are present.
Hybrid models helps in modeling both together.
Computationally expensive
number of activators A can be related to the total number
of genes by I + A N.
Kij: Concentration at which the influence of inhibitor j
is half of its saturation value.
Kak: Concentration at which the influence of activator
k is half of its saturation value.
nj , nk: Regulate the sigmoidicity of the interaction be-
havior in the same way as Hill coefficients in enzyme
kinetics.
ki: Degradation of the i-th gene expression product.
LIMITATIONS: Unless they are restricted to simple
function forms, differential equation models involve a
large number of parameters—O(d2) parameters where d
is the number of genes modeled. Moreover, differential
equation models require time-series data to learn the pa-
rameters
2.3. Single Molecule Level Model
Every biological network is composed of stochastic
components, and therefore it may manifest different be-
haviours, even starting from the same initial conditions
[30,31]. When the number of involved molecules of each
species is large, the law of mass action can be used to
accurately calculate the change in concentrations, and
little or no stochastic effect is observable. However,
when the number of molecules is small, significant sto-
chastic effects may be seen. This is particularly true for
regulatory networks, in which the number of regulatory
molecules is often low [32-35]. Recently, single cell ex-
perimental assays demonstrated the stochastic behaviour
of the processes of transcription and translation [36].
2.4. Hybrid Model
In the real world systems both continuous aspects and
discrete aspects are present. In general, concentrations
are expressed as continuous values, whereas the binding
of a transcription factor to DNA is expressed as a dis-
crete event (bound or unbound). However, the bounda-
ries between the discrete and continuous aspects depend
on the level of detail that our model is designed for. For
instance, on single cell level the concentrations may have
to be expressed by molecule counts and become discrete,
whereas if we use thermodynamic equilibrium to model
the protein-DNA binding, the variable describing the
N. Vijesh et al. / J. Biomedical Science and Engineering 6 (2013) 223-231
230
binding state becomes continuous. Hybrid models have
been developed in an attempt to describe both, discrete
and continuous aspects in one model.
An example of a hybrid model [37,38] is a multi-layer
evolutionary trained neuro-fuzzy recurrent network
(ENFRN) applied to the problem of GRN reconstruction,
which addresses the major drawbacks of currently exist-
ing computational methods. This choice was driven by
the benefits, in terms of computational power, that neural
network based methods provide. The self-organized na-
ture of ENFRN algorithm is able to produce an adaptive
number of temporal fuzzy rules that describe the rela-
tionships between the input (regulating) genes and the
output (regulated) gene. Related to that, another advan-
tage of this approach is that it overcomes the need of
prior data discretization, a characteristic of many com-
putational methods which often leads to information loss.
The dynamic mapping capabilities emerging from the
recurrent structure of ENFRN and the incorporation of
fuzzy logic drive the construction of easily interpretable
fuzzy rules of the form: “IF gene x is highly expressed at
time t THEN its dependent/target gene y will be lowly
expressed at time t + 1”. The evolutionary training, based
on the PSO framework, tries to avoid the drawbacks of
classical neural networks training algorithms [39]. Addi-
tionally, we are approaching the under-determinism pro-
blem by selecting the most suitable set of regulatory
genes via a time-effective procedure embedded in the
construction phase of ENFRN. Also, besides determining
the regulatory relations among genes, this method can
determine the type of the regulation (activation or re-
pression) and at the same time assign a score, which
might be used as a measure of confidence in the retrieved
regulation.
Comparison of different models discussed in this pa-
per is given in Table 1.
3. CONCLUSION
In this paper we have reviewed the different modeling
methods for reconstructing gene networks from gene
expression data. All methods mentioned above are for
reverse engineering of GRNs from gene expression data.
The Boolean network models have the limitation of dis-
crete apace and in reality, of course, the levels of gene
expression do not have only two states but can assume
virtually continuous values. The probabilistic methods
have the flexibility of assuming different probability of
expression for gene at a particular point of time and are
closely related to real time situations. Also we discussed
continuous models like linear and differential models
using non-discrete values. Single molecule based models
consider stochastic behavior of biological network and
hybrid models combines different concepts for GRN
reconstruction.
4. ACKNOWLEDGEMENTS
The authors wish to acknowledge the financial support provided by
Department of Information Technology (DIT) Government of India for
carrying out this work.
REFERENCES
[1] Guy, K. and Ron, S. (2008) Modelling and analysis of
gene regulatory networks.
www.nature.com/reviews/molcellbio
[2] Davidson, E. and Levin, M. (2005) Gene regulatory net-
works. Proceedings of the National Academy of Sciences
of the United States of America, 102, 4935.
doi:10.1073/pnas.0502024102
[3] Hasty, J., McMillen, D., Isaacs, F. and Collins, J.J. (2001)
Computational studies of gene regulatory networks: In
numero molecular biology. Nature Reviews Genetics, 2,
268-279. doi:10.1038/35066056
[4] Martin, T.S., Johannes, J.M. and Werner, D. (2010)
Comparative study of three commonly used continuous
deterministic methods for modeling gene regulation net-
works. BMC Bioinformatics, 11, 459.
doi:10.1186/1471-2105-11-459
[5] Wessels, L., van Someren, E. and Reinders, M.A. (3-7
January 2001) Comparison of genetic network models.
Proceedings of the Pacific Symposium on Biocomputing,
Hawaii, 508-519.
[6] Cho, K.H., Choo, S.M., Jung, S.H., Kim, J.R., Choi, H.S.,
Kim, J. (2007) Reverse engineering of gene regulatory
networks. IET Systems Biology, 1, 149-163.
doi:10.1049/iet-syb:20060075
[7] De Jong, H. (2002) Modeling and simulation of genetic
regulatory systems: A literature review. Journal of Com-
putational Biology, 9, 67-103.
doi:10.1089/10665270252833208
[8] Glass, L. and Kauffman, S.A. (1973) The logical analysis
of continuous, non-linear biochemical control networks.
Journal of Theoretical Biology, 39, 103-129.
doi:10.1016/0022-5193(73)90208-7
[9] Thomas, R. (1973) Boolean formalization of genetic con-
trol circuits. Journal of Theoretical Biology, 42, 563-585.
doi:10.1016/0022-5193(73)90247-6
[10] Vladimir, F. (2005) Handbook of computational molecu-
lar biology. University of California, Davis.
[11] Faure, A., Naldi, A., Chaouiya, C. and Thieffry, D. (2006)
Dynamical analysis of a generic boolean model for the
control of the mammalian cell cycle. Bioinformatics, 22,
e124-e131. doi:10.1093/bioinformatics/btl210
[12] Akutsu, T., Miyano, S. and Kuhara, S. (2000) Inferring
quality relations in genetic networks and metabolic path-
ways. Bioinformatics, 16, 727-734.
doi:10.1093/bioinformatics/16.8.727
[13] Tany, A. and Shamir, R. (2001) Computational expansion
of gene networks. Bioinformatics, 17, S270-S278.
Copyright © 2013 SciRes. OPEN ACCESS
N. Vijesh et al. / J. Biomedical Science and Engineering 6 (2013) 223-231
Copyright © 2013 SciRes.
231
[27] D’Haeseleer, P., Wen, X., Fuhrman, S. and Somogyi, R.
(1999) Linear modeling of mRNA expression levels dur-
ing CNS development and injury. Pacific Symposium on
Biocomputing, 4, 41-52.
doi:10.1093/bioinformatics/17.suppl_1.S270
[14] Lahdesmaki, Shmuleveich, L. and Yli-Harja, O. (2003)
On learning gene regulatory networks under the Boolean
network model. Machine Learning, 52, 147-167.
doi:10.1023/A:1023905711304 [28] Hellerstein, M.K. (2003) In vivo measurement of fluxes
through metabolic pathways: The missing link in func-
tional genomics and pharmaceutical research. Annual Re-
view of Nutrition, 23, 379-402.
doi:10.1146/annurev.nutr.23.011702.073045
[15] Shmulevich, I., Dougherty, E.R., Kim, S. and Zhang, W.
(2002) Probabilistic Boolean networks: A rule-based un-
certainty model for gene regulatory networks. Bioinfor-
matics, 18, 261-274. doi:10.1093/bioinformatics/18.2.261
[29] Vohradsky, J. (2001) Neural network model of gene ex-
pression. The FASEB Journal, 15, 846-854.
doi:10.1096/fj.00-0361com
[16] Shmulevich, I., Gluhovsky, I., Hashimoto, R.F., Dough-
erty, E.R. and Zhan, W. (2003) Steady-state analysis of
genetic regulatory networks modelled by probabilistic
Boolean networks. Comparative and Functional Genom-
ics, 4, 601-608. doi:10.1002/cfg.342
[30] Savageau, M.A. (1976) Biochemical systems analysis: A
study of function and design in molecular biology. Addi-
son-Wesley, Reading.
[17] Pearl, J. (1988) Probabilistic reasoning in intelligent sys-
tems: Networks of plausible inference. Morgan Kauf-
mann, San Mateo.
[31] McAdams, H.H. and Arkin, A. (1999) It’s a noisy busi-
ness! Genetic regulation at the nanomolar scale. Trends in
Genetics, 15, 65-69.
doi:10.1016/S0168-9525(98)01659-X
[18] Han, J.W. and Micheline, K. (2007) Data mining: Con-
cepts and techniques. Elsevier Science, New York.
[32] Ross, I.L., Browne, C.M. and Hume, D.A. (1994) Tran-
scription of individual genes in eukaryotic cells occurs
randomly and infrequently. Immunology & Cell Biology,
72, 177-185. doi:10.1038/icb.1994.26
[19] Friedman, N., Linial, M., Nachman, I. and Pe’er, D.
(2000) Using Bayesian networks to analyze expression
data. Journal of Computational Biology, 7, 601-620.
doi:10.1089/106652700750050961
[33] Bae, K., Lee, C., Hardin, P.E. and Edery, I. (2000)
dCLOCK is present in limiting amounts and likely medi-
ates daily interactions between the dCLOCK-CYC tran-
scription factor and the PER-TIM complex. Journal of
Neuroscience, 20, 1746-1753.
[20] Armaanzas, R., Inza, I. and Larraaga, P. (2008) Detecting
reliable gene interactions by a hierarchy of Bayesian
network classiers. Computer Methods and Programs in
Biomedicine, 91, 110-121.
doi:10.1016/j.cmpb.2008.02.010
[34] Guptasarma, P. (1995) Does replication-induced tran-
scription regulate synthesis of the myriad low copy num-
ber proteins of Escherichia coli? Bioessays, 17, 987-997.
doi:10.1002/bies.950171112
[21] Beal, M.J., Falciani, F., Ghahramani, Z., Rangel, C. and
Wild, D.L. (2005) A Bayesian approach to reconstructing
genetic regulatory networks with hidden factors. Bioin-
formatics, 21, 349-356.
doi:10.1093/bioinformatics/bti014 [35] Bailone, A., Levine, A. and Devoret, R. (1979) Inactiva-
tion of prophage λ repressor in vivo. Journal of Molecular
Biology, 131, 553-572.
doi:10.1016/0022-2836(79)90007-X
[22] Mason, O. and Verwoerd, M. (2007) Graph theory and
networks in biology. IET Systems Biology, 1, 89-119.
doi:10.1049/iet-syb:20060038
[36] Shea, M.A. and Ackers, G.K. (1985) The OR control
system of bacteriophage λ. A physical-chemical model
for gene regulation. Journal of Molecular Biology, 181,
211-230. doi:10.1016/0022-2836(85)90086-5
[23] Sauer, U., et al. (1996) Physiology and metabolic fluxes
of wildtype and riboflavin-producing Bacillus subtilis.
Applied and Environmental Microbiology, 62, 3687-
3696. [37] J. Paulsson. (2005) Models of stochastic gene expression.
Physics of Life Reviews, 2, 157-175.
doi:10.1016/j.plrev.2005.03.003
[24] Ness, S.A. (2006) Basic microarray analysis: Strategies
for successful experiments. Methods in Molecular Biol-
ogy, 316, 13-33. [38] Ioannis, A.M., Andrei, D. and Dimitris, T. (2010) Gene
regulatory networks modelling using a dynamic evolu-
tionary hybrid. BMC Bioinformatics, 11, 140.
doi:10.1186/1471-2105-11-140
[25] Kingsmore, S.F. (2006) Multiplexed protein measure-
ment: Technologies and applications of protein and anti-
body arrays. Nature Reviews Drug Discovery, 5, 310-320.
doi:10.1038/nrd2006 [39] Du, P., Gong, J., Wurtele, E.S. and Dickerson, J.A. (2005)
Modeling gene expression networks using fuzzy logic.
IEEE Transacions on Systems, Man and Cybernetics, 35,
1351-1359. doi:10.1109/TSMCB.2005.855590
[26] Chen, T., He, H.L. and Church, G.M. (1999) Modeling
gene expression with differential equations. Pacific Sym-
posium on Biocomputing, 4, 29-40.
OPEN ACCESS