Journal of Software Engineering and Applications, 2013, 6, 53-58
http://dx.doi.org/10.4236/jsea.2013.62009 Published Online February 2013 (http://www.scirp.org/journal/jsea)
53
Stochastic Modeling of Database Backup Policy for a
Computer System
Syouji Nakamura1, Xufeng Zhao2, Toshio Nakagawa2
1Department of Human Life and Information, Kinjo Gakuin University, Nagoya, Japan; 2Department of Business Administration,
Aichi Institute of Technology, Toyota, Japan.
Email: snakam@kinjo-u.ac.jp, g09184gg@aitech.ac.jp, toshi-nakagawa@aitech.ac.jp
Received January 3rd, 2013; revised February 1st, 2013; accepted February 9th, 2013
ABSTRACT
As the computer system has developed much in this highly information-oriented society, database security has become
a very important problem and its backup strategies need to be made more efficiently and safety. The image copy
method has been used as the most simple and dependable recovery mechanism for media failure. However, this method
spends high overhead costs for massive data transmission and much processing time in the normal operation of the da-
tabase. To cover such weak points, incremental and full backup methods are adopted before updated trucks reach a pre-
determined level. Moreover, when the number of full backup files exceeded a predetermined level, we stop incremental
and full backups and switch it to the image copy. This paper applies cumulative damage model to backup of files in a
database system, by putting damage shock by update, failure shock by database failure and damage by dumped files,
and considers the tradeoff among overhead costs of image copy and incremental, full backup methods, and discusses
analytically an optimal policy for the image copy backup interval. Finally, numerical examples are given in the case of
Poisson process and exponential distributions.
Keywords: Database Management System; Image Copy; Full Backup; Incremental Backup; Cumulative Process
1. Introduction
In this highly information-oriented society, database se-
curity [1] in computer systems has become a very im-
portant problem. Some recovery techniques in a database
management system have to be prepared previously for
emergency of troubles, using backup, check pointing, re-
organization, fault-tolerant technologies [2,3]. The back-
up service and its protection techniques have been stud-
ied and programed by many researchers and engineers,
e.g., the recent work Peer-to-Peer (P2P) backup system
[4-6], backup and protection schemes for a distributed
system [7,8].
When we refer to the backup techniques for the data-
base systems, image copy [9] is the most simple and de-
pendable method to ensure the safety of data and is al-
ways to take the backup copies of all files in other places
and to take out them if files in the original secondary
media are broken. However, this method takes many
hours, storages and costs when files become large be-
cause it stores all files. Frequent image copy backup
seems not be so reasonable so that it is often restricted to
a weekly or monthly schedule, although the increasing
speed and capacity of backup media could make over-
night backup to be a more realistic proposition. To make
the backup copies efficiently, we might dump only files
that have changed since the last backup, which is called
an incremented backup [9]. This would lessen signifi-
cantly both time and size of backup. Further, the recovery
techniques for database failures and the backup schemes
for broken hard disks were studied [2,10,11].
However, there are only a few studies that focus to the
scheduling problems of backup and recovery methods,
although many related techniques are available. That is,
we propose that backup schedule is a stochastic decision
making process from the viewpoints of management and
can be analyzed by the theory of stochastic processes.
Especially, with regarding to backup modeling, there
have been very few research papers that studied analyti-
cally optimal policies for a database system. Most prob-
lems were concerned with several ways to introduce
backup methods in techniques. Optimal full backup poli-
cies and incremental backup schedules have been studied
[12-14]. Even so, as referred as in [9], image copy back-
up is necessary to any database system for its whole se-
curity strategy, although it is restricted to be performed
weekly or monthly until now. Such a strict periodic im-
age copy backup policy is unreasonable and could be
optimized from the following two points: 1) this backup
technique has its superiority that it could copy all files
Copyright © 2013 SciRes. JSEA
Stochastic Modeling of Database Backup Policy for a Computer System
54
without any compression and make backup simpler and
reliable; 2) its performance needs to be combined with
incremental and full backup schedules, i.e., when the
total cost for incremental and full backups exceeds some
level, it is reasonable to do such an image copy backup to
renewal the database.
Thus, we propose the following backup policy which
ensures the safety of data and saves hours: The image
copy is carried out at scheduled times, and between these
backups, the incremental backup or full backup at each
files, which takes all copies of newly updated files since
the image copy, is done. That is, the image copy with
large overhead is done at long interval and the incre-
mental and full backups with small overhead are done at
short interval.
In this paper, we apply the cumulative damage model
[15,16] to the backup of files in a database system, by
putting damage shock by update, failure shock by data-
base failure and damage by dumped files. These models,
which play an important role in reliability theory, are
considered as a sequence of shocks that occur randomly
in time and give some amount of damage to a unit. The
damage is accumulated to the current damage level,
weakens the unit gradually, and makes it failure when the
total damage exceeds a failure level [15]. As applications
of cumulative damage processes in computer science,
such models have been applied successfully to backup
policies for a database system [12-14] and garbage col-
lection policies in memory management [17,18].
The following sections are organized as: Section 2 in-
troduces working schemes of incremental, full and image
copy backups, by taking a database system with n files as
an example. Section 3 formulates the stochastic backup
model, which combines the incremental, full and image
copy backup, i.e., the incremental backup is done when
the number of updated trucks does not exceed a certain
threshold value to an individual file; the full backup is
done when the number of updated trucks exceeds a cer-
tain threshold value to an individual file; and the image
copy backup is done at a planned time and when the
number of full backup files exceeds a certain threshold
value in a database. Then, we introduce costs suffered for
the overheads of three backups and obtain the expected
cost rate between the image copies. Section 4 discusses
an optimal interval of the image copy that minimizes the
expected cost in Section 3, and in Section 4, we compute
the model and its policies as a numerical example in the
case of Poisson process and exponential distribution.
Finally, in Section 5, concluding remarks and further
studies are given.
2. Backup Schedule
We consider a database system with n files that are com-
posed of the same size of trucks. In this database system,
we consider an incremental backup that takes all copies
of newly updated trucks since the previous image copy
and its overhead is increasing with the number of newly
updated trucks. For example, if all updated trucks are
included in the previous updated ones, then the number
of transferred data is the same as the previous one. How-
ever, if the updated trucks have some different ones from
the previous ones, then the number of transferred data is
increasing by their differences. Taking out copies of pre-
vious backups can make the recovery of a database easily
and rapidly, when some errors have occurred in storage
media.
For example, we consider a database with 6 files: In
Figure 1, when the updated track exceeds a threshold
value Z at each file, e.g., (a), (e) and (f), we dofull
backup to these files. Moreover, files (b)-(d) which do
not exceed the threshold value Z execute the incremental
backup. In Figure 2, when the number of full backup
files in the database exceeds a threshold value, we do the
image copy backup for this database. In this figure, when
the number of 5/6 files in the database is targeted, e.g.,
files (a)-(c), (e) and (f) are needed to be done by full
backup, we stop the full and incremental backups and
switch it to the operation of image copy backup.
It is well known that when the number of updated
trucks exceeds a threshold level Q in a file, the overhead
of incremental backup is larger than that of full backup
[10]. The value of Q/m is about 60% [10] in a database
system, where m is the total trucks in a file. Thus, if the
number of updated trucks exceeds a level Z (0 < Z Q),
we should make the full backup instead of the incre-
mental backup. Moreover, when the number of updated
files exceeds a threshold value in a database, the over-
head of full backup is larger than that of image copy
backup [9].
Figure 3 shows in the incremental backup method that
a usual backup storage volume is small, however, it is
necessary to preserve all backed up files. Therefore, the
Figure 1. Execution of full and incremental backups in a
database.
Copyright © 2013 SciRes. JSEA
Stochastic Modeling of Database Backup Policy for a Computer System 55
Figure 2. Execution of image copy backup in a database.
Figure 3. Storage volume every time of incremental, full
and image copy backups.
accumulation of incremental backup files becomes large
storage volume. In the full backup method, the storage
volume is equal to the size of the object files. Moreover,
in the image copy backup method, the storage volume is
the same size of database.
From above discussions, it becomes an important
problem in actual backup schemes when to create an im-
age copy. We want to lessen the number of the image
copy with large overhead, however, the overheads of the
incremental and full backup increase adaptively with the
number of newly updated trucks and files. From this
point of view and we should decide the image copy in-
terval, by comparing the overheads among three backups.
3. Expected Cost
We formulate the following stochastic backup models:
The image copy backup is done at a planned time T and
database initialization is made at such a scheduled time.
The incremental backup is done when the volume of up-
dated trucks does not exceed a certain threshold value Z
to an individual file. The full backup is done when the
volume of updated trucks exceeds a certain threshold
value Z to an individual file.
It is assumed that a database system with n files is up-
dated according to a nonhomogeneous Poisson process
with an intensity function
t
and a mean-value func-
tion
,Rt i.e., [15]. Then, the prob-
 
0
d
t
Rtu u
ability that j-th update occurs exactly during (0, t] is
 

e0,1,2,
!
j
Rt
j
Rt
Ht j
j



,
(1)
where
0.Rt
Further, let
j
W denote an amount of trucks of each
file, which they are updated or are new created since the
last backup at the j-th update. It is assumed that each
j
W
has an identical probability distribution
Pr1, 2,.
j
Gx Wj
j
x Then, the total amount
of updated trucks 1
j
ii
Z
W up to j-th update where
00Z
has a distribution

 
Pr1, 2,,
j
j
ZxGxj (2)
and


01Gx
for 0 for where
is the j-fold convolution of
0,x0,x

j
G

x
Gx

with itself.
Then, the probability that the total amount of updated
trucks exceeds exactly a threshold level K at the j-th up-
date is [15]. Let


1j
GK


j
GK
Z
t be the
total amount of updated trucks at time t. Then, the distri-
bution of
Z
t is





0
Pr .
j
j
j
Z
tx HtGx

(3)
Suppose that when the total amount of updated trucks
exceeds a threshold level Z at time we
want to do the full backup for this file. When the total
amount of updated trucks does not exceeds Z, we do the
incremental backup, and this probability is

1, 2,,jTj


 
1
0
Φ.
jj
j
TGZHT
(4)
Oppositely, when the total amount of updated trucks
exceeds Z, we do the full backup, and this probability is




 

 
1
1
00
0
d
1.
T
jj j
j
jj
j
TGZGZHtt
GZHT

 


t
(5)
Suppose that there exist n files in the database. How-
ever, it would be useless to do separately backup policies
for each file. It is assumed that if the total amount of up-
dated trucks of
,1,,
j
jKK n files exceeds a
threshold level Z at time T, then the full backup is done
for such j files. In this case, the probability that the image
copy backup is done for this database is
Copyright © 2013 SciRes. JSEA
Stochastic Modeling of Database Backup Policy for a Computer System
56
 
 
11
1
11
0
ΦΦΦ
!ΦΦdΦ.
!1!
nnj
j
njK
TnK
K
n
n
TTT
j
nuu
nK K
 










u
(6)
If the total amount of updated trucks of
files exceeds a threshold level Z at
time T, then the full backup is done for such j files. In
this case, the probability that full backup is done for the
database is
0,1, ,1jj K
 
1
11
0
ΦΦΦ.
Knj
j
nj
n
TT
j




T
(7)
Next, we introduce the following costs:
1 Full and incremental backup costs per unit of
time when the number of files whose updated trucks ex-
ceeds Z is less than K files.
:c
2 Image copy backup cost per unit of time when
the number of files whose updated trucks exceeds Z is
more than K files with .
:c
:c21
3Cost of database backup and initialization at time
T. The expected cost rate is
cc
  
 
123
00
123
00
dd
dd
.
TT
nn n
TT
nn
cTTttcT ttc
CT T
cttcttc
T

 





(8)
The first item of right-hand side of (8) is the full and
incremental backup cost and the second item is the image
copy backup cost. Note that
0C.
If ,T
then
the policy corresponds to the image copy only at the total
amount of updated trucks exceeds the threshold level Z,
and the expected cost rate is
 
2
lim ,
Tn
c
CCT

 (9)
where

 
0
1
11
00
Φd
Φd,
nn
Knj
j
j
tt
ntt
j
 




t
(10)
which is the mean time until the total updated trucks ex-
ceeds K.
4. Optimal Policy
We obtain an optimal time
0TT



CT which
minimizes the expected cost in (8). Letting
0,CT
 
3
210
d,
nn
Tt
t
cc
 

Tc (11)
i.e.,

3
21
0
d
T
n
c
tt
cc

,
(12)
whose left-hand side is strictly increasing from 0 to .
n
Thus, if
321nccc

, then there exists a finite and
unique
0TT

 that satisfies (12) and the resul-
ting cost rate is

121 .
n
CTcc cT

  (13)
5. Numerical Example
Suppose that
 
e
!
j
t
j
t
Ht j
and
1e .
Z
GZ

Then, we have

  
e0,1,2,
!
i
jZ
ij
Z
GZ j
i

.
Thus, (4) is rewritten as


 
1
0
1
00
Φee
!!
1e e
!!
ij
,
Z
T
jij
ji
j
TZ
ji
ZT
tij
TZ
ji




 










(14)
i.e.,


1
1
00
Φe
!!
e,
j
i
j
T
ji
TZ
Tji



Z
(15)
where
1
0
1.
i
Because


 
 
11
11
00
0
e
1! !!
ee,
!!
jj i
j
TZ
ji
jj
TZ
j
TT
TT Z
jj i
TZ
jj
e

 









(16)
from (6),
 
  

 
 
 
1
111
1
1
00
1
00
0
!
!1!
!ee
!1!!!
1e e
!!
ee.
!!
nK
K
n
K
ji
j
tZ
ji
nK
ji
j
tZ
ji
jj
tZ
j
n
ttt
nKK
tZ
n
nK Kji
tZ
ji
tZ
jj






t






 












(17)
Therefore, (12) becomes
Copyright © 2013 SciRes. JSEA
Stochastic Modeling of Database Backup Policy for a Computer System 57
 
 
 
 
0
1
1
00
0
1
00
3
021
!
d!1!
ee
!!
1e e
!!
eed
!!
T
n
K
ji
Tj
tZ
ji
nK
ji
j
tZ
ji
jj
tZ
j
n
tt
nKK
tZ
tji
tZ
ji
tZ c
t
jj c



















.
c
(18)
We obtain the optimal value which satisfies (18),
and the resulting expected cost rate is
T


 
12111.
n
j
nj
jK
n
CTcc cTT
j



 



(19)
Tables 1 and 2 indicate the cost rates
321
ccc for
optimal times T when and
6.12K10.20n
.
From Tables 1 and 2, when the value of
321
increases, the interval of becomes long. This shows
that we should lengthen the image copy backup interval,
if the value of 3
c is large compared to the value of
From the tables, when
ccc
T
Z
21
.cc5.10,
the value of
321
ccc to the value of becomes about twice.
When
T
10, 0.05n5. 6,0,ZK
 and
3219.875,ccc becomes 25. When the unit of
T is a day, we should execute the image copy backup
T
Table 1. Optimal T* when µZ = 5.0.
321
ccc
6K, 10n12K, 20n
T
0.05
0.08
0.05
0.08
15 0.115 4.9726 0.001 1.150
20 1.654 32.613 0.128 23.581
25 9.875 93.454 2.653 104.720
30 33.862 166.324 19.410 202.318
Table 2. Optimal T* when µZ = 10.0..
321
ccc
6K, 10n12K, 20n
T
0.05
0.08
0.05
0.08
15 0.240 10.457 0.003 2.487
20 3.473 68.794 0.278 32.346
25 20.773 197.659 5.735 223.785
30 71.369 352.425 41.837 428.991
every 25 days. In addition, when 10.0, 12ZK

,
20, 0.05n
and
32141.837cc in Table 2,
c
T
becomes 30. When the unit of T is a day, we should
execute the image copy backup every 30 days. In the real
world, we can request the value of
321
ccc com-
paratively easily. Then, when

32141ccc.837,
10.0, 12ZK
, 20n
and 0.05,
it is neces-
sary to do the image copy back up every month.
6. Conclusions
We have considered the problem when to make the in-
cremental, full and image copy backups, under the as-
sumptions that the overhead of backups depends on the
total amount of newly updated files in a database. We
have obtained the expected cost rate until the image copy
backup, and have discussed the optimal interval of the
image copy backup that minimizes it. It has been shown
that the optimal interval is given by a finite and unique
solution of an equation.
As a further problem, it would be necessary to con-
sider the model where the backup of only newly updated
files is made, and to compare it with the model studied in
this paper.
7. Acknowledgements
This work is partially supported by the Grant-in-Aid for
Scientific Research (C) of Japan Society for the Promo-
tion of Science under Grant No. 22500897 and No.
24530371.
REFERENCES
[1] T. Ward, “Security of Backup Data,” Information Systems
Security, Vol. 15, No. 1, 2006, pp. 31-34.
doi:10.1201/1086.1065898X/45926.15.1.20060301/9268
3.6
[2] T. Haerder and J. Muckstadt, “Optimal Policy for Batch
Operations: Backup, Checkpointing, Reorganization, and
Updating,” ACM Transactionson Database Systems, Vol.
2, No. 3, 1977, pp. 209-222. doi:10.1145/320557.320558
[3] I. Lazaridis, et al., “Fault Tolerant Evaluation of Con-
tinuous Selection Queries over Sensor Data,” Interna-
tional Journal of Distribute d Sensor Networks, Vol. 5, No.
4, 2009, pp. 338-360. doi:10.1080/15501320701585600
[4] H. Jarraya and M. Laurent, “A Secure Peer-to-Peer Back-
up Service Keeping Great Autonomy While under the
Supervision of a Provider,” Computers & Security, Vol.
29, No. 2, 2010, pp. 180-195.
doi:10.1016/j.cose.2009.10.003
[5] M. Gramaglia, M. Urueña and I. Martinez-Yelmo, “Off-
Line Incentive Mechanism for Long-term P2P Backup
Storage,” Computer Communications, Vol. 35, No. 12,
2012, pp. 1516-1526. doi:10.1016/j.comcom.2012.04.017
[6] M. Fesci-Sayita, E. TurhanTunalib and A. Murat Tekalpc,
“Resilient Peer-to-Peer Streaming of Scalable Video over
Copyright © 2013 SciRes. JSEA
Stochastic Modeling of Database Backup Policy for a Computer System
Copyright © 2013 SciRes. JSEA
58
Hierarchical Multicast Trees with Backup Parent Pools,”
Signal Processing: Image CommunicationVol. 27, No. 2,
2012, pp. 113-125. doi:10.1016/j.image.2011.11.004
[7] G. Bella, C. Pistagna and S. Riccobene, “Distributed
Backup through Information Dispersal,” Electronic Notes
in Theoretical Computer Science, Vol. 142, No. 3, 2006,
pp. 63-77. doi:10.1016/j.entcs.2004.11.046
[8] J. Ma, et al., “A Novel Adaptive Current Protection Sch-
eme for Distribution Systems with Distributed Genera-
tion,” International Journal of Electrical Power & En-
ergy Systems, Vol. 43, No. 1, 2012, pp. 1460-1466.
[9] http://www.backup4all.com/
[10] K. Suzuki and K. Nakajima, “Storage Management Soft-
ware,” FUJITSU, Vol.48, No. 4, 1997, pp. 389-397.
[11] G. Lohman and A. Reuter, “Principles of Transaction
Oriented Database Recovery,” ACM Computing Surveys,
Vol. 15, No. 4, 1983, pp. 287-317. doi:10.1145/289.291
[12] C. Qian, S. Nakamura and T. Nakagawa, “Cumulative
Damage Model with Two Kinds of Shocks and Its Appli-
cation to the Backup Policy,” Journal of the Operations
Research Society of Japan, Vol. 42, No. 4, 1999, pp.
501-511. doi:10.1016/S0453-4514(00)87116-1
[13] S. Nakamura, C. Qian, S. Fukumoto and T. Nakagawa,
“Optimal Backup Policy for a Database System with In-
cremental and Full Backups,” Mathematical and Com-
puter Modelling, Vol. 38, No. 11-13, 2003, pp. 1373-
1379. doi:10.1016/S0895-7177(03)90140-3
[14] C. Qian, Y. Huang, X. Zhao and T. Nakagawa, “Optimal
Backup Interval for a Database System with Full and Pe-
riodic Increment Backup,” Journal of Computers, Vol. 5,
No. 4, 2010, pp. 557-564. doi:10.4304/jcp.5.4.557-564
[15] T. Nakagawa, “Shock and Damage Models in Reliability
Theory,” Springer, London, 2007.
[16] S. Nakamura and T. Nakagawa, “Stochastic Reliability
Modeling, Optimization and Applications,” World Scien-
tific, Singapore, 2010.
[17] X. Zhao, S. Nakamura and T. Nakagawa, “Two Genera-
tional Garbage Collection Models with Major Collection
Time,” IEICE Transactions on Fundamentals of Elec-
tronics Communications and Computer Sciences, Vol.
E94-A, No. 7, 2011, pp.1558-1566.
[18] X. Zhao, S. Nakamura and T. Nakagawa, “Optimal Te-
nuring and Major Collection Times for a Generation Gar-
bage Collector,” Asia-Pacific Journal of Operational Re-
search, Vol. 29, No. 3, 2012, p. 1240018 (17 pages).