I. J. Communications, Network and System Sciences. 2008; 1: 1-103
Published Online February 2008 in SciRes (http://www.SRPublishing.org/journal/ijcns/).
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
A Predicted Region based Cache Replacement
Policy for Location Dependent Data in Mobile
Environment
Ajey KUMAR1, Manoj MISRA2, Anil K. SARJE2
Department of Electronics and Computer Engineering,
Indian Institute of Technology Roorkee, Roorkee, India
E-mail: 1ajeykumar7@gmail.com, 2{manojfec,sarjefec}@iitr.ernet.in
Abstract
Caching frequently accessed data items on the mobile client is an effective technique to improve the system
performance in mobile environment. Proper choice of cache replacement technique to find a suitable subset of items for
eviction from cache is very important because of limited cache size. Available policies do not take into account the
movement patterns of the client. In this paper, we propose a new cache replacement policy for location dependent data
in mobile environment. The proposed policy uses a predicted region based cost function to select an item for eviction
from cache. The policy selects the predicted region based on client’s movement and uses it to calculate the data
distance of an item. This makes the policy adaptive to client’s movement pattern unlike earlier policies that consider the
directional / non-directional data distance only. We call our policy the Prioritized Predicted Region based Cache
Replacement Policy (PPRRP). Simulation results show that the proposed policy significantly improves the system
performance in comparison to previous schemes in terms of cache hit ratio.
Keywords: Mobile Computing, Cache Replacement, Location Dependent Data, Valid Scope, Location Dependent
Information Services.
1. Introduction
Recent advances in wireless technology have ushered
the new paradigm of mobile computing. With the advent
of new mobile infrastructures providing higher bandwidth
and constant connection to the network from virtually
everywhere and advances in the global positioning
technologies , a new class of services referred to as
Location Dependent Information Services (LDIS) has
evolved and is gaining popularity among mobile users.
LDIS provide users with the ability to access
information related to their current location. By including
location as a part of user’s context information, service
carriers can provide better services to many value-added
applications such as travel and tourist information system,
assistance and emergency system, nearest object
searching system and local information access system,
which specifically target the mobile users. Hence, the
need for LDIS arises frequently. For example, imagine
you are on a business trip in a foreign city and you do not
know the city very well, you have no idea where to go. In
this situation, with the help of your portable device you
can query your personal interest like nearest restaurant,
nearest gas station, nearest ATM, nearest theater etc. and
can get the response on your device.
There are many challenges in providing LDIS services
to users. These challenges include limited bandwidth,
limited client power and intermittent connectivity
[1][2][4][6]. Caching helps to address some of these
challenges. Caching of frequently accessed data item on
client side is an effective technique to improve data
accessibility and to reduce access cost. However, due to
limitations of cache size on mobile devices, it is
impossible to hold all accessed data items in the cache.
Thus, there is a need of efficient cache replacement
algorithms to find out suitable subset of data items for
eviction from the cache. Good cache performance heavily
depends on these replacement algorithms. Also, for wide
area mobile environments due to its distributed nature, the
design of an efficient cache replacement policy becomes
very crucial and challenging to ensure good cache
performance.
Most of the existing cache replacement policies use
cost functions to incorporate different factors including
access frequency, update rate, size of objects etc.
Temporal-based traditional cache replacement strategies,
80 A. KUMAR ET AL.
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
such as Least Recently Used (LRU), Least Frequently
Used (LFU) and LRU-k [13] have been studied widely in
the past. These polices are based on the assumption that
client’s access patterns exhibit temporal locality (i.e. the
objects that were queried frequently in the past will
continue to be queried frequently in the future). These
algorithms replace data items based on recency or
frequency of access. LRU is the most commonly used
recency based algorithm. LFU, which maintains a
reference count for each cached objects, is the most
commonly used frequency based algorithm. Frequency-
based algorithms are well suited for skewed access
patterns in which a large fraction of accesses go to a
disproportionately small set of hot objects. Frequency and
recency based algorithms form the two ends of a
spectrum of caching algorithms. LRU-k cache
replacement algorithm tries to balance both recency and
frequency. However, in mobile networks where clients
utilize location dependent information services, clients
access pattern do not only exhibit temporal locality, but
also exhibit dependence on location of data, location of
the client and direction of the client’s movement [4][5].
As a result, the aforementioned policies are unsuitable for
supporting location dependent services because they do
not take into account the location of data objects and the
movement of mobile clients. Hence, relying solely on
temporal locality when making cache replacement
decisions will result in poor cache hit ratio in LDIS. To
overcome this problem, several location-aware cache
replacement policies [5][7][9][10] have been proposed for
location dependent information services.
Manhattan Distance-based cache replacement policy
[10] supports location dependent queries in urban
environments. Cache replacement decisions are made on
the basis of distance between a client’s current location
and the location of each cached data object. Objects with
the highest Manhattan distance from the client’s current
location are evicted at cache replacement. While the
Manhattan based policy accounts for the distance between
clients and data objects, the major limitation of this
approach is that it ignores the temporal access locality of
mobile clients and the direction of client movement while
making cache replacement decisions.
Furthest Away Replacement (FAR) [9] policy uses the
current location and movement direction of mobile clients
to make cache replacement decisions. Cached objects are
grouped into two sets, viz., in-direction set and the out-
direction set. Data objects in the out-direction set are
always evicted first before those in the in-direction set.
Objects in each set are evicted in the order based on their
distance from the client. Similar to the Manhattan
approach, FAR also neglects the temporal properties of
clients’ access pattern. It also becomes ineffective when
mobile clients change direction frequently due to frequent
change in the membership of objects between the in-
direction and out-direction sets.
In Probability Area Inverse Distance (PAID) [5] policy,
the cost function of data item i takes into account the
access probabilities (Pi) of data objects, area of its valid
scopes A(vsi) and the distance D(vsi) between the client’s
current position and the valid scope of the object
concerned (known as data distance). The cost function of
PAID is given by() ()PAvsD vs
ii i
. It neither takes into
account the size of the data object nor does it give priority
to the data objects in cache that are near to the mobile
client. Mobility Aware Replacement Scheme (MARS) [7]
policy is also a cost based policy, which comprises of
temporal score, spatial score and cost of retrieving an
object. Unlike PAID, it takes into account the updates of
data objects. But as far as location dependent data (LDD)
is concerned, their update rate (if exist) is negligible as
compared to temporal data. Thus, for LDD, only spatial
score dominates which consists of area of valid scope,
data distance from current client location and data
distance from future client location. The impact of
client’s anticipated location or region in deciding cache
replacement still remains unexplored.
None of these cache replacement policies are suitable
if client changes its direction of movement quite often.
Existing cache replacement policies only consider the
data distance (directional/undirectional) but not the
distance based on the predicted region or area where the
client can be in near future. Very few of these policies
[5][7] account for the location and movement of mobile
clients.
In this paper, we predict an area in the vicinity of
client’s current position, and give priority to the cached
data items that belong to this area irrespective of the
client’s movement direction. PPRRP calculates the data
item cost on the basis of access probability, valid scope
area, data size in cache and data distance based on the
predicted region, which has not been considered in any of
the existing policies.
The rest of the paper is organized as follows. Section 2
briefly describes mobile system model used in our work.
Section 3 details the proposed new cost based
replacement policy PPRRP. Section 4 and section 5 deal
with simulation model, and performance evaluation and
comparison simultaneously. Section 6 concludes the
paper.
2. Mobile System Model
We assume a cellular mobile network that is similar to
the model discussed by D. Barbara [1] as mobile
computing infrastructure. A mobile system [1][4][5][6] is
usually made up of a server, moving clients, and a wireless
connection between them (see Figure 1). The
geographical area is divided into small regions, called
cells. Each cell has a Base Station (BS) or Mobile
Support Station (MSS) augmented with wireless interfaces
A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 81
DEPENDENT DATAIN MOBILE ENVIRONMENT
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
and a number of Mobile Clients (MCs). Inter-cell and
intra-cell communications are managed by the MSSs. The
MCs communicate with the MSS by wireless links within
its radio coverage area. An MC can move freely from one
location to another within a cell or between cells while
retaining its network connection. An MC can either
connect to a MSS through a wireless communication
channel or disconnect from the MSS by operating in the
doze (power save) mode. The MC queries the database
servers that are connected to a wired network. The
wireless channel is logically separated into two sub
channels: uplink channel and downlink channel. The
uplink channel is used by MCs to submit queries to the
server via an MSS, while the downlink channel is used by
MSSs to disseminate information or to forward the
answers from the server to the target client.
The mobile computing platform can be effectively
described under the client/server paradigm [19]. A data
item is the basic unit for update and query. MCs only
issue simple requests to read the most recent copy of a
data item. There may be one or more processes running
on an MC. These processes are referred to as clients (we
use the terms MC and client/users interchangeably). In
order to serve a request from a client, the MSS needs to
communicate with the database server to retrieve the data
items. Since the communication between the MSS and the
database server is through wired links and is transparent
to the clients (i.e., from the client’s point of view, the
MSS is the same as the database server), we also use the
terms MSS and server interchangeably.
Fixed/wireline Network
(
Internet,LAN: Mbps to Gbps
)
Cellular Data
(
CDPD,DataTae: 19.2 kbps
)
Wireless LAN
(
Aironet,Wavelan: 2.12 Mbps
)
Mobile GSM
(
9.6 kbps
)
Base Node
Base Node
Base Node
Figure 1. Mobile computing system model
The information system provides location dependent
services to mobile clients. The geographical area covered
by the information system is referred as the service area.
In this paper, we assume a geometric location model, i.e.,
a location is specified as a two-dimensional coordinate.
However, it can be easily extended to 3-dimension space
by including the third dimension. Mobile clients can
identify their locations using systems such as the Global
Positioning System (GPS) [3]. The data item value is
different from data item, i.e., data item value for a data
item is an instance of the item valid for a certain
geographical region. Moreover, the data item value is
different from data item. Data item value for a data item
is an instance of the item valid for a certain geographical
region. So, a data item can show different values when
clients at different locations query it. For example,
“restaurant” is a data item, and the data values for this
data item vary depending on the location of query i.e.
point at which the query “ Tell me the nearest restaurant
was issued by a mobile client. The valid scope of a data
item value is defined as the region within which the data
item value is valid. In a two-dimensional space, a valid
scope (vs) can be represented by a geometric polygon p
(e1, …,en ), where e i
's are endpoints of the polygon. A
mobile client can cache data on its local disk or in any
storage system that survives power-off. In this paper, data
values are assumed to be of fixed size and read–only so
that we can omit the influence of data sizes and updates
on cache performance and concentrate on the impact
caused by the unique properties of location-dependent
data.
y-axis
O
x-axis
(a) Abstract model
1
θ
2
θ
4
θ
3
θ
MI
1
= t seconds
O
1
v
2
v
3
v
4
v
y-axis
x-axis
MI
2
= t seconds
MI
3
= t seconds
MI
4
= t seconds
(b) Discrete model
Figure 2. Client’s movement path
We also assume an unconstrained network, where
mobile clients move freely inside the geographical region
covered by the mobile network (without any
restrictions).In the abstract model, the path of a moving
client is represented by a curve in 2-dimension (x-y
plane), as shown in Figure 2(a). Though abstract model is
82 A. KUMAR ET AL.
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
simple, from computer implementation point of view,
discrete model is preferred [8]. In discrete model, the path
traveled is modeled as a sequence of line segments, each
associated with fixed velocity and direction, as shown in
Figure 2 (b). Length of the line segment depends on the
rate of change of direction and velocity. For random
movement this duration between change in direction and
velocity is small and for regular movement and highway
users this duration is large. This duration is known as
Moving Interval (MI) [5][7][17]. Figure 2(b) shows the
discrete movement of a mobile user with MI of t seconds.
The distance between any two locations or points is the
length of a straight line connecting the two points (i.e.
Euclidean distance).
3. Prioritized Predicted Region Based Cache
Replacement Policy (PPRRP)
3.1. Motivation
LDIS, being spatial in nature needs that the distance of
data item from client’s current position and its valid scope
area should also be taken into account for cache
replacement. Greater the distance of valid scope of data
from the user’s current position lower is the chance that
client will enter in the valid scope area of the data in near
future. Thus, it is better to eject the farthest data value
when replacement takes place. Moreover, because the
client is mobile, its position at the time of next query will
be different from its current position. Therefore the
client’s movement should also be taken into account.
Locations in the opposite direction of client’s movement
have very low chance of being revisited, though they may
be very close to it. Based on this reasoning, existing
cache replacement schemes like FAR and PAID
(directional) assign higher priorities to data items in the
client’s direction of movement giving very low priority to
the items in the opposite direction of user’s movement.
However, with random movement patterns of clients, it is
not always necessary that client will continue moving in
the same direction. Therefore, evicting data values which
are in the opposite direction of client’s movement but are
very close to client’s current position may degrade the
overall performance.
3.2. Basic Idea
When client movement pattern is random, retaining the
data items in the direction of user movement and
discarding the data items that are in the opposite direction
of user movement may not improve the performance.
Therefore, our cache replacement policy considers the
predicted region of user presence in near future (rather
than considering the direction of user movement only)
while selecting an item for replacement. The predicted
region is based on the client’s current movement pattern.
We show that it is useful to calculate the data distance
with respect to this region so that the data items in the
vicinity of client’s current position are not purged from
cache. Valid scope area of the data item and the amount
of space required to store the data item in cache are also
used to select an item for replacement. This is because the
client has higher chance of being in large region rather
than small regions and keeping smaller size data items in
cache helps to accommodate a large number of data items
in the cache. Hence, our cache replacement policy selects
a victim with low access probability, small valid scope
area falling outside the predicted region and large data
size.
3.3. Approach
We make use of discrete model for client’s movement
path as described in Section 3.2 and used in [5][7][14].
Assuming a predefined path of mobile user or a
predefined destination is generally not possible unless we
are dealing with a case where the user is moving in a train
or a ship and the entire path of the user is known well in
advance. For discrete model, the direction and velocity of
the user are known for current MI. At the end of each MI,
direction is selected randomly between 0° to 360°
degrees and the velocity between minimum (vmin) and
maximum speed (vmax) of the client. This motivates us in
predicting a region instead of predicting the path.
MIc
L
(,)
s
s
x
y(,)
ee
x
y
Figure 3. Current moving interval
MIc
L
(,)
ss
xy (,)
ee
xy
Figure 4. Predicted region
Let vc be the velocity in current moving interval MIc,
LMIc be the length of MIc along direction θc and (xs, ys)
and (xe, ye) be the starting and end point of MIc
respectively (see Figure 3). Assuming that the velocity vc
remains same (generally the mobile user does not changes
its velocity significantly over a long period) in the next
MI also, we can predict the region of user presence in
near future by the circle with radius LMIc and centre (xe, ye)
as shown in Figure 4. Our cache replacement policy uses
A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 83
DEPENDENT DATAIN MOBILE ENVIRONMENT
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
this region to calculate the distance of data items in cache
as follows:
The distance of data items outside the predicted
region is calculated from the centre of the circle
The distance of data items inside the predicted region
is calculated as the minimum of {LMIc, distance of the
valid scope from the current position of the user}.
Calculating the distance of data items in this way
ensures that
Items outside the predicted region always have the
lower priority than the items inside the predicted
region.
Items inside the predicted region, close to the user
have higher priority.
One of the advantages of using predicted region is that
it dynamically changes with speed of client and MI and
also takes into account the random movement of client.
Now, we define cost function for our cache
replacement policy PPRRP that considers access
probability, predicted region based data distance, valid
scope area and size of the data in cache. Associated with
each cached data object is the replacement cost. When a
new data object needs to be cached and there is
insufficient cache space, the object(s) with lowest
replacement cost is (are) removed until there is enough
space to cache new object. The cost of replacing a data
value j of data item i in client’s cache is calculated as:
{}
'
i,j
'
1
minimumL , D(vs)
'
P.A(vs )
ii,j '
.ifvsPred_Reg
i,j
SMIc
i,j
Cos ti,j '
P.A(vs )1
ii,j '
.ifvsPred_Reg
i,j
'
SD(vs )
i,j i,j
=
(1)
where Pi is the access probability of data item i , A(vs
i,j)
is the area of the valid scope vs
i,j for data value j, Si,j is
the size of data value j and valid scope vs
i,j , D(vs
i,j) is
the distance of the valid scope vs
i,j from the current user
position , D’(vs
i,j) is the distance of the valid scope vs
i,j
from the centre of the predicted region and Pred_Reg is
the predicted region.
4. Simulation Model
This section describes the simulation model used to
evaluate the performance of the proposed location-
dependent cache invalidation methods. Our Simulator is
implemented in C++ and setup is similar and in
accordance with those used in earlier studies [5][14].
4.1. System
Since seamless hand-off from one cell to another is
assumed, the network can be considered a single, large
service area within which the clients can move freely and
obtain location-dependent information services. In our
simulation, the service area is represented by a rectangle
with a fixed size of Size. We assume a “wrapped-around”
[5][14][107] model for the service area. In other words,
when a client leaves one border of the service area, it
enters the service area from the opposite border at the
same velocity.
The database contains ItemNum items. Every item
may display ScopeNum different values for different
client locations within the service area. The size of data
value varies from Smin to Smax and has following three
types of distributions [6]:
IncreasingSize: The size S
i of data item i grows
linearly as i increases, and is given by:
max min
min
(1)*( )
, 1,...,;
1
i
iSS
SSi ItemNum
ItemNum
−−
=+ =
(2)
DecreasingSize: The size Si of data item i decreases
linearly as i increases, and is given by:
max min
max
(1)*( )
, 1,...,;
1
i
iSS
SSi ItemNum
ItemNum
−−
=− =
(3)
RandomSize: The size S
i of data item i falls
randomly between Smin and Smax, given by:
minmax min
()*() ,1,...,;
i
S SprobSSiItemNum=+− =
⎢⎥
⎣⎦
(4)
where, ()
p
rob is a random function with uniformly
distributed value between 0 and 1. The choice of the size
distributions are based on previously published trace
analysis [6]. Though, some researchers have shown that
small data items are accessed more frequently than large
data items, but recent web trace analysis shows that the
correlation between data item size and access frequency is
weak and can be ignored [16]. Combined with the skewed
access pattern, IncreasingSize and DecreasingSize
represent client’s preference for frequently querying
smaller items and larger items respectively. In other
words, with IncreasingSize setting, the clients access the
smallest item most frequently and with DecreasingSize
setting, the clients access the largest item most frequently.
RandomSize, models the case where no correlation
between the access pattern and data size exists.
In the simulation, the scope distributions of the data
items are generated based on voronoi diagrams (VDs)
[12][20] because valid scopes of nearest neighbor queries
is defined by VD. Formally, given sets of point
84 A. KUMAR ET AL.
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
O={o1,o2,…..,on), V(oi), the Voronoi Cell (VC) for oi, is
defined as the set of points q in the space such that
dist(q,oi) < dist(q,oj), j i. That is, V(oi) consists of set
of points for which oi is nearest neighbor. In our
simulation, first data set Scope Distribution 1 (Figure 5
(a)) contains 110 points randomly distributed in a square
Euclidean space. The second data set, Scope Distribution
2 (Figure 5 (b)), contains the locations of 215 hospitals in
California area, which is extracted from the point data set
available at [18].
This model assumes that two floating-point numbers
are used to represent a two-dimensional coordinate and
one floating-point number to represent the radius of circle.
The size of a floating-point number is FloatSize. The
wireless network is modeled by an uplink channel and a
downlink channel. The uplink channel is used by clients
to submit queries, and the downlink channel is used by
the server to return query responses to target clients. The
communication between the server and a client makes use
of a point-to-point connection. It is assumed that the
available bandwidth is UplinkBa nd for the uplink channel
and DownlinkBand for the downlink channel.
(a) Scope distribution 1 (ScopeNum=110)
(b) Scope distribution 2 (ScopeNum=215)
Figure 5. Scope distributions for performance evaluation
4.2. Client
The mobile client is modeled with two independent
processes: query process and move process. The query
process continuously generates location-dependent read-
only queries for different data items. After the current
query is completed, the client waits for an exponentially
distributed time period with a mean of QueryInterval
before the next query is issued. The client access pattern
over different items follows a Zipf distribution with
skewness parameter θ, which is shown to be a realistic
approximation of skewed data access and are frequently
used to model non-uniform distribution [5][6][16]. In the
Zipf distribution, the access probability of the ith (1i N)
data item is represented as follows
1
1
1
i
N
jj
pi
θ
θ
=
= (5)
where N is the number of items in the database and 0 θ
1.
When θ = 0, the access pattern is uniform. As θ value
is increased the skewness increases. When θ = 1, it is the
strict Zipf distribution. Large θ results in more “skewed”
access distribution. To answer a query, the client first
checks its local cache. If the data value for the requested
item with respect to the current location is available, the
query is satisfied locally. Otherwise, the client submits
the query and its current location to the server and
retrieves the data through the downlink channel. The
move process controls the movement pattern of the client
using the parameter MovingInterval. After the client
keeps moving at a constant velocity for a time period of
MovingInterval, it changes velocity for next MI. The next
speed is selected randomly between MinSpeed and
MaxSpeed. Similarly, the next moving direction
(represented by the angle relative to the x-axis, counter
clock wise taken as positive) is selected randomly
between 0° to 360°. If the difference between MinSpeed
and MaxSpeed is low the mobile users move with almost
same velocity. The client is assumed to have a cache of
fixed size, which is a CacheSizeRatio ratio of the
database size.
4.3. Server
The server is modeled by a single process that services
the requests from clients. The requests are buffered at the
server if necessary, and an infinite queue buffer is
assumed. The FCFS service principle is assumed in the
model. To answer a location-dependent query, the server
locates the correct data value with respect to the specified
location. Since the main concern of this paper is the cost
of the wireless link( i.e. transmission time, receiving time
and disconnections), which is more expensive than the
wired-link and disk I/O costs(i.e. disk access time), the
overheads of request processing and service scheduling at
the server are assumed to be negligible in the model.
5. Performance Evaluation
This section describes the performance parameters and
measures used for simulation and analyze the results of
the simulation.
A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 85
DEPENDENT DATAIN MOBILE ENVIRONMENT
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
5.1. Performance Parameters
The default values of different parameters used in the
simulation experiments are given in Table 1. They are
chosen to be the same as used in earlier studies [5][7][14].
Experiments are performed using different workloads
and system settings. In order to get the true performance
for each algorithm, we collect the result data only after
the system becomes stable, which is defined as the time
when the client caches are full [5][6]. Each simulation
runs for 20,000 client issued queries and each result
obtained in the experiment is taken as the average of 10
simulation runs with Confidence Interval of 96 percent.
For simulation purpose, we assume that all data items
follow the same scope distribution in a single set of
experiments. Two scope distributions with 110 and 215
valid scopes are used (see Figure 5). Since the average
valid scope areas differ for these two scope distributions,
different moving speeds are assumed, i.e., the pair of
(MinSpeed,MaxSpeed) is set to (1,2) and (5,10) for Scope
Distribution 1 and Scope Distribution 2, respectively. The
Caching-Efficiency-Based (CEB) [5] cache invalidation
policy is employed for cache management. For
calculating data distance between valid scope (either a
polygon or a circle) and current location we select a
reference point for each valid scope and take the distance
(Euclidean distance) between the current location and the
reference point. For polygonal valid scope, the reference
point is defined as the endpoint that is closest to the
current location and for circular valid scope, it is defined
as the point where the circumference and the line
connecting the current location and the center of the circle
meet. Access probability for each data item is estimated
by using exponential ageing method [5][6]. Two
parameters are maintained for each data item i: a running
probability Pi and the time of the last access to item tl
i. Pi
is initialized to 0. When a new query is issued for data
item i, Pi is updated using the following formula:
()(1)
l
ici i
Ptt P
α
α
=−+− (6)
where, t
c is the current system time and α is a constant
factor to weigh the importance of most recent access in
the probability estimate. Note that the access probability
is maintained for each data item rather than for each data
value. If the database size is small, the client can maintain
these parameters (i.e., Pi and tl
i for each item i) for all
items in its local cache. However, if the database size is
large, these parameters will occupy a significant amount
of cache space. To alleviate this problem, we set an upper
bound to the amount of cache used for storing these
parameters (5 percent of the total cache size in our
simulation) and use the Least Frequently Used (LFU)
policy to manage the limited space reserved for it.
5.2. Performance Metric
Our primary performance metric is cache hit ratio.
Other performance metrics can be derived from the cache
hit ratio. Cache hit ratio can be defined as the ratio of the
number of queries answered by the client’s cache to the
total number of queries generated by the client.
Specifically, higher the cache hit ratio, higher is the local
data availability, less is the uplink and downlink costs and
the battery consumption [5][6][14].
5.3. Comparison of Location-Dependent Cache
Replacement Schemes
This subsection examines the performance of different
location-dependent cache replacement policies, namely,
PPRRP with PAID, FAR and Manhattan. Figures 6 to 15
show the cache hit ratio for both scope distributions (see
Figure 5) under various query intervals, moving intervals,
cache sizes, client’s speed and Zipf’s distribution.
5.3.1. Effect of Query Interval
Query interval is the time interval between two
consecutive client queries. In this set of experiments, we
vary the mean query interval from 20 seconds to 200
seconds. Figures 6 and 7, show cache performance for
both scope distributions and for the data distributions:
IncreasingSize, RandomSize and DecreasingSize.
Results show that, when the query interval is increased,
almost every scheme shows a worse performance. This is
because, for a longer query interval when a new query is
issued the client has a lower probability of residing in one
of the valid scopes of the previously queried data items.
When different cache replacement policies are compared,
the proposed policy substantially outperforms the existing
policies. Figure 6, compares the performance of cache
replacement policies versus query interval for Scope
Distribution 1. PPRRP, which prefers object within the
predicted region over the objects outside the predicted
region and gives priority to the data objects that are
nearer to the client’s current position within the predicted
region, obtains better performance than PAID. Average
improvement of PPRRP over PAID is 28%, 21% and
19% for IncreasingSize, RandomSize and DecreasingSize
respectively for Scope Distribution 1. Figure 7, shows the
effect of change in query interval on the performance of
cache replacement policies for Scope Distribution 2. It
can be observed that the PPRRP show similar gains in
performance for Scope Distribution 2 also as they were
for Scope Distribution1. The average improvement of
PPRRP over PAID for Scope Distribution 2 is 8%, 12.2%,
and 11% for IncreasingSize, RandomSize and
DecreasingSize respectively .
5.3.2. Effect of Moving Interval
This subsection examines the performance of the
replacement policies when the client’s moving interval is
varied. Longer the moving interval, less frequent is the
changes in velocity of the client and hence, there is lesser
86 A. KUMAR ET AL.
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
randomness in the client's movement. The performance
results for IncreasingSize, RandomSize and
DecreasingSize of data distribution are shown in Fig-
ures 8 and 9.
We can see that when the moving interval is varied
from 50 seconds to 400 seconds, the cache hit ratio
decreases drastically. The reason for this is as follows.
For a relatively longer moving interval, there is a high
probability of the client leaving one valid region and
entering another. Consequently, the cached data are less
likely to be re-used for subsequent queries. This leads to a
decreased performance with increase in MI.
Figure 8, compares the performance of cache
replacement policies over changing moving interval for
Scope Distribution 1. Though for small MI, the
randomness in client movement is more as compared to
larger MI but PPRRP perform better than all existing
policies for both small and large MI. The predicted region
in PPRRP helps to keep the data items within the
influence of client’s movement, thereby reducing the
effect of randomness in client’s movement. Average
improvement of PPRRP over PAID is 27%, 21% and
12% for IncreasingSize, RandomSize and DecreasingSize
respectively. Figure 9, compares the performance of
cache replacement policies over change in moving
interval for Scope Distribution 2. For Scope Distribution
2 also, we get similar improvement in performance of
PPRRP as they were for Scope Distribution 1. The
average improvement of PPRRP over PAID for Scope
Distribution 2 is 7%, 6.3%, and 11% for IncreasingSize,
RandomSize and DecreasingSize respectively.
5.3.3. Effect of Cache Size
In this set of experiments, we intend to investigate the
robustness of the proposed replacement schemes under
various cache sizes. Figures 10 and 11, show the results
when CacheSizeRatio is varied from 5% to 20%. As
expected, the performance of all replacement schemes
improves with increasing cache size. This is because the
cache can hold large number of data items which
increases the probability of getting a cache hit. Moreover,
replacement occurs less frequent in comparison to the
case when cache size is low. Figure 10, shows the
performance for Scope Distribution 1. PPRRP
consistently outperforms the existing policies from small
size cache to large size cache. Average improvement of
PPRRP over PAID is 25%, 24% and 18% for
IncreasingSize, RandomSize and DecreasingSize
respectively. Figure 11, compares the performance of
cache replacement policies under varied CacheSizeRatio
for Scope Distribution 2. Results show similar
performance gains for all proposed policies for Scope
Distribution 2 also. The average improvement of PPRRP
over PAID for Scope Distribution 2 is 6.7%, 9.4%, and
10% for IncreasingSize, RandomSize and DecreasingSize
respectively.
5.3.4. Effect of Client’s Speed
This subsection examines the effect of change in
client’s speed on the performance of the proposed cache
replacement policies. Client’s cache hit ratio is shown
against client speed from Figures 12 to 13. Four speed
ranges [15], 1~5m/s, 6~10m/s, 16~20m/s, 25~35m/s,
corresponding to the speed of a walking human, a running
human, a vehicle with moderate speed and a vehicle with
high speed, respectively are used. It can be seen that very
high cache hit ratio can be achieved for walking human.
For higher speed range, the cache hit ratio drops as client
spends less time at each geographic location and the valid
scope of each data item stored in cache becomes less
effective. In PPRRP, higher the speed of client, greater is
the predicted region and hence more data items stored in
the cache are held in that region. Average improvement
of PPRRP over PAID for different speed ranges for
Scope Distribution 1 and Scope Distribution 2 are given
in Table 2 and Table 3 respectively.
5.3.5. Effect of Client’s Access pattern
This subsection examines the performance of the
replacement policies under various clients’ access pattern.
Client’s access pattern is modeled by Zipf’s Distribution
[16]. The Zipf parameter θ determines the “skewness” of
the access pattern over data items. When θ=0, the access
pattern is uniformly distributed. When θ increases, more
access is focused on few items (skewed). Figures 14 and
15, show the impact of access pattern on performance of
replacement policies for both scope distributions. As
desired, performance of PPRRP along with other
replacement policies, increases with increase in θ for both
Scope distributions over all the three data size
distributions. Moreover, proposed policy shows an edge
over other policies.
6. Conclusion
In this paper, we presented a cache replacement policy,
PPRRP, for location-dependent data that uses predicted
region based cost function for selecting data items to be
replaced from the cache. In order to decide which data
items to replace from cache, an attempt must be made to
predict what items will be accessed in the future.
We emphasized on predicting a region around mobile
client’s current position apart from considering only
user’s direction or distance. Predicted region plays an
important role in improving the system performance.
Using the predicted region of user influence, the data
items in the vicinity of client’s current position are not
purged from cache, which increases the cache hit.
Proposed policy takes into account factors like access
probability, data distance from predicted region, valid
scope and data size in cache. In PPRRP, data distance is
calculated such that the data items within the predicted
region are given higher priority than the data items
A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 87
DEPENDENT DATAIN MOBILE ENVIRONMENT
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
outside the predicted region. In addition to giving highest
priority to the data items within the predicted region, data
items nearer to the client’s current position are also
favored over other data items in the same predicted region.
A number of simulation experiments have been
conducted to evaluate the performance of the PPRRP.
The results show that PPRRP, with different system
settings, give better performance (improves cache hit ratio)
than PAID. PPRRP achieves an average improvement of
more than 25% for Increasing Size, more than 20 % for
Random Size and more than 15 % for DecreasingSize as
compared to existing replacement policy PAID.
We compared all policies under various system
parameters and for two scope distributions. But in real-
world, there can be lots of scope distributions varying
from regions to countries. Moreover, we used Euclidean
distance to calculate the distance between two points.
However, in the real-world, this distance cannot represent
the real distance that a user has to cover in order to reach
to an object. For example, in City model the distance
between two points is calculated using Manhattan
distance. Hence, there is a need to explore proposed
policies under different real-world conditions. Also, LDIS
have been referred to as some of the most promising
technological inventions that may impact consumer
behavior over the next decade. User’s expectations from
mobile networks are becoming more demanding and this
trend is expected to intensify in the future. Hence, the
user need/profile is essential to develop better cache
replacement policies. Also, existing location dependent
cache management schemes consider only location-
dependent data. Investigation of location dependent cache
management schemes, which includes temporal
dependent update frequencies, is also required. Future
schemes for cache management should consider the
above facts to over come the challenges posed by LDIS.
7. References
[1] D. Barbara, “Mobile Computing and Databases: A
Survey,” IEEE Transactions on Knowledge and Data
Engineering, Vol. 11, No. 1, pp. 108-117,
January/February 1999.
[2] D. Barbara and T. Imielinski, “Sleepers and
Workaholics: Caching Strategies in Mobile
Environments,” In the Proceedings of the ACM
SIGMOD Conference on Management of Data,
Minneapolis, USA, pp. 1-12, 1994.
[3] I. A. Getting, “The Global Positioning System,”
IEEE Spectrum, Vol. 30, No. 12, pp. 36-47,
December 1993.
[4] D. L. Lee, W.-C. Lee, J. Xu and B. Zheng, “Data
Management in Location-Dependent Information
Services,” IEEE Pervasive Computing, Vol. 1, No. 3,
pp. 65-72, July 2002.
[5] B. Zheng, J. Xu and D. L. Lee, “Cache Invalidation
and Replacement Strategies for Location-Dependent
Data in Mobile Environments,” IEEE Transactions
on Computers, Vol. 51, No. 10, pp. 1141-1153,
October 2002.
[6] L. Yin, G. Cao and Y. Cai, “A Generalized Target-
Driven Cache Replacement Policy for Mobile
Environments,” In the Proceedings of the IEEE
Symposium on Applications on the Internet, pp. 14-
21, January 2003.
[7] K. Lai, Z. Tari and P. Bertok, “Location–Aware
Cache Replacement for Mobile Environments,” IEEE
Global Telecommunication Conference
(GLOBECOM 04), Vol. 6, pp. 3441-3447, 29th
November- 3rd December 2004.
[8] M. Erwig, R. H. Guting, M. Schneider and M.
Vazirgiannis, “Spatio-Temporal Data Types: An
Approach to Modeling and Querying Moving
Objects in Databases,” GeoInformatica, Vol. 3, No. 3,
pp. 269-296, 1999.
[9] Q. Ren and M. H. Dhunham, “Using Semantic
Caching to Manage Location Dependent Data in
Mobile Computing,” In the Proceedings of 6th
ACM/IEEE Mobile Computing and Networking
(MobiCom), Boston, USA, pp. 210-221, 2000.
[10] S. Dar, M. J. Franklin, B. T. Jonsson, D. Srivastava
and M. Tan, “Semantic Data Caching and
Replacement,” In the Proceedings of the 22nd
International Conference on Very Large
Databases(VLDB), pp. 330-341,1996.
[11] A. Balamash and M. Krunz, “An Overview of Web
Caching Replacement Algorithms,” IEEE
Communications Surveys & Tutorials, Vol. 6, No. 2,
pp. 44-56, 2004.
[12] J. O’ Rourke, Computational Geometry in C, chapter
5, Univ. of Cambridge Press, 1998.
[13] E. O’Neil and P. O’Neil, “The LRU-k page
replacement algorithm for database disk buffering”,
In the Proceedings of the ACM SIGMOD, Vol. 22,
No. 2, pp. 296-306, 1993.
[14] Il-dong Jung, Young-ho You, Jong-hwan Lee and
Kyungsok Kim, “Broadcasting and caching policies
for location-dependent queries in urban areas,” In the
Proceedings of the 2nd International Workshop on
Mobile Commerce, Atlanta, USA, pp. 54-60,
September 2002.
[15] C. Lu, G. Xing, O. Chipara and C. L. Fok,
“MobiQuery: A Spatio Temporal Data Service for
Sensor Networks,” In the Proceedings of the 2nd
88 A. KUMAR ET AL.
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
International Conference on Embedded Networked
Sensor System (ACM SenSys’04), Baltimore, USA,
pp. 320-334, 2004.
[16] L. Breslau, P. Cao, L. Fan, G. Phillips and S. Shenker,
“Web Caching and Zipf-like Distributions: Evidence
and Implications,” In the Proceedings of the 18th
Annual Joint Conference of the IEEE Computer and
Communications Societies (INFOCOM’99), New
York, USA, Vol. 1, pp. 126-134, March 1999.
[17] T. Camp, J. Boleng and V. Davies, “A Survey of
Mobility Model for Ad Hoc Network Research,”
Wireless Communication & Mobile Computing
(WCMC): Special Issue on Mobile AdHoc
Networking: Research, Trends and Applications, Vol.
2, No. 5, pp. 483-502, 2002.
[18] Spatial Datasets. Website at
http://www.rtreeportal.org, 2005.
[19] J. Jing, A. Helal and A. Elmagarmid, “Client-Server
Computing in Mobile Environments,” In the
Proceedings of the ACM Computing Surveys, Vol.
31, No. 2, pp. 117-157, June 1999.
[20] M. Berg, M. Kreveld M. Overmars and O.
Schwarzkopf, “Computational Geometry: Algorithms
and Applications,” chapter 7, New York, NY, USA,
Springer-Verlag, 1996.
A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 89
DEPENDENT DATAIN MOBILE ENVIRONMENT
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
Table 2. Improvement of PPRRP over PAID on different speed ranges
(Scope distribution 1)
Speed Ranges
(m/s) IncreasingSize
(%) RandomSize
(%) DecreasingSize
(%)
1~5 43 29 21
6~10 41 31 25
16~20 40 30 24
25~35 37 29 35
Table 1. Configuration parameters and de fa ult parameter settings for simulation model
Parameter Description Setting
Size size of the rectangle service area 4000m*4000m,
44000m*27000m
ItemNum number of data items in the
database
500
ScopeNum number of different values at
various locations for each item
110, 215
Smin minimum size of a data value 64 bytes
Smax maximum size of a data value 1024 bytes
UplinkBand bandwidth of the uplink channel 19.2 kbps
DownlinkBand bandwidth of the downlink
channel
144 kbps
F loatSize size of a floating-point number 4 bytes
QueryInterval average time interval between two
consecutive queries
50.0 s
MovingInterval time duration that the client keeps
moving at a constant velocity
100.0s
MinSpeed minimum moving speed of the
client
1m s-1,5 m s-1
MaxSpeed maximum moving speed of the
client
2m s-1, 10 m s-1
CacheSizeRatio ratio of the cache size to the
database size
10 %
θ skewness parameter for the Zipf
access distribution
0.5
Table3. Improvement of PPRRP over PAID on different speed ranges
(Scope distribution 2)
Speed Ranges
(m/s) IncreasingSize
(%) RandomSize
(%) DecreasingSize
(%)
1~5 2.2 15.3 16.2
6~10 5.8 8.5 13.5
16~20 4.7 10.9 6.3
25~35 1.2 7.4 11.3
90 A. KUMAR ET AL.
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
Distribution: IncreasingSize
Query Interval (seconds)
20406080100 120 140 160180 200
Cache Hit Ratio
0.10
0.15
0.20
0.25
0.30
0.35
PPRRP
PAID
FAR
Manhattan
(a)
Distribution: RandomSize
Query Interval (seconds)
20406080100 120 140 160180 200
Cache Hit Ratio
0.05
0.10
0.15
0.20
0.25
0.30
PPRRP
PAID
FAR
Manhattan
(b)
Distribution: DecreasingSize
Query Interval (seconds)
20406080100 120 140 160 180200
Cache Hit Ratio
0.05
0.10
0.15
0.20
0.25
PPRRP
PAID
FAR
Manhattan
(c)
Figure 6. Cache hit ratio vs query interval for scope
distribution 1
Distribution: IncreasingSize
Query Interval (seconds)
20406080100120 140160 180 200
Cache Hit Ratio
0.40
0.45
0.50
0.55
0.60
0.65
0.70
0.75
PPRRP
PAID
FAR
Manhatt a n
(a)
Distribution: RandomSize
Query Interval (seconds)
20406080100120 140160 180 200
Cache Hit Ratio
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
PPRRP
PAID
FAR
Manhattan
(b)
Distribution: DecreasingSize
Query Interval (seconds)
20406080100120 140160 180200
Cache Hit Ra tio
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
PPRRP
PAID
FAR
Manhattan
(c)
Figure 7. Cache hit ratio vs query interval for scope
distribution 2
A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 91
DEPENDENT DATAIN MOBILE ENVIRONMENT
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
Distribution: IncreasingSize
Moving Interval (seconds)
50 100200400
Cache Hit Ratio
0.10
0.15
0.20
0.25
0.30
PPRRP
PAID
FAR
Manhattan
(a)
Distribution: RandomSize
Moving Interval (seconds)
50 100200400
Cache Hit Ratio
0.05
0.10
0.15
0.20
0.25
PPRRP
PAID
FAR
Manhattan
(b)
Distribution: DecreasingSize
Moving Interval (seconds)
50 100200400
Cache Hit Ratio
0.05
0.10
0.15
0.20
0.25
PPRRP
PAID
FAR
Manhattan
(c)
Figure 8. Cache hit ratio vs moving interval for scope
distribution 1
Distribution: IncreasingSize
Moving Interval (seconds)
50 100200400
Cache Hit Ratio
0.40
0.45
0.50
0.55
0.60
0.65
0.70
PPRRP
PAID
FAR
Manhattan
(a)
Distribution: Random Si ze
Moving Interval (seconds)
50 100200400
Cache Hit Ratio
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
PPRRP
PAID
FAR
Manhattan
(b)
Distribution: DecreasingSize
Moving Interval (seconds)
50 100200400
Cache Hit Ratio
0.30
0.35
0.40
0.45
0.50
0.55
0.60
0.65
PPRRP
PAID
FAR
Manhattan
(c)
Figure 9. Cache hit ratio vs moving interval for scope
distribution 2
92 A. KUMAR ET AL.
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
Distribution: IncreasingSize
Cache Size (% of Database Size)
5 101520
Cache Hit Ratio
0.05
0.10
0.15
0.20
0.25
0.30
0.35
PPRRP
PAID
FAR
Manhattan
(a)
Distribution: RandomSize
Cache Size (% of Database Size)
5 101520
Cache Hit Ratio
0.06
0.08
0.10
0.12
0.14
0.16
0.18
0.20
0.22
0.24
0.26
PPRRP
PAID
FAR
Manhattan
(b)
Distribution: DecreasingSize
Cache Size (% of Database Size)
5 101520
Cache Hit Ratio
0.06
0.08
0.10
0.12
0.14
0.16
0.18
0.20
0.22
0.24
PPRRP
PAID
FAR
Manhattan
(c)
Figure 10. Cache hit ratio vs cache size for scope
distribution 1
Distribution: IncreasingSize
Cache Size (% of Database Size)
5 101520
Cache Hit Ratio
0.40
0.45
0.50
0.55
0.60
0.65
0.70
PPRRP
PAID
FAR
Manhattan
(a)
Distribution: Random Si ze
Cache Size (% of Database Size)
5 101520
Cache Hit Ratio
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
PPRRP
PAID
FAR
Manhattan
(b)
Distribution: DecreasingSize
Cache Size (% of Database Size)
5 101520
Cache Hit Ratio
0.30
0.35
0.40
0.45
0.50
0.55
0.60
0.65
0.70
PPRRP
PAID
FAR
Manhattan
(c)
Figure 11. Cache hit ratio vs cache size for scope
distribution 2
A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 93
DEPENDENT DATAIN MOBILE ENVIRONMENT
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
Distribution: IncreasingSize
Client Speed (m/s)
1-56-1016-20 25-35
Cache Hit Ratio
0.00
0.05
0.10
0.15
0.20
0.25
PPRRP
PAID
FAR
Manhattan
(a)
Distribution: RandomSize
Client Speed (m/s)
1-56-1016-20 25-35
Cache Hit Ratio
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
PPRRP
PAID
FAR
Manhattan
(b)
Distribution: DecreasingSize
Client Speed (m/s)
1-56-1016-20 25-35
Cache Hit Ratio
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
PPRRP
PAID
FAR
Manhattan
(c)
Figure 12. Cache hit ratio vs client speed for scope
distribution 1
Distribu tion: IncreasingSize
Client Speed (m/s)
1-56-1016-20 25-35
Cache Hit Ratio
0.0
0.2
0.4
0.6
0.8
PPRRP
PAID
FAR
Manhattan
(a)
Distribution: RandomSize
Client Speed (m/s)
1-56-1016-20 25-35
Cache Hit Ratio
0.0
0.2
0.4
0.6
0.8
PPRRP
PAID
FAR
Manhattan
(b)
Distribution: DecreasingSize
Client Speed (m/s)
1-56-1016-20 25-35
Cache Hit Ratio
0.0
0.2
0.4
0.6
0.8
1.0
PPRRP
PAID
FAR
Manhattan
(c)
Figure 13. Cache hit ratio vs client speed for scope
distribution 2
94 A. KUMAR ET AL.
Copyright © 2008 SciRes. I. J. Communications, Network and System Sciences. 2008; 1:1-103
Distribution: IncreasingSize
Zipf parameter (θ)
0.0 0.2 0.4 0.6 0.8 1.0
Cache Hit Ratio
0.0
0.1
0.2
0.3
0.4
0.5
0.6
PPRRP
PAID
FAR
Manhattan
(a)
Distribution: RandomSize
Zipf parameter (θ)
0.0 0.2 0.4 0.6 0.8 1.0
Cache Hit Ratio
0.0
0.1
0.2
0.3
0.4
0.5
0.6
PPRRP
PAID
FAR
Manhattan
(b)
Distribution: Decreas ingSize
Zipf parameter (θ)
0.0 0.2 0.4 0.6 0.8 1.0
Cache Hit Ratio
0.0
0.1
0.2
0.3
0.4
0.5
PPRRP
PAID
FAR
Manhattan
(c)
Figure 14. Cache hit ratio vs zipf parameter for scope
distribution 1
Distribution: IncreasingSize
Zipf parameter (θ)
0.0 0.2 0.4 0.6 0.8 1.0
Cache Hit Ratio
0.4
0.5
0.6
0.7
0.8
PPRRP
PAID
FAR
Manhattan
(a)
Distribution: RandomSize
Zipf parameter (θ)
0.0 0.2 0.4 0.6 0.8 1.0
Cache Hit Ratio
0.4
0.5
0.6
0.7
PPRRP
PAID
FAR
Manhattan
(b)
Distribution: Decreas ingSize
Zipf parameter (θ)
0.0 0.2 0.4 0.6 0.8 1.0
Cache Hit Ratio
0.3
0.4
0.5
0.6
0.7
PPRRP
PAID
FAR
Manhattan
(c)
Figure 15. Cache hit ratio vs zipf parameter for scope
distribution 2