A Predicted Region based Cache Replacement Policy for Location Dependent Data in Mobile Environment

doi:10.4236/ijcns.2008.11012

I. J. Communications, Network and System Sciences. 2008; 1: 1-103

Published Online February 2008 in SciRes (http://www.SRPublishing.org/journal/ijcns/).

A Predicted Region based Cache Replacement

Policy for Location Dependent Data in Mobile

Environment

Ajey KUMAR1, Manoj MISRA2, Anil K. SARJE2

Department of Electronics and Computer Engineering,

Indian Institute of Technology Roorkee, Roorkee, India

E-mail: 1ajeykumar7@gmail.com, 2{manojfec,sarjefec}@iitr.ernet.in

Abstract

Caching frequently accessed data items on the mobile client is an effective technique to improve the system

performance in mobile environment. Proper choice of cache replacement technique to find a suitable subset of items for

eviction from cache is very important because of limited cache size. Available policies do not take into account the

movement patterns of the client. In this paper, we propose a new cache replacement policy for location dependent data

in mobile environment. The proposed policy uses a predicted region based cost function to select an item for eviction

from cache. The policy selects the predicted region based on client’s movement and uses it to calculate the data

distance of an item. This makes the policy adaptive to client’s movement pattern unlike earlier policies that consider the

directional / non-directional data distance only. We call our policy the Prioritized Predicted Region based Cache

Replacement Policy (PPRRP). Simulation results show that the proposed policy significantly improves the system

performance in comparison to previous schemes in terms of cache hit ratio.

Keywords: Mobile Computing, Cache Replacement, Location Dependent Data, Valid Scope, Location Dependent

Information Services.

1. Introduction

Recent advances in wireless technology have ushered

the new paradigm of mobile computing. With the advent

of new mobile infrastructures providing higher bandwidth

and constant connection to the network from virtually

everywhere and advances in the global positioning

technologies , a new class of services referred to as

Location Dependent Information Services (LDIS) has

evolved and is gaining popularity among mobile users.

LDIS provide users with the ability to access

information related to their current location. By including

location as a part of user’s context information, service

carriers can provide better services to many value-added

applications such as travel and tourist information system,

assistance and emergency system, nearest object

searching system and local information access system,

which specifically target the mobile users. Hence, the

need for LDIS arises frequently. For example, imagine

you are on a business trip in a foreign city and you do not

know the city very well, you have no idea where to go. In

this situation, with the help of your portable device you

can query your personal interest like nearest restaurant,

nearest gas station, nearest ATM, nearest theater etc. and

can get the response on your device.

There are many challenges in providing LDIS services

to users. These challenges include limited bandwidth,

limited client power and intermittent connectivity

[1][2][4][6]. Caching helps to address some of these

challenges. Caching of frequently accessed data item on

client side is an effective technique to improve data

accessibility and to reduce access cost. However, due to

limitations of cache size on mobile devices, it is

impossible to hold all accessed data items in the cache.

Thus, there is a need of efficient cache replacement

algorithms to find out suitable subset of data items for

eviction from the cache. Good cache performance heavily

depends on these replacement algorithms. Also, for wide

area mobile environments due to its distributed nature, the

design of an efficient cache replacement policy becomes

very crucial and challenging to ensure good cache

performance.

Most of the existing cache replacement policies use

cost functions to incorporate different factors including

access frequency, update rate, size of objects etc.

Temporal-based traditional cache replacement strategies,

80 A. KUMAR ET AL.

such as Least Recently Used (LRU), Least Frequently

Used (LFU) and LRU-k [13] have been studied widely in

the past. These polices are based on the assumption that

client’s access patterns exhibit temporal locality (i.e. the

objects that were queried frequently in the past will

continue to be queried frequently in the future). These

algorithms replace data items based on recency or

frequency of access. LRU is the most commonly used

recency based algorithm. LFU, which maintains a

reference count for each cached objects, is the most

commonly used frequency based algorithm. Frequency-

based algorithms are well suited for skewed access

patterns in which a large fraction of accesses go to a

disproportionately small set of hot objects. Frequency and

recency based algorithms form the two ends of a

spectrum of caching algorithms. LRU-k cache

replacement algorithm tries to balance both recency and

frequency. However, in mobile networks where clients

utilize location dependent information services, clients

access pattern do not only exhibit temporal locality, but

also exhibit dependence on location of data, location of

the client and direction of the client’s movement [4][5].

As a result, the aforementioned policies are unsuitable for

supporting location dependent services because they do

not take into account the location of data objects and the

movement of mobile clients. Hence, relying solely on

temporal locality when making cache replacement

decisions will result in poor cache hit ratio in LDIS. To

overcome this problem, several location-aware cache

replacement policies [5][7][9][10] have been proposed for

location dependent information services.

Manhattan Distance-based cache replacement policy

[10] supports location dependent queries in urban

environments. Cache replacement decisions are made on

the basis of distance between a client’s current location

and the location of each cached data object. Objects with

the highest Manhattan distance from the client’s current

location are evicted at cache replacement. While the

Manhattan based policy accounts for the distance between

clients and data objects, the major limitation of this

approach is that it ignores the temporal access locality of

mobile clients and the direction of client movement while

making cache replacement decisions.

Furthest Away Replacement (FAR) [9] policy uses the

current location and movement direction of mobile clients

to make cache replacement decisions. Cached objects are

grouped into two sets, viz., in-direction set and the out-

direction set. Data objects in the out-direction set are

always evicted first before those in the in-direction set.

Objects in each set are evicted in the order based on their

distance from the client. Similar to the Manhattan

approach, FAR also neglects the temporal properties of

clients’ access pattern. It also becomes ineffective when

mobile clients change direction frequently due to frequent

change in the membership of objects between the in-

direction and out-direction sets.

In Probability Area Inverse Distance (PAID) [5] policy,

the cost function of data item i takes into account the

access probabilities (Pi) of data objects, area of its valid

scopes A(vsi) and the distance D(vsi) between the client’s

current position and the valid scope of the object

concerned (known as data distance). The cost function of

PAID is given by() ()PAvsD vs

ii i

. It neither takes into

account the size of the data object nor does it give priority

to the data objects in cache that are near to the mobile

client. Mobility Aware Replacement Scheme (MARS) [7]

policy is also a cost based policy, which comprises of

temporal score, spatial score and cost of retrieving an

object. Unlike PAID, it takes into account the updates of

data objects. But as far as location dependent data (LDD)

is concerned, their update rate (if exist) is negligible as

compared to temporal data. Thus, for LDD, only spatial

score dominates which consists of area of valid scope,

data distance from current client location and data

distance from future client location. The impact of

client’s anticipated location or region in deciding cache

replacement still remains unexplored.

None of these cache replacement policies are suitable

if client changes its direction of movement quite often.

Existing cache replacement policies only consider the

data distance (directional/undirectional) but not the

distance based on the predicted region or area where the

client can be in near future. Very few of these policies

[5][7] account for the location and movement of mobile

clients.

In this paper, we predict an area in the vicinity of

client’s current position, and give priority to the cached

data items that belong to this area irrespective of the

client’s movement direction. PPRRP calculates the data

item cost on the basis of access probability, valid scope

area, data size in cache and data distance based on the

predicted region, which has not been considered in any of

the existing policies.

The rest of the paper is organized as follows. Section 2

briefly describes mobile system model used in our work.

Section 3 details the proposed new cost based

replacement policy PPRRP. Section 4 and section 5 deal

with simulation model, and performance evaluation and

comparison simultaneously. Section 6 concludes the

paper.

2. Mobile System Model

We assume a cellular mobile network that is similar to

the model discussed by D. Barbara [1] as mobile

computing infrastructure. A mobile system [1][4][5][6] is

usually made up of a server, moving clients, and a wireless

connection between them (see Figure 1). The

geographical area is divided into small regions, called

cells. Each cell has a Base Station (BS) or Mobile

Support Station (MSS) augmented with wireless interfaces

A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 81

DEPENDENT DATAIN MOBILE ENVIRONMENT

and a number of Mobile Clients (MCs). Inter-cell and

intra-cell communications are managed by the MSSs. The

MCs communicate with the MSS by wireless links within

its radio coverage area. An MC can move freely from one

location to another within a cell or between cells while

retaining its network connection. An MC can either

connect to a MSS through a wireless communication

channel or disconnect from the MSS by operating in the

doze (power save) mode. The MC queries the database

servers that are connected to a wired network. The

wireless channel is logically separated into two sub

channels: uplink channel and downlink channel. The

uplink channel is used by MCs to submit queries to the

server via an MSS, while the downlink channel is used by

MSSs to disseminate information or to forward the

answers from the server to the target client.

The mobile computing platform can be effectively

described under the client/server paradigm [19]. A data

item is the basic unit for update and query. MCs only

issue simple requests to read the most recent copy of a

data item. There may be one or more processes running

on an MC. These processes are referred to as clients (we

use the terms MC and client/users interchangeably). In

order to serve a request from a client, the MSS needs to

communicate with the database server to retrieve the data

items. Since the communication between the MSS and the

database server is through wired links and is transparent

to the clients (i.e., from the client’s point of view, the

MSS is the same as the database server), we also use the

terms MSS and server interchangeably.

Fixed/wireline Network

(

Internet,LAN: Mbps to Gbps

)

Cellular Data

(

CDPD,DataTae: 19.2 kbps

)

Wireless LAN

(

Aironet,Wavelan: 2.12 Mbps

)

Mobile GSM

(

9.6 kbps

)

Base Node

Figure 1. Mobile computing system model

The information system provides location dependent

services to mobile clients. The geographical area covered

by the information system is referred as the service area.

In this paper, we assume a geometric location model, i.e.,

a location is specified as a two-dimensional coordinate.

However, it can be easily extended to 3-dimension space

by including the third dimension. Mobile clients can

identify their locations using systems such as the Global

Positioning System (GPS) [3]. The data item value is

different from data item, i.e., data item value for a data

item is an instance of the item valid for a certain

geographical region. Moreover, the data item value is

different from data item. Data item value for a data item

is an instance of the item valid for a certain geographical

region. So, a data item can show different values when

clients at different locations query it. For example,

“restaurant” is a data item, and the data values for this

data item vary depending on the location of query i.e.

point at which the query “ Tell me the nearest restaurant”

was issued by a mobile client. The valid scope of a data

item value is defined as the region within which the data

item value is valid. In a two-dimensional space, a valid

scope (vs) can be represented by a geometric polygon p

(e1, …,en ), where e i

's are endpoints of the polygon. A

mobile client can cache data on its local disk or in any

storage system that survives power-off. In this paper, data

values are assumed to be of fixed size and read–only so

that we can omit the influence of data sizes and updates

on cache performance and concentrate on the impact

caused by the unique properties of location-dependent

data.

y-axis

O

x-axis

(a) Abstract model

1

θ

2

θ

4

θ

3

θ

MI

1

= t seconds

O

1

v

2

v

3

v

4

v

y-axis

x-axis

MI

2

= t seconds

MI

3

= t seconds

MI

4

= t seconds

(b) Discrete model

Figure 2. Client’s movement path

We also assume an unconstrained network, where

mobile clients move freely inside the geographical region

covered by the mobile network (without any

restrictions).In the abstract model, the path of a moving

client is represented by a curve in 2-dimension (x-y

plane), as shown in Figure 2(a). Though abstract model is

82 A. KUMAR ET AL.

simple, from computer implementation point of view,

discrete model is preferred [8]. In discrete model, the path

traveled is modeled as a sequence of line segments, each

associated with fixed velocity and direction, as shown in

Figure 2 (b). Length of the line segment depends on the

rate of change of direction and velocity. For random

movement this duration between change in direction and

velocity is small and for regular movement and highway

users this duration is large. This duration is known as

Moving Interval (MI) [5][7][17]. Figure 2(b) shows the

discrete movement of a mobile user with MI of t seconds.

The distance between any two locations or points is the

length of a straight line connecting the two points (i.e.

Euclidean distance).

3. Prioritized Predicted Region Based Cache

Replacement Policy (PPRRP)

3.1. Motivation

LDIS, being spatial in nature needs that the distance of

data item from client’s current position and its valid scope

area should also be taken into account for cache

replacement. Greater the distance of valid scope of data

from the user’s current position lower is the chance that

client will enter in the valid scope area of the data in near

future. Thus, it is better to eject the farthest data value

when replacement takes place. Moreover, because the

client is mobile, its position at the time of next query will

be different from its current position. Therefore the

client’s movement should also be taken into account.

Locations in the opposite direction of client’s movement

have very low chance of being revisited, though they may

be very close to it. Based on this reasoning, existing

cache replacement schemes like FAR and PAID

(directional) assign higher priorities to data items in the

client’s direction of movement giving very low priority to

the items in the opposite direction of user’s movement.

However, with random movement patterns of clients, it is

not always necessary that client will continue moving in

the same direction. Therefore, evicting data values which

are in the opposite direction of client’s movement but are

very close to client’s current position may degrade the

overall performance.

3.2. Basic Idea

When client movement pattern is random, retaining the

data items in the direction of user movement and

discarding the data items that are in the opposite direction

of user movement may not improve the performance.

Therefore, our cache replacement policy considers the

predicted region of user presence in near future (rather

than considering the direction of user movement only)

while selecting an item for replacement. The predicted

region is based on the client’s current movement pattern.

We show that it is useful to calculate the data distance

with respect to this region so that the data items in the

vicinity of client’s current position are not purged from

cache. Valid scope area of the data item and the amount

of space required to store the data item in cache are also

used to select an item for replacement. This is because the

client has higher chance of being in large region rather

than small regions and keeping smaller size data items in

cache helps to accommodate a large number of data items

in the cache. Hence, our cache replacement policy selects

a victim with low access probability, small valid scope

area falling outside the predicted region and large data

size.

3.3. Approach

We make use of discrete model for client’s movement

path as described in Section 3.2 and used in [5][7][14].

Assuming a predefined path of mobile user or a

predefined destination is generally not possible unless we

are dealing with a case where the user is moving in a train

or a ship and the entire path of the user is known well in

advance. For discrete model, the direction and velocity of

the user are known for current MI. At the end of each MI,

direction is selected randomly between 0° to 360°

degrees and the velocity between minimum (vmin) and

maximum speed (vmax) of the client. This motivates us in

predicting a region instead of predicting the path.

MIc

L

(,)

s

x

y(,)

ee

x

y

Figure 3. Current moving interval

MIc

L

(,)

ss

xy (,)

ee

xy

Figure 4. Predicted region

Let vc be the velocity in current moving interval MIc,

LMIc be the length of MIc along direction θc and (xs, ys)

and (xe, ye) be the starting and end point of MIc

respectively (see Figure 3). Assuming that the velocity vc

remains same (generally the mobile user does not changes

its velocity significantly over a long period) in the next

MI also, we can predict the region of user presence in

near future by the circle with radius LMIc and centre (xe, ye)

as shown in Figure 4. Our cache replacement policy uses

A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 83

DEPENDENT DATAIN MOBILE ENVIRONMENT

this region to calculate the distance of data items in cache

as follows:

• The distance of data items outside the predicted

region is calculated from the centre of the circle

• The distance of data items inside the predicted region

is calculated as the minimum of {LMIc, distance of the

valid scope from the current position of the user}.

Calculating the distance of data items in this way

ensures that

• Items outside the predicted region always have the

lower priority than the items inside the predicted

region.

• Items inside the predicted region, close to the user

have higher priority.

One of the advantages of using predicted region is that

it dynamically changes with speed of client and MI and

also takes into account the random movement of client.

Now, we define cost function for our cache

replacement policy PPRRP that considers access

probability, predicted region based data distance, valid

scope area and size of the data in cache. Associated with

each cached data object is the replacement cost. When a

new data object needs to be cached and there is

insufficient cache space, the object(s) with lowest

replacement cost is (are) removed until there is enough

space to cache new object. The cost of replacing a data

value j of data item i in client’s cache is calculated as:

{}

'

i,j

'

1

minimumL , D(vs)

'

P.A(vs )

ii,j '

.ifvsPred_Reg

i,j

SMIc

i,j

Cos ti,j '

P.A(vs )1

ii,j '

.ifvsPred_Reg

i,j

'

SD(vs )

i,j i,j

∈

=

∉

⎧

⎪

⎨

⎪

⎩

(1)

where Pi is the access probability of data item i , A(vs’

i,j)

is the area of the valid scope vs’

i,j for data value j, Si,j is

the size of data value j and valid scope vs’

i,j , D(vs’

i,j) is

the distance of the valid scope vs’

i,j from the current user

position , D’(vs’

i,j) is the distance of the valid scope vs’

i,j

from the centre of the predicted region and Pred_Reg is

the predicted region.

4. Simulation Model

This section describes the simulation model used to

evaluate the performance of the proposed location-

dependent cache invalidation methods. Our Simulator is

implemented in C++ and setup is similar and in

accordance with those used in earlier studies [5][14].

4.1. System

Since seamless hand-off from one cell to another is

assumed, the network can be considered a single, large

service area within which the clients can move freely and

obtain location-dependent information services. In our

simulation, the service area is represented by a rectangle

with a fixed size of Size. We assume a “wrapped-around”

[5][14][107] model for the service area. In other words,

when a client leaves one border of the service area, it

enters the service area from the opposite border at the

same velocity.

The database contains ItemNum items. Every item

may display ScopeNum different values for different

client locations within the service area. The size of data

value varies from Smin to Smax and has following three

types of distributions [6]:

• IncreasingSize: The size S

i of data item i grows

linearly as i increases, and is given by:

max min

min

(1)*( )

, 1,...,;

1

i

iSS

SSi ItemNum

ItemNum

−−

=+ =

−

(2)

• DecreasingSize: The size Si of data item i decreases

linearly as i increases, and is given by:

max min

max

(1)*( )

, 1,...,;

1

i

iSS

SSi ItemNum

ItemNum

−−

=− =

−

(3)

• RandomSize: The size S

i of data item i falls

randomly between Smin and Smax, given by:

minmax min

()*() ,1,...,;

i

S SprobSSiItemNum=+− =

⎢⎥

⎣⎦

(4)

where, ()

p

rob is a random function with uniformly

distributed value between 0 and 1. The choice of the size

distributions are based on previously published trace

analysis [6]. Though, some researchers have shown that

small data items are accessed more frequently than large

data items, but recent web trace analysis shows that the

correlation between data item size and access frequency is

weak and can be ignored [16]. Combined with the skewed

access pattern, IncreasingSize and DecreasingSize

represent client’s preference for frequently querying

smaller items and larger items respectively. In other

words, with IncreasingSize setting, the clients access the

smallest item most frequently and with DecreasingSize

setting, the clients access the largest item most frequently.

RandomSize, models the case where no correlation

between the access pattern and data size exists.

In the simulation, the scope distributions of the data

items are generated based on voronoi diagrams (VDs)

[12][20] because valid scopes of nearest neighbor queries

is defined by VD. Formally, given sets of point

84 A. KUMAR ET AL.

O={o1,o2,…..,on), V(oi), the Voronoi Cell (VC) for oi, is

defined as the set of points q in the space such that

dist(q,oi) < dist(q,oj), ∀ j ≠ i. That is, V(oi) consists of set

of points for which oi is nearest neighbor. In our

simulation, first data set Scope Distribution 1 (Figure 5

(a)) contains 110 points randomly distributed in a square

Euclidean space. The second data set, Scope Distribution

2 (Figure 5 (b)), contains the locations of 215 hospitals in

California area, which is extracted from the point data set

available at [18].

This model assumes that two floating-point numbers

are used to represent a two-dimensional coordinate and

one floating-point number to represent the radius of circle.

The size of a floating-point number is FloatSize. The

wireless network is modeled by an uplink channel and a

downlink channel. The uplink channel is used by clients

to submit queries, and the downlink channel is used by

the server to return query responses to target clients. The

communication between the server and a client makes use

of a point-to-point connection. It is assumed that the

available bandwidth is UplinkBa nd for the uplink channel

and DownlinkBand for the downlink channel.

(a) Scope distribution 1 (ScopeNum=110)

(b) Scope distribution 2 (ScopeNum=215)

Figure 5. Scope distributions for performance evaluation

4.2. Client

The mobile client is modeled with two independent

processes: query process and move process. The query

process continuously generates location-dependent read-

only queries for different data items. After the current

query is completed, the client waits for an exponentially

distributed time period with a mean of QueryInterval

before the next query is issued. The client access pattern

over different items follows a Zipf distribution with

skewness parameter θ, which is shown to be a realistic

approximation of skewed data access and are frequently

used to model non-uniform distribution [5][6][16]. In the

Zipf distribution, the access probability of the ith (1≤i ≤N)

data item is represented as follows

1

i

N

jj

pi

θ

=

=∑ (5)

where N is the number of items in the database and 0≤ θ

≤1.

When θ = 0, the access pattern is uniform. As θ value

is increased the skewness increases. When θ = 1, it is the

strict Zipf distribution. Large θ results in more “skewed”

access distribution. To answer a query, the client first

checks its local cache. If the data value for the requested

item with respect to the current location is available, the

query is satisfied locally. Otherwise, the client submits

the query and its current location to the server and

retrieves the data through the downlink channel. The

move process controls the movement pattern of the client

using the parameter MovingInterval. After the client

keeps moving at a constant velocity for a time period of

MovingInterval, it changes velocity for next MI. The next

speed is selected randomly between MinSpeed and

MaxSpeed. Similarly, the next moving direction

(represented by the angle relative to the x-axis, counter

clock wise taken as positive) is selected randomly

between 0° to 360°. If the difference between MinSpeed

and MaxSpeed is low the mobile users move with almost

same velocity. The client is assumed to have a cache of

fixed size, which is a CacheSizeRatio ratio of the

database size.

4.3. Server

The server is modeled by a single process that services

the requests from clients. The requests are buffered at the

server if necessary, and an infinite queue buffer is

assumed. The FCFS service principle is assumed in the

model. To answer a location-dependent query, the server

locates the correct data value with respect to the specified

location. Since the main concern of this paper is the cost

of the wireless link( i.e. transmission time, receiving time

and disconnections), which is more expensive than the

wired-link and disk I/O costs(i.e. disk access time), the

overheads of request processing and service scheduling at

the server are assumed to be negligible in the model.

5. Performance Evaluation

This section describes the performance parameters and

measures used for simulation and analyze the results of

the simulation.

A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 85

DEPENDENT DATAIN MOBILE ENVIRONMENT

5.1. Performance Parameters

The default values of different parameters used in the

simulation experiments are given in Table 1. They are

chosen to be the same as used in earlier studies [5][7][14].

Experiments are performed using different workloads

and system settings. In order to get the true performance

for each algorithm, we collect the result data only after

the system becomes stable, which is defined as the time

when the client caches are full [5][6]. Each simulation

runs for 20,000 client issued queries and each result

obtained in the experiment is taken as the average of 10

simulation runs with Confidence Interval of 96 percent.

For simulation purpose, we assume that all data items

follow the same scope distribution in a single set of

experiments. Two scope distributions with 110 and 215

valid scopes are used (see Figure 5). Since the average

valid scope areas differ for these two scope distributions,

different moving speeds are assumed, i.e., the pair of

(MinSpeed,MaxSpeed) is set to (1,2) and (5,10) for Scope

Distribution 1 and Scope Distribution 2, respectively. The

Caching-Efficiency-Based (CEB) [5] cache invalidation

policy is employed for cache management. For

calculating data distance between valid scope (either a

polygon or a circle) and current location we select a

reference point for each valid scope and take the distance

(Euclidean distance) between the current location and the

reference point. For polygonal valid scope, the reference

point is defined as the endpoint that is closest to the

current location and for circular valid scope, it is defined

as the point where the circumference and the line

connecting the current location and the center of the circle

meet. Access probability for each data item is estimated

by using exponential ageing method [5][6]. Two

parameters are maintained for each data item i: a running

probability Pi and the time of the last access to item tl

i. Pi

is initialized to 0. When a new query is issued for data

item i, Pi is updated using the following formula:

()(1)

l

ici i

Ptt P

α

=−+− (6)

where, t

c is the current system time and α is a constant

factor to weigh the importance of most recent access in

the probability estimate. Note that the access probability

is maintained for each data item rather than for each data

value. If the database size is small, the client can maintain

these parameters (i.e., Pi and tl

i for each item i) for all

items in its local cache. However, if the database size is

large, these parameters will occupy a significant amount

of cache space. To alleviate this problem, we set an upper

bound to the amount of cache used for storing these

parameters (5 percent of the total cache size in our

simulation) and use the Least Frequently Used (LFU)

policy to manage the limited space reserved for it.

5.2. Performance Metric

Our primary performance metric is cache hit ratio.

Other performance metrics can be derived from the cache

hit ratio. Cache hit ratio can be defined as the ratio of the

number of queries answered by the client’s cache to the

total number of queries generated by the client.

Specifically, higher the cache hit ratio, higher is the local

data availability, less is the uplink and downlink costs and

the battery consumption [5][6][14].

5.3. Comparison of Location-Dependent Cache

Replacement Schemes

This subsection examines the performance of different

location-dependent cache replacement policies, namely,

PPRRP with PAID, FAR and Manhattan. Figures 6 to 15

show the cache hit ratio for both scope distributions (see

Figure 5) under various query intervals, moving intervals,

cache sizes, client’s speed and Zipf’s distribution.

5.3.1. Effect of Query Interval

Query interval is the time interval between two

consecutive client queries. In this set of experiments, we

vary the mean query interval from 20 seconds to 200

seconds. Figures 6 and 7, show cache performance for

both scope distributions and for the data distributions:

IncreasingSize, RandomSize and DecreasingSize.

Results show that, when the query interval is increased,

almost every scheme shows a worse performance. This is

because, for a longer query interval when a new query is

issued the client has a lower probability of residing in one

of the valid scopes of the previously queried data items.

When different cache replacement policies are compared,

the proposed policy substantially outperforms the existing

policies. Figure 6, compares the performance of cache

replacement policies versus query interval for Scope

Distribution 1. PPRRP, which prefers object within the

predicted region over the objects outside the predicted

region and gives priority to the data objects that are

nearer to the client’s current position within the predicted

region, obtains better performance than PAID. Average

improvement of PPRRP over PAID is 28%, 21% and

19% for IncreasingSize, RandomSize and DecreasingSize

respectively for Scope Distribution 1. Figure 7, shows the

effect of change in query interval on the performance of

cache replacement policies for Scope Distribution 2. It

can be observed that the PPRRP show similar gains in

performance for Scope Distribution 2 also as they were

for Scope Distribution1. The average improvement of

PPRRP over PAID for Scope Distribution 2 is 8%, 12.2%,

and 11% for IncreasingSize, RandomSize and

DecreasingSize respectively .

5.3.2. Effect of Moving Interval

This subsection examines the performance of the

replacement policies when the client’s moving interval is

varied. Longer the moving interval, less frequent is the

changes in velocity of the client and hence, there is lesser

86 A. KUMAR ET AL.

randomness in the client's movement. The performance

results for IncreasingSize, RandomSize and

DecreasingSize of data distribution are shown in Fig-

ures 8 and 9.

We can see that when the moving interval is varied

from 50 seconds to 400 seconds, the cache hit ratio

decreases drastically. The reason for this is as follows.

For a relatively longer moving interval, there is a high

probability of the client leaving one valid region and

entering another. Consequently, the cached data are less

likely to be re-used for subsequent queries. This leads to a

decreased performance with increase in MI.

Figure 8, compares the performance of cache

replacement policies over changing moving interval for

Scope Distribution 1. Though for small MI, the

randomness in client movement is more as compared to

larger MI but PPRRP perform better than all existing

policies for both small and large MI. The predicted region

in PPRRP helps to keep the data items within the

influence of client’s movement, thereby reducing the

effect of randomness in client’s movement. Average

improvement of PPRRP over PAID is 27%, 21% and

12% for IncreasingSize, RandomSize and DecreasingSize

respectively. Figure 9, compares the performance of

cache replacement policies over change in moving

interval for Scope Distribution 2. For Scope Distribution

2 also, we get similar improvement in performance of

PPRRP as they were for Scope Distribution 1. The

average improvement of PPRRP over PAID for Scope

Distribution 2 is 7%, 6.3%, and 11% for IncreasingSize,

RandomSize and DecreasingSize respectively.

5.3.3. Effect of Cache Size

In this set of experiments, we intend to investigate the

robustness of the proposed replacement schemes under

various cache sizes. Figures 10 and 11, show the results

when CacheSizeRatio is varied from 5% to 20%. As

expected, the performance of all replacement schemes

improves with increasing cache size. This is because the

cache can hold large number of data items which

increases the probability of getting a cache hit. Moreover,

replacement occurs less frequent in comparison to the

case when cache size is low. Figure 10, shows the

performance for Scope Distribution 1. PPRRP

consistently outperforms the existing policies from small

size cache to large size cache. Average improvement of

PPRRP over PAID is 25%, 24% and 18% for

IncreasingSize, RandomSize and DecreasingSize

respectively. Figure 11, compares the performance of

cache replacement policies under varied CacheSizeRatio

for Scope Distribution 2. Results show similar

performance gains for all proposed policies for Scope

Distribution 2 also. The average improvement of PPRRP

over PAID for Scope Distribution 2 is 6.7%, 9.4%, and

10% for IncreasingSize, RandomSize and DecreasingSize

respectively.

5.3.4. Effect of Client’s Speed

This subsection examines the effect of change in

client’s speed on the performance of the proposed cache

replacement policies. Client’s cache hit ratio is shown

against client speed from Figures 12 to 13. Four speed

ranges [15], 1~5m/s, 6~10m/s, 16~20m/s, 25~35m/s,

corresponding to the speed of a walking human, a running

human, a vehicle with moderate speed and a vehicle with

high speed, respectively are used. It can be seen that very

high cache hit ratio can be achieved for walking human.

For higher speed range, the cache hit ratio drops as client

spends less time at each geographic location and the valid

scope of each data item stored in cache becomes less

effective. In PPRRP, higher the speed of client, greater is

the predicted region and hence more data items stored in

the cache are held in that region. Average improvement

of PPRRP over PAID for different speed ranges for

Scope Distribution 1 and Scope Distribution 2 are given

in Table 2 and Table 3 respectively.

5.3.5. Effect of Client’s Access pattern

This subsection examines the performance of the

replacement policies under various clients’ access pattern.

Client’s access pattern is modeled by Zipf’s Distribution

[16]. The Zipf parameter θ determines the “skewness” of

the access pattern over data items. When θ=0, the access

pattern is uniformly distributed. When θ increases, more

access is focused on few items (skewed). Figures 14 and

15, show the impact of access pattern on performance of

replacement policies for both scope distributions. As

desired, performance of PPRRP along with other

replacement policies, increases with increase in θ for both

Scope distributions over all the three data size

distributions. Moreover, proposed policy shows an edge

over other policies.

6. Conclusion

In this paper, we presented a cache replacement policy,

PPRRP, for location-dependent data that uses predicted

region based cost function for selecting data items to be

replaced from the cache. In order to decide which data

items to replace from cache, an attempt must be made to

predict what items will be accessed in the future.

We emphasized on predicting a region around mobile

client’s current position apart from considering only

user’s direction or distance. Predicted region plays an

important role in improving the system performance.

Using the predicted region of user influence, the data

items in the vicinity of client’s current position are not

purged from cache, which increases the cache hit.

Proposed policy takes into account factors like access

probability, data distance from predicted region, valid

scope and data size in cache. In PPRRP, data distance is

calculated such that the data items within the predicted

region are given higher priority than the data items

A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 87

DEPENDENT DATAIN MOBILE ENVIRONMENT

outside the predicted region. In addition to giving highest

priority to the data items within the predicted region, data

items nearer to the client’s current position are also

favored over other data items in the same predicted region.

A number of simulation experiments have been

conducted to evaluate the performance of the PPRRP.

The results show that PPRRP, with different system

settings, give better performance (improves cache hit ratio)

than PAID. PPRRP achieves an average improvement of

more than 25% for Increasing Size, more than 20 % for

Random Size and more than 15 % for DecreasingSize as

compared to existing replacement policy PAID.

We compared all policies under various system

parameters and for two scope distributions. But in real-

world, there can be lots of scope distributions varying

from regions to countries. Moreover, we used Euclidean

distance to calculate the distance between two points.

However, in the real-world, this distance cannot represent

the real distance that a user has to cover in order to reach

to an object. For example, in City model the distance

between two points is calculated using Manhattan

distance. Hence, there is a need to explore proposed

policies under different real-world conditions. Also, LDIS

have been referred to as some of the most promising

technological inventions that may impact consumer

behavior over the next decade. User’s expectations from

mobile networks are becoming more demanding and this

trend is expected to intensify in the future. Hence, the

user need/profile is essential to develop better cache

replacement policies. Also, existing location dependent

cache management schemes consider only location-

dependent data. Investigation of location dependent cache

management schemes, which includes temporal

dependent update frequencies, is also required. Future

schemes for cache management should consider the

above facts to over come the challenges posed by LDIS.

7. References

[1] D. Barbara, “Mobile Computing and Databases: A

Survey,” IEEE Transactions on Knowledge and Data

Engineering, Vol. 11, No. 1, pp. 108-117,

January/February 1999.

[2] D. Barbara and T. Imielinski, “Sleepers and

Workaholics: Caching Strategies in Mobile

Environments,” In the Proceedings of the ACM

SIGMOD Conference on Management of Data,

Minneapolis, USA, pp. 1-12, 1994.

[3] I. A. Getting, “The Global Positioning System,”

IEEE Spectrum, Vol. 30, No. 12, pp. 36-47,

December 1993.

[4] D. L. Lee, W.-C. Lee, J. Xu and B. Zheng, “Data

Management in Location-Dependent Information

Services,” IEEE Pervasive Computing, Vol. 1, No. 3,

pp. 65-72, July 2002.

[5] B. Zheng, J. Xu and D. L. Lee, “Cache Invalidation

and Replacement Strategies for Location-Dependent

Data in Mobile Environments,” IEEE Transactions

on Computers, Vol. 51, No. 10, pp. 1141-1153,

October 2002.

[6] L. Yin, G. Cao and Y. Cai, “A Generalized Target-

Driven Cache Replacement Policy for Mobile

Environments,” In the Proceedings of the IEEE

Symposium on Applications on the Internet, pp. 14-

21, January 2003.

[7] K. Lai, Z. Tari and P. Bertok, “Location–Aware

Cache Replacement for Mobile Environments,” IEEE

Global Telecommunication Conference

(GLOBECOM 04), Vol. 6, pp. 3441-3447, 29th

November- 3rd December 2004.

[8] M. Erwig, R. H. Guting, M. Schneider and M.

Vazirgiannis, “Spatio-Temporal Data Types: An

Approach to Modeling and Querying Moving

Objects in Databases,” GeoInformatica, Vol. 3, No. 3,

pp. 269-296, 1999.

[9] Q. Ren and M. H. Dhunham, “Using Semantic

Caching to Manage Location Dependent Data in

Mobile Computing,” In the Proceedings of 6th

ACM/IEEE Mobile Computing and Networking

(MobiCom), Boston, USA, pp. 210-221, 2000.

[10] S. Dar, M. J. Franklin, B. T. Jonsson, D. Srivastava

and M. Tan, “Semantic Data Caching and

Replacement,” In the Proceedings of the 22nd

International Conference on Very Large

Databases(VLDB), pp. 330-341,1996.

[11] A. Balamash and M. Krunz, “An Overview of Web

Caching Replacement Algorithms,” IEEE

Communications Surveys & Tutorials, Vol. 6, No. 2,

pp. 44-56, 2004.

[12] J. O’ Rourke, Computational Geometry in C, chapter

5, Univ. of Cambridge Press, 1998.

[13] E. O’Neil and P. O’Neil, “The LRU-k page

replacement algorithm for database disk buffering”,

In the Proceedings of the ACM SIGMOD, Vol. 22,

No. 2, pp. 296-306, 1993.

[14] Il-dong Jung, Young-ho You, Jong-hwan Lee and

Kyungsok Kim, “Broadcasting and caching policies

for location-dependent queries in urban areas,” In the

Proceedings of the 2nd International Workshop on

Mobile Commerce, Atlanta, USA, pp. 54-60,

September 2002.

[15] C. Lu, G. Xing, O. Chipara and C. L. Fok,

“MobiQuery: A Spatio Temporal Data Service for

Sensor Networks,” In the Proceedings of the 2nd

88 A. KUMAR ET AL.

International Conference on Embedded Networked

Sensor System (ACM SenSys’04), Baltimore, USA,

pp. 320-334, 2004.

[16] L. Breslau, P. Cao, L. Fan, G. Phillips and S. Shenker,

“Web Caching and Zipf-like Distributions: Evidence

and Implications,” In the Proceedings of the 18th

Annual Joint Conference of the IEEE Computer and

Communications Societies (INFOCOM’99), New

York, USA, Vol. 1, pp. 126-134, March 1999.

[17] T. Camp, J. Boleng and V. Davies, “A Survey of

Mobility Model for Ad Hoc Network Research,”

Wireless Communication & Mobile Computing

(WCMC): Special Issue on Mobile AdHoc

Networking: Research, Trends and Applications, Vol.

2, No. 5, pp. 483-502, 2002.

[18] Spatial Datasets. Website at

http://www.rtreeportal.org, 2005.

[19] J. Jing, A. Helal and A. Elmagarmid, “Client-Server

Computing in Mobile Environments,” In the

Proceedings of the ACM Computing Surveys, Vol.

31, No. 2, pp. 117-157, June 1999.

[20] M. Berg, M. Kreveld M. Overmars and O.

Schwarzkopf, “Computational Geometry: Algorithms

and Applications,” chapter 7, New York, NY, USA,

Springer-Verlag, 1996.

A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 89

DEPENDENT DATAIN MOBILE ENVIRONMENT

Table 2. Improvement of PPRRP over PAID on different speed ranges

(Scope distribution 1)

Speed Ranges

(m/s) IncreasingSize

(%) RandomSize

(%) DecreasingSize

(%)

1~5 43 29 21

6~10 41 31 25

16~20 40 30 24

25~35 37 29 35

Table 1. Configuration parameters and de fa ult parameter settings for simulation model

Parameter Description Setting

Size size of the rectangle service area 4000m*4000m,

44000m*27000m

ItemNum number of data items in the

database

500

ScopeNum number of different values at

various locations for each item

110, 215

Smin minimum size of a data value 64 bytes

Smax maximum size of a data value 1024 bytes

UplinkBand bandwidth of the uplink channel 19.2 kbps

DownlinkBand bandwidth of the downlink

channel

144 kbps

F loatSize size of a floating-point number 4 bytes

QueryInterval average time interval between two

consecutive queries

50.0 s

MovingInterval time duration that the client keeps

moving at a constant velocity

100.0s

MinSpeed minimum moving speed of the

client

1m s-1,5 m s-1

MaxSpeed maximum moving speed of the

client

2m s-1, 10 m s-1

CacheSizeRatio ratio of the cache size to the

database size

10 %

θ skewness parameter for the Zipf

access distribution

0.5

Table3. Improvement of PPRRP over PAID on different speed ranges

(Scope distribution 2)

Speed Ranges

(m/s) IncreasingSize

(%) RandomSize

(%) DecreasingSize

(%)

1~5 2.2 15.3 16.2

6~10 5.8 8.5 13.5

16~20 4.7 10.9 6.3

25~35 1.2 7.4 11.3

90 A. KUMAR ET AL.

Distribution: IncreasingSize

Query Interval (seconds)

20406080100 120 140 160180 200

Cache Hit Ratio

0.10

0.15

0.20

0.25

0.30

0.35

PPRRP

PAID

FAR

Manhattan

(a)

Distribution: RandomSize

Query Interval (seconds)

20406080100 120 140 160180 200

Cache Hit Ratio

0.05

0.10

0.15

0.20

0.25

0.30

PPRRP

PAID

FAR

Manhattan

(b)

Distribution: DecreasingSize

Query Interval (seconds)

20406080100 120 140 160 180200

Cache Hit Ratio

0.05

0.10

0.15

0.20

0.25

PPRRP

PAID

FAR

Manhattan

(c)

Figure 6. Cache hit ratio vs query interval for scope

distribution 1

Distribution: IncreasingSize

Query Interval (seconds)

20406080100120 140160 180 200

Cache Hit Ratio

0.40

0.45

0.50

0.55

0.60

0.65

0.70

0.75

PPRRP

PAID

FAR

Manhatt a n

(a)

Distribution: RandomSize

Query Interval (seconds)

20406080100120 140160 180 200

Cache Hit Ratio

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

PPRRP

PAID

FAR

Manhattan

(b)

Distribution: DecreasingSize

Query Interval (seconds)

20406080100120 140160 180200

Cache Hit Ra tio

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

PPRRP

PAID

FAR

Manhattan

(c)

Figure 7. Cache hit ratio vs query interval for scope

distribution 2

A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 91

DEPENDENT DATAIN MOBILE ENVIRONMENT

Distribution: IncreasingSize

Moving Interval (seconds)

50 100200400

Cache Hit Ratio

0.10

0.15

0.20

0.25

0.30

PPRRP

PAID

FAR

Manhattan

(a)

Distribution: RandomSize

Moving Interval (seconds)

50 100200400

Cache Hit Ratio

0.05

0.10

0.15

0.20

0.25

PPRRP

PAID

FAR

Manhattan

(b)

Distribution: DecreasingSize

Moving Interval (seconds)

50 100200400

Cache Hit Ratio

0.05

0.10

0.15

0.20

0.25

PPRRP

PAID

FAR

Manhattan

(c)

Figure 8. Cache hit ratio vs moving interval for scope

distribution 1

Distribution: IncreasingSize

Moving Interval (seconds)

50 100200400

Cache Hit Ratio

0.40

0.45

0.50

0.55

0.60

0.65

0.70

PPRRP

PAID

FAR

Manhattan

(a)

Distribution: Random Si ze

Moving Interval (seconds)

50 100200400

Cache Hit Ratio

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

PPRRP

PAID

FAR

Manhattan

(b)

Distribution: DecreasingSize

Moving Interval (seconds)

50 100200400

Cache Hit Ratio

0.30

0.35

0.40

0.45

0.50

0.55

0.60

0.65

PPRRP

PAID

FAR

Manhattan

(c)

Figure 9. Cache hit ratio vs moving interval for scope

distribution 2

92 A. KUMAR ET AL.

Distribution: IncreasingSize

Cache Size (% of Database Size)

5 101520

Cache Hit Ratio

0.05

0.10

0.15

0.20

0.25

0.30

0.35

PPRRP

PAID

FAR

Manhattan

(a)

Distribution: RandomSize

Cache Size (% of Database Size)

5 101520

Cache Hit Ratio

0.06

0.08

0.10

0.12

0.14

0.16

0.18

0.20

0.22

0.24

0.26

PPRRP

PAID

FAR

Manhattan

(b)

Distribution: DecreasingSize

Cache Size (% of Database Size)

5 101520

Cache Hit Ratio

0.06

0.08

0.10

0.12

0.14

0.16

0.18

0.20

0.22

0.24

PPRRP

PAID

FAR

Manhattan

(c)

Figure 10. Cache hit ratio vs cache size for scope

distribution 1

Distribution: IncreasingSize

Cache Size (% of Database Size)

5 101520

Cache Hit Ratio

0.40

0.45

0.50

0.55

0.60

0.65

0.70

PPRRP

PAID

FAR

Manhattan

(a)

Distribution: Random Si ze

Cache Size (% of Database Size)

5 101520

Cache Hit Ratio

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

PPRRP

PAID

FAR

Manhattan

(b)

Distribution: DecreasingSize

Cache Size (% of Database Size)

5 101520

Cache Hit Ratio

0.30

0.35

0.40

0.45

0.50

0.55

0.60

0.65

0.70

PPRRP

PAID

FAR

Manhattan

(c)

Figure 11. Cache hit ratio vs cache size for scope

distribution 2

A PREDICTED REGION BASED CACHE REPLACEMENT POLICY FOR LOCATION 93

DEPENDENT DATAIN MOBILE ENVIRONMENT

Distribution: IncreasingSize

Client Speed (m/s)

1-56-1016-20 25-35

Cache Hit Ratio

0.00

0.05

0.10

0.15

0.20

0.25

PPRRP

PAID

FAR

Manhattan

(a)

Distribution: RandomSize

Client Speed (m/s)

1-56-1016-20 25-35

Cache Hit Ratio

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

PPRRP

PAID

FAR

Manhattan

(b)

Distribution: DecreasingSize

Client Speed (m/s)

1-56-1016-20 25-35

Cache Hit Ratio

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

PPRRP

PAID

FAR

Manhattan

(c)

Figure 12. Cache hit ratio vs client speed for scope

distribution 1

Distribu tion: IncreasingSize

Client Speed (m/s)

1-56-1016-20 25-35

Cache Hit Ratio

0.0

0.2

0.4

0.6

0.8

PPRRP

PAID

FAR

Manhattan

(a)

Distribution: RandomSize

Client Speed (m/s)

1-56-1016-20 25-35

Cache Hit Ratio

0.0

0.2

0.4

0.6

0.8

PPRRP

PAID

FAR

Manhattan

(b)

Distribution: DecreasingSize

Client Speed (m/s)

1-56-1016-20 25-35

Cache Hit Ratio

0.0

0.2

0.4

0.6

0.8

1.0

PPRRP

PAID

FAR

Manhattan

(c)

Figure 13. Cache hit ratio vs client speed for scope

distribution 2

94 A. KUMAR ET AL.

Distribution: IncreasingSize

Zipf parameter (θ)

0.0 0.2 0.4 0.6 0.8 1.0

Cache Hit Ratio

0.0

0.1

0.2

0.3

0.4

0.5

0.6

PPRRP

PAID

FAR

Manhattan

(a)

Distribution: RandomSize

Zipf parameter (θ)

0.0 0.2 0.4 0.6 0.8 1.0

Cache Hit Ratio

0.0

0.1

0.2

0.3

0.4

0.5

0.6

PPRRP

PAID

FAR

Manhattan

(b)

Distribution: Decreas ingSize

Zipf parameter (θ)

0.0 0.2 0.4 0.6 0.8 1.0

Cache Hit Ratio

0.0

0.1

0.2

0.3

0.4

0.5

PPRRP

PAID

FAR

Manhattan

(c)

Figure 14. Cache hit ratio vs zipf parameter for scope

distribution 1

Distribution: IncreasingSize

Zipf parameter (θ)

0.0 0.2 0.4 0.6 0.8 1.0

Cache Hit Ratio

0.4

0.5

0.6

0.7

0.8

PPRRP

PAID

FAR

Manhattan

(a)

Distribution: RandomSize

Zipf parameter (θ)

0.0 0.2 0.4 0.6 0.8 1.0

Cache Hit Ratio

0.4

0.5

0.6

0.7

PPRRP

PAID

FAR

Manhattan

(b)

Distribution: Decreas ingSize

Zipf parameter (θ)

0.0 0.2 0.4 0.6 0.8 1.0

Cache Hit Ratio

0.3

0.4

0.5

0.6

0.7

PPRRP

PAID

FAR

Manhattan

(c)

Figure 15. Cache hit ratio vs zipf parameter for scope

distribution 2

Paper Menu >>

Journal Menu >>