Download - ICNP'061 Benefit-based Data Caching in Ad Hoc Networks Bin Tang, Himanshu Gupta and Samir Das Computer Science Department Stony Brook University.

ICNP'06 1

Benefit-based Data Caching in Ad Hoc Networks

Bin Tang, Himanshu Gupta and Samir Das

Computer Science DepartmentStony Brook University

ICNP'06 2

Outline

Problem Addressed, and Motivation Problem Formulation Related Work Centralized Greedy Algorithm Distributed Implementation Performance Evaluation Conclusions

ICNP'06 3

Problem Addressed

In a general ad hoc network with limited memory at each node, where to cache data items, such that the total access (communication) cost is minimized?

ICNP'06 4

Motivation Ad hoc networks are resource constrained

Limited bandwidth, battery energy, and memory

Caching can save access (communication) cost, and thus, bandwidth and energy

ICNP'06 5

Problem Formulation Given:

Network graph G(V,E) Multiple data items Access frequencies (for each node and data item) Memory constraint at each node

Select data items to cache at each node under memory constraint

Minimize total access cost = ∑nodes ∑data items [(distance from node to the nearest

cache for that data item) x (access frequency) ]

ICNP'06 6

Related Work Related to facility-location problem and K-

median problem; No memory constraint

Baev and Rajaraman 20.5-approximation algorithm for uniform-size data

item For non-uniform size, no polynomial-time

approximation unless P = NP We circumvent the intractability by

approximating “benefit” instead of access cost

ICNP'06 7

Related Work - continued

Two major empirical works on distributed caching Hara [infocom’99] Yin and Cao [Infocom’ 04] (we compare our work

with theirs)

Our work is the first to present a distributed caching scheme based on an approximation algorithm

ICNP'06 8

Algorithms

Centralized Greedy Algorithm (CGA) Delivers a solution whose “benefit” is at least 1/2 of

the optimal benefit

Distributed Greedy Algorithm (DGA) Purely localized

ICNP'06 9

Centralized Greedy Algorithm (CGA)

Benefit of caching a data item at a node

= the reduction of total access cost

i.e., (total access cost before caching) – (total access cost after caching)

ICNP'06 10

Centralized Greedy Algorithm (CGA)

CGA iteratively selects the most beneficial (data item, node to cache at) pair.

I.e., we pick (at each stage) the pair that has the maximum benefit.

Theorem: CGA is (1/2)–approximate for uniform data item.

¼-approximate for non-uniform size data item

ICNP'06 11

CGA Approximation Proof Sketch

G’: modified G, where each node has twice memory of that in G caches data items selected by CGA and optimal

B(Optimal in G)

< B(Greedy + Optimal in G’)

= B(Greedy) + B(Optimal) w.r.t Greedy

< B(Greedy) + B(Greedy) [Due to greedy choice]

= 2 x B(Greedy)

ICNP'06 12

Distributed Greedy Algorithm (DGA)

Each node caches the most beneficial data items, where the benefit is based on “local traffic”.

“Local Traffic” includes: Its own data requests Data requests to its data items Data requests forwarding to others

ICNP'06 13

DGA: Nearest Cache Table

Why do we need it? Forward requests to the nearest cache Local Benefit calculation

What is it? Each nodes keeps the ID of nearest cache for

each data item Entries of the form: (data item, the nearest cache) Above is on top of routing table.

Maintenance – next slide

ICNP'06 14

Maintenance of Nearest-cache Table

When node i caches data Dj

broadcast (i, Dj) to neighbors Notify server, which keeps a list of caches

On recv (i, Dj) if i is nearer than current nearest-cache of Dj,

update and forward

ICNP'06 15

Maintenance of Nearest-cache Table -II

i deletes Dj get list of caches Cj from server of Dj

broadcast (i, Dj, Cj) to neighbors

On recv (i, Dj, Cj) if i is current nearest-cache for Dj, update using Cj

and forward

ICNP'06 16

Maintenance of Nearest-cache Table -III

More details pertaining to Mobility Second-nearest cache entries (needed for benefit

calculation for cache deletions) Benefit thresholds

ICNP'06 17

Performance Evaluation

CGA vs. DGA Comparison

DGA vs. HybridCache Comparison

ICNP'06 18

CGA vs. DGA

Summary of simulation results: DGA performs quite close to CGA, for

wide range of parameter values

ICNP'06 19

Varying Number of Data Items and Memory Capacity – Transmission radius =5, number of nodes = 500

ICNP'06 20

DGA vs. Yin and Cao’s work.

Yin and Cao:[infocom’04] CacheData – caches passing-by data item CachePath – caches path to the nearest cache HybridCache – caches data if size is small

enough, otherwise caches the path to the data Only work of a purely distributed cache placement

algorithm with memory constraint

ICNP'06 21

DGA vs. HybridCache [YC 2004]

Simulation setup: Ns2, routing protocol is DSDV Random waypoint model, 100 nodes move at a

speed within (0,20m/s), 2000m x 500m area Tr=250m, bandwidth=2Mbps

Performance metrics: Average query delay Query success ratio Total number of messages

Server Model: 1000 data items, divided into two

servers. Data item size: [100, 1500] bytes

Data access models Random: Each node accesses 200 data

items randomly from the 1000 data items Spatial: (details skipped)

Naïve caching algorithm: caches any passing-by data, uses LRU for cache replacement

Varying query generate time on random access pattern

ICNP'06 24

Summary of Simulation Results

Both HybridCache and DGA outperform Naïve approach

DGA outperforms HybridCache in all metrics Especially for frequent queries and small

cache size For high mobility, DGA has slightly worse

average delay, but much better query success ratio

ICNP'06 25

Conclusions

Data caching problem for multiple items under memory constraint

Centralized approximation algorithm

Localized distributed implementation

First work to present a distributed caching scheme based on an approximation algorithm

ICNP'06 26

Questions?

ICNP'06 27

Varying Network Size and Transmission Radius - number of data items = 1000, each node’s memory capacity = 20 units

ICNP'06 28

Correctness of the maintenance

Nearest-cache table is correct For node k whose nearest-cache table needs to

change in response to a new cache i, every intermediate nodes between k and i needs to change its table

Second-nearest cache is correct For cache node k whose second-nearest cache

should be changed to i in response to new cache i, there exist two distinct neighboring nodes i1, i2 s.t. nearest-cache node of i1 is k and nearest-cache node of i2 is i

ICNP'06 29

ICNP'06 30

ICNP'06 31

An Example

AB

C

D

EF