ICNP'06 1
Benefit-based Data Caching in Ad Hoc Networks
Bin Tang, Himanshu Gupta and Samir Das
Computer Science DepartmentStony Brook University
ICNP'06 2
Outline
Problem Addressed, and Motivation Problem Formulation Related Work Centralized Greedy Algorithm Distributed Implementation Performance Evaluation Conclusions
ICNP'06 3
Problem Addressed
In a general ad hoc network with limited memory at each node, where to cache data items, such that the total access (communication) cost is minimized?
ICNP'06 4
Motivation Ad hoc networks are resource constrained
Limited bandwidth, battery energy, and memory
Caching can save access (communication) cost, and thus, bandwidth and energy
ICNP'06 5
Problem Formulation Given:
Network graph G(V,E) Multiple data items Access frequencies (for each node and data item) Memory constraint at each node
Select data items to cache at each node under memory constraint
Minimize total access cost = ∑nodes ∑data items [(distance from node to the nearest
cache for that data item) x (access frequency) ]
ICNP'06 6
Related Work Related to facility-location problem and K-
median problem; No memory constraint
Baev and Rajaraman 20.5-approximation algorithm for uniform-size data
item For non-uniform size, no polynomial-time
approximation unless P = NP We circumvent the intractability by
approximating “benefit” instead of access cost
ICNP'06 7
Related Work - continued
Two major empirical works on distributed caching Hara [infocom’99] Yin and Cao [Infocom’ 04] (we compare our work
with theirs)
Our work is the first to present a distributed caching scheme based on an approximation algorithm
ICNP'06 8
Algorithms
Centralized Greedy Algorithm (CGA) Delivers a solution whose “benefit” is at least 1/2 of
the optimal benefit
Distributed Greedy Algorithm (DGA) Purely localized
ICNP'06 9
Centralized Greedy Algorithm (CGA)
Benefit of caching a data item at a node
= the reduction of total access cost
i.e., (total access cost before caching) – (total access cost after caching)
ICNP'06 10
Centralized Greedy Algorithm (CGA)
CGA iteratively selects the most beneficial (data item, node to cache at) pair.
I.e., we pick (at each stage) the pair that has the maximum benefit.
Theorem: CGA is (1/2)–approximate for uniform data item.
¼-approximate for non-uniform size data item
ICNP'06 11
CGA Approximation Proof Sketch
G’: modified G, where each node has twice memory of that in G caches data items selected by CGA and optimal
B(Optimal in G)
< B(Greedy + Optimal in G’)
= B(Greedy) + B(Optimal) w.r.t Greedy
< B(Greedy) + B(Greedy) [Due to greedy choice]
= 2 x B(Greedy)
ICNP'06 12
Distributed Greedy Algorithm (DGA)
Each node caches the most beneficial data items, where the benefit is based on “local traffic”.
“Local Traffic” includes: Its own data requests Data requests to its data items Data requests forwarding to others
ICNP'06 13
DGA: Nearest Cache Table
Why do we need it? Forward requests to the nearest cache Local Benefit calculation
What is it? Each nodes keeps the ID of nearest cache for
each data item Entries of the form: (data item, the nearest cache) Above is on top of routing table.
Maintenance – next slide
ICNP'06 14
Maintenance of Nearest-cache Table
When node i caches data Dj
broadcast (i, Dj) to neighbors Notify server, which keeps a list of caches
On recv (i, Dj) if i is nearer than current nearest-cache of Dj,
update and forward
ICNP'06 15
Maintenance of Nearest-cache Table -II
i deletes Dj get list of caches Cj from server of Dj
broadcast (i, Dj, Cj) to neighbors
On recv (i, Dj, Cj) if i is current nearest-cache for Dj, update using Cj
and forward
ICNP'06 16
Maintenance of Nearest-cache Table -III
More details pertaining to Mobility Second-nearest cache entries (needed for benefit
calculation for cache deletions) Benefit thresholds
ICNP'06 18
CGA vs. DGA
Summary of simulation results: DGA performs quite close to CGA, for
wide range of parameter values
ICNP'06 19
Varying Number of Data Items and Memory Capacity – Transmission radius =5, number of nodes = 500
ICNP'06 20
DGA vs. Yin and Cao’s work.
Yin and Cao:[infocom’04] CacheData – caches passing-by data item CachePath – caches path to the nearest cache HybridCache – caches data if size is small
enough, otherwise caches the path to the data Only work of a purely distributed cache placement
algorithm with memory constraint
ICNP'06 21
DGA vs. HybridCache [YC 2004]
Simulation setup: Ns2, routing protocol is DSDV Random waypoint model, 100 nodes move at a
speed within (0,20m/s), 2000m x 500m area Tr=250m, bandwidth=2Mbps
Performance metrics: Average query delay Query success ratio Total number of messages
Server Model: 1000 data items, divided into two
servers. Data item size: [100, 1500] bytes
Data access models Random: Each node accesses 200 data
items randomly from the 1000 data items Spatial: (details skipped)
Naïve caching algorithm: caches any passing-by data, uses LRU for cache replacement
ICNP'06 24
Summary of Simulation Results
Both HybridCache and DGA outperform Naïve approach
DGA outperforms HybridCache in all metrics Especially for frequent queries and small
cache size For high mobility, DGA has slightly worse
average delay, but much better query success ratio
ICNP'06 25
Conclusions
Data caching problem for multiple items under memory constraint
Centralized approximation algorithm
Localized distributed implementation
First work to present a distributed caching scheme based on an approximation algorithm
ICNP'06 27
Varying Network Size and Transmission Radius - number of data items = 1000, each node’s memory capacity = 20 units
ICNP'06 28
Correctness of the maintenance
Nearest-cache table is correct For node k whose nearest-cache table needs to
change in response to a new cache i, every intermediate nodes between k and i needs to change its table
Second-nearest cache is correct For cache node k whose second-nearest cache
should be changed to i in response to new cache i, there exist two distinct neighboring nodes i1, i2 s.t. nearest-cache node of i1 is k and nearest-cache node of i2 is i
Top Related