Post on 18-Dec-2015
CacheCast: Eliminating CacheCast: Eliminating Redundant Link Traffic for Redundant Link Traffic for Single Source Multiple Single Source Multiple Destination Transfers Destination Transfers
Piotr Srebrny, Thomas Plagemann, Vera Goebel
Department of Informatics, University of Oslo
Andreas Mauthe
Computing Department, Lancaster University
OutlineOutlineProblem statementIdeaDesignRelated workCacheCast evaluationConclusions
What is the Problem?What is the Problem?The Internet provides only a
mechanism for single source to single destination datagram transmission (unicast)
This is expressed in the IP header
IPPAYLOAD
… Destination Address Source Address …
What is the Problem? What is the Problem? (cont.)(cont.)The Internet – a network of routers and links
APB
P
CP
Redundancy!
S
AB
C
What has been done?What has been done?“Datagram Routing for Internet
Multicasting” L. Aguilar – explicit list of destinations in the IP header◦Follow-ups: XCast, Small Group Multicast
“Host Extension for IP Multicasting” S. Deering – destination address denotes a group of host
“A Case for End System Multicast” Y. hua Chu et al. – application level multicast
CacheCastCacheCastCacheCast is a network layer
caching mechanism that eliminates redundant data transmissionsS
AB
C
APB
C
BP
CP
CacheCast IdeaCacheCast Idea
1.
2.
3.
P A
0 0 0
0
A
B
P A
PP P
1 1 1
0
A
B
P AP A
P B
PP P
1 1 1
0
A
B
CacheCast Idea (cont.)CacheCast Idea (cont.)
4.
5.
PB
PP P
1 1 1
1
B
A
PP P
B B
1 1 1
0
B
A
Link CacheLink CachePoint-to-point logical linksCaching is done per link:
◦Cache Management Unit (CMU)◦Cache Store (CS)
Router
Router
Router Router
CMU
CS
CMU
CS
CMU CS
Caching
Normal processing
Link Cache RequirementsLink Cache RequirementsSimple processing
◦~72ns to process a minimum size packet on a 10Gbps link and ~18ns on a 40Gbps link Modern DDR r/w cycle ~20ns Modern SRAM r/w cycle ~5ns
Small cache size◦A link queue scaled to 250ms of the
link traffic
Source RequirementsSource RequirementsSimple cache processing
◦A source provides information on payload ID, payload size
Minimise cache size◦A source batches request for the
same data and transmits it within the minimum amount of time
Cacheable PacketsCacheable PacketsCacheCast packet carries a
metadata describing packet payload ◦Payload ID◦Payload size◦Index
Only packets with the metadata are cached
CMU & CSCMU & CS
CMU CS
Index
012
Index
012
Payload ID
000
Payload
---
AP1 x
P1
P1AP1 0
Cache miss
CMU table Cache store
CMU & CS (cont.)CMU & CS (cont.)
CMU CS
Index
012
Index
012
Payload ID
P1P2P3
Payload
---
BP2 x
P1
B1
P2P3
Cache hit
P2
CMU table Cache store
Estimating Cache SizeEstimating Cache SizeConcept of packet train
It is sufficient to hold payload in the CS for the packet train duration time
How many packet headers can send a source send within the time?
Estimating Cache Size (cont.)Estimating Cache Size (cont.)Back-of-the-envelope calculations
~10ms caches are sufficient
Source uplink speed
Packet train time
2ms 10ms 50ms
512Kbps 2 8 40
1Mbps 4 16 79
10Mbps 32 157 781
100Mbps 313 1561 7802
Implication of the Small SizeImplication of the Small Size10ms cache size on a 10Gbps
link:~12.8MB for the CS storage space~13000 entries in the CMU table
What about 100Mbps LAN?~130KB for CS~130 entries in the CMU table
We can afford that!
Related WorkRelated WorkPacket Caches on Routers: The
Implications of Universal Redundant Traffic Elimination. Ashok Anand et al.◦Per link cache◦Universal redundancy elimination◦No server support
EvaluationEvaluationBandwidth consumption
◦CacheCast vs. IP Multicast Unique packet headers Finite cache sizes
Incremental deployment◦Benefits of partial cache deployment
Congestion control◦CacheCast impact on TFRC
throughput
Bandwidth ConsumptionBandwidth ConsumptionMulticast efficiency metric:
Example:
u
mm L
L1
9,5 um LL
9
4
9
51 m
Lm – total amount of multicast linksLu – total amount of unicast links
CA B
S
Bandwidth Consumption (cont.)Bandwidth Consumption (cont.)
CacheCast unicast header part (h) and multicast payload part (p)
Thus:
E.g.: using packets which sp=1416B and sh=84B we experience reduction of 5%
uhp
mpuhCC Lss
LsLs
)(1
p
hmCC s
sr
r
,1
1
u
mm L
L1
Finite Cache SizeFinite Cache SizeThe more destination the higher
efficiency E.g.
◦512Kbps – 8 headers in 10ms, e.g. 12 destinations
Slow sources transmitting to many destinations cannot achieve the maximum efficiency
APBCDEFGHIPJKL
m
Finite Cache Size (cont.)Finite Cache Size (cont.)
Uplink PH
512Kbps 8
1Mbps 16
10Mbps 157
100Mbps 1561
Sources with different uplink speed transmitting to the growing number of destinations
S
Incremental DeploymentIncremental Deployment
1 2 3 4 5 6
The CS and CMU deployed incrementally
Incremental DeploymentIncremental Deployment
Bottleneck Link Test
ns2 implementation 100 TCP flows competing with 100
TFRC flows on a bottleneck link
Bottleneck Link Test (cont.)
Both TCP and TRFC benefit from CacheCast
CacheCast ImplementationCacheCast ImplementationRouter part
◦ Click Modular Router CMU and CS elements - in total ~400 lines of
code
Server part◦ Linux kernel – system call
msend(fd_set *write_to, fd_set *written,
char *buf, int len)◦ Paraslash tools – a streaming server
that uses the msend system call and a receiver
TestbedTestbedTestbed setup:
◦Paraslash server (S)◦Click Modular Router (R)◦Paraslash receivers (A,B)
Testbed ResultsTestbed Results
Bandwidth consumed by packet header transmission
msend overhead negligible
ConclusionsConclusionsCacheCast is:
◦A valid solution for single source multiple destinations transfers
◦Simple and reliable◦Fully distributed◦Incrementally deployable
Thank You for Your Attention!Thank You for Your Attention!