Network Redundancy Elimination
-
Upload
shi-junxiao -
Category
Technology
-
view
208 -
download
5
description
Transcript of Network Redundancy Elimination
![Page 1: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/1.jpg)
Network Redundancy EliminationJUNXIAO SHI 2013-11-05
Neil T. Spring and David Wetherall. 2000. A protocol-independent technique for eliminating redundant network traffic. SIGCOMM Comput. Commun. Rev. 30, 4 (August 2000), 87-95. DOI=10.1145/347057.347408 http://doi.acm.org/10.1145/347057.347408
slides ©yoursunny.com 2013, CreativeCommons BY-NC 3.0
![Page 2: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/2.jpg)
Problem
![Page 3: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/3.jpg)
Back in 2000, home Internet is slow
MODEM data rate: 33.6Kbps or 56Kbps
round trip latency: >100ms
2 minutes to load a webpage
![Page 4: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/4.jpg)
Today, Internet isn’t always fast Satellite link (eg. Iridium)
◦ high latency◦ 2.4KB/s◦ $1.35 per minute
2G cellular data (eg. H2O Wireless)◦ high latency◦ low bandwidth◦ $0.30 per MB
![Page 5: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/5.jpg)
Web contents are redundant
Screenshots of http://quotes.wsj.com/index/CN/SHCOMP during a trading day. Quote changes, but other remains same.
![Page 6: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/6.jpg)
Web contents are often uncached
Web authors don’t want you to cache their contents, because:◦ Contents are dynamic. Stock price may
change at any time. News articles are posted throughout the day.
◦ Contents are personalized. Your Facebook homepage is different from anyone else’s.
◦ Access count must be accurate. Advertising revenue is calculated per thousand impressions.
response headers of http://www.dailyfinance.com/
![Page 7: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/7.jpg)
To the naïve user -
![Page 8: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/8.jpg)
Design
![Page 9: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/9.jpg)
Architecture
cache
bandwidth-constrained channel
convert repeated strings into tokens
reconstruct original packet
contents of both caches must be consistent
network layer,protocol-independent
cache
![Page 10: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/10.jpg)
The Cache Cache: holds most recent packets
◦ admission policy: admit all◦ replacement policy: FIFO
Indexed by representative fingerprints of the packets it holds◦ map fingerprint to the most recent packet it appears
![Page 11: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/11.jpg)
Representative fingerprints1. Calculate rolling Rabin fingerprints for sequences of β bytes, mod M.
2. Select fingerprints ending with γ zeros as representative fingerprints.
Rabin fingerprints are not cryptographically secure. Algorithm should not assume collision-free.
Rabin fingerprints are used for finding similar documents, not for chunking.
window size: βselect one in 2γ fingerprintsfingerprint space: M
![Page 12: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/12.jpg)
Sender process
cache
generate representative fingerprints
lookup fingerprints in cache index
verify no collision
expand to the left and to the right, byte-by-byte
convert matched regions into tokensadd packet to cache,
evicting oldest packet if necessary
send encoded, smaller packet
token format• the fingerprint• # bytes expanded to the left• # bytes expanded to the right
![Page 13: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/13.jpg)
Receiver process
cache
lookup tokens in cache index
reconstruct original packet
generate representative fingerprints
add packet to cache, evicting oldest packet if necessary
deliver original packet
![Page 14: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/14.jpg)
Cache consistency Contents of sender cache and receiver cache must be consistent.
Why caches might be inconsistent?◦ Network channel isn’t reliable. A packet that entered sender cache but lost on the
channel will not be present in receiver cache.
How to detect cache inconsistency?◦ Fingerprints! If there’s no collision, receiving an unrecognized fingerprint indicates
caches are inconsistent.
What happens if caches are inconsistent?◦ Receiver cannot reconstruct original packet.
![Page 15: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/15.jpg)
Implementation
![Page 16: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/16.jpg)
Trace analyzer The algorithm is implemented as a user-level process to analyze a trace.
![Page 17: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/17.jpg)
Parameters Fingerprint space: M=260
◦ collision almost impossible
Penalty for each matching region: 12 octets◦ to represent the space needed for the token
Windows size β and fingerprint selecting frequency 2γ
◦ large β: better “quality” of matches, less potential bytes saving◦ small β: worse “quality” of matches (shorter matches in more recent packets)◦ small γ: more likely to find a match, larger index (=less memory for cached packets)◦ large γ: less likely to find a match, less memory usage◦ γ=5, β=64
![Page 18: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/18.jpg)
Performance 45Mbps on a PC with Pentium -550 and 1GB memoryⅢ
This work is designed for slow links.
![Page 19: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/19.jpg)
Follow-up work Future works by same authors:
◦ universal redundancy elimination◦ SmartRE: coordinated network-wide redundancy elimination◦ EndRE: end-system redundancy elimination
![Page 20: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/20.jpg)
Traffic AnalysisHow much redundancy is there?
![Page 21: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/21.jpg)
Amount of redundancy
Internet => corporate30% redundant
corporate => Internet50% redundant
with just 1MB of memory for cache+index:at least 10% redundant
![Page 22: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/22.jpg)
HTTP RTSP Napster Lotus HTTPS FTP-data NNTP DNS ASF AOL SMTP POP Telnet Other0
10
20
30
40
50
60
70
traffi
c am
ount
(%)
Redundancy by protocol HTTP, Telnet, POP, ASF have high percentage of repeated strings.
HTTPS, FTP-data, Napster, RTSP, NNTP have low percentage of repeated strings.
redundant traffic
Redundancy elimination algorithm is protocol-independent, so we can save bytes on non-Web traffic.
![Page 23: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/23.jpg)
Comparison with HTTP caching
Squid gzip Squid+gzip RE Squid+RE0
20
40
60
80
100
traffi
c (%
)
redundancy elimination works better than HTTP
caching and compression
![Page 24: Network Redundancy Elimination](https://reader033.fdocuments.net/reader033/viewer/2022061303/54931464b47959650f8b45f4/html5/thumbnails/24.jpg)