Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael...
-
date post
15-Jan-2016 -
Category
Documents
-
view
224 -
download
0
Transcript of Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael...
![Page 1: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/1.jpg)
Identifying Performance Bottlenecks in CDNs through
TCP-Level Monitoring
Peng SunMinlan Yu, Michael J. Freedman, Jennifer
RexfordPrinceton University
August 19, 2011
![Page 2: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/2.jpg)
2
Performance Bottlenecks
Server
APP
Server OS
CDN Servers
Internet
Clients
APPWrite too slowly
Server OSInsufficient send buffer or Small initial congestion window
InternetNetwork congestion
ClientInsufficient receive buffer
![Page 3: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/3.jpg)
3
Reaction to Each Bottleneck
Server
APP
Server OS
CDN Servers
Internet Clients
APP is bottleneck: Debug application
Server OS is bottleneck:Tune buffer size, or upgrade server
Internet is bottleneck: Circumvent the congested part of network
Client is bottleneck: Notify client to change
![Page 4: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/4.jpg)
4
Server
APP
Packet Sniffer
Server OS
Previous Techniques Not Enough
Application logs:No details of network activities
Packet sniffing:Expensive to capture
Active probing:Extra load on network
Transport-layer stats:Directly reveal perf. bottlenecks
![Page 5: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/5.jpg)
5
How TCP Stats Reveal Bottlenecks
CDN Servers Internet
Clients
CDN Server Applications
Server Network
StackNetwork Path Clients
Insufficient data in send buffer
Send buffer full or Initial
congestion window too
small
Packet lossReceive
window too small
![Page 6: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/6.jpg)
6
Measurement Framework
• Collect TCP statistics• Web100 kernel patch• Extract useful TCP stats for analyzing perf.
• Analysis tool• Bottleneck classifier for individual
connections• Cross-connection correlation at AS level• Map conn. to AS based on RouteView• Correlate bottlenecks to drive CDN decisions
![Page 7: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/7.jpg)
7
How Bottleneck Classifier Works
BytesInSndBuf = Rwin
Rwin limits sending
Client is bottleneck
Cwin drops greatlyand Packet loss
Network path is bottleneck
Small initial Cwin
Slow start limits perf.
Network Stack is bottleneck
![Page 8: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/8.jpg)
8
CoralCDN Experiment
• CoralCDN serves 1 million clients per day
• Experiment Environment• Deployment: A Clemson PlanetLab node• Polling interval: 50 ms• Traces to Show: Feb 19th – 25th 2011• Total # of Conn.: 209K• After removing
Cache-Miss Conn.: 137K (Total 2008 ASes)
• Log Space overhead• < 200MB per Coral server per day
![Page 9: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/9.jpg)
9
What are Major Bottleneck for Individual Clients?
• We calculate the fraction of time that the connection is under each bottleneck in lifetime
Bottlenecks% of Conn. With Bottleneck for
>40% of Lifetime
Server Application 10.75%
Server Network Stack 18.72%
Network Path 3.94%
Clients 1.27%
Reasons:Slow CPU or scarce disk resources of the PlanetLab node
Reasons:Congestion window rises too slowly for short conn. (>80% of the connections last <1 second)
Reasons:Spotty network (discussed in next slide)Reasons:Receive buffer too small (Most of them are <30KB)Our suggestion:Use more powerful PlanetLab machinesOur suggestion:Use larger initial congestion windowOur suggestion:Filter them out of decision making
![Page 10: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/10.jpg)
10
AS-Level Correlation
• CDNs make decision at the AS level• e.g., change server selection for 1.1.1.0/24
• Explore at the AS level:• Filter out non-network bottlenecks• Whether network problems exist• Whether the problem is consistent
![Page 11: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/11.jpg)
11
Filtering Out Non-Network Bottlenecks
• CDNs change server selection if clients have low throughput
• Non-network factors can limit throughput
• 236 out of 505 low-throughput ASes limited by non-network bottlenecks
• Filtering is helpful: • Don’t worry about things CDNs cannot control• Produce more accurate estimates of perf.
![Page 12: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/12.jpg)
12
Network Problem at AS Level
• CDN make decision at AS level
• Whether conn. in the same AS have common network problem
• For 7.1% of the ASes, half of conn. have >10% packet loss rate
• Network problems are significant at the AS level
![Page 13: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/13.jpg)
13
Consistent Packet Loss of AS
• CDNs care about predictive value of measurement
• Analyze the variance of average packet loss rates • Each epoch (1 min) has nonzero average loss rate• Loss rate is consistent across epochs
(standard deviation < mean)
Analysis Length# of ASes with
Consistent Packet Loss
One Week 377 / 2008
One Day (Feb 21st) 122 / 739
One Hour(Feb 21st 18:00~19:00)
19 / 121
![Page 14: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/14.jpg)
14
Conclusion & Future Work• Use TCP-level stats to detect performance
bottlenecks
• Identify major bottlenecks for a production CDN
• Discuss how to improve CDN’s operation with our tool
• Future Works• Automatic and real-time analysis combined into
CDN operation• Detect the problematic AS on the path• Combine TCP-level stats with application logs to
debug online services
![Page 15: Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.](https://reader036.fdocuments.net/reader036/viewer/2022062322/56649d2d5503460f94a04772/html5/thumbnails/15.jpg)
15
Thanks!
Questions?