Moving beyond end-to-end path information to optimize CDN performance

33
Moving beyond end-to-end path information to optimize CDN performance Krishnan, R., et al. in processed IMC '09. 2009. New York, NY, USA: ACM. Reported by: Eraser Huang Email: [email protected] 1/33

description

Moving beyond end-to-end path information to optimize CDN performance. Krishnan , R., et al . in processed IMC '09. 2009. New York, NY, USA: ACM. Reported by: Eraser Huang Email: [email protected] 2011-03-23 @SYSU . Agenda. Problem Abstract Overview Path Latency Analysis - PowerPoint PPT Presentation

Transcript of Moving beyond end-to-end path information to optimize CDN performance

Page 1: Moving beyond end-to-end path information to optimize CDN  performance

Moving beyond end-to-end path information to optimize CDN performance

Krishnan, R., et al. in processed IMC '09. 2009. New York, NY, USA: ACM.

Reported by: Eraser HuangEmail: [email protected] @SYSU

1/33

Page 2: Moving beyond end-to-end path information to optimize CDN  performance

AgendaProblemAbstractOverviewPath Latency AnalysisDiagnose Cases of Inefficient

RoutingSome ExampleLimitation

2 /33

Page 3: Moving beyond end-to-end path information to optimize CDN  performance

ProblemClient-Server Applications

3 /33

Page 4: Moving beyond end-to-end path information to optimize CDN  performance

ProblemContent Distribution Network ( CDN )

4 /33

Page 5: Moving beyond end-to-end path information to optimize CDN  performance

ProblemContent Distribution Networks ( CDN )

5 /33

Page 6: Moving beyond end-to-end path information to optimize CDN  performance

AbstractMain result of this paper is that:

◦Redirecting every client to the server with least latency does not suffice to optimize client latencies

◦Find that queuing delays often override the benefits of a client interacting with a nearby server

The dataset analyzed in this paper is available at : http://research.google.com/pubs/pub35590.html

6 /33

Page 7: Moving beyond end-to-end path information to optimize CDN  performance

OverviewGoogle’s CDN Architecture

◦Aims to redirect each client to the node to which it has the least latency

◦The RTT measured to a client is taken to be representative of the client’s prefix

◦This redirection however is based on the prefix corresponding to the IP address of the DNS server that resolves the URL of the content on the client’s behalf

7 /33

Page 8: Moving beyond end-to-end path information to optimize CDN  performance

OverviewGoals

◦Understand the efficacy of latency-based redirection in enabling a CDN to deliver the best RTTs possible to its clients

◦Identify the broad categories of causes for poor RTTs experienced by clients

◦Implement a system to detect instances of poor RTTs and diagnose the root causes underlying them

8 /33

Page 9: Moving beyond end-to-end path information to optimize CDN  performance

OverviewThe authors have used WhyHigh to

diagnose several instances of inflated latencies◦BGP tables from routers◦Mapping of routers to geographic locations ◦RTT logs for connections from clients◦Traffic volume information◦Active probes such as traceroutes and pings when necessary

◦Approximately 170K prefixes spread across the world

The dataset analyzed in this paper is available at : http://research.google.com/pubs/pub35590.html

9 /33

Page 10: Moving beyond end-to-end path information to optimize CDN  performance

OverviewData Set - RTT Measurement

The RTT will be measured

10 /33

Page 11: Moving beyond end-to-end path information to optimize CDN  performance

OverviewData Set - Data Pre-Processing

1-to prefix• Map every client to the routable prefix• By BGP snapshot

2-add geo-info

• tag prefixes with geographic information• from a commercial geolocation database

3-prunning

• prune out incorrect prefixes with incorrect geographical information• Geo-RTT• “confidence” • Region Spanning

11 /33

Page 12: Moving beyond end-to-end path information to optimize CDN  performance

Path Latency Analysis Distribution of RTTs

Figure 212 /33

Page 13: Moving beyond end-to-end path information to optimize CDN  performance

Path Latency Analysis Three Main Components of TCP Layer RTT

◦Transmission delay (time to put a packet on to the wire) The size of typically control packets is 50 bytes, the

transmission delay will be less than 1ms on dialup link

◦Propagation delay (time spent from one end of the wire to the other end) The client is far away from the node to which they

have the lowest latency◦Queuing delay (time spent by a packet

waiting to be forwarded)

13 /33

Page 14: Moving beyond end-to-end path information to optimize CDN  performance

Path Latency Analysis Effectiveness of Client

Redirection

Figure 314 /33

Page 15: Moving beyond end-to-end path information to optimize CDN  performance

Path Latency Analysis Characterizing Latency Inflation

More than 20%

Figure 415 /33

Page 16: Moving beyond end-to-end path information to optimize CDN  performance

Path Latency Analysis Data Set Partition

80% 20%

1) Prefixes closest to the node geographically

2) All other prefixes

Figure 3 16 /33

Page 17: Moving beyond end-to-end path information to optimize CDN  performance

Path Latency Analysis Characterizing Latency Inflation

(after data set partition)

More than 20%

Figure 5 17 /33

Page 18: Moving beyond end-to-end path information to optimize CDN  performance

Path Latency AnalysisCharacterizing Delays

More than 40%

Figure 4 18 /33

Page 19: Moving beyond end-to-end path information to optimize CDN  performance

Path Latency AnalysisChange of Route (Inefficient

Routing) 4K 6K

19 /33

Page 20: Moving beyond end-to-end path information to optimize CDN  performance

Path Latency AnalysisCharacterizing Queuing Delays

Figure 720 /33

Page 21: Moving beyond end-to-end path information to optimize CDN  performance

Path Latency AnalysisSummary

◦Redirection based on end-to-end RTTs results in most clients being served from a geographically nearby node;

◦A significant fraction of prefixes have inefficient routes to their nearby nodes;

◦Clients in most prefixes incur significant latency overheads due to queuing of packets.

21 /33

Page 22: Moving beyond end-to-end path information to optimize CDN  performance

Diagnose Casesof Inefficient RoutingIdentifying Inflated Prefixes

◦Compare the minimum RTT measured at the node across all connections to the prefix with the minimum RTT measured at the same node across all connections to clients within the prefix’s region

◦Declare a prefix to be inflated if that difference is greater than 50ms.

22 /33

Page 23: Moving beyond end-to-end path information to optimize CDN  performance

Diagnose Casesof Inefficient RoutingIdentifying Causes of Latency Inflation

◦Snapshots of the BGP routing table provide information on the AS path being used to route packets to all prefixes

◦A log of all the BGP updates tells us the other alternative paths available to each prefix

◦A traceroute 1 from the node to a destination in the prefix, and pings to intermediate routers seen on the traceroute will gain visibility into the reverse path back from prefixes

23 /33

Page 24: Moving beyond end-to-end path information to optimize CDN  performance

Diagnose Casesof Inefficient RoutingIdentifying Causes of Latency

Inflation◦Circuitousness along the forward path

Sequence of locations traversed along the traceroute to the prefix

◦Circuitousness along the reverse path Significant RTT increase on a single hop of the

traceroute Return TTL on the response from a probed

interface Flow records gathered at border routers in

Google’s network

24 /33

Page 25: Moving beyond end-to-end path information to optimize CDN  performance

Diagnose Casesof Inefficient RoutingHelping Administrator to Troubleshooting

◦Identifying Path Inflation Granularity (i) Prefixes sharing the same PoP-level path

measured by traceroute, (ii) Prefixes sharing the same AS path and the

same exit and entry PoPs out of and into Google’s network, (iii) prefixes sharing the same AS path

(iv) Prefixes belonging to the same AS◦Ranking CDN Nodes

The fraction of nearby prefixes that have inflated latencies

The fraction of nearby prefixes that are served elsewhere

25 /33

Page 26: Moving beyond end-to-end path information to optimize CDN  performance

Diagnose Casesof Inefficient RoutingRanking of 13 CDN Nodes

26 /33

Page 27: Moving beyond end-to-end path information to optimize CDN  performance

System Architecture of WhyHighSteps Involved in the WhyHigh

Pipeline

27 /33

Page 28: Moving beyond end-to-end path information to optimize CDN  performance

Diagnose Casesof Inefficient RoutingIdentifying Root Causes of

Inflation◦Lack of peering◦Limited bandwidth capacity◦Routing misconfiguration◦Traffic engineering

28 /33

Page 29: Moving beyond end-to-end path information to optimize CDN  performance

Some ExampleIllustrative Cases

◦Case 2: No peering, and shorter path on less specific prefix

Data used in troubleshooting Case 2: (a)Extract of traceroute, and (b) AS paths received by Google.

- Traffic engineering

A node in India measured RTTs above 400ms to prefixes in IndSP1

29 /33

Page 30: Moving beyond end-to-end path information to optimize CDN  performance

Some ExampleIllustrative Cases

◦Case 3: Peering, but inflated reverse path

30 /33

Data used in troubleshooting Case 4: (a) Extract of traceroute, and (b) Pings to routers at peering link.

A node in Japan measured RTTs above 100ms to prefixes in IndSP1

Page 31: Moving beyond end-to-end path information to optimize CDN  performance

Some ExampleSummarizing use of WhyHigh

WhyHigh’s classification of inflated paths

31 /33

Page 32: Moving beyond end-to-end path information to optimize CDN  performance

LimitationTraceroutes yield path information

only at the IP routing layer◦However, path inflation could occur

below layer 3, e.g., in MPLS tunnels;◦may not be explainable by the

geographic locations of traceroute hopsOnly has access to RTT data

◦TCP transfer times of medium to large objects could be inflated by other factors such as loss rate and bandwidth

32 /33

Page 33: Moving beyond end-to-end path information to optimize CDN  performance

Thank You!

33 /33