Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, [email protected],...

51
Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, [email protected] , http://cs.rpi.edu/~anchupa Rensselaer Polytechnic Institute, Troy, NY Roshan Sumbaly, [email protected] Coursera, Mountain View, CA Sam Shah, [email protected] LinkedIn, Mountain View, CA

Transcript of Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, [email protected],...

Page 1: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

Hotspot Detection in a Service Oriented Architecture

Pranay Anchuri, [email protected], http://cs.rpi.edu/~anchupa Rensselaer Polytechnic Institute, Troy, NY

Roshan Sumbaly, [email protected] Coursera, Mountain View, CA

Sam Shah, [email protected] LinkedIn, Mountain View, CA

Page 2: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Introduction

Page 3: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Largest professional network.

300M members from 200 countries.

2 new members per second.

Page 4: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Largest professional network.

300M members from 200 countries.

2 new members per second.

Page 5: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Service Oriented Architecture

Page 6: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

What is a Hotspot

Hotspot : Service responsible for suboptimal performance of a user facing functionality.

Page 7: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

What is a Hotspot

Hotspot : Service responsible for suboptimal performance of a user facing functionality.

Performance measures: Latency Cost to serve Error rate

Page 8: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Who uses hotspot detection ?

Engineering teams : Minimize latency for the user. Increase the throughput of the servers.

Operations teams : Reduce the cost of serving user requests.

Page 9: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Goal

Page 10: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Data - Service Call Graphs

Service call metrics logged into a central system.

Call graph structure re-constructed from random trace id.

Page 11: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Example of Service Call Graph

Read profile

Content Service

Context Service

Content Service

Entitlements

Visibility

3 712

10 11

Page 12: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Example of Service Call Graph

Read profile

Content Service

Context Service

Content Service

Entitlements

Visibility

3 712

10 11

Page 13: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Example of Service Call Graph

Read profile

Content Service

Context Service

Content Service

Entitlements

Visibility

3 712

10 11

Page 14: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Example of Service Call Graph

Read profile

Content Service

Context Service

Content Service

Entitlements

Visibility

3 712

10 11

Page 15: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Page 16: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Challenges in mining hotspots

Page 17: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Structure of call graphs

Structure of call graphs change rapidly across requests. Depends on member’s attributes. A/B testing. Changes to code base.

Over 90% unique structures for most requested services.

Page 18: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Asynchronous service calls

Calls AB, AC are Serial : C is called after B returns to A. Parallel : B and C are called at same time

or in a brief time span. Parallel service calls are particularly

difficult to handle. Degree of parallelism ~ 20 for some

services.

Page 19: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Related Work

Hu et. al [SIGCOMM 04, INFOCOMM 05] Tools to detect bottlenecks along network

paths.

Mann et. al [USENIX 11] Models to estimate latency as a function of

RPC’s latencies.

Page 20: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Why existing methods don’t work ?

Metric cannot be controlled as in bottleneck detection algorithms.

Analyzing millions of small networks. Parallel service calls.

Page 21: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Our approach

Page 22: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

● Given call graphs

Optimize and summarize approach

Page 23: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

● Given call graphs

● Hotspots in each call graph

Optimize and summarize approach

Page 24: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

● Given call graphs

● Hotspots in each call graph

● Ranking hotspots

Optimize and summarize approach

Page 25: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

What are the top-k hotspots in a call graph ?

Hotspots in a specific call graph irrespective of other call graphs for the same type of request.

Page 26: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Key Idea

What are the k services, if already optimized, that would have lead to maximum reduction in the latency of request ?(Specific to a particular call graph)

Page 27: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Quantifying impact of a service

What if a service was optimized by θ ? (think after the fact)

Page 28: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Quantifying impact of a service

What if a service was optimized by θ ? (think after the fact) Its internal computations are θ times faster. No effect on the overall latency if its

parent is waiting on other service call to return.

Page 29: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Example

[0,11]

[0,3]

[1,2]

[1.3, 1.6]

[2.1, 2.5]

[4,11]

[6,9]

[7,8]

Page 30: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Example

[0,11]

[0,3]

[1,2]

[1.3, 1.6]

[2.1, 2.5]

[4,11]

[6,9]

[7,8]

Page 31: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Example

[0,11]

[0,3]

[1,2]

[1.3, 1.6]

[2.1, 2.5]

[4,11]

[6,9]

[7,8]

2x faster

Page 32: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Example

[0,11]

[0,3]

[1,2]

[1.3, 1.6]

[2.1, 2.5]

[4,11]

[6,9]

[7,8]

2x faster

Page 33: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Example

[0,11]

[0,3]

[1,2]

[1.3, 1.6]

[2.1, 2.5]

[4,11]

[6,9]

[7,8]

2x faster

Effect of 2x speedup

Page 34: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Local effect of optimization

Latency : Sum of computation and waiting times.

Effect : Lesser computation times and early subcalls. 1)

2) 3)

=

is a service and is its subcall after computation intervals.

Page 35: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Negative example

[0,11]

[0,3]

[1,2]

[1.3, 1.6]

[2.1, 2.5]

[4,11]

[6,9]

[7,8]

Page 36: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Negative example

[0,11]

[0,3]

[1,2]

[1.3, 1.6]

[2.1, 2.5]

[4,11]

[6,9]

[7,8]

Page 37: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Negative example

[0,11]

[0,3]

[1,2]

[1.3, 1.6]

[2.1, 2.5]

[4,11]

[6,9]

[7,8]

Page 38: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Example

Page 39: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Under the propagation assumption

Computing the optimal services is NP-hard. Reduction from a variation of subset sum

problem. Construction and proof in the paper.

Page 40: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Relaxation

Variation of the propagation assumption that allows for a service to propagate fractional effects to its parent. Leads to a greedy algorithm.

Page 41: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Greedy algorithm to compute top-k hotspots

Given an optimization factor θ, Repeatedly select a service that has maximum impact

on frontend service. Update the times after each selection. Stop after k iterations.

Page 42: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Ranking hotspots

top services change significantly across different call graphs.

Rank hotspots on: Frequency (itemset

mining) Impact on front end

service.

Page 43: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Rest of the paper

Similar approach applied to cost of request metric.

Generalized framework for optimizing arbitrary metrics.

Other ranking schemes.

Page 44: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Results

Page 45: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

DatasetRequest type

Avg # of call graphs per day*

Avg # of service call per request

Avg # of subcalls per service

Max # of parallel subcalls

Home 10.2 M 16.90 1.88 9.02

Mailbox 3.33 M 23.31 1.9 8.88

Profile 3.14 M 17.31 1.86 11.04

Feed 1.75 M 16.29 1.87 8.97

* Scaled down by a constant factor

Page 46: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

vs Baseline algorithm

Page 47: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

User of the system

Page 48: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Consistency over a time period

Page 49: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Conclusion

Page 50: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

Conclusions

Defined hotspots in service oriented architectures.

Framework to mine hotspots w.r.t various performance metrics.

Experiments on real world large scale datasets.

Page 51: Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, anchupaanchupa@cs.rpi.edu anchupa.

www.rpi.edu

ThanksQuestions ?