Automated Parameterization of Performance Models from ... · Automated Parameterization of...

154
Automated Parameterization of Performance Models from Measurements Giuliano Casale Imperial College London, UK Simon Spinner University of Würzburg, Germany Weikun Wang Imperial College London, UK Tutorial @ ICPE 2016, Delft, the Netherlands, March 13, 2016

Transcript of Automated Parameterization of Performance Models from ... · Automated Parameterization of...

Page 1: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

Automated Parameterization of Performance Models from Measurements Giuliano Casale Imperial College London, UK Simon Spinner University of Würzburg, Germany Weikun Wang Imperial College London, UK Tutorial @ ICPE 2016, Delft, the Netherlands, March 13, 2016

Page 2: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

2

Workload Characterization Common parameters in performance models: Service demand of a request CPU time, bandwidth consumed, …

Arrival rate of requests

Applications Automated performance modelling Resource cost splitting Performance anomaly detection …

Page 3: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

3

Example: Queueing Modelling

Demands

Java Modelling Tools: http://jmt.sf.net

Page 4: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

4

Example: Cost Splitting Carico Workload 1

Tran

sazi

oni/o

ra

Lun Mar Mer Gio Ven Sab Dom

010

020

030

040

0

Carico Workload 2

Tran

sazi

oni/o

ra

050

150

250

Lun Mar Mer Gio Ven Sab Dom

Ripartizione dei Costi

Util

izzo

CP

U -

%

020

4060

8010

0

Lun Mar Mer Gio Ven Sab DomCarico Workload 3

Tran

sazi

oni/o

ra

050

100

150

200

Lun Mar Mer Gio Ven Sab Dom

Workload 1

Workload 2

Workload 3

# Tr

ansa

ctio

ns

# Tr

ansa

ctio

ns

# Tr

ansa

ctio

ns

Mo Tu We Th Fr Sa Su

Mo Tu We Th Fr Sa Su

Mo Tu We Th Fr Sa Su

Mo Tu We Th Fr Sa Su

How to recover weight of individual contributions to utilization?

Util

izat

ion

[0-1

00%

]

We know from theory that the weight is exactly the service demand!

Page 5: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

5

A Typical Challenge

5

HTTP Requests in the WS (Web Server)

Observation period T

1 request in WS

3 requests in WS

0 requests in WS

Time

Page 6: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

6

A Typical Challenge OS schedules jobs in round robin If n requests run simultaneously, each will approximately receive 1/n of the CPU time Process Sharing is a round robin where the quantum of time assigned to each request is infinitesimal

6

X X

Service time S of the yellow request

Time

CPU

33% CPU time each

50% each

100% for blue

X

Quantum

Requests Arrive

Simultaneously

3 requests running

Page 7: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

7

Tutorial Agenda

Introduction Demand Estimation Utilization-based

– LibReDE tool Response-based Queue Length-based

– FG tool Comparison Study & Case Studies Arrival Rate Estimation

– M3A tool

Page 8: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

1

Demand Estimation

Page 9: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

2

Overview Estimation Approaches (first part) Simple Utilization Response Time

LibReDE demo Estimation Approaches (second part) Queue Length

Page 10: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

3

Response Time Approximation

Trivial approximation: 𝐷𝐷𝑖𝑖,𝑐𝑐 ≈ 𝑅𝑅𝑐𝑐 Assumptions resource dominates system response time waiting time in queue ≪ 𝐷𝐷𝑖𝑖,𝑐𝑐

Only applicable at low resource utilization

Page 11: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

4

Service Demand Law Basic operational law:

𝐷𝐷𝑖𝑖,𝑐𝑐 =𝑈𝑈𝑖𝑖,𝑐𝑐𝑋𝑋0,𝑐𝑐

Partial utilization 𝑈𝑈𝑖𝑖,𝑐𝑐 is hard to derive Operating system: per-process statistics Profilers: high-overheads

2 alternative solutions: Controlled experiment Partitioning

Page 12: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

5

Controlled Experiment Measurement Interval CPU Utilization Requests executed in separate experiments Resource Demand Problems: Not applicable at runtime Mutual interference

40%

Request1

30%

Request2

Page 13: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

6

Partitioning Measurement Interval CPU Utilization Mixed Workload How to partition processing time? Response times Additional performance counters

60%

Request2

Request1 ?

Page 14: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

7

Estimation Approaches

Data Collection Data Collection

Demand Estimation

Demand Estimation

Modeling Assumptions (scheduling, service distribution)

Modeling Assumptions (scheduling, service distribution)

Model Solution Model Solution

Utilization Approach Response Time Approach

Page 15: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

8

Linear Regression Linear model (based on Utilization Law) 𝑈𝑈𝑖𝑖 = ∑ 𝑋𝑋𝑖𝑖,𝑐𝑐𝑁𝑁

𝑐𝑐 ∙ 𝐷𝐷𝑖𝑖,𝑐𝑐 + 𝑈𝑈0 Example:

At least 𝑚𝑚 > 𝑛𝑛 observations required Alternative solutions: Least-Squares Regression Least Absolute Differences Regression …

0.54 = 3 * 𝐷𝐷𝑖𝑖,1 + … + 8 * 𝐷𝐷𝑖𝑖,𝑛𝑛 0.72 = 9 * 𝐷𝐷𝑖𝑖,1 + … + 4 * 𝐷𝐷𝑖𝑖,𝑛𝑛 0.33 = 2 * 𝐷𝐷𝑖𝑖,1 + … + 9 * 𝐷𝐷𝑖𝑖,𝑛𝑛 … = …

Page 16: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

9

Example: Outliers

Outliers can bias the regression

Fit Without Outliers

Page 17: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

10

Example: Collinearities

e.g.: logins proportional to logouts

“Right” answer not well-defined

Page 18: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

11

Other Approaches Robust regression Least Absolute Differences: Zhang et al. (2007) Least Trimmed Squares: Casale et al. (2008)

Machine-learning Clusterwise linear regression: Cremonesi et al. (2010) Pattern matching: Cremonesi et al. (2014)

Page 19: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

12

Utilization Approaches Utilization-based approaches Advantages Only utilization and throughput data required Minimal assumptions:

– Any scheduling strategy – Any interarrival distribution – (Any service time distribution)

Disadvantages Robustness Amount of data

Page 20: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

13

Response Time Approaches Assumptions Single queue: closed-form solution exists Queueing network: product-form

Response time equations depend on Scheduling strategy Service distribution Interarrival time distribution

If M/G/1 with PS or LCFS or M/M/1 with FCFS and class-independent service

times, then 𝑅𝑅 = 𝐷𝐷1−𝑈𝑈

Page 21: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

14

General Optimization Assumptions: Variables 𝑫𝑫 = (𝐷𝐷1,1, … ,𝐷𝐷1,𝑛𝑛, … ,𝐷𝐷𝑖𝑖,1, … ,𝐷𝐷𝑖𝑖,𝑛𝑛) Queueing Network QN 𝒛𝒛 = 𝑓𝑓(𝑫𝑫) Observation data 𝒛𝒛�

Optimization Problem: min

𝑫𝑫𝒛𝒛 − 𝒛𝒛�

𝑫𝑫 may be subject to certain constraints

arbitrary norm

Page 22: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

15

Examples Menascé (2008): Liu et al. (2006):

Squared response time error Product-form solution (non-linear!)

Constrained to valid solutions

Page 23: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

16

Kalman Filter Dynamical system State model:

𝑿𝑿𝑘𝑘=𝑭𝑭𝑘𝑘−1𝑿𝑿𝑘𝑘−1+𝑩𝑩𝑘𝑘−1𝒖𝒖𝑘𝑘−1 + 𝒘𝒘𝑘𝑘−1 Observation model:

𝒁𝒁𝑘𝑘 = 𝑯𝑯𝑘𝑘𝑿𝑿𝑘𝑘+𝒗𝒗𝑘𝑘 Filter 𝒛𝒛𝑘𝑘 , 𝒛𝒛𝑘𝑘−1, 𝒛𝒛𝑘𝑘−2, … , 𝒛𝒛1 𝑿𝑿�𝑘𝑘~𝑁𝑁(𝒙𝒙�𝑘𝑘,𝑷𝑷�𝑘𝑘)

observations

previous state controlled input uncorrelated noise next state

estimated mean value estimated covariance

time-series of observations

observation noise

Page 24: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

17

Applied to Demand Estimation

State vector 𝑿𝑿𝑘𝑘 = 𝑫𝑫 Constant state model 𝑿𝑿𝑘𝑘=𝑿𝑿𝑘𝑘−1 + 𝒘𝒘𝑘𝑘−1 Observation model (e.g., Kumar et al. 2009)

𝑅𝑅1⋮𝑈𝑈

=

𝐷𝐷11 − 𝑈𝑈⋮

𝑋𝑋1 ∙ 𝐷𝐷1 + ⋯+ 𝑋𝑋𝐶𝐶 ∙ 𝐷𝐷𝑐𝑐

Other observation models are possible (e.g., Zheng et al. 2008, Wang et al. 2012)

Page 25: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

18

Want to learn more?

Elsevier PEVA, October 2015.

Page 26: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

19

http://descartes.tools/librede Eclipse Public License (EPL)

Page 27: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

1

Response Time Based Estimation

Joint work with S. Kraft and S. Pacheco-Sanchez (SAP Belfast,

UK) and Juan F. Pérez (U. Melbourne, Australia)

Page 28: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

2

Paradigm Shift

Demand Estimation

Modeling Assumptions

(Scheduling, Service Distribution)

Model Generation and Solution

Data Collection

Demand Estimation

Modeling Assumptions

(Scheduling, Service Distribution)

Model Generation and Solution

Data Collection

Utilization Approach Response Time Approach

Page 29: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

3

Estimate demand D from response time R

We investigate the likelihood function in First-Come First-Served (FCFS) queues

– e.g., admission control, disk drive buffers, … Processor Sharing (PS) queues

– e.g., CPUs, bandwidth sharing, …

Observation

R D4 D3

D1

D2 D0

1. For each observed R sample 2. Draw D from parameter space 3. Compute likelihood P[R|D] 4. Move in parameter space to

maximize P[R|D]

Parameter Space

Maximum Likelihood Estimator

Page 30: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

4

Model response time using absorbing CTMCs Under FCFS, future arrivals do not affect response time distribution of the tagged job

RT Likelihood in FCFS queues

D1 D2 D2 D1 D1

nK, jobs of class k in queue

D1 D1 1 2

3

4 5

1

Abs

orbi

ng tr

ansi

tion

Model State Space of Markov Chain 2

1

D1 D2 D2 D1 D1

3

D1 D2 D2

4

D2 D2

5

D2

1

Page 31: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

5

RT Likelihood in FCFS queues

Probability of being absorbed by time t into a give CTMC state Well-understood: PH-type distributed FCFS Example:

Backlog seen upon arrival

Class 1 arrival

ML Problem (K classes)

Page 32: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

6

Monitoring dataset Active mix: (1 ,2 ,1 ,1 , 0 ) Admission state (mix) and response times

Assumptions

V CPUs

R Classes + Class switching

W Workers

… Admission

Response time

Multi-core server

Page 33: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

7

PI: Trajectory Inference

Class switching probabilities Users submit requests cyclically Requests issued change class over time

Closed class-switching queueing model V CPUs, R classes O(V2R) states Inference of trajectory too complex

?

Page 34: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

8

Dataset characteristics

CI: Complete information (baseline) V=1 CPU: full state trajectory V>1 CPUs: no individual CPU states We split demand proportionally, taking into account the active workers

PI: Partial information Sample admission state and response time Mean throughput is assumed known

Page 35: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

9

CI: Demand estimation V=1 CPU Full demand distribution

Request Runtime

Active workers

Demand Request j

Class r

Scale by Active CPUs V>1 CPU

Page 36: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

10

0

1

2

3

4

5

6

7

8

9

10

20 60 100 140 180 220

100 samples1000 samples

CI: Baseline

Number of users (N) – R=2 classes - 0.1 prob. class switch

Err

or (

%)

V=2 CPUs, W=8 workers

Page 37: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

11

CI requires very detailed measurements Closed queueing network model Assume a fixed mix as seen upon job arrival No class switching ( tractable) Model can estimate response time of arriving job

PI: Approximation

CPU-0

CPU-1

Admission

Inter Admission

Time

CPU 0+1 (PS queue)

Page 38: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

12

V=1 CPU Model assumed in equilibrium

RPS: Regression Approach

Queue seen at admission

(incl. arriving job)

Response Time

Class r

V>1 CPU Individual CPU state estimated

Demand Class r

Average queue per CPU

Page 39: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

13

0.00

20.00

40.00

60.00

80.00

100.00

120.00

140.00

160.00

20 60 100 140 180 220

CIRPS

RPS Results

100 samples, V=4 CPUs, W=16 workers

Number of users (N) – R=2 classes - 0.1 prob. class switch

Err

or (

%)

Best in heavy-load

Page 40: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

14

MLPS: Maximum Likelihood Maximum Likelihood Estimation (MLPS) Search over mean demand guesses Maximize likelihood of observed dataset

Response time likelihood Tagged customer method (absorbing CTMC) Initialized with state seen upon admission Mean demand guess CTMC rates

Page 41: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

15

Job of class 3 arrives at system with V=1 CPU

Mix seen upon arrival: 1 job of class 1, 3 jobs of class 2

We study the transient of this CTMC to obtain the response time distribution of the class-3 job

MLPS: Absorbing CTMC

0,0,1

1,0,1

0,1,1

0,2,1

0,2,1

1,1,1

0,3,1

1,2,1

1,3,1

1/E[D3]

1/(2E[D3])

1/(3E[D3])

1/(4E[D3])

1/(5E[D3])

1/(4E[D3])

1/(3E[D3])

1/(2E[D3])

1/(3E[D3]) Initial state

Transitions depend on E[D1] and E[D2]

Page 42: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

16

MLPS: Absorbing CTMC

V=1 CPU Dataset: Likelihood function for each sample:

V>1 CPUs Load-dependent rates

init with state at admission

trajectory in ri sec

completion rates

CTMC generator

1/demand

Page 43: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

17

0.00

5.00

10.00

15.00

20.00

25.00

30.00

35.00

20 60 100 140 180 220

CIMLPS

MLPS: Results

Number of users (N) – R=2 classes - 0.1 prob. class switch

Err

or (

%)

100 samples, V=4 CPUs, W=16 workers

Best in light load

Page 44: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

18

0.00

2.00

4.00

6.00

8.00

10.00

12.00

20 60 100 140 180 220

CIMINPS

MinPS: min(DRPS, DMLPS)

100 samples, V=4 CPUs, W=16 workers

Number of users (N) – R=2 classes - 0.1 prob. class switch

Err

or (

%)

best in light or heavy load

Page 45: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

19

1

10

100

1,000

2 4 8 16 32 64

MLPSFMLPSRPS/ERPS

MinPS: ODE Approximation

100 samples: W=128 workers

V (CPUs)

Run

ning

Tim

e [s

]

exponential growth

scalable ODE approx

Page 46: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

20

MinPS: Sensitivity Analysis

Magnitude of class demands 3 orders of difference: CI gap ~insensitive

Class switch probability High / Rare: CI gap ~insensitive

Non-exponential service low CV: CI gap weakly sensitive high CV: CI gap ~insensitive for CV<2

Page 47: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

21

Case Study: SAP ERP 3-tier commercial application Modified MLPS with setup times Transactions grouped in R=2 classes

Response Time

User 1 Worker Database

SAP ERP Application Server

Workload Generator

Dispatcher

SAP ERP Database

Page 48: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

22

0

0.05

0.1

0.15

0.2

0.25

N 5N 7N 10N 15N

MEASSIM

SAP ERP Queueing Model

Measured vs Simulation with Estimated Demands

Population - Baseline N = [6 jobs class 1, 4 jobs class 2]

Res

pons

e T

ime

Page 49: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

23

Fluid MLPS (FMLPS) Limit behaviour of the CTMC for growing rates and requests increasingly deterministic V=scale factor. Request mix is unchanged. Limit behaviour can be modelled via ODEs

State occupancy measure at time t=100

Page 50: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

24

1

10

100

1,000

2 4 8 16 32 64

MLPSERPSFMLPS

Computational Costs

100 samples: W=128 workers

V (CPUs)

Run

ning

Tim

e [s

]

CTMC: exponential complexity

Fluid: scalable

(RPS, <1s)

Page 51: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

25

0

2

4

6

8

10

12

20 60 100 140 180

CIFMLPS

FMLPS: SAP ERP Results

100 samples: V=8 CPUs, W=64 workers

N (jobs)

Err

or %

(Intractable for MLPS)

Page 52: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

26

Queue-Length Based Estimation

Page 53: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

27

Monitor occupancy at all resources Observations: Ill-posed, unless think times known

Probabilistic model of distributed system

Gibbs: iteratively sample posterior

Gibbs Sampling (GQL)

prior

Page 54: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

28

GQL: Results

Accurate estimates, error ~3%-7%

Page 55: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

29

GQL: Sensitivity analysis Increasing model sizes

05

101520253035404550

4 6 8 13 20 28 89 127 186

GQL

GQLErr

or (%

)

Log of state space size

Page 56: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

30

GQL: Prior distribution

Dirichlet prior all estimates converge unless the exact demand has very low probability prior

Page 57: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

31

GQL: Results

Accurate estimates, error ~3%-7%

Page 58: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

32

GQL vs Other MCMC Methods

Far better convergence properties than Metropolis-Hastings and Slice sampling About 13-15% error in estimating demands against cloud ERP data (Apache OFBiz)

Page 59: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

33

QMLE Approximation

Based on Maximum Likelihood Estimation

Works with mean queue length

A simple approximation of the MLE:

Consider the demand vector where

More details at tomorrow’s talk!

Page 60: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

34

FG Tool

Page 61: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

35

Tool support FG - “Filling-the-Gap” Batch offline analysis, support for Condor Open Source Software MCR executables (BSD-3)

Main repo:

https://github.com/Imperial-AESOP/Filling-the-Gap

Manual available in the repo

Page 62: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

36

FG: Initial design Outputs Model parameters Visualization Forecasting

–Requires analysis, but not decision-making User control knobs Analysis frequency Horizon of analysis Monitoring intensity Maximum collection window

Page 63: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

37

FG: Parameter Estimation

Parameters for QN/LQN models

Page 64: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

38

FG: Architecture

Page 65: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

39

FG: Methods Implemented methods Complete Information (CI) Utilization-Based Regression (UBR) Utilization-Based Optimization (UBO) (M/GI/1-PS; cf. Zhang et al., Menasce) ODE-based MLPS (“fluid MLPS”, FMLPS) MINPS Queue-Based Gibbs Sampling (GQL) Extended RPS (ERPS, includes a new correction for number of cores)

Page 66: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

40

FG: Comparison Results

Page 67: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

1

Queue-Length Based Estimation

Page 68: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

2

Monitor occupancy at all resources Observations: Ill-posed, unless think times known

Probabilistic model of distributed system

Markov-Chain Monte-Carlo (MCMC) draw samples from target distribution averaging samples provides estimate

Gibbs Sampling (GQL)

Page 69: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

3

Markov-Chain Monte-Carlo (MCMC)

Gibbs Sampling (GQL)

Page 70: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

4

Gibbs: sample one dimension at a time

iteratively sample posterior

where

Gibbs Sampling (GQL)

prior

Page 71: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

5

GQL: Results

Accurate estimates, error ~3%-7%

Page 72: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

6

GQL: Sensitivity analysis Increasing model sizes

05

101520253035404550

4 6 8 13 20 28 89 127 186

GQL

GQLErr

or (%

)

Log of state space size

Page 73: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

7

GQL: Prior distribution

Dirichlet prior all estimates converge unless the exact demand has very low probability prior

Page 74: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

8

GQL: Results

Accurate estimates, error ~3%-7%

Page 75: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

9

GQL vs Other MCMC Methods

Far better convergence properties than Metropolis-Hastings and Slice sampling About 13-15% error in estimating demands against cloud ERP data (Apache OFBiz)

Page 76: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

10

QMLE Approximation

Based on Maximum Likelihood Estimation

Works with mean queue length

A simple approximation of the MLE:

Consider the demand vector where

Approach generalizes to load-dependent QNs

More details at tomorrow’s talk!

Page 77: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

11

FG Tool

Page 78: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

12

Tool support FG - “Filling-the-Gap” Batch offline analysis, support for Condor Open Source Software MCR executables (BSD-3)

Main repo:

https://github.com/Imperial-AESOP/Filling-the-Gap

Manual available in the repo

Page 79: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

13

FG: Initial design Outputs Model parameters Visualization

User control knobs Analysis frequency Horizon of analysis Algorithm selection

Page 80: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

14

FG: Parameter Estimation

Parameters for QN/LQN models

Page 81: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

15

FG: Architecture

Page 82: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

16

FG: Methods Implemented methods Complete Information (CI) Utilization-Based Regression (UBR) Utilization-Based Optimization (UBO) (M/GI/1-PS; cf. Zhang et al., Menasce) ODE-based MLPS (“fluid MLPS”, FMLPS) MINPS Queue-Based Gibbs Sampling (GQL) Extended RPS (ERPS, includes a new correction for number of cores)

Page 83: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

17

FG: Comparison Results

Page 84: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

1

Comparison & Case Studies

Page 85: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

2

Comparison

Elsevier PEVA, October 2015.

Page 86: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

3

Experiments Dataset D1: Queueing Simulator Simulated M/M/1 queue with FCFS scheduling Workload classes: 1, 2 and 5 Utilization levels: 10%, 50%, 90% In total: 900 traces

Dataset D2: Micro-Benchmarks Workload classes: 1, 2, and 3 Utilization levels: 20%, 50%, 80% In total: 210 traces

Page 87: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

4

Compared Approaches Based on Service Demand Law (Brosig et al. 2009) Utilization Regression (Rolia and Vetland 1995) Kalman Filter (Kumar et al. 2009) Opitimization 1 (Menascé 2008) Optimization 2 (Liu et al. 2006) Response time regression (Kraft et al. 2009) Gibbs Sampling (Wang et al. 20133)

Page 88: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

5

Sampling Interval

0

20

40

60

80

100

120

Util. Regression Kalman Filter Optim. 1 Optim. 2

Rela

tive

Erro

r (i

n %

)

1 s 5 s 10 s 30 s 60 s 120 s

Number of requests within one interval should be larger than 60.

Dataset D1

Page 89: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

6

Number of Samples

0

1

2

3

4

5

6

7

8

SDL Util.Regression

Kalman Filter Optim. 1 Optim. 2 Rt. Regression Gibbs

Rela

tive

Erro

r (i

n %

)

600 samples 3600 samples

Dataset D1

Number of samples has only limited impact.

Page 90: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

7

Number of Workload Classes

0

20

40

60

80

100

120

140

160

180

SDL Util.Regression

Kalman Filter Optim. 1 Optim. 2 Rt.Regression

Gibbs

Rela

tive

Erro

r (i

n %

)

1 class 2 classes 5 classes

Dataset D1

Number of classes has strong impact on D1.

Page 91: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

8

Number of workload classes

0

5

10

15

20

25

30

SDL Util.Regression

Kalman Filter Optim. 1 Optim. 2 Rt. Regression Gibbs

Rela

tive

Erro

r (i

n %

)

1 class 2 classes 3 classes

Dataset D2

Number of classes has a much smaller impact on D2.

Page 92: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

9

Correlation Analysis C

lass

es

SD

L

Uti

l.

Reg

ress

ion

Kal

man

Filt

er

Op

tim

. 1

Op

tim

. 2

Rt.

Reg

ress

ion

Gib

bs

2 0.91 0.35 0.88 0.89 0.88 0.52 0.9 5 0.72 0.37 0.78 0.8 0.8 0.44 0.79

Correlation with std[D]:

Significant influence of scheduling strategy.

Page 93: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

10

Load Level

0

10

20

30

40

50

60

70

80

90

100

SDL Util.Regression

Kalman Filter Optim. 1 Optim. 2 Rt.Regression

Gibbs

Rela

tive

Erro

r (i

n %

)

10% 50% 90%

Dataset D1

High utilization observations can impact estimation negatively.

Page 94: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

11

Collinearity

0

20

40

60

80

100

120

SDL Util.Regression

Kalman Filter Optim. 1 Optim. 2 Rt.Regression

Gibbs

Rela

tive

Erro

r (i

n %

)

Low High

Collinearity in throughputs impacts Utilization Regression

Page 95: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

12

Missing Jobs

0

10

20

30

40

50

60

70

80

90

100

SDL Util.Regression

Kalman Filter Optim. 1 Optim. 2 Rt.Regression

Gibbs

Rela

tive

Erro

r (i

n %

)

5% 10% 20% 30%

Higher influence on utilization approaches than on response time ones.

Page 96: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

13

Additional Wait Times

0

5

10

15

20

25

30

35

40

45

SDL Util.Regression

Kalman Filter Optim. 1 Optim. 2 Rt. Regression Gibbs

Rela

tive

Erro

r (i

n %

)

25ms 75ms 125ms

Higher influence on response time approaches than on utilization ones.

Page 97: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

14

Execution Time

0.10

1.00

10.00

100.00

1000.00

10000.00

100000.00

1000000.00

SDL Util.Regression

KalmanFilter

Optim. 1 Optim. 2 Rt.Regression

Gibbs

Exec

utio

n Ti

me

(in

ms)

1 class 2 classes 5 classes

Page 98: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

15

Case Study: SPECjEnterprise SPECjEnterprise2010 application benchmark Distributed deployment over 7 VMs „Microservice Style“

Strategies for Demand Estimation Observed end-to-end response time Observed residence time per tier Per-resource statistics

Page 99: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

16

Experiment Setup

Page 100: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

17

Results

0

10

20

30

40

50

60

70

80

90

100

Purchase Manage Browse Mfg EJB Mfg WS

Rela

tive

Erro

r (%

)

Prediction Error Response Time

Optim. 2 (System) Optim. 2 (Tier) SDL

Page 101: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

18

Results

0

10

20

30

40

50

60

70

80

90

100

VM 2 VM 3 VM 4 VM 5 VM 6 VM 7 VM 9

Rela

tive

Erro

r (%

)

Prediction Error Utilization

Optim. 2 (System) Optim. 2 (Tier) SDL

Page 102: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

19

Case Study: SAP HANA Admission control Multi-tenant application (extended TPC-W) SAP HANA cloud platform

Supports Performance isolation between tenants IEEE/ACM CCGrid 2014.

Page 103: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

20

General Idea

Application Server

Admission Control

Requests rr,c

Tenants

Accepted Request

Response

Resource Demand Estimation

Demand dt,r,i

Guarantee

Indices:

t = tenant

r = request type

i = resource

Moni-toring

Throughput Xt,r Response Time Rt,r

Utilization Ui

Page 104: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

21

Resource Isolation

Page 105: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

22

Performance Isolation

Page 106: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

23

Case Study: Zimbra Goal: Automatic vertical CPU scaling of VMs Zimbra is a collaboration server Transactional workload SLA: Mails need to be delivered within 2 minutes Mails may be queued

IEEE SASO 2014.

Page 107: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

24

Approach Overview

Desired resource allocation (at+1)

VM1

VM2

VMn

... Applicatio

n Controller

Model Builder

Model: p = f(λ,a)

Current resource usage (ut)

App-level SLO (pref)

App Sensor

System Sensor

Observed app performance (pt)

pt

vApp Manager

New VM resource settings (number of vCPUs, configured memory size)

vApp

Page 108: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

25

Layered Performance Model

Application layer

Virtual resource layer

Physical resource layer

VM1 VM2 vApp

vCPU vCPU

Physical CPU Service rate depends on physical hardware

+ Hypervisor Scheduling Delays

+ OS scheduling delays + Wait times for other

resources

Hierarchical modeling approach (Method of Layers [1]): Service time at layer 𝑖𝑖 is equal to response time of an underlying closed queueing network at layer 𝑖𝑖 − 1

Load-dependent Service Demands

Page 109: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

26

Influence of Layers

Zimbra MTA with linearly increasing workload:

Dem

ands

(in

seco

nds)

Estimated demands reflect contention at hypervisor and application level

Page 110: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

27

Reconfigurations Controller Mean

latency [s] Reconfigurations Mean

vCPUs Max vCPUs

Model-based 20.48 13 1.4 2 Trigger-based (1 min)

10.82 273 1.83 3

Trigger-based (5 min)

25.97 72 1.46 3

Static allocation 1385 0 1 1

Zimbra MTA VM:

Model-based controller needs less reconfigurations and resources

Page 111: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

28

Bibliography Menascé, D. A. (2008). “Computing missing service demand parameters for performance models”. In: CMG Conference Proceedings, pp. 241–248

Liu, Z., L. Wynter, C. H. Xia, and F. Zhang (2006). “Parameter inference of queueing models for IT systems using end-to-end measurements”. In: Perform. Eval. 63.1, pp. 36–60

Kumar, D., A. N. Tantawi, and L. Zhang (2009a). “Real-time performance modeling for adaptive software systems with multi-class workload”. In: Proceedings of the 17th Annual Meeting of the IEEE/ACM International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, MASCOTS, pp. 1–4

Zheng, T., C. M. Woodside, and M. Litoiu (2008). “Performance Model Estimation and Tracking Using Optimal Filters”. In: IEEE Trans. Software Eng. 34.3, pp. 391–406

Page 112: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

29

Bibliography Wang, W., X. Huang, X. Qin, W. Zhang, J. Wei, and H. Zhong (2012). “Application-Level CPU Consumption Estimation: Towards Performance Isolation of Multi-tenancy Web Applications”. In: Proceedings of the 2012 IEEE Fifth International Conference on Cloud Computing, CLOUD, pp. 439–446

Brosig, F., S. Kounev, and K. Krogmann (2009). “Automated extraction of palladio component models from running enterprise Java applications”. In: Proceedings of the 4th International Conference on Performance Evaluation Methodologies and Tools, VALUETOOLS, p. 10

Rolia, J. and V. Vetland (1995). “Parameter estimation for performance models of distributed application systems”. In: Proceedings of the 1995 Conference of the Centre for Advanced Studies on Collaborative Research, CASCON, p. 54

Page 113: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

30

Bibliography Kraft, S., S. Pacheco-Sanchez, G. Casale, and S. Dawson (2009). “Estimating service resource consumption from response time measurements”. In: Proceedings of the 4th International Conference on Performance Evaluation Methodologies and Tools, VALUETOOLS, p. 48 Wang, W. and G. Casale (2013). “Bayesian Service Demand Estimation Using Gibbs Sampling”. In: Proceedings of the 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, MASCOTS, pp. 567–576 G. Casale, P. Cremonesi, R. Turrin, Robust Workload Estimation in Queueing Network Performance Models, in: 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP), 2008, pp. 183-187.

Page 114: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

31

Bibliography P. Cremonesi, K. Dhyani, A. Sansottera, Service Time Estimation with a Refinement Enhanced Hybrid Clustering Algorithm, in: Analytical and Stochastic Modeling Techniques and Applications, Vol. 6148 of Lecture Notes in Computer Science, Springer Berlin / Heidelberg, 2010, pp. 291--305 P. Cremonesi, A. Sansottera, Indirect estimation of service demands in the presence of structural changes, Performance Evaluation 73 (0) (2014) 18--40, special Issue on the 9th International Conference on Quantitative Evaluation of Systems

Page 115: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

1

Arrival Process Fitting

Joint work with A. Sansottera and P. Cremonesi (DEIB, Politecnico di Milano, Italy)

Page 116: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

2

Outline Introduction

Moments and probabilities in Marked MAPs

Fitting of second-order acyclic Marked MAPs

Results

Conclusions

Page 117: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

3

Requests Traffic

Time-Varying Peaks of User Activity

High Performance

Will the system sustain the load?

Sun Mon Tue Wed Thu

request number

Inter-arrival times

[µs]

FAST RATE

SLOW RATE

SLOW RATE

FAST RATE

SLOW RATE

Page 118: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

4

Stochastic models Generate statistically similar request arrival patterns Analytical models accelerate search for optimal decisions

Markovian Traffic Model

request number

Inter-arrival times

[µs]

Arrival Process Modelling

Stochastic analysis

SLOW RATE

FAST RATE

FAST RATE

SLOW RATE

30%

70%

50%

50%

Request number

Inte

rarr

ival

tim

e [m

s]

Page 119: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

5

Stochastic models Generate statistically similar request arrival patterns Analytical models accelerate search for optimal decisions

Arrival Process Modelling

SLOW RATE

FAST RATE

FAST RATE

SLOW RATE

Markovian Traffic Model Automated fitting methods 30%

70%

50%

50%

Models evaluated: ~350 Initial Guess

Page 120: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

6

Network of queues mathematical abstraction for prediction, what-if scenarios, … describes billions of possible states for the resources efficient output analysis techniques [Smirni, QEST’09]

Incoming Requests

Traffic Decomposition for QN

FAST RATE

SLOW RATE

Completed Requests

Web Server

Storage

Database

CPU

Disks

CPUs

Cloud Resources

Page 121: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

7

Network of queues mathematical abstraction for prediction, what-if scenarios, … describes billions of possible states for the resources developed efficient analysis techniques

Output Flow

Traffic Decomposition for QN

Completed Requests

Storage

Database

Disks

CPUs

FAST RATE

SLOW RATE

SLOW RATE

Cloud Resources

Page 122: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

8

Network of queues mathematical abstraction for prediction, what-if scenarios, … describes billions of possible states for the resources developed efficient analysis techniques

Output Flow

Traffic Decomposition for QN

Completed Requests

Storage

Database

Disks

CPUs

FAST RATE

SLOW RATE

SLOW RATE

Cloud Resources

Page 123: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

9

Network of queues mathematical abstraction for prediction, what-if scenarios, … describes billions of possible states for the resources developed efficient analysis techniques

Output Flow

Traffic Decomposition for QN

Completed Requests

SLOW RATE

FAST RATE

SLOW RATE

FAST RATE

Cloud Resources

Page 124: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

10

Non-Poisson Arrivals Microsoft Live Maps Back End Trace

Disk read/write inter-issue time Poisson Phase-type Renewal Processes

Page 125: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

11

PH-type Distribution N transient states

Exit vector

No-mass at 0 assumption

CTMC

Representation PH(D0, α)

1 absorbing state

Phase-type Distribution: distribution of the time to absorption

Page 126: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

12

PH-type Renewal Process Renewal Process with Phase-type distribution Inter-arrival times: i.i.d. with PH(D0, α) distribution

Counting process N(t) is a CTMC

Blocks of N states

Block k: N(t) = k

After absorption, go to block k+1 Initial state in block k+1: probability α Rate of exit from state i and restart from state j = si αj

⋅⋅

=

αα

α

sDsD

sD

Q0

0

0

000000

Transitions in block k

N(t) = 2

N(t) = 3

Event: exit from block k, restart from block k+1

N(t) = 1

Page 127: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

13

Some Tools for PH Fitting

EMpht (1996) http://home.imf.au.dk/asmus/pspapers.html EM algorithm for ML fitting, based on Runge-Kutta methods Local optimization technique

jPhase (2006) http://copa.uniandes.edu.co/software/jmarkov/index.html

Java library ML and canonical form fitting algorithms

Page 128: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

14

Some Tools for PH Fitting

PhFit (2002) http://webspn.hit.bme.hu/∼telek/tools.htm

Separate fit of distribution body and tail Both continuous and discrete ML distributions

G-FIT (2007) http://ls4-www.cs.uni-dortmund.de/home/thummler/gfit.tgz

Hyper-Erlang PHs used as building block Automatic aggregation of large traces, dramatic speed-up of computational times compared to EMpht

Page 129: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

15

Correlated Arrivals Microsoft Live Maps Back End Trace

Disk read/write inter-issue time Phase-type Renewal Processes Markovian Arrival Processes

Page 130: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

16

Markovian Arrival Process Phase-type Renewal Process Rate of exit from state i and restart from state j = si αj

Markovian Arrival Process (MAP) Rate of exit from state i and restart from state j = sij Generalization of PH-Renewal: allows to model correlation

=

10

10

10

000000

DDDD

DD

Q

−−

=

nnn rr

rrrr

D

λ

λλ

21

23221

13121

0

=

nnnn sss

ssssss

D

21

232221

131211

1Representation: MAP(D0,D1)

Interval-stationary initialization

Page 131: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

17

Tools for MAP fitting

KPC-Toolbox (2008) http://www.cs.wm.edu/MAPQN/kpctoolbox.html Moment-matching method Composition of large MAPs by two-state MAPs Property of KPC Process (similar relations for higher-order moments, ACF, …)

KPC Process

!/][][][ kXEXEXE kb

ka

k =

Page 132: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

18

Motivation and Goals Marked Markovian Arrival Processes (MMAPs) Generalization of MAPs to model multi-class arrivals Allow to model non-Poisson cross-correlated arrivals Allow efficient solution of the models with matrix-

analytic methods

Modeling the arrival process at a queuing system (MMAP[K]/PH[k]/1-FCFS queue) FCFS queues can be analyzed analytically using age process Q-MAM: https://bitbucket.org/qmam/qmam/src BU-Tools: http://webspn.hit.bme.hu/~telek/tools/butools/

Page 133: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

19

Multi-class Arrivals Microsoft Live Maps Back End Trace

Disk read/write inter-issue time Markovian Arrival Processes Marked Markovian Arrival Processes

Page 134: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

20

Marked MAPs

(D0,D1) is a representation of the MAP underlying the MMAP

(D0,D11,D12) is a representation of a MMAP[2] process (2 classes)

Page 135: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

21

Fitting Fitting problem Marked trace from a real system: (Xi, Ci) MMAP Queues with arrivals that follow MMAP can be solved

analytically

Two families of methods Maximum-likelihood Matching moments (or other characteristics)

We focus on moment matching More computationally efficient In real systems, easier to save moments than the whole

trace

Page 136: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

22

Issues of moment matching Representation of MMAPs is not minimal Number of parameters >> Degrees of freedom

Hard to obtain analytical fitting formulas for the parameters Easy: Parameters -> Moments Hard: Moments -> Parameters Requires solving a non-linear system of equations in the

general case Non-linear least squares for MMAP fitting [Buchholz, 2010]

Page 137: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

23

Issues of moment matching Feasibility: given a number of states n for the

MMAP, which values of the moments can be fitted exactly? Related issue: how to perform approximate fitting?

Which characteristics best capture the queueing behavior? Caveat 1: not all characteristics have known analytical

formulas Caveat 2: inverting the analytical formulas might be harder

for some characteristics

Page 138: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

24

Outline Introduction

Moments and probabilities in Marked MAPs

Fitting of second-order acyclic Marked MAPs

Results

Page 139: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

25

Definitions Ordinary moment of order j:

Backward moment of order j for class c (green):

Forward moment of order j for class c (green):

Cross moments of order j for class c followed by class k:

Probability of a class c arrival:

“Transition” probability of a class c arrival followed by class k

Page 140: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

26

Moment Dependencies Ordinary moments can be expressed as linear

combination of the forward moments, weighted by the class probabilities the backward moments, weighted by the class probabilities the cross moments, weighted by the class-transition probabilities

For 2 classes and j = 1 Linear system for M1ck 4 unknowns, rank 3

A cross-moment might be needed to uniquely determine a second-order MMAP[2]

Page 141: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

27

Outline Introduction

Moments and probabilities in Marked MAPs

Fitting of second-order acyclic Marked MAPs

Results

Page 142: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

28

AMMAP[2] Fitting

7 degrees of freedom

4 for the underlying

AMAP

3 for the marginal

Phase-type

1 for the auto-

correlation decay

3 for multi-class

characteristics

D0 D1

Page 143: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

29

AMMAP[2] Fitting Any MAP(2) has geometric autocorrelation decay with rate γ • Canonical form for the underlying MAP(2) [Bodrog et al., 2010] Acyclic: two forms for γ > 0 and γ < 0 For γ = 0, acyclic phase-type renewal

γ > 0 γ < 0

Page 144: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

30

AMMAP[2] Fitting Any MAP(2) has geometric autocorrelation decay with rate γ • Canonical form for the underlying MAP(2) [Bodrog et al., 2010] Acyclic: two forms for γ > 0 and γ < 0 For γ = 0, acyclic phase-type renewal

γ > 0 γ < 0

Page 145: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

31

AMMAP[2] Fitting Any MAP(2) has geometric autocorrelation decay with rate γ • Canonical form for the underlying MAP(2) [Bodrog et al., 2010] Acyclic: two forms for γ > 0 and γ < 0 For γ = 0, acyclic phase-type renewal

γ > 0 γ < 0

3 degrees of freedom 3 degrees of freedom

Page 146: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

32

AMMAP[2] Fitting How to spend the 3 available degrees of

freedom?

We have found closed, analytical formulas for the three parameters q11, q21, q22, for both canonical forms

Three different sets of characteristics considered Class probabilities and… 1) Forward moments and backward moments 2) Forward moments and class transition probabilities 3) Backward moments and class transition probabilities

Page 147: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

33

AMMAP[m] Fitting How to handle more than 2 classes?

p1 = 0.29 F11 = 0.08 B11 = 0.08

p2 = 0.43 F12 = 0.13 B12 = 0.12

p3 = 0.29 F13 = 0.08 B13 = 0.09

Page 148: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

34

M3A Toolbox

Latest version:

https://github.com/Imperial-AESOP/M3A A set of Matlab functions designed for computing the statistical descriptors of MMAPs and fitting marked traces with MMAPs Syntax compatibility with KPC-Toolbox

– M3A’s MMAPs are treated by KPC-Toolbox as MAPs

Page 149: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

35

Outline Introduction

Moments and probabilities in Marked MAPs

Fitting of second-order acyclic Marked MAPs

Results

Page 150: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

36

Real-World Traces

Microsoft Live Maps Back End Trace – Disk read/write inter-issue time

Simulation of */M/1 Queue

Page 151: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

37

Bibliography Menascé, D. A. (2008). “Computing missing service demand parameters for performance models”. In: CMG Conference Proceedings, pp. 241–248

Liu, Z., L. Wynter, C. H. Xia, and F. Zhang (2006). “Parameter inference of queueing models for IT systems using end-to-end measurements”. In: Perform. Eval. 63.1, pp. 36–60

Kumar, D., A. N. Tantawi, and L. Zhang (2009a). “Real-time performance modeling for adaptive software systems with multi-class workload”. In: Proceedings of the 17th Annual Meeting of the IEEE/ACM International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, MASCOTS, pp. 1–4

Zheng, T., C. M. Woodside, and M. Litoiu (2008). “Performance Model Estimation and Tracking Using Optimal Filters”. In: IEEE Trans. Software Eng. 34.3, pp. 391–406

Page 152: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

38

Bibliography Wang, W., X. Huang, X. Qin, W. Zhang, J. Wei, and H. Zhong (2012). “Application-Level CPU Consumption Estimation: Towards Performance Isolation of Multi-tenancy Web Applications”. In: Proceedings of the 2012 IEEE Fifth International Conference on Cloud Computing, CLOUD, pp. 439–446

Brosig, F., S. Kounev, and K. Krogmann (2009). “Automated extraction of palladio component models from running enterprise Java applications”. In: Proceedings of the 4th International Conference on Performance Evaluation Methodologies and Tools, VALUETOOLS, p. 10

Rolia, J. and V. Vetland (1995). “Parameter estimation for performance models of distributed application systems”. In: Proceedings of the 1995 Conference of the Centre for Advanced Studies on Collaborative Research, CASCON, p. 54

Page 153: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

39

Bibliography Kraft, S., S. Pacheco-Sanchez, G. Casale, and S. Dawson (2009). “Estimating service resource consumption from response time measurements”. In: Proceedings of the 4th International Conference on Performance Evaluation Methodologies and Tools, VALUETOOLS, p. 48 Wang, W. and G. Casale (2013). “Bayesian Service Demand Estimation Using Gibbs Sampling”. In: Proceedings of the 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, MASCOTS, pp. 567–576 G. Casale, P. Cremonesi, R. Turrin, Robust Workload Estimation in Queueing Network Performance Models, in: 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP), 2008, pp. 183-187.

Page 154: Automated Parameterization of Performance Models from ... · Automated Parameterization of Performance Models from ... # Transactions ... Joint work with S. Kraft and S. Pacheco-Sanchez

40

Bibliography P. Cremonesi, K. Dhyani, A. Sansottera, Service Time Estimation with a Refinement Enhanced Hybrid Clustering Algorithm, in: Analytical and Stochastic Modeling Techniques and Applications, Vol. 6148 of Lecture Notes in Computer Science, Springer Berlin / Heidelberg, 2010, pp. 291--305

P. Cremonesi, A. Sansottera, Indirect estimation of service demands in the presence of structural changes, Performance Evaluation 73 (0) (2014) 18--40, special Issue on the 9th International Conference on Quantitative Evaluation of Systems

Giuliano Casale, Evgenia Smirni: KPC-toolbox: fitting Markovian arrival processes and phase-type distributions with MATLAB. SIGMETRICS Performance Evaluation Review 39(4): 47 (2012)

Andrea Sansottera, Giuliano Casale, Paolo Cremonesi: Fitting second-order acyclic Marked Markovian Arrival Processes. DSN 2013: 1-12 Work supported by the EU projects DICE (644869) and MODAClouds (318484) and the EPSRC project OptiMAM (EP/M009211/1).