Performance Analysis of Computer Systems and Networks Varsha Apte Department of Computer Science and...

Performance Analysis of Computer Systems and Networks

Varsha Apte

Department of Computer Science and Engineering, IIT Bombay

July 10, 2003

Wipro Technologies, Bangalore

July 21, 2003 © 2004 by Varsha Apte, IIT Bombay 2

Agreement

This material can be read or used in any manner only by Wipro employees.

The slides in this presentation cannot be used, with or without modification, as part of other documents, without prior written permission from the author.


Outline: Part I: Theory and Applications (3

hours)Warm-up: Introduction to Performance Analysis (20 mts)

Motivation (Why?)

Metrics (What?)

Methods (How?)

Modeling resources under contention: Queueing Systems (40 mts)

Commonly made assumptions

Commonly used models

Useful results


OutlinePart I: Theory and Applications

(3 hours)Application of Queuing models to Networks (20 mts)

Application of Queuing models to Software (20 mts)

Application of queuing models to Services: Pre-requisite: Queuing Networks (40 mts)

Services Models (40).


Outline Part II: Examples and Practice (2

hours)N/W models (40 mts)

(TDMA, FDMA, etc)

Random access models

WAN? HTTP?

Services Performance Testing (30 mts)

Layered modeling example, practice (50 mts)


Performance

per·for·mance n. 1. The act of performing or the state of being performed. 2. The act or style of performing a work or role before

an audience. 3. The way in which someone or something functions:

The pilot rated the airplane's performance in high winds.

4. A presentation, especially a theatrical one, before an audience.

5. Something performed; an accomplishment. 6. Linguistics. One's actual use of language in actual

situations


…What is Performance?

How well a system performs its function.E.g. performance of a car: kilometers/liter, 0-60 kph in X seconds etc.

We assume that system is “functioning”.

Often loosely used to describe failure characteristics

This is actually reliability (dependability)

E.g. how often does the car break down in an year?


Example: On-line Service

Client

Server

•What questions about performance can we ask?•Why should we ask them?•How can we answer them?


…What

Response time

Blocking

Queue length

Throughput

Utilization

Packet Delay, Message Delay

Loss Rate

Queue Length

“Goodput”

Utilization

Delay Jitter


...Why

Sizing (Hardware, network)

Setting configuration parameters

Choosing architectural alternatives

Determining bottlenecks

Verifying/Challenging intuition

Verify that QoS guarantees will be met

…


HOW?


Example: Estimating end-to-end delay

Client

Server

Measure it!

At the client

At the server

Simulate it – Write (or use) a computer program that simulates the behavior of the system and collects statistics

Analyze it “with pen and paper”

Let's try!

Assume Web service


Dissecting the delay

Client Processing (prepare request)

Connection Setup

Sending the request

Server processing the request

Sending the response

Client processing (display response)


...Dissecting delaysConnection Set-up (assume TCP):

SYN—SYNACK

1 Round-trip time before request can be sent

Sending the request½ RTT for request to reach server

At the server:Queuing Delay for server thread

Processing delay (once request gets server thread)

Thread will also be in CPU job “queue” or disk queue


...Dissecting delays

Sending the response back

Cannot use RTT (=round trip time for small packet)

Assume for now response is one packet (large). Delay components:

Queuing delay

Packet processing delay (at each node)

Packet transmission delay (at each link)

Link propagation delay


Delay- observations

Many delays are fixed Propagation delay

Packet processing delay

For a given packet size, transmission delay

Some are variableNotably, Queuing Delay


Key Concept

Fundamental concept: Contention for a resource leads to users of the resource spending time queuing, or in some way waiting for the resource to be given to them. The calculation of this time, is what requires sophisticated models, because this time changes with random changes in the system - e.g. traffic volumes, failures, etc. and because it depends on various system mechanisms.

Queuing Systems

An Introduction to Elementary Queuing Theory


What/Why is a Queue?

The systems whose performance we study are those that have some contention for resources

If there is no contention, performance analysis is in most cases irrelevant

When multiple “users/jobs/customers/ tasks” require the same resource, use of the resource has to be regulated by some discipline


…What/Why is a Queue?

Additionally, when a customer finds a resource busy, the customer may

Wait in a “queue” (if there is a waiting room)

Or go away (if there is no waiting room, or of the waiting room is full)

Hence the word “queue” or “queuing system”Can represent any resource in front of which, a

queue can form In some cases an actual queue may not form, but it is called

a “queue” anyway.


Examples of Queuing SystemsCPU

Customers: processes/threads

DiskCustomers: processes/threads

Network Link Customers: packets

IP Router Customers: packets

ATM switch:Customers: ATM cells

Web server threads Customers: HTTP requests

Telephone lines:Customers: Telephone Calls


Elements of a Queue

ServerWaiting Room/ Buffer/Queue

Queueing Discipline

Customer Inter-arrival time

Service time


Elements of a Queue

Number of Servers

Size of waiting room/buffer

Service time distribution

Nature of arrival “process”Inter-arrival time distribution

Correlated arrivals, etc.

Number of “users” issuing jobs (population)

Queueing discipline: FCFS, priority, LCFS, processor sharing (round-robin)


Elements of a Queue

Number of Servers: 1,2,3….

Size of buffer: 0,1,2,3,…

Service time distribution & Inter-arrival time distribution

Deterministic (constant)

Exponential

General (any)

Population: 1,2,3,…


Queueing Systems Notation

X/Y/Z/A/B/CX: Inter-arrival time distribution

Distributions denoted by D (Deterministic), M (Exponential) or G (General)

Y: Service time distribution

Z: Number of Servers

A: Buffer size

B: Population size

C: Discipline

E.g.: M/G/4/50/2000/LCFS


Queue Performance Measures

Queue Length: Number of jobs in the system (or in the queue)Waiting time (average, distribution): Time spent in queue before serviceResponse time: Waiting time+service timeUtilization: Fraction of time server is busy or probability that server is busyThroughput: Request completion rate


Queue Performance Measures

Let observation time be TA = number of arrivals during time T

C = number of completions during time T

B = Total time system was busy during time T

Then:Arrival Rate = = A/T

Throughput = C/T

Utilization = ρ = B/T

Average service time = = B/C


Basic Relationships

In “steady-state” for a system without loss (i.e. infinite buffer system)

Completionrate = Arrival Rate, since “in-flow = out-flow”)

If arrival rate > service rate, then Utilization =

B/T = (B/C) x (C/T) = Average Service Time x Completion Rate = = for a loss-less system.

For loss-full systems, if B = fraction of requests lost,

(1 – B)


Little’s Law: N = R

Average number of customers in a queuing system = Throughput x Average Response TimeApplicable to any “closed boundary” that contains queuing systemsFor loss-less systems: N = RAlso, if L is the number in queue (not in service), and W is waiting time:

L = W


Simple Example (Server)Assume just one server (single thread)

Requests come in @ 3 requests/second

Request processing time = 250 ms.

Utilization of server? 75%

Throughput of the server? 3 reqs/second

What if requests come in 5 reqs/second?Utilization = 100%, Throughput = 3 reqs/second

Waiting time (for 3 reqs/second?)L/3, where L is queue length. But what is L?


Classic Single Server Queue: M/M/1

Exponential service time

Exponential inter-arrival timeThis is the “Poisson” arrival process.

Single Server

Infinite buffer (waiting room)

FCFS disciplineCan be solved very easily , using theory of Markov chains


Exponential Distribution

Memory-less distribution Distribution of remaining time

does not depend on elapsed time

Mathematically convenient

Realistic in many situations (e.g. inter-arrival times of calls)

X is EXP() P[X < t] = 1 – e-t

Average value of X = 1/

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 1 2 3 4 5 6 7 8 9 10t

CDF PDF


Exponential <-> Poisson

When distribution of inter-arrival time is Exponential, the “arrival process” is a “Poisson” process.Properties of Poisson process with parameter

If Nt = Number of arrivals in (0,t]; then P[Nt = k] = t e-t/k!

Superposition of Poisson processes is a Poisson processSplitting of Poisson process results in Poisson

processes


Important Result!

M/M/1 queue results

Let be arrival rate and be service time, and = 1/ be service rate

Utilization

Mean number of jobs in the system/(1-)

Throughput

Average response time:R = N/


Response Time Graph

Graph illustrates typical behavior of response time curve

tau =0.5

0

2

4

6

8

10

12

14

16

18

20

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

rho

Re

sp

on

se

Tim

e


M/M/1 queue results

For M/M/1 queue, a formula for distribution of response time is also derived.

M/M/1 response time is exponentially distributed, with parameter (1-), i.e. EXP((1-)) with average 1/((1-)) = as shown earlier


M/G/1 single server queue

General service time distributionMean number in system =

N = Where is the squared coefficient of variation of the service time distribution ( coefficient of variation = standard deviation/mean)Called the Pollaczek-Khinchin (P-K) mean value formula.

Mean response time by Little’s law


M/G/1 delay

Mean response time by Little’s law

For constant service time:Mean response time=

Mean waiting time =


Queue Length by P-K formula

Coefficient of variation for:

•Det: 0

•Uniform(10-50): 0.222

•Erlang-2: 0.5

•Exp: 1

•Gen: 3

0

5

10

15

20

25

30

35

40

0 0.2 0.4 0.6 0.8 1

load

Que

ue L

engt

h

Det Erlang-2 Unif(10,50) Exp General


Multiple server queue: M/M/c

One queue, c servers

Utilization, is = ?c

ais?Average number of busy servers.

Queue length equation exists (not shown here)

For c = 2, queue length is: 2 -

Important quantity: termed traffic intensity or offered load


Finite Buffer Models: M/M/c/K

c servers, buffer size K (total in the system can be c+K)

If a request arrives when system is full, request is dropped

For this system, there will be a notion of loss, or blocking, in addition to response time etc.

Blocking probability is probability that arriving request finds the system full

Response time is relevant only for requests that are “accepted”


...Finite Buffer Queues

Arrival rate:

Service rate:

Throughput?

Utilization?

Queue length?Model

Waiting time? (Little's law)


Finite Buffer Queue:Asymptotic Behavior

Utilization vs offered load

Throughput vs offered load

Blocking probability vs offered load

Queue length vs offered load

Waiting time vs. offered load


Finite Buffer (Loss Models)

M/M/c/0: Poisson arrivals, exponential service time, c servers, no waiting room. Represents which kind of systems?

Circuit-switched telephony! (Servers are lines or trunks, Service time is termed “holding time”)

Interesting measure for this queue: probability that arriving call finds all lines busy


Erlang-B formula

Blocking probability (probability that arriving call finds all s servers busy) =

(as/s!) / [sum(k from 0 to s) {ak/k!}]


Application of Models


Example-1

You are developing an application server where the performance requirements are:

Average response time < 3 seconds

At least 90% of requests should complete within 6 seconds.

Customer has given forecasted arrival rate = 0.5 requests/second

What should be the budget for service time of requests in the server?

Answer: <=1.135 seconds.


Example-2

If you have two servers, is it better to 1. split the incoming requests into two queues and

send them to each server 2. Or, put them in one queue, and the first in queue

is sent to whichever server is idle.3. Or, replace two servers by one server, twice as

fast.for minimizing response times?

Verify intuition by model. Let be arrival rate, and be service time

Calculate response times, and find which case gives least response time.


Example-3: ATM Link Model

Case 1: Assume ATM linkLink b/w

Packet size:

Packet arrival rate

Delay through link: node processing delay (negligible) + queuing delay + transmission delay + propagation delay


Example-4: Multi-threaded Server

Assume multi-threaded server. Arriving requests are put into a buffer. When a thread is free, it picks up the next request from the buffer.

Execution time: mean =

Traffic =

How many threads should we configure?

Response time =


...Example-4

Related question: estimate average memory requirement of the server.

Likely to have: constant component + dynamic component

Dynamic component is related to number of active threads

Suppose memory requirement of one active thread = M

Avg. memory requirement= constant + N*rho


Service Performance Models


Example: Web-Based E-Mail Service

IMAP server

Ad Server

Authentication

Server,

SMTP

Server

Web Server

WAN

Service is composed by putting together various applications

User request


Use Cases

Login

Read

Delete

Send

Move to Folder

...

Each scenario has a different flow through the back-end systems


Browser Web Authentication IMAP SMTP

Session_id

Send_to_auth Verify_session

0.2GeneratHtml

0.8

Read_mailMessage

Send_to_imap

Change_to_html

Read Message Flow Diagram


Model of Service Performance

Request flows through various servers or “queues”

At each server, there will be queuing delays, processing delays, etc.

Model: Queuing Networks


Queuing Networks

Picture of queuing network

Jobs arrive at certain queues (open queuing networks)

After receiving service at one queue (i) , they proceed to the another server (j), with some probability p_ij, or exit


Open Queuing Network - measures

Throughput (rate at which requests complete and leave)

Bottleneck Server


Open Queuing Network - measures

Total time spent in the system before completion (overall response time, from the point of view of the user)


Open Queuing Network: Example 1

Add e.g. from paper, or kst book.


...Open Queuing Network- Example 1

Observations:Each server has different service time

But what is the request rate arriving to each server?

Need to calculate this using flow equations



Show calculation (derivation of avg. number of visits)



Bottleneck Server...

maximum throughput of the system



Bottleneck Server...changes when parameters change

maximum throughput of the system



Jackson's theorem: Overall time spent at Server 1

Server 2

Server 3...



Total time spent before leaving the system


Part II

Network Models, Performance Testing Advanced Software Models


Comparison of TDMA vs FDMA

Which gives better packet delay?

Apply basic queuing results to find packet delays, compare.

Parameters:M: number of users

P: Packet size

R: channel bandwidth (bit-rate)

l: packet arrival rate per user


1. TDMATime is divided into frames, users send packets “1 at a time”

When packet is sent, it uses entire channel bandwidth

Each frame has M slotsLength of each slot = transmission time of one packet = P/R

Frame length = M. P/R


...TDMA

Avg. Packet Delay = Queuing Delay + Synchronization Delay + Transmission Delay (=P/R)

Avg. Synchronization Delay = M.P/(2 R)

Queuing Delay?


...TDMA

Queuing Delay?Assume packet waits in a queue whose service time is M. P/ R

M/D/1 model with “effective service time” equal to frame length.

Waiting time =

Total packet delay =


2. FDMA

Channel bandwidth is divided equally among users

Users can transmit all at one time

Each user gets R/M bit rate

Packet transmission delay = (P/(R/M) ) = M.P/R.

Average packet transfer delay = queuing delay+transmission delay

No “synchronization”


...FDMA

Can be modeled as M/D/1 queueService time = M. P/R

Arrival rate = l

Total = M. P/R +


Compare TDMA vs. FDMA

For M > 2 packet transfer delay in FDMA is greater.


LAN performance

ALOHA

Slotted ALOHA

CSMA/CD

Wireless LAN


Software Performance Testing


Typical Scenario

Scenario is actually of “closed” queuing network


Observations

Throughput?

Arrival rate?

Resp. time vs # of users (asymptotic)

Resp. time vs arrival rate (asymptotic)


Metrics from model

Saturation number

Response time


Converting Stress Tests to “Load” tests


Performance measurement – a case study (Java paper)

Performance Analysis of Computer Systems and Networks Varsha Apte Department of Computer Science and...

Documents

Transcript of Performance Analysis of Computer Systems and Networks Varsha Apte Department of Computer Science and...