Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

64
Performance Analysis of Computer Systems and Networks Prof. Varsha Apte

Transcript of Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

Page 1: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

Performance Analysis of Computer Systems and

Networks

Prof. Varsha Apte

Page 2: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 2

Example: On-line Service

Client

Server

•What questions about performance can we ask?•Why should we ask them?•How can we answer them?

Page 3: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 3

…What

SoftwareResponse timeBlockingQueue lengthThroughput Utilization

NetworkPacket Delay, Message DelayLoss RateQueue Length“Goodput”UtilizationDelay Jitter

Page 4: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 4

...Why

Sizing (Hardware, network)

Setting configuration parameters

Choosing architectural alternatives

Comparing algorithms

Determining bottlenecks

Guaranteeing QoS will be met

Page 5: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

HOW?

Page 6: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 6

Example: Estimating end-to-end delay

Client

Server

Measure it!At the clientAt the server

Simulate it – Write (or use) a computer program that simulates the behaviour of the system and collects statistics

Analyse it “with pen and paper”

Let's try (delay)!Assume Web service

Page 7: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 7

Dissecting the response time

Delay components:

Client Processing (prepare request)

Connection Set-up

Sending the request

Server processing the request

Sending the response

Client processing (display response)

Page 8: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 8

...Dissecting delaysConnection Set-up (assume TCP): SYN—SYNACK1 Round-trip time before request can be sent

Sending the request½ RTT for request to reach server

At the server:Queuing Delay for server threadProcessing delay (once request gets server thread)

Thread will also be in CPU job “queue” or disk queue

Sending the response back

Page 9: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 9

...Dissecting delays

Delay components of “RTT”:Queuing delay at (at each link)Packet processing delay (at each node)Packet transmission delay (at each link)Link propagation delay

Page 10: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 10

Delay- observations

Many delays are fixed Propagation delayPacket processing delayFor a given packet size, transmission delay

Some are variableNotably, Queuing Delay

Page 11: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 11

Key Concept

Fundamental concept: Contention for a resource leads to users of the resource spending time queuing, or in some way waiting for the resource to be given to them. The calculation of this time, is what requires sophisticated models, because this time changes with random changes in the system - e.g. traffic volumes, failures, etc. and because it depends on various system mechanisms.

•Focus of this workshop: Queuing Delay

Page 12: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

Queuing Systems

An Introduction to Elementary Queuing Theory

Page 13: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 13

What/Why is a Queue?

The systems whose performance we study are those that have some contention for resources

If there is no contention, performance is in most cases not an issue When multiple “users/jobs/customers/ tasks” require the same resource, use of the resource has to be regulated by some discipline

Page 14: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 14

…What/Why is a Queue?

When a customer finds a resource busy, the customer may

Wait in a “queue” (if there is a waiting room)Or go away (if there is no waiting room, or if the waiting room is full)

Hence the word “queue” or “queuing system”

Can represent any resource in front of which, a queue can form

In some cases an actual queue may not form, but it is called a “queue” anyway.

Page 15: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 15

Examples of Queuing SystemsCPU

Customers: processes/threads

DiskCustomers: processes/threads

Network Link Customers: packets

IP Router Customers: packets

ATM switch:Customers: ATM cells

Web server threads Customers: HTTP requests

Telephone lines:Customers: Telephone Calls

Page 16: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 16

Elements of a Queue

ServerWaiting Room/ Buffer/Queue

Queueing Discipline

Customer Inter-arrival time

Service time

Page 17: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 17

Elements of a Queue

Number of Servers

Size of waiting room/buffer

Service time distribution

Nature of arrival “process”Inter-arrival time distributionCorrelated arrivals, etc.

Number of “users” issuing jobs (population)

Queuing discipline: FCFS, priority, LCFS, processor sharing (round-robin)

Page 18: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 18

Elements of a Queue

Number of Servers: 1,2,3….

Size of buffer: 0,1,2,3,…

Service time distribution & Inter-arrival time distribution

Deterministic (constant)ExponentialGeneral (any)

Population: 1,2,3,…

Page 19: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 19

Queue Performance Measures

Queue Length: Number of jobs in the system (or in the queue)

Waiting time (average, distribution): Time spent in queue before service

Response time: Waiting time+service time

Utilization: Fraction of time server is busy or probability that server is busy

Throughput: Job completion rate

Page 20: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 20

Queue Performance Measures

Let observation time be TA = number of arrivals during time TC = number of completions during time TB = Total time system was busy during time T

Then:Arrival Rate = = A/TThroughput = C/TUtilization = ρ = B/TAverage service time = = B/CService rate = 1/

Page 21: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 21

Basic RelationshipsIn “steady-state” for a stable system without loss (i.e. infinite buffer system)

Completionrate = Arrival Rate, since “in-flow = out-flow”)

If arrival rate > service rate, then Utilization =

B/T = (B/C) x (C/T) = Average Service Time x Completion Rate = = for a loss-less system.

For loss-full systems, if p = fraction of requests lost, (1 – p)

Throughput of a system, Utilization x Service Rate =

(C/T) = (B/T) x (C/B)

Page 22: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 22

Little’s Law: N = R

Average number of customers in a queuing system = Throughput x Average Response Time

Applicable to any “closed boundary” that contains queuing systems

Some other assumptions

Also, if L is the number in queue (not in service), and W is waiting time:

L = W

Page 23: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 23

Simple Example (Server)Assume just one server (single thread)Requests come in @ 3 requests/secondRequest processing time = 250 ms.Utilization of server?

75%Throughput of the server?

3 reqs/secondWhat if requests come in 5 reqs/second?

Utilization = 100%, Throughput = 3 reqs/secondWaiting time (for 3 reqs/second?)

L/3, where L is queue length. But what is L?Need Queuing

Model

Page 24: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 24

Queuing Systems Notation

X/Y/Z/A/B/CX: Inter-arrival time distribution

Distributions denoted by D (Deterministic), M (Exponential) or G (General)

Y: Service time distributionZ: Number of ServersA: Buffer size B: Population sizeC: Discipline

E.g.: M/G/4/50/2000/LCFS

Page 25: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 25

Classic Single Server Queue: M/M/1

Exponential service time

Exponential inter-arrival timeThis is the “Poisson” arrival process.

Single Server

Infinite buffer (waiting room)

FCFS disciplineCan be solved very easily , using theory of Markov chains

Page 26: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 26

Exponential Distribution

Memory-less distribution Distribution of remaining time does not depend on elapsed time

Mathematically convenient

Realistic in many situations (e.g. inter-arrival times of calls)

X is EXP()

P[X < t] = 1 – e-t

Average value of X = 1/

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 1 2 3 4 5 6 7 8 9 10t

CDF PDF

Page 27: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 27

Exponential <-> Poisson

When distribution of inter-arrival time is Exponential, the “arrival process” is a “Poisson” process.Properties of Poisson process with parameter

If Nt = Number of arrivals in (0,t]; then P[Nt = k] = t e-t/k!

Superposition of Poisson processes is a Poisson processSplitting of Poisson process results in Poisson processes

Page 28: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 28

Important Result!

M/M/1 queue results

Let be arrival rate and be service time, and = 1/ be service rate

Utilization

Mean number of jobs in the system/(1-)

ThroughputAverage response time (Little’s law):R = N/

Page 29: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 29

Response Time Graph

Graph illustrates typical behaviour of response time curve

tau =0.5

0

2

4

6

8

10

12

14

16

18

20

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

rho

Re

sp

on

se

Tim

e

Page 30: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 30

M/M/1 queue results

For M/M/1 queue, a formula for distribution of response time is also derived.

M/M/1 response time is exponentially distributed, with parameter (1-), i.e.

P [Response Time < t] = 1 – e-(t

Page 31: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 31

M/G/1 single server queue

General service time distribution

Mean number in system =N = Where standard deviation/mean of service timeCalled the Pollaczek-Khinchin (P-K) mean value formula.

Mean response time by Little’s law

Page 32: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 32

M/G/1 delay

Mean response time by Little’s lawR = N/ =

For constant service time (M/D/1 queue):

M/D/1-Mean response time= M/D/1-Mean waiting time =

Page 33: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 33

Queue Length by P-K formula

Coefficient of variation for:

•Det: 0

•Uniform(10-50): 0.222

•Erlang-2: 0.5

•Exp: 1

•Gen: 3

0

5

10

15

20

25

30

35

40

0 0.2 0.4 0.6 0.8 1

load

Que

ue L

engt

h

Det Erlang-2 Unif(10,50) Exp General

Page 34: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 34

Multiple server queue: M/M/c

One queue, c serversUtilization, ca= Average number of busy

servers.Queue length equation exists (not

shown here)For c = 2, queue length is: 2 - Average Response Time? Little’s Law! For c = 2, R = N/-

Important quantity: termed traffic intensity or offered load

Page 35: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 35

Finite Buffer Models: M/M/c/K

c servers, buffer size K (total in the system can be c+K)

If a request arrives when system is full, request is dropped

For this system, there will be a notion of loss, or blocking, in addition to response time etc. Blocking probability (p) is probability that arriving request finds the system full

Response time is relevant only for requests that are “accepted”

Page 36: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 36

...Finite Buffer QueuesArrival rate: Service rate:Throughput?– p)

Utilization?c

Blocking probability? Queue length (N)?Formula exists (from queuing model)

Waiting time? (Little's law = N/)

Page 37: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 37

Finite Buffer Queue:Asymptotic Behavior

As offered load increases (infinity)Utilization () ThroughputcBlocking probabilityp 1

Queue lengthN c+ K

Waiting time K/(c + 1/

Page 38: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 38

Finite Buffer (Loss Models)

M/M/c/0: Poisson arrivals, exponential service time, c servers, no waiting room.

Represents which kind of systems?Circuit-switched telephony! (Servers are lines or trunks, Service time is termed “holding time”)

Interesting measure for this queue: probability that arriving call finds all lines busy

Page 39: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 39

Erlang-B formula

Blocking probability (probability that arriving call finds all s servers busy) =

(as/s!) / [sum(k from 0 to s) {ak/k!}]

Page 40: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

Examples

Page 41: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 41

Example-1

You are developing an application server where the performance requirements are:

Average response time < 3 seconds

Forecasted arrival rate = 0.5 requests/second

What should be the budget for service time of requests in the server?

Answer: <1.2 seconds.

Page 42: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 42

Example-2

If you have two servers, is it better to split the incoming requests into two queues

and send them to each server Or, put them in one queue, and the first in

queue is sent to whichever server is idle.Or, replace two servers by one server,

twice as fast. for minimizing response times?

Page 43: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 43

Example-2 contd.

Verify intuition by model. Let be arrival rate, and be service timeCalculate response times, and order cases

by response times.

Answer: R3 < R2 < R1

Page 44: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 44

Example-3: ATM Link Model

Assume ATM link, Poisson arrivals, infinite buffer

Link b/w: 10 MbpsPacket size: 53 bytesPacket transmission time = 42.4 sPacket inter-arrival time = 50 s. Assume Poisson arrivals

Page 45: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 45

Example-3 contd.

Delay through link: node processing delay (negligible) + queuing delay (waiting time)+ transmission delay + propagation delay

Queuing Delay?M/D/1 delay = (42.4)(42.4/50) / (2 x (1-

42.4/50)) = 118.2 ms

Page 46: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 46

Example-4: Multi-threaded Server

Assume multi-threaded server. Arriving requests are put into a buffer. When a thread is free, it picks up the next request from the buffer.

Execution time: mean = 200 msTraffic = 30 requests/secHow many threads should we configure? (assume enough hardware).

Traffic Intensity = 30 x 0.2 = 6 = Average number of busy servers At least 6

Response time = (Which formula?)

Page 47: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 47

...Example-4

Related question: estimate average memory requirement of the server.

Likely to have: constant component + dynamic componentDynamic component is related to number of active threadsSuppose memory requirement of one active thread = MAvg. memory requirement= constant + M* 6

Page 48: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 48

Example-5: Hardware Sizing

Consider the following scenario:An application server runs on a 24-CPU

machineServer seems to peak at 320 transactions

per secondWe need to scale to 400.Hardware vendor recommends going to 32

CPU machine.Should you?

Page 49: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 49

Example-5: Hardware Sizing

First do bottleneck analysis! Suppose logs show that at 320 transactions per second, CPU utilization is 67% - What is the problem? What is solution?

Page 50: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 50

Example-5: Hardware Sizing

Most likely Explanation: Number of threads in server is < number of CPUs

Possible diagnosis: Server has 16 threads configured Each thread has capacity of 20 transactions per

second Total capacity: 320 reqs/second. At this load, 16 threads will be 100% busy average

CPU utilization will be 16/24=67%

Solution: Increase number of threads – no need for CPUs.

Page 51: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

Part II

Applications to Software Performance Testing and Measurement, Network Models and Service Performance

Page 52: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 52

Software Performance Testing

Page 53: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 53

Typical Scenario

M clients issue requests, wait for response, then “think”, then issue request again

Let 1/ be mean think time, 1/ be mean service time.

Scenario is actually of “closed” queuing network

Clients

Server

Page 54: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 54

Observations

Throughput? Arrival rate?At steady-state, “flow” through both queues is equalServer throughput = Server utilization X Service rate

U X Request is generated by each user once every [think time + response time] = 1/ (1/+ R)

Overall request arrival rate = M / (1/ + R) = U *

Clients

Server

Page 55: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 55

Observations

M / (1/ + R) = U * = Throughput

Response time = number of clients/Throughput – think time

As number of clients increase, can be shown to tend to: M/ – 1/.

Linear function of M Increase M by 1 Response time increases by 1/

Clients

Server

Page 56: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 56

Metrics from model

Saturation number (number of clients after which system is “saturated”)

M* = 1 + /

M

R

Page 57: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

Case Study

Software Performance Measurement

Page 58: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 58

Queuing Networks

Jobs arrive at certain queues (open queuing networks)

After receiving service at one queue (i) , they proceed to another server (j), with some probability pij, or exit

Page 59: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 59

Open Queuing Network - measures

Maximum Throughput

Bottleneck Server

Throughput

Page 60: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 60

Open Queuing Network - measures

Total time spent in the system before completion (overall response time, from the point of view of the user)

start finish

Page 61: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 61

Open Queuing Network: Example 1

W: Web Server

A1, A2: Application Server 1, 2

D1, D2: Database Server 1,2

W

A1 D1

D2A2pw0

pwA1

pwA2

pA1w

pA2w

pA1D1

pA2D2

Page 62: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 62

...Open Queuing Network- Example 1

Each server has different service time

But what is the request rate arriving to each server?

Need to calculate this using flow equations

pw0

pwA1

pwA2

pA1w

pA2w

pA1D1

pA2D2

Page 63: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 63

...Open Queuing Network- Example 1

Equations for Average number of visits before leaving the server network

vA1 = pwA1. vw+ vD1 , vD1 = pA1D1 . vA1

vA2 = pwA2. vw + vD2 , vD2 = pA2D2 . vA2

Vw = 1 + pA1w. vA1 + pA2w. vA2

pw0

pwA1

pwA2

pA1w

pA2w

pA1D1

pA2D2

Page 64: Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.

© 2004 by Varsha Apte, IIT Bombay 64

...Open Queuing Network- Example 1

Spreadsheet calculation showsHow bottleneck server changesHow throughput changesHow response time changes

Results are accurate for only some types of networks, for others, they are approximate

pw0

pwA1

pwA2

pA1w

pA2w

pA1D1

pA2D2