Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.
-
Upload
morgan-houston -
Category
Documents
-
view
217 -
download
2
Transcript of Performance Analysis of Computer Systems and Networks Prof. Varsha Apte.
Performance Analysis of Computer Systems and
Networks
Prof. Varsha Apte
© 2004 by Varsha Apte, IIT Bombay 2
Example: On-line Service
Client
Server
•What questions about performance can we ask?•Why should we ask them?•How can we answer them?
© 2004 by Varsha Apte, IIT Bombay 3
…What
SoftwareResponse timeBlockingQueue lengthThroughput Utilization
NetworkPacket Delay, Message DelayLoss RateQueue Length“Goodput”UtilizationDelay Jitter
© 2004 by Varsha Apte, IIT Bombay 4
...Why
Sizing (Hardware, network)
Setting configuration parameters
Choosing architectural alternatives
Comparing algorithms
Determining bottlenecks
Guaranteeing QoS will be met
HOW?
© 2004 by Varsha Apte, IIT Bombay 6
Example: Estimating end-to-end delay
Client
Server
Measure it!At the clientAt the server
Simulate it – Write (or use) a computer program that simulates the behaviour of the system and collects statistics
Analyse it “with pen and paper”
Let's try (delay)!Assume Web service
© 2004 by Varsha Apte, IIT Bombay 7
Dissecting the response time
Delay components:
Client Processing (prepare request)
Connection Set-up
Sending the request
Server processing the request
Sending the response
Client processing (display response)
© 2004 by Varsha Apte, IIT Bombay 8
...Dissecting delaysConnection Set-up (assume TCP): SYN—SYNACK1 Round-trip time before request can be sent
Sending the request½ RTT for request to reach server
At the server:Queuing Delay for server threadProcessing delay (once request gets server thread)
Thread will also be in CPU job “queue” or disk queue
Sending the response back
© 2004 by Varsha Apte, IIT Bombay 9
...Dissecting delays
Delay components of “RTT”:Queuing delay at (at each link)Packet processing delay (at each node)Packet transmission delay (at each link)Link propagation delay
© 2004 by Varsha Apte, IIT Bombay 10
Delay- observations
Many delays are fixed Propagation delayPacket processing delayFor a given packet size, transmission delay
Some are variableNotably, Queuing Delay
© 2004 by Varsha Apte, IIT Bombay 11
Key Concept
Fundamental concept: Contention for a resource leads to users of the resource spending time queuing, or in some way waiting for the resource to be given to them. The calculation of this time, is what requires sophisticated models, because this time changes with random changes in the system - e.g. traffic volumes, failures, etc. and because it depends on various system mechanisms.
•Focus of this workshop: Queuing Delay
Queuing Systems
An Introduction to Elementary Queuing Theory
© 2004 by Varsha Apte, IIT Bombay 13
What/Why is a Queue?
The systems whose performance we study are those that have some contention for resources
If there is no contention, performance is in most cases not an issue When multiple “users/jobs/customers/ tasks” require the same resource, use of the resource has to be regulated by some discipline
© 2004 by Varsha Apte, IIT Bombay 14
…What/Why is a Queue?
When a customer finds a resource busy, the customer may
Wait in a “queue” (if there is a waiting room)Or go away (if there is no waiting room, or if the waiting room is full)
Hence the word “queue” or “queuing system”
Can represent any resource in front of which, a queue can form
In some cases an actual queue may not form, but it is called a “queue” anyway.
© 2004 by Varsha Apte, IIT Bombay 15
Examples of Queuing SystemsCPU
Customers: processes/threads
DiskCustomers: processes/threads
Network Link Customers: packets
IP Router Customers: packets
ATM switch:Customers: ATM cells
Web server threads Customers: HTTP requests
Telephone lines:Customers: Telephone Calls
© 2004 by Varsha Apte, IIT Bombay 16
Elements of a Queue
ServerWaiting Room/ Buffer/Queue
Queueing Discipline
Customer Inter-arrival time
Service time
© 2004 by Varsha Apte, IIT Bombay 17
Elements of a Queue
Number of Servers
Size of waiting room/buffer
Service time distribution
Nature of arrival “process”Inter-arrival time distributionCorrelated arrivals, etc.
Number of “users” issuing jobs (population)
Queuing discipline: FCFS, priority, LCFS, processor sharing (round-robin)
© 2004 by Varsha Apte, IIT Bombay 18
Elements of a Queue
Number of Servers: 1,2,3….
Size of buffer: 0,1,2,3,…
Service time distribution & Inter-arrival time distribution
Deterministic (constant)ExponentialGeneral (any)
Population: 1,2,3,…
© 2004 by Varsha Apte, IIT Bombay 19
Queue Performance Measures
Queue Length: Number of jobs in the system (or in the queue)
Waiting time (average, distribution): Time spent in queue before service
Response time: Waiting time+service time
Utilization: Fraction of time server is busy or probability that server is busy
Throughput: Job completion rate
© 2004 by Varsha Apte, IIT Bombay 20
Queue Performance Measures
Let observation time be TA = number of arrivals during time TC = number of completions during time TB = Total time system was busy during time T
Then:Arrival Rate = = A/TThroughput = C/TUtilization = ρ = B/TAverage service time = = B/CService rate = 1/
© 2004 by Varsha Apte, IIT Bombay 21
Basic RelationshipsIn “steady-state” for a stable system without loss (i.e. infinite buffer system)
Completionrate = Arrival Rate, since “in-flow = out-flow”)
If arrival rate > service rate, then Utilization =
B/T = (B/C) x (C/T) = Average Service Time x Completion Rate = = for a loss-less system.
For loss-full systems, if p = fraction of requests lost, (1 – p)
Throughput of a system, Utilization x Service Rate =
(C/T) = (B/T) x (C/B)
© 2004 by Varsha Apte, IIT Bombay 22
Little’s Law: N = R
Average number of customers in a queuing system = Throughput x Average Response Time
Applicable to any “closed boundary” that contains queuing systems
Some other assumptions
Also, if L is the number in queue (not in service), and W is waiting time:
L = W
© 2004 by Varsha Apte, IIT Bombay 23
Simple Example (Server)Assume just one server (single thread)Requests come in @ 3 requests/secondRequest processing time = 250 ms.Utilization of server?
75%Throughput of the server?
3 reqs/secondWhat if requests come in 5 reqs/second?
Utilization = 100%, Throughput = 3 reqs/secondWaiting time (for 3 reqs/second?)
L/3, where L is queue length. But what is L?Need Queuing
Model
© 2004 by Varsha Apte, IIT Bombay 24
Queuing Systems Notation
X/Y/Z/A/B/CX: Inter-arrival time distribution
Distributions denoted by D (Deterministic), M (Exponential) or G (General)
Y: Service time distributionZ: Number of ServersA: Buffer size B: Population sizeC: Discipline
E.g.: M/G/4/50/2000/LCFS
© 2004 by Varsha Apte, IIT Bombay 25
Classic Single Server Queue: M/M/1
Exponential service time
Exponential inter-arrival timeThis is the “Poisson” arrival process.
Single Server
Infinite buffer (waiting room)
FCFS disciplineCan be solved very easily , using theory of Markov chains
© 2004 by Varsha Apte, IIT Bombay 26
Exponential Distribution
Memory-less distribution Distribution of remaining time does not depend on elapsed time
Mathematically convenient
Realistic in many situations (e.g. inter-arrival times of calls)
X is EXP()
P[X < t] = 1 – e-t
Average value of X = 1/
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6 7 8 9 10t
CDF PDF
© 2004 by Varsha Apte, IIT Bombay 27
Exponential <-> Poisson
When distribution of inter-arrival time is Exponential, the “arrival process” is a “Poisson” process.Properties of Poisson process with parameter
If Nt = Number of arrivals in (0,t]; then P[Nt = k] = t e-t/k!
Superposition of Poisson processes is a Poisson processSplitting of Poisson process results in Poisson processes
© 2004 by Varsha Apte, IIT Bombay 28
Important Result!
M/M/1 queue results
Let be arrival rate and be service time, and = 1/ be service rate
Utilization
Mean number of jobs in the system/(1-)
ThroughputAverage response time (Little’s law):R = N/
© 2004 by Varsha Apte, IIT Bombay 29
Response Time Graph
Graph illustrates typical behaviour of response time curve
tau =0.5
0
2
4
6
8
10
12
14
16
18
20
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
rho
Re
sp
on
se
Tim
e
© 2004 by Varsha Apte, IIT Bombay 30
M/M/1 queue results
For M/M/1 queue, a formula for distribution of response time is also derived.
M/M/1 response time is exponentially distributed, with parameter (1-), i.e.
P [Response Time < t] = 1 – e-(t
© 2004 by Varsha Apte, IIT Bombay 31
M/G/1 single server queue
General service time distribution
Mean number in system =N = Where standard deviation/mean of service timeCalled the Pollaczek-Khinchin (P-K) mean value formula.
Mean response time by Little’s law
© 2004 by Varsha Apte, IIT Bombay 32
M/G/1 delay
Mean response time by Little’s lawR = N/ =
For constant service time (M/D/1 queue):
M/D/1-Mean response time= M/D/1-Mean waiting time =
© 2004 by Varsha Apte, IIT Bombay 33
Queue Length by P-K formula
Coefficient of variation for:
•Det: 0
•Uniform(10-50): 0.222
•Erlang-2: 0.5
•Exp: 1
•Gen: 3
0
5
10
15
20
25
30
35
40
0 0.2 0.4 0.6 0.8 1
load
Que
ue L
engt
h
Det Erlang-2 Unif(10,50) Exp General
© 2004 by Varsha Apte, IIT Bombay 34
Multiple server queue: M/M/c
One queue, c serversUtilization, ca= Average number of busy
servers.Queue length equation exists (not
shown here)For c = 2, queue length is: 2 - Average Response Time? Little’s Law! For c = 2, R = N/-
Important quantity: termed traffic intensity or offered load
© 2004 by Varsha Apte, IIT Bombay 35
Finite Buffer Models: M/M/c/K
c servers, buffer size K (total in the system can be c+K)
If a request arrives when system is full, request is dropped
For this system, there will be a notion of loss, or blocking, in addition to response time etc. Blocking probability (p) is probability that arriving request finds the system full
Response time is relevant only for requests that are “accepted”
© 2004 by Varsha Apte, IIT Bombay 36
...Finite Buffer QueuesArrival rate: Service rate:Throughput?– p)
Utilization?c
Blocking probability? Queue length (N)?Formula exists (from queuing model)
Waiting time? (Little's law = N/)
© 2004 by Varsha Apte, IIT Bombay 37
Finite Buffer Queue:Asymptotic Behavior
As offered load increases (infinity)Utilization () ThroughputcBlocking probabilityp 1
Queue lengthN c+ K
Waiting time K/(c + 1/
© 2004 by Varsha Apte, IIT Bombay 38
Finite Buffer (Loss Models)
M/M/c/0: Poisson arrivals, exponential service time, c servers, no waiting room.
Represents which kind of systems?Circuit-switched telephony! (Servers are lines or trunks, Service time is termed “holding time”)
Interesting measure for this queue: probability that arriving call finds all lines busy
© 2004 by Varsha Apte, IIT Bombay 39
Erlang-B formula
Blocking probability (probability that arriving call finds all s servers busy) =
(as/s!) / [sum(k from 0 to s) {ak/k!}]
Examples
© 2004 by Varsha Apte, IIT Bombay 41
Example-1
You are developing an application server where the performance requirements are:
Average response time < 3 seconds
Forecasted arrival rate = 0.5 requests/second
What should be the budget for service time of requests in the server?
Answer: <1.2 seconds.
© 2004 by Varsha Apte, IIT Bombay 42
Example-2
If you have two servers, is it better to split the incoming requests into two queues
and send them to each server Or, put them in one queue, and the first in
queue is sent to whichever server is idle.Or, replace two servers by one server,
twice as fast. for minimizing response times?
© 2004 by Varsha Apte, IIT Bombay 43
Example-2 contd.
Verify intuition by model. Let be arrival rate, and be service timeCalculate response times, and order cases
by response times.
Answer: R3 < R2 < R1
© 2004 by Varsha Apte, IIT Bombay 44
Example-3: ATM Link Model
Assume ATM link, Poisson arrivals, infinite buffer
Link b/w: 10 MbpsPacket size: 53 bytesPacket transmission time = 42.4 sPacket inter-arrival time = 50 s. Assume Poisson arrivals
© 2004 by Varsha Apte, IIT Bombay 45
Example-3 contd.
Delay through link: node processing delay (negligible) + queuing delay (waiting time)+ transmission delay + propagation delay
Queuing Delay?M/D/1 delay = (42.4)(42.4/50) / (2 x (1-
42.4/50)) = 118.2 ms
© 2004 by Varsha Apte, IIT Bombay 46
Example-4: Multi-threaded Server
Assume multi-threaded server. Arriving requests are put into a buffer. When a thread is free, it picks up the next request from the buffer.
Execution time: mean = 200 msTraffic = 30 requests/secHow many threads should we configure? (assume enough hardware).
Traffic Intensity = 30 x 0.2 = 6 = Average number of busy servers At least 6
Response time = (Which formula?)
© 2004 by Varsha Apte, IIT Bombay 47
...Example-4
Related question: estimate average memory requirement of the server.
Likely to have: constant component + dynamic componentDynamic component is related to number of active threadsSuppose memory requirement of one active thread = MAvg. memory requirement= constant + M* 6
© 2004 by Varsha Apte, IIT Bombay 48
Example-5: Hardware Sizing
Consider the following scenario:An application server runs on a 24-CPU
machineServer seems to peak at 320 transactions
per secondWe need to scale to 400.Hardware vendor recommends going to 32
CPU machine.Should you?
© 2004 by Varsha Apte, IIT Bombay 49
Example-5: Hardware Sizing
First do bottleneck analysis! Suppose logs show that at 320 transactions per second, CPU utilization is 67% - What is the problem? What is solution?
© 2004 by Varsha Apte, IIT Bombay 50
Example-5: Hardware Sizing
Most likely Explanation: Number of threads in server is < number of CPUs
Possible diagnosis: Server has 16 threads configured Each thread has capacity of 20 transactions per
second Total capacity: 320 reqs/second. At this load, 16 threads will be 100% busy average
CPU utilization will be 16/24=67%
Solution: Increase number of threads – no need for CPUs.
Part II
Applications to Software Performance Testing and Measurement, Network Models and Service Performance
© 2004 by Varsha Apte, IIT Bombay 52
Software Performance Testing
© 2004 by Varsha Apte, IIT Bombay 53
Typical Scenario
M clients issue requests, wait for response, then “think”, then issue request again
Let 1/ be mean think time, 1/ be mean service time.
Scenario is actually of “closed” queuing network
Clients
Server
© 2004 by Varsha Apte, IIT Bombay 54
Observations
Throughput? Arrival rate?At steady-state, “flow” through both queues is equalServer throughput = Server utilization X Service rate
U X Request is generated by each user once every [think time + response time] = 1/ (1/+ R)
Overall request arrival rate = M / (1/ + R) = U *
Clients
Server
© 2004 by Varsha Apte, IIT Bombay 55
Observations
M / (1/ + R) = U * = Throughput
Response time = number of clients/Throughput – think time
As number of clients increase, can be shown to tend to: M/ – 1/.
Linear function of M Increase M by 1 Response time increases by 1/
Clients
Server
© 2004 by Varsha Apte, IIT Bombay 56
Metrics from model
Saturation number (number of clients after which system is “saturated”)
M* = 1 + /
M
R
Case Study
Software Performance Measurement
© 2004 by Varsha Apte, IIT Bombay 58
Queuing Networks
Jobs arrive at certain queues (open queuing networks)
After receiving service at one queue (i) , they proceed to another server (j), with some probability pij, or exit
© 2004 by Varsha Apte, IIT Bombay 59
Open Queuing Network - measures
Maximum Throughput
Bottleneck Server
Throughput
© 2004 by Varsha Apte, IIT Bombay 60
Open Queuing Network - measures
Total time spent in the system before completion (overall response time, from the point of view of the user)
start finish
© 2004 by Varsha Apte, IIT Bombay 61
Open Queuing Network: Example 1
W: Web Server
A1, A2: Application Server 1, 2
D1, D2: Database Server 1,2
W
A1 D1
D2A2pw0
pwA1
pwA2
pA1w
pA2w
pA1D1
pA2D2
© 2004 by Varsha Apte, IIT Bombay 62
...Open Queuing Network- Example 1
Each server has different service time
But what is the request rate arriving to each server?
Need to calculate this using flow equations
pw0
pwA1
pwA2
pA1w
pA2w
pA1D1
pA2D2
© 2004 by Varsha Apte, IIT Bombay 63
...Open Queuing Network- Example 1
Equations for Average number of visits before leaving the server network
vA1 = pwA1. vw+ vD1 , vD1 = pA1D1 . vA1
vA2 = pwA2. vw + vD2 , vD2 = pA2D2 . vA2
Vw = 1 + pA1w. vA1 + pA2w. vA2
pw0
pwA1
pwA2
pA1w
pA2w
pA1D1
pA2D2
© 2004 by Varsha Apte, IIT Bombay 64
...Open Queuing Network- Example 1
Spreadsheet calculation showsHow bottleneck server changesHow throughput changesHow response time changes
Results are accurate for only some types of networks, for others, they are approximate
pw0
pwA1
pwA2
pA1w
pA2w
pA1D1
pA2D2