System Performance & Scalability i206 Fall 2010 John Chuang.

27
System Performance & Scalability i206 Fall 2010 John Chuang
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    222
  • download

    1

Transcript of System Performance & Scalability i206 Fall 2010 John Chuang.

System Performance & Scalability

i206 Fall 2010

John Chuang

John Chuang 2

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

http://bits.blogs.nytimes.com/2007/11/26/yahoos-cybermonday-meltdown/index.html

John Chuang 3

Computing Trends

Multi-core CPUs Data centers Cloud computing

What are the drivers?- scalability, availability, cost-effectiveness

Server

Server

Server

Service

Client

Client

John Chuang 4

Lecture Outline

Performance Metrics Availability Queuing theory

- M/M/1 queue Scalability

- M/M/m queue

John Chuang 5

What is Performance?

Users want fast response time and high availability

Managers want happy users, and many of them, while minimizing cost

What are standard measures of system performance?

John Chuang 6

Performance Metrics

Response time (seconds) Throughput (MIPS, Mbps, TPS, ...)

Resource utilization (%) Availability (%)

John Chuang 7

Availability

Availability

Down-time per year

One hour down-time per:

90% 36 days 9 hours

99% 3.7 days 4.1 days

99.9% 9 hours 41.6 days

99.99% 53 minutes 1.14 years

99.999% 5 minutes 11.41 years

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

Availability = MTTF / (MTTF + MTTR)-Mean-time-to-failure (MTTF)-Mean-time-to-recover (MTTR)

John Chuang 8

Response Time

M/M/1 Queue (μ = 100)

0

0.05

0.1

0.15

0.2

0.25

0 0.2 0.4 0.6 0.8 1

Utilization

( )Response Time s

Client Server

Formulaterequest Message latency

Message latency

Processing time

Interpretresponse

Network

Queuing time

Adapted from: David Messerschmitt

John Chuang 9

Queuing Theory

1. Arrival Process

2. Service TimeDistribution

3. Number ofServers

4. SystemCapacity

5. Customer Population

6. ServiceDiscipline

Source: Raj Jain

John Chuang 10

Kendall’s Notation (1953)

A/B/c/k/N/D- A: arrival process- B: service time distribution- c: number of servers- k: system capacity- N: population size- D: service discipline

M: Markov (exponential, memoryless, random, Poisson)

D: deterministicE: ErlangH: hyper-exponentialG: general FCFS: first come first

servedFCLS: first come last

servedRR: round-robinetc.

1. Arrival Process

2. Service Time Distribution

3. Number of Servers

4. SystemCapacity

5. Customer Population

6. ServiceDiscipline

John Chuang 11

Example Systems

M/M/1/ / /FCFS (simplified as M/M/1)- Markovian (Poisson, memoryless) arrival- Markovian service time- 1 server- Infinite server capacity- Infinite arrival stream- First-come-first-serve discipline

Other examples:- M/M/1/k (finite capacity)- M/M/m (m servers)- G/D/1 (arbitrary arrival, deterministic service time)

8 8

John Chuang 12

M/M/1 Queue Poisson arrival, with average arrival rate of jobs/sec

Poisson service, with average service rate of μ jobs/sec

Single server with infinite queue

System utilization (hopefully < 1): = /μ

Average number of jobs in system:N = n·pn = /(1 - )

System throughput (if < 1) : X =

Average response time (from Little’s Law):R = N/X = 1/(μ - )

John Chuang 13

Example: Web Server

Web server receives 40 requests/second Web server can process 100 requests/second

What is server utilization? At any given time, how many requests are at server (waiting plus being processed)?

What is the mean total delay at server (waiting plus processing)?

What happens when traffic rate doubles?

John Chuang 14

Example: Web Server

= 40 requests/second μ = 100 requests/second Utilization = = /μ = 40/100 = 40%

# of requests = N = /(1 - ) = 0.67

Average time spent at server = R = N/X = 0.67/40 = 17ms

John Chuang 15

Example: Traffic Doubled

= 80 requests/second μ = 100 requests/second Utilization = = /μ = 80/100 = 80%

# of requests = N = /(1 - ) = 4 Average time spent at server = R = N/X = 4/80 = 50ms (more than doubled!)

John Chuang 16

Approaching Congestion

= 99 requests/second μ = 100 requests/second Utilization = = /μ = 99/100 = 99%

# of requests = N = /(1 - ) = 99

Average time spent at server = R = N/X = 99/99 = 1 second!

John Chuang 17

Utilization Affects Performance

M/M/1 Queue (μ = 100)

0

0.05

0.1

0.15

0.2

0.25

0 0.2 0.4 0.6 0.8 1

Utilization

( )Response Time s

John Chuang 18

M/M/1/k Queue (Finite Capacity)

= /μ N = /(1-) – (k+1)k+1/(1-k+1) R = N/X = N/eff

- where eff = (1-Pk) = effective arrival rate

- and Pk = k(1-)/(1-k+1) = probability of a full queue

Loss rate = - eff

John Chuang 19

M/M/1/k Response TimeM/M/1 and M/M/1/k Queues (μ = 100)

0

0.05

0.1

0.15

0.2

0.25

0 0.2 0.4 0.6 0.8 1

Utilization

( )Response Time s

M/M/1

M/M/1/1

M/M/1/2

M/M/1/10

M/M/1/100

John Chuang 20

M/M/1/k ThroughputThroughput given Service rate μ = 100 jobs/sec

0

10

20

30

40

50

60

70

80

90

100

0 0.2 0.4 0.6 0.8 1

Utilization

( / )Throughput jobs sec

M/M/1

M/M/1/1

M/M/1/2

M/M/1/10

M/M/1/100

John Chuang 21

Lecture Outline

Performance Metrics Availability Queuing theory

- M/M/1 queue Scalability

- M/M/m queue

John Chuang 22

Scalability

The capability of a system to increase total throughput under an increased load when resources (typically hardware) are added- Cost of additional resource- Performance degradation under increased load

John Chuang 23

Scalability Example

Original web server: can process μ requests/sec; accepts requests at /sec

Now request rate increases to 10/sec and web server is swamped ( = 10/μ)!

Need to add new hardware!

John Chuang 24

Which is better?

Option 1: One big web server that can process 10μ requests/sec

Option 2: Ten web servers, each can process μ requests/sec; each accepts 10% of requests (/sec per server)

Option 3: Ten web servers, each can process μ requests/sec; share single queue (load balancer) that accepts requests at 10/sec

John Chuang 25

μ 10 10μ μ

μ

μ

μ

μ

μ

μ

μ

μ

μ

μμ

μ

μ

μ

μ

μ

μμ

10

Option 1: M/M/1 queue with big server Option 2: (ten M/M/1 queues)

Option 3: M/M/10 queue

John Chuang 26

M/M/m Queue (m Servers)

= /mμ N = m + /(1-)

where

and

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

John Chuang 27

Which is Better?

Option 1(M/M/1 big)

Option 2(ten M/M/1)

Option 3(M/M/10)

Utilization ()

0.5 0.5 0.5

Number of requests (N)

1 1*10 5.036

Response Time (R)

2ms 20ms 10.07ms

m = 10; μ = 100; = 50

Remember: Scalability is not just about performance!