1 Performance Evaluation of Computer Systems Introduction.
-
Upload
elwin-robinson -
Category
Documents
-
view
238 -
download
1
Transcript of 1 Performance Evaluation of Computer Systems Introduction.
1
Performance Evaluation of Computer Systems
Introduction
2
Outline
Introduction to performance evaluation Objectives of performance evaluation Techniques of performance evaluation Metrics in performance evaluation
3
Introduction Computer system users, administrators, and designers are all
interested in performance evaluation. The goal in system performance evaluation is to provide the
highest performance at the lowest cost. Computer performance evaluation has important role in
selection of computer systems, design of systems and applications, and analysis of existing systems.
4
Objectives of Performance Study Evaluating design alternatives (system design) Comparing two or more systems (system selection) Determining the optimal value of a parameter (system
tuning) Finding the performance bottleneck (bottleneck
identification) Characterizing the load on the system (workload
characterization) Determining the number and sizes of components
(capacity planning) Predicting the performance at future loads
(forecasting).
5
Basic Terms
System: Any collection of hardware, software and network.
Metrics: Criteria used to analysis the performance of the system or components.
Workloads: The requests made by the users of the system.
6
Performance Evaluation Activities Performance evaluation of a system can be done at different
stages of system development System in planning and design stage
Use high level models to obtain performance estimates for alternative system configurations and alternative designs.
System is operational Measure the system behavior with a view to improve the
performance Develop validated model that can be used for performance
prediction and capacity planning.
7
Techniques for Performance Evaluation Performance measurement
Obtain measurement data by observing the events and activities on an existing system
Performance modeling Represent the system by a model and manipulate the
model to obtain information about system performance
8
Performance Measurement
Measure the performance directly on a system Need to characterize the workload placed on the
system during measurement Generally provide the most valid results Nevertheless, not very flexible
May be difficult (or even impossible) to vary some workload parameters
9
Performance Modeling
Model An abstraction of the system obtained by making a set of
assumptions about how the system works Capture the essential characteristics of the system
Reasons of using models Experimenting with the real system may be
too costly too risky, or too disruptive to system operation
System may only be in the design stage
10
Performance Modeling Workload characterization
Capture the resource demands and intensity of the load brought to the system
Performance metrics The measure of interest, such as mean response time, the
number of transactions completed per second, the ratio of blocked connection requests, etc.
11
Performance Modeling
Solution methods Analytic modeling Simulation modeling
12
Analytic Modeling
Mathematical methods are used to obtain solutions to the performance measures of interest
Numerical results are easy to compute if a simple analytic solution is available
Useful approach when one only needs rough estimates of performance measures
Solutions to complex models may be difficult to obtain
13
Simulation Modeling
Develop a simulation program that implements the model
Run the simulation program and use the data collected to estimate the performance measurement of interest
A system can be studied at an arbitrary level of detail
It may be costly to develop and run the simulation program
14
Stochastic Model Model contains some random input components
which are characterized by probability distributions, e.g., time between arrivals to a system by exponential distribution
Output is also random, and provides probability distributions of the performance measures of interest
15
Queuing Model
The most commonly used model to analyze the performance of computer systems and networks.
Single queue: models a component of overall system, such as CPU, disk, communication channel
Network of queues: models system components and their interaction.
16
Steps in Performance Modeling
17
Commonly Used Performance Metrics Response Time
Turn around time Reaction time Stretch factor
Throughput Operations/second
Jobs per second Requests per second Millions of Instructions Per Second (MIPS) Millions of Floating Point Operations Per Second (MFLOPS) Packets Per Second (PPS) Bits per second (bps) Transactions Per Second (TPS)
Efficiency Utilization
18
Commonly Used Performance Metrics (Cont…)
Reliability R(t) MTTF
Availability Mean Time to Failure (MTTF) Mean Time to Repair (MTTR) MTTF/(MTTF+MTTR)
19
Response Time
Interval between user’s request and system response
Time
User’sRequest
System’sResponse
20
Response Time (cont…)
Can have two measures of response time Both ok, but 2 preferred if execution long
Time
User FinishesRequest
System Starts
Response
User Starts
Request
System Finishes
Response
System Starts
Execution
ReactionTime
ResponseTime 1
ResponseTime 2
21
Response Time (cont…) Turn around time: time between submission of a
job and completion of output For batch job systems
Reaction time: Time between submission of a request and beginning of execution Usually need to measure inside system since nothing
externally visible Stretch factor: ratio of response time at load to
response time at minimal load Most systems have higher response time as load
increases
22
Throughput Rate at which requests can be serviced by system (requests
per unit time)
23
Efficiency
Ratio of maximum achievable throughput (ex: 9.8 Mbps) to nominal capacity (ex: 10 Mbps) 98%
For multiprocessor systems, ratio of n-processor to that of one-processor (in MIPS or MFLOPS)
Effi
cien
cy
Number of Processors
24
Utilization
Typically, fraction of time resource is busy serving requests Time not being used is idle time System managers often want to balance resources to have
same utilization Ex: equal load on CPUs But may not be possible. Ex: CPU when I/O is bottleneck
May not be time Processors: busy / total Memory: fraction used / total
25
Miscellaneous Metrics Reliability
Probability of errors or mean time between errors (error-free seconds)
Availability Fraction of time system is available to service requests
(fraction not available is downtime) Mean Time To Failure (MTTF) is mean uptime
Useful, since availability high (downtime small) may still be frequent and no good for long request
26
Definition of Reliability Recommendations E.800 of the International Telecommunications Union (ITU-T) defines reliability as follows:
“The ability of an item to perform a required function under given conditions for a given time interval.”
In this definition, an item may be a circuit board, a component on a circuit board, a module consisting of several circuit boards, a base transceiver station with several modules, a fiber-optic transport-system, or a mobile switching center (MSC) and all its subtending network elements. The definition includes systems with software.
27
Basic Definitions of Reliablity
Reliability R(t):X : time to failure of a system
F(t): : distribution function of system lifetime
Mean Time To system Failure:
f(t): density function of system lifetime
tFtXPtR 1
00
dttRdtttfXEMTTF
28
Definition of Availability
Availability is closely related to reliability, and is also defined in ITU-T Recommendation E.800 as follows:
"The ability of an item to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval, assuming that the external resources, if required, are provided."
An important difference between reliability and availability is that reliability refers to failure-free operation during an interval, while availability refers to failure-free operation at a given instant of time, usually the time when a device or system is first accessed to provide a required function or service
29
Availability (Cont…)
Instantaneous (point) Availability A(t):
A(t) = P (system working at t)
Let H(t) be the convolution of F and G:
g(t): density function of system repair time
Then:
Inst. Availability , , Reliability
dxxgxtFtHt
)()(0
t
xdHxtAtRtA0
)()()()(
)()( tRtA
30
First failed and got repaired at time x<t & UP at end of interval (x,t), prob:
Availability (Cont…)
0 x t
x + dx
First repair completed here
Never failed in (0,t), prob: R(t) System working at time t
t
xdHxtA0
)()(
31
Availability (Cont…)
MTTR: Mean Time to Repair
Y: repair period of the system
Availability and Reliability are related but different!
0
)( dtttgYEMTTR
32
We can show from equation (1) that:
Also:
Availability (Cont…)
MTTRMTTF
MTTFASS
)yearminutes(
60*8760*)1(
perin
Adowntime ss
33
High Reliability/Availability/Safety Traditional applications (long-life/life-critical/safety-critical)
Space missions, aircraft control, defense, nuclear systems New applications (non-life-critical/non-safety-critical, business critical)
Banking, airline reservation, e-commerce applications, web-hosting, telecommunication
Scientific applications (non-critical)
34
Motivation – High Availability
35
IFIP WG10.4 Failure occurs when the delivered service no longer
complies with the specification Error is that part of the system state which is liable
to lead to subsequent failure Fault is adjudged or hypothesized cause of an error
Faults are the cause of errors that may lead to failuresFault Error Failure
36
Three Rules of Validation
Do not trust the results of a simulation model until they have been validated by analytical modeling or measurements.
Do not trust the results of an analytical model until they have been validated by a simulation model or measurements.
Do not trust the results of a measurement until they have been validated by simulation or analytical modeling.