Analyzing Performance test data

28

description

Analyzing Performance test data. (or how to convert your numbers to information). Carles Roch-Cunill Test Lead for System Performance McKesson Medical Imaging Group [email protected]. Agenda. Performance testing as an experimental activity - PowerPoint PPT Presentation

Transcript of Analyzing Performance test data

Page 1: Analyzing Performance test data
Page 2: Analyzing Performance test data

Analyzing Performance Analyzing Performance test data test data

(or how to convert your numbers to information)(or how to convert your numbers to information)

Carles Roch-CunillCarles Roch-CunillTest Lead for System PerformanceTest Lead for System Performance

McKesson Medical Imaging GroupMcKesson Medical Imaging [email protected]@mckesson.com

Page 3: Analyzing Performance test data

6/1/2009

AgendaAgenda

- Performance testing as an experimental activityPerformance testing as an experimental activity

- Very fast review of Scientific MethodVery fast review of Scientific Method

- Errors, forget them at your own riskErrors, forget them at your own risk

- About the meaning of dataAbout the meaning of data

- Some statistical conceptsSome statistical concepts

- Analyzing dataAnalyzing data

- Adjusting your data to a modelAdjusting your data to a model

- SummarySummary

Page 4: Analyzing Performance test data

6/1/2009

Performance testing as an experimental Performance testing as an experimental activityactivity

There are two approaches to testing:There are two approaches to testing:

a) Without added valuea) Without added value

– This feature does not workThis feature does not work

– This requirement is not meetThis requirement is not meet

b) With added valueb) With added value

– This feature does not work, and this This feature does not work, and this module/component/software artifact is the culpritmodule/component/software artifact is the culprit

– This requirement is not meet, and it fails for this reason.This requirement is not meet, and it fails for this reason.

Usually, things are not so clear, and testers statements fall somehow Usually, things are not so clear, and testers statements fall somehow in the middle.in the middle.

Because Performance testing gathers data that can be analyzed, the Because Performance testing gathers data that can be analyzed, the performance tester is well positioned to provide added value performance tester is well positioned to provide added value information to the team.information to the team.

Page 5: Analyzing Performance test data

6/1/2009

Performance testing as an experimental Performance testing as an experimental activityactivity

If you want to provide added value and explain why the requirement If you want to provide added value and explain why the requirement is not met you willis not met you will

- Formulate a hypothesis: “My performance degrades due to Formulate a hypothesis: “My performance degrades due to component X”component X”

- Test the hypothesis by developing an appropriate test Test the hypothesis by developing an appropriate test environmentenvironment

- Gather resultsGather results

- Analyze the results to see if they confirm or reject your hypothesisAnalyze the results to see if they confirm or reject your hypothesis

If you are lucky and your guess (the hypothesis) was good, you will If you are lucky and your guess (the hypothesis) was good, you will have explained at least a part of the performance behaviour.have explained at least a part of the performance behaviour.

However, usually there may be other factors that may also influence However, usually there may be other factors that may also influence your performance, so you have catch one low hanging fruit.your performance, so you have catch one low hanging fruit.

Page 6: Analyzing Performance test data

6/1/2009

Performance testing as an experimental Performance testing as an experimental activityactivity

You can create different test that will put more emphasis in one of the You can create different test that will put more emphasis in one of the components of the system.components of the system.

For example, you may want to specifically measure the performance of the For example, you may want to specifically measure the performance of the data repository tier, or the network, or only the UI.data repository tier, or the network, or only the UI.

Depending where is your focus, your methodology and your tools will Depending where is your focus, your methodology and your tools will change.change.

In all cases, you need to fix all the parameters but one. For example, if you In all cases, you need to fix all the parameters but one. For example, if you want to study the influence of the network on your system, you need to want to study the influence of the network on your system, you need to do the following:do the following: Determine the parameters that characterize the network (latency, Determine the parameters that characterize the network (latency,

bandwidth, utilization…)bandwidth, utilization…) Identify if they are independent or not (utilization and latency may Identify if they are independent or not (utilization and latency may

not be independent) not be independent) Modify one parameter at a time while keeping the other constantModify one parameter at a time while keeping the other constant

Page 7: Analyzing Performance test data

6/1/2009

Very fast review of Scientific MethodVery fast review of Scientific Method

- An effect has been observed. Example: performance An effect has been observed. Example: performance degradation on your applicationdegradation on your application

- You try to reproduce it and learn the conditions to reproduce You try to reproduce it and learn the conditions to reproduce it at willit at will

- You may gather some data through testingYou may gather some data through testing- To explain the data you formulate a model (hypothesis)To explain the data you formulate a model (hypothesis)- You refine your testing and tailor it around your modelYou refine your testing and tailor it around your model- You analyze the new data and check if your model fits the You analyze the new data and check if your model fits the

datadata- If the model fits it, you are on a good footingIf the model fits it, you are on a good footing- If the model partially fits it, you either refine your model or If the model partially fits it, you either refine your model or

discard it.discard it.- If the model does not fits it, you formulate another modelIf the model does not fits it, you formulate another model- In both cases, new data obtained from other tests may force In both cases, new data obtained from other tests may force

you to modify/rethink or even dump your model.you to modify/rethink or even dump your model.- Once your data fits the model, you draw conclusions based Once your data fits the model, you draw conclusions based

on the framework provided by the model.on the framework provided by the model.

Page 8: Analyzing Performance test data

6/1/2009

Very fast review of Scientific MethodVery fast review of Scientific Method

Unstated principles:Unstated principles:

• Simpler is betterSimpler is better

• Same procedure and system, you get the same Same procedure and system, you get the same results.results.

• A model should not introduce mode questions than it A model should not introduce mode questions than it answersanswers

• Usually, newer models include the older models as Usually, newer models include the older models as particular casesparticular cases

• Models are dynamic.Models are dynamic.

Page 9: Analyzing Performance test data

6/1/2009

Errors, forget them at your own riskErrors, forget them at your own risk

Errors happen… so take them into Errors happen… so take them into accountaccount

There are two main kind of errors:There are two main kind of errors:Human Errors: stopping the watch in Human Errors: stopping the watch in

the wrong moment, confusing the wrong moment, confusing digits…digits…

Instrument error: Your watch is not Instrument error: Your watch is not precise, has a mechanical defect…precise, has a mechanical defect…

Page 10: Analyzing Performance test data

6/1/2009

Errors, forget them at your own riskErrors, forget them at your own risk

In the graph besides. If your error bar is ± 1, we can say the trend is to a larger value. However, if the error bar is ± 3, then we can not say anything about the trend of this data

Page 11: Analyzing Performance test data

6/1/2009

About the meaning of dataAbout the meaning of data

Performance generates a lot of data. But what all the Performance generates a lot of data. But what all the data means? To explain this data you need to take data means? To explain this data you need to take into account:into account: HardwareHardware Network characteristicsNetwork characteristics Network topologyNetwork topology Physical support for Data tier (storage, Physical support for Data tier (storage,

database..)database..) The architecture of your applicationThe architecture of your application How your application is codedHow your application is coded

……..

Page 12: Analyzing Performance test data

6/1/2009

About the meaning of dataAbout the meaning of data

In addition, you need to analyze the results in the context of the requirement In addition, you need to analyze the results in the context of the requirement or the question you are trying to answer.or the question you are trying to answer.

For example:For example:

“ “ Event A should not take more than Event A should not take more than x seconds”x seconds”

In most of the circumstances involving computer systems, you will have an In most of the circumstances involving computer systems, you will have an stochastic component in your distribution. Assuming a normal one you will stochastic component in your distribution. Assuming a normal one you will have something like have something like

Page 13: Analyzing Performance test data

6/1/2009

About the meaning of dataAbout the meaning of data

But, what exactly the requirement means?But, what exactly the requirement means?Strictly it means:Strictly it means:

Page 14: Analyzing Performance test data

6/1/2009

About the meaning of dataAbout the meaning of data

However, the requirement it usually interpreted asHowever, the requirement it usually interpreted as::

For formal point of view the requirement “Event A should not take more than For formal point of view the requirement “Event A should not take more than x seconds” x seconds” would have failed with the above distribution. However the statement “The average of would have failed with the above distribution. However the statement “The average of Event A should not take more than Event A should not take more than x seconds” would passx seconds” would pass

Page 15: Analyzing Performance test data

6/1/2009

About the meaning of dataAbout the meaning of data

The requirement can also be expressed as percentileThe requirement can also be expressed as percentile

In this case the requirement will be stated as “Event A should not In this case the requirement will be stated as “Event A should not take more than take more than X seconds 50% of the time” X seconds 50% of the time”

Page 16: Analyzing Performance test data

6/1/2009

Some statistical conceptsSome statistical concepts

Once we have defined the question, we can provide the answer. The answer Once we have defined the question, we can provide the answer. The answer will be obtained through measurements (either manual or automated).will be obtained through measurements (either manual or automated).

The more measurements you take, the better will be your statistics and the The more measurements you take, the better will be your statistics and the better will be your answers.better will be your answers.

However, the measurements need to be However, the measurements need to be statistically significantstatistically significant. What it . What it means is the measurement is good enough to be included in your means is the measurement is good enough to be included in your statistics.statistics.

All the measurements that are included in your statistics need to be All the measurements that are included in your statistics need to be statistically equivalentstatistically equivalent

Page 17: Analyzing Performance test data

6/1/2009

Some statistical conceptsSome statistical concepts

How you determine if your data is statistically equivalent?How you determine if your data is statistically equivalent?

You can apply some complex mathematical analysis or apply common sense.You can apply some complex mathematical analysis or apply common sense.

Some rules of thumb:Some rules of thumb: If in a single set of measurements, 20% of your data is very different, you either have a problem in your test system or you are observing different phenomena.If in a single set of measurements, 20% of your data is very different, you either have a problem in your test system or you are observing different phenomena. If you have done several runs, and the 90th percentile of a new test is bigger (smaller) than the maximum (minimum) of the previous tests, then the new data is not statistically similar, and has no statistically significance If you have done several runs, and the 90th percentile of a new test is bigger (smaller) than the maximum (minimum) of the previous tests, then the new data is not statistically similar, and has no statistically significance

for your results.for your results. If you are expecting a specific distribution, and you are not getting it, the current set can not be compared (is not statistically equivalent) to the data you were expecting.If you are expecting a specific distribution, and you are not getting it, the current set can not be compared (is not statistically equivalent) to the data you were expecting. Outliers are not statistically equivalent to the rest of the set.Outliers are not statistically equivalent to the rest of the set.

Page 18: Analyzing Performance test data

6/1/2009

Some statistical conceptsSome statistical concepts

Example of 90Example of 90thth percentile for Test 3 being bigger than the maximum of the percentile for Test 3 being bigger than the maximum of the other sets of measurements. In this context Test 3 is not statistically other sets of measurements. In this context Test 3 is not statistically equivalent and will be rejected. equivalent and will be rejected.

Page 19: Analyzing Performance test data

6/1/2009

Some statistical conceptsSome statistical concepts

Outliers are usually defined asOutliers are usually defined as Measurement outside the overall pattern of a Measurement outside the overall pattern of a

distribution (Moore and McCabe 1999). distribution (Moore and McCabe 1999). A more precise definition is a point the is 1.5 more A more precise definition is a point the is 1.5 more

than the interquartile range above the third than the interquartile range above the third quartile of below the first quartilequartile of below the first quartile

Usually, the presence of an outlier indicates either an Usually, the presence of an outlier indicates either an error in the measurement or an incomplete modelerror in the measurement or an incomplete model

Page 20: Analyzing Performance test data

6/1/2009

Analyzing dataAnalyzing data

While testing a non deterministic system you While testing a non deterministic system you will always get a distribution of values, all of will always get a distribution of values, all of them valid in principle. them valid in principle.

For example, if your average in a measure is 3 For example, if your average in a measure is 3 and you sample again and get 6, this ‘6’ is also and you sample again and get 6, this ‘6’ is also correct and you can not discard this number correct and you can not discard this number ((unless you do not determine this point is an outlierunless you do not determine this point is an outlier).).

The good news is you can extract information The good news is you can extract information from this succession of different numbers.from this succession of different numbers.

Page 21: Analyzing Performance test data

6/1/2009

Analyzing dataAnalyzing data

For example, we may have the following For example, we may have the following collection of raw data for a measure collection of raw data for a measure that generically we will describe as that generically we will describe as “query database”, in seconds“query database”, in seconds

4.18; 2.1; 1.9; 2.23; 4.5; 4.2; 2.19; 4.18; 2.1; 1.9; 2.23; 4.5; 4.2; 2.19; 2.21; 4.24; 2.23; 1.99; 2.01; 2.39; 2.21; 4.24; 2.23; 1.99; 2.01; 2.39; 4.19; 2.42; 2.08; 2.27; 3.98; 2.21; 4.19; 2.42; 2.08; 2.27; 3.98; 2.21; 2.45; 4.32;2.45; 4.32; average: 2.9 average: 2.9

These results seem to be a mix of These results seem to be a mix of two series:two series:2.1; 1.9; 2.23; 2.19; 2.21; 2.23; 1.99; 2.1; 1.9; 2.23; 2.19; 2.21; 2.23; 1.99; 2.01; 2.39; 2.42; 2.08; 2.27; 2.21; 2.01; 2.39; 2.42; 2.08; 2.27; 2.21; 2.452.45 average: 2.2 average: 2.2AndAnd4.18; 4.24; 4.19; 3.98; 4.32; 4.5; 4.24.18; 4.24; 4.19; 3.98; 4.32; 4.5; 4.2

average: 4.2 average: 4.2

Frequency

0

2

4

6

8

10

12

14

16

2.2 4.2

Frequency

Page 22: Analyzing Performance test data

6/1/2009

Analyzing dataAnalyzing data

What the previous slide is telling us? Averaging all the results tells us nothing. The results point to a hidden effect: the system

executes the query in different ways. One possible cause could be that one query joints

more tables and thus, it takes more time to return the results

So, if you want to answer the question of “What is the time to execute this query” you would need to be more nuanced or would need to know the frequency of these queries, so you would be able to make a weighted average.

Page 23: Analyzing Performance test data

6/1/2009

Adjusting your data to a modelAdjusting your data to a model

The most common one is the usual Gaussian or normal distribution, The most common one is the usual Gaussian or normal distribution, where where σσ is the standard deviation and is the standard deviation and μμ is the average is the average

The importance of this distribution lay in the Central Limit Theorem, that indicates the distribution of random variables tend to be a normal distribution when sampled a The importance of this distribution lay in the Central Limit Theorem, that indicates the distribution of random variables tend to be a normal distribution when sampled a large number of times.large number of times.

Example: if we assume that latency experience by users in a wireless network only depend on the distance to the hub, Example: if we assume that latency experience by users in a wireless network only depend on the distance to the hub, μμ can be interpreted as the average distance of can be interpreted as the average distance of the user to the hub and the user to the hub and σσ will indicate how spread are the users around the hub. will indicate how spread are the users around the hub.

Page 24: Analyzing Performance test data

6/1/2009

Adjusting your data to a modelAdjusting your data to a model

Another example of analysis:Another example of analysis:

The Chi distribution The Chi distribution

Resembles in first approximation to the Gaussian distribution, however, it refers when a phenomena depends of K independent Resembles in first approximation to the Gaussian distribution, however, it refers when a phenomena depends of K independent parameters, and each of them individually would provide a Gaussian distribution.parameters, and each of them individually would provide a Gaussian distribution.

Example: the observed latency time in a ADSL city wide network may depend of the network utilization, and the latency Example: the observed latency time in a ADSL city wide network may depend of the network utilization, and the latency induced by the distance to the nearest hub. If we want to improve the performance of the system, then we need to tackle induced by the distance to the nearest hub. If we want to improve the performance of the system, then we need to tackle both problems. both problems.

Page 25: Analyzing Performance test data

6/1/2009

Adjusting your data to a modelAdjusting your data to a model

This would be an example of two uniform This would be an example of two uniform distributionsdistributions

Frequency

0

2

4

6

8

10

12

14

16

2.2 4.2

Frequency

Page 26: Analyzing Performance test data

6/1/2009

Adjusting your data to a modelAdjusting your data to a model

If your model can not explain well the results, If your model can not explain well the results, you need to change or improve the modelyou need to change or improve the model

A useful model should have predictive A useful model should have predictive capabilities, so you can design new tests to capabilities, so you can design new tests to prove/disprove the model.prove/disprove the model.

Negative results (model disproved) can be as Negative results (model disproved) can be as useful as a positive resultsuseful as a positive results

The analysis of the performance data can help The analysis of the performance data can help to prevent future bottlenecks and problemsto prevent future bottlenecks and problems

The analyzed results will have a range of The analyzed results will have a range of validity. Do not force too many consequences validity. Do not force too many consequences from themfrom them

Page 27: Analyzing Performance test data

6/1/2009

SummarySummary

Performance testers provide information beyond requirement compliancePerformance testers provide information beyond requirement compliance Performance testing should be treated like a experimental activityPerformance testing should be treated like a experimental activity As experimental activity, scientific method is the most appropriate As experimental activity, scientific method is the most appropriate

method of enquiry.method of enquiry. In tune with the scientific method, you need to make assumptions, In tune with the scientific method, you need to make assumptions,

design your experiment accordingly and reduce the error barsdesign your experiment accordingly and reduce the error bars Data should be subject to an statistical analysisData should be subject to an statistical analysis After the analysis, you should try explain your data with a modelAfter the analysis, you should try explain your data with a model If the models does not a good job explaining your data, you should If the models does not a good job explaining your data, you should

change/refine the modelchange/refine the model Your analysis should help to make the software better.Your analysis should help to make the software better.

Page 28: Analyzing Performance test data

6/1/2009

Analyzing Performance test Analyzing Performance test datadata

Questions?Questions?