Basic Statistics in Pharmaceutical Industry

8/7/2019 Basic Statistics in Pharmaceutical Industry

1/32

Basic Statistics in PharmaceuticalBasic Statistics in Pharmaceutical

IndustryIndustry


2/32

ContentsContents

SampleSample

Bias and RandomizationBias and Randomization

Measure of Location/Central TendencyMeasure of Location/Central Tendency

Measure of DispersionMeasure of Dispersion

Hypothesis testingHypothesis testing

Type I and type II errorsType I and type II errors

PowerPower

PP--valuevalue


3/32


4/32

Sources ofVariabilitySources ofVariability

which patients in the population get includedwhich patients in the population get included

in the studyin the study

which of the patients in the study get allocatedwhich of the patients in the study get allocated

to treatmentto treatment


5/32

BiasBias

Bias is a measure of how much the sample is not aBias is a measure of how much the sample is not a

good representation of the population.good representation of the population.

Sources of bias:Sources of bias: selection biasselection bias

evaluator biasevaluator bias

measurement biasmeasurement bias

bias due to outcomebias due to outcome--related dropoutrelated dropout bias due to inconsistent study conductbias due to inconsistent study conduct


6/32

RandomizationRandomization

Assures (with high probability) that treatment andAssures (with high probability) that treatment and

control groups are similar in all aspects except forcontrol groups are similar in all aspects except for

treatmenttreatment

Expect balance between known and unknown factorsExpect balance between known and unknown factors

Confidence that selection of patients into groups willConfidence that selection of patients into groups will

not be determined by a process that will give annot be determined by a process that will give an

advantage (bias) to one of the groupsadvantage (bias) to one of the groups Expect that group comparisons will be unbiasedExpect that group comparisons will be unbiased

Fundamental validity of statistical testsFundamental validity of statistical tests


7/32

Measures ofMeasures of

Location/Central TendencyLocation/Central Tendency


8/32

MeanMean

The mean of a sample of values is the

arithmetic average and is determined by

dividing the sum of the values by the numberof the values.

Mean = Sum of all the values / Total number of

Observations


9/32

Example:50 Values of the Ritchie Index (MeasureExample:50 Values of the Ritchie Index (Measure

of Joint Stiffness) in 50 Untreated Patientsof Joint Stiffness) in 50 Untreated Patients

14 9 8 9 1 20 3 3 2 4

2 3 6 1 2 1116 2416 21

19 22 33 12 12 12 19 10 33 219 40 1 20 1 2 4 7 9 4

9 6 14 8 27 10 27 7 24 21

Here, Total number ofobservations are 50.

Mean = (14++21)/50 = 12.18


10/32

MedianMedian

The median is the midpoint of the values when they arearranged in ascending order.

(If there are an even number of values there is no

midpoint value and the average of the two middle values

is taken).

If n is odd, then

Median = value of ((n+1)/2)th

term

If n is even, then

Median = mean of (n/2)th term and ((n/2)+1)th term


11/32

1 1 1 1 2 2 2 2 2 3

3 3 4 4 4 6 6 7 7 8

8 9 9 9 9 10 10 11 12 1212 14 14 16 16 19 19 19 20 20

21 21 22 24 24 27 27 33 33 40

Ordered Ritchie Index ValuesOrdered Ritchie Index Values

Median = (9+10)/2 = 9.5


12/32

ModeMode

The mode is the most repetitive or most

frequent value.


13/32

1 1 1 1 2 2 2 2 2 3 3 34 4 4 6 6 7 7

8 8 9 9 9 9 10 10

11 12 12 12 14 1416 16 19 19 19 20 20

21 21 22 24 24

27 27 33 33 40

Here, modal value is 2

Ordered Ritchie Index ValuesOrdered Ritchie Index Values


14/32

0

2

4

6

8

10

12

14

16

0 - 5 6 - 10 11 - 15 16 - 20 21 - 25 26 - 30 31 - 35 36 - 40

Values of the Ritchie Index

Arithmetic Mean - outlier prone

Median - only uses relative magnitudes

Mode - not necessarily central

Location = Central Tendency

freq


15/32

Measures of Dispersion orMeasures of Dispersion orVariabilityVariability


16/32

RangeRange

The range of a sample of values is the largestvalue minus the smallest value.

If the maximum and minimum value of data isIf the maximum and minimum value of data is

101 and 96 respectively then range is 101101 and 96 respectively then range is 101--96=596=5

Range is simple .. BUTRange is simple .. BUT

Only uses min and maxOnly uses min and max Gets larger as sample size increasesGets larger as sample size increases


17/32

InterInter--quartile Rangequartile Range

The inter-quartile range of a sample of values is the

difference between the upper and lower quartiles.

Quartiles divide the ordered data into 4 parts. The

lower quartile is the value which is greater than ofthe sample and less than of the sample.

Conversely, the upper quartile is the value which is

greater than of the sample and less than of the

sample.


18/32

1/4of50 = 12.5

3/4of50 = 37.5

1 1 1 1 2 2 2 2 2 3

3 3 4 4 4 6 6 7 7 88 9 9 9 9 10 10 11 12 12

12 14 14 16 16 19 19 19 20 20

2121 22 24 24 27 27 33 33 40

So, inter-quartile range = 37.5 12.5 = 25


19/32

0

2

4

6

8

10

12

14

16

0 - 5 6 - 10 11 - 15 16 - 20 21 - 25 26 - 30 31 - 35 36 - 40

Values of the Ritchie Index

Lower quartile = 3.5

Upper quartile = 19

Inter-quartile range = 15.5

Freq


20/32

Neither measure uses the numerical valuesNeither measure uses the numerical values -- only relativeonly relativemagnitudesmagnitudes

A measure which accounts for the values is theA measure which accounts for the values is the standardstandarddeviationdeviation

Consider the aspirin data from the new processConsider the aspirin data from the new process

96 97 100 101 101 (mean 99 mg)96 97 100 101 101 (mean 99 mg)

Determine deviations from meanDetermine deviations from mean --33 --2 1 2 22 1 2 2

Square , add, average and squareSquare , add, average and square--rootroot

standard deviation =standard deviation = 09219 !


21/32

Confidence IntervalConfidence Interval

An interval summary that includes the concept of variation inAn interval summary that includes the concept of variation inpoint estimates.point estimates.

Confidence interval provides boundaries between which weConfidence interval provides boundaries between which weare relatively certain that the true value of a populationare relatively certain that the true value of a populationsummary lies.summary lies.

A common choice for the level of certainty is 95%. TA common choice for the level of certainty is 95%. Theheconstructedconstructed 95%95% CIs will cover the true populationCIs will cover the true populationmeasure of interest for 95% of all possible samples.measure of interest for 95% of all possible samples.

Since the realized sample is one but many possible samples,Since the realized sample is one but many possible samples,the constructed confidence interval based on a sample is onethe constructed confidence interval based on a sample is onebut many possible confidence intervals.but many possible confidence intervals.


22/32

Making DecisionsMaking Decisions

Q: Given two samples from two populations, canQ: Given two samples from two populations, canwe tell, with certainty, if the two populations arewe tell, with certainty, if the two populations arethe same or different?the same or different?

A: We can not.A: We can not.

But, we can say whether they areBut, we can say whether they are likelylikely to be theto be thesame or different.same or different.

We are making decisions with calculated risks.We are making decisions with calculated risks.The calculated risks are specified beforehand inThe calculated risks are specified beforehand interms of probabilities.terms of probabilities.


23/32

Philosophy ofStatistical TestingPhilosophy ofStatistical Testing

Assume that the opposite of what we wish toAssume that the opposite of what we wish toprove is true.prove is true.

Collect data.Collect data.

Under the assumption, compute how likely theUnder the assumption, compute how likely thedata are as we have observed.data are as we have observed.

If it is not very likely, conclude that the originalIf it is not very likely, conclude that the original

assumption is probably not true...assumption is probably not true... Otherwise, conclude that the data is consistentOtherwise, conclude that the data is consistent

with the assumption.with the assumption.


24/32

Null and Alternative HypothesisNull and Alternative Hypothesis

Alternative hypothesis (HAlternative hypothesis (Haa) is what we would like to) is what we would like toconclude. For example, our treatment is better thanconclude. For example, our treatment is better thanour competitors; or our treatment is clinicallyour competitors; or our treatment is clinically

equivalent to our competitors.equivalent to our competitors. Null hypothesis (HNull hypothesis (H00) is usually the opposite of H) is usually the opposite of Haa..

Like prosecutors, we are trying to gather evidence toLike prosecutors, we are trying to gather evidence to

prove our cases. The defendant is assumed innocentprove our cases. The defendant is assumed innocentuntil proven guilty. Our job is to actively prove ouruntil proven guilty. Our job is to actively prove ourclaim of superiority or equivalence.claim of superiority or equivalence.


25/32

Examples ofHExamples ofH00 and Hand Haa

Research hypothesis:Research hypothesis:

Zyvox increases % of individuals withZyvox increases % of individuals withmicrobiologic cures beyond that produced by themicrobiologic cures beyond that produced by the

comparator.comparator.Let pLet pZZ = % of individuals with microbiologic cures= % of individuals with microbiologic cures

when receiving Zyvox; pwhen receiving Zyvox; pCC = the corresponding= the corresponding

figure for the comparator.figure for the comparator.

So, we will testSo, we will testHH00: p: pZZ = p= pCCHHaa: p: pZZ p pCC or Hor Haa

**: p: pZZ > p> pCC


26/32

ErrorsErrors

DecisionDecision

TruthTruth

HHaa

HH00

truetrue

Reject HReject H00

Correct decisionCorrect decision Type I error (Type I error (EE))

(false positive)(false positive)

Do Not Reject HDo Not Reject H00 Type II error (Type II error (FF))(false positive)(false positive)

Correct decisionCorrect decision


27/32

Whose Risks?Whose Risks?

Type I errorType I error

False positiveFalse positive

Consumers riskConsumers risk

Regulators concernRegulators concern

Type II errorType II error

False negativeFalse negative

Sponsors riskSponsors risk Societal lossSocietal loss


28/32

ConventionsConventions

In most situations in drug development, type I error is theIn most situations in drug development, type I error is themore grievous error.more grievous error.

Probability of a type I error is usually held at a specificProbability of a type I error is usually held at a specific

level (denoted by a), say 0.05. In other words, we arelevel (denoted by a), say 0.05. In other words, we areallowing a 5% chance to reject Hallowing a 5% chance to reject H00 even if Heven if H00 is true.is true.

Probability of a type II error is then a function of theProbability of a type II error is then a function of thesample size.sample size.

So, we want to choose an appropriate sample size toSo, we want to choose an appropriate sample size tocontrol the type II error rate at an acceptable level.control the type II error rate at an acceptable level.


29/32

More ConventionsMore Conventions

Power is the complement of type II error rate. The lowerPower is the complement of type II error rate. The lowerthe type II error rate is, the higher the power, and the morethe type II error rate is, the higher the power, and the moresensitive the test is.sensitive the test is.

Power is the probability of getting a successful trial if thePower is the probability of getting a successful trial if thedrug does what we think it does.drug does what we think it does.

Among the threesome of type I error rate, power, andAmong the threesome of type I error rate, power, andsample size, we can calculate the third if we know thesample size, we can calculate the third if we know the

other two.other two. Many alternative hypotheses are composite, so we specifyMany alternative hypotheses are composite, so we specify

the desirable power to detect a clinically meaningfulthe desirable power to detect a clinically meaningfuldifference in the treatment effects.difference in the treatment effects.


30/32

What Is PWhat Is P--Value?Value?

PP--values measure how consistent the data are with the nullvalues measure how consistent the data are with the nullhypothesis. They are basically the probabilities ofhypothesis. They are basically the probabilities ofobserving what we observed if the null hypothesis is true.observing what we observed if the null hypothesis is true.

Since samples can produce different summary results andSince samples can produce different summary results andPP--values are calculated from samples, Pvalues are calculated from samples, P--values can differvalues can differfrom sample to sample.from sample to sample.

If a PIf a P--value is less than the allowable type I error rate 5%,value is less than the allowable type I error rate 5%,

we will conclude that what we observed is not consistentwe will conclude that what we observed is not consistentwith the null hypothesis Hwith the null hypothesis H00. Therefore, the . Therefore, the proof byproof by

contradictioncontradiction approach leads to the rejection of H approach leads to the rejection of H00 andandthe acceptance of Hthe acceptance of Haa..


31/32

Thank You!Thank You!


32/32

Questions?Questions?

Basic Statistics in Pharmaceutical Industry

Documents

Transcript of Basic Statistics in Pharmaceutical Industry