Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total /...
Transcript of Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total /...
![Page 1: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/1.jpg)
CCEB
Basic Statistics for Virology
Sarah J. Ratcliffe, Ph.D. Center for Clinical Epidemiology and
Biostatistics U. Penn School of Medicine Email: [email protected]
September 2011
![Page 2: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/2.jpg)
CCEB
Outline
Summarizing the Data Hypothesis testing Example Interpreting results Resources
![Page 3: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/3.jpg)
CCEB
Summarizing the Data
Two basic aspects of data Centrality Variability
Different measures for each Optimal measure depends on type of data
being described
![Page 4: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/4.jpg)
CCEB
Centrality Mean
Sum of observed values divided by number of observations
Most common measure of centrality Most informative when data follow normal
distribution (bell-shaped curve) Median
“middle” value: half of all observed values are smaller, half are larger
Best centrality measure when data are skewed Mode
Most frequently observed value
![Page 5: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/5.jpg)
CCEB
Mean can Mislead Group 1 data: 1,1,1,2,3,3,5,8,20
Mean: 4.9 Median: 3 Mode: 1 Group 2 data: 1,1,1,2,3,3,5,8,10
Mean: 3.8 Median: 3 Mode: 1 When data sets are small, a single extreme
observation will have great influence on mean, little or no influence on median
In such cases, median is usually a more informative measure of centrality
![Page 6: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/6.jpg)
CCEB
![Page 7: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/7.jpg)
CCEB
![Page 8: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/8.jpg)
CCEB
Variability
Most commonly used measure to describe variability is standard deviation (SD)
SD is a function of the squared differences of each observation from the mean
If the mean is influenced by a single extreme observation, the SD will overstate the actual variability
![Page 9: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/9.jpg)
CCEB
Variability
SD describes variability in original data SE = SD / √n describes variability in the
mean. Confidence intervals for the mean use SE
e.g. Mean ± t* SE Confidence interval for predicted
observation uses SD.
![Page 10: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/10.jpg)
CCEB
![Page 11: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/11.jpg)
CCEB
Outlier Detection
When is an observation unusual?
One large/small value can bias mean, especially in small samples.
Can not throw away a data point without an explanation or reason!
![Page 12: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/12.jpg)
CCEB
Outlier Detection
Possible tests include Grubb’s test Rosner’s test Dixon’s test http://www.cee.vt.edu/ewr/environmental
/teach/smprimer/outlier/outlier.html Huber’s method gives weighted mean with
outliers “down weighted”.
![Page 13: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/13.jpg)
CCEB
Data Distributions
Parametric tests (e.g. t-tests, ANOVA) rely on data being normally distributed.
![Page 14: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/14.jpg)
CCEB
![Page 15: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/15.jpg)
CCEB
![Page 16: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/16.jpg)
CCEB
![Page 17: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/17.jpg)
CCEB
![Page 18: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/18.jpg)
CCEB
Data Transformation
Can transform the data to normality to perform tests.
0 1 2 3 4 5 60
2
4
6
8
10
12
14
Natural log transform:
x* = ln(x)
-1 -0.5 0 0.5 1 1.50
1
2
3
4
5
6
7
8
Exponential transform:
x* = exp(x)
![Page 19: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/19.jpg)
CCEB
Hypothesis testing Steps: 1. Select a one-sided or two-sided test. 2. Establish the level of significance. (α = .05) 3. Select an appropriate test statistic. 4. Run the test. 5. Make decision to accept or reject null
hypothesis.
![Page 20: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/20.jpg)
CCEB
Hypothesis testing: 1. One-sided versus Two-sided
Determined by the alternative hypothesis. Unidirectional = one-sided
Example: Infected macaques given vaccine or placebo. Higher viral-replication in vaccine group has no benefit of interest.
H0: vaccine has no beneficial effect on viral-replication levels at 6 weeks after infection.
Ha: vaccine lowers viral-replication levels by 6 weeks after infection.
![Page 21: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/21.jpg)
CCEB
Hypothesis testing: 1. One-sided versus Two-sided
Bi-directional = two-sided Example:
Infected macaques given vaccine or placebo. Interested in whether vaccine has any effect on viral- replication levels.
H0: vaccine has no beneficial effect on viral-replication levels at 6 weeks after infection.
Ha: vaccine effects the viral-replication levels.
![Page 22: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/22.jpg)
CCEB
Hypothesis testing: 2. Level of Significance
How many different hypotheses are being examining?
How many comparisons are needed to answer this hypothesis?
Are any interim analyses planned? e.g. test data, depending on results
collect more data and re-test. => How many tests will be ran in total?
![Page 23: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/23.jpg)
CCEB
Hypothesis testing: 2. Level of Significance
αtotal = desired total Type-I error (false positives) for all comparisons.
One test α1 = αtotal
Multiple tests / comparisons If αi = αtotal, then ∑αi > αtotal
Need to use a smaller α for each test.
![Page 24: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/24.jpg)
CCEB
Hypothesis testing: 2. Level of Significance
Conservative approach: αi = αtotal / number comparisons
Can give different α’s to each comparison. Formal methods include: Bonferroni, Tukey-
Cramer, Scheffe’s method, Duncan-Walker. O’Brien-Fleming boundary or a Lan and Demets
analog can be used to determine αi for interim analyses.
Benjamini Y, and Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. JRSSB, 57:125-133.
![Page 25: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/25.jpg)
CCEB
Hypothesis testing: 3. Selecting an Appropriate test
How many samples are being compared? One sample Two samples Multi-samples
Are these samples independent? Unrelated subjects in each sample. ‘Subjects’ in each sample related / same
![Page 26: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/26.jpg)
CCEB
Hypothesis testing: 3. Selecting an Appropriate test
Are your variables continuous or categorical or both? If continuous, is the data normally
distributed? Blindly assuming normality can lead to
wrong conclusions!!!
![Page 27: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/27.jpg)
CCEB
Source: Richardson & Overbaugh (2005). Basic statistical considerations in virological experiments. Journal of Virology, 29(2): 669-676.
Type of Data
No. of samples being
compared
Relationship between samples
Underlying distribution of all samples Potential statistical tests
Binary 1 n/a Binary One sample binomial test Binary 2 Independent Binary Chi-square test, Fisher's exact test Binary >2 Independent Binary Chi-square test Binary 2 Paired Binary McNemar's test Binary >2 Related Binary Cochran's Q test
Continuous 1 n/a Normal One sample t-test for means, one-sample
chi-square test fro variances
Continuous 1 n/a Non-normal One sample Wilcoxon signed-rank test,
one-sample sign test
Continuous 2 Independent Normal Two-sample t-test for means, two-sample
F test for variances Continuous 2 Independent Non-normal Wilcoxon rank sum test Continuous 2 Paired Normal Paired t-test Continuous 2 Paired Non-normal Wilcoxon signed-rank test, sign test
Continuous >2 Independent Normal One-way ANOVA for means, Bartlett's
test of homogeneity for variances
Continuous >2 Independent Non-normal Kruskal-Wallis test
Continuous >2 Related Non-normal Friedman rank sum test
![Page 28: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/28.jpg)
CCEB
Hypothesis testing: Selecting an Appropriate test
Other tests are available for more complex situations. For example, Two-part t-test or Wilcoxon test: combines
test for proportion that doesn’t express with test of distribution of those that do express.
(Lachenbruch P.A., Comparisons of two-part models with competitors. Stat Med, 20(8):1215-34, 2001)
Repeated measures ANOVA: >2 measurements taken on each subject; usually interested in time/dose effect.
![Page 29: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/29.jpg)
CCEB
Hypothesis testing
Steps: 1. Select a one-tailed or two-tailed test. 2. Establish the level of significance (e.g., α = .05). 3. Select an appropriate test statistic. 4. Run the test. 5. Make decision to accept or reject null
hypothesis.
![Page 30: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/30.jpg)
CCEB
Example
Expression of chemokine receptors on CD14+/CD14- populations of blood monocytes.
Percent of cells positive by FACS.
![Page 31: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/31.jpg)
CCEB
CCR8 subject CD14+ CD14-
1 5 17 2 9 25 3 13 36 4 2 9 5 5 18 6 0 2 7 6 6 8 21 30 9 5 6
10 36 35 mean 10.2 18.4 st dev 10.9 12.6 st error 3.4 4.0
![Page 32: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/32.jpg)
CCEB
Example cont’d
Continuous data, 2 samples => t-test, if normal OR => Wilcoxon rank sum or signed-rank
sum test, if non-normal Are samples independent or paired?
If independent, can test for equality of variances using a Levene’s test
![Page 33: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/33.jpg)
CCEB
![Page 34: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/34.jpg)
CCEB
Example cont’d Possible results for different assumptions:
P-values Normal (t-tests)
Non-normal (non-parametric
tests)
Independent, equal variance
0.137
Independent, unequal variance
0.137 0.105
Paired 0.010 0.013
![Page 35: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/35.jpg)
CCEB
Example cont’d
Which result is correct? Data are paired The differences for each subject are
normally distributed. => Paired t-test p = .010 There is a difference in the percentage of
positive CD14+ and CD14- cells.
![Page 36: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/36.jpg)
CCEB
![Page 37: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/37.jpg)
CCEB
A graph of the 95% CIs for the means would give the impression there is no difference …
![Page 38: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/38.jpg)
CCEB
When it’s really the differences we are testing.
![Page 39: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/39.jpg)
CCEB
Example cont’d
Note: paired tests don’t always give lower p-values.
A 1-sided test on the CCR5 values would give p-values of:
p = 0.06 independent samples p = 0.11 paired samples
WHY?
![Page 40: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/40.jpg)
CCEB
Example 1 cont’d
The differences have a larger spread than the individual variables.
![Page 41: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/41.jpg)
CCEB
5. Interpreting Results
P-values Is there a statistically significant result? If not, was the sample size large
enough to detect a biologically meaningful difference?
![Page 42: Basic Statistics for Virology - Perelman School of ... · Conservative approach: α i = α total / number comparisons Can give different α’s to each comparison. Formal methods](https://reader035.fdocuments.in/reader035/viewer/2022070800/5f0226a17e708231d402d2cd/html5/thumbnails/42.jpg)
CCEB
Online Resources
Power / sample size calculators http://www.stat.uiowa.edu/~rlenth/Power/
Free statistical software http://statpages.org/javasta2.html