Dennis & Patten Participation in Government Mepham High School Health Care Reform in America.
Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten,...
-
Upload
patrick-webb -
Category
Documents
-
view
219 -
download
0
description
Transcript of Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten,...
![Page 1: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/1.jpg)
Basics of Biostatistics for Health ResearchSession 4 – February 28, 2013
Dr. Scott Patten, Professor of EpidemiologyDepartment of Community Health Sciences
& Department of Psychiatry
![Page 2: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/2.jpg)
Generate Commands Using Logic
generate obese2 = .recode obese2 .=0 if bmi <= 30recode obese2 .=1 if bmi > 30tab obese obese2prtest obese2, by(sex)
Missing as obese, which is strange.
![Page 3: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/3.jpg)
Missing Values and Logical Operators
• http://www.stata.com/support/faqs/data-management/logical-expressions-and-missing-values/
![Page 4: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/4.jpg)
Generate Commands Using Logic
generate obese2 = .recode obese2 .=0 if bmi <= 30recode obese2 .=1 if bmi > 30 & bmi !=.tab obese obese2, missingprtest obese2, by(sex)
This code works.
![Page 5: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/5.jpg)
Statistical Errors
![Page 6: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/6.jpg)
![Page 7: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/7.jpg)
P (non-exposed) 0.1Alt Hypoth. 0.2 (diff. between 2 prop.)P (exposed) 0.3
N (exposed) 30N (non-exposed) 30 (set equal to exposed)
Alpha 0.05
Power 0.5095
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
-0.5 -0.4 -0.3 -0.2 -0.14.3715E-160.1 0.2 0.3 0.4 0.5
Null Hypothesis Alternative Hypothesis Reject Indicator
Increase Sample Size
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1
Power
Reset
Increase Effect Size
Increase Alpha
Sample Size Simulation
![Page 8: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/8.jpg)
Sample Size Calculation in STATA
3
21
![Page 9: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/9.jpg)
Sample Size Dialogue Boxes
![Page 10: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/10.jpg)
Let’s do a calculation!
• You are planning a parallel group RCT – with treatment and control groups.
• Normally, 20% of people die with disease X, but you expect to cut this in half with a new treatment.
• How many do you need in each group to achieve 95% power at alpha = 5%?
![Page 11: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/11.jpg)
Output (sampsi)
n2 = 349 n1 = 349
Estimated required sample sizes:
n2/n1 = 1.00 p2 = 0.1000 p1 = 0.2000 power = 0.9500 alpha = 0.0500 (two-sided)
Assumptions: and p2 is the proportion in population 2Test Ho: p1 = p2, where p1 is the proportion in population 1
Estimated sample size for two-sample comparison of proportions
. sampsi .2 .1, alpha(0.05) power(.95)
![Page 12: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/12.jpg)
Another Calculation
• A QoL scale in a particular disease has a mean score of 20 and a standard deviation of 5.
• You are conducting a placebo controlled trial to evaluate a treatment that is expected to improve the QoL by 2 points on this scale.
• You recruit n=50 into each group – what power will you achieve?
![Page 13: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/13.jpg)
Output (sampsi)
power = 0.5160
Estimated power:
n2/n1 = 1.00 n2 = 50sample size n1 = 50 sd2 = 5 sd1 = 5 m2 = 22 m1 = 20 alpha = 0.0500 (two-sided)
Assumptions: and m2 is the mean in population 2Test Ho: m1 = m2, where m1 is the mean in population 1
Estimated power for two-sample comparison of means
![Page 14: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/14.jpg)
• Go to “www.ucalgary.ca/~patten” www.ucalgary.ca/~patten
• Scroll to the bottom.• Right click to download the files described as
being “for PGME Students”– One is a dataset– One is a data dictionary
• Save them on your desktop
![Page 15: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/15.jpg)
Review: Comparing Proportions
• We’ve looked at several procedures for comparing proportions (e.g. for obesity in men vs. women):
generate obese = .recode obese .=0 if bmi <= 30recode obese .=1 if bmi > 30 & bmi !=.tab obese obese, missingprtest obese, by(sex)
![Page 16: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/16.jpg)
Epitab Commands
1
32
![Page 17: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/17.jpg)
Review: Comparing Proportions
• We’ve looked at several procedures for comparing proportions (e.g. for obesity in men vs. women):
recode sex 2=1 1=0cs obese sex
![Page 18: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/18.jpg)
The output…
chi2(1) = 17.16 Pr>chi2 = 0.0000 Attr. frac. pop .1118099 Attr. frac. ex. .181502 .0997744 .25581 Risk ratio 1.22175 1.110833 1.343743 Risk difference .0265444 .0141393 .0389496 Point estimate [95% Conf. Interval] Risk .1462487 .1197042 .1347732 Total 6571 5004 11575 Noncases 5610 4405 10015 Cases 961 599 1560 Exposed Unexposed Total sex
. cs obese sex
![Page 19: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/19.jpg)
A “non-significant” association
generate highgluc = .recode highgluc .=0 if glucose <= 140 recode highgluc .=1 if glucose > 140 & glucose !=.generate female=sexrecode female (1=0) (2=1)tab highgluc female, exact
![Page 20: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/20.jpg)
How does this look with cs?
.
chi2(1) = 3.51 Pr>chi2 = 0.0609 Prev. frac. pop .12358 Prev. frac. ex. .2215609 -.0122169 .4013463 Risk ratio .7784391 .5986537 1.012217 Risk difference -.0054099 -.0111474 .0003276 Point estimate [95% Conf. Interval] Risk .0190074 .0244173 .0213998 Total 5682 4505 10187 Noncases 5574 4395 9969 Cases 108 110 218 Exposed Unexposed Total female
. cs highgluc female
![Page 21: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/21.jpg)
Review: Try the cci command to obtain the OR
.
chi2(1) = 3.51 Pr>chi2 = 0.0609 Prev. frac. pop .12358 Prev. frac. ex. .2215609 -.0122169 .4013463 Risk ratio .7784391 .5986537 1.012217 Risk difference -.0054099 -.0111474 .0003276 Point estimate [95% Conf. Interval] Risk .0190074 .0244173 .0213998 Total 5682 4505 10187 Noncases 5574 4395 9969 Cases 108 110 218 Exposed Unexposed Total female
. cs highgluc female
Check your work with the cc command.
![Page 22: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/22.jpg)
Comparing Proportions?
Yes No
Fisher’s Exact Test Parametric Assumptions?
Yes No
Multiple Groups? Multiple Groups?
Yes NoYes No
ANOVA t-test Kruskall-Wallis Wilcoxon’s-Rank Sum
![Page 23: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/23.jpg)
Two situations we haven’t covered…
• Severely skewed distributions• Two continuous variables
![Page 24: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/24.jpg)
Severely Skewed Variables
![Page 25: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/25.jpg)
Solution: Make Some Categories
• For example:– Non-smokers– Light smokers (<20)– Moderate 20-40– Heavy > 40
• Your task: Make a variable with these categories and do a statistical test to compare men to women.
![Page 26: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/26.jpg)
E.g. for the recoding…
generate smoke = .recode smoke .=1 if cigpday==0recode smoke .=2 if cigpday > 0 & cigpday < 20recode smoke .=3 if cigpday >=20 & cigpday <= 40recode smoke .=4 if cigpday > 40 & cigpday !=.tab smoke, missing
![Page 27: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/27.jpg)
Some output…
Fisher's exact = 0.000
Total 4,990 6,558 11,548 4 122 23 145 3 1,754 1,073 2,827 2 686 1,292 1,978 1 2,428 4,170 6,598 smoke 1 2 Total sex
stage 1: enumerations = 0stage 2: enumerations = 142603stage 3: enumerations = 146stage 4: enumerations = 1Enumerating sample-space combinations:
. tab smoke sex, exact
![Page 28: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/28.jpg)
Two continuous variables
• E.g. diastolic blood pressure and BMI• The place to start is always a scatter plot• STATA calls this a “two way” graph
![Page 29: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/29.jpg)
Start with Create
![Page 30: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/30.jpg)
Select the two variables
Submit
![Page 31: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/31.jpg)
![Page 32: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/32.jpg)
The command produced…• Produced by our dialogue box…
twoway (scatter diabp sysbp)• The same dialogue box can fit a line…
twoway (lfit diabp sysbp)
This time select “line”
![Page 33: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/33.jpg)
You can combine the two..
• Try it!twoway (scatter diabp sysbp) (lfit diabp sysbp)
• To assess significance, use the regress command (can you find the menu option?)regress diabp sysbp
![Page 34: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/34.jpg)
Note: the linear output
• Line: y = mx + b
• diabp = 33.42 + 0.364(sysbp) _cons 33.42091 .4606105 72.56 0.000 32.51804 34.32379 sysbp .3639623 .0033325 109.22 0.000 .3574301 .3704946 diabp Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 1580658.92 11626 135.958965 Root MSE = 8.1921 Adj R-squared = 0.5064 Residual 780160.451 11625 67.1105764 R-squared = 0.5064 Model 800498.474 1 800498.474 Prob > F = 0.0000 F( 1, 11625) =11928.05 Source SS df MS Number of obs = 11627
. regress diabp sysbp
![Page 35: Basics of Biostatistics for Health Research Session 4 – February 28, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.](https://reader034.fdocuments.in/reader034/viewer/2022051301/5a4d1b387f8b9ab05999dcca/html5/thumbnails/35.jpg)
(In Class) Assignment for Today
• Assess whether there is an association between systolic blood pressure and death
(you need to decide how)• We’ll define elevated systolic blood
pressure as being > 140 mm of Hg.– What is the risk ratio for death for people with
elevated systolic blood pressure?– Is the risk ratio statistically significant?