Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21...

15
Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer

Transcript of Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21...

Page 1: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

Computing for Research ISpring 2012

Exploratory Data Analysis and Hypothesis TestingFebruary 21

Primary Instructor:Elizabeth Garrett-MAyer

Page 2: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

Exploratory Data Analysis

• We’ve already discussed some basic stuff– sum and sum, detail– tab

• What other sorts of exploration might we do?• Confidence intervals– for continuous variables– for categorical variables

Page 3: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

Immediate command for CIs

Continuous:cii N xbar s

Binary:cii N phatorcii N x

Page 4: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

Confidence intervals

• For a continuous variable:mean varlist

• Example:* estimate means of ceramide variablesmean c18ceramidemean totalc - s1pc1

Page 5: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

Additional options

tab initialre initialmean c18ceramide, over(initialre)

mean c18ceramide, vce(bootstr)

mean c18ceramide, vce(bootstr) over(initialre)mean c18ceramide, over(initialre)

mean c18ceramide, level(90)

Page 6: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

Confidence intervals for proportion

proportion varlist

Examplesproportion failure

proportion failure death initialre

proportion failure, vce(bootstr)proportion failure, cluster(patient)proportion failure, level(90)

Page 7: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

Hypothesis Testing

• A number of different approaches• Options– nonparametric vs. parametric– continuous vs. categorical (vs. other?)– one vs. two vs. more than two groups

Page 8: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

One sample t-tests

• ttesti N mean sd null • ttest varname == null• ttest var1 == var2 *paired

• Examples:ttesti 20 48 2.75 50 ttest c18c == 10ttest frombaselines1p==100ttest frombaselinec18==100

Page 9: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

Two sample t-teststtesti N1 mean1 sd1 N2 mean2 sd2ttest varname1 == varname2, unpairedttest varname, by(groupvar)

Examples:

ttest c18, by(sex)ttest c18, by(sex) unequal

Page 10: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

Nonparametric?

• ranksum: two group comparison• kwallis: >= two group comparison

• signrank: matched pairs signed ranks test• signtest: sign test of matched pairs

Page 11: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

Nonparametric?*nonparametric testsranksum c18, by(sex)kwallis c18, by(sex)

use ceramide.alldata, clearkeep if cycle==3gen c18dif = frombaselinec18-100signrank c18dif=0signrank frombaselinec18=100signtest c18dif=0signtest frombaselinec18=100

Page 12: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

Anova

• anova y x(note that x is assumed to be categorical)anova y x1 x2

Examples:anova c18c initialre

Page 13: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

One sample binomial tests

• prtest and bitest• Difference? – prtest uses large sample approximations– bitest uses exact test

bitest varname==p0bitesti N x p0

Page 14: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

One sample binomial testsuse "SCBC2004.v9.dta", clearreplace ercat=. if ercat==9gen ercatn=cond(ercat==2,0,1)replace ercatn=. if ercat==.tab ercat ercatn

bitest ercatn=0.50bitest ercatn=0.65prtest ercatn=0.65

Page 15: Computing for Research I Spring 2012 Exploratory Data Analysis and Hypothesis Testing February 21 Primary Instructor: Elizabeth Garrett-MAyer.

Two (or more) sample binomial tests

tab y x, exacttab y x, chi

tab ercatn gradetab ercatn stage