Useful Statistical Tools February 19, 2010. Today’s Class Aphorisms Useful Statistical Tools...
-
date post
15-Jan-2016 -
Category
Documents
-
view
220 -
download
0
Transcript of Useful Statistical Tools February 19, 2010. Today’s Class Aphorisms Useful Statistical Tools...
Useful Statistical Tools
February 19, 2010
Today’s Class
• Aphorisms• Useful Statistical Tools• Probing Question• Assignments• Surveys
Aphorisms“Get close enough to know the task, but stay far enough to see the patterns.”
"Humor happens, embrace it.“
"Much like improv, prom night, and getting into fights, the key to good contextual inquiry is to always say yes.“
"Learn as though you would never be able to master it; hold it as though you would be in fear of losing it.“
"Until you learn to interpret openly, you open yourself to mis-interpretation.“
"To know an answer, you must ask a question. To know a truth, you must contextually inquire the right question.”
"Your participant does all the hard stuff. All you have to do is talk about it and check your work“
"You cannot learn if you already know, unless you first learn how to forget!“
"Listen to the people around you, including to those you know well -- but listen deeper.“
"Do, or do not. There is no try."
Any guesses
• From those who did not email in?
• Juelaila has won the first cookie• There is one cookie remaining
Aphorisms“Get close enough to know the task, but stay far enough to see the patterns.”
"Humor happens, embrace it.“
"Much like improv, prom night, and getting into fights, the key to good contextual inquiry is to always say yes.“
"Learn as though you would never be able to master it; hold it as though you would be in fear of losing it.“
"Until you learn to interpret openly, you open yourself to mis-interpretation.“
"To know an answer, you must ask a question. To know a truth, you must contextually inquire the right question.”
"Your participant does all the hard stuff. All you have to do is talk about it and check your work“
"You cannot learn if you already know, unless you first learn how to forget!“
"Listen to the people around you, including to those you know well -- but listen deeper.“
"Do, or do not. There is no try."
Aphorisms“Get close enough to know the task, but stay far enough to see the patterns.”
"Humor happens, embrace it.“
"Much like improv, prom night, and getting into fights, the key to good contextual inquiry is to always say yes.“
"Learn as though you would never be able to master it; hold it as though you would be in fear of losing it.“
"Until you learn to interpret openly, you open yourself to mis-interpretation.“
"To know an answer, you must ask a question. To know a truth, you must contextually inquire the right question.”
"Your participant does all the hard stuff. All you have to do is talk about it and check your work“
"You cannot learn if you already know, unless you first learn how to forget!“
"Listen to the people around you, including to those you know well -- but listen deeper.“
"Do, or do not. There is no try."
Cookies!
• "Do, or do not. There is no try.“– Juelaila answered first
• "Until you learn to interpret openly, you open yourself to mis-interpretation.“– No answers
Let’s discuss
• A few of these aphorisms
• Do you think that they help us understand the idea and practice of contextual inquiry better?
Your thoughts?
• “Get close enough to know the task, but stay far enough to see the patterns.”
Your thoughts?
• "Much like improv, prom night, and getting into fights, the key to good contextual inquiry is to always say yes.“
Your thoughts?
• "Until you learn to interpret openly, you open yourself to mis-interpretation.“
Your thoughts?
• "Your participant does all the hard stuff. All you have to do is talk about it and check your work“
Your thoughts?
• "You cannot learn if you already know, unless you first learn how to forget!“
Comments? Questions?
Today’s Class
• Aphorisms• Useful Statistical Tools• Probing Question• Assignments• Surveys
Useful Statistical Tools
• Power Analysis• Meta-Analysis• Imputation
Power Analysis
• A set of methods for determining
• The probability that you will obtain a statistically significant result, assuming a true effect size and sample size of a certain magnitude
Or
• The reverse
• Given a certain true effect size, and a desired probability of obtaining a statistically significant result, what sample size is needed?
Why? When?
• Why might a researcher want to do each type of power analysis?
• When might a researcher want to do each type of power analysis?
When used
• Effect size + Power --> Sample Size– Usually used before running study to pick sample
size
• Effect size + Sample Size --> Power– Usually used after running study to explain to
thesis committee why more subjects are needed
Power analysis
• Can be computed from– “Effect Size”/ Cohen’s d• (M1 – M2)/ (pooled SD, e.g. s)
– r– Difference in two r values– And several other metrics
Power analysis
• Can be computed for– Single-group t-test– Two-group t-test– Paired t-test– F test– Sign test– Etc., etc., etc.
Mathematical Details
• Differ for different statistical tests and metrics• Possible to do this in online power calculators
Sign Test Example(Courtesy of John McDonald)
What is a good value for power?
• Conventionally, power = 0.80 is treated as “good”
• Kind of a magic number
Comments? Questions?
I need 3 volunteers
Play with calculator
• http://www.cs.uiowa.edu/~rlenth/Power/
• Two-sample t-test
Volunteer #1
• If the true effect size is 0.5 ,s how big a sample do you need to achieve Power = 0.8?
Volunteer #2
• If the true effect size is 0.2 ,s how big a sample do you need to achieve Power = 0.8?
Volunteer #3
• If your control condition gains 20 points pre-post• And your experimental condition gains 40 points pre-
post• And the pooled standard deviation is 30 points• And you have 20 students in each condition
• What’s your statistical power?
Comments? Questions?
How can statistical power be increased?
• Both in theory, and in real life
How can statistical power be increased?
• Increase sample size
How can statistical power be increased?
• Increase difference in means– Make your intervention better
How can statistical power be increased?
• Increase difference in means– Make your control condition worse• Some researchers make the mistake of picking a control
condition that’s impossibly good– ScienceAssistments versus
ScienceAssistments, with one less potential IV
• This doesn’t mean you should fish for a control condition that is absurdly awful– DrScheme versus
Learning programming through interpretive dance– Miley’s World versus
Learning math through reading textbooks
How can statistical power be increased?
• Increase difference in means– Make your control condition worse• Some researchers make the mistake of picking a control
condition that’s impossibly good– ScienceAssistments versus
ScienceAssistments, with one less potential IV
• This doesn’t mean you should fish for a control condition that is absurdly awful– DrScheme versus
Learning programming through interpretive dance– Miley’s World versus
Learning math through reading textbooks written in Danish
How can statistical power be increased?
• Reduce standard deviation– What methods have we discussed in class that
could help us do this?
How can statistical power be increased?
• Reduce standard deviation– What methods have we discussed in class that
could help us do this?• Stratification
Comments? Questions?
Meta-Analysis
Meta-Analysis
• Very important point, right up front
• There is meta-analysis
• And then there are the statistical techniques used in meta-analysis– Much broader in application than just classical
meta-analysis!
Meta-Analysis
• In the classic sense, integrating across a set of previous studies, to attempt to find an overall effect size or significance of finding across all those studies
Examples
• Kulik & Kulik (1991) computer-aided instruction does 0.3 s better than traditional instruction
• Cohen, Kulik, & Kulik (1982) found that expert tutors do 2.3 s better than traditional instruction; novice tutors only do 0.4 s better than traditional instruction
Process of doing a meta-analysis
• Find all the studies on topic of interest• Find measure of interest (effect size or
statistical significance)• Integrate across studies
Challenges
• What might make it difficult to• Find all the studies on topic of interest• ?
Challenges to Finding all Studies
• Knowing what terminology to use in literature review – many phenomena have many names– Off-task behavior, Time-on-task, Percent On-Task,
Attention– Gaming the system, Systematic Guessing, Hint
Abuse, Help Abuse, Executive Help-Seeking, Letaxmaning, Off-Task Gaming Behavior, Player Transformation, Goal Structure Misalignment
Challenges to Finding all Studies
• “File-Drawer Effect”– Papers with null results get rejected by conference
program committees and journal reviewers– Papers with null results don’t get submitted in the
first place
Find measure of interest
• Statistical significance– If you can find a p, you can turn it into a Z, and
you’re good to go• Using Z formula in Excel, or a Z-p table
– Set direction on Z to be consistent• E.g. all studies with finding X are positive• All studies with finding not-X are negative
Find measure of interest
• Effect size– Transform values into correlations or Cohen’s d
values
Why might you…
• Why might you want to do meta-analysis on effect size versus statistical significance?
Integrating Across Studies
• Two cases
• Studies are independent
• Studies are non-independent
Studies are Independent
• By far the statistically easier case
Aggregating significance tests
• Stouffer’s Z• For N studies, each with Z value
SZsqrt(N)
Volunteer?
Example
• Five studies on the effects of taking gym class on mathematics performance– Two studies found positive effect of taking gym
class, p= 0.02, p=0.06– Three studies found negative effect of taking gym
class, p=0.05, p=0.11, p=0.75
– One-tailed Z table on the next slide
Z table
Aggregating correlations
• Convert r to Fisher z’
• For N studies, each with z’ value
Sz’ N
• Then convert the result back to r
Why Fisher z’?
• Equal differences between any two Fisher z’ values are equal in significance
• Whereas r is uneven– From r=0.8 to 0.9 is a bigger difference in
significance than r=0.2 to r=0.3– So transformation is necessary to weight all
differences in correlation equally
Volunteer?
Example
• Five studies on the effects of learning computer programming on popularity– Two studies found positive correlation, r = 0.1, r=
0.3– Three studies found negative correlation, r = - 0.8,
r = - 0.6, r = - 0.7
– Fisher z’ table on the next slide
r z' r z' r z' r z' r z' r z'
0.00 0.0000 0.18 0.1820 0.35 0.3654 0.53 0.5901 0.69 0.84800.86 1.2933
0.01 0.0100 0.19 0.1923 0.36 0.3769 0.54 0.6042 0.70 0.86730.87 1.3331
0.02 0.0200 0.20 0.2027 0.37 0.3884 0.55 0.6184 0.71 0.88720.88 1.3758
0.03 0.0300 0.21 0.2132 0.38 0.4001 0.56 0.6328 0.72 0.90760.89 1.4219
0.04 0.0400 0.22 0.2237 0.39 0.4118 0.57 0.6475 0.73 0.92870.90 1.4722
0.05 0.0500 0.23 0.2342 0.40 0.4236 0.58 0.6625 0.74 0.95050.91 1.5275
0.06 0.0601 0.24 0.2448 0.41 0.4356 0.59 0.6777 0.75 0.97300.92 1.5890
0.07 0.0701 0.25 0.2554 0.42 0.4477 0.60 0.6931 0.76 0.99620.93 1.6584
0.08 0.0802 0.26 0.2661 0.43 0.4599 0.61 0.7089 0.77 1.02030.94 1.7380
0.09 0.0902 0.27 0.2769 0.44 0.4722 0.62 0.7250 0.78 1.04540.95 1.8318
0.10 0.1003 0.28 0.2877 0.45 0.4847 0.63 0.7414 0.79 1.07140.96 1.9459
0.11 0.1104 0.29 0.2986 0.46 0.4973 0.64 0.7582 0.80 1.09860.97 2.0923
0.12 0.1206 0.30 0.3095 0.47 0.5101 0.65 0.7753 0.81 1.12700.98 2.2976
0.13 0.1307 0.31 0.3205 0.48 0.5230 0.66 0.7928 0.82 1.15680.99 2.6467
0.14 0.1409 0.32 0.3316 0.49 0.5361 0.67 0.8107 0.83 1.1881
0.15 0.1511 0.33 0.3428 0.50 0.5493 0.68 0.8291 0.84 1.2212
0.16 0.1614 0.34 0.3541 0.51 0.5627 0.85 1.2562
0.17 0.1717 0.52 0.5763
Comments? Questions?
Studies are non-independent
• Generally taken to mean that same sample (at least in part) is involved
• The case where there is non-independence due to similar (or the same) learning materials is generally not considered, as inter-correlation is low and difficult to compute
Math is “complex”
• Strube’s (1985) Adjusted Z is used instead of Stouffer’s Z in these cases– Accounts for correlation of different data points
for the same subject
• Similar approach for effect size
Comments? Questions?
Other Uses of These Techniques
Non-independence in modeling
• Take the case where you are studying whether an EDM model is statistically significantly different than chance– N actions involving M students
• It is extremely invalid to do a statistical significance test involving N actions– Assumes each action is independent of each other action
• But it biases towards non-significance to collapse the N data points into one data point per student
Solution
• Do separate statistical significance test within each student (actions can be treated as independent of each other, once student is accounted for)
• Then use Stouffer’s Z to aggregate across students
To see examples…
• There is not time to discuss the math in detail today, but see examples in– Baker, Corbett, & Aleven (2008)– Baker, Corbett, Roll, & Koedinger (2008)
Comments? Questions?
Imputation
Imputation
• In data sets with large amounts of data per data point– For instance, extremely long surveys or
demographic data• It is common to have small amounts of missing
data in each data point– E.g. variable 17 missing for students 1, 14, 90, 112,
202, 477
In these cases…
• It may be undesirable to throw out every data point that has a missing response– You might end up losing 30-40% of your data, or
more, and biasing your data
• For instance, people who occasionally fail to respond to survey items probably differ systematically from people who dilligently and carefully answer every question
Imputation
• For each data point missing a value• Find a set of “similar” data point that is not
missing that value– Similar data point has low absolute difference
across non-missing variables• Randomly choose one of the non-missing
values to fill in the missing data
Multiple Imputation
• Create 3-10 data sets in this fashion• Then for all the missing data, find the mean
(and SD) across all imputed data sets
• Use the no-longer-missing data in future analyses
An alternative: regression imputation
• Find set of linear regression functions predicting each variable from all other variables
• Use this function to fill in missing data
Advantages? Disadvantages?
• Multiple Imputation• Regression Imputation• Throwing out all data points with missing
variables
Comments? Questions?
Today’s Class
• Aphorisms• Useful Statistical Tools• Probing Question• Assignments• Surveys
Probing Question
• Observation: Relatively few researchers use power analysis when designing their studies.
• Why?• Are they making a mistake?
Today’s Class
• Aphorisms• Useful Statistical Tools• Probing Question• Assignments• Surveys
Assignment #5
• Any questions?
Today’s Class
• Aphorisms• Useful Statistical Tools• Probing Question• Assignments• Surveys