Inference Using Formulas

14
Statistics: Unlocking the Power of Data Inference Using Formulas STAT 101 Dr. Kari Lock Morgan Chapter 6 t-distribution Formulas for standard errors Normal and t based inference Matched pairs

description

STAT 101 Dr. Kari Lock Morgan. Inference Using Formulas. Chapter 6 t-distribution Formulas for standard errors Normal and t based inference Matched pairs. Confidence Interval Formula. IF SAMPLE SIZES ARE LARGE…. From N(0,1). From original data. From bootstrap distribution. - PowerPoint PPT Presentation

Transcript of Inference Using Formulas

Page 1: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

Inference Using Formulas

STAT 101

Dr. Kari Lock Morgan

Chapter 6• t-distribution• Formulas for standard errors• Normal and t based inference• Matched pairs

Page 2: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

Confidence Interval Formula

*sample statistic z SE

From original data

From bootstrap

distribution

From N(0,1)

IF SAMPLE SIZES ARE LARGE…

Page 3: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

Formula for p-values

From randomization

distribution

From H0

sample statistic null valueSE

z

From original data

Compare z to N(0,1) for p-value

IF SAMPLE SIZES ARE LARGE…

Page 4: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

Standard Error

• Wouldn’t it be nice if we could compute the standard error without doing thousands of simulations?

• We can!!!

Page 5: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

Parameter Distribution Standard Error

ProportionNormal

Difference in Proportions

Normal

Mean t, df = n – 1

Difference in Means t, df = min(n1, n2) – 1

Correlation t, df = n – 2

Standard Error Formulas

(1 )p pn

2

n

1 1

1

2 2

2

(1 ) (1 )p p p pn n

2 21 2

1 2n n

Page 6: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

SE Formula Observationsn is always in the denominator (larger sample size

gives smaller standard error)

Standard error related to square root of 1/n

Standard error formulas use population parameters… (uh oh!)

For intervals, plug in the sample statistic(s) as your best guess at the parameter(s)

For testing, plug in the null value for the parameter(s), because you want the distribution assuming H0 true

Page 7: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

Null ValuesSingle proportion: H0

: p = p0 => use p0 for p

Difference in proportions: H0: p1 = p2

use the overall sample proportion from both groups (called the pooled proportion) as an estimate for both p1 and p2

Means: Standard deviations have nothing to do with the null, so just use sample statistic s

Correlation: H0: ρ = 0 => use ρ = 0

Page 8: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

• For quantitative data, we use a t-distribution instead of the normal distribution

• This arises because we have to estimate the standard deviations

•The t distribution is very similar to the standard normal, but with slightly fatter tails (to reflect the uncertainty in the sample standard deviations)

t-distribution

Page 9: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

• The t-distribution is characterized by its degrees of freedom (df)

• Degrees of freedom are based on sample size• Single mean: df = n – 1 • Difference in means: df = min(n1, n2) – 1• Correlation: df = n – 2

• The higher the degrees of freedom, the closer the t-distribution is to the standard normal

Degrees of Freedom

Page 10: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

t-distribution

Page 11: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

Aside: William Sealy Gosset

Page 12: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

• A matched pairs experiment compares units to themselves or another similar unit

• Data is paired (two measurements on one unit, twin studies, etc.).

• Look at the difference for each pair, and analyze as a single quantitative variable

• Matched pairs experiments are particularly useful when responses vary a lot from unit to unit; can decrease standard deviation of the response (and so decrease the standard error)

Matched Pairs

Page 13: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

Golden Balls: Split or Steal?

• Both people split: split the money• One split, one steal: stealer gets all the money• Both steal: no one gets any money

Would you split or steal?

a) Splitb) Steal

http://www.youtube.com/watch?v=p3Uos2fzIJ0

Van den Assem, M., Van Dolder, D., and Thaler, R., “Split or Steal? Cooperative Behavior When the Stakes Are

Large,” available at SSRN: http://ssrn.com/abstract=1592456, 2/19/11.

Page 14: Inference Using Formulas

Statistics: Unlocking the Power of Data Lock5

To DoDo Project 1 (due Friday, 3pm)

Read Chapter 6

Do HW 5 (due Wednesday, 3/19)