Statistics Review slides

download Statistics Review slides

of 64

Transcript of Statistics Review slides

  • 7/23/2019 Statistics Review slides

    1/64

    PLCY 2305:

    Statistics II for Public PolicyCourse Meeting: Tuesdays, 5:30!"

    Lab Meeting: Mondays, #2!#

    $%ce &ours: T'ursdays, (!)() *eorge Street

    P+$-SS$+: ./Y/TI $1-S./Y/TI$1-S4+$1-67

    T/: 4+7$ */SP-+II

    4+7$*/SP-+II4+$1-67

    mailto:[email protected]:[email protected]:[email protected]:[email protected]
  • 7/23/2019 Statistics Review slides

    2/64

    1-2

    4rief $8er8ie9 of t'e CourseMuc' of social science analysis is about 'o9 to

    understand iortant relations'is, often 9it'olicy ilications, by estiating ;uantitati8eagnitudes of causal e

  • 7/23/2019 Statistics Review slides

    3/64

    1-3

    T'is course is about using

    data to easure causaleed test scores?

    4ut alost al9ays 9e only 'a8e obser8ational

    BnoneAeriental data= returns to education= cigarette rices

    = onetary olicy

    Most of t'e course deals 9it' di%culties arising fro using

    obser8ational to estiate causal e

  • 7/23/2019 Statistics Review slides

    4/64

    1-4

    Learn et'ods for estiating causal e

  • 7/23/2019 Statistics Review slides

    5/64

    1-5

    Empirical problem: Class si>e andeducational outut

    =Policy ;uestion: 1'at is t'e eeby one student er class? by "studentsHclass?

    =1e ust use data to Jnd out Bis t'ere any9ay to ans9er t'is withoutdata?

    +e8ie9 of Probability and StatisticsBS1 C'aters 2, 3

  • 7/23/2019 Statistics Review slides

    6/64

    1-6

    T'e California Test Score

    6ata Set/ll K!( and K!" California sc'ool districts Bn20

    Nariables:=5t'grade test scores BStanford!Oac'ie8eent test, cobined at' and

    reading, district a8erage=Student!teac'er ratio BST+ no ofstudents in t'e district di8ided by no full!tie e;ui8alent teac'ers

  • 7/23/2019 Statistics Review slides

    7/64

    1-7

    Initial loo@ at t'e data:(You should already know how to interpret this table)

    T'is table doesnt tell us anyt'ing about t'e relations'ibet9een test scores and t'e STR

  • 7/23/2019 Statistics Review slides

    8/64

    1-8

    6o districts 9it' saller

    classes 'a8e 'ig'er testscores?Scatterplot of test score 8 student!teac'er ratio

    What does this figure show?

  • 7/23/2019 Statistics Review slides

    9/64

    1-9

    If 9e get soe nuerical e8idence on 9'et'erdistricts 9it' lo9 ST+s 'a8e 'ig'er test scores,t'en 'o9 to analy>e?

    # Coare a8erage test scores in districts 9it' lo9

    ST+s to t'ose 9it' 'ig' ST+s BDestimationE

    2 Test t'e DnullE 'yot'esis t'at t'e ean test

    scores in t'e t9o tyes of districts are t'e sae,against t'e Dalternati8eE 'yot'esis t'at t'ey

    di

  • 7/23/2019 Statistics Review slides

    10/64

    1-10

    Initial data analysis: Coare districts9it' DsallE class si>es BST+ Q 20 8s

    DlargeE BST+ R 20 class si>es:

    1. Estimationof di

  • 7/23/2019 Statistics Review slides

    11/64

    1-11

    # -stiation

    G Bt'isis ust notation for

    t'e

    eans (5) G (500

    )

    Is t'is a large di

  • 7/23/2019 Statistics Review slides

    12/64

    1-12

    2 &yot'esis testing

    6i

  • 7/23/2019 Statistics Review slides

    13/64

    1-13

    Coute t'e di

  • 7/23/2019 Statistics Review slides

    14/64

    1-14

    3 ConJdence inter8al

    / O5W conJdence inter8al for t'e di

  • 7/23/2019 Statistics Review slides

    15/64

    1-15

    1'at coes neAtZ

    T'e ec'anics of estiation, 'yot'esis testing,and conJdence inter8als s'ould be failiar

    T'ese concets eAtend directly to regression and its8ariants

    4efore turning to regression, 'o9e8er, 9e 9ill

    re8ie9 soe of t'e underlying t'eory of estiation,'yot'esis testing, and conJdence inter8als:=1'y do t'ese rocedures 9or@, and 9'y use t'ese rat'er

    t'an ot'ers?

    =1e 9ill re8ie9 t'e intellectual foundations of statistics and

    econoetrics

  • 7/23/2019 Statistics Review slides

    16/64

    1-16

    +e8ie9 of Statistical T'eory# The probability framework for statistical inference

    2 -stiation

    3 Testing

    ConJdence Inter8als

    The probability framework for statistical inference

    a Poulation, rando 8ariable, and distribution

    b Moents of a distribution Bean, 8ariance, standard

    de8iation, co8ariance, correlationc Conditional distributions and conditional eans

    d 6istribution of a sale of data dra9n randoly fro aoulation: Y#, Z, Yn

  • 7/23/2019 Statistics Review slides

    17/64

    1-17

    Ba Poulation, rando

    8ariable, and distributionPopulationT'e grou or collection of all ossible entities of interestBsc'ool districts

    1e 9ill t'in@ of oulations as inJnitely large B[ is an

    aroAiation to D8ery bigE

    Random variable Y

    uerical suary of a rando outcoe Bdistrict a8erage

    test score, district ST+

  • 7/23/2019 Statistics Review slides

    18/64

    1-18

    Population distribution of Y

    T'e robabilities of di

  • 7/23/2019 Statistics Review slides

    19/64

    1-19

    Bb Moents of a oulation distribution: ean,8ariance, standard de8iation, co8ariance, correlation

    mean eAected 8alue BeAectation of Y

    EBY

    Y

    long!run a8erage 8alue of Yo8er reeatedreali>ations of Y

    variance EBYGY2

    easure of t'e s;uared sread oft'e distribution

    standard deviation !Y

    Y2

    variance

  • 7/23/2019 Statistics Review slides

    20/64

    1-20

    "oments# $td%

    skeness easure of asyetry of a distribution

    skewness 0: distribution is syetric

    skewnessV BQ 0: distribution 'as long rig't Bleft tail

    kurtosis

    easure of ass in tails

    easure of robability of large 8alues

    kurtosis 3: noral distribution

    skewnessV 3: 'ea8y tails BDleptokurtotic)

    E Y Y( )

    3

    Y3

    E Y Y( )

    4

    Y

    4

  • 7/23/2019 Statistics Review slides

    21/64

    1-21

  • 7/23/2019 Statistics Review slides

    22/64

    1-22

    2 rando 8ariables: oint

    distributions andco8ariance+ando 8ariables&and''a8e a!oint distributionT'e covariancebet9een&and'is= co8B&,' E\B&G&B'G'] !&'

    T'e co8ariance is a easure of t'e linear association bet9een

    &and'F

    Its units are units of&units of'

    co8B&,' V 0 eans a ositi8e relation bet9een&and'

    If&and'are indeendently distributed, t'en co8B&,' 0

    Bbut not 8ice 8ersaT'e co8ariance of a r8 9it' itself is its 8ariance:

    co8B&,& E\B&G&B&G&] E\B&G&2]

    X

    2

  • 7/23/2019 Statistics Review slides

    23/64

    1-23

    If t'e co8ariance bet9een Test S$oreand STRisnegati8e:

    So is t'e correlationZ

  • 7/23/2019 Statistics Review slides

    24/64

    1-24

    T'e $orrelation $oe$ientis deJned in tersof t'e co8ariance:

    corrB&,' r&'

    # G# ^ corrB&,' ^ #

    # corrB&,' # ean erfect ositi8e linear association# corrB&,' G# eans erfect negati8e linear association

    # corrB&,' 0 eans no linear association

    cov(X,Z)var(X)var(Z)

    = XZ

    X

    Z

  • 7/23/2019 Statistics Review slides

    25/64

    1-25

    The $orrelation $oe$ient measures linear asso$iation

  • 7/23/2019 Statistics Review slides

    26/64

    1-26

    Bc Conditional distributions andconditional eans"onditional distributions

    T'e distribution of Y, gi8en 8alueBs of soe ot'er rando8ariable,&

    -A: t'e distribution of test scores, gi8en t'at ST+ Q 20

    "onditional e#pectations and conditional moments

    $onditional mean ean of conditional distribution= EBYU& Bimportant concept and notation

    $onditional varian$e 8ariance of conditional distribution

    Eample: EBTest s$oresUSTRQ 20 t'e ean of test scoresaong districts 9it' sall class si>es

    The di$erence in means is the di$erence beteen themeans o% to conditional distributions&

    EBTest s$oresUSTRQ 20 G EBTest s$oresUSTRR 20

  • 7/23/2019 Statistics Review slides

    27/64

    1-27

    *onditional mean# $td%

    $t'er eAales of conditional eans:

    1ages of all feale 9or@ers BY 9ages,& gender

    Mortality rate of t'ose gi8en aneAeriental treatent BY li8eHdieF&treatedHnot treated

    If EB&U' EB&), t'en corrB&,' 0 Bnotnecessarily 8ice 8ersa 'o9e8er

    The conditional mean is a 'possiblyne( term %or the %amiliar idea o% the

  • 7/23/2019 Statistics Review slides

    28/64

    1-28

    Bd 6istribution of a sale ofdata dra9n randoly fro a

    oulation: Y#,Z, Yn)e ill assume simple random samplingC'oose and indi8idual Bdistrict, entity at rando fro t'eoulation

    Randomness and data

    Prior to sale selection, t'e 8alue ofYis rando becauset'e indi8idual selected is rando

    $nce t'e indi8idual is selected and t'e 8alue of Yis obser8ed,t'en Yis ust a nuber G not rando

    T'e data set is BY#, Y2,Z, Yn, 9'ere Yi 8alue of Yfor t'e it'

    indi8idual Bdistrict, entity saled

  • 7/23/2019 Statistics Review slides

    29/64

    1-29

    +istribution of Y#,Z,Ynunder simplerandom samplin,

    4ecause indi8iduals _# and _2 are selected atrando, t'e 8alue of Y#'as no inforation contentfor Y2

    T'us:

    =Y#and Y2are independently distributed

    =Y#and Y2coe fro t'e sae distribution, t'at is,

    Y#, Y2are identically distributed=T'at is, under sile rando saling, Y#and Y2

    are indeendently and identically distributedBi.i.d.

    =More generally, under sile rando saling,Y i # Z n are iid

  • 7/23/2019 Statistics Review slides

    30/64

    1-30

    This framework allows ri,orous statisti$al inferen$esabout moments of population distributions usin, asample of data from that populationZ# T'e robability frae9or@ for statistical inference

    2 Estimation

    3 Testing

    ConJdence Inter8als

    Estimation

    is t'e natural estiator of t'e ean 4ut:

    a 1'at are t'e roerties of ?

    b 1'y s'ould 9e use rat'er t'an soe ot'er estiator? oreAale:

    = Y#Bt'e Jrst obser8ation

    = aybe une;ual 9eig'ts G not sile a8erage

    = edianBY#,Z, Yn

    T'e starting oint is t'e saling distribution of Z

    Y

    Y

    Y

    Y

  • 7/23/2019 Statistics Review slides

    31/64

    1-31

    Ba T'e saling

    distribution ofis a rando 8ariable, and its roerties aredeterined byt'e sampling distributionof=T'e indi8iduals in t'e sale are dra9n at rando

    =T'us t'e 8alues of BY#, Z, Yn are rando

    =T'us functions of BY#, Z, Yn, suc' as , are rando: 'ad adi

  • 7/23/2019 Statistics Review slides

    32/64

    1-32

    The samplin, distribution of

    # $td%E#ample: Suose Yta@es on 0 or # Ba *ernoullirando

    8ariable 9it' t'e robability distribution,Pr\Y 0] 022, PrBY# 0)"

    T'en

    EBY p# B# Gp 0 p 0)"

    /6 E\YG EBY]2pB# Gp -remember this./

    0)" B#G0)" 0#)

    T'e saling distribution of deends on n

    Consider n 2 T'e saling distribution of is,

    = PrB 0 0222 005

    = PrB 20220)" 03

    = PrB # 0)"2 0(#

    Y

    Y

    2

    Y

    Y

    Y

    Y

    Y

  • 7/23/2019 Statistics Review slides

    33/64

    1-33

    T'e saling distribution of 9'en Yis 4ernoulli Bp )":

    Y

  • 7/23/2019 Statistics Review slides

    34/64

    1-34

    T'ings 9e 9ant to @no9 about t'esaling distribution:

    1'at is t'e ean of ?=If EB true )", t'en is an unbiasedestiatorof

    1'at is t'e 8ariance of ?=&o9 does 8arB deend on nBfaous #Hnforula

    6oes becoe close to9'en nis large?=La9 of large nubers: is a consistentestiator of

    Gaears bell s'aed for nlargeZis t'isgenerally true?=In fact, Gis aroAiately norally distributed for n

    large

    BCentral Liit T'eore

    Y Y

    Y

    YY

    Y

    Y

    Y

  • 7/23/2019 Statistics Review slides

    35/64

    1-35

    T'e ean and 8ariance of

    t'e saling distribution of*eneral case G t'at is, for Yiiid fro any distribution, notust 4ernoulli:ean: EB EB Y

    Nariance: 8arB E\ G EB ]2

    E\ GY]2

    E

    E

    Y

    Y

    1

    nYi

    i=1

    n

    1

    nE(Y

    i)

    i=1

    n

    1

    n

    Yi=1

    n

    Y Y Y

    1

    n

    Yi

    i=1

    n

    Y

    2

    1

    n(Y

    i

    Y)

    i=1

    n

    2

    Y

  • 7/23/2019 Statistics Review slides

    36/64

    1-36

    So 8arB E

    1

    n(Y

    i

    Y)

    i=1

    n

    2

    Y

    E1

    n(Y

    i

    Y)

    i=1

    n

    1

    n(Y

    j

    Y)

    j=1

    n

    1

    n2

    E (Yi

    Y)(Y

    j

    Y)

    j=1

    n

    i=1

    n

    1

    n2cov(Y

    i,Y

    j)

    j=1

    n

    i=1

    n

    1

    n2 Y

    2

    i=1

    n

    2

    Y

    n

  • 7/23/2019 Statistics Review slides

    37/64

    1-37

    "ean and varian$e of samplin,distribution of # $td%

    EB Y

    8arB

    0mpli$ations:# is an unbiasedestiator ofYBt'at is, EB Y

    2 8arB is in8ersely roortional to n

    # t'e sread of t'e saling distribution isroortional to #H

    2T'us t'e saling uncertainty associated 9it'is roortional to #H Blarger sales, lessuncertainty, but s;uare!root la9

    Y

    Y

    Y

    Y

    2

    n

    YY

    Y

    n

    nY

  • 7/23/2019 Statistics Review slides

    38/64

    1-38

    T'e saling distribution of

    9'en nis largeor sall sale si>es, t'e distribution of iscolicated, but if nis large, t'e salingdistribution is sile

    # /s nincreases, t'e distribution of becoesore tig'tly centered aroundYBthe +a o%

    +arge ,umbers

    2 Moreo8er, t'e distribution of GYbecoesnoral

    Bthe "entral +imit Theorem

    Y

    Y

    Y

    Y

  • 7/23/2019 Statistics Review slides

    39/64

    1-39

    T'e 1aw of 1ar,e 2umbers:

    /n estiator is consistentif t'e robability t'at its falls 9it'inan inter8al of t'e true oulation 8alue tends to one as t'esale si>e increases

    If BY#,Z,Yn are iid and Q [, t'en is a consistent estiator

    ofY, t'at is,Pr\U GYU Q]# as n[

    9'ic' can be 9ritten, Y

    BD YE eans D con8erges in robability toYE

    Bthe math: as n[, 8arB 0, 9'ic' ilies t'at

    Pr\U GYU Q 3] #

    Y2 Y

    Y

    p

    Y

    Y Yp

    Y

    Y

    2

    n

    Y

  • 7/23/2019 Statistics Review slides

    40/64

    1-40

    T'e *entral 1imit Theorem

    BCLT:If BY#,Z,Yn are iid and 0 Q Q [ , t'en 9'en nislarge t'e distribution of is 9ell aroAiated by anoral distribution

    = is aroAiately distributed 2BY, BDnoraldistribution 9it' eanYand 8ariance HnE

    = B GYH!Yis aroAiately distributed 2B0,# Bstandard

    noral

    =That is, standardized = =

    is approximately distribted as ,!",#)

    =The lar$er is n, the better is the approximation%

    n Y

    Y2

    Y

    Y

    Y

    2

    n

    Y

    2

    YYE(Y)

    var(Y)

    Y Y

    Y/ n

    Y E(Y )

  • 7/23/2019 Statistics Review slides

    41/64

    1-41

    Same eample: saling distribution of :

    YE(Y)

    var(Y)

  • 7/23/2019 Statistics Review slides

    42/64

    1-42

    Suary: T'e Saling 6istributionof

    or Y#,Z,Yniid 9it' 0 Q Q [,

    T'e eAact BJnite sale saling distribution of 'as eanYBD is an unbiased estiator ofYE and 8ariance Hn

    $t'er t'an its ean and 8ariance, t'e eAact distribution of is

    colicated and deends on t'e distribution of YBt'e oulationdistribution

    1'en nis large, t'e saling distribution siliJes:

    ("a$ o% large n&'ers)

    is aroiatel* N(0+1) (C",)

    Y

    Y2

    Y

    Y

    2

    Y

    p

    Y

    YE(Y)

    var(Y)

    Y

  • 7/23/2019 Statistics Review slides

    43/64

    1-43

    Bb 1'y 7se To -stiate

    Y? isunbiased: EB Y is consistent: Y is t'e Dleast s;uaresE estiator ofYF sol8es,

    so, inii>es t'e su of s;uared DresidualsEoptional derivation (also see 4pp% 5%6)

    Set deri8ati8e to >ero and denote otial 8alue of mby :

    or 7

    Y

    Y

    Y

    Y

    Y

    Yp

    Y

    Y

    minm

    (Yi

    m)2

    i=1

    n

    d

    dm (Yi m)2

    i=1

    n

    d

    dm(Y

    i m)2

    i=1

    n

    2 (Yi m)i=1

    n

    Y

    i=1

    n

    1

    n

    i

    m= nm

    m

    1

    nYi

    i=1

    n

    Ym

  • 7/23/2019 Statistics Review slides

    44/64

    1-44

    1'y 7se To -stiateY,

    ctd 'as a saller 8ariance t'an all ot'er linear unbiased

    estiators: consider t'e estiato r, , 9'ere

    `ai are suc' t'at is unbiasedF t'en 8arB ^ 8arB

    Broof: S1, C' #)

    isnt t'e only estiator ofYG can you t'in@ of a tie you

    ig't 9ant to use t'e edian instead?

    -T ST-PS:#T'e robability frae9or@ for statistical inference

    2-stiation

    &%'ypothesis Testin$

    ConJdence inter8als

    Y

    Y

    1

    1

    n

    Y i i

    i

    aYn

    =

    = Y

    Y Y

    Y

  • 7/23/2019 Statistics Review slides

    45/64

    1-45

    &yot'esis Testing

    T'e hypothesis testing robleBfor t'eean:

    Ma@e a ro8isional decision based on t'ee8idence at 'and 9'et'er a null 'yot'esisis true, or instead t'at soe alternati8e'yot'esis is true

    T'at is, test

    =80: EBY Y,08s 8#: EBY VY,0B#!sided,V

    =80: EBY Y,08s 8#: EBY QY,0B#!sided,

  • 7/23/2019 Statistics Review slides

    46/64

    1-46

    Some terminolo,y for

    testin, statisti$alhypotheses:p(value robability of dra9ing a statistic Beg at least asad8erse to t'e null as t'e 8alue actually couted 9it' yourdata, assuing t'at t'e null 'yot'esis is true

    T'e signifcance levelof a test is a re!seciJed robability

    of incorrectly reecting t'e null, 9'en t'e null is true"alculating the p-valuebased on :

    p!8alue

    1'ere is t'e 8alue of actually obser8ed Bnonrando

    PrH

    0

    [Y Y,0

    >Yact Y,0

    !

    Y

    Y

    Yact Y

  • 7/23/2019 Statistics Review slides

    47/64

    1-47

    *al$ulatin, the p9value#

    $td%To coute t'ep!8alue, you need t'e to @no9 t'e salingdistribution of , 9'ic' is colicated if nis sallIf nis large, you can use t'e noral aroAiation BCLT:

    p!8alue ,

    robability under leftrig't 2B0,# tails

    9'ere std de8of t'e distribution of !YH .

    Y

    PrH

    0

    [Y Y,0

    >Yact Y,0

    !

    PrH

    0

    [Y

    Y,0

    Y/ n

    >Yact

    Y,0

    Y/ n

    !

    PrH

    0

    [Y

    Y,0

    Y

    >Y

    act Y,0

    Y

    !

    Y Y n

  • 7/23/2019 Statistics Review slides

    48/64

    1-48

    *al$ulatin, the p9value with!

    Yknown:

    or large n,p!8alue t'e robability t'at a 2B0,#rando 8ariable falls outside UB GY,0H U

    In ractice, is un@no9n G it ust be estiated

    Yact

    Y

    Y

  • 7/23/2019 Statistics Review slides

    49/64

    1-49

    Estimator of the varian$e of

    Y: Dsale 8ariance of YEact:

    If BY#,Z,Yn are iid and EBY Q [ , t'en

    1'y does t'e la9 of large nubers aly?=4ecause is a sale a8erageF see /endiA 33

    =Tec'nical note: 9e assue EBY Q [ because 'ere t'e

    a8erage is not of Yi, but of its s;uareF see / 33

    sY2

    1

    n 1 (Yi Y)2

    i=1

    n

    sY2p

    Y2

    sY2

    2

  • 7/23/2019 Statistics Review slides

    50/64

    1-50

    *omputin, the p9value with estimated:

    p!8alue ,

    Blarge n

    so

    robability under noral tails t'at is outside Uta$tU

    9'ere t Bt'e usual t!statistic

    .

    p-val&e . ( estimated)

    .

    Y2

    PrH0 [Y

    Y,0 >Yact Y,0 !

    PrH

    0

    [Y

    Y,0

    Y/ n

    >Yact

    Y,0

    Y/ n

    !

    PrH0[Y

    Y,0

    sY/ n

    >Yact

    Y,0

    sY/ n

    !

    Y Y,0

    sY/ n

    Pr

    H0

    [ t> tact ! Y2

  • 7/23/2019 Statistics Review slides

    51/64

    1-51

    1'at is t'e lin@ bet9een t'ep!8alueand t'e signiJcance le8el?

    T'e signiJcance le8el is reseciJed oreAale, if t'e reseciJed signiJcancele8el is 5W,

    =You reect t'e null 'yot'esis if UtU R #O(

    =-;ui8alently, you reect ifp^ 005

    =T'ep!8alue is soeties called t'e marginalsignifcance level

    =$ften, it is better to counicate t'ep!8alue t'an sily9'et'er a test reects or not G t'ep!8alue contains ore

    inforation t'an t'e DyesHnoE stateent about 9'et'ert'e test re ects

  • 7/23/2019 Statistics Review slides

    52/64

    1-52

    /t t'is oint, you ig't be 9ondering,1'at 'aened to t'e t!table and t'e degreesof freedo?

    The Stdent t distribtion

    If Yi, i #,Z, nis iid, 2BY, , t'en t'e t!statistic

    'as t'e Student t!distribution 9it' nG # degrees of

    freedoT'e critical 8alues of t'e Student t!distribution istabulated in t'e bac@ of all statistics boo@s+eeber t'e recie?

    # Coute t'e t!statistic

    2 Coute t'e degrees of freedo, 9'ic' is nG#

    3 Loo@ u t'e 5W critical 8alue

    If t'e t!statistic eAceeds in absolute 8alue

    Y2

  • 7/23/2019 Statistics Review slides

    53/64

    1-53

    Coents on t'is recie

    and t'e Studentt!distribution# T'e t'eory of t'e t!distribution 9as one of t'e earlytriu's of at'eatical statistics

    It is astounding, really: if Yis iid noral, t'en you can@no9 t'e ea$t, nite9sampledistribution of t'e t!statistic Git is t'e Student t

    So, you can construct conJdence inter8als Busing t'eStudent tcritical 8alue t'at 'a8e ea$tlyt'e rig'tco8erage rate, no atter 9'at t'e sale si>e

    T'is result 9as really useful before couters 9it' sallsale si>es

    4utZ

  • 7/23/2019 Statistics Review slides

    54/64

    1-54

    *omments on Student tdistribution# $td%2 If t'e sale si>e is oderate Bse8eral do>en or large

    B'undreds or ore, t'e di

  • 7/23/2019 Statistics Review slides

    55/64

    1-55

    d di ib i

  • 7/23/2019 Statistics Review slides

    56/64

    1-56

    *omments on Student t distribution#$td%3 So, t'e Student!tdistribution is only rele8ant

    9'en t'e sale si>e is 8ery sallF4ut e8en so, you ust be sure t'at t'e

    oulation distribution of Yis noral

    In ost data, t'e norality assution is rarely

    credible

    or eAale:= /re earnings are norally distributed?

    * S d di ib i

  • 7/23/2019 Statistics Review slides

    57/64

    1-57

    *omments on Student t distribution#$td% Pooled Nariance: Consider t'e t!statistic testing t'e

    'yot'esis t'at t9o eans Bgrous s, l are e;ual:

    -8en if t'e oulation distribution of Yin t'e t9o grous isnoral, t'e ooled 8ariance statistic doesnt 'a8e aStudent tdistribution

    T'e ooled 8ariance t!statistic is only 8alid if t'e 8ariancesof t'e noral distributions are t'e sae in t'e t9o grous

    1ould you eAect t'is to be true, say, for ens 8

    t=Ys Y

    l

    ss2

    ns

    + sl2

    nl

    =Ys Y

    l

    SE(Ys Y

    l)

    h d di ib i

  • 7/23/2019 Statistics Review slides

    58/64

    1-58

    The Student9t distribution ;SummaryT'e assution t'at Yis distributed 2BY, is rarely lausible

    in ractice BIncoe? uber of c'ildren?

    or nV 30, t'e t!distribution and 2B0,# are 8ery close Bas ngro9s large, t'e tnG#distribution con8erges to 2B0,#

    T'et!distribution is an artifact fro days 9'en sale si>es 9eresall and t'ere 9ere noHfe9 couters

    or 'istorical reasons, statistical soft9are tyically uses t'e t!

    distribution to coutep!8alues G but no die is oderate or large

    or t'ese reasons, in t'is class 9e 9ill focus on t'e large!naroAiation gi8en by t'e CLT

    -T ST-PS:

    #T'e robability frae9or@ for statisticalinference

    2-stiation

    3Testing

    Y2

  • 7/23/2019 Statistics Review slides

    59/64

    1-59

    ConJdence Inter8als

    / O5W confdence intervalforYis an inter8alt'at contains t'e true 8alue ofYin O5W ofreeated sales

    2ote: 1'at is rando 'ere? T'e 8alues of Y#,,Ynand t'us any functions of t'e G including t'econJdence inter8al

    T'e conJdence inter8al 9ill di

  • 7/23/2019 Statistics Review slides

    60/64

    1-60

    *onden$e intervals# $td%/ O5W conJdence inter8al can al9ays be constructed as t'e

    set of 8alues ofYnot reected by a 'yot'esis test 9it' a 5WsigniJcance le8el

    Y: ^ #O( Y: G#O( ^ ^ #O(

    Y: G#O( ^ GY^ #O(

    Y B G #O( , #O(

    This $onden$e interval relies on the lar,e9n results that is

    approimately normally distributed and

    Lin@ to Table

    Y Y

    sY/ n

    Y Y

    sY/ n

    sY

    n

    sY

    n

    sY

    nY

    Y

    sY2p

    Y2

    Y

    http://www.had2know.com/academics/normal-distribution-table-z-scores.htmlhttp://www.had2know.com/academics/normal-distribution-table-z-scores.html
  • 7/23/2019 Statistics Review slides

    61/64

    1-61

    Suary:

    ro t'e t9o assutions of:# sile rando saling of a oulation, t'at is,

    `Yi, i#,Z,n are iid

    2 0 Q EBY Q [

    Statisticians de8eloed, for large sales Blarge n:= T'eory of estiation Bsaling distribution of

    = T'eory of 'yot'esis testing Blarge!n distribution of t!statistic and coutation of t'ep!8alue

    = T'eory of conJdence inter8als Bconstructed by in8ertingt'e test statistic

    /re assutions B# h B2 lausible in ractice? es

    Y

    L t b @ t t' i i l li

  • 7/23/2019 Statistics Review slides

    62/64

    1-62

    Lets go bac@ to t'e original olicy;uestion:

    1'at is t'e e

  • 7/23/2019 Statistics Review slides

    63/64

  • 7/23/2019 Statistics Review slides

    64/64