Stats Lec1

download Stats Lec1

of 29

Transcript of Stats Lec1

  • 7/28/2019 Stats Lec1

    1/29

    andFinance

  • 7/28/2019 Stats Lec1

    2/29

    2nd and3rd year

    course

    Introductiontostatisticalprinciplesandtechniques

    Core

    text:

    Barrow

    Emphasisonpracticalapplications

    Assessment:project

  • 7/28/2019 Stats Lec1

    3/29

    Keydefinitions

    Population:the

    entire

    set

    of

    observations/census

    Sample:asubgroupofthepopulation

    denotedbyGreekcharacterse.g. and2

    Stat st c:anestimateo t eparameterca cu ate usingt esamp e

    denotedby

    normal

    characters

    e.g.

    and

    s

    2

    x

  • 7/28/2019 Stats Lec1

    4/29

    DescriptiveStatistics

    Descriptive

    statistics

    summarize a

    mass

    of

    information

    Wemayusegraphical and/ornumerical methods

    Exam les of ra hical methods are the bar chart XY chart ie chart

    andhistogram(seeseminar1andworkshop1forpractice)

    Examplesofnumericalmethodsareaverages andstandard

    deviations

  • 7/28/2019 Stats Lec1

    5/29

    NumericalTechniques

    Weexamine

    measures

    of

    Location

    spers on

  • 7/28/2019 Stats Lec1

    6/29

    Measuresof

    Location

    Mean strictl

    thearithmetic

    mean

    the

    well

    known

    avera e

    Median e.g.theincomeofthepersoninthemiddleofthe

    distribution

    . .

    These

    different

    measures

    can

    give

    different

    answers

  • 7/28/2019 Stats Lec1

    7/29

    e

    eano

    t e

    ncome

    str ut on

    Person 1 2 3 4 5 6 7 8 9 10income 15 15 20 25 45 55 70 85 125 250

    705x

    .10 n

    Mean income is therefore 70,500 per year

    NB: use if data is the whole population , or if the data is a sample

    same formula .

    x

  • 7/28/2019 Stats Lec1

    8/29

    TheMedian

    Theincomeofthemiddleperson i.e.theonelocated

    halfway

    through

    the

    distribution

    Poorest Richest

    This persons income

    ,

  • 7/28/2019 Stats Lec1

    9/29

    Calculatingthe

    Median

    Wehave10observationsinthesample,sotheperson5.5in

    rank

    order has

    the

    median

    wealth.

    This

    person

    is

    somewhere

    , ,

    Person 1 2 3 4 5 6 7 8 9 10

    ence eme an ncome s , peryear

    Q:

    what

    happens

    to

    the

    median

    if

    the

    richest

    persons

    income

    Q:whathappenstothemean?

  • 7/28/2019 Stats Lec1

    10/29

    e

    o e

    Themodeistheobservationwiththehighestfrequency

    ,

    Itispossibletohaveasampleorpopulationwithnomode,ormore

    than1mode

    . .

  • 7/28/2019 Stats Lec1

    11/29

    Measuresof

    Dispersion

    ange e erence e weensma es an arges o serva on.Notveryinformativeformostpurposes

    Variance basedon

    all

    observations

    in

    the

    population

    or

    sample

  • 7/28/2019 Stats Lec1

    12/29

    TheVariance

    Thevarianceistheaverageofallsquareddeviationsfromthemean:

    n

    x 2

    NB:use2 forpopulationvariance;forsamplevarianceuses2 and

  • 7/28/2019 Stats Lec1

    13/29

    TheVariance

    (cont.)

    fSmall

    variance

    variance

  • 7/28/2019 Stats Lec1

    14/29

    Calculatin theSam leVariance22 xnx

    n

    i

    1

    ns

    x 15 15 20 25 45 55 70 85 125 250

    x2

    225 225 400 625 2025 3025 4900 7225 15625 62500

    497055.7010;9677562500....225225 221

    2 xnxi i

    5230

    9

    49705967752

    s

    NB: Variance is in 2, so we use the square root, known as the

    standard deviation, s. s=72.318, i.e. 72,318.

  • 7/28/2019 Stats Lec1

    15/29

    StandardDeviation

    Usefultohelpusestimate

    a)The%ofobs.thatliewithinagivennumberofstandarddeviations

    b)Whereaparticularobservationliesrelativetothemean

  • 7/28/2019 Stats Lec1

    16/29

    Cheb shevsRule

    100(11/k2)%ofobservationsliewithink standarddeviationsabove

    an e ow emean.

    e.g.100*[11/(22)]%=75%ofobs.liewithin2s.d.s eithersideofthemean

    75%

    -1s-2s +1s +2s

  • 7/28/2019 Stats Lec1

    17/29

    IftheunderlyingdistributionisNormal(morenextweek),then

    68%ofobservationsliewithin 1st.devs

    95%ofobservationsliewithin 2st.devs

    99%ofobservationsliewithin 3st.devs

  • 7/28/2019 Stats Lec1

    18/29

    zscorestellushowmanystandarddeviationsanobservationliesaboveorbelowthemean

    xz

    z>0meansthattheobservationliesabovethemean

    z< means a eo serva on es e ow emean

    . . .

    5565

    Thus 65isexactl 1st.dev.abovethemean

    10

  • 7/28/2019 Stats Lec1

    19/29

    Summary

    Wecanusegraphicalandnumericalmeasurestosummarisedata

    Theaimistosimplifywithoutdistortingthemessage

    dispersion[variance,

    standard

    deviation,

    zscores]

    provide

    agood

    descriptionofthedata

  • 7/28/2019 Stats Lec1

    20/29

    statisticswhenthedataisgrouped

  • 7/28/2019 Stats Lec1

    21/29

    DataonWealthintheUK

    a e . e s r u on o wea , ,

  • 7/28/2019 Stats Lec1

    22/29

    Themean

    of

    the

    Wealth

    Distribution

    mid-point,Range x f fx

    0 5.0 3,417 17,085.0

    10,000 17.5 1,303 22,802.525 000 32.5 1 240 40 300.0

    40,000 45.0 714 32,130.0

    50,000 55.0 642 35,310.0

    60,000 70.0 1,361 95,270.0

    , . , , .

    100,000 125.0 2,708 338,500.0150,000 175.0 1,633 285,775.0

    200,000 250.0 1,242 310,500.0

    , . , .

    500,000 750.0 367 275,250.0

    1,000,000 1500.0 125 187,500.0

    2,000,000 3000.0 41 123,000.0o a , , , .

    443.1335.722,225,2 fx

    933,16

  • 7/28/2019 Stats Lec1

    23/29

    Calculatingthe

    Median

    16,933observations,henceperson8,466.5inrankorder hasthemedianwealth

    Range Frequency

    Cumulat ive

    frequency

    0 3,417 3,417

    10,000 1,303 4,720

    25,000 1,240 5,960

    40,000 714 6,674

    50 000 642 7 316

    Number with wealthless than 60k

    60,000 1,361 8,677

    80,000 1,270 9,947

    Number with wealthless than 80k

    : : :

  • 7/28/2019 Stats Lec1

    24/29

    Calculatingthe

    Median

    (cont.)

    Tofindtheprecisemedian,use

    xxx LUL

    2

    316,7

    933,16 907.76

    361,1608060

    Medianwealthis76,907

  • 7/28/2019 Stats Lec1

    25/29

    TheMode

    (cont.)

    Forgroupe ata,t emo ecorrespon stot e nterva w t greatestfrequencydensity

    Range Frequency

    width

    u y

    density

    , , .

    10,000 1,303 15,000 0.0869Modalclass

    25,000 1,240 15,000 0.0827

    40,000 714 10,000 0.0714

    50,000 642 10,000 0.0642

    ,

  • 7/28/2019 Stats Lec1

    26/29

    TheVariance

    Thevarianceistheaverageofallsquareddeviationsfromthemean:

    xf2

    2

    Thelargerthisvalue,thegreaterthedispersionoftheobservations

  • 7/28/2019 Stats Lec1

    27/29

    RangeMid-point

    (000) Frequency, f

    Deviation

    (x - ) (x - )2

    f(x - )2

    0 5.0 3,417- 126.4 15,987.81 54,630,329.97

    10,000 17.5 1,303- 113.9 12,982.98 16,916,826.55

    25,000 32.5 1,240- 98.9 9,789.70 12,139,223.03

    40,000 45.0 714- 86.4 7,472.37 5,335,274.81

    50,000 55.0 642- 76.4 5,843.52 3,751,537.16

    60,000 70.0 1,361- 61.4 3,775.23 5,138,086.73

    80,000 90.0 1,270- 41.4 1,717.51 2,181,241.95

    100,000 125.0 2,708- 6.4 41.51 112,411.42

    150,000 175.0 1,633 43.6 1,897.22 3,098,162.88

    200,000 250.0 1,242 118.6 14,055.79 17,457,288.35300,000 400.0 870 268.6 72,122.92 62,746,940.35

    500,000 750.0 367 618.6 382,612.90 140,418,932.52

    1,000,000 1500.0 125 1,368.6 1,872,948.56 234,118,569.53

    2,000,000 3000.0 41 2,868.6 8,228,619.88 337,373,415.02

    Total 16,933 895,418,240.28

    07.880,52

    933,16

    28.240,418,89522

    f

    xf

  • 7/28/2019 Stats Lec1

    28/29

    TheStandard

    Deviation

    Thevarianceismeasuredinsquareds(becauseweusedsquareddeviations)

    deviation

    957.22907.880,52 ,

  • 7/28/2019 Stats Lec1

    29/29

    SampleMeasures

    Forsamp e ata,use

    2

    2 xxf 1 n

    Thisgives

    an

    unbiased

    estimate of

    the

    population

    variance

    Takethesquarerootofthisforthesamplestandarddeviation