Introduction to statistics inference

download Introduction to statistics inference

of 5

description

Terminology of statistics inference

Transcript of Introduction to statistics inference

  • Terminology

    Population: a collection of units or individuals

    Parameter: a number associated with the population

    Sample: a subset of the population

    Estimate: a number computed from the sample, and used as a guess forthe parameter

    How good the estimate is depends on how the sample was taken.

    Try to avoid or reduce bias in the sample

    Ani Adhikari and Philip Stark Statistics 2.3X Lecture 1.1 1 / 1

  • Roosevelt versus Landon

    1936 U.S. Presidential election, won by FDR

    % for FDRLiterary Digest prediction 43(sample size 10,000,000)

    Gallup prediction of Digest prediction 44(sample size 3,000)

    Gallup prediction of election result 56(sample size 50,000)

    Election result 62

    [Table adapted from Statistics, 4th edition, by Freedman, Pisani, andPurves]

    Ani Adhikari and Philip Stark Statistics 2.3X Lecture 1.1 2 / 1

  • Avoid these!

    Selection bias: systematically leaving out a section of the population

    Non-response bias: people not answering survey questions

    Bigger isnt always better

    If the method of sampling is bad, taking a large sample doesnt help. Youjust get a big bad sample.

    Ani Adhikari and Philip Stark Statistics 2.3X Lecture 1.1 3 / 1

  • Random sample

    Random sample or probability sample: Before the sample is drawn, ithas to be possible to calculate the probability with with each member ofthe population will be included in the sample.Note. This probability doent have to be the same for all members of thepopulation.

    Example 1. I have a population of 4 people: A, B, C, and D. I decide tosample by selecting A, and then for each of B through D, deciding by acoin toss whether or not to include that person. Is this a random sample?

    Answer. Yes.P(A enters the sample) = 1P(B enters the sample) = 0.5P(C enters the sample) = 0.5P(D enters the sample) = 0.5

    Ani Adhikari and Philip Stark Statistics 2.3X Lecture 1.1 4 / 1

  • Types of samples

    Sample of convenience: for example, the first 10 people that walk byyou on a streetcorner; not a random sample

    Simple random sample: draws uniformly at random withoutreplacement from the population; random sample

    Cluster sample: for example, take a simple random sample of classes ata uinversity, then take all the students in those classes; random sample

    Pay attention to the randomization

    Formulas for one kind of random sample might not work for another kind.

    Ani Adhikari and Philip Stark Statistics 2.3X Lecture 1.1 5 / 1