Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results...

139
Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides the basis for making statistical inferences Chapter 6 Probability Distributions

Transcript of Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results...

Page 1: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1 of 139

Learn ….To analyze how likely it is that sample results will be “close” to population values

How probability provides the basis for making statistical inferences

Chapter 6Probability Distributions

Page 2: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 2 of 139

Inferential Statistics

Use sample data to make decisions and predictions about a population

Page 3: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 3 of 139

Section 6.1

How Can We Summarize Possible Outcomes and Their

Probabilities?

Page 4: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 4 of 139

Randomness

The numerical values that a variable assumes are the result of some random phenomenon:• Selecting a random sample for a

population

or

• Performing a randomized experiment

Page 5: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 5 of 139

Random Variable

A random variable is a numerical measurement of the outcome of a random phenomenon.

Page 6: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 6 of 139

Random Variable

Use letters near the end of the alphabet, such as x, to symbolize variables.

Use a capital letter, such as X, to refer to the random variable itself.

Use a small letter, such as x, to refer to a particular value of the variable.

Page 7: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 7 of 139

Probability Distribution

The probability distribution of a random variable specifies its possible values and their probabilities.

Page 8: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 8 of 139

Discrete Random Variable

The possible outcomes are a set of separate numbers: (0, 1,2, …).

Page 9: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 9 of 139

Probability Distribution of a Discrete Random Variable

A discrete random variable X takes a set of separate values (such as 0,1,2,…)

Its probability distribution assigns a probability P(x) to each possible value x:

• For each x, the probability P(x) falls between 0 and 1

• The sum of the probabilities for all the possible x values equals 1

Page 10: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 10 of 139

Example: How many Home Runs Will the Red Sox Hit in a Game?

What is the estimated probability of at least three home runs?

Page 11: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 11 of 139

Example: How many Home Runs Will the Red Sox Hit in a Game?

Page 12: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 12 of 139

Parameters of a Probability Distribution

Parameters: numerical summaries of a probability distribution.

Page 13: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 13 of 139

The Mean of a Probability Distribution

The mean of a probability distribution is denoted by the parameter, µ.

Page 14: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 14 of 139

The Mean of a Discrete Probability Distribution

The mean of a probability distribution for a discrete random variable is

where the sum is taken over all possible values of x.

)(xpx

Page 15: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 15 of 139

Expected Value of X

The mean of a probability distribution of a random variable X is also called the expected value of X.

The expected value reflects not what we’ll observe in a single observation, but rather that we expect for the average in a long run of observations.

Page 16: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 16 of 139

Example: What’s the Expected Number of Home Runs in a Baseball Game?

Find the mean of this probability distribution.

Page 17: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 17 of 139

Example: What’s the Expected Number of Home Runs in a Baseball Game?

The mean:

= 0(0.23) + 1(0.38) + 2(0.22) + 3(0.13) + 4(0.03) + 5(0.01) = 1.38

)(xpx

Page 18: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 18 of 139

The Standard Deviation of a Probability Distribution

The standard deviation of a probability distribution, denoted by the parameter, σ, measures its spread.

Larger values of σ correspond to greater spread.

Page 19: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 19 of 139

Continuous Random Variable

A continuous random variable has an infinite continuum of possible values in an interval.

Examples are: time, age and size measures such as height and weight.

Page 20: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 20 of 139

Probability Distribution of a Continuous Random Variable

A continuous random variable has possible values that from an interval.

Its probability distribution is specified by a curve.

Each interval has probability between 0 and 1.

The interval containing all possible values has probability equal to 1.

Page 21: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 21 of 139

Continuous Variables are Measured in a Discrete Manner because of Rounding.

Page 22: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 22 of 139

Which Wager do You Prefer? You are given $100 and told that you must pick

one of two wagers, for an outcome based on flipping a coin:

A. You win $200 if it comes up heads and lose $50 if it comes up tails.

B. You win $350 if it comes up head and lose your original $100 if it comes up tails.

Without doing any calculation, which wager would you prefer?

Page 23: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 23 of 139

You win $200 if it comes up heads and lose $50 if it comes up tails.

Find the expected outcome for this

wager.

a. $100

b. $25

c. $50

d. $75

Page 24: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 24 of 139

You win $350 if it comes up head and lose your original $100 if it comes up tails.

Find the expected outcome for this

wager.

a. $100

b. $125

c. $350

d. $275

Page 25: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 25 of 139

Section 6.2

How Can We Find Probabilities for Bell-Shaped Distributions?

Page 26: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 26 of 139

Normal Distribution

The normal distribution is symmetric, bell-shaped and characterized by its mean µ and standard deviation σ.

The probability of falling within any particular number of standard deviations of µ is the same for all normal distributions.

Page 27: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 27 of 139

Normal Distribution

Page 28: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 28 of 139

Z-Score

Recall: The z-score for an observation is the number of standard deviations that it falls from the mean.

Page 29: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 29 of 139

Z-Score

For each fixed number z, the probability within z standard deviations of the mean is the area under the normal curve between

z and z -

Page 30: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 30 of 139

Z-Score

For z = 1:

68% of the area (probability) of a normal

distribution falls between:

1 and 1 -

Page 31: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 31 of 139

Z-Score

For z = 2:

95% of the area (probability) of a normal

distribution falls between:

2 and 2 -

Page 32: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 32 of 139

Z-Score

For z = 3:

Nearly 100% of the area (probability) of a normal

distribution falls between:

3 and 3 -

Page 33: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 33 of 139

The Normal Distribution: The Most Important One in Statistics

It’s important because…• Many variables have approximate normal

distributions.

• It’s used to approximate many discrete distributions.

• Many statistical methods use the normal distribution even when the data are not bell-shaped.

Page 34: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 34 of 139

Finding Normal Probabilities for Various Z-values

Suppose we wish to find the probability within, say, 1.43 standard deviations of µ.

Page 35: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 35 of 139

Z-Scores and the Standard Normal Distribution

When a random variable has a normal distribution and its values are converted to z-scores by subtracting the mean and dividing by the standard deviation, the z-scores have the standard normal distribution.

Page 36: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 36 of 139

Example: Find the probability within 1.43 standard deviations of µ

Page 37: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 37 of 139

Example: Find the probability within 1.43 standard deviations of µ

Probability below 1.43σ = .9236

Probability above 1.43σ = .0764

By symmetry, probability below -1.43σ = .0764

Total probability under the curve = 1

Page 38: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 38 of 139

Example: Find the probability within 1.43 standard deviations of µ

Page 39: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 39 of 139

Example: Find the probability within 1.43 standard deviations of µ

The probability falling within 1.43 standard deviations of the mean equals:

1 – 0.1528 = 0.8472, about 85%

Page 40: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 40 of 139

How Can We Find the Value of z for a Certain Cumulative Probability?

Example: Find the value of z for a cumulative probability of 0.025.

Page 41: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 41 of 139

Example: Find the Value of z For a Cumulative Probability of 0.025

Look up the cumulative probability of 0.025 in the body of Table A.

A cumulative probability of 0.025 corresponds to z = -1.96.

So, a probability of 0.025 lies below µ - 1.96σ.

Example: Find the Value of z For a Cumulative Probability of 0.025

Page 42: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 42 of 139

Example: Find the Value of z For a Cumulative Probability of 0.025

Page 43: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 43 of 139

Example: What IQ Do You Need to Get Into Mensa?

Mensa is a society of high-IQ people whose members have a score on an IQ test at the 98th percentile or higher.

Page 44: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 44 of 139

Example: What IQ Do You Need to Get Into Mensa?

How many standard deviations above the mean is the 98th percentile?

• The cumulative probability of 0.980 in the body of Table A corresponds to z = 2.05.

• The 98th percentile is 2.05 standard deviations above µ.

Page 45: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 45 of 139

Example: What IQ Do You Need to Get Into Mensa?

What is the IQ for that percentile?

• Since µ = 100 and σ 16, the 98th percentile of IQ equals:

µ + 2.05σ = 100 + 2.05(16) = 133

Page 46: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 46 of 139

Z-Score for a Value of a Random Variable

The z-score for a value of a random variable is the number of standard deviations that x falls from the mean µ.

It is calculated as:

-x

z

Page 47: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 47 of 139

Example: Finding Your Relative Standing on The SAT

Scores on the verbal or math portion of the SAT are approximately normally distributed with mean µ = 500 and standard deviation σ = 100. The scores range from 200 to 800.

Page 48: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 48 of 139

Example: Finding Your Relative Standing on The SAT

If one of your SAT scores was x = 650, how many standard deviations from the mean was it?

Page 49: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 49 of 139

Example: Finding Your Relative Standing on The SAT

Find the z-score for x = 650.

1.50 100

500 - 650

-x z

Page 50: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 50 of 139

Example: Finding Your Relative Standing on The SAT

What percentage of SAT scores was higher than yours?

• Find the cumulative probability for the z-score of 1.50 from Table A.

• The cumulative probability is 0.9332.

Page 51: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 51 of 139

Example: Finding Your Relative Standing on The SAT

The cumulative probability below 650 is 0.9332.

The probability above 650 is 1 – 0.9332 = 0.0668

About 6.7% of SAT scores are higher than yours.

Page 52: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 52 of 139

Example: What Proportion of Students Get A Grade of B?

On the midterm exam in introductory statistics, an instructor always give a grade of B to students who score between 80 and 90.

One year, the scores on the exam have approximately a normal distribution with mean 83 and standard deviation 5.

About what proportion of students get a B?

Page 53: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 53 of 139

Example: What Proportion of Students Get A Grade of B?

Calculate the z-score for 80 and for 90:

1.40 5

83 - 90

-x z

0.60- 5

83 - 80

-x z

Page 54: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 54 of 139

Example: What Proportion of Students Get A Grade of B?

Look up the cumulative probabilities in Table A.

• For z = 1.40, cum. Prob. = 0.9192

• For z = -0.60, cum. Prob. = 0.2743 It follows that about 0.9192 – 0.2743 =

0.6449, or about 64% of the exam scores were in the ‘B’ range.

Page 55: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 55 of 139

Using z-scores to Find Normal Probabilities

If we’re given a value x and need to find a probability, convert x to a z-score using:

Use a table of normal probabilities to get a cumulative probability.

Convert it to the probability of interest.

-x

z

Page 56: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 56 of 139

Using z-scores to Find Random Variable x Values

If we’re given a probability and need to find the value of x, convert the probability to the related cumulative probability.

Find the z-score using a normal table.

Evaluate x = zσ + µ.

Page 57: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 57 of 139

Example: How Can We Compare Test Scores That Use Different Scales?

When you applied to college, you scored 650 on an SAT exam, which had mean µ = 500 and standard deviation σ = 100.

Your friend took the comparable ACT in 2001, scoring 30. That year, the ACT had µ = 21.0 and σ = 4.7.

How can we tell who did better?

Page 58: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 58 of 139

What is the z-score for your SAT score of 650?

For the SAT scores: µ = 500 and σ = 100.

a. 2.15

b. 1.50

c. -1.75

d. -1.25

Page 59: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 59 of 139

What percentage of students scored higher than you?

a. 10%

b. 5%

c. 2%

d. 7%

Page 60: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 60 of 139

What is the z-score for your friend’s ACT score of 30?

The ACT scores had a mean of 21 and a standard deviation of 4.7.

a. 1.84

b. -1.56

c. 1.91

d. -2.24

Page 61: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 61 of 139

What percentage of students scored higher than your friend?

a. 3%

b. 6%

c. 10%

d. 1%

Page 62: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 62 of 139

Standard Normal Distribution

The standard normal distribution is the normal distribution with mean µ = 0 and standard deviation σ = 1.

It is the distribution of normal z-scores.

Page 63: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 63 of 139

Section 6.3

How Can We Find Probabilities When Each Observation Has Two Possible

Outcomes?

Page 64: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 64 of 139

The Binomial Distribution

Each observation is binary: it has one of two possible outcomes.

Examples:• Accept, or decline an offer from a bank for a credit

card.

• Have, or have not, health insurance.

• Vote yes or no in a referendum.

Page 65: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 65 of 139

Conditions for the Binomial Distribution

Each of n trails has two possible outcomes: “success” and “failure”.

Each trail has the same probability of success, denoted by p.

The n trials are independent.

The binomial random variable X is the number of successes in the n trials.

Page 66: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 66 of 139

Example: Finding Binomial Probabilities for An ESP Experiment

John Doe claims to possess ESP. An experiment is conducted:

• A person in one room picks one of the integers 1, 2, 3, 4, 5 at random.

• In another room, John Doe identifies the number he believes was picked.

• The experiment is done with three trials.

• Doe got the correct answer twice.

Page 67: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 67 of 139

Example: Finding Binomial Probabilities for An ESP Experiment

If John Doe does not actually have ESP and is actually guessing the number, what is the probability that he’d make a correct guess on two of the three trials?

Page 68: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 68 of 139

Example: Finding Binomial Probabilities for An ESP Experiment

Page 69: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 69 of 139

Example: Finding Binomial Probabilities for An ESP Experiment

The three ways John Doe could make two correct guesses in three trials are: SSF, SFS, and FSS.

Each of these has probability: (0.2)2(0.8)=0.032.

The total probability of two correct guesses is 3(0.2)2(0.8)=0.096.

Page 70: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 70 of 139

Probabilities for a Binomial Distribution

Denote the probability of success on a trial by p.

For n independent trials, the probability of x successes equals:

nxpp xnx ,...,2,1,0,)1(x)!-(nx!

n!p(x) )

Page 71: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 71 of 139

Example: Using the Binomial Formula in ESP Experiment

The probability of exactly 2 correct guesses is the binomial probability with n = 3 trials, x = 2 correct guesses and p = 0.2 probability of a correct guess.

0.096 8)3(0.04)(0. )8.0()2.0(2!1!

3! p(2) 12

Page 72: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 72 of 139

Example: Are Women Passed over for Managerial Training?

Example: Presence of bias in promotion.• Large supermarket in Florida.

• Group of women claimed that female employees were passed over for management training.

Page 73: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 73 of 139

Example: Are Women Passed over for Managerial Training?

Large employee pool of more than 1000 people.

Half the employees are male; half are female.

None of the 10 employees chosen for management training were female.

Page 74: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 74 of 139

Example: Are Women Passed over for Managerial Training?

How can we investigate statistically the women’s assertion of gender bias?

Page 75: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 75 of 139

Example: Are Women Passed over for Managerial Training?

If the employees are selected randomly in terms of gender, about half of the employees picked should be females and about half should be males.

Page 76: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 76 of 139

Example: Are Women Passed over for Managerial Training?

Due to ordinary sampling variation, it need not happen that exactly 50 % of those selected are females.

Page 77: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 77 of 139

Example: Are Women Passed over for Managerial Training?

If employees were actually selected at random for the training, what are the chances that none of the 10 employees selected were females?

Page 78: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 78 of 139

Example: Are Women Passed over for Managerial Training?

The probability that no females are chosen equals:

001.0)50.0()50.0(0!10!

10! p(0) 100

Page 79: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 79 of 139

Example: Are Women Passed over for Managerial Training?

It is very unlikely (one chance in a thousand) that none of the 10 selected for management training would be female.

Page 80: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 80 of 139

Example: Are Women Passed over for Managerial Training?

Page 81: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 81 of 139

Do the Binomial Conditions Apply?

Before you use the binomial distribution, check that its three conditions apply:

• Binary data (success or failure).

• The same probability of success for each trial (denoted by p).

• Independent trials.

Page 82: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 82 of 139

Do the Binomial Conditions Apply to the Managerial Training Example?

The data are binary. If employees are selected randomly, the

probability of selecting a female on a given trial is 0.50.

With random sampling from a large population, outcomes from trials are independent.

Page 83: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 83 of 139

Binomial Mean and Standard Deviation

The binomial probability distribution for n trials with probability p of success on each trial has mean µ and standard deviation σ given by:

p)-np(1 np,

Page 84: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 84 of 139

Example: How Can We Check for Racial Profiling?

Study conducted by the American Civil Liberties Union.

Study analyzed whether African-American drivers were more likely than other in the population to be targeted by police for traffic stops.

Page 85: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 85 of 139

Example: How Can We Check for Racial Profiling?

Data:• 262 police car stops in Philadelphia in

1997.

• 207 of the drivers stopped were African-American.

• In 1997, Philadelphia’s population was 42.2% African-American.

Page 86: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 86 of 139

Example: How Can We Check for Racial Profiling?

Does the number of African-Americans stopped suggest possible bias, being higher than we would expect (other things being equal, such as the rate of violating traffic laws)?

Page 87: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 87 of 139

Example: How Can We Check for Racial Profiling?

Assume:• 262 car stops represent n = 262 trials.

• Successive police car stops are independent.

• P(driver is African-American) is p = 0.422.

Page 88: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 88 of 139

Example: How Can We Check for Racial Profiling?

Calculate the mean and standard deviation of this binomial distribution:

p)-np(1 np,

Page 89: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 89 of 139

Example: How Can We Check for Racial Profiling?

8(0.578)262(0.422)

111 262(0.422)

Page 90: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 90 of 139

Example: How Can We Check for Racial Profiling?

Recall: Empirical Rule• When a distribution is bell-shaped, about

100% of it falls within 3 standard deviations of the mean.

Page 91: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 91 of 139

Example: How Can We Check for Racial Profiling?

1353(8)111 3

87 3(8) - 111 3 -u

Page 92: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 92 of 139

Example: How Can We Check for Racial Profiling?

If no racial profiling is happening, we would not be surprised if between about 87 and 135 of the 262 people stopped were African-American.

The actual number stopped (207) is well above these values.

The number of African-American stopped is too high, even taking into account random variation.

Page 93: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 93 of 139

Example: How Can We Check for Racial Profiling?

Limitation of the analysis:• Different people do different

amounts of driving, so we don’t really know that 42.2% of the potential stops were African-American.

Page 94: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 94 of 139

When Is the Binomial Distribution Dell Shaped?

The binomial distribution has close to a symmetric, bell shape when the expected number of successes, np, and the expected number of failures, n(1-p) are both at least 15.

Page 95: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 95 of 139

Section 6.4

How Likely Are the Possible Values of a Statistic?

The Sampling Distribution

Page 96: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 96 of 139

Statistic

Recall: A statistic is a numerical summary of sample data, such as a sample proportion or a sample mean.

Page 97: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 97 of 139

Parameter

Recall: A parameter is a numerical summary of a population, such as a population proportion or a population mean.

Page 98: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 98 of 139

Statistics and Parameters

In practice, we seldom know the values of parameters.

Parameters are estimated using sample data.

We use statistics to estimate parameters.

Page 99: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 99 of 139

Example: 2003 California Recall Election

Prior to counting the votes, the proportion in favor of recalling Governor Gray Davis was an unknown parameter.

An exit poll of 3160 voters reported that the sample proportion in favor of a recall was 0.54.

Page 100: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 100 of 139

Example: 2003 California Recall Election

If a different random sample of about 3000 voters were selected, a different sample proportion would occur.

Page 101: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 101 of 139

Example: 2003 California Recall Election

Imagine all the distinct samples of 3000 voters you could possibly get.

Each such sample has a value for the sample proportion.

Page 102: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 102 of 139

Statistics and Parameters

How do we know that a sample statistic is a good estimate of a population parameter?

To answer this, we need to look at a probability distribution called the sampling distribution.

Page 103: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 103 of 139

Sampling Distribution

The sampling distribution of a statistic is the probability distribution that specifies probabilities for the possible values the statistic can take.

Page 104: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 104 of 139

The Sampling Distribution of the Sample Proportion

Look at each possible sample. Find the sample proportion for each

sample. Construct the frequency distribution of

the sample proportion values. This frequency distribution is the

sampling distribution of the sample proportion.

Page 105: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 105 of 139

Example: Sampling Distribution

Which Brand of Pizza Do You Prefer?

• Two Choices: A or D.

• Assume that half of the population prefers Brand A and half prefers Random D.

• Take a random sample of n = 3 tasters.

Page 106: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 106 of 139

Example: Sampling Distribution

Sample No. Prefer Pizza A

Proportion

(A,A,A) 3 1

(A,A,D) 2 2/3

(A,D,A) 2 2/3

(D,A,A) 2 2/3

(A,D,D) 1 1/3

(D,A,D) 1 1/3

(D,D,A) 1 1/3

(D,D,D) 0 0

Page 107: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 107 of 139

Example: Sampling Distribution

Sample Proportion

Probability

0 1/8

1/3 3/8

2/3 3/8

1 1/8

Page 108: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 108 of 139

Example: Sampling Distribution

Page 109: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 109 of 139

Mean and Standard Deviation of the Sampling Distribution of a Proportion

For a binomial random variable with n trials and probability p of success for each, the sampling distribution of the proportion of successes has:

To obtain these value, take the mean np and standard deviation for the binomial distribution of the number of successes and divide by n.

n

p)-p(1deviation standard and pMean

)1( pnp

Page 110: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 110 of 139

Example: 2003 California Recall Election

Sample: Exit poll of 3160 voters.

Suppose that exactly 50% of the population of all voters voted in favor of the recall.

Page 111: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 111 of 139

Example: 2003 California Recall Election

Describe the mean and standard deviation of the sampling distribution of the number in the sample who voted in favor of the recall.

• µ = np = 3160(0.50) = 1580

• 1.28)50.0)(50.0(3160p)-np(1

Page 112: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 112 of 139

Example: 2003 California Recall Election

Describe the mean and standard deviation of the sampling distribution of the proportion in the sample who voted in favor of the recall.

0089.0000079.03160

)50.0)(50.0()1(Deviation Standard

0.50 p Mean

p

pp

Page 113: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 113 of 139

The Standard Error

To distinguish the standard deviation of a sampling distribution from the standard deviation of an ordinary probability distribution, we refer to it as a standard error.

Page 114: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 114 of 139

Example: 2003 California Recall Election

If the population proportion supporting recall was 0.50, would it have been unlikely to observe the exit-poll sample proportion of 0.54?

Based on your answer, would you be willing to predict that Davis would be recalled from office?

Page 115: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 115 of 139

Example: 2003 California Recall Election

Fact: The sampling distribution of the sample proportion has a bell-shape with a mean µ = 0.50 and a standard deviation σ = 0.0089.

Page 116: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 116 of 139

Example: 2003 California Recall Election

Convert the sample proportion value of 0.54 to a z-score:

4.5 0.0089

0.50) - (0.54 z

Page 117: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 117 of 139

Example: 2003 California Recall Election

Page 118: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 118 of 139

Example: 2003 California Recall Election

The sample proportion of 0.54 is more than four standard errors from the expected value of 0.50.

The sample proportion of 0.54 voting for recall would be very unlikely if the population support were p = 0.50.

Page 119: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 119 of 139

Example: 2003 California Recall Election

A sample proportion of 0.54 would be even more unlikely if the population support were less than 0.50.

We there have strong evidence that the population support was larger than 0.50.

The exit poll gives strong evidence that Governor Davis would be recalled.

Page 120: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 120 of 139

Summary of the Sampling Distribution of a Proportion

For a random sample of size n from a population with proportion p, the sampling distribution of the sample proportion has

If n is sufficiently large such that the expected numbers of outcomes of the two types, np and n(1-p), are both at least 15, then this sampling distribution has a bell-shape.

n

p)-p(1error standard and pMean

Page 121: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 121 of 139

Section 6.5

How Close Are Sample Means to Population Means?

Page 122: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 122 of 139

The Sampling Distribution of the Sample Mean

The sample mean, x, is a random variable.

The sample mean varies from sample to sample.

By contrast, the population mean, µ, is a single fixed number.

Page 123: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 123 of 139

Mean and Standard Error of the Sampling Distribution of the Sample Mean

For a random sample of size n from a population having mean µ and standard deviation σ, the sampling distribution of the sample mean has:

• Center described by the mean µ (the same as the mean of the population).

• Spread described by the standard error, which equals the population standard deviation divided by the square root of the sample size: n

Page 124: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 124 of 139

Example: How Much Do Mean Sales Vary From Week to Week?

Daily sales at a pizza restaurant vary from day to day.

The sales figures fluctuate around a mean µ = $900 with a standard deviation σ = $300.

Page 125: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 125 of 139

Example: How Much Do Mean Sales Vary From Week to Week?

The mean sales for the seven days in a week are computed each week.

The weekly means are plotted over time. These weekly means form a sampling

distribution.

Page 126: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 126 of 139

Example: How Much Do Mean Sales Vary From Week to Week?

What are the center and spread of the sampling distribution?

113 7

300

900$

Page 127: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 127 of 139

Sampling Distribution vs. Population Distribution

Page 128: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 128 of 139

Standard Error

Knowing how to find a standard error gives us a mechanism for understanding how much variability to expect in sample statistics “just by chance.”

Page 129: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 129 of 139

Standard Error The standard error of the sample mean:

As the sample size n increases, the denominator increase, so the standard error decreases.

With larger samples, the sample mean is more likely to fall close to the population mean.

n

Page 130: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 130 of 139

Central Limit Theorem

Question: How does the sampling distribution of the sample mean relate with respect to shape, center, and spread to the probability distribution from which the samples were taken?

Page 131: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 131 of 139

Central Limit Theorem

For random sampling with a large sample size n, the sampling distribution of the sample mean is approximately a normal distribution.

This result applies no matter what the shape of the probability distribution from which the samples are taken.

Page 132: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 132 of 139

Central Limit Theorem: How Large a Sample?

The sampling distribution of the sample mean takes more of a bell shape as the random sample size n increases. The more skewed the population distribution, the larger n must be before the shape of the sampling distribution is close to normal. In practice, the sampling distribution is usually close to normal when the sample size n is at least about 30.

Page 133: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 133 of 139

A Normal Population Distribution and the Sampling Distribution

If the population distribution is approximately normal, then the sampling distribution is approximately normal for all sample sizes.

Page 134: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 134 of 139

How Does the Central Limit Theorem Help Us Make Inferences

For large n, the sampling distribution is approximately normal even if the population distribution is not.

This enables us to make inferences about population means regardless of the shape of the population distribution.

Page 135: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 135 of 139

Section 6.6

How Can We Make Inferences About a Population?

Page 136: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 136 of 139

Three Distinct Types of Distributions

Population Distribution

Data Distribution

Sampling Distribution

Page 137: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 137 of 139

Population Distribution

This is the probability distribution from which we take the sample.

Value of its parameters, such as the population proportion p and the population mean µ are usually unknown.

Page 138: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 138 of 139

Data Distribution This is the distribution of the sample data.

It’s described by sample statistics, such as a sample proportion or a sample mean.

With random sampling, the large the sample size n, the more closely it resembles the population distribution.

With larger n, the higher the probability that a sample statistic falls close to the population parameter.

Page 139: Agresti/Franklin Statistics, 1 of 139 Learn …. To analyze how likely it is that sample results will be “close” to population values How probability provides.

Agresti/Franklin Statistics, 1e, 139 of 139

Sampling Distribution

This is the probability distribution of a sample statistic, such as a sample proportion or sample mean.

It provides the key for telling us how close a sample statistic falls to the unknown parameter.

For large n, the sampling distribution is approximately a normal distribution.