Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5...

35
Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1

Transcript of Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5...

Page 1: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning

Sampling: Surveys and How to Ask Questions

Chapter 5

1

Page 2: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 2

Principle Idea:

The data collection method used affects the extent to which sample data can be used to make inferences about a larger population.

Page 3: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 3

5.1 Collecting and Using Sample Data Wisely

• Descriptive Statistics: using numerical and graphical summaries to characterize a data set or describe a relationship.

• Inferential Statistics: using sample information to make conclusions about a broader range of individuals than just those observed.

Page 4: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 4

The Fundamental Rule for Using Data for Inference

Available data can be used to make inferences about a much larger group if the data can be considered to be representative with regard to the question(s) of interest.

Page 5: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 5

Example 5.1 Do First Ladies Represent Other Women?

Past First Ladies are not likely to be representative of other American women, nor even future First Ladies, on the question of age at death, since medical, social, and political conditions keep changing in ways that may affect their health.

Page 6: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 6

Example 5.2 Do Penn State StudentsRepresent Other College Students?

• If question of interest = average handspan of females in college age range? => Yes

• If question of interest = how fast ever driven a car? => No, since Penn State in rural area with open spaces, county roads, little traffic.

Page 7: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 7

Populations, Samples, and Simple Random Samples

Population: the entire group of units about which inferences are to be made.

Sample: the smaller group of units actually measured or surveyed.

Census: every unit in the population is measured or surveyed.

Page 8: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 8

Populations, Samples, and Simple Random Samples

Simple Random Sample: every conceivable group of units of the required size from the population has the same chance to be the selected sample.

Helps ensure sample data will be representative of population, but can be difficult to obtain.

Page 9: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 9

Populations, Samples, and Simple Random Samples

Sample Survey: a subgroup of a large population questioned on set of topics. Special type of observational study.

Less costly and less time than a census.

Page 10: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 10

Advantages of a Sample Survey over a Census

Sometimes a Census Isn’t Possible when measurements destroy units

Speedespecially if population is large

Accuracydevote resources to getting accurate sample results

Page 11: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 11

Bias: How Surveys Can Go WrongResults based on a survey are biased if method used to obtain those results would consistently produce values that are either too high or too low.

Selection bias occurs if method for selecting participants produces sample that does not represent the population of interest.

Nonparticipation bias (nonresponse bias) occurs when a representative sample is chosen but a subset cannot be contacted or doesn’t respond.

Biased response or response bias occurs when participants respond differently from how they truly feel.

Page 12: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 12

5.2 Margin of Error, Confidence Intervals, and Sample Size

With proper methods, a sample of 1600 people from an entire population of millions can fairly certainly gauge the percentage of the entire population who have a certain trait or opinion to within 2.5%.

Page 13: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 13

Margin of Error: The Accuracy of Sample Surveys

The sample proportion and the population proportion with a certain trait or opinion differ by less than the margin of error in at least 95% of all random samples.

Conservative margin of error =

Add and subtract the margin of error to create an approximate 95% confidence interval.

%1001

n

Page 14: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 14

Confidence Intervals

95% Confidence Interval for a Population Proportion:

For about 95% of properly conducted sample surveys, the interval

sample proportion to sample proportion

will contain the actual population proportion.

Another way to write it: sample proportion

n

1

n

1

n

1

Page 15: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 15

Example 5.3 The Importance of Religion for Adult Americans

Poll of n = 1025 adult Americans: “How important would you say religion is in your own life?”

Very important 56%Fairly important 25%Not very important 19%

Conservative margin of error is 3%: 03.01025

1

Approx. 95% confidence interval for the percent of all adult Americans who say religion is very important:

56% 3% or 53% to 59%

Page 16: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 16

Interpreting Confidence Interval

The interval 53% to 59% may or may not capture the percent of adult Americans who considered religion to be very important in their lives.

But, in the long run this procedure will produce intervals that capture the unknown population values about 95% of the time => called the 95% confidence level.

Page 17: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 17

Choosing a Sample Size for a Survey

If m.e. is the desired margin of error for a 95% confidence interval for a population proportion, the required sample size is:

2..

1

emn

Page 18: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 18

The Effect of Population Size

The m.e. for a sample of 1000 is about 3% whether the population size is 30,000 or 200 million.

In practice, as long as the population size is ≥ 10 times as large as the sample size, the population size has almost no influence on the accuracy of sample estimates.

Page 19: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 19

5.3 Choosing a Simple Random Sample

Probability Sampling Plan: everyone in population has specified chance of making it into the sample.

Simple Random Sample: every conceivable group of units of the required size has the same chance of being the selected sample.

Page 20: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 20

Choosing a Simple Random Sample

You Need:1. List of the units in the population.2. Source of random numbers.

• Table of Random Digits• Random Number Generator• Computer Software

Page 21: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 21

Simple Random Sample of StudentsSchool has 5000 students. Want a simple random sample of 10 students.

1. Number the units: Students numbered 1 to 5000.2. Ask a computer program (e.g. Minitab) to randomly

select 10 of them.

Page 22: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 22

Example 5.6 Representing the Heights of British Women

Simple random sample of 10 from 199 British women.1. Assign an ID number from 001 to 199 to each woman.2. Use random digits to randomly select ten numbers between

001 to 199, sample the heights of the women with those IDs.

Sample 1: Using Statistical Package MinitabIDs: 176, 10, 1, 40, 85, 162, 46, 69, 77, 154Heights: 60.6, 63.4, 62.6, 65.7, 69.3, 68.7, 61.8, 64.6, 60.8, 59.9; mean = 63.7 inches

Sample 2: Using Table of Random Digits IDs: 41, 93, 167, 33, 157, 131, 110, 180, 185, 196Heights: 59.4, 66.5, 63.8, 62.6, 65.0, 60.2, 67.3, 59.8, 67.7, 61.8; mean = 63.4 inches

Page 23: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 23

5.4 Other Sampling MethodsNot always practical to take a simple random sample, can be difficult to get a numbered list of all units.

Example: College administration would like to survey a sample of students living in dormitories.

Shaded squares show a simple random sample of 30 rooms.

Page 24: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 24

Stratified Random SamplingDivide population of units into groups (called strata) and take a simple random sample from each of the strata.

College survey: Two strata = undergrad and graduate dorms. Take a simple random sample of 15 rooms from each of the strata for a total of 30 rooms.Ideal: stratify so little variability in responses within each of the strata.

Page 25: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 25

Cluster SamplingDivide population of units into groups (called clusters), take a random sample of clusters and measure only those items in these clusters.

College survey: Each floor of each dorm is a cluster.

Take a random sample of 5 floors and all rooms on those floors are surveyed.

Advantage: need only a list of the clusters instead of a list of all individuals.

Page 26: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 26

Systematic SamplingOrder the population of units in some way, select one of first k units at random and then every kth unit thereafter.

College survey: Order list of rooms starting at top floor of 1st undergrad dorm. Pick one of the first 11 rooms at random room 3, then pick every 11th room after that.

Note: often a good alternative to random sampling but can lead to a biased sample.

Page 27: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 27

Random-Digit DialingMethod approximates a simple random sample of all households in the United States that have telephones.

1. List all possible exchanges (= area code + next 3 digits).

2. Take a sample of exchanges (chance of being sampled based on white pages proportion of households with a specific exchange).

3. Take a random sample of banks (= next 2 digits) within each sampled exchange.

4. Randomly generate the last two digits from 00 to 99.

Once a phone number determined, make multiple

attempts to reach someone at that household.

Page 28: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 28

5.5 Difficulties and Disasters in Sampling

• Using wrong sampling frame

• Not reaching individuals selected

• Nonresponse or nonparticipation

• Self-selected sample

• Convenience/Haphazard sample

Some problems occur even when a sampling plan has been well designed.

Page 29: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 29

Case Study 5.1 The Infamous Literary Digest Poll of 1936

Election of 1936: Democratic incumbent Franklin D. Roosevelt and Republican Alf Landon

Literary Digest Poll: • Sent questionnaires to 10 million people from magazine

subscriber lists, phone directories, car owners, who were more likely wealthy and unhappy with Roosevelt.

• Only 2.3 million responses for 23% response rate. Those with strong feelings, the Landon supporters wanting a change, were more likely to respond.

• (Incorrectly) Predicted a 3-to-2 victory for Landon.

Page 30: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 30

Case Study 5.1 The Infamous Literary Digest Poll of 1936

Election of 1936: Democratic incumbent Franklin D. Roosevelt and Republican Alf Landon

Gallup Poll: • George Gallup just founded the American Institute of

Public Opinion in 1935.

• Surveyed a random sample of 50,000 people from list of registered voters. Also took a random sample of 3000 people from the Digest lists.

• (Correctly) Predicted Roosevelt the winner. Also predicted the (wrong) results of the Literary Digest poll within 1%.

Page 31: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 31

5.6 How to Ask Survey Questions

• Deliberate bias: The wording of a question can deliberately bias the responses toward a desired answer.

• Unintentional bias: Questions can be worded such that the meaning is misinterpreted by a large percentage of the respondents.

• Desire to Please: Respondents have a desire to please the person who is asking the question. Tend to understate response to an undesirable social habit/opinion.

Possible Sources of Response Bias in Surveys

Page 32: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 32

• Asking the Uninformed: People do not like to admit that they don’t know what you are talking about when you ask them a question.

• Unnecessary Complexity: If questions are to be understood, they must be kept simple. Some questions ask more than one question at once.

• Ordering of Questions: If one question requires respondents to think about something that they may not have otherwise considered, then the order in which questions are presented can change the results.

Possible Sources of Response Bias in Surveys (cont)

Page 33: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 33

• Confidentiality and Anonymity: People will often answer questions differently based on the degree to which they believe they are anonymous.

Easier to ensure confidentiality, promise not to release identifying information, than anonymity, researcher does not know the identity of the respondents.

Possible Sources of Response Bias in Surveys (cont)

Page 34: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 34

• Be Sure You Understand What Was Measured: Words can have different meanings. Important to get a precise definition of what was actually asked or measured. E.g. Who is really unemployed?

• Some Concepts Are Hard to Precisely Define: E.g. How to measure intelligence?

• Measuring Attitudes and Emotions: E.g. How to measure self-esteem and happiness?

Page 35: Copyright ©2011 Brooks/Cole, Cengage Learning Sampling: Surveys and How to Ask Questions Chapter 5 1.

Copyright ©2011 Brooks/Cole, Cengage Learning 35

• Open or Closed Questions: Should Choices Be Given?

Open question = respondents allowed to answer in own words.

Closed question = given list of alternatives, usually offer choice of “other” and can fill in blank.

If closed are preferred, they should first be presented

as open questions (in a pilot survey) for establishing list of choices.

Results can be difficult to summarize.

• Problems with Open Questions