Chapter Twelve Quality Control and Initial Analysis of Data.
-
Upload
omarion-toomer -
Category
Documents
-
view
225 -
download
4
Transcript of Chapter Twelve Quality Control and Initial Analysis of Data.
Chapter Twelve
Quality Control and Initial
Analysis of Data
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 2
Chapter Objectives
• Define editing and distinguish between a field edit and an office edit
• Define coding and outline the steps it involves• Compute measures of central tendency and
dispersion of the data for each variable in a data set
• State the potential uses of frequency distribution or one- way tables
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 3
Data Analysis at Rockbridge Associates: Data Integrity
• Data integrity is the foundation for successful marketing research
• Rockbridge ensures integrity in the collection and processing of the data by a number of quality control checks for– mail surveys
– telephone surveys
– web surveys
• Rockbridge ensures data integrity in how the results are interpreted and explained to management
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 4
Editing
• Editing is the process of examining completed data collection forms and taking whatever corrective action is needed to ensure the data are of high quality– Preliminary or field edit
– Final or office edit
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 5
Field Edit
• A field edit, or preliminary edit, is a quick examination of completed data collection forms, usually on the same day they are filled out
• Objectives– Ensure that proper procedures are being followed in
selecting respondents, interviewing them, and recording their responses
– Fix fieldwork deficiencies before they turn into major problems
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 6
Office Edit
• A final, or office edit, verifies response consistency and accuracy– Makes necessary corrections
– Determines whether some or all parts of a data collection form should be discarded
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 7
What Is Wrong With this Response…
• A respondent said he was 18 years old but indicated that he had a Ph.D. when asked for his highest level of education.
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 8
Editing Can Help Uncover
• Improper field procedures• Incomplete interviews• Improperly conducted interviews• Technical problems with the questionnaire or
interview• Respondent rapport problems• Consistency problems that can be isolated
and reconciled
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 9
Improper Field Procedures
• Wrong questionnaire form used• Interview inadvertently not taken
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 10
Incomplete Interviews
• Questions not asked• Directions not followed (proper segments of
the questionnaire were not administered)
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 11
Improperly Conducted Interviews
• The wrong respondent interviewed (e.g., son instead of father)
• Questions misinterpreted by interviewer or respondent
• Evidence of bias or influencing of answers.• Failure to probe for adequate answers or the use of
poor probes• Interviewer's illegible writing and/or style.• Interviewer recorded information which identified a
respondent whose anonymity should have been protected
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 12
Improperly Conducted Interviews (Cont’d)
• Interviewer apparently does not understand what type of responses constitute an answer to the actual question asked
• Interviewer does not understand what the objective of the question is and thus accepts an improper frame of reference for the respondent's answer
• Other evidence of need for training or instructions to be given to interviewer – failure to write down probes, wrong abbreviations,
failure to follow directions
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 13
Technical Problems With the Questionnaire or Interview
• Space was not provided for needed information• The presence of unanticipated or unusually frequent
extreme responses to questions, indicating a possible need for rewording of certain questions
• Inappropriate or unworkable interviewer instructions not detected in the pretest
• The order in which questions were asked introduces confusion, resentment, or bias into the respondent's answers
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 14
Respondent Rapport Problems
• Frequent refusal to answer certain questions.• Reports of abnormal termination of the
interview (or presence of hostility) due to sensitive questions
• Evidence that respondent and interviewer are playing the "game" of "What answer do you want me to give?"
• Evidence that the presence of other people in the interview situation is causing problems
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 15
Consistency Problems That Can Be Isolated and Reconciled
• Contradictory answers – Reports no savings in one section of the interview but
reports interest from bank accounts in another section
• Misclassification – Mortgage debt improperly reported as installment debt
• Impossible answers – Reports paying $600 for a new Edsel in 1970 - the car
should have been recorded as a "used" car; or weekly income reported on the income-per-month line
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 16
Consistency Problems That Can Be Isolated and Reconciled (Cont’d)
• Unreasonable (and probably erroneous) responses – Respondent reports borrowing $2,000 for two years to
buy a car but reported monthly payments multiplied by 24 months are less than $2,000
– Respondent reports that the house value is $90,000 while income is $2,000 per year and the respondent claims less than a high school education
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 17
Preventing Errors
• Careful planning before fieldwork begins• Automating data entry
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 18
Coding
• Coding broadly refers to the set of all tasks associated with transforming edited responses into a form that is ready for analysis
• Steps– Transforming responses to each question into a set of
meaningful categories
– Assigning numerical codes to the categories
– Creating a data set suitable for computer analysis
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 19
Transforming Responses into Meaningful Categories
• A structured question is pre-categorized• Responses to a nonstructured or open-ended
question to be grouped into a meaningful and manageable set of categories
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 20
The Best Way to Treat "Don't Know" Responses
• Infer an actual response – dubious validity• Classify the "don't know's" as a separate
response category for each question
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 21
Missing-Value Category
• A missing value can stem from– A respondent's refusal to answer a question
– An interviewer's failure to ask a question or record an answer or a "don't know" that does not seem legitimate
• Best way to treat missing value responses– Sound questionnaire design
– Tight control over fieldwork
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 22
Assigning Numerical Codes
• Assign appropriate numerical codes to responses that are not already in quantified form
• To assign numerical codes, the researcher should facilitate computer manipulation and analysis of the responses
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 23
Coding Multiple Response
• Which of the following countries have you visited during the past 12 months?
________Canada________England________France________Germany________Japan________Mexico
• Need six variables, each relating to a specific country and having two possible values. For example, 1= “No” and 2 = “Yes”
• Six columns must be set aside in the data spreadsheet to record responses to this question
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 24
Multiple Response Question –Rank Order Question
• Please rank the following fast-food restaurants by placing a 1 beside the restaurant you think is best overall, a 2 beside the restaurant you think is second best, and so on.__________Burger King__________McDonald's__________Wendy's__________Whataburger
• This question requires as many variables (and columns) as there are objects to be ranked
• 4 separate variables are needed
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 25
Creating a Data Set
• Organized collection of data records• Each sample unit within the data set is called
a case or observation• Structure of a Data Set
– The number of observations = n
– The total number of variables embedded in the questionnaire is m, then
• Data set = n x m matrix of numbers
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 26
Table 12.3 Structure of a Data Sheet
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 27
Preliminary Data Analysis:Basic Descriptive Statistics
• Preliminary data analysis examines the central tendency and the dispersion of the data on each variable in the data set
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 28
Table 12.4 Measures of Central Tendency and Dispersion for Different Types of Variables
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 29
Measurement Level of Data Pertaining to Variable – Nominal
• Measures of Central Tendency– Mode: Most frequently occurring response
• Measures of Dispersion – Strictly speaking, the concept of dispersion is
not meaningful for nominal data
– An idea about the distribution of responses can be obtained by examining their relative frequencies of occurrence
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 30
Measurement Level of Data Pertaining to Variable – Ordinal
• Measures of Central Tendency– Median: 50th percentile response
• Measures of Dispersion – Range: Defined by the highest and lowest
response values
– Interquartile range: Difference between the 75th and 25th percentile responses
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 31
Measurement Level of Data Pertaining to Variable – Interval
• Measures of Central Tendency– Mean: Arithmetic average of response values
• Measures of Dispersion – Standard deviation: As defined in Chapter 9
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 32
Measurement Level of Data Pertaining to Variable – Ratio
• Measures of Central Tendency– Mean: Arithmetic average of response values
• Measures of Dispersion – Standard deviation: As defined in Chapter 9
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 33
Mode
• The value that occurs most frequently
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 34
Table 12.5 How Long Have You Been Using
the Services of National? – Computing
Mode
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 35
Median
• The observation below which 50 percent of the observations fall
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 36
How long have you been using the services of National?
4 3 4 1 4 4 4 4 4 4 3
4 4 3 4 4 4 3 1 1
1= Less than a year; 2 = 1 to less than 2 years; 3 = 2 to less than 5 years;
4 = 5 years or more
Arranging the 20 values in ascending order:
1 1 1 3 3 3 3 4 4 4 4
4 4 4 4 4 4 4 4 4
Because the sample size = 20, there are two middle values: 4 and 4. The
median is, therefore, the average of the two middle values = 4.
Table 12.6 Length of Time Service Used – Responses from 20 Customers
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 37
Table 12.7 Computing Median for Length of Time Service Used
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 38
Mean
n = Number of units in the sample
xi = data obtained from each sample unit I
= sample mean value, given by
1
( )n
ii
X
n
X
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 39
Table 12.8 Overall Quality of Services Provided by
National– Computing Mean
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 40
Measures of Dispersion
• Range• Variance• Standard Deviation
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 41
Range
• Range is the difference between the largest and smallest value
• The simplest measure of dispersion
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 42
(xi –x )2
S2 = ---------- n-1
Variance
• Variance of a set of data is a measure of deviation of the data around the arithmetic mean
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 43
n (xi –x )2
i=1---------- n-1
Standard Deviation
• Standard deviation is the square root of the variance
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 44
Table 12.9 Overall Quality of Services Provided by National: Computing Range, Variance, and Standard Deviation
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 45
Frequency Distribution: One-Way Tabulation
• One-way tabulation is a table showing the distribution of data pertaining to categories of a single variable
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 46
Table 12.10 Age and Length of Time Service Used
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 47
Table 12.10 Age and Length of Time Service Used (Cont’d)
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 48
Why Averages May be Misleading
• Researchers tested a new sauce product and found– Mean rating of the taste test was close to the
middle of the scale, which had "very mild" and "very hot" as its bipolar adjectives
• Researcher’s conclusion – Consumers need really neither really hot nor
really mild sauce
Copyright © Houghton Mifflin Company. All rights reserved. 12 | 49
Why Averages May be Misleading (Cont’d)
• Deeper examination revealed – The existence of a large proportion of
consumers who wanted the sauce to be mild and an equally large proportion who wanted it to be hot nor really mild sauce
• Moral of the story– A clear understanding of the distribution of
responses can help a researcher avoid erroneous inferences