Post on 28-Mar-2015
The Evolution of Survey Process Quality
Concepts• Survey
• Design
• Quality
• Quality dimensions
• Product quality
• Process quality
• Organizational quality
• Quality assurance
• Quality control
• Error sources
• Mean squared error
The Survey Process
Research Objectives
SamplingDesign
Data Collection
Data Processing
Analysis/Interpretation
Concepts Population
Mode of AdministrationQuestions
Questionnaire
revi
se
revi
se
The Concept of a Survey
• concerns a set of objects comprising a population• population under study has one or
more measurable properties• goal is to describe the population by
one or more parameters defined in terms of the measurable properties
The Concept of a Survey (con’d)
• access to the population requires a
frame
• sample is selected in accordance with a
sampling design specifying a probability
mechanism and a sample size
The Concept of a Survey (cont’d)
• observations are made in accordance
with a measurement process
• based on the measurements an
estimation process is applied to
compute estimates
• purpose is to make inference to the population (facts, decision-making)
Typical Shortcomings
• target population is changed during the
study
• selection probabilities are not known for
all selected units
• correct estimation formulas are not used
Types of Surveys• One-time
– Attitudes, opinions
• Repeated or continuing– Official statistics (short term indicators, agriculture,
living conditions, crime)
– Other (drug use, consumer research, behaviors)
• International and comparative– Official statistics (European Statistical System,
poverty, water supply)
– Student achievement, literacy, values, happiness, marketing, attitudes
Types of organizations
• Official Statistics– Centralized (NSIs)– Decentralized (Different agencies)
• General survey work– Private, academic– IMF, OECD, UN
Stakeholders
• Customers and users
• Researchers
• Survey organizations
• Owners
• Interest organizations
• The general public
A Brief History• Biblical censuses• Political arithmetic 1650-1800, Graunt and
Eden• The 1895 ISI proposal regarding
representative investigations• Bowley argues for random sampling 1913 in
an attempt to connect statistical theory and sample design
• ISI agrees to promote extended investigation of representative methods in the mid-20s
• Tschuprow, stratified random sampling, early 20s• The 1934 Neyman paper on the
representative method and optimum allocation• Neyman develops theories for
sampling (cluster sampling, ratio estimation, two-phase sampling) and confidence intervals
• Fisher’s random experiments• Nonsampling error theory in the
1940s
• Interpenetration 1946, Mahalanobis• The US Census Bureau survey model 1959-1964• Data quality, Kish, Zarkovich 1965-66• Total survey design, Dalenius 1968
• Developments in other disciplines (errors and their causes)– Questions and interviewers (1917-)
– The response process (1968-) Sudman, Bradburn, Cannell, Tourangeau
– Interviewer-respondent interaction
• Statistical process control (SPC)– Shewhart’s control chart, 1924
– Administrative applications of SPC in survey work, Minton 1968
Quality Milestones
• Early quality management (building ships, maintaining roads, leading empires)
• Industrial revolution (Taylor, Benz, Ford 1910-)
The Quality Revolution Starts Here
• Shewhart’s control chart for process
control
• Dodge and Romig’s acceptance sampling
• A theory for statistical process control
These are methods and tools to handle process variation
• Deming’s 14 points• Juran’s spiral of progress• Ishikawa’s 7 quality control tools• The Joiner Triangle (quality, scientific
approach, teamwork)• Taguchi’s experimental design• Bottom line
– Recognition of the client/customer/user– Increased competition– A need for continuous improvement
Just a Few More Milestones
• Business excellence models (ISO, EFQM, Malcolm Balridge), a clear user perspective
• TQM, Six Sigma, Kaizen, Lean, PDCA, BPR and more
• Quality assurance and quality control
• Standards and quality guidelines
Quality According to ISO 9001
The totality of characteristics of an entity that bear on its ability to satisfy stated and implied needs
Definitions of Quality
• General– Fitness for use
– Design
– Conformance
• In the survey context– Accurate, timely,
accessible plus other dimensions
– Advanced visual display vs tables
– Tolerable error
Quality Assurance and Quality Control
• QA is defined as a set of activities whose purpose is to demonstrate that an entity meets all quality requirements
• QC is defined as a set of activities whose purpose is to ensure that all quality requirements are met
Quality Product (QP)
A QP is one that meets the needs and expectations of customers/clients/users
Eurostat’s Quality Dimensions
• Relevance of statistical concepts
• Accuracy of estimates• Timeliness and
punctuality in disseminating results
• Accessibility and clarity of the information
• Comparability• Coherence• (Completeness)
The Process View
• Product characteristics are established together with the user
• The quality of the product is decided by the processes generating the product
• The processes are controlled via key process variables
Assuring and Controlling QualityQuality Level Main stake-
holdersControl instrument
Measures and indicators
Product User, client Product specs, SLA, evaluation studies, frameworks, standards
Frameworks, compliance, MSE, user surveys
Process Survey designer
SPC, charts, acceptance sampling, risk analysis, CBM, SOP, paradata, checklists, verification
Variation via control charts, other paradata analysis, outcomes of evaluation studies
Organization Agency, owner, society
Excellence models, ISO, CoP, reviews, audits, self-assessments
Scores, strong and weak points, user surveys, staff surveys
Measuring and Documenting Quality
• Accuracy can be measured• Other quality dimensions are
qualitative and can be seen as constraints
• Quality profiles• Quality reports• Performance measures• Codes of practice
Examples of Tools - 1• Self-assessment via excellence model
or other frameworks• Checklists• Quality management (TQM, Six Sigma)• External and internal auditing• Customer satisfaction surveys
Examples of Tools - 2• Staff surveys• Quality control (verification, paradata)• Documentation• MSE component measures
Improving Quality• Benchmarking
• Changing processes
• Small steps or business process reengineering
• Project teams
• Standardization via current best methods
documents or standard operating procedures
and checklists
• Development of quality guidelines
• Training
Quality management philosophies
• This is how I run my company
• Theory a la Drucker
• Improvement methodologies (TQM, Six Sigma, Lean)
• Business Excellence Models (EFQM, Malcolm Baldrige)
34
EQFM Model 2010
Contents of ISO 20252 (sections)
1. Scope 2. Terms and definitions (Swedish translation, some terms inconsistent
with terms used at Stats Sweden)
3. Quality management system requirements (documentation, staff competence and training )
4. Managing the executive elements of research (research proposals, project schedules, Questionnaire Design)
5. Data collection (field worker training, validation levels & methods, also qualitative data collection)
6. Data management and processing (coding, data editing, data storage & data security, eg. original data shall be kept)
7. Report on research projects
What Is Six Sigma?
1. Results oriented management2. Infrastructure and competence3. Problem-solving methodology
Six Sigma focuses on…
• variations
• customers
• processes
• chronic problems
• results
Why the name Six Sigma?
With a sigma level of 6σ a process has no more than 3.4 defects per million opportunities (dpmo)
Control chart (example)
Common cause variation
• Common causes are the process inputs and conditions that contribute to the regular, everyday variation in a process
• Every process has common cause variation
• Example: Percentage of correctly scanned data, affected by people’s handwriting, operation of the scanner…
Understanding Variation (I)
Understanding Variation (II)Special cause variation
• Special causes are factors that are not always present in a process but appear because of particular circumstances
• The effect can be large• Special cause variation is not present all the time• Example: Using paper with a color unsuitable for
scanning
Action
• Eliminate special cause variation
• Decrease common cause variation if necessary
• Do not treat common cause as special cause
Roots of paradata
• Traditional global ones such as error rates (since 1940)
• The Bristol monograph
• The 1998 ASA session in Dallas
• The Eurostat LEG on Quality
• Handbook on process quality
• Rapid development last 10 years
Meta and para
• Prefixes derived from Greek
• Meta (discussions about discussions, data about data)
• Para (beside, near, beyond, parallel)
Mick Couper’s trilogy
• Data
• Metadata (data about data)
• Paradata (data about processes)
• There are many standards for surveys; examples include– ISO 20252– OMB standards– NCES statistical standards– Quality guidelines developed by specific
organizations (Stat Can; RTI; etc.)– ESS Standards for survey reports
Standards
A Standard Is…
A document that – describes methods and procedures for the
collecting, processing, storing, and presenting survey data.
– define the (minimal) level of quality and effort that is acceptable for all survey processes
What purposes do survey standards serve?
• Define a minimally acceptable level of quality that organizations should attain.
• Provide consistency across surveys in different organizations
• Facilitate communication of complex concepts, formulas, procedures and methodologies
• Provide transparency of the methodologies used to produce a survey data set.
• Transfers skills and knowledge of best survey practice
We Concentrate on Accuracy• Data must be of sufficient quality for
decision-making• Other dimensions are constraints• Accuracy is much more difficult to
understand• It is important to convey information on
error sources and their contributions to total survey error
• Accuracy is measured by the mean squared error, MSE
Two Routes to Handling Survey Errors
1. Get an estimate of MSE so that we get confidence or other intervals that we can trust
2. Try to develop and use methods that are almost error-free so that the estimated variance becomes an approximation of the MSE
What is mean squared error?
MSE = Bias2 + Variance = (Bspec + BNR + BFr + Bmeas + BDP)2
+ Varsamp + Varmeas + VarDP
The Survey Process Revisited
The Survey Process
Research Objectives
SamplingDesign
Data Collection
Data Processing
Analysis/Interpretation
Concepts Population
Mode of AdministrationQuestions
Questionnaire
revi
se
revi
se
3M Survey Life Cycle Paradigm
Copyright CCSG - ccsg.isr.umich.edu
Survey Quality
Data Dissemination
Data Collection
Data Processing and Statistical
Adjustment
Data Harmonization
Pretesting
Ethical Considerations in
Surveys
Translation
Adaptation of Survey Instruments
Questionnaire Design
Sample Design
Tenders, Bids, and Contracts
Study, Organizational, and
Operational Structure
Interviewer Recruitment, Selection, and
Training
Instrument Technical Design
Examples of issues
• Research questions and survey questions
• General survey design• Target population• Main mode or mix of
modes• Developing the
instrument• Sampling design
• Data collection• Data processing• Estimation• Providing survey
results• Quality assurance• Quality control• Evaluation
Objective of Survey Design
• Maximize survey quality for given budget
or
• Minimize cost of achieving specified level of quality
Due to selecting Errors due toa sample instead of mistakes or systemthe entire pop’n. deficiencies.
Survey Error
SamplingError
NonsamplingError
Specification Error
Measurement Error
Processing Error
Nonresponse Error
Frame Error
Nonsampling Error
Risk of Bias and Variance by Error Source
MSE Component Var Bias
Sampling error High Low
Specification error Low High
Nonresponse error Low High
Frame error Low High
Measurement error High High
Data Processing error High High
How do we estimate bias?• Obtain measurements that are essentially error
free (“gold standard measurements”)– Implement preferred survey methods on a limited
basis– Record checks
• Comparisons to external gold standard estimates– Census, CPS, other high quality national surveys
• Modelling attempts
Effects of Nonsampling Errorsof Estimates - 1
• Variable errors increase the variances of means, totals, and proportions– Confidence levels for interval estimates may be
over-stated
• Systematic errors bias the estimates of means, totals, and proportions
Effects of Nonsampling Errorsof Estimates - 2
• Both variable and systematic errors bias estimates of correlation and regression coefficients
• The nominal level of Type I error can be either to high or too low in the presence of nonsampling errors.
Total Survey Error
• Sampling usually more efficient than census
• Sampling error predictable
• Nonsampling error nonpredictable
• Find the balance
• Use risk management
Conclusions
• Survey design involves allocations of resources using incomplete and imperfect information
• Objective should be to minimize total error subject to cost constraints
Specific Error Sources
Specification Error
• Concepts
• Objectives
• Subject matter problem translated into a statistical problem
• Mismatch between research question and survey question
• Are all research questions covered?
Frame Errors
• Coverage errors– Missing units– Duplications– Extraneous units
• Classification errors– Industry (e.g., Standard Industry Classification
(SIC))– Geography– Size
Frame Errors (Cont’d)
• Contact errors– Address incomplete or incorrect– Contact name– Phone number
• Other errors– Unit structure error– Frame not current => Errors
Not on Frame
On Frame
= mean for entire target pop’n
,Ct CY
1 ,NC Ct t NCY
Y
C C NC NCY t Y t Y
Relative Bias Due to Coverage Error
(1 )( )C C NCNC
t Y YRB
Y
Coverage Bias as a Function of tc and the Relative Difference Btwn and
CY-10
Relative Difference Btwn Covered & Noncovered (%)
Relative CoverageBias (%)
-25-20-15-10-505
10152025
-50 -40 -30 -20 10 20 30 40 50
tc = .90
tc = .70tc = .50
0
CY NCY
Nonresponse Error
• Unit nonresponse
– Noncontacts
– Refusals
• Item nonresponse
– Individual questions skipped
Nonresponse Bias
Nonrespondents
Respondents
Total Population
Relative Bias Due to Nonresponse
RBNR = (1- tR) ( YR YNR
Y)
tR = Response rate for a tele. survey= 75%
YR = Av. income for respondents= 107 Kr.
YNR = Av. income for nonrespondents= 89 Kr.
Y = .75 (107) + .25 (89)= 102.5 Kr.
Example
RB = (.25)
= .044 or 4.4%
( 107 89102.50 )
Example
Response Bias as a Function of tR and the Relative Difference Btwn YR and YNR
-10
Relative Diff Btwn Respondents & Nonrespondents (%)
Relative ResponseBias (%)
-25-20-15-10-505
10152025
-50 -40 -30 -20 10 20 30 40 50
tR = .90
tR = .70tR = .50
0
Components of Response and Nonresponse
In -sco pe U n its(4 )
O u t-o f-sco pe U n its(5 )
R e so lved U n its(2 )
E s tim a tedIn -sco pe U n its
(3 A )
E s tim a tedO u t-o f-sco pe U n its
(3 B )
U n re so lve d U n its(3 )
T o ta l U n its
Estimating the Unresolved Units That Are In-Scope
(4)
(2)x (3) = (3A)
In-scope Units
Refusals Conversions(11)
O therrespondents
(12)
Respondent Units(6)
Refusals(13)
No contacts(14)
ResidualNonrespondents
(15)
Nonrespondent Units(7)
In-scope Units(4 )
Out-of-scope Units
N o n -ex is te n tU n its
(8 )
T e m p o ra rilyO u t-o f-sco pe
U n its(9 )
P e rm an e n tlyO u t-o f-sco pe
U n its(1 0 )
O u t-o f-sco pe U n its(5 )
Response Rate ComponentsGlobal Process Data
•Response rate (6) / [(3A)+(4)]
•Cooperation rate (6) / [(6)+(13)]
•Refusal rate (13)/ [(6)+(13)]
•Refusal rate(13)/ [(3A)+(4)]
•Nonresponse rate (7) / [(3A)+(4)]
Factors Influencing Refusals
Survey Design:• Mode• Respondent rule• Interview length• Interview period length• Survey topic• Questionnaire design
Respondent Characteristics:
• Age, gender, income, health
• Urban-rural
• Crime rate
• Literacy
Interviewer Characteristics:
• Age, gender, race, perceived income, etc.
• Prior experience (skill, confidence)
• Interviewer expectations
• Attitude, recent experience, motivation
Societal Factors:
• Social responsibility
• Legitimacy of survey objective
Psychological Factors(Groves, Cialdini, Couper, 1992)Reciprocation: Compliance as repayment
for a gift, payment, or concession; benefit to R
Consistency: Compliance is consistent with an announced position (belief, attitude, or value)
Social Validation: More willingness to comply if one believes that similar others would also comply
Authority: Compliance is more likely if request comes from a legitimate authority
Scarcity: More willingness to comply to secure opportunities that are scarce
Liking: More willingness to comply to requests from interviewers who are liked
Implications for Interviewing
Prolong Interaction: Maintain conversation to identify cues to use with psychological factors
Tailoring: Adapt interviewing approach to the sample unit
Other Methods to Handle NR
• Decrease respondent burden• New theory for respondent-friendly
questionnaires• Incentives• Call scheduling algorithms• Adjusting for nonresponse• Dillman’s TDM• Mixed mode• Ensuring confidentiality
The Response Process and Its Implications for
Questionnaire Design
InformationSystem
Respondent Interviewer
Instrument
Modeof Data
Collection
Setting
Self-
Adm
in
Response Processes• Cannell, Miller and Oksenberg 1981• Tourangeau 1984• Cantor and Edwards 1991• Biemer and Fecso 1995• Sudman, Bradburn and Schwarz 1996• Tourangeau, Raps and Rasinski 2000• Sudman, Willimack, Nichols and Mesenbourg
2000• Willimack and Nichols 2010
Response ProcessesIndividuals
• Comprehension
• Retrieval
• Judgment and estimation
• Communicating an answer
Establishments
• Encoding in memory/record formation
• Identification and selection of respondents
• Assessment of priorities
• Comprehension
• Retrieval
• Judgment and estimation
• Communicating an answer
• Data release
Phenomena I
• Satisficing• Telescoping• Recency• Primacy• Surprise questions• Context effects• Response alternatives
effect
• Middle alternatives and DK
• Vague terms• Reference period• Double-barreled
questions• Sensitive questions
Phenomena II
• Social desirability bias• Respondent calculations• Vague quantifiers• Number of scale points• Progress indicators• Aided recall
• Labelling scale points• Numerical labels• Acquiescence• General and specific
questions• CAPITALIZED TEXT• Images
Implications for Questionnaire Design
• Wording• Length• Format
– Open– Closed– Scales– Filter
• Positioning of questions
• Type of question
– Factual
– Attitude
– Hypothetical
• Layout
• Navigation
• Computer-aided
Encoding/Record Formation
• Description Knowledge is obtained, processed, and is either stored in memory or a physical record is made. To be retrieved the information must exist.
– Proxy R error; responses from R’s who really “don’t know”
– Memory is incomplete, distorted, or inaccurate
– Records are missing, incomplete, or incompatible with survey requirements
Types of Errors:
Comprehension
• Description Meaning of the question, as researcher intended it, is understood by the respondent
– Context Errors– Use of technical terms– Translation problems– Misleading response alternatives
Types of Errors:
Retrieval of Information
• Description
Respondent retrieves relevant information from memory or from records or other external sources
– Forgetting– Telescoping– Estimating– Use of out-dated records
Types of Errors:
Judgment and Formatting a Response
• Description
Information is evaluated and a response is formatted corresponding to the response alternatives presented
– Response alternatives are too constrained– Response alternatives suggest a response
distribution– Respondents are pressured into giving a
“top of the head” response
Types of Errors:
Response Editing and Communication
• Description
Respondent edits response and communicates it
– Social desirability effects– Fear of disclosure– Acquiescent behavior
Types of Errors:
Errors Due to Interviewers
and Interviewing
The Role of the Interviewer• School A: “Standardized” interview
perspective • Requires interviewers to: Read questions exactly as worded Refrain from unscripted interactions Obtain a codeable response from the
respondent Avoid attempts to clarify concepts unless
clarifications are prescripted
School B: “Collaborative” or “conversational” interview perspective • Requires interviewers to:
– Detect and repair respondent misunderstanding of the question
– Collaborate with respondent in the interview process
– Make common sense inferences in recording answers– Redesign questions to adapt them to the respondent’s
situation
In practice
• Conversational flexible interviewing approach
• Mixture of standardized and conversational
• Person-oriented style
– Discussing personal opinions with the respondent
– Inconsistent probing– Inconsistent feedback– Rewording or misinterpreting questions– Falsification
Systematic Interviewer Errors
Systematic Errors
• Poor questionnaire design• systematic errors• across all respondents• Interviewer error • systematic errors• within an interviewer’s • assignment
Interviewer Characteristics (Age, Race, Sex, Education) Appearance Motives Beliefs/attitudes Perceptions Expectations Behaviors Skills Knowledge
Respondent Characteristics (Age, Race, Sex, Education) Knowledge Interest/motivation Confidence Strength of convictions Expectations
Interviewer Error
Questionnaire Definition clarity Terminology/jargon Question form Instructions Question wording Question topic
Survey Conditions andSetting Mode of interview Standardization Interviewer training Interviewer supervision Monitoring/observation
Interviewer Characteristics (Age, Race, Sex, Education) Appearance Motives Beliefs/attitudes Perceptions Expectations Behaviors Skills Knowledge
Respondent Characteristics (Age, Race, Sex, Education) Knowledge Interest/motivation Confidence Strength of convictions Expectations
Interviewer Error
Questionnaire Definition clarity Terminology/jargon Question form Instructions Question wording Question topic
Survey Conditions andSetting Mode of interview Standardization Interviewer training Interviewer supervision Monitoring/observation
Design Factors that May Explain Interviewer Effects
No Interviewer Variance
Sample Dispersion
Interviewer Variance
Interviewer A
Interviewer B
Interviewer C
Interviewer D
Interviewer E
Interviewer Error Model Observed value = true value
+ systematic error + variable error
Using this model, we estimate:
int = variance (systematic error)total variance of observed value
Consequences of Interviewer Error for Totals and Means
Variance of y (bar) is increased (i.e. multiplied) by the factor
1 + (m 1) int
Computation of the Increase in Variance
Suppose m = 100 and int = .01 then
1 + (m 1) int = 1 + (99) x .01= 2, approx.
Example
Values of intint from Interviewer Variance Studies in the Literature
Studies Reporting intint InterviewMode
AverageValue of intint
Study of Blue Collar Workers (Kish, 1962) Study 1 Study 2
Face to Face0.0200.014
Canadian Census, 1961 (Fellegi, 1964) Face to Face 0.008
Canadian Health Survey (Feather, 1973) Face to Face 0.006
Study of Mental Retardation (Freeman and Butler,1976)
Face to Face 0.036
World Fertility Survey (O’Muircheartaigh andMarckwardt, 1980)Peru - main surveyPeru - reinterviewLesotho - main survey
Face to Face
0.0500.0580.102
Consumer Attitude Survey (Collins and Butcher,1982)
0.013
Interviewer Training Project (Fowler andMangione, 1985)
Face to Face 0.005
Average intint for Face to Face Surveys .0312
Values of intint from Interviewer Variance Studies in the Literature
Studies Reporting intint InterviewMode
AverageValue of intint
Study of Telephone Methodology Telephone 0.0089
Health and Television Viewing Telephone 0.0074
Health in America Telephone 0.0018
1980 Post Election Study Telephone 0.0086
Monthly Consumer Attitude Survey November 1981 December 1981 January 1982 February 1982 March 1982
Telephone0.01840.00570.01630.00900.0067
Average intint for Telephone Surveys .0092
Evaluating Interviewer Performance
– Monitoring telephone interviews– Tape and video recording and behavior coding– Verification recontact– Reinterviews– On-site observations– Questionnaire review– Keystroke files– Mock interviews
Data Collection Modes and
Associated Errors
Data Collection Mode
– Modes of data collection– Choosing a mode
– Data quality considerations– New technologies and mode
Modes I• CAPI = Computer Assisted Personal
Interviewing
• ACASI = Audio CASI
• CATI = Computer Assisted Telephone Interviewing
• PAPI = Paper and Pencil Interviewing
• CADE = Computer Assisted Data Entry
Modes II
• TDE = Touchtone Data Entry
• CASI = Computer Assisted Self Interviewing
• EDI = Electronic Data Interchange
• DBM = Disk by mail
• EMS = Electronic Mail Survey
• VRE = Voice Recognition Entry
• T-ACASI = Telephone ACASI
Face-to-face
– Flexible– Expensive– Advantages and drawbacks of interviewer– Visual aids
Telephone interviewing
– Similar to f-f but less flexible– Fast– Monitoring possible– Questionnaires have to be simpler
– Good for sensitive topics– No control over response process– Can be made respondent-friendly– All survey materials must be crystal-clear– Respondent sets the pace– Question order effects reduced
Web Surveys• Internet access varies
• Differences in computer systems and browsers must be considered
• Good for visual stimuli
• Questionnaires should be short
• Fast and inexpensive
Diary
– Recall error increases over time– Heavy response burden– Behavior can change temporarily– Survey topic is such that total survey period is
quite long
Administrative records
– Errors similar to those of other modes– Statisticians have sometimes no control over
contents, updates, etc– Statistical purposes come in second after
administrative ones– Conceptual differences common
Direct observation
• No respondents• Devices and calibration problems• Various kinds of observations
– Counting behaviors, eye estimates, anthropology, mystery shopping, price collection, photos
• Observer errors (rho)
Mixed modes
• Can be an “optimal” solution
• Can be a necessity due to frame problems or nonresponse problems
• Give respondents a choice
• Adjustment of questions and questionnaire seldom done
The Choice of Mode
• Each mode has advantages and disadvantages regarding– Costs– Measurement errors– Nonresponse and coverage– Flexibility– Timeliness
The Decision Regarding Mode• Sometimes there is no real choice due to
costs or practical constraints
• Often more than one mode must be used
• Pure mode effects difficult to assess
• The decision often concerns a main mode
Summary
• The choice of mode can be very simple or very complex
• Error structures of new modes are not fully understood
Summary
• The choice of mode can be very simple or very complex
• Error structures of new modes are not fully understood
Data Processing Errors and Their
Control
Data Processing Steps for PAPI
1. Check-in: questionnaires are collected and work units are formed
2. Scan edit: entries are inspected to avoid data entry problems
3. Data entry: questionnaire data are captured via keying, scanning or other optical sensing
4. Editing: captured data are “corrected” and “cleaned;” missing data are “imputed.”
Data Processing Error• Relatively sparse literature• Some steps are very error prone (e.g., coding
and editing)• Errors are both systematic and variable
– rho
• Increased automation and integration reduces variable error while increasing systematic error
Data Capture Errors
• Keying errors
– Discovered by verification keying or editing
– Error rates usually small based on records, fields or characters
– Studies often conducted in QC environments
– The vital few large errors can have large effects on MSE
Data Capture Errors (cont’d)• Intelligent Character Recognition
– Error types are substitution and rejection
– Substitution errors can be systematic
– Condition of incoming documents and the equipment is crucial which calls for continuing calibration
– Might have to be complemented with manual keying
Editing definition• Editing is the identification and, if
necessary, correction of errors and outliers in individual data used for statistics production
• The definition does not state that all errors be corrected or even identified
• Editing can be very costly
Purpose of editing
• To provide information about data quality (patterns and root causes)
• To provide information about future survey improvements
• To ”clean up” the data
Different Kinds of Editing
• Micro-editing: Editing at record level
• Macro-editing: Editing at aggregate level
• Selective editing
• Output editing
The result is overediting
• Historical reasons• Large budgets• Really QC of the data collection operation• Feedback loop often missing• Risk management
Key process variables for editing(examples)
• Edit failure rate (#objects with edit failures/#objects edited) estimates amount of verification
• Correction rate (#objects corrected/#objects edited) estimates the effect
• Edit success rate by variable (#objects with changes on variable X/#objects with edit failures on X estimates how successfully the edits identify errors on X
Coding
• Classification process where open-ended responses are classified into coding categories
• Coding can be expensive, error-prone and boring
• Coding can be manual centralized or decentralized, automated or computer-assisted
Input Action OutputResponse
Coding Instructions
Nomenclature
CoderJudgment
Code NumberAssignment
The Generic Coding Process
Coding Errors
• Coding is subjective in nature• Error rates and variability rates can be
large• Coding error occurs when there is a
deviation between the assigned code number and the true code number
Coding Errors (con’d)
• Coding errors are identified by verification
• Coding rules and nomenclatures may be incomplete
• Errors are controlled by automation, dependent, and independent verification
Examples of Coding Error Rates
• 1970 Swedish Census– Occupation 13.5 %– Industry 9.9 %
• 1970 US Census– Occupation 13,3 %– Industry 9.1 %
• 1991 RTI– Occupation 21%– Industry 17%
Production coding by Coder A resulting in
code number xA
Verification coding by Coder B resulting in
code number xB
Compare code numbers xA and xB
xA = xB?
xA = xB is the final, outgoing
code number
Verification coding by Coder D resulting in
code number xD
xD is the final, outgoing
code number
Verification coding by Coder C resulting in
code number xC
Compare code numbers xA , xB and xC
xA = xC?or
xB = xC?
xA = xC
orxB = xC
is the final, outgoingcode number
Two-way Independent Verification with Adjudication
Yes
No
Yes
No
Automated Coding• There should be a computer-stored dictionary• Responses are entered online or via some
other medium like scanning or keying• Responses are matched with dictionary
descriptions and based on that matching the responses are coded by the software or transferred to manual coding
• By collecting and analyzing process data the system is continually improved
Levels of Automation• Computer Assisted Coding
• Automated
• Matching can be exact or inexact
• Coding degrees obtained: -Purchases 73% (Sweden)
-Industry and occupation 63% (US)
Key process variables in coding
• Coding degree in AC and MC• Effects in coding degree by updates of
dictionary• Coding degree by category, AC and MC• Coding error rate by coders, categories, coding
mode and update version• CAC consultation degree by category and coder
File preparation
• Attaching weights to each unit• Final weight is a product of base weight and
adjustment factors for nonresponse and noncoverage
• No theory for measurement error adjustment yet• Computation can be difficult• Application of disclosure avoidance techniques,
macrodata and microdata
The Total Survey Error Framework
160
Deming (1944) “On Errors in Surveys”
• American Sociological Review!
• First listing of sources of problems, beyond sampling, facing surveys
• The 13 factors
161
Deming’s 13 factors
-The 13 factors that affect the usefulness of a survey
-To point out the need for directing effort toward all of them in the planning process with a view to usefulness and funds available
-To point out the futility of concentrating on only one or two of them
-To point out the need for theories of bias and variability that correlate accumulated experience
162
163
Comments on Deming (1944)
• Does include nonresponse, sampling, interviewer effects, mode effects, various other measurement errors, and processing errors
• Omits coverage errors• Includes nonstatistical notions (auspices)• Includes estimation step errors (wrong
weighting)• “Total survey error” not used as a term
164
Sampling Text Treatment of Total Survey Error
• Kish, Survey Sampling, 1965– Graphic on biases– 65 of 643 pages on various errors, with
specified relationship among errors
165
Sampling Biases
Frame biases
“Consistent” Sampling Bias
Constant Statistical Bias
Nonsampling
Biases
Noncoverage
NonresponseNonobservation
Field: data collection
Office: processingObservation
166
Sampling Text Treatment of Total Survey Error
• Särndal, Swensson, Wretman, Model Assisted Survey Sampling, 1992– Part IV, 124 pp. of 694, coverage, nonresponse,
measurement error; omits processing error
• Lohr, Sampling Design and Analysis, 2009– 34 of 600 pages on nonresponse, and 40 on
nonsampling errors and survey quality
167
Other textbooks
• Cochran (1953). Sampling Techniques.40 pages in concluding chapter on “sources of error
in surveys”• Deming (1950). Some Theory of Sampling. Starts
with the 1944 factors but then continues with pure sampling
• Hansen, Hurwitz and Madow (1953). Sample Survey Methods and Theory, Vol 1. Nine pages on survey errors.
• Zarkovich 1966. Quality of Statistical Data.
168
Total Survey Error (1979)Anderson, Kasper, Frankel, and Associates
• Empirical studies on nonresponse, measurement, and processing errors for health survey data
• Initial total survey error framework in more elaborated nested structure
169
Total Error
VariableError
Sampling
Nonsampling
Field
Processing
Bias
Nonsampling
Observation
Field
Processing
Sampling
Frame
Consistent
Nonobservation
Noncoverage
Nonresponse
170
Survey Errors and Survey Costs (1989), Groves
• Attempts conceptual linkages between total survey error framework and– psychometric true score theories
– econometric measurement error and selection bias notions
• Ignores processing error• Highest conceptual break on variance vs. bias• Second conceptual break on errors of
nonobservation vs. errors of observation
171
Coverage Nonresponse Sampling Interviewer Respondent Instrument Mode
Coverage Nonresponse Sampling Interviewer Respondent Instrument Mode
Errors ofNonobservation
ObservationalErrors
Bias
Errors ofNonobservation
ObservationalErrors
Variance
Mean Square Error
construct validitytheoretical validityempirical validityreliability
criterion validity - predictive validity - concurrent validity
172
Nonsampling Error in Surveys (1992), Lessler and Kalsbeek
• Evokes “total survey design” more than total survey error
• Omits processing error
173
Components of Error Topics
Frame errorsMissing elements
Nonpopulation elements
Unrecognized multiplicities
Improper use of clustered frames
Sampling errors
Nonresponse errorsDeterministic vs. stochastic view of nonresponse
Unit nonresponse
Item nonresponse
Measurement errorsError models of numeric and categorical data
Studies with and without special data collections
174
Introduction to Survey Quality, (2003), Biemer and Lyberg
• Major division of sampling and nonsampling error
• Adds “specification error” (a la “construct validity”) or relevance error
• Formally discusses process quality
• Discusses “fitness for use” as quality definition
175
Sources of Error Types of Error
Specification error Concepts
Objectives
Data element
Frame error Omissions
Erroneous inclusions
Duplications
Nonresponse error Whole unit
Within unit
Item
Incomplete Information
Measurement error Information system
Setting
Mode of data collection
Respondent
Interview
Instrument
Processing error Editing
Data entry
Coding
Weighting
Tabulation
176
Survey Methodology, (2009) Groves, Fowler, Couper, Lepkowski, Singer,
Tourangeau• Notes twin inferential processes in surveys
– from a datum reported to the given construct of a sampled unit
– from estimate based on respondents to the target population parameter
• Links inferential steps to error sources
177
ConstructInferential Population
Measurement
Response
Target Population
Sampling Frame
Sample
Validity
Measurement Error
Coverage
Error
Sampling
Error
Measurement Representation
Respondents
Nonresponse
ErrorEdited Data
ProcessingError
Survey Statistic
178
Key Statistical Developments in Total Survey Error 1
• Errors of observers can be correlated (1902), Karl Pearson
• Interpenetrating samples (1946), Mahalanobis• Criteria for true values (1951), Hansen, Hurwitz,
Marks and Mauldin• Essential survey conditions, correlated response
variance (1959), H-H-Bershad• BC survey model “mixed-error model”(1961), H-
H-B
179
• Interviewer effects using ANOVA (Kish 1962)• Simple response variance via reinterviews (1964), H-H-Pritzker• Relaxed assumptions of zero covariance of true values and
response deviations (1964, 1974), Fellegi• Errors of Measurement (1968), Cochran• Estimating model components via basic study schemes using
replication, interpenetreation and combinations of the two (1969), Bailar and Dalenius
• Estimating nonsampling variance using mixed linerar models (1978), Hartley and Rao
• “Error Profile” of Current Population Survey (1978), Brooks and Bailar
• Multi-method multi-trait models on survey measures (1984), Wothke and Browne
Key Statistical Developments in Total Survey Error 2
180
Weaknesses of the Common Usage of “Total Survey Error”
– Notably a user perspective is missing– Key quality dimensions are missing in the TSE
paradigm– User often cannot or prefers not to question accuracy– The complexity does not invite outside scrutiny of
accuracy– Users not really informed about real levels of error or
uncertainty– We don’t really know how users perceive
information on errors
181
Other Weaknesses of the Total Survey Error Paradigm 1
1. Lack of routine measurementsNo agency does thisError/quality profiles are useful but rare
2. Ineffective influence on professional standards Little expansion beyond sampling error in practice
Press releases on Federal statistics rarely contain even sampling errors
Survey error research compartmentalized rather than integratedMethodologists tend to specializeRoot causes of error often still missingHow about OMB’s requirement of NR bias studies if NR
expects to exceed 20%?
182
Other Weaknesses of the Total Survey Error Paradigm 2
3.Large burden on design of some estimators
Interpenetration, reinterviews for variance estimation complicated and costly
Intractable expressions for some components
4. Some assumptions unrealistic
183
Strengths of the Total Survey Error Framework
1. Taxonomic decomposition of errors• nomenclature for different components
2. Separation of phenomena affecting statistics in different ways
• variance vs. bias; observation vs. nonobservation; respondent/interviewer/measurement task; processing
3. Conceptual foundation of the field of survey methodology• subfields defined by errors
4. Tool to identifying gaps in the research literature• e.g., where are the error evaluation papers on processing?
184
Needed Steps in a Research Agenda for Total Survey Error 1
1. Integrating causal models of survey errors• cognitive psychological mechanisms (anchoring,
recall decay)
2. Research on interplay of two or more error sources jointly
• e.g., nonresponse and measurement error
3. Research on the interplay of biases and variances• e.g., does simple response variance increase
accompany some response bias reductions (self-administration effects)?
185
4. Guidance on tradeoffs between quality measurement and quality maximization and between measures and developing error-free processes- how much should we spend on quality enhancement vs. measurement of quality (Spencer, 1985)?
5. Integrating other notions of quality into the total survey error paradigm- if “fitness for use” predominates as a conceptual base, how can we launch research that incorporates error variation associated with different uses? Australian Bureau of Statistics
Needed Steps in a Research Agenda for Total Survey Error 2
186
Needed Steps in a Research Agenda for Total Survey Error 3
6. Exploiting a multiple-mode, multiple frame, multiple phase survey world
7. Need for methodological studies to assist the user8. Costs and risks9. Develop theories for optimal design of specific operations,
design principles10. More standards?
Measures and Indicators of Quality
187
Assuring and Controlling QualityQuality Level Main stake-
holdersControl instrument
Measures and indicators
Product User, client Product specs, SLA, evaluation studies, frameworks, standards
Frameworks, compliance, MSE, user surveys
Process Survey designer
SPC, charts, acceptance sampling, risk analysis, CBM, SOP, paradata, checklists, verification
Variation via control charts, other paradata analysis, outcomes of evaluation studies
Organization Agency, owner, society
Excellence models, ISO, CoP, reviews, audits, self-assessments
Scores, strong and weak points, user surveys, staff surveys
Process data and Paradata
Definitions
• Process is a series of actions or steps towards achieving a particular end
• Process quality is an assessment of how far each step meets defined criteria
• Process variables are factors that can vary with each repetition of the process
• Key process variables are factors that have a large effect on process end result
Some paradata terminology
• Data, Metadata, Paradata• Macro paradata– global process data such as
response rates, coverage rates, edit failure rates, sometimes broken down
• Micro paradata– process data that concern individual records such as flagged imputed records, keystroke data
• Formal selection, collection, and analysis of key process variables that have an effect on a desired outcome, e.g., increased productivity
Definitions of paradata
• Groves and Couper: Paradata are data about the data collection process
• Admit that definition is not well-evolved and subject to debate
• Groves et al. : Process and administrative data produced auxiliary to the survey data collection.
• The European term “process data” takes all survey processes into account
• Developing terminology standards is usually a waste of time
• Paradata is a subset of process data but nothing to argue about
• The important thing is: Never collect data on processes that are not related to quality, every collection should be goal-driven
• Collecting data on processes related to quality without using SPC and other proper analysis methods is extremely wasteful
• If you don’t know how to analyze don’t collect
Plan for continuous improvement (of a product) Marker and Morganstein
1997• Identify critical product characteristics• Develop a process flow map• Determine key process variables• Evaluate measurement capability• Determine stability of critical processes• Determine process capability• Establish a system for continuous process
monitoring
Product characteristics
• Ideally decided by the customer
• Communicating concepts and innovative ideas
Flow charts
• Flow, decision points, customers
• Define owners
• List process variables (those whose values can affect product characteristics)
• At this stage a process “variable” is much broader than what is usually meant (factors such as prices, dates, lists of customers, etc can be “variables”
Key process variables
• Difficult step• Key are those that have the largest effect on
process outputs• Collective knowledge is used in the
selection process• Tools include the Pareto diagram and the
cause-and-effect diagram (fishbone or Ishikawa)
Measurement capability
• Do not reach conclusions about process stability without knowledge about measurement errors
• Available data may be useless• Data should allow quantification of
improvement• Be careful when it comes to customer
satisfaction surveys
Determine stability of critical processes
• Control charts
• Diagnose type of process variation
– (Assignable) special cause
– Common cause
• Take action
Determine system capability
• After system changes (improvement projects) triggered by unacceptable common cause variation process stability must be reevaluated so that the new process is capable of meeting specs such as minimum response rates, minimum error rates, deadlines, etc
• Reduced variation is maintained by adhering to SOPs or CBMs
System for process monitoring
• Processes cannot be expected to remain stable over time.
• Technology changes, new types of human errors, customer requirements change
• Thus, monitoring necessary
Paradata in coding, say, occupation
• Manual: error rate by coder, category, coder experience, within- and between coder variability
• Computer-assisted: degree of computer-consulting, error rates combined with computer use
• Automated: error rates by category, coding degree in general, by dictionary update, by dictionary type
New types of paradata
• Interviewer notes
• Attributes of call attempts
• Nature of interaction with sample member
• Behaviours during the interview
• Flagging imputed records
• Keystroke data
• Response latency
Importance of paradata (I)
• Continuous updates of progress and stability checks (monitoring)– Control charts, standard reports– Managers choose to act or not to act– Early warning system
• Input to long-run process improvement of product quality– Analysis of special and common cause variation
• Input to methodological changes– Finding and eliminating root causes of problems– Research
Importance of paradata (II)
• Responsive designs– Simultaneous monitoring of paradata and regular
survey data to improve efficiency and accuracy
• Input to organizational change– E.g., centralization, decentralization, standardization
• Quality profiles, client communication, public use paradata files, inference, picturing quality over time
Exploratory analysis of paradata
• Example of multivariate situation
• Observing one interviewer: Large % vacant housing, unusual time of interview, short interview length, response pattern does not vary much
• Possible curbstoning
Risks associated with paradata
• Could be a lot due to automatic byproducts• Could be a lot of indirect indicators of cost
and quality• Correct analysis approaches must be used• Ethical concerns• Overuse and underuse
Thoughts on development
• Process indicators should be key• Paradata are multivariate in nature and might have
to be combined to be relevant• We need to learn how to use paradata to intervene in
the process as needed• Create paradata archives to allow reanalysis so that
understanding of what is key can grow or change• Examine potential of partnership across
organizations• Communicating paradata with users
An Overview of Survey Error EvaluationMethods
Purpose of Survey Error Evaluation
• Compare data collection modes or methods
• Optimize allocation of resources
• Error reduction for specific survey processes
• Provide users with information on data quality
• Adjustment estimates for nonsampling error
General Methods for Evaluation
• Pretesting
• Experiments
• Statistical Process Control– Process Control
• Key process variables
• Control charts
– Acceptance sampling
• Postsurvey validation
Some Techniques for Survey Evaluation I
• Evaluation method– Expert review of questionnaires
• Unstructured
• Structured
• Stage– Design
• Purpose– Identify problems with questionnaire layout and
format, question wording, order and instructions
Some Techniques for Survey Evaluation II
• Evaluation method– Cognitive methods
• Behavior coding
• Cognitive interviewing
• Other cognitive lab methods
• Stage– Design/pretest
• Purpose– Evaluate one or more stages of the response process
Some Techniques for Survey Evaluation III
• Evaluation method– Debriefings
• Interviewer group discussions• Respondent focus groups
• Stage– Pretest/survey/post survey
• Purpose– Evaluate questionnaire and data collection
procedures
Some Techniques for Survey Evaluation IV
• Evaluation method– Observation
• Supervisor observation• Telephone monitoring• Tape recording/CARI
• Stage– Pretest/survey
• Purpose– Evaluate interviewer performance. Identify
questionnaire problems.
Some Techniques for Survey Evaluation V
• Evaluation method– Post-survey analysis
• Experimentation• Nonrandom observation• Internal consistency• External validation
• Stage– Post-survey
• Purpose– Compare alternative methods of data collection, estimate
MSE components, validate estimates
Some Techniques for Survey Evaluation VI
• Evaluation method– Post-survey data collection
• Reinterviews• Nonresponse follow-up• Record checks
• Stage– Post-survey
• Purpose– Estimate MSE components
Basic interview-reinterview table for a dichotomous variable
Interview
Reinterview 1 0
1 a
b a+b
0 c d c+d
a+c b+d n
Some measures
• g=(b+c)/n gross difference rate or disagreement rate
• A=(a+d)/n agreement rate (1-g)• ndr=(b-c)/n net difference rate• I=g/[p1(1-p2)+p2(1-p1)] index of
inconsistency• p1=(a+b)/n• p2=(a+c)/n
Practical Survey Design for Minimizing MSE
What Should Be Designed?
• Requirements+specifications+operations
• Ideal goal+ Defined goal+Actual results
• Good survey design means control of accuracy through the specs (QA) and control of operations (QC)
Some Early Thinking
• Hansen-Hurwitz-Pritzker 1967
– Take all error sources into account
– Minimize all biases and select a minimum-variance scheme so that Var becomes an approximation of (a decent) MSE
– The zero defects movement that later became Six Sigma
• Dalenius 1969
– Total survey design
Alternative Criteria of Effectiveness
• Minimizing MSE for a given budget while meeting other requirements
• Maximizing fitness for use for a given budget
• Maximizing comparability for a given budget
• All these reversed
• Something else?
The Elements of Design
• Assessing the survey situation (requirements)• Choosing methods, procedures, “intensities”, and
controls (specifications)• Allocating resources• Assessing alternative designs• Carry out one of them or a modification of it• Have a Plan B
So, What’s the Problem?
• No established survey planning theory
• Multi-purpose, many users
• The information paradox
• Uninformed clients/users/designers
• Much design work is partial, not total
• Limited knowledge of effects of measures on MSE and cost
More Problems
• Decision theory and economics theory not used to their potential
• New surveys conducted without sufficient consideration of what is already known
• No one knows the proper allocation of resources put in before, during and after
• The literature is small
Various Skills Needed Which Calls for a Design Team
• Survey methodology
• Subject-matter
• Statistics (decision theory, risk analysis, loss functions, optimization, process control)
• Economics (cost functions, utility)
• IT
Rules of the Road
• Use reliable methods
• Develop a survey plan showing the resource allocation to each stage
• To be able to allocate resources optimally, collect information during planning and implementation
Rules of the Road (cont’d)
• Monitor the processes that lead to the product
• Disseminate information on data quality to users and producers
The Balance Between Cost, Errors and Other Quality Features
• Quality dimensions conflict– Accuracy vs timeliness– Accuracy vs relevance– Comparability vs accuracy– Cost vs error
Problems that Impede our Ability to Optimize Surveys
• Lack of expertise• The relationship between resources spent on
error reduction and actual error reduction is unknown
• Survey errors are highly interactive
Problems that Impede our Ability to Optimize Surveys
(con’d)• Major surveys are multi-purpose• All quality dimensions and constraints on them
limit design flexibility• It is not known how to allocate resources
between pilot studies, error reduction and error measurement
Bad News and Good News
• Bad news– Cost-survey optimization can be extremely
complex and much of this complexity is unknown
Bad News and Good News
• Good news– Simple models describing the relationship
between cost and error are still useful because often the optimum is flat
The Adaptive Element
• The entire survey process should be responsive to anticipated uncertainties that exist before the process begins and to real time information obtained throughout the execution of the process
or • Use process data (paradata) to check, and if
necessary, adjust the process
We Should Assemble What We Know
• Assessment methods
• Design principles
• Trade-offs and their effects
• The potential offered by other disciplines
• We shouldn’t accept partial designs
Apply Design Principles
• If pop is skewed then….
• If pop is nested then….
• If questions are sensitive then….
• If a high NR rate is expected then…
Examples of Trade-offs
• Accuracy vs timeliness• Response burden vs wealth of detail• Conduct survey vs other information
collection• Large n vs smaller n• Mixed vs single mode• NR bias vs measurement error• NR vs interpretation by family members
Example of Outline for a Survey Plan
• Statement of work
• Technical approach
• Management plan
• Schedule of activities and deliverables
• Budget
Checking out the Resources
• Consult in-house experts• Participate in professional activities• Develop current best methods for major
survey processes
Checking out the Resources (cont’d)
• Apply findings from the survey methods literature
• Consult general quality guidelines developed by prominent organizations
Examples of resources
• Conferences
– ASA
– AAPOR
– ISI
– Topic
• Journals
• JOS
• Survey Methodology
• POQ
• Books
Using Pilot Studies to Inform Survey Design
• Paradox:
• In principle, the survey designer needs information that will not be available until the survey has been completed
• The answer: Pilot studies on a smaller scale than the survey itself
Examples of Pilot Study Topics
• Choice of mode
• Length of recall period
• Topic sensitivity
• Response burden
• Clarity of concepts and definitions
• Effect of confidentiality pledges
• Question wording
• Alternative respondent rules
• Time estimates
• Expected rates of nonsampling error
• Cost components
Documentation
• Survey administrative processes– Survey plan– Revisions of plan– Process details– Process variables
Documentation (con’d)• Quality reports
– Use framework based on quality dimensions– Report estimates of MSE components– In absence of MSE component estimates provide
indicators of quality– Implement a rolling evaluation scheme