The Utility of Metadata for Questionnaire Design and Evaluation
-
Upload
xander-freeman -
Category
Documents
-
view
28 -
download
3
description
Transcript of The Utility of Metadata for Questionnaire Design and Evaluation
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
1
The Utility of Metadata for Questionnaire Design and Evaluation
Jim EspositoBureau of Labor StatisticsWashington, DC
Disclaimer: The views and opinions expressed in this presentation are those of the presenter/author and not necessarily those of the Bureau of Labor Statistics or the Bureau of the Census.
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
2
Objectives of Presentation
To draw attention to the concept of metadata and to its scope and relevance
To describe a case study involving the measurement of work/employment that illustrates the utility of metadata in evaluating and designing questionnaire items
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
3
Metadata: An Informal Definition
Metadata can be defined as any information (verbal or numeric or code, qualitative or quantitative) that provides context for understanding survey-generated data: Domain-specific/ethnographic information Concepts and question objectives Questionnaire items and administration modes Instructional materials Pre- and post-survey evaluation research Classification algorithms
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
4
The Measurement of Labor-Force Status via the CPS
Current Population Survey [CPS] Official source of LF statistics in USA (e.g.,
monthly unemployment rate) CPS measures work, not jobs 60,000 households a month Principal LF categories: Employed [EMP],
unemployed [UE], not-in-the-labor-force [NILF] Employed: Work for pay, one hour or more;
unpaid work in family business, 15 hours or more; job (but absent last week)
Data collected monthly via two modes [face-to-face and telephone CAPI; centralized CATI]
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
5
CPS: Some Relevant Details
The CPS was redesigned in the early 1990s, utilized a multiple-method of evaluation plan (e.g., behavior coding, interviewer and respondent debriefings, split-ballot design) and generated a substantial amount of metadata
The CPS relies on about 16 questionnaire items to generate estimates for its three major labor force categories: EMP, UE and NILF (and various subcategories)
Again, CPS measures work, not jobs
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
6
The Measurement of Employment Status via the ACS
American Community Survey [ACS] Largest survey conducted in the USA; will
replace the Decennial Census “long form” 250,000 households a month Collects data on a broad range of demographic
topics (e.g., population, housing, disability status, employment status, educational attainment, health insurance)
Adheres to BLS employment concept with the same three major categories: EMP, UE and NILF
Data collected continuously via three modes [SAQ (66%), CATI and face-to-face CAPI)
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
7
ACS: Some Relevant Details
The ACS was developed over a series of stages (starting in the early 1990s) and achieved full implementation in 2005; there is a substantial amount of metadata documenting this process
At present, the ACS relies on the content of six CPS items (modified for use in the ACS) to generate its estimates for three employment status categories: EMP, UE and NILF
Because of methodological/procedural differences, the CPS and the ACS can not be expected to produce identical estimates
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
8
CPS: Work Item and DQ Issues [1]
CPS Work Question [No-business-in-household wording.] LAST WEEK, did you do ANY work for pay?
Data Quality [DQ] Issues, CPS Redesign Final evaluation phase (1992-93): Interviewers rated
this item as one of the more problematic questions on the redesigned CPS (e.g., Just my job?; Do you mean my regular job?)
On the basis of other evaluation data (e.g., behavior- coding and response-distribution analyses), these “reports” by respondents were determined not to represent a serious data-quality issue because of the likelihood of interviewer mediation and “repair work”
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
9
CPS: Work Item and DQ Issues [2]
Data Quality Issues (continued) Respondent debriefing data indicated that this
question did miss some marginal/paid work activity (1.6%): “In addition to people who have regular jobs, we are also interested in people who may only work a few hours per week. Last week, did [name] do any work at all, even for as little as one hour?”
The evaluation work conducted during the redesign was documented extensively by Census Bureau and BLS researchers in the 1990s (e.g., conferences; papers; book chapter); however, much of this work is not cited in ACS evaluation documents
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
10
ACS: Work Item and DQ Issues [1]
Current ACS LAST WEEK, did this person do ANY work for
either pay or profit? Mark (X) the “Yes” box even if the person worked only 1 hour, or helped without pay in the family business or farm for 15 hours or more, or was on active duty in the Armed Forces.
Data Quality Issues ACS underestimates employment (which
compromises estimates in the other two categories, UE and NILF)—next slide
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
11
CPS vs. C2000/ACS Estimates
CPS/Census-2000 Match Study “Combined-Month Sample”: February though May, 2000,
specific rotations;~86,000 addresses; wt. N: 207,875,749
CPS vs. ACS-like employment status items EMP: 64.1% vs. 62.3% (underestimate) UE: 2.7% vs. 3.4% (overestimate) NILF: 32.8% vs. 34.0% (overestimate)
Note: The employment status items from the Census-2000
long form are identical to those used in the current ACS.
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
12
ACS: Work Item and DQ Issues [2]
Data Quality Issues Small–scale evaluation [2004]: Expert reviews; behavior
coding; focus groups with ACS interviewers Behavior coding [CATI site; 51 HHs; 104 persons]:
INT codes: exact (78%); major changes (10%); data due in part to prior context [disability questions]
RSP codes: adequate answers (98%); other than simple yes or no (21%); examples (e.g., “For pay, yes.”; Just his “regular job.”; “No, currently unemployed.”)
Read-if-Necessary Statement: Never read Focus groups: “pay or profit” confusing; multiple-job
holders and self-employed (e.g., “Did you mean, other than my regular job?”); read-if-necessary statement rarely read; some interviewers ask about job directly
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
13
ACS: Revised Work Items
Revisions to ACS Work Question (1A): LAST WEEK, did this person work for pay at a
job (or business)? [If “no” to 1A, ask (1B).] (1B): LAST WEEK, did this person do ANY work for
pay, even for as little as one hour?
Rationale Current ACS work question confuses some
respondents: Why? Exploiting two-part question appears to clarify the
response task for some respondents and in so doing better achieves the objective of gathering accurate data on work activity and employment status
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
14
Estimates of Labor-Force/ Employment Status
2006 ACS Content Test January—March 2006; ~ 63,000 addresses, equally
split between control/current vs. test/revised groups
Current vs. revised ACS items EMP: 62.8% vs. 65.7% (plus 2.9%)* UE: 4.1% vs. 3.6% (minus 0.5%) NILF: 33.1% vs. 30.7% (minus 2.4%)*
Revised items manifest less bias and variability, as well
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
15
The CPS Work Item: Why might it be problematic for some respondents?
Grice (1975): Maxims on Quantity 1. Make your contribution as informative as is
required (for the current purposes of the exchange). 2. Do not make your contribution more informative
that is required.
Fowler (1995): Principles 3 and 3d. Principle 3: A survey question should be worded so
that every respondent is answering the same question.
Principle 3d: If what is to be covered is too complex to be included in a single question, ask multiple questions.
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
16
Invoking Grice on Quantity: Hypothetical Example [ACS/SAQ]
LAST WEEK, did you do ANY work for pay?
Respondent [full-time job]: How should I answer this [#!&?@] question? It’s doesn’t mention a “job” and probably would if that’s what they wanted to know. And it specifically says “work for pay”, so it must mean doing work on the side. OK, just check the “no” box.
Reference to a “job” is missing. [Maxim 1] “Work for pay” is specified, which would seem
superfluous (especially for someone with a full-time job): Who works all those hours for free? [Maxim 2]
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
17
Resolution for ACS: Two-Part Work Item
Revisions to ACS Work Question (1A): LAST WEEK, did this person work for pay at a
job (or business)? [If “no” to 1A, ask (1B).] (1B): LAST WEEK, did this person do ANY work for
pay, even for as little as one hour?
Part (1A) specifically mentions “job”, “work for pay” and “business”.
Part (1B) captures work for “as little as one hour”?
Not perfect, but better than current ACS item.
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
18
Closing Remarks
Even survey questions that appear simple and straightforward may not be for some respondents. [Key issues: Why and how many respondents affected?]
It is risky to import questions from one survey to another, especially when the surveys differ in terms of mode of administration (and in various other ways, too).
In evaluating and “fixing” questionnaire items, quantitative research, alone, is not sufficient.
Summary: Our best hope for optimizing data quality (i.e., minimizing measurement error) is a thorough and critical review of relevant metadata, followed by prudent design-and evaluation decisions that are informed by such reviews.
24 April 2007 QUEST2007: Statistics Canada, Ottawa, Canada
19
Thank You
Questions or comments?
Post-workshop: [email protected]