Authors: M. Murgia – [email protected]@istat.it A. Nunnari –...

24
Authors: M. Murgia – [email protected] A. Nunnari – [email protected] Presented by: M. Murgia UNECE – Seminar on New Frontiers for Statistical Data Collection Geneva, 2 November 2012 Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Transcript of Authors: M. Murgia – [email protected]@istat.it A. Nunnari –...

Page 1: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

Authors: M. Murgia – [email protected]

A. Nunnari – [email protected]

Presented by: M. Murgia

UNECE – Seminar on New Frontiers for Statistical Data Collection

Geneva, 2 November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Page 2: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

Content of the presentation

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Aims of this presentation

Mixed-mode: state of the art in Istat

Adoption of a mixed-mode strategy

A software solution to support mixed-mode surveys

Requirements for a generalised software and evaluation criteria

Results and future directions

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Page 3: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

To describe the mixed-mode strategy adopted by the Italian NSI – ISTAT - both in terms of questionnaire design and of software solution.

To identify criteria that help creating an integrated data collection system based on generalised functions covering all steps of data collection phase.

In few words: how technology can best support methodology

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Aims of this presentation

Page 4: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

What we intend for mixed-mode

The combined use of any data collection techniques for the questionnaire administration as well as for reminder and or follow-up phases.

This means that a mixed-mode strategy has an impact on the entire phase: from design of data collection methodology to finalisation of collected data (sub-processes 2.3 to 4.4 of GSBPM).

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Mixed-mode: state of the art in ISTAT

Why using mixed-mode strategy? A potential optimal solution to face, all together, problems of:• budget cuts• low response rates• low land line coverage

Page 5: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

Mixed-mode experiences in Istat

Oldest experiences: mix of traditional data collection modes

Business target: Mail-CATI surveys

Population target: CATI-CAPI: Labour Force Survey

More recently: WEB has been included in the mix

Business target: Mail-WEB and WEB-CATI surveys

Population target (even more recently): - CATI-WEB: 2009 PHD graduates survey - Mail-WEB: 2011 Population Census

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Mixed-mode: state of the art in ISTAT

Page 6: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

The next future

The oldest mixed-mode approach is based on “traditional” techniques applied sequentially.

This approach is no more suitable to face budget, response rate and land line coverage problems. They can only be tackled by adopting a different approach that involves:

- any order of mode mixing: parallel or sequential

- any type of data collection technique: traditional (mail, CATI,

CAPI) and less traditional (WEB)

- any type of data collection instruments: traditional (pc) and

innovative (mobile phones, smartphones, tablets, etc.)

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Mixed-mode: state of the art in ISTAT

Istat is moving toward this approach

Page 7: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

The next future

Methodological (q’aire design) and technical issues (web & new hardware) have still to be solved by Istat:

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Mixed-mode: state of the art in ISTAT

WEB in mixed mode• Lot of experiences for business surveys, but “simple”

questionnaire design strategy guided by a main technique - the other modes used only in a second step to cover missing strata.

• Two experiences for population surveys and minor or no concerns about questionnaire design and few issues in terms of response rate:

- PHD graduates survey used two different questionnaires (different surveyed phenomena); Population Census used a main mail-specific questionnaire;

- High response rates for web, more than 30% in both cases: high education level of respondents for PHD graduates surveys and massive advertising campaign for Population Census

Page 8: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

Mobile phones in mixed mode

Mobile phones used in CATI surveys for respondents with no land line phone, but no experiences in combining them with other methods. The combined use implies to address:

- methodological issues (sampling frame, coverage, survey

environment – out of the scope of this presentation);

- technical issues: to adapt the questionnaire layout to a smaller

screen resolution when used in mixed-mode with web. Same

problems with smartphones, tablets etc.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Mixed-mode: state of the art in ISTAT

The next future

Methodological (q’aire design) and technical issues (web & new hardware) have still to be solved by Istat:

Page 9: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

Mobile phones in mixed mode

Besides, for population surveys, the use of web and mobile phone is a help and a must to solve budget, response rate and land line coverage problems as shown in the picture:

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Mixed-mode: state of the art in ISTAT

27 27,229,9 31,8

34,136,8

40,244,4

48,951,5

84,7 83,1 81,3 79,876,8

74 71,969,6 69,6

67,1

10,212,9 14,7

17,119,4

23,225,6

28,4 28,430,9

0

10

20

30

40

50

60

70

80

90

2001 2002 2003 2005 2006 2007 2008 2009 2010 2011

Internet use

Own a land line

Mobile only

The next future

Methodological (q’aire design) and technical issues (web & new hardware) have still to be solved by Istat:

Page 10: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Adoption of a mixed-mode strategy

What it is needed

The design of a data collection strategy aimed at containing the portion of non sampling error due to mode effect

Mode effect = differences in collected data due to the characteristics of mode (measurement error) and not to real differences. Therefore it is always present in collected data, also if one single mode is used.

In mixed-mode it needs to pay attention not to increase non- sampling error by adding a mix of measurement errors.

To use data collection instruments that are mode insensitive.

Methodology

Technology

Page 11: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Adoption of a mixed-mode strategy

TASK: To create an optimal designed mixed-mode strategy aimed at reducing the risk of greater non sampling error due to coverage, measurement, frame and non-response errors

Methodological side of the world

Technological side of the world

TASK: To create an optimal data collection system able to implement the designed strategy and aimed at reducing the risk of greater non sampling error due to complexity.

Complexity = duplication of efforts to implement the questionnaire across modes.

Complexity increases survey costs, delay in data delivery and measurements errors

Page 12: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Adoption of a mixed-mode strategy

Methodological side of the world

A data collection strategy can be designed in different ways each affecting differently the questionnaire design

Literature review (De Leeuw, Dillman)

• One main mode, the others are secondary or auxiliary

• All modes are equally important

Questionnaire design: mode-enhancement construction approach

Questionnaire design:

- mode-specific construction or maximisation method

- uni-mode approach

- generalised mode design

The questionnaire is purposely designed to be different for each method in order to reach the cognitive equivalence of the perceived stimulus “… the same offered stimulus is not necessarily the same perceived stimulus” (De Leeuw 2005)

Page 13: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Adoption of a mixed-mode strategy

Methodological side of the world

The generalised approach is not easy. It requires:

- to identify the differences in modes influencing the cognitive process of answering

- to use cognitive tests to demonstrate that different question formats elicit equivalent answers

But it seems able to answer Istat needs (to combine any data collection mode and instrument).

How to implement it? Through the creation of an integrated data collection system, based on generalised functions covering all steps of data collection

Technological side of the world

Page 14: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

A software solution to support mixed-mode strategy

Technological side of the world

TASK: to reduce complexity

HOW: 1) Consolidation: i) to create from scratch a single all purposes, highly integrated system or ii) to use available tools and make them speak common languages, share the same data representations and meet functional standards;

2) Generalisation: to abandon ad hoc procedures in favour of generalised ones. Three main dimension of generalisation:- Data collection technique- Class of respondents- Software and hardware platforms

3) Questionnaire abstraction: to design the questionnaire independently from its implementation in any collection mode (user side); the system should be flexible in order to support any mode-specific changes to the questionnaire.

Page 15: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

A software solution to support mixed-mode strategy

Technological side of the world

How to achieve the three objectives? (Consolidation, Generalisation, and Questionnaire abstraction)

Istat case study

A workgroup to define requirements that a data collection system must have in order to be considered as generalised.

Requirements have to be applicable to the entire process of data collection.

Four areas for requirements:

1) Survey units management: managing the information to contact respondent and to logically define it as a user in the system: name, address, phone, e-mail, username and password etc.

2) Electronic questionnaire: instrument to collect micro-data

3) Data collection management: real-time administrative tools for conducting and monitoring the data collection process: management of user grants, first validation tools, questionnaire tracking systems, reporting tools etc.

4) Communication facilities: tools for exchanging information with survey respondents: helpdesk system, content management system, automatic reminders management, etc.

Page 16: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

A software solution to support mixed-mode strategy

Istat case study: requirements for the Electronic questionnaire

Present situation in Istat: many available data collection tools, developed independently for specific contexts and with different technologies.

To answer the three “questions” we collected information on tools and at the same we started to devise functional requirements on the base of which to evaluate the tools.

Meetings with IT projects manager to create a feed-back process

First we need to know:

1) what was already available: to take an accurate inventory of what is available in order to avoid redundancy and duplication of effort;

2) what we should require from it: to define standards and requirements;3) what fully or partially already meet these standards or can more easily and cost-effectively be brought into compliance.

Page 17: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Requirements for a generalised software and evaluation criteria

Evaluation form: criteria

High level of differences among tools, no directly comparable. The evaluation form was created in order to cover the majority of topics trying to get a common minimum but exhaustive information.

Istat case study: requirements for the Electronic questionnaire

Two categories for evaluation criteria:

Cross-sectional criteria: refer to an evaluation of the actual facilities provided by the tools at time t

Longitudinal criteria: take into account potential assets in order to assess possible lines of development.

Page 18: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Istat case study: requirements for the Electronic questionnaire

Requirements for a generalised software and evaluation criteria

Evaluation form: Cross-sectional criteria

1. Usability

2. Flexibility

3. Completeness of functions

4. Generalisation of functions

5. Integration with XML data representation model

6. Independence from proprietary systems

7. Cross-browser compatibility

8. Platform compatibility

Page 19: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Evaluation form: Cross-sectional criteria

1. Usability

Istat case study: requirements for the Electronic questionnaire

Requirements for a generalised software and evaluation criteria

A primary requisite that even non-computer experts (statistical researchers) are enabled to use authoring tools without the mediation of dedicated IT personnel. Results: increase in the quality and reduction of training and support costs. Criteria to evaluate usability:

• user documentation availability;• presence of a user interface;• ability to reuse and modify existing objects and data - metadata already defined; - templates of questionnaire layout.

2. Flexibility

Adaptability of the software to multiple classes of respondents and of data acquisition techniques. Capability to handle questionnaires with different degrees of complexity and different ways of administering questions. Abstraction of the questionnaire object would help in flexibility and would make possible to apply adaptive collection techniques (e.g. mode-switching.). Criteria to evaluate flexibility:• Completeness of functions (explained after).• Presence of a metadata-driven architecture• Modularity

Page 20: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Evaluation form: Cross-sectional criteria

3. Completeness of functions

Istat case study: requirements for the Electronic questionnaire

Requirements for a generalised software and evaluation criteria

a) to implement all types of question formats; b) to implement all types of check rules (both client and server-side);c) to manage question sequences dynamically (smart branching, skip-and-

fill…);d) to manage linkage to external archives for look-ups, form pre-fill, cross-

referencing, etc.;e) to be able to perform computer-assisted coding through optimised text

matching algorithms;f) to allow the controlled upload of the requested data as files (ASCII,

spreadsheet etc.);g) to allow the respondent to export the questionnaire (empty or filled) for

reference, printing and/or archiving;h) to allow the respondent or the interviewer to complete the survey in

multiple sessions (saving and later retrieving the questionnaire);i) to enable questionnaire-sharing (concurrent access to a single

questionnaire);j) to implement loop functions: question loops, page or block loops,

questionnaire loops, loop-and-merge facilities;k) multilingualism

Page 21: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Istat case study: requirements for the Electronic questionnaire

Requirements for a generalised software and evaluation criteria

Evaluation form: Cross-sectional criteria

4. Generalisation of functions

The system must support data interoperability at three levels: a) data exchange between items in the same data collection toolbox; b) data exchange with the tools used in other phases of the statistical

process;c) integration with tools for data collection and transmission used at European

and international level.The interoperability must also be ensured along two dimensions: - Syntactic interoperability: sharing of formats and transmission protocols

for the exchange of data. - Semantic interoperability: definition and supply of metadata necessary to

interpret shared data.

5. Integration with XML data representation model

Each software component has to be able to deal with a changing environment by allowing variable data to be introduced in the system through parameterisation.

Page 22: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Istat case study: requirements for the Electronic questionnaire

Requirements for a generalised software and evaluation criteria

Evaluation form: Cross-sectional criteria

6. Independence from proprietary systems

The software should ensure that the layout and functionality of the web questionnaire do not vary as different browsers are used to access it. A cross-browser application ensures the smooth running of the compilation process even when the specifications of respondent’s client environment are unknown or not directly controllable, like in business or households/population surveys.

7. Cross-browser compatibility

Several advantages:• lower costs of acquisition and management;• greater independence from suppliers;• greater control over the code thus enhancing its flexibility;Open technologies allows to share solutions with the developer community of other NSIs.

The progressive widening of the range of users adopting emerging technologies makes platform compatibility an essential requirement: users must be able to easily fill in electronic questionnaire even on devices such as PDAs, netbooks, laptops, smart-phones, tablets, MID, UMPC.

8. Platform compatibility

Page 23: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012

Improve the quality on data collection: minimum requirements for a generalised software independently from the mode

Istat case study: requirements for the Electronic questionnaire

Requirements for a generalised software and evaluation criteria

Evaluation form: Longitudinal criteria

1. Generalisability of available ad hoc functions

2. Modularity of functions

3. Logical and semantic abstraction

4. Compliance with recognised standards for data and metadata description

Page 24: Authors: M. Murgia – murgia@istat.itmurgia@istat.it A. Nunnari – nunnari@istat.itnunnari@istat.it Presented by: M. Murgia UNECE – Seminar on New Frontiers.

Results and future directions

Data collected from the returned forms and from the meeting with IT project managers were merged, normalised and table-formatted in order to enhance comparability

The result (almost expected) was that none of the surveyed software tools was found fully compliant with all the proposed requirements.

The resulting data represent a solid foundation to go on with the analysis of the toolboxes and for the definition of the enterprise architecture standards for data collection in Istat.

Final results are expected by the end of 2012 as they are part of the “Stat2015” project aimed at the standardisation and industrialisation of the entire cycle of ISTAT statistical processes, according to a model based on a metadata-driven and service oriented architecture.

UNECE- Seminar on New Frontiers for Statistical Data Collection - Geneva, 2nd November 2012