1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38)...

13
1 1 International Collaboration International Collaboration on Industrialization of on Industrialization of Editing: Editing: Business Case (Part 1, Business Case (Part 1, WP38) WP38) Li-Chun Zhang Statistics Norway

description

3 Objectives & principles Example: Objectives (the “new” paradigm) –Error-source identification and error prevention –Collect information about quality –Identification and adjustment of critical errors in data Example: Objectives (SNZ proposal) –Efficiency as quality against cost –Continuous quality improvement –Provide quality information Example: Principles –Original data as much as possible (“old” Felligi-Holt paradigm) –Maximum automated processing –Analysis of (editing) process efficiency –Training, documentation –…

Transcript of 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38)...

Page 1: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

1

1

International Collaboration on International Collaboration on Industrialization of Editing:Industrialization of Editing:Business Case (Part 1, WP38) Business Case (Part 1, WP38)

Li-Chun ZhangStatistics Norway

Page 2: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

2

Industrialization of Editing: Some issues to be dealt with• Overall objective, principles and guidelines (e.g. the “new”

paradigm of editing)

• Conceptual reference framework with regard to GSBPM

• Conceptual reference framework with regard to GSIM to-be

• Design of generic functionality

• Minimum set of standard methods

• IT tools and platforms

Page 3: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

3

Objectives & principles• Example: Objectives (the “new” paradigm)

– Error-source identification and error prevention– Collect information about quality– Identification and adjustment of critical errors in data

• Example: Objectives (SNZ proposal)– Efficiency as quality against cost– Continuous quality improvement– Provide quality information

• Example: Principles– Original data as much as possible (“old” Felligi-Holt paradigm)– Maximum automated processing– Analysis of (editing) process efficiency– Training, documentation– …

Page 4: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

4

Generic Statistical Data Editing Process (GSDEP) • GSBPM ≠ Flow Chart

• An example from EDIMBUS

• Mapping GSDEP with GSBPM– Micro vs. macro editing– “Editing & Imputation” (E&I) vs.

“Editing & Estimation” (E&E)

• Connections to GSIM to-be

Page 5: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

5

Common Statistical Data Reference (CSDR):Interface btw. SDE and GSIM to-be

• Statistical production as transformations of data => steady / major states of data

• Common Micro Data Format for database management

• Common Functional Data Format for method library

Page 6: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

6

Design of generic functionality• Databases

– Micro database of CMDF data files (M-Base)– Functional database of functional data files and alignment tables (F-Base)– Function library (F-Lib) contains all available standardized generic (program) tools.

• Builders– Functional data builder (D-Build) transforms relevant CMDF data files into the required functional data files, and

updates the relevant alignment tables.– Function builder (F-Build) takes functional data files as the input data and tools from the F-Lib, and configures the

necessary parameters according to a given specification for machine-based or automated data processing.– Screen builder (S-Build) takes fnctional and/or CMDF data files as the input data, and configures an environment

for manual inspection/editing of individual records/questionnaires according to a given specification. • Runners:

– Batch processor is the environment for executing automated/machined-based SDE processes, chiefly relying on functions that are configured in the F-Build.

– Manual processor is the environment for manually executing SDE processes, chiefly relying on the interface provided through the S-Build.

– Selection and Drilling are the dedicated environments for carrying out selective editing and drilling up-and-down among hierarchically structured aggregations.

– Data processor supports the necessary administration of data and metadata.• Managers:

– ANOPE is the environment for quality assessment of the editing processes.– Response manager provides the interface for re-contact with the data providers, and other generally related

production processes (such as Process 4 Collect).

Page 7: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

Claude PoirierStatistics Canada

Next steps

• Objectives, guidelines and principles– Finalize user requirements– Identify existing methods– React to functional gaps– Set up the framework– Develop the toolset– Deliver training

7

Page 8: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

Finalizing user requirements

• Prioritizing edit and imputation requirements– Micro-editing methods

Automated E&I on numerical and categorical data

– Macro-editing methodsSelective editing; Macro editing; Editing of macro data

– On-line editingCollection edits and self-administered edits

– Data confrontation and certificationMethods using multiple data sources

– Standardized platformCommon architecture

8

Page 9: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

Existing tools and Platforms

• Identifying and analysing existing products– SigEE (Australia)

– BANFF, CANCEIS (Canada)

– BEST, POSS (New Zealand)

– ISEE, DYNAREV (Norway)

– TRITON, SELEKT (Sweden)

9

Page 10: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

Reacting to functional gaps

• Not all requirements will be satisfied

• Brainstorming sessions are being organised

• Development priorities will be discussed

Developing the tool set

• Consolidate preferred tools– Adapt existing tools to the environment

– Develop pre/post processors to fit the environment

• Develop missing functions10

Page 11: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

Delivering training material

• User guide

• Methodology documentation

• System documentation

Comments / Questions

• It’s your turn

11

Page 12: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

Frequently asked questions (FAQ)

Q1: What governance model drives the project?Q2: When do we expect the suite of editing functions to be

delivered?Q3: As a member of the collaboration network, will my

agency have to pay any fees for accessing and using released functions?

Q4: My statistical agency is not part of the network. Are there any fees that are planned to let me use the products?

Q5: My agency would like join the network. Is this possible? How?

12

Page 13: 1 1 International Collaboration on Industrialization of Editing: Business Case (Part 1, WP38) Li-Chun Zhang Statistics Norway.

Frequently asked questions (FAQ)

Q6: I understand from your presentation that a common environment is being planned? Would I be able to use the functions in another environment?

Q7: My agency is willing to share a system but its foundation software is not compliant with the proposed environment. What will happen?

Q8: My agency is willing to offer a system or a module for the network. Who will own the module?

Q9: Will the resulting products become open-source?

13