19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

37
19.11.2004 Toni Räikkönen Data Collection in Statistics Finland now and in the Future

Transcript of 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

Page 1: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004Toni Räikkönen

Data Collection in Statistics Finland

now and in the Future

Page 2: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 2

Topics

General background of the data collection in Statistics Finland

Internet-based data collection Self-made web data collection applications XCola (XML-based Collection Application)

Page 3: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 3

Primary objectives in data collection

reduce data supply burden of respondents speed up data production lower data collection costs improve the quality of data remove overlapping collection and promote joint use of the collected data between different authorities

Page 4: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 4

Background About 96 % of the data is collected from administrative registers

About 4 % of the data is collected directly from respondents

paper forms, Excel sheets web collection applications interviews by CATI/CAPI systems, mainly using Blaise software

Result agreement with the Ministry of FinanceAll respondents (enterprises, communes, schools) should have the possibility to transmit their data electronically by the end of 2006.

Page 5: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 5

Page 6: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 6

Web Data Collection Applications in Statistics Finland 15.11.2004

Inquiry Unit Ready for Production Status Software made by maintenance

1. Self-made web collection applications

Rakennuskustannusindeksi YS 2001 in production VB6 TK/Räikkönen RäikkönenAluebarometri EL 2001 in production VB6 TK/Räikkönen RäikkönenMyyntitiedustelu YS 2002 in production ASP.NET TK/Hedman Esikot / LanuTeollisuuden volyymi-indeksi YS 2002 in production ASP.NET TK/Hedman Esikot / LanuVarastotiedustelu YS 2003 in production ASP.NET TK/Hedman Esikot / LanuOptiotiedustelu YS 2003 in production ASP.NET TK/Hedman Esikot / LanuEnergiankäyttötiedustelu YR 2004 in production ASP.NET TK/Piela Maarit AspTuottajahintaindeksi HP 1.10.2004 in test ASP.NET TK/Asp Maarit AspKuntien toimintayksiköt HE 30.11.2004 in test ASP.NET TK/Piela ?Majoitustilasto YS 1.1.2005 under construction XCola TK/Snellman Kesete ?Teollisuuden uudet tilaukset YS 1.1.2005 under construction XCola TK/Snellman Kesete ?Varallisuustutkimus EL 1.1.2005 under construction Blaise IS TK/Lauri Lauri ?Palvelujen hintaindeksi HP 2005 planned ASP.NET TK/Asp ?Maatilatalouden yritys- ja tulotilasto TO 2005 planned XCola TK/Räikkönen Topso / HelanderTavarankuljetus, kotimaan liikenne YS 2005 planned XCola TK/Kesete ?Tavarankuljetus, ulkomaan liikenne YS 2005 planned XCola TK/Kesete ?Rakennusyritysten korjausrakentaminen YS 2005 planned XCola ?YTR:n yksitoimipaikkaiset YS 2005 planned XCola ? YREKYTR:n laatutiedustelu YS 2006 planned XCola ? YREKYTR:n monitoimipaikkaiset YS 2006 planned XCola ? YREKYTR:n uusien tiedustelu YS 2006 planned XCola ? YREKKulutustutkimus EL 2006 planned Blaise IS TK

Page 7: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 7

3. Other paper form inquiries waiting for a web collection application

Kuntien henkilöstötiedustelu HE 2005 planned Statistics Finland ?Valtion tuottavuustilastointi TO 2005 planned ?Luottokorttitilasto TO 2005 planned ?Maa- ja metsätyöntekijöiden palkat HP 2005 planned ?Yritysten innovaatio YR 2005 planned ?Ammattitiedustelu HE 2006 planned ?Asunto-osakeyhtiöiden taloustilasto HP 2006 planned ?Kuorma-autoliikenteen kustannukset HP 2006 planned ?Kaupan määräaikaisselvitys YR 2007 planned ?Työvoimakustannustutkimus HP 2008 planned ?Pääomakantatiedustelu YR ? planned ?Palvelujen hyödykekysely YR ? planned ?

Inquiry Unit Ready for Production Status Software made by

2. Web collection applications ( have been / being / will be ) built outside of Statistics Finland

Peruskoulut HE 1997 in production ELMAOppilaitokset ja opiskelijat HE 2000 in production ELMALukioiden oppilasvalinnat HE in production ELMALukioiden henk.pohj. opiskelija-ain. HE in production ELMAKuntatiedonkeruu in production ELMAKuntien neljännesvuositilasto TO in production ELMAKuntien toimintatilasto TO 2003 in production ELMAKuntien taloustilasto I TO 2003 in production ELMAKuntien taloustilasto II TO 2003 in production ELMAKuntien palkkatiedustelu HP 2004 in production ELMATeollisuuden toimipaikkatiedustelu T5 YR 2004 in production ELMAAjankäyttötiedustelu YR 2004 in production ELMATietotekn. ja sähk. kauppa yrityksissä YR 2005 under construction ELMAPalvelujen ulkomaankauppa YR 2005 under construction ELMAYritysten tutkimus ja kehittäminen YR 2005 planned, ordered ELMAYmpäristönsuojelumenot YR 2005 planned, ordered ELMATeleviestintä YR 2005 planned, ordered ELMATilinpäätöstietojen lisätiedot TILKES YR 2005 planned ELMA ?Yksityisen sektorin palkat HP 2005 planned ELMA ?Julkisen sektorin tutkimus ja kehittäminen YR 2006 planned ELMA ?Linja-autoliikenne YR 2006 planned ELMA ?Energiantuotanto YR 2006 planned ELMA ?Hyödyketiedustelu YR 2006 planned ELMA ?

Page 8: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 8

Data collection in Statistics Finland by type and media used

Indirect data collection: 94 %EDI: 100%

Direct data collection: 6 %

CAPI: 2%

Paper: 2%

EDI: 2%

Page 9: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 9

Data flows

Different types of data flowsdata are needed only by Statistics Finlandthe same data are needed by several administrative organizations

interviews made by CATI/CAPI system Different solutions

using external teleoperator for distributing data to different data collectors (TYVI model)

self-made web-based systemBlaise solution for carrying out interviews

Page 10: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 10

The TYVI model Data Flows from Enterprises to Authorities

interfaces and transmission data capture data refining management of user accounts

Participants The enterprises The TYVI-operators The authorities

The authority needs not to be in relationship of many to many with the respondents

Page 11: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 11

YrityksetEnterprises TYVIOPERATOR

TYVIOPERATOR

TYVIOPERATOR

TYVIOPERATOR

Enterprises

SOFTWAREHOUSE

ACCOUNTINGOFFICE

Enterprises

TAX

ADMINISTRATION

STATISTICSFINLAND

OTHERS

FTPHTTP

The TYVI-model (Vallaskangas 1998)

Page 12: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004Toni Räikkönen

Internet -based collection of data

Case: Building Cost Index

Page 13: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 13

General background

Fall 2000 All existing electronic data collections were handled by 3rd

party operators (TYVI model) The production system of Building Cost Index was under re-

construction and lacked web-based data collection About Building Cost Index (Business Trends)

~300 respondents (hardware stores, wholesale stores, plumbing stores etc.)

Price information of 1-15 products collected from each respondent every month

Paper forms are usually sent on the 15th day and expected back around the 25th day

Page 14: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 14

The design goals of the web system

Provide means of web based collection of statistical data

No extra burden (no installations, no javascript based solutions etc.)

“Live” feedback to the respondents (upon validations etc.)

Page 15: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 15

Hardware architecture

Running on Windows NT serverWeb server: Microsoft Internet Information Server 4 (IIS4)

Component Server: Microsoft Transaction Server 2.0Anonymous access (No NT-authentication)

Database serverWindows 2000 serverRunning Microsoft SQL Server 2000Deployed on DMZ, accessible only through firewall

Page 16: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 16

Application architecture

Built using Microsoft Windows DNA (Distributed iNternet Application Architecture)

Standard 3-tier architecture that consists ofPresentation layer: HTML, ASPBusiness layer: COM componentsDatabase layer: Relational database

System consists of two separate modules (both self-made)

User authenticationData collection

Page 17: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 17

Experiences

Beta phase from 5/2001 - 9/2001, 30 respondents 9/2001 - 2/2002, 70 users In 3/2002 the systems was opened to all respondents

147 users at the moment (nearly 50%)

Page 18: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004Toni Räikkönen

Internet -based collection of data

CASE: Business Trends’ collection systems

technical aspects

Page 19: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 19

Design goals

Create framework for similar systems Multi-language support LDAP -based user authentication w/ centralized administration

Create generic method for transferring data between collection and production databases

Create “mass emailer” for all kinds of collection systems

Page 20: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 20

Software & hardware architecture

Built using Microsoft.NET and ASP.NET Generic 3-tier architecture w/ presentation, business and database logic

Collection database separated from the production database

128 bit encryption used for communication between respondents and Statistics Finland

Page 21: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 21

Framework of the collection system

The modular structure of the framework allows toChange menus, headers, footers and other stylesAdd custom functionality (using ASP.NET user controls) on the pages

Add and load different languages for the pages The base use cases are more or less same in different collection systems (login, questionnaire, feedback, instructions and contact information)

Page 22: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 22

Multi-language support

Most of the textual information on the web pages is stored in the database

Texts are loaded on the server’s memory on the system startup

Only long descriptions are kept as files Page language can be changed “on the fly” Every element has a tag on the page template and the relevant text is attached to the element upon the page load

Page 23: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 23

User authentication

The objective was to use LDAP (lightweight directory access protocol) for the user authentication

The development for this didn’t proceed in the schedule, so it was temporarily replaced with database-based user authentication and administration

Authentication thru LDAP has been tested and it seems to be an ideal solution

At the moment we’re building a simple web administration application to finish the LDAP part

Page 24: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 24

Page 25: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 25

Data transfers

Data transfers between collection and production databases are handled with an external win32 -application

Built with PowerBuilder using pipeline feature (data flow)

Data from collection database is transferred to the temporary tables in the production database and then synchronized with the actual tables

Solution is quite customizable, allowing new functionality by adding new pipelines

Page 26: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 26

Mass emailer

An external application was built with Visual Basic 6 to send emails to the respondents

Modular approachNew systems can be added using textual configuration files

Reply requests can be added by writing sql statements to the configuration files

Supports attachments Replaces traditional letters

Page 27: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 27

Development experiences

Microsoft.NET was just released when the development began

Development environment wasn’t always stabile and the developers experienced quite a lot of unexpected behavior

Despite this, ASP.NET is quite an improvement when comparing to other web application methods (asp, php, perl etc.)

Although inter-browser compatibility is still quite poor

Page 28: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 28

Effects of the electronic data supply system on data collection process

Printing the questionaries Transferring data to collection database Mailing E-mail informing (mass emailer) Receiving the questionaries (mail, fax, e-mail, TYVI) (Electronic data supply) Validating and entering the data Mass validation Printing and mailing the reminders E-mail reminder (mass emailer) Phone inquiry Phone inquiry Non-individual delayed feedback Individual direct feedback Limited access to previous own data Previous own data available

Manual exclusive treatment Electronic mass treatment

Page 29: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 29

Results (1): Sale inquiry

Electronic data supply system users of all

respondents:

after 1. month: 48%

after 2. month: 59%

after 3. month: 61%

since 4. month: 70%

Today: 75 - 80%

Page 30: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 30

Results (2): Sale inquiry

Reminders sent:

before electronic data supply system: ~1000

after 1. month: ~800

after 2. month: ~700

after 3. month: ~600

since 4. month: ~500

Page 31: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 31

Experiences (1)

Feedback from respondents has been very positive:

Response burden has redused remarkably

Enthusiasm of persons involved in data collection

Manual data treatment has redused (at least by 50%)

Quality of data has improved: Validation, additional

information if data is not comparable etc.

Page 32: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 32

Experiences (2)

Number of enquires made by respondents concerning

electronic data supply system:

first two months: ~100 / month (mainly questions

concernig base settings)

since third month: ~30 / month (mainly forgotten

passwords)

Page 33: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 33

Development ideas

Although the framework is quite good, some ideas have arisen

Use of XML toDefine the concepts of the questionnairesDefine the presentation (XSLT)Define the validations

Replace the user authentication with LDAP

Page 34: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 34

Benefits

EnablesComplex validations of the dataDynamic creation of presentation layer logicDisplaying of pre-fetched data to individual respondents

Live feedback to the respondents (validation errors etc.)

Page 35: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 35

Drawbacks

Requires user/customer administration forMaintaining user profilesHelpdesk/Support services

Page 36: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004Toni Räikkönen

Internet -based collection of data

CASE: Accomodation statistics

XML-based form

Page 37: 19.11.2004Toni Räikkönen Data Collection in Statistics Finland now and in the Future.

19.11.2004 37