StatMine, visual exploration of output data
-
Upload
edwin-de-jonge -
Category
Technology
-
view
151 -
download
1
description
Transcript of StatMine, visual exploration of output data
StatMine – prototypeStatMine, an exploration of dissemination data
Edwin de Jonge
Statistics Netherlands
25 September 2012, Seoul
an exploration of dissemination data: StatMine 2
an exploration of dissemination data: StatMine 3
StatMine, from numbers to analysis 4
an exploration of dissemination data: StatMine 5
Why StatMine?
• Statistics Netherlands (SN) mission produce relevant information for:• Policy makers• Journalists• Citizens• Enterprises• Economists• Social scientists • Etc.
5
an exploration of dissemination data: StatMine 6
Numbers ≠ Information
StatLine is SN’s online DB (over 1 billion figures)
We know from a user study that:
1. Many interesting patterns in StatLine are not spotted by users
2. Many important topics in StatLine are scattered across multiple tables
6
an exploration of dissemination data: StatMine 7
Example of problem 2
• Policymaker interested in patients with diabetes:
• Visits to medical doctor• Hospital admissions• Mortality• Medication consumption (insuline)• Obesity
Are all different statistical products (from different sources)!
an exploration of dissemination data: StatMine 8
Data analysis = Data insight
Goal research project StatMine is to provide data insight by:
• (I) Using data visualisation• (II) Combining data table fragments• (III) Deriving variables
All hypotheses (will be) tested with a prototype with internal and external users.
(I), tested and succesful
(II, III,… ) is work in progress
8
an exploration of dissemination data: StatMine 9
Chart types
Bar chart
Line chart
Mosaic chart
Bubble/scatter chart
Comparison
Development
Structure
Correlation
an exploration of dissemination data: StatMine 10
Chart type – bar chart
an exploration of dissemination data: StatMine 11
Chart type – line chart
an exploration of dissemination data: StatMine 12
Chart type – mosaic chart
an exploration of dissemination data: StatMine 13
Chart type – bubble chart
an exploration of dissemination data: StatMine 14
Small multiples
Split chart into different subpopulations Goal: compare subpopulations Very little tools offer this functionality!
an exploration of dissemination data: StatMine 15
Small multiples
an exploration of dissemination data: StatMine 16
Composing a chart
Example:• Year x Region x Gender x Age
• Count• Mean income• Employment
Numeric variables / topics
categorical variables / dimensions
an exploration of dissemination data: StatMine 17
Prototype
• Built in php, javascript (d3)• Imported 10 StatLine example tables
• Complex tables, e.g.• Labor participation x gender x cohorts• Labor market flow per quarter (employed/unemployed)• Enterprise birth, death and growth x economic activity x quarter
• Tested on:• Internal users• Owners of data
an exploration of dissemination data: StatMine 18
Demo
an exploration of dissemination data: StatMine 19
Evaluation
• Part I : very succesful• Owners of data want prototype to check their own
data• Provides insights• Easy detection of anomalies
19
an exploration of dissemination data: StatMine 20
Work in progress
20
• II, Combination of different fragments• Testing with policymakers (end this year)• Or “How to glue statistical tables?”
• III, Derive variables + analysis• Absolute vs relative (per population unit)• Turnover / # employees• Etc
an exploration of dissemination data: StatMine 21
Questions?