Data in context Chapter 1 of Data Basics. Frameworks Today, we will be presenting two frameworks for...
-
Upload
alyson-allison -
Category
Documents
-
view
222 -
download
0
description
Transcript of Data in context Chapter 1 of Data Basics. Frameworks Today, we will be presenting two frameworks for...
Data in context
Chapter 1 of Data Basics
Frameworks
Today, we will be presenting two frameworks for thinking about the content of data services.
A. Statistics and Data• What are statistics? What are data?• A chart of statistical information
B. Continuum of Access• Understanding dissemination channels
We’ll begin by organizing concepts that underlie statistical information.
Frameworks
Statistics are ubiquitous
“Statistics are generated today about nearly every activity on the planet. Never before have we had so much statistical information about the world in which we live. Why is this type of information so abundant? For one thing, statistics have become a form of currency in today’s information society. Through computing technology, society has become very proficient in calculating statistics from the vast quantities of data that are collected. As a result, our lives involve daily transactions revolving around some use of statistical information.”
Data Basics, page 1.1
What are we talking about?
Statistics and DataStatistics
• numeric facts/figures in the form of summaries
• created from data, i.e, already processed
• presentation-readyData
• numeric files created and organized for analysis
• requires processing• not ready for display• methodology-driven
Statistics and Data
Statistics and Data
Stories are told through statistics
The National Population Survey used in this example had over 80,000 respondents in 1996-97 sample and the Canadian Community Health Survey in 2005 has over 130,000 cases. How do we tell the stories about each of these respondents?
We create summaries of these life experiences using statistics.
Dimensions of statistics
Six dimensions or variables in this tableThe cells in the table are the number ofestimated smokers.
GeographyRegion
TimePeriods
Unit of Observation AttributesSmokersEducationAgeSex
Definitions use classifications
The definitions for concepts and variables use classification systems to assign categories or values to the properties of the concepts. For example, Region in this table consists of Canada and the ten provinces.
Some classifications are based on standards while others are based on convention or practice.
For example, Standard Geography classifications
Definitions use classifications
Classifications involve categories
CategoriesSex
TotalMaleFemale
Periods1994-19951996-1997
Statistics are about definitions
Each characteristic or variable that is measured or recorded about the unit of observation must be clearly defined. Statistics Canada has definitions for some of the more frequently used concepts and variables on its website under “Definitions, data sources and methods.”
The Census Dictionary is an important source for definitions of the concepts and variables in each Census.
Definitions and metadata
All of the definitions and information that describe the unit of observation, the universe, the sampling method, the concepts and the variables are critical to understand both the data and the statistics derived from the data.
We use to talk about codebooks and about the User’s Guide and Data Dictionary when speaking of data documentation. Now we refer to this documentation as metadata, which has been expanded to include documentation throughout the life cycle of a survey. The Data Documentation Initiative 3.0 standard is being used to organize this information.
Methods producing data Observational
MethodsExperimental
MethodsComputational
Methods
Focus is on developing observational instruments to collect data
Focus is on manipulating causal agents to measure change in a response agent
Focus is on modeling phenomena through mathematical equations
Correlation Causation Prediction
Replicate the analysis (same data or similar) Replicate the experiment Replicate the simulation
Statistics summarize observations
Statistics summarize experiment results
Statistics summarize simulation results
Methods producing data
A particular discipline or field will tend to be dominated by one of these three methods, although outputs may also exist from the other two methods.
Consequently, the knowledge disseminated within a field is often fairly homogeneous in how statistical information is used and reported.
Knowing this and the life cycle in which statistics are produced can help in the search for statistics.
Summary
Statistics are derived from observational, experimental or simulated data .
A table is a format for displaying statistics and presents a summary or one view of the data.
Tables are structured around geography, time and attributes of the unit of observation.
Statistics are dependent on definitions. Statistics summarize individual stories into
common or general stories.