Quantitative Evidence for Marketing
Data Library, Rutherford North 1st FloorChuck Humphrey Data Library
October 26, 2009
Outline Distinction between statistics and data Statistics are about definitions and
classifications Sources for population demographics
The Census Sources for family expenditures and income
CANSIM and E-STAT Sources for consumer behaviour
Tablebase, PMB and GMID
Distinguishing statistics from data
How statistics and data differ
Statistics• Numeric facts & figures • Derived from data, i.e, already
processed• Presentation-ready• Published
Data• Numeric files created and
organized for analysis or processing
• Requires processing• Not display-ready• Disseminated, not published
Statistics are about definitions
Statistics are about definitions!
Statistics are dependent on definitions. You may think of statistics as numbers, but the numbers represent measurements or observations based on specific definitions.
Tables are structured around geography, time and social content based on attributes of the unit of observation. These properties all need definitions.
Statistics are about definitions! Consider the following example from the 2006
Canadian Census on the data behind some statistics about visible minorities.
Visible Minority Groups (15), Generation Status (4), Age Groups (9) and Sex (3) for the Population 15 Years and Over of Canada, Provinces, Territories, Census Metropolitan Areas and Census
Agglomerations, 2006 Census - 20% Sample Data
Statistics are about definitions! How is visible minority status identified in the
Census? Are aboriginals among the visible minority in Canada? What is the definition of visible minority?
Statistics involve classifications The definitions that shape statistics specify the metric
of the data they summarize (for example, Canadian dollars) or the categories used to classify things if a statistic represents counts or frequencies. In this latter case, classification systems are used to identify categories of membership in a concept’s definition.
Some classification systems are based on standards while others are based on convention or practice.
For an example of a standard, see the North American Industrial Classification System (NAICS).
Statistics are presentation ready Tables and charts (or graphs) are typically used
to display many statistics at once. You will find statistics sprinkled in text as part of a narrative describing some phenomenon; but tables and charts are the primary methods of organizing and presenting statistics.
A quick review
To this point, we have established that: Statistics are ‘real’ only if they are derived from
data; Statistics are dependent of definitions of the
concepts they summarize; Statistics that represent counts of things in the
data employ classification systems, which are based either on standards or convention; and
Statistics are typically organized for display using tables or charts.
Statistics and data sources
Population Demographics
Family Expenditures and Income
Consumer Behaviour
Population and demographics
The Census is one of the most important sources of statistical information about Canada. It is the largest survey conducted in Canada and, consequently, is the primary source for small area statistics.
To use data from the Census, you must know: The characteristics collected in the Census that
are available for the spatial units used to disseminate results;
The variety of spatial units used to disseminate Census results.
Census of Population Two forms are used to collect the Census: 2A,
which goes to 80% of the households, and 2B, which goes to the other 20%.
In 2006, the 2A form contained 8 questions while the 2B form had these 8 plus 53 additional questions.
Long history of specific questions (see the Census Dictionary.)
You need to understand the content of the Census to know what statistics are possible from the Census.
Urban small area statistics
Census Metropolitan Areas
Source for the graphic: Illustrated Glossary, 2006 Census Geography, Statistics Canada
Metropolitan Areas 2006 Map of Edmonton CMA
Census results for 2006
Standard Census data products Highlight tables
Profiles Census trends Topic-based tabulations
For smaller areas outside CMAs or for dissemination areas, need to retrieve from the Data Library
Public use microdata files for individuals, households and families
A database product & a portal
Before showing you products for income and family expenditures, you need to know about CANSIM and ESTAT.
CANSIM CANSIM is a very large database containing
socio-economic statistics for Canada. There are currently over 38 million time series organized in approximately 2,800 tables.
The statistics in CANSIM come from surveys (e.g., the Labour Force Survey), administrative data (e.g., crime and justice) and simulations or
models (e.g., population projections). Geography, content and time are basic to
retrieving time series from CANSIM.
E-STAT
E-STAT is a portal to retrieve Census results and free CANSIM holdings. The tables in this version of CANSIM are extracted once a year in July, while the online version CANSIM on the Statistics Canada website is updated daily.
If you access a table using CANSIM on the Statistics Canada website, you must pay $3.00 per time series.
The U of A also subscribes to the CHASS version of CANSIM, which is updated weekly. Like E-STAT, you don’t pay with this version.
Family expenditures and income
Family expenditures and income
Census has individual and household income Income from administrative sources
T1 family file Longitudinal administrative database
Survey sources for expenditure data Survey of Household Spending
Public use microdata files for Survey of Household Spending
Consumer behaviour Tablebase contains statistics from the trade
literature. GMID - Global Market Information Database PMB contains statistics about Canadian
consumer demographics for specific product information.
Use keyword searches to find tables of interest and then conduct new searches employing the index terms assigned to them.
Top Related