Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota [email protected]...

33
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota [email protected] The IPUMS projects are funded by the National Science Foundation and the National Institutes of Health 00:00

Transcript of Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota [email protected]...

Page 1: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Using IPUMS.org

Katie GenadekMinnesota Population Center

University of [email protected]

The IPUMS projects are funded by the National Science Foundation and the National Institutes of Health

00:00

Page 2: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Overview

• What is IPUMS?• Microdata and Summary Data• IPUMS-USA• IPUMS-CPS• Online Analysis System• Online Demonstration• Questions

00:44

Page 3: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

What is IPUMS?

Integrated - consistent codes, labels, and documentation

Public Use - anonymized, downloadable

Microdata - individual-level

Series - pooled data over time and place

1:26

Page 4: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

But, What is IPUMS Data?Individual level:

Demographic DataCensus DataSurvey DataHealth DataHistorical DataMigration DataTime Use Data

Summary level: Demographic Data

Census DataHistorical DataMapping Data

2:09

Page 5: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

MPC Data Projects

http://www.ipums.org/

2:41

Page 6: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

MICRODATA AND SUMMARY DATAMicrodata:

4:40

Page 7: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Microdata versus Summary Data

Shows full range of responses for individuals

Enable custom tables and sophisticated analyses

Suppression: geography, truncation, and item level suppression

Premade or published tables of aggregate characteristics

Enable examination of small geographic areas

Suppression: limited content, grouped intervals, and cell suppression

Microdata Summary Data

4:40

Page 8: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Summary Data

5:44

Page 9: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

H910000240000000088001001000220100P910000020101032120010010010011504P910000010201036220010010010011999P910201000301011220060010010011999P910201000301009120060010010011999P910201000301007120060010010011999P910201000301006120060010010011999P910201000301004220060010010011999P910201000301003220060010010011999P910201000301002220060010010011999H910000240000000088001001000110100P910000020101030110010290510511310P910000010201021210010290290171999P910201000301001110060010290291999H910000240000000088001001000220100P910000020101045120010010010011100P910000010201025220010010010011820P910201000301007220060010010011999H910000240000000088001001000220100P910000020101049120010010010011100P910000010201049220010010010011820P910201000301019220060010010011820P910201000301015220060010010012820

Household record(shaded) followedby a person recordfor each member of the household

Relationship

AgeSexRace

BirthplaceMother’s birthplace

Occupation

For each type ofrecord, columns

correspond tospecific variables

IPUMS Data Structure

5:54

Page 10: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

IPUMS-USAMicrodata Data:

6:50

Page 11: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

IPUMS-USA• Database includes public use microdata

samples:• U.S. decennial censuses (1850-2000)• Complete-count dataset for 1880• Linked Samples 1850 – 1930• Samples from Puerto Rico (1910-2008)• American Community Survey (2000-2009)

• The first MPC data project• Most widely used database ~ 30,000 users

6:53

Page 12: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Census SamplesCensus Year

Sample Density

Number of persons in dataset

1850 1% 198,000

1860 1% 354,000

1870 1% 428,000

1880 100% 50,300,000

1900 6% 5,189,000

1910 1.4% 1,265,000

1920 1% 1,037,000

1930 5% 6,060,000

1940 1% 1,351,000

1950 1% 1,922,000

1960 1% 1,780,000

1970 6% 12,180,000

1980 9% 20,403,000

1990 6% 15,000,000

2000 6% 16,885,000

8:31

Page 13: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

The American Community Survey • Replaced the long form of the Decennial Census

– Demonstration stage: 2000 to 2004– Full implementation 2005, group quarters added 2006

• Rolling sample designMicrodata samples:• Full survey responses for 1% of US population• Yearly samples, multi-year samples

9:18

Page 14: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

ACS Samples Year

SampleDensity

Number of Persons in dataset

2000 1 in 750 372,000

2001 1 in 230 1,200,000

2002 1 in 260 1,075,000

2003 1 in 230 1,200,000

2004 1 in 240 1,194,000

2005 1 in 100 2,878,000

2006 1 in 100 2,970,000

2007 1 in 100 3,100,000

2008 1 in 100 3,001,000

2009 1 in 100 3,030,700

10:03

Page 15: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Census and ACS Variable Topics

• Basic demographic• Marriage• Family structure• Fertility• Ethnicity• Disability

• Education• Work• Income• Migration• Housing

Characteristics

10:13

Page 16: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Geography Limitations• No confidentiality restrictions for samples

prior to 1940 – no geographic limitation• Samples from 1940-1970

– Limited and inconsistent geographic identifiers • Recent samples:

– State– Some Metropolitan Areas– County Groups – Public Use Microdata Areas (PUMAs)

10:43

Page 17: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

What are PUMAs?

• Public Use Microdata Areas (PUMAs)

• Comprised of approximately 100,000 persons

• Boundaries do not always align with jurisdictional boundaries

• Detailed contents and maps available

• GIS shape files for PUMAs available

11:26

Page 18: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

IPUMS-CPSMicrodata Data:

11:53

Page 19: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Current Population Survey (CPS)• Administered starting 1940• Monthly survey administered by the Bureau of

Labor Statistics • Household survey was designed to measure

unemployment• Source of the official Government statistics on

employment and unemployment• In 2009 - 57,000 households interviewed monthly

11:55

Page 20: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Current Population SurveyMarch Supplement

• All March respondents• Additional respondents from February, March

and November monthly samples• Data are collected for Armed Forces members

residing with their families• March Annual Social and Economic Supplement

is the most widely used by social scientists and policymakers

12:20

Page 21: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Current Population SurveyMarch Supplement

• Labor force participation and unemployment• Work experience and educational attainment• Sources of income including non-cash benefits • Program participation • Tax filing status• Health Insurance• Migration

12:51

Page 22: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

IPUMS - CPS• All March Data (Back to 1962)• Basic Monthly Surveys

– Samples from 2000-2008 (back to 1976 soon)– Data for every month– ~50,000 households surveyed each month– Less variables than March supplement

• Demographic information• Family characteristics• Employment status• Education information

13:16

Page 23: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

ONLINE ANALYSIS SYSTEMObtaining Data:

14:26

Page 24: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Online Analysis System

• High-speed tabulation software developed at UC-Berkeley

• Allows for analysis of microdata without statistical package

• All analysis performed online• Can analyze multiple years of data• Help guides on webpage

14:26

Page 25: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Features• Data analysis capabilities

– Frequencies and cross tabulations (including charts) – Comparisons of means (with complex std errors) – Correlation matrix – Comparisons of correlations – Regression (ordinary least squares) – Logit and probit regression – List values of individual cases

15:02

Page 26: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Where is this online tabulator?

• Follow the link ‘Analyze Data Online’ from the homepage of:– usa.ipums.org/usa/– cps.ipums.org/cps/

• Select all samples of year of interest in USA• Open IPUMS-USA or CPS in additional tab

for documentation

15:41

Page 27: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

USE THIS DATAObtaining Data:

16:00

Page 28: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Microdata for Analysis• Documentation is Important!!!

– Use the IPUMS documentation– Be aware of top/bottom codes, NIU codes, and

missing data codes– Know the universe – who got asked the question

• Weights – makes estimates representative– See additional weights presentation

• Sample size is important – Check analysis without weights

16:01

Page 29: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Microdata for Analysis

• Allows more complex analysis then summary data

• Geographic Restrictions– State Level Analysis– Metro Area level Analysis

• Time series – change over time• Not downloading tons of tables

18:43

Page 30: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

IPUMS is Awesome• Comprehensive online documentation• Integration makes analyzing change over time

possible• Data analysis system allows you access the

data and analyze it online• All of the data are available for free online• User support is available by e-mail to help you

as needed

19:31

Page 31: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Social Explorer - Shout Out

• Produces online maps and data reports

• Based on boundary files made available through NHGIS

• Map changes in census data over time

• http://www.socialexplorer.com/

20:23

Page 32: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

DISCUSSION OF “WEIGHTING” ANDONLINE DEMO OF IPUMS

Obtaining Data:

20:54

Page 33: Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota kgenadek@umn.edu The IPUMS projects are funded by the National Science.

Questions – email usIPUMS User [email protected]

Contact:Katie [email protected]

32:00