Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving...
-
date post
19-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of Census 2000 symposium, session 4 paper 261 Archiving Census Documentation and Microdata: Preserving...
Census 2000 symposium, session 4 paper 26
1
Archiving Census Archiving Census Documentation and Microdata:Documentation and Microdata:
Preserving Memory, Preserving Memory, Increasing StakeholdersIncreasing Stakeholders
* * ** * *Wendy L. Thomas and Robert Wendy L. Thomas and Robert
McCaaMcCaaMinnesota Population CenterMinnesota Population Center
http://www.ipums.org/Internationalhttp://www.ipums.org/InternationalIPUMS International, funded byIPUMS International, funded by
The National Science Foundation of the United The National Science Foundation of the United StatesStates
Census 2000 symposium, session 4 paper 26
2
» Microcomputer revolution --> new uses Microcomputer revolution --> new uses for census data, specifically microdatafor census data, specifically microdata
» Effective use or microdata requires Effective use or microdata requires systematic preservation of metadata systematic preservation of metadata
» Availability of microdata --> enhances the Availability of microdata --> enhances the value of censuses and increases value of censuses and increases stakeholdersstakeholders
» IPUMS International consortium promotes IPUMS International consortium promotes preservation and use of census microdatapreservation and use of census microdata
Subtext: Subtext: Preserving census metadata and Preserving census metadata and
microdatamicrodataenhances value of census enhances value of census
and increases stakeholdersand increases stakeholders
Census 2000 symposium, session 4 paper 26
3
16th century Aztec census 16th century Aztec census (in Nahuatl, (in Nahuatl, 1530s1530s): “Here is the ): “Here is the
home of...”home of...”
translatransla
tedted
(from Museum of Antropology, (from Museum of Antropology, Mexico City)Mexico City) original original
ms.ms.
digitizdigitizeded
transcritranscribedbed
Census 2000 symposium, session 4 paper 26
4
121001026007007200000112100001042220020260070072000001121000010432300100600700720000012123000000423002004007000000000000000000005230020020070000000000000000000062300200000700000000000000000000
Census microdata Census microdata of the 21st (and late 20th) of the 21st (and late 20th)
century: century: Who will preserve them?Who will preserve them?
Will they be made usable?Will they be made usable?
Census Census microdata:microdata:
Public goods should Public goods should be usedbe used
Censuses are Censuses are costlycostly
Where microdata are available, Where microdata are available, they are usedthey are used
Census 2000 symposium, session 4 paper 26
5
……official statistics that meet the official statistics that meet the
test of practical utility are to be test of practical utility are to be
compiled and made available on compiled and made available on
an impartial basis by official an impartial basis by official
statistical agencies to honor statistical agencies to honor
citizens’ entitlement to public citizens’ entitlement to public
information.information.
-- UN Statistical Commission, -- UN Statistical Commission,
19941994
Census 2000 symposium, session 4 paper 26
6
How anonymized census samples How anonymized census samples became a standard statistical became a standard statistical
product:product:
» USA: 1960, 1970, 1980, 1990: varying densities; USA: 1960, 1970, 1980, 1990: varying densities; gaining on CPS as most widely used demographic gaining on CPS as most widely used demographic microdatamicrodata
» Canada:Canada:- 1971, 1976, 1981, 1986, 1991, 1996: varying - 1971, 1976, 1981, 1986, 1991, 1996: varying
designsdesigns- 1996: Data Liberation Initiative led to an - 1996: Data Liberation Initiative led to an
explosion in of usage in research and teachingexplosion in of usage in research and teaching» UK: UK:
- 1991: 2% individuals, 0.5% households- 1991: 2% individuals, 0.5% householdshundreds of publications, thousands of users hundreds of publications, thousands of users
- 2001: double the densities.- 2001: double the densities.
Census 2000 symposium, session 4 paper 26
7
IPUMSIPUMSii helps five ways: helps five ways:
» 1. 1. InventoryInventory the world’s census microdata the world’s census microdata» 2. 2. PreservePreserve endangered microdata and endangered microdata and
documentationdocumentation* * ** * *
» 3. 3. AnonymizeAnonymize census microdata to preserve census microdata to preserve statistical confidentiality, using highest statistical confidentiality, using highest standards (Stat. Nether.)standards (Stat. Nether.)
» 4. 4. IntegrateIntegrate datasets of selected countries using datasets of selected countries using UN, Eurostat and other standardsUN, Eurostat and other standards
» 5. 5. DisseminateDisseminate database free with complete database free with complete copies to all partnerscopies to all partners
IIntegrated ntegrated PPublic ublic UUse se MMicrodata icrodata SSeries - eries - IInternationalnternational
Census 2000 symposium, session 4 paper 26
8
PPAAYYSS
II
PP
UU
MM
SSii
» Assemble microdata and Assemble microdata and documentationdocumentation
» Develop samples to minimize Develop samples to minimize confidentiality risks and maximize confidentiality risks and maximize robustnessrobustness
» Design national integration planDesign national integration plancensus-by-censuscensus-by-censusconcept-by-conceptconcept-by-conceptcode-by-codecode-by-code
» Write integrated documentation Write integrated documentation
National experts in each National experts in each country are contracted country are contracted
to:to:
Census 2000 symposium, session 4 paper 26
9
» Microdata...for any population Microdata...for any population or administrative division: or administrative division: Nation, province, district, city, Nation, province, district, city, ethnic group, etc.ethnic group, etc.
» Example: Latin America, Example: Latin America, - 20 countries- 20 countries- 67 censuses inventoried- 67 censuses inventoried- 1% - 100% sample densities- 1% - 100% sample densities- 100,000 to 150 million cases- 100,000 to 150 million cases19th century19th century: : 2 censuses 2 censuses1960s1960s:: 1414 1970s1970s::17171980s1980s:: 1616 1990s1990s::1717
» Found: complete census data for Found: complete census data for Colombia 1973 and 16 other countriesColombia 1973 and 16 other countries
II
PP
UU
MM
SSii
IINNVVEENNTTOORRIIEESS
Census 2000 symposium, session 4 paper 26
10
PPRREESSEERRVVEESS
UN Demographic Center for Latin UN Demographic Center for Latin America America
(CELADE, Santiago, Chile)(CELADE, Santiago, Chile)~3000 microdata tapes to be ~3000 microdata tapes to be
preservedpreserved
IIPPUUMMSSii
and metadata (documentation)and metadata (documentation)
Census 2000 symposium, session 4 paper 26
11
Preserve against accident, Preserve against accident, deterioration and technological deterioration and technological
obsolescenceobsolescence» Microdata:Microdata:
- transfer to stable media- transfer to stable media- use standard data storage protocols- use standard data storage protocols- entrust copies with at least two depositories - entrust copies with at least two depositories
» Metadata: collect, catalogue, and reproduceMetadata: collect, catalogue, and reproduce- Enumeration forms (preserve all versions - Enumeration forms (preserve all versions
used)used)- Enumerator and data processing instructions - Enumerator and data processing instructions - Codebooks (photocopies and scanned images)- Codebooks (photocopies and scanned images)- Technical studies, evaluations, reports- Technical studies, evaluations, reports
UN Stat. Div.: entire archive to be UN Stat. Div.: entire archive to be preserved, cataloguedpreserved, catalogued
Census 2000 symposium, session 4 paper 26
12
AANNOONNYYMMIIZZEESS
II
PP
UU
MM
SSii
Using the highest Using the highest standards currently standards currently
available:available:technical (Eurostat technical (Eurostat
workshops)workshops)administrative (license administrative (license
agreement)agreement)
Imagine a new statistical Imagine a new statistical product: product: a scientifically anonymized a scientifically anonymized census microdata sample made census microdata sample made up of unidentifiable individuals...up of unidentifiable individuals...
Census 2000 symposium, session 4 paper 26
13
Anonymized census microdata Anonymized census microdata samplessamples
available for European countriesavailable for European countries(* = in IPUMS(* = in IPUMSii consortium, consortium, * * = =
negotiating)negotiating)» 16 countries available via PAU, 1990 round 16 countries available via PAU, 1990 round
(3 in IPUMS(3 in IPUMSii, 4 negotiating):, 4 negotiating):» Belgium, Czech Republic, Estonia, Belgium, Czech Republic, Estonia,
**Finland, *Hungary, Finland, *Hungary, **Italy, Latvia, Italy, Latvia, Lithuania, Lithuania, **Norway, Poland, *Spain, Norway, Poland, *Spain, Sweden, Switzerland, Sweden, Switzerland, **Russia, Turkey, *UKRussia, Turkey, *UK
» 11 countries not available via PAU (2 in 11 countries not available via PAU (2 in IPUMSIPUMSii):):» *Austria, Croatia, Denmark, *France, *Austria, Croatia, Denmark, *France,
Germany, Iceland, Ireland, Germany, Iceland, Ireland, **Netherlands, Netherlands, Portugal, Slovak Republic, SloveniaPortugal, Slovak Republic, Slovenia
Census 2000 symposium, session 4 paper 26
14
International Monetary Fund’s International Monetary Fund’s General Data Dissemination General Data Dissemination
SystemSystem52 countries with uniform 52 countries with uniform
standardsstandards» All embrace strict standards of statistical All embrace strict standards of statistical
confidentialityconfidentiality» Prohibit disclosure of information which Prohibit disclosure of information which
may identify individuals or entitiesmay identify individuals or entities» 37 of 52 countries distribute anonymized 37 of 52 countries distribute anonymized
census microdata samplescensus microdata samples» Microdata samples are becoming standard Microdata samples are becoming standard
statistical products statistical products
Census 2000 symposium, session 4 paper 26
15
IINNTTEEGGRRAATTEESS
Photos from Colombia Photos from Colombia integration projectintegration project, February, February--
March, 2000:March, 2000:4 experts from DANE (census 4 experts from DANE (census
office)office)+7 academics (3 universities)+7 academics (3 universities)
IIPPUUMMSSii
Standard:UN/Standard:UN/Eurostat Eurostat Principles & Principles & Recs...Recs...
Census Census documentation documentation compiled for compiled for Colombian Colombian microdatamicrodata
Census 2000 symposium, session 4 paper 26
16
DDIISSSSEEMMIINNAATTEESS
II
PP
UU
MM
SSii
» End-User license agreement End-User license agreement » protects privacy and confidentialityprotects privacy and confidentiality» assures proper use assures proper use
» User selects User selects » countries, countries, » cases, cases, » variables, and variables, and » samples--makes chronological &/or samples--makes chronological &/or
cross-national research possible cross-national research possible using census microdatausing census microdata
» Open architecture software and Open architecture software and mirror sites available to all mirror sites available to all partnerspartners
International web-based International web-based access system access system
Census 2000 symposium, session 4 paper 26
17
153 countries with 1 million + pop. in 2000153 countries with 1 million + pop. in 2000
2000 round figures are provisional2000 round figures are provisional
Population censuses became Population censuses became universal in the 20th century. universal in the 20th century.
Will census microdata ... in Will census microdata ... in the 21st?the 21st?
Census 2000 symposium, session 4 paper 26
18
additional information at:additional information at:http://www.ipums.org/internatihttp://www.ipums.org/internati
onalonal
* * * * * ** * * * * *
Thank youThank you
Census 2000 symposium, session 4 paper 26
19
Preserving Memory, Increasing Preserving Memory, Increasing StakeholdersStakeholders
» 1. Introduction: Well-preserved documentation 1. Introduction: Well-preserved documentation and data -->effective data collection, and data -->effective data collection, dissemination, usedissemination, use
» 2. Long-term preservation of documentation 2. Long-term preservation of documentation and dataand data
» 3. Determining What to Preserve3. Determining What to Preserve» 4. Assessing Future Value4. Assessing Future Value» 5. Inventory of available technology/ personnel/ 5. Inventory of available technology/ personnel/
knowledgeknowledge» 6. Conclusion: Preserve and make accessible 6. Conclusion: Preserve and make accessible
census microdata to enhance value of census census microdata to enhance value of census (IPUMS(IPUMSi i ))