Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann...
-
Upload
godfrey-wiggins -
Category
Documents
-
view
213 -
download
0
Transcript of Using Data Science as Evidence in Public Policy With Big Data and Elections Dr. Brand Niemann...
1
Using Data Science as Evidence in Public Policy With Big Data and Elections
Dr. Brand NiemannDirector and Senior Enterprise Architect – Data Scientist
Semantic Communityhttp://semanticommunity.info/
AOL Government Bloggerhttp://gov.aol.com/bloggers/brand-niemann/
November 1-2, 2012http://semanticommunity.info/CNSTAT
2
Start by Asking Questions
• Which by State, Congressional District, and which by time?
• Which is the easiest to reformat?• Which is the most interesting?• Where have the candidates been?• Which data is free?• Etc.
Note: Drew Conway (@drewconway) speaking about the joys, challenges, and power of data science. "Data science, as a discipline, is fundamentally about human behavior.” http://semanticommunity.info/AOL_Government/2012_Recorded_Future_User_Conference
3
Then Look for the Evidence
• Brainstorm:– What Have I Done Before?
• 2012 Annual Statistical Abstract:– Chapter 7. Elections
• Google Searches:– Election and Voting Data
• Conferences:– National Academy Seminars
• Television:– Debates, etc.
4
Begin With the End In Mind(Stephen Covey)
• Story (publicity and money)• Research Notes (document what I did and
learned)• Conditioned Data Sets (added value)• Spotfire Dashboard (cool visualizations)• Lecture to Students at George Mason
University (help them learn what a data scientist/data journalist does)
5
My 5-Step Method
• So what I like to do to illustrate (data science) and explain (data journalism) in the following (like a recipe):– Put the Best Content into a Knowledge Base (e.g. MindTouch)
• The 2012 Annual Statistical Abstract, CNSTAT, etc.
– Put the Knowledge Base into a Spreadsheet (Excel)• Linked Data to Subparts of the Knowledge Base
– Put the Spreadsheet into a Dashboard (Spotfire)• Data Integration and Interoperability Interface
– Put the Dashboard into a Semantic Model (Excel)• Data Dictionaries and Models
– Put the Semantic Model into Dynamic Case Management (Be Informed)• Structured Process for Updating Data in the Dashboard
7
2012 Annual Statistical Abstract:Chapter 7. Elections (Visualizations)
http://semanticommunity.info/FedStats.net#Section_7_ELECTIONS
8
2012 Annual Statistical Abstract:Chapter 7. Elections (Metadata)
http://semanticommunity.info/FedStats.net#Section_7._Elections
9
FedStat.net: Commemorating over 135 years of making statistics available to citizens everywhere
http://semanticommunity.info/FedStats.net#Story
10
FedStats.gov Remains Rich Source Of Government Data For Citizens
http://gov.aol.com/2012/07/26/fedstats-gov-remains-rich-source-of-government-data-for-citizens/
11
2012 Annual Statistical Abstract
http://www.census.gov/compendia/statab/
12
Data From CD-ROM to My Server
http://semanticommunity.net/StatAbs2012/
13
Spreadsheet
http://semanticommunity.info/@api/deki/files/19606/Elections2012.xls
14
Welcome to the Campaign 2012 Interactive Dashboard
http://campaign2012.c-span.org/electoral-college-map
My Note: Not like the next slide!
15
CNN Electoral Map
http://www.cnn.com/ELECTION/2012/ecalculator
16
CNN Electoral Map in Excel
http://semanticommunity.info/@api/deki/files/19606/Elections2012.xls
17
CNN Electoral Map in Spotfire
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire
18
Data Set Inventory and Results
http://semanticommunity.info/CNSTAT#Story
19
2012 Annual Statistical Abstract Election Tables Metadata
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire
20
Table 397. Participation in Elections for President and U.S. Representatives and Table 402. Vote Cast for President, by Major Political Party
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire
21
Table 405. Electoral Vote Cast for President by Major Political Party--States
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire
22
Table 408. Apportionment of Membership in House of Representatives, by State
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire
23
Table 410. Vote Cast by Congressional Districts: 2010
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire
24
Cover Page
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire
25
Conclusions and Suggestions• I had the pleasure of attending three very interesting and related professional statistical
meetings recently that showed that statisticians really care about current issues.• This made me appreciate that elections are a big data problem that is approached in
three basic ways: Historical elections data, Collection and modeling of polling survey data before the election, and Use of social media.
• So I used inventoried the historical and polling survey data (I could get) to aid in selection and visualization in a dashboard and found I needed both Congressional and State boundary files as shown in a table.
• So imagine an election season in which we had less or no polls to influence voters so they could focus on the candidates and the issues and then we got an amazing example of big data processing just after the polls closed (by gentleman's agreement with Congress) which we could all participate in by seeing the precinct voting results posted to Twitter and processed by many apps that developers had developed to bring us interesting and useful results. I am eager to see that to happen in 2014 and 2016!
• I will be updating these results with the final 2012 elections data and providing another story.
26
Extra Slides
• Boundary Files:– US States Repositioned– US Counties Repositioned– US Congressional Districts 1– US Congressional Districts 2
• Sources:– Spotfire
• https://silverspotfire.tibco.com/us/library
– US Census• http://www.census.gov/cgi-bin/geo/shapefiles2010/main
27
US States Repositioned
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire
28
US Counties Repositioned
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire
29
US Congressional Districts 1
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire
30
US Congressional Districts 2
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Elections-Spotfire