1
Why Doesn't EPA Have a Self-Contained Statistical Unit?:
A Tribute to Doug EngelbartDr. Brand Niemann
Director and Senior Data ScientistSemantic Community
http://semanticommunity.info/AOL Government Blogger
http://breakinggov.com/author/brand-niemann/July 8, 2013
3
Preface
• Doug Engelbart had a strong influence on my professional work for the US Government:– It started with his participation in our Federal CIO
Council Interagency Collaboration Expedition Workshops with Wikis.
– It continued with my building a Dynamic Knowledge Repository for OMB after his Bootstrapping Innovation - Putting Vision to Practice Paradigm.
– It finished with an invitation to visit his home and provide a ride to his doctor for a check up.
4
Purpose• Add another building block to my Dynamic Knowledge
Repository Ecosystem as a tribute to Doug Engelbart.• Use the recent 5th Principles and Practices For A Federal
Statistical Agency as the core of an expert knowledge base.
• Answer the question: Why Doesn't the US EPA Have a Self-Contained Statistical Unit? after all this time and effort.
• Show what can be done with US EPA and Scotland’s environment data in visualizations that the US EPA, OMB, and Scotland want.
5
Some of My Principles and Practices
• Start With the End in Mind (Stephen Covey)– A good visualization depends more on the data and its creator than the
tool (Edward Tufte)• Tool Wars Can Impede the Use of Content Management and
Visualizations for Decision Making (Brand Niemann)– Encourage all tools to support interoperability (reuse) and “treat all
content as data” (Dominic Sale)• A Well-designed Spreadsheet That Can be “Dragged and Dropped”
Onto a Tool That Creates Statistics and Visualizations in the Public and Private Clouds is the “Killer App” (Brand Niemann)– This is why I used Silver Spotfire at the US EPA and now for European,
Japanese, and US applications, but this can be done with other tools – they just take longer in my experience.
6
A Well-designed Spreadsheet
http://www.scotland.gov.uk/Resource/0040/00400791.xls
7
A New, Innovative Way to Display Water Quality Information
URL
8
Scotland’s Environment:Homepage
http://www.environment.scotland.gov.uk/default.aspx
My Note: It starts with finding the statistics and their metadataand then producing a data story supported by data products. This is what a data scientist –data journalist does!
9
Scotland’s Environment:Trends and Indicators
http://www.environment.scotland.gov.uk/trends_and_indicators.aspx
10
The Scottish Government Environmental Statistics
http://www.scotland.gov.uk/Topics/Statistics/Browse/Environment
11
“Drag and Drop” Onto a Tool
Open FileOpen From LibraryAdd Data TablesAdd On-Demand Data TableAdd Data Connection
12
Creates Statistics and Visualizations in the Public and Private Cloud
13
Get a Data Story Idea• In the 5th Principles and Practices For A Federal Statistical
Agency, under Principal Statistical Agencies it says:– This section provides information—primarily from agency
websites (see Appendix E) and OMB publications—on 13 of the 14 members of the ICSP, excluding only the Office of Environmental Information in the Environmental Protection Agency, which is not a self-contained statistical unit. The information provided for the 13 agencies includes origins, authorizing legislation or other authority, status of head (presidential appointee, career senior executive service official), budget and full-time permanent staffing levels in 2012 (see U.S. Office of Management and Budget, 2012b: Table 1 and App. B), and principal programs. The agencies are discussed in alphabetical order.
14
Add Your Personal Experience• I worked in EPA's Environmental Statistics Division and
compiled a knowledgebase of their activities. Earlier I worked in the EPA Center for Environmental Statistics to try to become a Bureau of Environmental Statistics and produced an EPA Ontology State of the Environment Report.
• While working in the EPA Center for Environmental Statistics, I helped produce the EPA Guide to Selected National Environment Statistics in the US Government and the Guide to Global Environmental Statistics. I received the EPA Bronze Medal for the former in 1993.
15
Add Your Personal Opinion• Since Congress never allowed EPA to have a bureau of Environmental
Statistics and since the Office of Environmental Information in the Environmental Protection Agency would never allow the Environmental Statistics Division to become a self-contained statistical unit, I decide to spend the rest of my EPA career being a data scientist and applying my statistics and data architecture expertise to analyzing and visualizing as many EPA and government data sets as possible using the premier tool based on S-Plus and Spotfire called Spotfire by TIBCO.
• This turned out to be very visionary because now the statistical agencies (e.g. Census) and OMB are actively looking to apply state-of-the-art tool to provide a lot of federal data to analysts and empowering them to use a visualization tool to derive new understandings. See:– http://semanticommunity.info/Data_Science/Free_Data_Visualization_and_An
alysis_Tools
16
Bring In More Ideas and Data Sets
http://blog.epa.gov/science/2013/06/epa-scientists-presented-open-science-at-white-house/
My Note: This article contains links to data sets that I am using.
17
EPA Scientists Used These Data Sets
http://epa.gov/comptox/
My Note: These are the data sets and metadata in the article.
18
EPA Provides These Open Data Sets
http://www2.epa.gov/open
My Note: I am mining these data sets.
19
EPA Just Received Recognition For Their GeoPlatform
• Recent Tweet: EPA GeoPlatform got a @ComputerWorld award for collaboration: http://www.eiseverywhere.com/ehome/49069/83917/?& …– https://
twitter.com/DruidSmith/status/351786541049331712
• This is an opportunity to make it even more collaborative (reusable) and Digital Government Strategy Compliant!
20
US EPA Environmental Dataset Gateway Download
https://edg.epa.gov/data/
My Note: This is difficultfor the public to use andnot “content as data”.
21
EDG Well-Designed Spreadsheet
http://semanticommunity.info/@api/deki/files/24897/EPAOpenGovernmentData.xlsx
My Note: This is Linked Open Data version of theEPA’s Geospatial Data that supports faceted search!
22
EDG Visualizations:Bar Charts
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?EPAOpenGovernmentData-Spotfire
My Note: One can use this to assess Agency performance and prioritize data analyses.
23
EDG Visualizations:Map Chart
My Note: Dynamically linkedadjacent visualizations.
https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?EPAOpenGovernmentData-Spotfire
24
Build a Knowledge Base in MindTouch
http://semanticommunity.info/CNSTAT/Principles_and_Practices_for_a_Federal_Statistical_Agency
My Note: This is Digital Government Strategy Compliant!
25
Build a Knowledge Base Indexin Spreadsheet
http://semanticommunity.info/@api/deki/files/24897/EPAOpenGovernmentData.xlsx
My Note: This is Linked Open Data and makes unstructured content structured so “all content is data” and federated search can be done across everything!
26
Some Conclusions and Recommendations
• Doug Engelbart knew how to work with people and technology.• The recent 5th Principles and Practices For A Federal Statistical
Agency contains core subject matter expertise for working with government data to support decision making.
• The US EPA and many other government agencies do not have “self-contained statistical units” but they can make better use of visualizations of their data to support decision making like Scotland.
• Start With the End in Mind, Avoid Tool Wars, and Develop Well-designed Spreadsheets That Can be “Dragged and Dropped” Onto a Tool That Creates Statistics and Visualizations in the Public and Private Clouds.
Top Related