U.S. Geological Survey Digital Data Series 60 U.S. Geological
U.S. Department of the Interior U.S. Geological Survey Next Generation Data Integration Challenges...
-
Upload
julie-fletcher -
Category
Documents
-
view
214 -
download
1
Transcript of U.S. Department of the Interior U.S. Geological Survey Next Generation Data Integration Challenges...
U.S. Department of the InteriorU.S. Geological Survey
Next Generation Data Integration Challenges
National Workshop on Large Landscape Conservation
Sean Finn, USFWS; Jim Strittholt, CBI; Tim Kern, USGSOctober 2014
Data Integration Now and Tomorrow
Not just a technical concept
Interaction of People Policy Technology
Critical at every scale – local, regional, and national
Data Integration Challenges
Human Factors Challenges
Data collected for different missions may have custom and mission-specific vocabularies
Expensive to deal with varied data formats and types (extraction, transform, and load)
Inconsistent spatial representations can confuse visualization and analysis efforts
Complex schemas can make it hard to pull out project-appropriate data
Inadequate metadata and unclear provenance
Human Factors Approaches
Provide support for efforts at all scales Establish best practices and standards for
cooperating data catalogs and repositories Establish connections across organizational
bounds Develop “share spaces” for tools and analysis
methods Provide examples of different integration
techniques and approaches (“successes”)
Success Stories
Policy and Economic Challenges
Metadata policy contradicts data.gov policy Data release policies inconsistent with
Project Open Data Skills mismatches Infrastructure needs , funding Resource plans rarely consider long-term
expenses due to new security policies User expectations
Policy and Economic Approaches
Infrastructure Commercial and government clouds Open compute cluster access
Staff MOOCs Communities of interest
Resources Open source projects Free or lower cost collaboration sites
Policy and Economic Approaches
User Expectations Agile project development involve scientists and not tech staff Infrastructure availability More ready-to-use toolkits available
Policy Current retirement rate Increased reliance on external sources
Technical Challenges
Big Data Capture Analysis Presentation Access
Discovery Data lifecycle and long-term protection New sources and types of data
Technical Approaches
Parallelization and federation
Bring tools to data, not data to tools
Design for cloud
New software frameworks
Technical Approaches
Repo 1
Repo n
Integrated Data Management Network
Delivered a Roadmap to Interoperability, user support, and training
Provide project tracking and sharing options Established open-source projects Cataloged tools Demonstrated and shared data visualization,
processing, and access capabilities Built a community that can help address
these future challenges