Plenary Session 3: IT Best Practices for - Home -...
Transcript of Plenary Session 3: IT Best Practices for - Home -...
Plenary Session 3: IT Best Practices for
DOT’s Turning Big Data into Meaningful Information within the
Transportation Enterprise
July 11th, 2017
Bruce Aquila, Executive Consultant
Hexagon Safety and Infrastructure
1. What is Big Data?
2. Roadblock to the Usage of Big Data
3. Exploiting Big Data
4. Use Case
5. Conclusion
Agenda
2
What is Big Data?
• Massive Volume of Structured/Unstructured
Data
• Difficult to Process with Traditional Database
and Software Techniques
• Identified by:
• Volume Large volumes
• Variety Varied types of data
• Velocity Speed at which data is processed
• Veracity Reliability of the data
• Value Meaningful information
Definition
4
Courtesy Margaret Rouse
• Imagery
• LiDAR
• Sensor
• Live Traffic
• Crash
• 3D CAD
• Public Relations
DOT Big Data (Examples)
5
DOT Sources
6
Big Data - Roadblocks
• Inadequate Staffing
• Lack of Business Sponsorship
• No Governance or Stewardship
• Ambiguous Business Case
• User Engagement
• Investment Shortfall
Institutional
8
• “We’ve Never Done Business like this
Before !!”
• Territoriality
• Political Alliances
• Organizational Realignment
• Business Process Re-design
• Lack of Awareness
• Privacy Concerns
Cultural
9
• Archaic IT Architectures
• Insufficient Physical Infrastructure
• Existing Data Quality
• Lack of BD Technology Knowledge
• Adequate Security
Technology
10
Big Data Exploitation
• The Process of Examining Large and
Varied Data Sets:
• To Uncover Hidden Patterns
• Unknown Correlations
• Trends
• Customer Preferences
• Other Useful Information
Big Data Analytics
12
Allows DOT’s to make more-informed Business Decisions!!!
Hadoop – Open source framework for distributed
storage of LARGE datasets on computer clusters
• Allows Data Scaling Up/Down without Hardware
Failures
• Written Predominately in Java
• Modules consist of:
• Common – Libraries/Utilities used by other Hadoop
Modules
• Distributed File System (HDFS)
• Stores data on commodity machines
• High aggregate Bandwidth across the Cluster
• YARN – Manages computing resources for clusters and
uses them for scheduling users' applications
• MapReduce – Implementation of Google MapReduce
Programming Model for Large-scale Data Processing
Framework – Apache Hadoop
13
Use Case – Hawaii DOT
HDOT RIMS Traffic Station Analyzer
15
• Live Stream Traffic Data
• Raw Data into Formatted Records
• Edit Station Data
• Identify Deviations (False Readings)
• Several Reports:
• 15 Minute
• Peak Hours
• Peak Volume
• Upstream-Downstream
• Oracle Apex used to Design
Interface and Analytics
Conclusion
• All DOT’s Face the BD Issue
• IT Groups Understand Challenges
• Can Assist in Removing Silos
• BD Success is ACHIEVABLE !!
• Need Business Visionary/Champion
• New Thought Paradigm Required
• Technology Platforms can sustain BD
• Pick the Right “Jump Off” Project
Chief Takeaways
18
Thank You
19