Big Data with Data Virtualization (session 3 from Packed Lunch Webinar Series)
-
Upload
denodo -
Category
Data & Analytics
-
view
159 -
download
2
Transcript of Big Data with Data Virtualization (session 3 from Packed Lunch Webinar Series)
© 2013 Denodo Technologies
6 Sessions Covering Key Data Integration Challenges Solved with Data Virtualization
Session 3: Leveraging Big Data in Your Enterprise with Data Virtualization
Topics covered: *Making BI agile
*Integrating Big Data *Combining SOA and Data
Integration *Enhancing and Extending MDM
and DW *Creating a single view of your
customer
© 2013 Denodo Technologies
Today’s Speakers
■ Paul Moxon Senior Director, Product Management
■ Tom Forman Senior Solutions Consultant
© 2013 Denodo Technologies
Agenda
■ Introduction to Big Data
■ Big Data Uses and Challenges
■ Increasing the Impact of Big Data using Data Virtualization
■ Data Virtualization & Big Data - Product Demonstration
■ Summary
■ Q&A and Next Steps
© 2013 Denodo Technologies
Agenda
■ Introduction to Big Data
■ Big Data Uses and Challenges
■ Increasing the Impact of Big Data using Data Virtualization
■ Data Virtualization & Big Data - Product Demonstration
■ Summary
■ Q&A and Next Steps
© 2013 Denodo Technologies
Why is Big Data so Important?
■ Technology is democratizing storage and processing of data
■ New found ability to extract value from data – at low cost
■ Must be done in the context of the business • Integrated into existing business processes, decision
making systems, etc.
■ Unified, integrated view of ALL enterprise data • Regardless of source or structure
© 2013 Denodo Technologies
What is Big Data?
Many definitions…
■ 3 V’s – Volume, Velocity, Variety
■ 4 V’s = 3 V’s AND (Value OR Variability OR Veracity)
■ Forrester:
“Big Data is the frontier of a firm’s ability to store, process, and access all of the data it needs to operate, make decisions, reduce risks, and serve customers”
© 2013 Denodo Technologies
Big Data – 4 Stage Process
Acquire Store Process Disseminate
Focus of ‘Big Data’ products
Focus of Data Virtualization
© 2013 Denodo Technologies
Big Data Sources and Stores
Sources
■ Web data • Social media (Twitter,
Facebook, Blog posts, etc.) • Clickstreams • ‘Open Government’ • Audio/Video data • …
■ Internet of Things • Sensor data • Web logs • Geo-location data • RFID data • Network logs • Streaming data • …
Stores
■ Hadoop & derivatives • Apache Hadoop • Cloudera • HortonWorks • Amazon EMR • IBM InfoSphere BigInsights • …
■ NoSQL databases • Apache Cassandra • MongoDB • CouchDB • Neo4J • …
■ Data Warehouses • MPP EDW • Teradata, etc. • …
© 2013 Denodo Technologies
Agenda
■ Introduction to Big Data
■ Big Data Uses and Challenges
■ Increasing the Impact of Big Data using Data Virtualization
■ Data Virtualization & Big Data - Product Demonstration
■ Summary
■ Q&A and Next Steps
© 2013 Denodo Technologies
Big Data Uses – Analytics and Low Cost Storage
Difficulty To Implement
Bu
sin
ess
Val
ue
Reporting
Analysis
Monitoring
Prediction
$$$$
$
Prescriptive
Prediction Use Cases:
• Fraud prevention • Recommendation engines • Weather predictions • etc.
© 2013 Denodo Technologies
Challenges for Big Data
Which of the following challenges prevent your organization from making better use of Big Data?
54%
50%
42%
38%
37%
Managing and integrating data from a variety of sources
Ensuring data quality from a variety of sources
Getting staffing and management commitments for projects
Communicating and interpreting results of analytics
Finding the right kind of talent
Source: Forrsights BI/Big Data Survey Q3, 2012
© 2013 Denodo Technologies
The Bad News…
Most data is in silos of applications, databases, data warehouses, file systems, or not even captured…
and it’s only getting worse!
© 2013 Denodo Technologies
Agenda
■ Introduction to Big Data
■ Big Data Uses and Challenges
■ Increasing the Impact of Big Data using Data Virtualization
■ Data Virtualization & Big Data - Product Demonstration
■ Summary
■ Q&A and Next Steps
© 2013 Denodo Technologies
Denodo Platform Architecture
Dat
a C
onsu
mer
s Enterprise Applications, ESB Reporting, BI, Portals Mobile, Web, Users
Dat
a Vi
rtua
lizat
ion Design Tools
Optimizer
Cache
Scheduler
Monitoring
Governance
Metadata
Security
Publish Real-time (Right-time) Data Services
Combine Transform, Improve Quality, Integrate
Connect Normalized Views of Disparate Data
Denodo Platform
Library of Wrappers Web Automation Any Data or Content Read and Write
Dat
a So
urce
s
Databases & Warehouses
Enterprise Applications
Cloud / SaaS Applications
XML, Excel, Flat Files
Big Data, NoSQL
Web 2.0 Soc. Media
PDF, Docs, Index, Emails
More Structured Less Structured
Business Solutions Access Information-
as-a-Service
Denodo Platform Right Information at the Right Time
Disparate Data Any Source Any Format
Multiple Protocols, Formats
Linked Data Services Query, Search, Browse
Request/Reply, Event Driven
Secure Delivery
© 2013 Denodo Technologies
Navteq – Location Based Services with Big Data
Combine EDW, Cloud, data feeds and real-time social media
Bat
ch
Denodo Platform Data Virtualization | Data Services | Web ETL B
atch
Real Time
Location Content Management System
Smart API – Core Widgets
User Data Store Service
Web, Devices | Users | Partners | App Builders
POIs
© 2013 Denodo Technologies
Agenda
■ Introduction to Big Data
■ Big Data Uses and Challenges
■ Increasing the Impact of Big Data using Data Virtualization
■ Data Virtualization & Big Data - Product Demonstration
■ Summary
■ Q&A and Next Steps
© 2013 Denodo Technologies
Making Hadoop Map/Reduce Accessible
Map-reduce
job
HDFS File
Command connector
HDFS Files connector
1. The command connector issues the Map-Reduce job
to the Hadoop cluster 2. The job generates a file with the results
3. Denodo reads the generated file
4. The results are returned to the
client 5. This data can be
cached
Denodo Cache
An end user issues a SELECT * from ViewA
ViewA
© 2013 Denodo Technologies
Other NoSQL – Denodo integrations
■ Couch DB • Document-based database (JSON)
with eventual consistency
■ Mongo DB • Document-based database (BSON)
with dynamic schema
■ Neo4J • Open source Graph database written in
Java
■ Apache Cassandra • Hybrid Column/Key-Value Pair based
database
© 2013 Denodo Technologies
Agenda
■ Introduction to Big Data
■ Big Data Uses and Challenges
■ Increasing the Impact of Big Data using Data Virtualization
■ Data Virtualization & Big Data - Product Demonstration
■ Summary
■ Q&A and Next Steps
© 2013 Denodo Technologies
The Promise (and Challenges) of Big Data
Source: Forrsights BI/Big Data Survey Q3, 2012
© 2013 Denodo Technologies
Leverage ALL of Your Data Expose all data across your organization
Break down data silos
Include Web/Cloud, Big Data, unstructured
Lower Cost & Increased Agility Lower integration costs by 80%
Flexibility to Change
Real-time (Right-time) Data Services
Fast Time to Solution Projects in 4-6 weeks
ROI in <6 months
Adds new IT and Business capabilities
Big Data & Data Virtualization Benefits - Summary
© 2013 Denodo Technologies
Agenda
■ Introduction to Big Data
■ Big Data Uses and Challenges
■ Increasing the Impact of Big Data using Data Virtualization
■ Data Virtualization & Big Data - Product Demonstration
■ Summary
■ Q&A and Next Steps
© 2013 Denodo Technologies
Data Virtualization & Big Data – Next Steps Move forward at your own pace DBTA Best Practices: Data Virtualization is Vital for Maximizing Your NOSQL
and Big Data Investment http://www.denodo.com/en/resources/solution_briefs/dbta_dv_nosql_bigdata_2012/index.php
Attend Packed Lunch Webinar Sessions – Covering Data Virtualization for…
Making BI Agile
Integrating Big Data
Combining SOA and Data Integration
Enhancing and Extending MDM and DW
Creating a Single View of Your Customer
Move forward with one of our Data Virtualization experts Phone: (+1) 877-556-2531 (NA) | Email: [email protected]
Phone: (+44) (0)20 7869 8053 (EMEA)
www.denodo.com
© 2013 Denodo Technologies
Next Session:
Topics covered: *Making BI agile
*Integrating Big Data *Combining SOA and Data
Integration *Enhancing and Extending MDM
and DW *Creating a single view of your
customer
Thank You!
Capitalizing on your SOA with Data Virtualization Wednesday, July 23rd 2014