Haven 2 0
-
Upload
data-science-warsaw -
Category
Documents
-
view
514 -
download
2
Transcript of Haven 2 0
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
HP HAVEnBig Data Use CasesMikolaj Nietz, Solution Architect
Application Services Global Delivery,
Hewlett-Packard
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
The changing Big Data landscape
Human InformationMachine Data
Business
Data
10% of Information
90% of Information
Annual Growth
~100%
~10%
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Interact with and process 100% of your data seamlesslyImagine if you could…
Transactional
data Social media Images AudioVideoMobile Email TextsDocumentsIn-memoryHadoop
Standard APIs and tools
Dashboards & alerts Business intelligence Your custom appsPackaged apps
Ingest Analyze Understand
Machine Data Business Data Human Information
Open connectors
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Big Data PlatformHAVEn
HAVEn
Social media IT/OT ImagesAudioVideo Transactional data
Mobile Search engineEmail Texts
Catalogue massive volumes of distributed data
Hadoop/HDFS
Process and index all information
AutonomyIDOL
Analyze at extreme scale in real-time
Vertica
Collect & unify machine data
Enterprise Security
Powering HP Software+ your apps
nApps
Documents
hp.com/Haven
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Why HAVEn?
Hadoop
Autonomy IDOL
Vertica
Enterprise Security (HP ArcSight)
n – a numer of other apps
„Safe Haven” = „Bezpieczna Przystań”
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
HP HAVEn/Big DataReference Architecture
Rich-media data
Unstructuredtext data
Mixed-structure data
Unknown-structure data
Semi-structuredtext data
Structured text data
ODS
EDW
Data marts
Hadoop
HDFSMap ReduceData integration
NotOnly SQL AnalyticsOperational mgt.
Access-in-placeMeaning-based
analytics (Autonomy IDOL)
Autonomyvalue-addapplications
BI/Visualizationtools
Analytictools
Lightweight ETL
Hadoop Extended Tools
Access-in-place
Indexed metadata
VerticaAnalytics RDBMS
Native analyticsUDx extensionsR-Functions
Access-in-place
Indexed metadata
WWW
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Apache Hadoop
Has flexibility to store and mine any type of data
• Query previously inaccessible structured and unstructured data
• Not bound by single schema
Excels at processing complex data
• Scale-out architecture divides workloads across multiple nodes
• Flexible file system eliminates ETL* bottlenecks
Scaleseconomically
• Deployable on commodity hardware
• Open source platform guards against vendor lock
Hadoop Distributed File System (HDFS)
Self-healing, high bandwidth
clustered storage
MapReduce
Distributed Computing Framework
Open source Linux-based platform for data storage and processing that is…
Scalable Fault tolerant Distributed
Core HADOOP system components (Workloads)
Like Linux, there are several distributions of Hadoop
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
HP Autonomy IDOL
Social Media Video Audio Email Texts Mobile TransactionalData
Documents XML Search Engine Images
HP Autonomy
IDOL Applications
Autonomy Connectors
eDiscovery
Enterprise Search
Media Monitoring
Social Media Analytics
DecisionSupport
AugmentedReality
Partner/In-house apps
HC Analytics
Repositories
InformationTypes
Apps
500Functions
IDOL Services Multimedia Informatics
EnrichmentCapture
InteractionAnalyticsDiscovery
Concept Clouds
Active MatchingVisualization
ACA
MediaBin
Connected LiveVault
TRIM
AeD
Data Protector
WorkSite
DigitalSafe
Connectors
…
CloudEnterprise
IDOLOS for Human Information
ERP
CRM
Database Jive…
Image
HIS
Data Warehouse
Hadoop
SharePoint
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Seamlessly access virtually any enterprise content repository, including file systems, email, or knowledge bases
400+ connectors
All data types, all content repositories – unmatched understanding
HP Autonomy IDOL platform
High-performance human information processing
HP Autonomy IDOL
Leverage the power of functions like sentiment, categorization, and clustering to deliver intelligence and insight
Over 500 functions
Process virtually any file type such as text (email, tweet, document), audio, video, and even people profiles & behavior
1,000+ file types
Achieve big data scalability and high performance with distributable ingest and query architecture
Distributable architecture
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
HP VerticaReal Time Analytics Platform
Standard SQL Interface
NativeHigh
Availability
Auto Database
Design
Advanced Compression
Column Orientation
MPP Massively Parallel
Processing
Leverages BI, ETL, Hadoop/MapReduce and OLTP investments
Automatic setup, optimization, and DB management
Built-in redundancy that also speeds up queries
Native DB-aware clustering on low-cost x86 Linux nodes
Up to 90% space reduction using 12+
algorithms
• 10x – 100x performance than classic RDBMS
• High scalability from TBs to PBs
• Simple integration with existing ETL and BI solutions
• Superior performance on off-the-shelf hardware
• Ultimate deployment flexibility
• 24/7 Load and Query
• Flexzone
• Very close Hadoop integration
• Soon-to-come: Vertica-on-Yarn
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Why Hadoop and Vertica are complementary
• Designed for Performance
• Interactive Analytics
• A Rich SQL Ecosystem
• Designed for Fault Tolerance
• Storage & Batch Processing
• A Rich Programming Model
Both purpose-built scalable platforms
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Gain insight into your data in near-real time by running queries 50x-1,000x faster than legacy products
Blazing fast analytics
Speed, scalability, and openness at lower TCO
HP Vertica Analytics platform
High-performance data analytics platform purpose-built for big data
HP Vertica
Infinitely scale your solution by adding an unlimited number of industry-standard servers
Massive scalability
Protect and embrace your investment in hardware and software with built-in support for Hadoop, R, and a range of ETL and BI tools
Open architecture
Store 10x-30x more data per server than row databases with patented columnar compression
Optimized data storage
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Collect, normalize, and categorize machine data such as logs, events, and flows from any device, any time, anywhere from any vendor
315+ connectors
Collect, store, and analyze any machine data across IT
HP ArcSight Universal log management platform
High-performance universal log management to consolidate machine data across IT
HP ArcSight
The unified machine data through filtering and parsing is enriched with rich metadata, which allows you to search machine data through simple text-based keywords without the need of domain expertise
Search over 1,000,000 events per second
The unified data is stored through high compression ratio in any of your existing storage formats, eliminating the need for expensive databases and DBAs
Store years’ worth of data
Built-in content packs, algorithms, rules, and the unified machine data help you deploy IT security, IT operations, IT GRC, and log analytics
Analytics & intelligence
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
The „n”
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Autonomy + Vertica + Tableau + HP Anywhere on Tablet
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
German Car ManufacturerEarly Warning System
Business problem
Detect unusual increases in the number of warranty repairs (OT warranty) as soon as they
appear.
Data analysis problem
Detect anomalies (outliers) in time series.
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
External
Internal
German Car ManufacturerBig Data Labs
Warranty
Repairs
Landing
Zone
Integrated
Data
Analytical
Record
Analytical
ProcessingVisualization
HP HAVEn Platform
Repairs
Claims
Sales
Storage
Parts &
Production
Diagnostics
Reference
Weather
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Global Telecommunication GroupLog Analysis
Vertica ClusterNFS
Hadoop Cluster
Log System
POC environment
Vertica Hadoop Connector
JDBC
3 Vertica nodes:
• 2x2 core Intel XEON @ 2.7 GHz
• 32 GB RAM
• 9.7 TB storageJava applications
Analytics & Reporting clients
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Global Cranes ManufacturerSensor Data Analysis
Remote
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Facebook Big Data Architecture for Log Analysis
Mobile
PC/Laptop
Web Servers
LogsHadoop/
HDFS 2 huge HadoopClusters
• 1.7 ExaBytes
• 15000 nodes
• 40000 nodes
Job Scheduler
Vertica
Logs
15 mins
Hourly
Daily
Legacy
• 600K MR Jobs/day• 50K Informatica Jobs/day
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Develop Operate
SecureMonetize
Govern
HAVEn
hp.com/havenThank you!
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Resources:
• www.hp.com\haven
• www.vertica.com
• www.autonomy.com
• www.hortonworks.com
• Vertica to try: https://my.vertica.com/?redirect_to=https%3A%2F%2Fmy.vertica.com%2Fdownload-community-edition%2F
• About HAVEn-on-demand: http://www.datacenterknowledge.com/archives/2014/12/03/hp-launches-big-data-cloud-called-haven-ondemand/