Haven 2 0

22
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP HAVEn Big Data Use Cases Mikolaj Nietz, Solution Architect Application Services Global Delivery, Hewlett-Packard

Transcript of Haven 2 0

Page 1: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

HP HAVEnBig Data Use CasesMikolaj Nietz, Solution Architect

Application Services Global Delivery,

Hewlett-Packard

Page 2: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

The changing Big Data landscape

Human InformationMachine Data

Business

Data

10% of Information

90% of Information

Annual Growth

~100%

~10%

Page 3: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Interact with and process 100% of your data seamlesslyImagine if you could…

Transactional

data Social media Images AudioVideoMobile Email TextsDocumentsIn-memoryHadoop

Standard APIs and tools

Dashboards & alerts Business intelligence Your custom appsPackaged apps

Ingest Analyze Understand

Machine Data Business Data Human Information

Open connectors

Page 4: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Big Data PlatformHAVEn

HAVEn

Social media IT/OT ImagesAudioVideo Transactional data

Mobile Search engineEmail Texts

Catalogue massive volumes of distributed data

Hadoop/HDFS

Process and index all information

AutonomyIDOL

Analyze at extreme scale in real-time

Vertica

Collect & unify machine data

Enterprise Security

Powering HP Software+ your apps

nApps

Documents

hp.com/Haven

Page 5: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Why HAVEn?

Hadoop

Autonomy IDOL

Vertica

Enterprise Security (HP ArcSight)

n – a numer of other apps

„Safe Haven” = „Bezpieczna Przystań”

Page 6: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

HP HAVEn/Big DataReference Architecture

Rich-media data

Unstructuredtext data

Mixed-structure data

Unknown-structure data

Semi-structuredtext data

Structured text data

ODS

EDW

Data marts

Hadoop

HDFSMap ReduceData integration

NotOnly SQL AnalyticsOperational mgt.

Access-in-placeMeaning-based

analytics (Autonomy IDOL)

Autonomyvalue-addapplications

BI/Visualizationtools

Analytictools

Lightweight ETL

Hadoop Extended Tools

Access-in-place

Indexed metadata

VerticaAnalytics RDBMS

Native analyticsUDx extensionsR-Functions

Access-in-place

Indexed metadata

WWW

Page 7: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Apache Hadoop

Has flexibility to store and mine any type of data

• Query previously inaccessible structured and unstructured data

• Not bound by single schema

Excels at processing complex data

• Scale-out architecture divides workloads across multiple nodes

• Flexible file system eliminates ETL* bottlenecks

Scaleseconomically

• Deployable on commodity hardware

• Open source platform guards against vendor lock

Hadoop Distributed File System (HDFS)

Self-healing, high bandwidth

clustered storage

MapReduce

Distributed Computing Framework

Open source Linux-based platform for data storage and processing that is…

Scalable Fault tolerant Distributed

Core HADOOP system components (Workloads)

Like Linux, there are several distributions of Hadoop

Page 8: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

HP Autonomy IDOL

Social Media Video Audio Email Texts Mobile TransactionalData

Documents XML Search Engine Images

HP Autonomy

IDOL Applications

Autonomy Connectors

eDiscovery

Enterprise Search

Media Monitoring

Social Media Analytics

DecisionSupport

AugmentedReality

Partner/In-house apps

HC Analytics

Repositories

InformationTypes

Apps

500Functions

IDOL Services Multimedia Informatics

EnrichmentCapture

InteractionAnalyticsDiscovery

Concept Clouds

Active MatchingVisualization

ACA

MediaBin

Connected LiveVault

TRIM

AeD

Data Protector

WorkSite

DigitalSafe

Connectors

CloudEnterprise

IDOLOS for Human Information

ERP

CRM

Database Jive…

Image

HIS

Data Warehouse

Hadoop

SharePoint

Page 9: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Seamlessly access virtually any enterprise content repository, including file systems, email, or knowledge bases

400+ connectors

All data types, all content repositories – unmatched understanding

HP Autonomy IDOL platform

High-performance human information processing

HP Autonomy IDOL

Leverage the power of functions like sentiment, categorization, and clustering to deliver intelligence and insight

Over 500 functions

Process virtually any file type such as text (email, tweet, document), audio, video, and even people profiles & behavior

1,000+ file types

Achieve big data scalability and high performance with distributable ingest and query architecture

Distributable architecture

Page 10: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

HP VerticaReal Time Analytics Platform

Standard SQL Interface

NativeHigh

Availability

Auto Database

Design

Advanced Compression

Column Orientation

MPP Massively Parallel

Processing

Leverages BI, ETL, Hadoop/MapReduce and OLTP investments

Automatic setup, optimization, and DB management

Built-in redundancy that also speeds up queries

Native DB-aware clustering on low-cost x86 Linux nodes

Up to 90% space reduction using 12+

algorithms

• 10x – 100x performance than classic RDBMS

• High scalability from TBs to PBs

• Simple integration with existing ETL and BI solutions

• Superior performance on off-the-shelf hardware

• Ultimate deployment flexibility

• 24/7 Load and Query

• Flexzone

• Very close Hadoop integration

• Soon-to-come: Vertica-on-Yarn

Page 11: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Why Hadoop and Vertica are complementary

• Designed for Performance

• Interactive Analytics

• A Rich SQL Ecosystem

• Designed for Fault Tolerance

• Storage & Batch Processing

• A Rich Programming Model

Both purpose-built scalable platforms

Page 12: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Gain insight into your data in near-real time by running queries 50x-1,000x faster than legacy products

Blazing fast analytics

Speed, scalability, and openness at lower TCO

HP Vertica Analytics platform

High-performance data analytics platform purpose-built for big data

HP Vertica

Infinitely scale your solution by adding an unlimited number of industry-standard servers

Massive scalability

Protect and embrace your investment in hardware and software with built-in support for Hadoop, R, and a range of ETL and BI tools

Open architecture

Store 10x-30x more data per server than row databases with patented columnar compression

Optimized data storage

Page 13: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Collect, normalize, and categorize machine data such as logs, events, and flows from any device, any time, anywhere from any vendor

315+ connectors

Collect, store, and analyze any machine data across IT

HP ArcSight Universal log management platform

High-performance universal log management to consolidate machine data across IT

HP ArcSight

The unified machine data through filtering and parsing is enriched with rich metadata, which allows you to search machine data through simple text-based keywords without the need of domain expertise

Search over 1,000,000 events per second

The unified data is stored through high compression ratio in any of your existing storage formats, eliminating the need for expensive databases and DBAs

Store years’ worth of data

Built-in content packs, algorithms, rules, and the unified machine data help you deploy IT security, IT operations, IT GRC, and log analytics

Analytics & intelligence

Page 14: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

The „n”

Page 15: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Autonomy + Vertica + Tableau + HP Anywhere on Tablet

Page 16: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

German Car ManufacturerEarly Warning System

Business problem

Detect unusual increases in the number of warranty repairs (OT warranty) as soon as they

appear.

Data analysis problem

Detect anomalies (outliers) in time series.

Page 17: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

External

Internal

German Car ManufacturerBig Data Labs

Warranty

Repairs

Landing

Zone

Integrated

Data

Analytical

Record

Analytical

ProcessingVisualization

HP HAVEn Platform

Repairs

Claims

Sales

Storage

Parts &

Production

Diagnostics

Reference

Weather

Page 18: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Global Telecommunication GroupLog Analysis

Vertica ClusterNFS

Hadoop Cluster

Log System

POC environment

Vertica Hadoop Connector

JDBC

3 Vertica nodes:

• 2x2 core Intel XEON @ 2.7 GHz

• 32 GB RAM

• 9.7 TB storageJava applications

Analytics & Reporting clients

Page 19: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Global Cranes ManufacturerSensor Data Analysis

Remote

Page 20: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Facebook Big Data Architecture for Log Analysis

Mobile

PC/Laptop

Web Servers

LogsHadoop/

HDFS 2 huge HadoopClusters

• 1.7 ExaBytes

• 15000 nodes

• 40000 nodes

Job Scheduler

Vertica

Logs

15 mins

Hourly

Daily

Legacy

• 600K MR Jobs/day• 50K Informatica Jobs/day

Page 21: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Develop Operate

SecureMonetize

Govern

HAVEn

hp.com/havenThank you!

Page 22: Haven 2 0

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Resources:

• www.hp.com\haven

• www.vertica.com

• www.autonomy.com

• www.hortonworks.com

• Vertica to try: https://my.vertica.com/?redirect_to=https%3A%2F%2Fmy.vertica.com%2Fdownload-community-edition%2F

• About HAVEn-on-demand: http://www.datacenterknowledge.com/archives/2014/12/03/hp-launches-big-data-cloud-called-haven-ondemand/