Big Data on OpenStack

Post on 15-Jan-2015

1.297 views 3 download

Tags:

description

The massive computing and storage resources that are needed to support big data applications make cloud environments an ideal fit. In this session, you'll learn how to build your big data "database on-demand" using MongoDB, Cassandra, Solr, MySQL, or any other big data solution, as well as manage your big data application using a new open source framework called “Cloudify.” All this, on top of the OpenStack cloud.

Transcript of Big Data on OpenStack

Big Data on

OpenStack

@natishalom

About GigaSpaces

Managing Big Data on the Cloud

100s of Enterprise Customers

My Data Out of My

hands..

No Way!

The Reality of Big Data…

2.7 ZB

0.5 Petabytes

66%

Global Digital Data

Two years’ tweets

Plan to use Big Data/Cloud

43% think that their

organization’s data analytics could be improved if data analytics was part of

cloud services

Large ISV Case Study

• Application– Call Center surveillance

• Background– Previously – voice data

• Goal for a new system– Monitor data & voice– Multiple data sources – Advanced correlations

The Challenges…

Ever Growing Data

Deeper Correlation

Tight Performance

A Classic Case for..

A Typical Big Data System

The Challenge

Cost Business Impact

Lower Margins

Competiveness

Time to Market

Customer Satisfaction

Infrastructure

Operational

The Solution Big Data

in the Cloud

Big Data in the Cloud- 3 Reasons

• Skills– Do you really need/want this all in-

house?• Huge amounts of external data – Does it make sense to move and

manage all this data behind your firewall?

• Focus on the value of your data– Instead of big data management

Holger Kisker

Managing Big Data on the

Cloud

• Auto start VMs• Install and configure

app components • Monitor • Repair • (Auto) Scale• Burst…

Big Data in the Cloud

Reduce the Infrastructure

Cost

Choose the Right Cloud for the Job

Running Bare-Metal for high I/O workloads, Public cloud for sporadic workloads

Big Data in the Cloud

Reducing the Operational Complexity

• Consistent Management

• Automation Through the Entire Stack

Big Data on

OpenStack

General Approach …

Reducing the Complexity

17

My Recipes

Wrap all your system elements into easy-to-use recipes, providing you with consistent, automated management of your Big Data

Consistent ManagementTypical Big Data System

Scale

Monitor

Update

Deploy

One manager easily & consistently handles all system functions.

Reducing the Infrastructure Cost

18

Consistent Management

Abstraction

Typical Big Data System

Creates an abstraction between your Big Data system recipe/blueprint and the target environment. This means you can take the same blueprint and simply point it at different environments without making any changes to your application.

Testing Production

Development

Client Environment

Scale

Monitor

Update

Deploy

Is that Good

Enough?

What about-Performance?-Deterministic Latency?

Bare Metal vs. Virtualization Benchmark

Source: Petestrenna

8.84%

14.36%

24.46%

2.41X

10.84X

Disk I/O

CPU and Memory

Network I/O

Disk Latency

Micro-operations

Bare Metal vs. Virtualization Benchmark

Source: NTT DOCOMO

The Impact on Big Data

Apps

3X more compute resources

for the same workload!

Non Deterministic Latency

Bare Metal OpenStack Support

Case-Study“We took this single image, picked it up from public cloud into a Rackspace-powered private cloud and saw a

4X increased efficiency running that workload.”

Jim O’Neill CIO at HubSpot

Automation Frameworks

Configuration Centric APP Centric (PaaS)

Big Data Apps, on Any Cloud, Your Way

Open source (Apache2)

Built-in Support for Big Data StacksReal Time Relational DB

ClustersNoSQL Clusters Hadoop

Storm MySQL MongoDB Hadoop (Hive, Pig,..)

GigaSpaces XAP Postgress Cassandra ZooKeeper

Couchbase

ElasticSearch

Moving from Existing Data Center to OpenStack?

Consistent Management

ScaleDeplo

y

Monitor

Update

Non Virtualized Data Center OpenStack Cloud

Cloud Driver

Demo Time…

Storm on OpenStack

BigData Services Catalogue on OpenStack (HP)

Large ISV Case Study

• Application– Call Center surveillance system

• Background– Previously – voice data

• Goal for a new systemMonitor data & voiceMultiple data sources Advanced correlations Mission

Accomplished

Additional Benefits

• True Cloud Economics

• One product -> any Customer Environment

• Increased Agility

Thank You!

References: http://www.cloudifysource.org http://github.com/CloudifySource

Additional References

• Bare Metal Cloud/PaaS• OpenStack Baremetal Project • Big Data in the Cloud• Big Data in the Cloud using Cloudify• Putting Hadoop On Any Cloud (A video presentation)• In Memory Computing (Data Grid) for Big Data• Using the Cloudify Player as an Open Source Framework for Buil

ding Your Own Cloud Application Marketplace on OpenStack• Going native: The move to bare-metal cloud services• New bare metal cloud offerings emerging• How much overhead does x86/x64 virtualization have?• Amazon EC2 versus Bare Metal and KVM? The inside story on w

hat you thought you knew about EC2