Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

23
www.datacenterdynamics.com THE RISE OF BIG DATA.. WHEN DO “YOU” HAVE TO FACE IT Dez Blanchfield June 24th, 2014 Big data “hammer” Created by Rachel Jones of Wink Design Studio using: © Tagxedo.com

Transcript of Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

Page 1: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

THE RISE OF BIG DATA..WHEN DO “YOU” HAVE TO FACE IT

Dez Blanchfield

June 24th, 2014

Big data “hammer” Created by Rachel Jones of Wink Design Studio using: © Tagxedo.com

Page 2: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

BIG DATA & THE DATA CENTREBIG DATA HAS EVOLVED TO ESTABLISH ITS PLACE IN THE ENTERPRISE DATA CENTER

Data volumes are growing exponentially year on year, and the ‘stress’ being placed on data center infrastructure across networks, storage and compute is overloading many data centers ability to service it.

Data Center infrastructure need to adapt and evolve in order to support new workloads. Let’s take a quick look at what a data center is today, what big data is, and the types of workloads and technologies we’re having to consider as upcoming growth markets.

Frist let’s just state once and for all, we are sick and tired of hearing about the 4 x V’s

» Volume» Velocity» Variety» Veracity

Page 3: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

WHAT IS BIG DATALET’S BE CLEAR ABOUT WHAT IS NOT BIG DATA

Everyone has an opinion about what Big Data is and is not. Let’s be clear about what Big Data is NOT.

Just putting a “Big Data” stamp on it does not make it Big Data.

Big Data is not:

» Lots of data» Fast data» Messy data» Badly managed data» Bigger databases / bigger SAN’s» Individual silos of data» The result of regulatory data retention

Analysis of bad data will result in bad information.

FACT: A growth of data in a a traditional enterpriseDatabases from 20 GB to 200 GB is not Big Data, that’sJust lots of data. The are not the same by any measure.

Page 4: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

WHAT IS BIG DATAA PLAIN ENGLISH DEFINITION AND A FEW EXAMPLES OF WHAT BIG DATA IS

BIG DATA: “The collection, processing and usage of large volumes of digitized data to improve how companies make important decisions and operate the business.”

What is Big Data:

» Unstructured data / Machine data» Aggregated click streams » Previously unconnected data feeds» Horizontal analysis across vertical silos» Insights from analysis of multiple data pools» Sentiment analysis within communities ( staff / consumers )» Predictive Analytics replaces Routine Maintenance

Big Data analytics can give you a key points of differentiation.

IBM: “Over the next decade we will see significant gaps openup between enterprises that proactively transform their operationsfor the digital age and those that continue with business as usual.”Era of Smart - Reinventing Australian Enterprises for the Digital Economy

Page 5: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

WHAT IS A DATA CENTRELET’S BE HONEST - DO WE ACTUALLY KNOW WHAT A DATA CENTRE IS ANY MORE

?

Page 6: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

WHAT IS A DATA CENTREIS IT TRADITIONAL PURPOSE BUILT DEDICATED IT ACCOMODATION

Page 7: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

WHAT IS A DATA CENTREIS IT A HONKING BIG UBER SECRET CAMPUS OF WAREHOUSE SCALE DATA HALLS

Page 8: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

WHAT IS A DATA CENTREIS IT PURPOSE BUILT CONTAINERISED MOBILE DATA ROOMS

Page 9: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

WHAT IS A DATA CENTREIS IT A HYBRID OF CONTAINERISED COMPUTE FABRIC AND TRADITIONAL DATA HALLS

Page 10: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

WHAT IS A DATA CENTRETHE GIANTS OF THE INDUSTRY ARE DOING MORE THAN JUST DIPPING THEIR TOES

Page 11: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

WHAT IS A DATA CENTREIS IT DEDICATED PURPOSE BUILT IT ACCOMODATION

Page 12: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

WHAT IS A DATA CENTREIF IT WALKS LIKE A DUCK, IF IT QUACKS LIKE A DUCK

Page 13: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

WHAT IS A DATA CENTREDATA CENTRE INNOVAITON – THE SKY’S THE LIMIT

Page 14: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

THE DATA CENTRE LANDSCAPENOT ALL DATA CENTERS ARE CREATED EQUAL

When the media talk about Big Data data centers they all too often default to what I call The Usual Suspects

The Usual Suspects have specialist niche application workloads to service, i.e. not Enterprise workloads

» Facebook» Google» YaHoo» eBay» PayPal» NASA» CERN» CIA / TSA / FBI / NSA

Some have created disruptive technologies

» Hadoop ( Google / YaHoo )» OpenStack ( NASA / Rackspace )» Open Compute ( Facebook )

Page 15: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

THE BIG DATA LANDSCAPENOT ALL DATA IS CREATED EQUAL

In this era of big data, we are fast learning, all too often the hard way, that not all data is created equal.

Raw data originating from machine log files, social media, or years of original transaction data is often considered to be of lower value until it has been prepared & refined for analysis

key points to keep in mind about your data

» Treat data as if it is perishable goods» Make timely relevant use of data in decision-making» Know where your data is at all times» Know who has access to your data, when and how» Be able to provide access to your data in multiple forms» Structure data consistently, ETL is your friend» Silos make data less valuable over time» Data documentation is critical» Communicate within the business on exactly what data you have and why

Page 16: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

BIG DATA PLATFORMS ARE PLENTIFULTHE GROWTH IN BIG DATA PLATFORMS IS NOTHING SHORT OF EXPLOSIVE

Page 17: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

NOT ALWAYS DATA CENTRE FRIENDLYBOTH ENTERPRISE & OPEN SOURCE PLATFORMS PRESENT THEIR OWN CHALLENGES

Deploying any large scale Big Data platform into a modern data center will present challenges not faced with traditional enterprise network, storage and compute workloads – especially in the area of “rack awareness”.

Platforms which span multiple racks should ensure that replicas of data exist on multiple racks. This way, the loss of a switch does not render portions of the data unavailable due to all replicas being underneath it.

Rack awareness is critical in datacenters - Not all Big Data platforms are capable of being “rack aware”

» Hadoop 1.2.1 and 2.0 can be made rack aware» OpenStack not so much ( SWIFT has zones )» SAP Hana is not» CSC Infochimps is in part» Spark can schedule for locality» Ceph + RADOS = rack aware object store» Moose FS has “proximity” settings» Aerospike has “paxos protocol”» Couchbase 2.5 now has rack awareness» Traditional enterprise workloads don’t apply» HPC platforms eat networks for breakfast» A full Hadoop rebalance is quite the joyride

FACT: If you make the wrong design decisions in either bare metal or virtualized deployments of any of these types of ecosystems, your network, storage, compute & data center infrastructure are in for a whole new world of pain.

Page 18: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

TRY HOSTING THESE EXAMPLESCONSIDER THE CHALLENGES HOSTING THESE BIG DATA CUSTOMERS

A provincial Chinese phone company payroll system

» One million full time staff

Queensland power utility

» Asset & Vegetation management system» Drone planes acquiring 8TB of data per flight» 2 flights per day per plane, plans for a fleet of 6 planes» HPC & Big Data storage & compute resources on-prem and in-cloud

Virgin Atlantic ( airline )

» The new Boeing 787 aircraft create half a terabyte of data per flight» There are an average of 87,000 domestic flights a day inside US airspace» 87,000 flights per day x 0.5 TB p/flight = 43,500 TB of data created per day» In other words, approx. 42 Petabytes a day ( the meaning of life the universe and everything !? )

NOTE: That’s just the “storage” problem, consider the network and compute scale required to perform even the most rudimentary analysis on a dataset of that scale !!

!?

Page 19: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

OTHER KEY TOPICS IMPACTING DATA CENTRESBIG DATA AND THE MIRIAD OF SUPPORTING TECH TOPICS YOU NEED TO BE WATCHING

On-prem / Off-prem / Hybrid / Cloud

» Azure / AWS / RAX / Softlayer / Smartcloud» VMware / OpenStack / Citrix / Hyper-V / KVM / Xen / LXC

VM Instantiation / App Containers

» Vagrant / Packer / Docker» OSv / Capstan» Joyent / SmartOS

DevOps / Automation / Service Catalogues

» FAI / Pre-Seed / Kickstart / Cobbler» Puppet / Chef / cfengine / Salt / Ansible / Razor / Juju

Bursting into 3rd party clouds

» Private» Public» Hybrid» Storage, Network & Compute

Page 20: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

TAKEAWAY POINTS – PART 130 MINUTES BARELY LETS US SCRATCH THE SURFACE

Food for thought when you are gazing into your crystal ball trying to map out your 1, 2, 3 and 5 year roadmap

» Traditional data center thinking no longer valid» Your old cost models need to be completely rebuilt» Starting with a clean sheet of paper is a valid decision» Distributed beats centralized» Girds beat Networks beat Clusters beat Scale» 2 x Tier 1 beats 1 x Tier 3 hands down» Edge of network is as important as your network core» Economists outnumber engineers» Actuaries and Data Scientists are now cool» Software & hardware developers must be Infrastructure savvy» Web scale has taught us some valuable things» Modularity is the key» Build it by the rack at the factory and ship it to me» Reference architectures now include applications stacks» Infrastructure as a Service is the new normal» Platform and Software “as a service” by default» Containers are a good start

Page 21: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

TAKEAWAY POINTS – PART 230 MINUTES BARELY LETS US SCRATCH THE SURFACE

Food for thought when you are gazing into your crystal ball trying to map out your 1, 2, 3 and 5 year roadmap

» Whole of rack “forklift installs”» Petabyte is the new Terabyte» Everything including the kitchen sink is purpose built» Property developers get kitchens and bathrooms made in China» Containers of kitchens and bathrooms are “dropped in from the roof”» Buildings are not necessary» Tin sheds and razor wire are more and more acceptable» 42 foot containers plugged into the side of buildings is quite normal» Software defined everything» Networks glue it all together» Capex is dead / Opex is king / Customers want to rent everything» Contracts are no longer relevant» Focus on delivering good products & services» Everything is a service and paid for “by the month”» Vendor lock in is to be avoided like the plague» Everything gets smaller and we rack and pack more of them» Power consumption of 1x rack in 2014 equals 10 racks in 2004

Page 22: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

THANK YOUTHANKS FOR YOUR TIME – FEEL FREE TO PING ME ON TWITTER OR LINKEDIN

QUESTIONS ?

Page 23: Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield

www.datacenterdynamics.com

ABOUT MEDEZ BLANCHFIELD

Strategy & ArchitectureAustralian Federal Government

Cloud Computing, Big Data, Hadoop & OpenStack solutionsall start with a conversation. You name the time & place,and I'll pay for coffee.

{ email } [email protected]

{ mobile } +61 414 464 356

{ phone } +61 2 8006 4700

{ twitter } @dez_blanchfield

{ linkedin } http://linkedin.com/in/dezblanchfield