Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

51
© C2B2 Consulting Limited 2011 www.c2b2.co.uk All Rights Reserved Data Grids for Extreme Performance, Scalability and Availability Steve Millidge Director C2B2

description

 

Transcript of Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

Page 1: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Data Grids for Extreme Performance, Scalability and

AvailabilitySteve Millidge

Director

C2B2

Page 2: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

“Reliability, Availability, Scalability and Performance are prerequisites

for functionality!”

They are Priority 1 Requirements

Page 3: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Availability

• System is available for customers to use

• No availability results in no transactions

• Transactions = $$$• Receive your Pink

Slip if you can’t sort it!

Page 4: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Multipliers in Availability

System1

System2

System3

99% Availability 99% Availability 99% Availability

Overall Availability = 0.99*0.99*0.99 = 97%

Page 5: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

HA Techniques

Redundancy Decoupling

System

System

99% Availability

99% Availability

Pair = 1 – (0.01*0.01) = 99.99%

Overall = 0.9999 x 0.9999 x 0.9999 = 99%

System 1

99% Availability

System 2

System 3

99% Availability

99% Availability

Overall = 99%

Page 6: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Performance

How fast does a single transaction take to execute!

• Faster Performance = Happier Customers• Faster Performance = More Transactions

Page 7: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Barriers to Performance

• Raw Algorithmic Performance• Resource Limitations

– Not enough cpu, disk, memory• Resource Contention

– Locks • IO Latency

– Network, Disk

Page 8: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Latency

Time delay in requesting an operation and it being initiated

• Key factor in large scale distributed applications

• Typically not taken into account during development

Page 9: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Latency Factors

• Network Distance• Network Reliability• Data Size• Operation Granularity• Resource Contention• JVM GC

Page 10: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Move the Data and ProcessingClose Together

Page 11: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Scalability

Ability to add more hardware in response to more demand.

Without a reduction in performance!

Page 12: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Business Imperatives

• Success of the Business or Service• Growth of Mobile• Huge Variation of Load through a period• Sudden Large Spikes due to events

Cloud Enables Elastic Scalability

Page 13: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Scaling OutHorizontal Scaling

• Add Additional Servers

• Add Load Balancer• Distribute traffic

across the servers• Much Cheaper than

Scale Up• Has HA benefits

Page 14: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Linear Scalability(Nirvana)

1 2 3 40

100

200

300

400

500

600

700

800

900

Linear ScalabilityTypical Scalability

Users

Cluster Nodes

Page 15: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Typical Scale Out Architecture

Node1

Load Balancer

Node3

Node4

Node2

Database

Nodes Host Stateless Services

Database containsPersistent State

Page 16: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Stateless Services

True Stateless Services• Static HTML Serving• Basic Calculations• State Received from

Client

Pseudo Stateless• Read, Update and Store

state in the DB• Use sticky session to

route to non critical state• Typical of Most Online

applications• Push scalability issue to

the database

Page 17: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Scaling a Stateless Middletier is easy

however

Scaling Databases is hard and very expensive

Page 18: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Radical Idea

Put state back into the Middleware

Page 19: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Caching

Page 20: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Read Through Cache

Application

Data Store

Cache

GE

T A

A

AA

Cache Loader

Page 21: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Write Through Cache

Application

Cache

PU

T B

GE

T BB

BB

Data Store

Cache Writer

Page 22: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Write Behind Cache

Application

Cache

PU

T BB

B

Data Store

Write Behind Processor

GE

T B

B

Page 23: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Caches

• Caches aren’t New– Hibernate Session Cache– Entity Bean Cache– JPA Cache– Custom Caches– Open Source Caches

• Typically Cache Database Data or Page Fragments

Page 24: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

JSR 107JCACHE - Java Temporary Caching API

• Been around a Long Time– 10 years

• Focussed on Java SE– With some JEE Integration for JEE7

• Caching API– V get(Object key) throws CacheException;– void put(K key, V value) throws

CacheException;

Page 25: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

JSR 107Get Involved

• Google Group for Discussion– http://groups.google.com/group/jsr107

• Google Docs for Spec– https://docs.google.com/document/d/1YZ-lrH6

nW871Vd9Z34Og_EqbX_kxxJi55UrSn4yL2Ak• GitHub for Code

– https://github.com/jsr107/jsr107spec

Page 26: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Local Caching (Roll your Own)

Benefits• Pretty Simple to Write

– Concurrent Hashmap

• Used in many applications

• Use JCache API

Challenges• Cache Eviction• Cache Loading/Storing• Cache Prefetching• Cache Refresh• Write Behind Processing• Clustering !!

THINK LONG AND HARD!!

Page 27: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Clustering Challenges

Application

Cache

B

Application

CacheB

GE

T B

GE

T B

B

B

Data Store

Page 28: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Update Replication

Application

Cache

Application

CacheB1 B1

UP

DA

TE

BB2

B2B2

Data Store

Page 29: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Update Invalidation

Application

Cache

Application

Cache B1B1

UP

DA

TE

BB2

B2 Invalidate

Data Store

Page 30: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Replication Write Performance

Application

Cache

Application

Cache

Application

Cache

Application

Cache

PU

T

BB

BBBB

Page 31: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Cache Partitioning

Application

Cache

Application

Cache

Application

Cache

Application

Cache

PU

T

BB

B

GE

T

B

B BCC

PU

T C

C

Page 32: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Elasticity in Partitioned Caches

Application

Cache

Application

Cache

Page 33: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

HA Cache Partitioning

Application

Cache

Application

Cache

Application

Cache

Application

Cache

PU

T

BB

BB

NODECRASH!!!

B

Page 34: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Partitioned Cache

• Linear Scalability– 2 hops for Read (Worst Case)– 2 hops for Write (Worst Case)

• High Availability– Configurable Duplicates

• Location Independent Access– Grid knows where data is

• More Nodes = More Data in Memory

Page 35: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Consider a Large Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Page 36: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

How Much Can We Store

• 21 Amazon xLargeMemory Instances– 17Gb RAM

• 3 Nodes Per Instance– 4Gb 64bit JVM Heap + 5 Gb OS

• 63 Cluster Nodes• 252 Gb JVM Heap Available• Approx 125Gb Data in the Grid!• Cost per Month $9000

Page 37: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Grids can Even Overflow

Application

Cache

• Passivates Data to a Local Backing Store (NIO memory mapped file)

• Use Java NIO for Off Heap Storage

• Berkely DB local Storage

• Reduces GC overheadLocal Drive

Page 38: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

HA In Memory Data

Data Centre

Server Rack 1 Server Rack 2

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

Cache

Application

CacheData Centre

Server Rack 1 Server Rack 2

ApplicationCache

ApplicationCache

ApplicationCache

ApplicationCache

ApplicationCache

ApplicationCache

Do We Need the Database?

Page 39: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Database as Business Audit

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

ApplicationCach

e

Data Store

Business Audit Data

Page 40: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Now you Have a DATA GRID

Page 41: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

So Much More than an L2 Cache

Page 42: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Computation on the Grid

Application

Grid Node

Application

Grid Node

Application

Grid Node

Application

Grid Node

Process

Page 43: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

In Place Processing

Application

Grid Node

Application

Grid Node

Application

Grid Node

Application

Grid Node

ProcessProcessProcessProcess

Page 44: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Application

Grid Node

Querying the Grid

Application

Grid Node

Application

Grid Node

Application

Grid Node

QueryQueryQueryQuery

Page 45: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Data Grid Events Subsystem

Application

Grid Node

Application

Grid Node

Application

Grid Node

Application

Grid NodeMapListener

Page 46: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Putting it All Together

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

ApplicationCac

he

JEEClusterNode

JEEClusterNode

JEEClusterNode

JEEClusterNode

JEEClusterNode

Load BalancerWeb Sockets

ProcessProcessProcess

Data Grid

Page 47: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

BE RADICAL

Build New Architectures

With Data Grids!

Page 48: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Extreme Performance

• Reduced Latency– Data close to processing– Reduce roundtrips and expensive calculations

• Parallel Processing– Distributed Processing (Map-Reduce-like)– Distributed Query Processing

Page 49: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Extreme Scalability

• O(1) Writes and Reads– Worst Case two hops– No increase with number of nodes

• Data Volume Increases with Nodes– Large data volumes stored in the Data Grid

• Elastic Topology– Clusters Rebalance with node changes

Page 50: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved

Extreme Availability

• No Single Point of Failure– Duplicates prevent data loss– Duplicate Numbers Configurable

• Write Behind – decouples Database Availability

• Self Healing– Removing Nodes causes rebalancing

Page 51: Data Grids for Extreme Performance, Scalability and Availability JavaOne 2011 Steve Millidge

© C2B2 Consulting Limited 2011www.c2b2.co.uk

All Rights Reserved