Caching Grid In A Nutshell

36
Caching and Grid in a Nutshell An overview of caching and grid products with case studies by John Davies

Transcript of Caching Grid In A Nutshell

Page 1: Caching Grid In A Nutshell

Caching and Grid in a Nutshell

An overview of caching and grid products with case studies

byJohn Davies

Page 2: Caching Grid In A Nutshell

Agenda

• Introduction• The needs of the banking world (some examples)• Grid vs. Caching

– When is a cache not a cache?• Gemstone, GigaSpaces and Tangosol

– A data caching example in each– The Master-Worker pattern

• Getting data onto and off the grid• Handling complex data• Q&A

Page 3: Caching Grid In A Nutshell

The Speaker (John Davies)• 25 years in IT; Hardware, Assembler (Z80), C, C++ and then Java –

mostly on UNIX and Linux– Also played with Occam, Objective C, Ruby & Perl

• Co-authored 5 Java Wrox titles in 2000/2001– Including Java Server programming J2EE (1.3) & Java XML

• Several senior positions in major banks– Global Head of Technical Consulting (Architecture) at BNP Paribas,

London, Paris and New York (2001-2003)– Global Head of Technical Architecture (Prime Brokerage) at JPMorgan,

London and New York (2004-2006)

• CTO and co-founder of C24 (www.c24.biz)– London based, founded in 2000, worldwide customer base– Leading provider of SWIFT, ISO-20022 & FpML integration technology– Used by the majority of large banks and clearing houses

Page 4: Caching Grid In A Nutshell

Requirements (an example)• 12,000 prices a second (bid and ask) going out to 10 options

exchanges– One exchange allows 20 quotes per batch and 100 batches per second– Which of the 12,000 updates do we quote?

• Parse and store the messages– take the difference between the new price and the previous quote on the

exchange– Sort the results (in real-time) and then publish the top 20 every 10ms

updating the “quoted” store

• Repeat the above for the other 9 exchanges– Each will have different rules about what can and can’t be published– Oh yes, and while you’re at it, make it all scalable, resilient and highly

available

Page 5: Caching Grid In A Nutshell

Another example• The bank’s positions (how much they owe/own of each

stock) are stored on a database– Not huge, around 1GB

• Complex pricing calculations (run on several machines) need read access to most of the data– The process takes several minutes to run the queries

• If the bank can’t get the results quickly it doesn’t know its risks– They can’t decide whether to buy or sell stocks

Page 6: Caching Grid In A Nutshell

Data vs. Compute Grid• Most tasks need numbers to crunch

– If there are a lot of numbers to crunch then it’s a data grid– If there’s a lot of crunching then it’s a compute grid

• An example of a data grid…– 1000 user’s need to access and update trade data on demand– Data volumes are typically around 1GB– This can often be thought of as a data cache

• An example of a compute grid…– 1000 CPUs to calculate new curves in “real time” as prices

change– This is NOT a cache

Page 7: Caching Grid In A Nutshell

When is a cache not a cache?• Cache comes from the French “to hide”

– But who cares what the French think?– This usually means it is hiding a database

• If there’s no database then it’s not really a cache– A distributed data grid without a database behind it is not a

cache

• Data is often short-lived, there is little point in writing it to a “classic” database– Since it’s distributed across many machines it’s pretty safe– This scenario can be termed a “distributed in-memory database”– A “classic” database (e.g. Oracle) can be used for archive

Page 8: Caching Grid In A Nutshell

So, data grid or compute grid?• There’s a thin line between a data grid (cache) and a

compute grid– What if we need lots of crunching on lots of data? – Not unusual

• This is the problem the vendors have in positioning their products– Some come from a compute background but can also function

well as a data grid or cache (e.g. GigaSpaces)– Some come from the in-memory database or caching

background and can also provide an excellent platform for number crunching (e.g. Gemstone and Tangosol)

• In every case there is one thing in common…

Page 9: Caching Grid In A Nutshell

Distributed computing• The whole point is to distribute the load

• In a nutshell grid computing is distributed computing– In effect it’s been around since the days of CORBA (early 90s)

• I’m excluding WAN grid here (e.g. SETI etc.)

• What the vendors add are APIs, performance, management, integration, standards and GUIs– And, in most cases, nice friendly faces

• Companies in this space include…– DataSynapse, Gemstone, GigaSpaces, Platform, Tangosol and

Terracotta– There are others but we’ll look at the main Java players…

Page 10: Caching Grid In A Nutshell

The main Java players• Gemstone

– Started in OO databases in the 82, initially Smalltalk then C++– Now naturally extended to distributed database

• GigaSpaces– Started as an implementation of Sun’s JavaSpaces (part of Jini)– Now widely used in Financial Services compute grids– Probably the “purest” compute grid solution

• Tangosol– Started in “classic” caching in 2000– Now regarded as the strongest caching vendor (widest usage)– Also widely used in Financial Service for caching and data grids

Page 11: Caching Grid In A Nutshell

Gemstone• Main product is GemFire

– Biggest selling point over the others is native C/C++ implementations

– Seem to term EDF (“Enterprise Data Fabric”)

• Have an interesting concept of “Real-Time Events”– Basically continuous queries e.g.select * from trades where traderID=‘Fred’can be made to run forever returning any new inserts matching the query

• Have a JDBC driver that replaces the existing one and works as a cache– Nice and easy to use

Page 12: Caching Grid In A Nutshell

Tangosol• Product is “Coherence”

– Started as a cache and has remained in this space– Early success perhaps due inefficiencies of EJBs

• Extremely easy to use– Essentially distributed Hashmaps– The API is already known to most Java programmers

• Includes event mechanism for active queries

• Partnered and well integrated into many JEE vendors– Hooks well into Spring, Hibernate and KODO etc.

• Now playing strongly into the grid market place– Through partnerships with DataSynapse and Platform et al

Page 13: Caching Grid In A Nutshell

GigaSpaces• GigaSpaces were one of the first implementations of

Jini’s JavaSpaces– Jini was originally sold around the mobile phone networks, i.e.

massively distributed

• The JavaSpaces API is incredibly simple– Difficult to understand why Sun didn’t back this

• GigaSpaces have now got into caching, JMS, ESB and SBA (“Space-Based Architecture”)– Also very will integrated into Spring– Recently Mule integration has extended solutions into ESBs

• JavaSpaces is by default event driven and distributed

Page 14: Caching Grid In A Nutshell

How do they work?• Let’s take two scenarios...

3. Prices being updated, we need to distribute the data– Typically a data-grid (not a cache because there’s no database)– We want the data to be “persisted” at the remote client

4. Calculations need to be performed on data using all available CPUs– More of a compute-grid

• Scenario 1 we’ll look at with three options– Scenario 2 with just GigaSpaces

• Mainly for time, GemStone and Tangosol and do a equally impressive job and the code is very similar

Page 15: Caching Grid In A Nutshell

The data we want to store...• The data is foreign exchange data

– CurrencyPair (GBP/NOK)– Period (Spot)– Rate (12.20950)

• We want to be able to retrieve the latest exchange rate for any given currency pair, for any period

class Price{

private String currencyPair;private String period;private Double rate;

// Constructors and getters / setters and other methods...

public String getKey() { return currencyPair + ”-” + period; }}

Page 16: Caching Grid In A Nutshell

Writing a Price to GemFire• Connect to the distributed system and get the Cache ref// Create / Find a cache (using the map interface)DistributedSystem ds = DistributedSystem.connect();// Get the Singleton instance of the cacheCache cache = CacheFactory.create(system);

• Instantiate your object and simply put into a Map// Create / Find Data Region “Prices” in the cacheMap prices = (Map ) cache.getRegion(“Prices”);// Write the Price to the cache...prices.put( price.getKey(), price );

As you can see this couldn’t be easier

Page 17: Caching Grid In A Nutshell

Reading a Price from GemFire// Get Access to the Data Region “Prices” and cast as a java.util.MapMap map = (Map)cache.getRegion(“Prices");// Retrieve the latest spot price for GBP/NOKPrice myPrice = (Price ) map.get(“GBP/NOK-SPOT” );

• All GemFire Data Regions are indexes on the key used in Put

// Get Access to the Data Region “Prices”Region prices = cache.getRegion(“Prices");// If the retrieval is not based on primary key, you can use OQL// Retrieve the latest spot price for GBP/NOKSelectResults results = prices.query(“getKey() = ‘GBP/NOK-SPOT’”);for (Iterator iter = results.iterator(); iter.hasNext(); ){ Price myPrice = (Price) iter.next();}

• All GemFire Data Regions can be indexed on fields or and/or methods

Page 18: Caching Grid In A Nutshell

Writing a Price to GigaSpaces• In “classic” JavaSpaces every object in the space needs

to implement “Entry”– GigaSpaces have optimised this out in their most recent release

// Find the spaceJavaSpace space = (JavaSpace)SpaceFinder.find(“jini://*/*/mySpace”);

space.write( myPrice, null, Integer.MAX_VALUE ); // null transactions

• Because Sun haven’t evolved Jini and JavaSpaces GigaSpaces (and others) have had to do the innovation

Page 19: Caching Grid In A Nutshell

Reading a Price from GigaSpaces// Find the spaceJavaSpace space = (JavaSpace)SpaceFinder.find(“jini://*/*/mySpace”);

// Define and template of what we’re looking for...Price template = new Price ( “GBP/NOK”, “SPOT”, null );

Object entry = space.read( template, null, 10000 ); // null transactionsPrice myPrice = ((Price) entry);

• Optimisations and exception handling were left out for clarity e.g.– We would normally use a “snapshot” of the template

• Notice how the query is done by an interface/template rather than a data-centric query– This is more in line with SOA, i.e. Service Oriented as opposed to data-

oriented

• So how does GigaSpaces work?

Page 20: Caching Grid In A Nutshell

GigaSpaces clustering

Replicatio

n

Replication

Replication

Write prices

Read prices

Read prices

Read prices

Page 21: Caching Grid In A Nutshell

Writing a Price to Coherence• We don’t need to implement anything

– Since the data is distributed is must of course be Serializable

// Create / Find a cache (using the map interface) NamedCache map = CacheFactory.getCache(“myCache");

// Write the Price to the cache...Map.put( trade.getKey(), trade );

• Creating an index...

// Create an index on the “key”map.addIndex(new ReflectionExtractor(“getKey”, false, null);

• As you can see this couldn’t be easier

Page 22: Caching Grid In A Nutshell

Reading a Price from Coherence// Create / Find a cache (using the map interface)NamedCache map = CacheFactory.getCache(“myCache");

// Retrieve the latest spot price for GBP/NOKPrice myPrice = (Price ) map.get(“GBP/NOK-SPOT” );

• At this level Coherence works in a similar way to GigaSpaces

Replic

ation

Replication

Replication

Write trades

Read trades

Read trades

Read trades

Page 23: Caching Grid In A Nutshell

OK, so it’s the same – or is it?• Replication is the basic method used for data distribution

– It’s not quite this simple though, the devil’s in the details...

• Looking into the options we see several– Synchronous and Asynchronous replication– Partitioned replication– Load balancing– Failover– Local/Near optimisation

• Management of the above is a vital feature of most applications– We’ll ignore the pretty GUIs for today and stick with code

Page 24: Caching Grid In A Nutshell

Documentation– Both Tangosol and GigaSpaces sent me excellent

documentation with just hours of notice, I just wish I could do them justice...

– Each caching scenario is well explained through diagrams

Page 25: Caching Grid In A Nutshell

Partitioned Topology

• Thanks for Cameron Purdy (Tangosol) for these diagrams…– GigaSpaces have an excellent Wiki for

this too

• Goal : Extreme Scalability.

• Solution : Transparently partition the Cache Data to distribute the load across all cluster members.

• Linear Scalability : By partitioning the data evenly, the per-port throughput (the amount of work being performed by each server) remains constant.

• Benefits

Page 26: Caching Grid In A Nutshell

Distributed processing• We send data to the grid and it returns

results– An interesting example is a Mandelbrot

calculation– We send co-ordinates and the engine sends

back small graphics blocks (pixels)– This example is (-1.375+0.625i, -0.75+0i)

• This is an easy example to break up into smaller parts– Each one being passed as an individual unit

of work

• This is similar to the complex calculations carried out in financial pricing engines– The calculate Monte Carlo value-at-risk

(VaR)

Page 27: Caching Grid In A Nutshell

Master / Worker Pattern• The “Master” writes tasks to a

distributed container – E.g. Space, map, cache etc.

• Remote “workers” read the tasks and execute their “run” method(s)– A “Task” is just a POJO with data and

methods

• The Master then takes out completed results

Page 28: Caching Grid In A Nutshell

Master / Worker code• The “Task” object

– These are filled with coordinates by the master and written to the space

public class Task {public Boolean isProcessed = false;public Coordinate data = null; // Inputpublic Pixels[] graphics = null; // Output

public Task run(){

calcMandelbrot();isProcessed = true;return this;

}// All the other stuff...

}

• The Workers execute the randomly read Tasks– Returning the result to the Space (this is done in a transaction)

Task task = (Task) space.take(template, tr, 10000);Task result = task.run();space.write( result, tr, 10000);

Page 29: Caching Grid In A Nutshell

Advantages of Master / Worker

• Since all tasks are available to every worker we simply need to add workers to scale

• Scalability is dynamic– We can double/triple the CPU power

without having to stop the system– Try this with JEE!

• This dynamic scalability isn’t unique to GigaSpaces, Tangosol can also dynamically scale

Workers

Task Submission

Page 30: Caching Grid In A Nutshell

“Grid Engines”

3 - Workers Manager - Interface to control and monitor workers

1- Writes tasks into the space and get the results submitted the workers

5 - Space – The Collaboration area for all tasks and results

4 - Take task and write result

The Grid engine

2 - Perform some scheduling for the tasks based on user defined

business logic

Page 31: Caching Grid In A Nutshell

Getting data into the grid• Having a very powerful processing engine is one thing,

how do we get the data into the engine fast enough?

• The solution is in the integration technology

• Ideally we want any payload on any transport– From WebServices (SOAP/HTTP or JMS) to Object/JRMP (RMI)

• To define the services and relationship between the transport and payload we can use WSDL– But we’re not bound to XML at this point

Page 32: Caching Grid In A Nutshell

Artix• IONA have evolved an interesting integration solution

• From their days in CORBA they have evolved IDL into WSDL and Java– And extended IIOP to include MQ, RV, JMS, HTTP etc.– While still maintaining support for C, C++, .NET (boo) and

mainframe

• We can effectively expose our Java grid or cache to C, C++ etc. or even WebService enable the grid– This is probably the future of ESBs and SOA, i.e. grid-based

containers and persistence with Artix-style service enablement

• We used Artix with Tangosol at a large bank in London

Page 33: Caching Grid In A Nutshell

Complex payloads• How do we get a complex derivative into a grid?

• Simple, bind the definition of the derivative into code and send it in as an object– This is faster and smaller than XML– Maintains all of the features of XML (XPath etc.)– Can also include business rules and constraints– Can always be turned back into XML on demand

• FpML (Financial products Markup Language)– Used by virtually every bank– Schema is immensely complex but well designed– Built for extendibility and always extended

Page 34: Caching Grid In A Nutshell

C24’s Integration Objects• Integration Objects (IO) is a binding tool that binds not

only XML to Java but also CSVs and complex standards

• It will also bind constraints and rules into the code– For example “ValueDate must be after TradeDate”, this can not

be expressed in XML Schema but is part of the FpML standard– We use OCL, Schematron XPath and Java to define rules

• C24’s IO generates an FpML jar that defines FpML-based Object– These can now be used in the grid– This greatly simplifies dealing with complex data like derivatives

• Essentially much of the code is being generated

Page 35: Caching Grid In A Nutshell

Conclusion• Caching (data-grids) and compute-grids are the app-

servers of the future, perfect for ESBs– Applications run in distributed containers accessing distributed

data– Mule is an excellent example being able to use both Tangosol

and GigaSpaces

• We can re-use hardware by sharing all applications and data across all hardware– This was actually achieved in a large bank

• Binding technologies can vastly simplify working with complex data in grids (compute and data)

Page 36: Caching Grid In A Nutshell

Thank you!• Many thanks to Gemstone, GigaSpaces and Tangosol

for guidance, slides and code. Special thanks to– Jags Ramnarayan (Gemstone)– Cameron Purdy (Tangosol)– Shay Hassidim (GigaSpaces)

Any Questions?

[email protected] (www.C24.biz)