Grid Asia2008 Low Latency Data Grid

10
Copyright © 2006, GemStone Systems Inc. All Rights Reserved. Low Latency Data Grids in Finance Jags Ramnarayan Chief Architect GemStone Systems [email protected] om

description

Investment banks rely extensively on grids to dramatically increase throughput for their calculations for analytics (especially risk). The traditional design pattern involves executing compute intensive workflows where jobs require movement of large data files to the compute nodes, calculation results creating files which then are again consumed by the next job in the flow. Increasingly, the pattern is shifting to running short lived tasks where the bottleneck is data i.e. the time spent to move data back and forth between compute nodes can be overwhelming - turning a compute bound job to be a IO bound one. For instance, real time pricing for financial derivative instruments could just take a few milliseconds, but, the time required for the data transfer could be hundreds of milliseconds.The talk focuses on one architectural pattern gaining popularity - move the compute to the data. The data is partitioned in grid memory across many nodes and the compute task is routed to the node with the right data set provisioned based on the data hints it provides during launch. We discuss the features of the main-memory based data grid solution that uses different data partitioning policies such as hashing or data relationship based to manage data across a large cluster of nodes. We also discuss techniques for rebalancing data and behavior across the Grid nodes to achieve the best throughput and lowest latency.

Transcript of Grid Asia2008 Low Latency Data Grid

Page 1: Grid Asia2008 Low Latency Data Grid

Copyright © 2006, GemStone Systems Inc. All Rights Reserved.

Low Latency Data Grids in Finance

Jags RamnarayanChief Architect

GemStone [email protected]

Page 2: Grid Asia2008 Low Latency Data Grid

Copyright © 2008, GemStone Systems Inc. All Rights Reserved.

Background on GemStone Systems

• Known for its Object Database technology since 1982

• Now specializes in memory-oriented distributed data management

• Over 200 installed customers in global 2000

• Grid focus driven by:– Very high performance with predictable

throughput, latency and availability• Capital markets• Large e-commerce portals – real time fraud• Federal intelligence

Page 3: Grid Asia2008 Low Latency Data Grid

Copyright © 2008, GemStone Systems Inc. All Rights Reserved.

Use of Grid computing in finance

• Two primary areas in tier 1 investment banks– Risk Analytics– Pricing

Page 4: Grid Asia2008 Low Latency Data Grid

Copyright © 2008, GemStone Systems Inc. All Rights Reserved.

State of affairs – Risk Analytics

• Deluge of data (market data, trade data, etc)

• Overnight batch job doesn’t cut it– Want intra-day risk metrics– In some cases, real-time risk

• Explosion in simulation scenarios– More accurate risk exposure– Compliance

• Increasing number of smaller calculations

Page 5: Grid Asia2008 Low Latency Data Grid

Copyright © 2008, GemStone Systems Inc. All Rights Reserved.

State of affairs – Pricing (derivatives)

• Too many products

• Increasing complexity in products– Too many underliers– Many relationships

• Hunger for latency reduction– Calculating the new price with lowest possible

latency– Pushing the prices to distributed applications

Page 6: Grid Asia2008 Low Latency Data Grid

Copyright © 2008, GemStone Systems Inc. All Rights Reserved.

Where is the problem?

Grid

Scheduler

Compute farm

Data warehouses Rational databases

• Database/file access contention– Too many concurrent

connections– Large database server

bottlenecks on network– Queries results are large

causing CPU bottlenecks– Even a parallel file system

throttled by disk speeds• Too much data transfer

– Between tasks, Jobs– Between Grid and file systems,

databases– Data consistency issues

File system

CPU bound job turns into a IO bound Job

Page 7: Grid Asia2008 Low Latency Data Grid

Copyright © 2008, GemStone Systems Inc. All Rights Reserved.

Data Fabric for Risk Analytics

OtherDatabases

File system

JavaClient

C++Client

C#Client When data is stored, it is

transparently replicated and/or partitioned;

Redundant storage can be in memory and/or on disk—ensures continuous availability

Keep reference data replicated on many; partition trade data

Machine nodes can be added dynamically to expand storage capacity or to handle increased client load

Pool memory (and disk) across cluster ; parallelize

data access and computation to achieve very high

aggregate throughput

Page 8: Grid Asia2008 Low Latency Data Grid

Copyright © 2008, GemStone Systems Inc. All Rights Reserved.

Data Fabric for Risk Analytics

OtherDatabases

File system

JavaClient

C++Client

C#Client

TaskFlow - As results are generated push events to compute nodes to initiate subsequent computation Avoid bulk data transfer

across tasks or Jobs

Thousands of compute nodes can maintain local cache of most frequently used data; Optionally use local disk for overflow

Move reference data to local cache

Synchronous read through, write through or

Asynchronous write-behind to other data sources and

sinks

Page 9: Grid Asia2008 Low Latency Data Grid

Copyright © 2008, GemStone Systems Inc. All Rights Reserved.

Move business logic to data

f1 , f2 , … fn

FIFO Queue

Data fabric Resources

Exec function

s

Sept TradesSubmit (f1) ->

AggregateHighValueTrades(<input data>,

“where trades.month=‘Sept’)

Function (f1)

Function (f2)

Principle: Move task to computational resource with most of the relevant data before considering other nodes where data transfer becomes necessary

Parallel function execution service (“Map Reduce”) Data dependency hints

• Routing key, collection of keys, “where clause(s)” Serial or parallel execution

Page 10: Grid Asia2008 Low Latency Data Grid

Copyright © 2008, GemStone Systems Inc. All Rights Reserved.

Key lessons

• Apps should think about capitalizing memory across Grid (it is abundant)

• Keep IO cycles to minimum through main memory caching of operational data sets– Scavange Grid memory and avoid data source access

• Achieve linear scaling for your Grid apps by horizontally partitioning your data and behavior– Read “Pat helland’s – Life beyond Distributed transactions” (

http://www-db.cs.wisc.edu/cidr/cidr2007/papers/cidr07p15.pdf)

• Get more info on the GemFire data fabric– http://www.gemstone.com/gemfire