Big Data & Rocket Fuel -...

22
Big Data & Rocket Fuel Dr Raj Subramani, HSBC Reza Rokni, Google Cloud, Solutions Architect Adrian Poole, Google Cloud, &

Transcript of Big Data & Rocket Fuel -...

Page 1: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Big Data & Rocket Fuel

Dr Raj Subramani, HSBCReza Rokni, Google Cloud, Solutions ArchitectAdrian Poole, Google Cloud,

&

Page 2: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Eight cloud products with

ONE BILLIONUsers

Organize the world’s information and make it universally accessible and useful

Google’s Mission

Page 3: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

18 years of Google R&D /

Investment

Prohibitively Expensive

Mar

gina

l cos

t of

chan

ge

$

Increasing complexity of systems and processes

Trad

itiona

l Arc

hitec

ture

s

Google Cloud Native Architectures (GCP)

Increasing Marginal Cost of Change

Page 4: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Containers at Google

4

2004 2016

Core Ops Team

Number of running jobs

Enabled Google to grow our fleet over 10x faster than we grew our ops team

Page 5: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

55

Google’s innovation in data

2012 20132002 2004 2006 2008 2010

GFS

MapReduce

Bigtable Colossus

Dremel Flume

Megastore

Spanner

Millwheel

Pub/Sub

F1

2016

Dataflow

TensorFlow

Proprietary + Confidential

Page 6: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

6

2012 20132002 2004 2006 2008 2010

GCS

Dataproc

Bigtable GCS

BigQuery Dataflow

Datastore

Dataflow

Pub/Sub

2016

Dataflow

NoSQL

Google’s innovation in data

Proprietary + Confidential

Spanner

Spanner

Cloud ML

Page 7: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Now available on Google Cloud Platform

Big Data

Compute

ComputeEngine

App Engine ContainerEngine

Storage & Databases

Storage Cloud SQLBigtable

Machine Learning

Spanner Datastore

BigQuery Pub/Sub Dataflow Dataproc Datalab Speech APIMachine Learning

Translate APIVision API

Page 8: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

● Democratise ML

● Big datasets beat fancy algorithms

● Good Models

● Lots of compute

Lesson of the last 10 years...

Page 9: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Google BigQueryBigQuery is Google's fully managed, petabyte scale, low cost enterprise data warehouse for analytics. BigQuery is serverless. There is no infrastructure to manage and you don't need a database administrator, so you can focus on analyzing data to find meaningful insights using familiar SQL. BigQuery is a powerful Big Data analytics platform used by all types of organizations, from startups to Fortune 500 companies.

Simple: Fully Managed and Serverless

Convenient: Mb -> Pb Scale and Fast Convenience of SQL

Secure: Encrypted, Durable and Highly Available

Page 10: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

What is Cloud Dataflow?

Intelligently scales to millions of QPS

Open source programming model

Unified batch and streaming processing

Fully managed, no-ops data processing

Page 11: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Confidential + Proprietary

Google Cloud Dataflow

Page 12: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Big Data at HSBC Scale

Dr Raj Subramani, HSBC

Page 13: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Fundamental Review of the Trading Book

Fundamental Review of the Trading Book (FRTB)● Basel Committee on Banking Supervision (BCBS) conducted two

assessments (The Regulatory Consistency Assessment Programme - February and December 2013) for capital charges of market risks in trading books for institutions with approved internal models

● The significant differences in capital charges confirmed that the market risk framework was in need for reform

The regulations, in their final form, were published in January 2016

National supervisors are expected to finalize implementation by January 2019

Banks are expected to report under the new standards by end of 2019

Page 14: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Fundamental Review of the Trading BookTrading Book and Banking

Book Boundary

FRTB

TreatmentOf Credit

(securitised v/s non-securitised)

ApproachTo Risk

Management(VaR to Expected

Shortfall)

Incorporation of liquidity horizons

Treatment of Hedging and Diversification

Relationship between Internal Model (IM) and Standardized Approach (SA)

Page 15: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Working in the Cloud – the tradeoffs

Technologyoutcomes

Public CloudRisks

CostOutcomes

GovernanceRisks

● Business focused IT solution● Access to latest technology● Rapid prototyping● Quicker time to market

● Reduced capacity lag● Scalability and performance● Reduced total cost of ownership

● Internal Security clearance● Regulatory approval● Data sharing across borders● Geo-political issues

● Data security risks● Lock-in risks● Third party dependency risks

Page 16: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Proprietary + Confidential

Cloud Dataflow

Compute and storage

Unbounded

Bounded

Resource management

Resource auto-scaler

Dynamic work rebalancer

Work scheduler

Monitoring

Log collection

Graph optimization

Auto-healing

Intelligent watermarking SOURCE

SINK

Page 17: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Trade & Market DataTransferred to the Cloud (batch or stream)

Storage

Market Data

Trade Data

Pub/Sub

Unbounded

Bounded

Dataflow

Analytics

BigQuery

Post

Processing

Store results Post-process

The Anatomy of a Risk Engine

Data distribution and workflow across the analytics

Page 18: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

● 2 million (dummy) plain vanilla mono currency interest rate swaps in 12 currencies● Dummy interest rate market data build from Bond, Futures and Swaps● Analytics was open source Quantlib (C++ compiled on Linux)

Dataflow as Risk Engine - Scale and Performance

Page 19: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

JVMrunning

C++

● Performance gains are not always obtained straight out of the box

● Application of domain knowledge and expertise will always help tease out the best desired performance

Dataflow as Risk Engine - Stateful Analytics

Page 20: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

The Cloud Journey

• Bring the business problem not a technical solution

• Beware the frog in the well

• Big Data in Google is just data; the separation of the data from the processing, in Google, allows for clever combinations to address both scenarios

Page 21: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

What next ?

• Sign up for a Google Cloud account - first $300 free !

• Google Cloud courses @ https://www.coursera.org/ including Qwiklabs

• Contact Ian O’Shea ( [email protected] ) for further info.

Page 22: Big Data & Rocket Fuel - BigDataFinancebigdatafinance.eu/.../uploads/2017/07/Big-Data-Rocket-Fuel.pdf · Big Data & Rocket Fuel Dr Raj Subramani, HSBC ... Datastore Dataflow Pub/Sub

Thank you

Dr Raj Subramani, HSBCReza Rokni, Google Cloud, Solutions ArchitectAdrian Poole, Google Cloud, Financial Services

&