hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf ·...

14
hadoop reference architecture financial services webinar series all informa6on in this document is confiden6al & proprietary

Transcript of hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf ·...

Page 1: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

hadoop reference architecture  

financial services webinar series 

all informa6on in this document is confiden6al & proprietary 

Page 2: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

financial services industry issues 

© tresata inc. 2011 

A.  Huge Financial Services industry 

ver6cal… 

B.   …u6lized very limited data 

capabili6es which… 

C.   …drove poor consumer 

underwri6ng decisions… 

D.   …that yielded severely nega6ve 

financial results 

Figure 1: Consumer Credit Loans vs. 

Delinquency Rates Trends 

0.0% 

0.5% 

1.0% 

1.5% 

2.0% 

2.5% 

3.0% 

3.5% 

4.0% 

4.5% 

5.0% 

 ‐    

 0.5  

 1.0  

 1.5  

 2.0  

 2.5  

 3.0  

1990 

1992 

1994 

1996 

1998 

2000 

2002 

2004 

2006 

2008 

2010 

Consumer Loans ($T) 

Delinquency Rate (%) 

Figure 2: Mortgage 1‐4 Loans vs. 

Delinquency Rates Trends 

0.0% 

2.0% 

4.0% 

6.0% 

8.0% 

10.0% 

12.0% 

 ‐    

 2.0  

 4.0  

 6.0  

 8.0  

 10.0  

 12.0  

1990 

1992 

1994 

1996 

1998 

2000 

2002 

2004 

2006 

2008 

2010 

Mortgage 1‐4 Loans ($T) 

Delinquency Rate (%) 

Source: Federal Reserve 2011 

Page 3: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

big impact on consumer behavior 

© tresata inc. 2011 

•  Debt Overhang 

•  ‘Nega6ve’ Equity Homes 

•  Damaged Credit Scores 

•  Zip Code Myth 

Source: Tresata AnalyMcs 2012 

Page 4: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

required big data capabili6es 

© tresata inc. 2011 

A.   Current Capabili6es 1.  Sub Samples 

2.  Limited variables 

3.  Limited Mme series 

4.  Very high cost 

 

B.   Future Capabili6es 1.  All customers 

2.  All variables 

3.  Full Mme series 

4.  BeSer AnalyMcs 

 

 

Major Data Analy6cs & 

Technology  Challenge! 

 

Time (Daily) 

Variables 

Individual Customers  Current Industry 

CapabiliMes 

Future Industry CapabiliMes 

Source: Tresata AnalyMcs 2011 

Page 5: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

where are we in the hype cycle? 

© tresata inc. 2011 

Page 6: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

delivery & comprehension paradigm 

© tresata inc. 2011 

Source: Cap Gemini 2011 

Page 7: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

a new big data architecture 

© tresata inc. 2011 

Enterprise Data Systems / Enterprise File Systems (HDFS, NFS, etc.) 

Hadoop Environment 

Data & Analy6cs Pla\orm 

•  Broader leverage of EDS/EFS 

•  Abstracted modularity 

•  As‐a–service Infrastructure 

•  Automated Provision, Spin‐Up & Scale IaaS 

•  Auto‐deploy, auto‐scale Hadoop clusters (don't 

need heavily staffed SME/support, instead 

use commodity model) 

•  Unified, converging story over 

Mme for indexing & algorithms 

•  Grow machine & senMment feedbase 

Enterprise Distributed  

Compute (hpc / grid) 

cpu  gpu  fpga 

TradiMonal  BI / DW / OLAP 

Pvt or Public Cloud (Kit & Opera6ng Model) 

Smart Data AnalyMcs louds 

•  Broader leverage of compute 

•  Abstracted modularity 

storage  & access 

paradigm  

delivery & comprehension 

paradigm  

Batch  Real‐Mme Hybrid 

•  Unified, near real‐Mme "big charts" 

•  Synergy with tradiMonal "quants/algs" 

Page 8: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

enterprise ready hadoop stack 

© tresata inc. 2011 

Tresata 

Analy6cs Engine 

Security & Monitoring 

Hadoop Opera6ng System 

 Infra

structu

re 

Data Pla\orm 

 

Server Hardware  & Opera6on System 

 

 Apache 

• High bandwidth backplane •  Industry approved & cerMfied security 

•  Fully managed plaiorm, e2e 

•  Private or public cloud 

• Hadoop distribuMon (like Hortonworks)  • DistribuMon support 

Tresata Cer6fied & Packaged BDaaS Stack 

•  Extensible AnalyMcs Plaiorm •  Total View of Customer Engine 

•  Proprietary Algorithms & AnalyMcs 

Page 9: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

do not add complexity 

© tresata inc. 2011 

Example Enterprise Architecture 

Page 10: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

10 

make the start easy 

© tresata inc. 2011 

Data In 

Info Out 

Feed IN data from 

mulMple warehouses 

or SOR’s 

Feed BACK info to Business Processes 

Page 11: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

11 

if its data…you can move it 

© tresata inc. 2011 Courtesy the genius of xkcd 

Page 12: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

12 

build vs. buy 

© tresata inc. 2011 

EDW  EDW  EDW 

Commodity Hardware (HP, Dell, Super Micro etc) 

Server Opera6ng System (Linux) 

Data Storage (HDFS, NFS) 

Data Processing (Map Reduce) 

Extract 

Data Access  Data Pipeline 

Sqoop  Oozie  Avro PIG  Hive  HBase 

Data Security (Kerberos,Custom) 

Business ApplicaMons 

Viz  EV Report  Analyze  Model 

Hadoop Big Data Pla\orm 

Tresata Pla\orm 

Source: Tresata IllustraMon 2012 

Page 13: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

13 

focus on use cases 

Structured & Unstructured 

Data Sources  

Store & Process  

•  Clean 

•  Parse 

•  De‐dupe 

•  Match 

•  Join 

Compute  

•  Clustering 

•  Graph 

•  Classifiers 

•  SenMment 

•  Behavior 

Visualize  

•  Heatmaps 

•  Benchmark 

•  Geotag 

•  Outliers 

•  Networks 

Tresata Data Engine (fully engineered on Hadoop) 

Client Data 

Mul6ple  

Use Cases 

Marke6ng 

© tresata inc. 2011 

Social Data 

Market Data  Risk  

Trading 

Source: Tresata IllustraMon 2012 

Page 14: hadoop reference architecturehortonworks.com/wp-content/uploads/2012/02/ref-architecture-ubs.pdf · Data Storage (HDFS, NFS) Data Processing (Map Reduce) Extract Data Access Data

14 

how hortonworks can help 

•  Training and Cer6fica6on –  www.hortonworks.com/training/ 

•  Hortonworks Data Pla\orm and Support 

–  www.hortonworks.com/hortonworksdataplaiorm/ 

–  www.hortonworks.com/technology/techpreview/ 

–  www.hortonworks.com/support/ 

•  Educa6onal Webinars –  www.hortonworks.com/webinars/