NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

23
© 2013 IBM Corporation Big Data in the Real World Chandra S Kallur Service Area Leader, Business Analytics and Optimization December 8, 2013

description

NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

Transcript of NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

Page 1: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

© 2013 IBM Corporation

Big Data in the Real World

Chandra S KallurService Area Leader, Business Analytics and Optimization

December 8, 2013

Page 2: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

2© 2013 IBM Corporation

Agenda

Big Data – Myths & Truths

The Big Data Strategy

Examples of Big Data Instantiation in real world

Future of Big Data

What can Big Data do for your Organization ?

Page 3: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

3© 2013 IBM Corporation

Big Data Myths

Big Data is only about Unstructured informationBig Data is only about Unstructured information

False! Most projects include structured information sources.False! Most projects include structured information sources.

Big Data projects are expensiveBig Data projects are expensive False! You should start small and projects should be ROI positiveFalse! You should start small and projects should be ROI positive

Big Data technologies makes traditional databases and warehouses obsolete

Big Data technologies makes traditional databases and warehouses obsolete

False! Databases and warehouse remain vital part of analytic solutionsFalse! Databases and warehouse remain vital part of analytic solutions

Big Data technologies require BIG datasetsBig Data technologies require BIG datasets

False! Flexibility, not data size, is the most important aspect.False! Flexibility, not data size, is the most important aspect.

Page 4: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

4© 2013 IBM Corporation

Big Data: Is It Only For A Few Industries? False?

Page 5: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

5© 2013 IBM Corporation

The Big Data Strategy: Move the Analytics Closer to the Data

New analytic applications drive the requirements for a big data platform

• Integrate and manage the full variety, velocity and volume of data

• Apply advanced analytics to information in its native form

• Visualize all available data for ad-hoc analysis

• Development environment for building new analytic applications

• Workload optimization and scheduling

• Security and Governance

Page 6: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

6© 2013 IBM Corporation

T-Mobile uses big data to optimize network performance and reduce costsT-Mobile uses big data to optimize network performance and reduce costs

• Needed a solution to store and analyze two

years worth of Call Detail Records (CDRs),

switch, billing and network event data for over

30 million subscribers to identify and address

network bottlenecks• Analyze over 17 billion events per day to

provide over 1,300 users with network Quality

of Experience (QoE) analytics, traffic

engineering, dropped session analytics as

well as voice and data session analytics• Business users can perform ad-hoc network

and traffic analysis to identify performance

issues in seconds and address them faster

6

Page 7: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

7© 2013 IBM Corporation

Ufone uses real-time analytics to reduce customer churnUfone uses real-time analytics to reduce customer churn

Need

• Difficulty in managing marketing campaigns

• No direct ability to correlate campaigns with earned business

• Execute a successful marketing campaign base on real time customer insights

Benefits

• Analyzed customer call detail records (CDRs) and created customer profile segmentation .

• Data is streamed and analyzed real-time, offer is given to clients in a timely manner

• Campaign response time improved from 25% to 50%, improving CDR analysis from 1 day to 30 seconds and customer churn reduced by 15 to 20%

7

Page 8: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

8© 2013 IBM Corporation

A European utility uses streams and predictive analytics to create accurate estimates of demand to fully capture and optimize the use of distributed generation resources

Need•Real time, scalable and accurate forecasts at a very low level of locality

•Very high number of forecast models automatically updated with limited user interaction

•Incorporate local, diverse information, such as local weather conditions or events

•Simulation for test and what-if analysis on huge amounts of data

Benefits•Accuracy: 20% improvement over industry and academic state of the art. Validated onsite with real consumption data

•Performance: 100’s of thousands of time series processed on an IBM Blade server

•Abrupt changes in demand were resolved with network reconfigurations

8 © 2013 IBM Corporation

Page 9: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

9© 2013 IBM Corporation

Optimizing capital investments based on double digit Petabyte analysis

•Model the weather to optimize placement of turbines, maximizing power generation and longevity

• Modeling based on a global 1x1 kilometer grid with hundreds of variables

• Time to analysis curve flatted from 3 weeks to 3 days!

•Build models to cover forecasting and real-time operation of power generation units

• Wind turbine sensor data collection to store and understand PB’s of actual operating results, once the turbine is in production

• Scope includes service intervals, mean time to failure, and optimization of turbine interaction with wind conditions

9

Page 10: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

10

© 2013 IBM Corporation

A large U.S. regulated energy provider deploys condition-based maintenance to assess natural gas pipeline risks

NeedCorrelate data from multiple sources into one actionable platform – utilizing information to better plan and deploy inspection, detection, maintenance, repair and replacement resources and personnel.

Benefits•Unified source of truth by integrating data from:

• GIS, EAM, historians

• Corrosion history, drawings, cathodic protection

• External data sources like weather, soil etc

•Analytics-driven condition based assessment

•Estimates of mean residual life, true asset age

•The ability to associate asset condition with failure and mitigation actions probability

•Identify prescriptive options on assets

10 © 2013 IBM Corporation

Page 11: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

11

© 2013 IBM Corporation

TerraEchos identifies and classifies potential security threats – miles away

Need

•More secure facilities

•A U.S. high security facility needed a physical intrusion detection system able to detect, classify, locate and track potential threats – above and below ground

Benefits•Because the solution captures and transmits in real-time, security personnel are able to have unprecedented insight into any event – even when the disturbance is miles away – and take appropriate action

TerraEchos identifies and classifies potential security threats – miles away

Need

•More secure facilities

•A U.S. high security facility needed a physical intrusion detection system able to detect, classify, locate and track potential threats – above and below ground

Benefits•Because the solution captures and transmits in real-time, security personnel are able to have unprecedented insight into any event – even when the disturbance is miles away – and take appropriate action

11 © 2013 IBM Corporation

Page 12: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

12

© 2013 IBM Corporation

“Helps detect life threatening conditions up to 24 hours sooner”

• Performing real-time analytics using physiological data from neonatal babies

• Continuously correlates data from medical monitors to detect subtle changes and alert hospital staff sooner

• Early warning gives caregivers the ability to proactively deal with complications

Results

• Helps detect life threatening conditions up to 24 hours sooner

• Lower morbidity and improved patient care

University of Ontario Institute of Technology (UOIT) Detects Neonatal Patient Symptoms Sooner

University of Ontario Institute of Technology (UOIT) Detects Neonatal Patient Symptoms Sooner

Capabilities Utilized

Stream Computing

Page 13: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

13

© 2013 IBM Corporation

99%60%10%

Future of Big Data – Cognitive computing at play

Understands natural language and human speech

Adapts and Learns from user selections and responses

Generates and evaluates

hypothesis for better outcomes

3

2

1

Page 14: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

14

© 2013 IBM Corporation

Big Data and Watson

InfoSphere BigInsights

POS DataPOS Data

CRM DataCRM Data

Social MediaSocial Media

Distilled Insight- Spending habits- Social relationships- Buying trends

Distilled Insight- Spending habits- Social relationships- Buying trends

Advanced search and analysis

Advanced search and analysis

Watson can consume insights fromBig Data for advanced analysis

Watson can consume insights fromBig Data for advanced analysis

Big Data technology is used to build Watson’s knowledge base

Big Data technology is used to build Watson’s knowledge base

Watson uses the Apache Hadoop open framework to distribute the workload for loading information into memory.

Watson uses the Apache Hadoop open framework to distribute the workload for loading information into memory.

Approx. 200M pages of text(To compete on Jeopardy!)

Watson’s Memory

Page 15: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

15

© 2013 IBM Corporation

Question100s Possible

Answers

1000’s of Pieces of Evidence

Multiple Interpretations

100,000’s scores from many simultaneous Text Analysis Algorithms100s sources

. . .

HypothesisGeneration

Hypothesis and Evidence Scoring

Final Confidence Merging & Ranking

SynthesisQuestion &

Topic Analysis

QuestionDecomposition

HypothesisGeneration

Hypothesis and Evidence Scoring

Answer & Confidence

Generates and scores many hypotheses using a combination of 1000’s Natural Language Processing, Information Retrieval, Machine Learning and Reasoning Algorithms.

These gather, evaluate, weigh and balance different types of evidence to deliver the answer with the best support it can find

DeepQA: The Technology Behind WatsonMassively Parallel Probabilistic Evidence-Based Architecture

One Jeopardy! question can take 2 hours on a single 2.6Ghz Core: Optimized & Scaled out on 2,880-Core IBM HPC using UIMA-AS, Watson is answering in 2-6 seconds.

Page 16: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

16

© 2013 IBM Corporation

IBM Watson Oncology Advisor

IBM Confidential: References to potential future products are subject to the Important Disclaimer provided earlier in the presentation

Oncology Diagnosis and Treatment Demonstration

Page 17: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

17

© 2013 IBM Corporation

Page 18: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

18

© 2013 IBM Corporation

Page 19: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

19

© 2013 IBM Corporation

Page 20: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

20

© 2013 IBM Corporation

Page 21: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

21

© 2013 IBM Corporation

Page 22: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

22

© 2013 IBM Corporation

What can Big Data do for your organization?

Create Innovative New ProductsCreate Innovative New ProductsAct on Deeper Customer InsightAct on Deeper Customer Insight Social Media - Product/brand

Sentiment analysis Brand strategy Market analysis RFID tracking & analysis Transaction analysis to create

insight-based product/service offerings

Social Media - Product/brandSentiment analysis

Brand strategy Market analysis RFID tracking & analysis Transaction analysis to create

insight-based product/service offerings

Social media customer sentiment analysis Promotion optimization Segmentation Customer profitability Click-stream analysis CDR processing Multi-channel interaction analysis Loyalty program analytics Churn prediction

Social media customer sentiment analysis Promotion optimization Segmentation Customer profitability Click-stream analysis CDR processing Multi-channel interaction analysis Loyalty program analytics Churn prediction

Optimize your Operational ProcessesOptimize your Operational Processes

Smart Grid/meter management Supply Chain Optimization Sales reporting Inventory & merchandising optimization Options trading ICU patient monitoring Disease surveillance Transportation network optimization Store performance Environmental analysis Experimental research

Smart Grid/meter management Supply Chain Optimization Sales reporting Inventory & merchandising optimization Options trading ICU patient monitoring Disease surveillance Transportation network optimization Store performance Environmental analysis Experimental research

Prevent Fraud andReduce RiskPrevent Fraud andReduce Risk

Multimodal surveillance Cyber security Fraud modeling & detection Risk modeling & management Regulatory reporting

Multimodal surveillance Cyber security Fraud modeling & detection Risk modeling & management Regulatory reporting

Proactively Maintain your AssetsProactively Maintain your Assets Network analytics Asset management and predictive issue resolution Website analytics IT log analysis

Network analytics Asset management and predictive issue resolution Website analytics IT log analysis

Page 23: NATC 2013 - Big Data in Real World by Chandra Kallur, IBM

23

© 2013 IBM Corporation

Questions?