© 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO,...

26
© 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics

Transcript of © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO,...

Page 1: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation

Pushing the Frontiers of Analytics

Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics

Page 2: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation2

Global Technology Outlook Objectives

GTO identifies significant technology trends

early. It looks for high impact disruptive

technologies leading to game changing

products and services over a 3-10 year

horizon.

Technology thresholds identified in a GTO

demonstrate their influence on clients,

enterprises, & industries and have high

potential to create new businesses.

2

Page 3: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation3

Global Technology Outlook 2012Uncertain data and analytics are major themes

Managing Uncertain Data at Scale

Future of

Analytics

The Future Watson

Systems of People

Outcome Based Business

Resilient Business and Services

3

Page 4: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation4 4

Managing Uncertain Data at Scale

Trend: Most of the world’s analyzed data will be uncertain

By 2015, 80% of the world’s data will be uncertain

Uncertain data management requires new techniques

These techniques are necessary for real-world Big Data Analytics

Opportunity: Business leadership using Big Data Analytics

Robust, business-aware uncertain data management

Use analytics over uncertain web, sensor, and human-generated data

Enable good business decisions by understanding analysis confidence

Challenge: Taking Big Data Analytics into an uncertain world

Analysis of text is highly nuanced; sensor-based data is imprecise

Timely business decisions require efficient large-scale analytics

It is more difficult to obtain insight about an individual than a group, especially if the source data is uncertain

Page 5: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation55

* Truthfulness, accuracy or precision, correctness

The fourth dimension of Big Data: Veracity – handling data in doubt

Volume Velocity Veracity*Variety

Data at Rest

Terabytes to exabytes of existing

data to process

Data in Motion

Streaming data, milliseconds to

seconds to respond

Data in Many Forms

Structured, unstructured, text,

multimedia

Data in Doubt

Uncertainty due to data inconsistency& incompleteness,

ambiguities, latency, deception, model approximations

Page 6: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation66

Forecasting a hurricane(www.noaa.gov)

Fitting a curve to data

Model UncertaintyAll modeling is approximate

Process UncertaintyProcesses contain

“randomness”

Uncertainty arises from many sources

Uncertain travel times

Semiconductor yield

Intended Spelling Text Entry

Actual Spelling

GPS Uncertainty

??

?

RumorsContaminated?

{John Smith, Dallas}{John Smith, Kansas}

Data UncertaintyData input is uncertain

Ambiguity

{Paris Airport}Testimony

Conflicting Data

??

?

Page 7: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation77

Glo

bal

Dat

a V

olu

me

in E

xab

ytes

Sens

ors

(Inte

rnet

of T

hing

s)

Multiple sources: IDC,Cisco

100

90

80

70

60

50

40

30

20

10

Agg

rega

te U

ncer

tain

ty %

VoIP

9000

8000

7000

6000

5000

4000

3000

2000

1000

0

2005 2010 2015

By 2015, 80% of all available data will be uncertain

Enterprise Data

Data quality solutions exist for enterprise data like customer, product, and address data, but

this is only a fraction of the total enterprise data.

By 2015 the number of networked devices will be double the entire global population. All

sensor data has uncertainty.

Social Media

(video, audio and text)

The total number of social media accounts exceeds the entire global

population. This data is highly uncertain in both its expression and content.

Page 8: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation88

Creating profiles from many sources

Many inconsistent data sources Intent hidden within social media Geospatial data is imprecise

Examples: Uncertainty management presents many opportunities

Process and forecast uncertainty

System analytics predict maintenance

Sm

art

er

Pla

ne

t

36

cu

sto

me

r v

iew

Su

pp

ly c

ha

in

Able to identify: 40% more smokers found 15% more disease history

Modeling Uncertainties Demand, sales, production, shipment

Shipping Uncertainties Goods damaged Mistakes in shipped goods

35% more satisfied customers by analyzing agent notes

35% better churn prediction using customer SMS messages

Reduced time to determine lending risk from weeks to minutes

More data from physician notes and tests

He

alt

hc

are

5% more oil platform production

30% less maintenance cost

Downtime costs $M in income loss Equipment maintenance needs unpredictable Customer contracts impose penalties

Mitral stenosis: 50% more diagnoses 35% misdiagnoses

Structured medical records are incomplete “Golden” text notes

must be interpreted Drug names Relationship types

(mtr, sibs, m, paunt)

Research

80% lower price protection costs

30% less channel inventory

50% fewer returns

Reductions obtained using inventory replenishment model that accounts for uncertain price protection

Improvements obtained using statistical modeling that combine equipment sensor data with performance history to predict corrective maintenance activities

Uncertainty in images

Telco

Auto

Healthcare

Energy

Page 9: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation99

Required: tight integration to maximize context discovery

Required: common practices followedby multiple standards for representing uncertain data and uncertainty of all types, provenance, and lineage and other metadata

Required: common APIs to enable sharing across the uncertainty management pipeline

No such common practices, standards or APIs exist today

Condensing data reduces uncertainty by constructing context

Customer at Mall

Customer in Store #42Correlation

Data finds Data

Sense Making

Fact Discovery

Son

Mother

Birthday

Date

Spatial Reasoning

A&

Temporal Reasoning

&

Corroboration(Evidence Combination)

ETC.

MichaelSan Jose, CA

Credit Loyalty

Influencers

Buying DSLR today !

Intent

CO

ND

EN

SE

$999 $560

In-Store PricingAnd Discounts

Maximum ContextFor

Minimum Uncertainty

$999

$560OR

Buyinga DSLR today !

NY

Page 10: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation10 10

Systems of People

A shift in value from process optimization to people-centric processes

Organizations have extracted most of the efficiencies from traditional process automation

IT enablement opportunities are shifting to Line of Business

A new set of data is made possible by exploiting social business

Social business drives new efficiencies and value from people-centric processes

An opportunity to instrument people-processes

Provides the basis for addressing diverse set of problems

A new IT market is emerging

Adaptive social platforms instrumented with knowledge capture, interconnected with enterprise data and processes, and made intelligent through differentiating analytics will transform business

Page 11: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation11 11

People-centric processes are at the core of a broad range of issues

Differentiate for GrowthCreate winning products, fast, by having the best and most productive knowledge workers

Drive Sales ProductivityCreate superior sales force, drive sales enablement and seller/client alignment

Grow in Emerging MarketsRe-create organizational footprint in global markets

Transform Service Delivery Further grow productivity and enable new delivery models

Page 12: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation12 12

Optimizing people-centric processes is not the same as optimizing supply chains

CRM ClaimsDeliveryRecords

Patents & Publications

– Clients served– Products sold – Sales patterns– Productivity

In the last couple of weeks, I’ve talked to ABC

bank, XYZ and at a security conference.

Status: Working Expert: Security

– Engagements worked

– Team info

– Work specs– Tasks

accomplished– Productivity

– Innovation– Products– Technical

leadership

“Status updates alone on Facebook amount to more than ten times more words than on all blogs worldwide” - David Kirkpatrick, The Facebook Effect

Status: At conference

Influencer

Rich information (e.g. expertise, work patterns, response to incentives, digital reputation) is flowing through on-line collaboration and enterprise systems

Capturing this information enables analytics to be applied to people-centric processes

Page 13: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation13 13

Strength of Sales Force Index is an example of what is possible with a rich representation of people

TODAY Years selling Job change Salary band PBC

FUTURE True skills and expertise Disciplines Clients served Products sold Team experiences Connections Incentives and responses Career path …

SSFI mines sales force data to understand which attributes of a seller (e.g. skills, experiences), sales team (e.g. team composition, territories) or sales process (e.g. incentives, coverage model) are driving sales performance (quota attainment, win rates, productivity)

SSFI identifies: – Reasons for performance disparities (at

individual or group level), and the best set of actions to drive performance

“Why is our sales force in Region X not performing at par with other regions or competition?”

“What actions can we take to improve sales performance?”

“What are the incentives that truly drive performance?”

Page 14: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation14 14

Executing on SoP vision depends on three key capabilities

Develop capabilities to create a representation of a person’s skills, experiences, preferences, digital reputation…

In a structured and organized way, so it can be used for the purpose of running a business

Implement capabilities for people-centric process optimization within an analytics platform for rapid, on-demand deployment

matching, talent cloud crowdsourcing, predictive markets

simulation of workforce trends performance analytics

behavior modeling…

Incorporate capabilities that adapt content for situations and needs, and enhance communication over many devices, across diverse pools of talent

context-aware cognitive load management

translation, transcriptiontext-to-speech, voice…

PEOPLE CONTENT PEOPLE ANALYTICSPEOPLE ENABLEMENT

Page 15: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation15

Future of Analytics

Explosion of unstructured data

Creates new analytics opportunities

Addresses new enterprise needs

Consistent, extensible, and consumable analytics platform

Reduces cost-to-value for enterprises

Increases analytics solution coverage with limited supply of skills

Optimizing across the stack to deploy analytics at scale

Analytics becomes a dominant IT workload and drives HW design

Opportunity to seamlessly scale from terascale to exascale

15

Page 16: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation16

Analytics is broadly defined as the use of data and computation to make smart decisions

Data

Historical

Simulated

Text Video, Images Audio

Data instances

Reports and queries on data aggregates

Predictive models

Answers and confidence

Feedback and learning

Decision point Possible outcomes

Option 1

Option 2

Option 3

16

Page 17: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation17

The value of analytics grows by incorporating new sources of data, composing a variety of analytic techniques, spanning organizational silos, and enabling iterative, user-driven interaction

So

urc

es a

nd

typ

es o

f d

ata

New format or usage of data

Structured or standardized

Scope of decisionLow High

Multi-modal demand forecastingIntent-to-buy trends

Segmentation-based

market impactestimates

Price-based demand forecasting(own & competitors)Sales-based

demand forecasting

17

Page 18: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation18

Analytics toolkits will be expanded to support ingestion and interpretation of unstructured data, and enable adaptation and learning

Extended from: Competing on Analytics, Davenport and Harris, 2007

Standard Reporting

Ad hoc Reporting

Query/Drill Down

Alerts

Forecasting

Simulation

Predictive Modeling

In memory data, fuzzy search, geo spatial

Causality, probabilistic, confidence levels

High fidelity, games, data farming

Larger data sets, nonlinear regression

Rules/triggers, context sensitive, complex events

Query by example, user defined reports

Real time, visualizations, user interaction

Report

Decide and Act

Understand and Predict

Collect and Ingest/Interpret

Learn

Tra

ditio

nal

New

Dat

a N

ew M

eth

od

s

Optimization

Optimization under Uncertainty

Decision complexity, solution speed

Quantifying or mitigating risk

Adaptive Analysis

Continual Analysis Responding to local change/feedback

Responding to context

Entity Resolution

Annotation and Tokenization

Relationship, Feature Extraction

People, roles, locations, things

Rules, semantic inferencing, matching

Automated, crowd sourcedDecide what to count;

enable accurate counting

In the context of the

decision process

18

Page 19: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation19

Analytic solutions will apply multiple methods to multiple forms of dataExample: Utility Vegetation Management

Effective Right of Way vegetation management is critical to streamlined utility operations

Traditional Right of Way programs are mainly static-scenario driven on a six year cycle– Static and rigid models lead to predominantly reactive operations, which are expensive– Focus on narrow corridor widths fails to address severe weather impact

A multimodal analytics approach can overcome these shortcomings– Structured data (e.g. transmission line maps) and unstructured data (e.g. LIDAR sensor)– Advanced modeling to perform a dynamic scenario-driven analysis

3-DimensionalModel

Recovery

Right-of-WayDynamic

Forecasting Model

ScheduleGenerator

Visualization

ELECTRIC

TELECOMMUNICATIONS

RAIL

ROAD

OILSo

luti

on

Fra

me

wo

rk

SENSORS

UTILITY DATA

MAPS

WEATHER

Preprocessor

Preprocessor

Preprocessor

Preprocessor

19

Page 20: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation20

Data Acquisition

Analytics solution development requires several interacting design steps

Streaming data

Text data

Multi-dimensional

Time series

Geo spatial

Relational

Data mining & statistics

Optimization & simulation

Fuzzy matching

Network algorithms

Composition andPackaging

Core AnalyticsFiltering and

Extraction Validation

Social network

Video & image

Semantic analysis

Business Rules Engine

Data Evaluation and FusionAlgorithm Composition and Invention

Testing and Execution Optimization

✔Deployment

New algorithms

20

Page 21: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation21

An Analytics solution platform will increase enterprise value by supporting both the CxO solution and the CIO infrastructure

The CIO can reduce cost and add value to the use of analytics by supporting collaboration and data/analysis sharing

Leverage MandateStreamline

operations and increase

organizational effectiveness

Expand MandateRefine business processes and

enhance collaboration

Transform MandateChange the industry value chain through

improved relationships

Pioneer MandateRadically innovate products, markets, business models

Easier consumption of Analytics solutions– Have consistent look and feel– Changes are easier to implement effectively– Trustworthy solutions are produced

More efficient, less complex development– Reduces growth of development costs– Speeds delivery of new functionality– Expands analytics solution developer population

Reduces client cost of operation – Seamless integration eases deployment

of solutions– Establishes preferred development path

for new solution– Consistent and coherent infrastructure eases

managing solutions

Lines of code

Rev

enu

e

Without

platform

With

platform

21

Page 22: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation22 22

Optimizing across the stack will enable the deployment of analytics at scale

Cores SCM

StorageNetwork

Cores SCM

StorageNetwork

Cores SCM

StorageNetwork

Cores SCM

StorageNetwork

+ +

Predictive Analytics Modeling, Simulation

Text AnalyticsHadoop Workloads

OptimizationSensitivity Analysis Future System

Systems supporting future analytics will be more data centric, composable and scalable

Balanced, reliable, power efficient systems, with integrated software that scales seamlessly

Integrated analytics, modeling and simulation capabilities to address generation, management and analysisof Big Data for Business Advantage

Systems will support increasingly complex data sets and workflows.

Different elements within these complex workflows will require different capabilities within systems.

General PurposeIntegrated Network

Integrated ProcessingIntegrated Storage

Page 23: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation23

Extend Watson technology

Moves beyond “question-in & answer-out” to always “learning” evidence-based decision support

Addresses the enterprise need to convert growing volumes of information into actionable knowledge

Demonstrates business value in critical problem spaces, starting with Healthcare

Lead in new domains

Efficiently adapting and scaling Watson to new domains requires a novel blend of engineering and research

Enable efficient adaptation

The Future Watson

23

Page 24: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation24 24

Watson’s real value proposition: Efficient decision support over unstructured (and structured) content

Unstructured Data Broad, rich in context Rapidly growing, current Invaluable yet under utilized

SQL/

XQuery

Existing

BI

Inference/

Rules

Structured Data Precise, explicit Narrow, expensive

Jeopardy! Challenge

Deeper Understanding but BrittleHigh Precision at High CostNarrow Limited Coverage

Shallow UnderstandingLow Precision

Broad Coverage

Deeper Understanding,Higher Precision and Broader,

Timely Coverage at lower costs

Key WordSearch

Relevance Ranking

Open-Domain

Question-Answering

Page 25: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation25 25

LearningUnderstanding Interacting Explaining

Specific Questions

The type of murmur associated with this condition is harsh,

systolic, and increases in intensity with

Valsalva

From specific questions

to rich, incomplete problem

scenarios(e.g. EHR)

Rich ProblemScenarios

Entire Medical Record

Question-In/Answer-Out

Evidence analysis and look-ahead,

drive interactive dialog to refine

answers and evidence

Interactive Dialog Teach Watson

Refined Answers, Follow-up Questions

Input, Responses

Dialog

Batch Training Process

Scale domain learning and

adaptation rate and efficiency

Continuous Training& Learning Process

Answers, Corrections, Judgements

Responses, Learning Questions

Precise Answers& Accurate Confidences

Move fromquality answers

to quality answers and

evidence

ComparativeEvidence Profiles

Taking Watson beyond Jeopardy!

Page 26: © 2012 IBM Corporation Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics.

© 2012 IBM Corporation26