Enterprise intelligence apr2012 load - romania - 30 min

59
© 2012 IBM Corporation 1 Enterprise Intelligence Jeff Jonas, IBM Distinguished Engineer Chief Scientist, IBM Entity Analytics Email: [email protected] Blog: www.jeffjonas.typepad.com Twitter: http://www.twitter.com/jeffjonas

description

 

Transcript of Enterprise intelligence apr2012 load - romania - 30 min

Page 1: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation1

Enterprise Intelligence

Jeff Jonas, IBM Distinguished EngineerChief Scientist, IBM Entity Analytics

Email: [email protected]: www.jeffjonas.typepad.com

Twitter: http://www.twitter.com/jeffjonas

Page 2: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation2

My Background

Early 80‟s: Founded Systems Research & Development (SRD), a custom software consultancy

Personally designed and deployed +/- 100 systems, a number of which contained multi-billions of transactions describing 100‟s of millions of entities

1989 – 2003: Built numerous systems for Las Vegas casinos including a technology known as Non-Obvious Relationship Awareness (NORA)

2001: Funded by In-Q-Tel, the venture capital arm of the CIA

2005: IBM acquires SRD

Today: Primarily focused on „sensemaking on streams‟ with special attention towards privacy and civil liberties protections

Page 3: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation3

Time

Com

puti

ng P

ower

Gro

wth

Sensemaking Algorithms

Available Observation

Space

Context

Trend: Organizations Are Getting Dumber

EnterpriseAmnesia

Every two days now we create as much information as we did from the dawn of civilization up until 2003.”

~ Eric Schmidt, CEO Google

Page 4: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation4

Amnesia, definition

A defect in memory, especially resulting from brain damage.

Page 5: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation5

Enterprise Amnesia, definition

A defect in memory, resulting in wasted resources, lower revenues, unnecessary fraud losses, etc.

Page 6: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation6

Time

Sensemaking Algorithms

Available Observation

Space

ContextWHY?

Trend: Organizations Are Getting DumberC

ompu

ting

Pow

er

Gro

wth

Page 7: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation7

Algorithms at Dead End.

You Can‟t Squeeze Knowledge

Out of a Pixel.

Page 8: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation8

[email protected]

No Context

Page 9: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation9

Context, definition

Better understanding something by taking into account the things around it.

Page 10: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation10

Information in Context … and Accumulating

Top 200Customer

Job Applicant

IdentityThief

CriminalInvestigation

[email protected]

Page 11: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation11

The Puzzle Metaphor

Imagine an ever-growing pile of puzzle pieces of varying sizes, shapes and colors

What it represents is unknown – there is no picture on hand

Is it one puzzle, 15 puzzles, or 1,500 different puzzles?

Some pieces are duplicates, missing, incomplete, low quality, or have been misinterpreted

Some pieces may even be professionally fabricated lies

Until you take the pieces to the table and attempt assembly, you don‟t know what you are dealing with

Page 12: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation12

Puzzling

Cottage Garden

© 2010 Royce B. McClure,

Artist All Rights Reserved

© 2010 Ravensburger USA,

Inc.

Down Home Music

© Kay Lamb Shannon,

Artist

Licensed by Cypress Fine

Art Licensing

© 2011 Ravensburger USA

Inc.

Neuschwanstein Beauty

© 2009 Photo Copyright

Robert Cushman Hayes

© 2009 Ravensburger USA,

Inc.

Vegas

Artwork provided by

Hadley House Licensing,

Minneapolis

© 2011 Giesla Hoelscher

All Rights Reserved

© 2011 Ravensburger USA,

Inc.

270 pieces

90%200 pieces

66%

150 pieces

50%

6 pieces2%

30 pieces10% (duplicates)

Page 13: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation13

Page 14: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation14

Page 15: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation15

First Discovery

Page 16: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation16

More Data Finds Data

Page 17: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation17

Duplicates in Front Of Your Eyes

Page 18: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation18

First Duplicate Found Here

Page 19: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation19

Page 20: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation20

Incremental Context – Incremental Discovery

6:40pm START

22min “Hey, this one is a duplicate!”

35min “I think some pieces are missing.”

37min “Looks like a bunch of hillbillies ona porch.”

44min “Hillbillies, playing guitars, sittingon a porch, near a barber sign …and a banjo!”

Page 21: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation21

150 pieces

50%

Page 22: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation22

Incremental Context – Incremental Discovery

47min “We should take the sky and grassoff the table.”

2hr “Let‟s switch sides, and see if wecan make sense of this fromdifferent perspectives.”

2hr10m “Wait, there are three … no, fourpuzzles.”

2hr17m “We need a bigger table.”

2hr18m “I think you threw in a few randompieces.”

Page 23: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation23

Page 24: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation24

How Context Accumulates

With each new observation … one of three assertions are made: 1) Un-associated; 2) placed near like neighbors; or 3) connected

Must favor the false negative

New observations sometimes reverse earlier assertions

Some observations produce novel discovery

As the working space expands, computational effort increases

Given sufficient observations, there can come a tipping point

Thereafter, confidence improves while computational effort decreases!

Page 25: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation25

Big Data [in context]. New Physics.

More data: better the predictions– Lower false positives

– Lower false negatives

More data: bad data good– Suddenly glad your data is not perfect

More data: less compute

Page 26: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation26

Big Data

Pile of ____ In Context

Page 27: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation27

One Form of Context: “Expert Counting”

Is it 5 people each with 1 account … or is it 1 person with 5 accounts?

Is it 20 cases of H1N1 in 20 cities … or one case reported 20 times?

If one cannot count … one cannot estimate vector or velocity (direction and speed).

Without vector and velocity … prediction is nearly impossible.

Page 28: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation28

Entity ResolutionDemonstration

Page 29: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation29

VOTERGeorge F Balston

YOB: 1951 D/L: 4801

13070 SW Karen Blvd Apt 7

Beaverton, OR 97005

Last voted: 2008

DECEASED PERSONGeorge Balston

YOB: 1951 SSN: 5598

DOD: 1995

Entity Resolution Demonstration

When it comes to best practices in voter matching, if only a name and year of birth match, this is insufficient proof of a match. Many

different people in the U.S. share a name and year of birth.

Human review is required.

Unfortunately, there are thousands and thousands of cases just like this and state election offices don‟t have the staff (or budget) to

manually review such volumes.

Page 30: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation30

VOTERGeorge F Balston

YOB: 1951 D/L: 4801

13070 SW Karen Blvd Apt 7

Beaverton, OR 97005

Last voted: 2008

DECEASED PERSONGeorge Balston

YOB: 1951 SSN: 5598

DOD: 1995

Now Consider This Tertiary DMV Record

DMVGeorge F Balston

YOB: 1951 SSN: 5598 D/L: 4801

3043 SW Clementine Blvd Apt 210

Beaverton, OR 97005

The DMV record contains enough features to match both the voter (name, year of birth and driver‟s license) and/or the deceased persons record (name, year of birth and SSN). For the sake of argument, let‟s

say it matches the voter best.

Page 31: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation31

VOTERGeorge F Balston

YOB: 1951 D/L: 4801

13070 SW Karen Blvd Apt 7

Beaverton, OR 97005

Last voted: 2008

DMVGeorge F Balston

YOB: 1951 SSN: 5598 D/L: 4801

3043 SW Clementine Blvd Apt 210

Beaverton, OR 97005

DECEASED PERSONGeorge Balston

YOB: 1951 SSN: 5598

DOD: 1995

Features Accumulate

The voter/DMV record now shares a name, year of birth and SSN with the deceased person record. In voter matching best practices, this evidence would be sufficient to make a determination that this voter

is in fact deceased. This case no longer needs human review.

Page 32: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation32

VOTERGeorge F Balston

YOB: 1951 D/L: 4801

13070 SW Karen Blvd Apt 7

Beaverton, OR 97005

Last voted: 2008

DMVGeorge F Balston

YOB: 1951 SSN: 5598 D/L: 4801

3043 SW Clementine Blvd Apt 210

Beaverton, OR 97005

DECEASED PERSONGeorge Balston

YOB: 1951 SSN: 5598

DOD: 1995

Useful Insight Revealed!

As features accumulate it becomes possible to resolve previous un-resolvable identity

records.

As events and transactions accumulate –

detection of relevance improves.

Here we can see George who died in 1995 voted in

2008.

Page 33: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation33

IBM InfoSphere Identity Insight V8

Page 34: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation34

MoneyGram International

Page 35: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation35

Enterprise IntelligenceOne Plausible Journey

Enterprise IntelligenceOne Plausible Journey

Page 36: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation36

ObservationSpace

Sense and Respond

What you know

New Observations

Page 37: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation37

ObservationSpace

Decide

?Relevance

Finds the Sensor(<200ms)

Data Finds Data

Sense and Respond

Page 38: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation38

Explore and Reflect

ObservationSpace

Decide

?

DirectedAttention

Relevance Find You

DeepReflection

CuratedData

PatternDiscovery

RelevanceFinds the Sensor

(<200ms)

Data Finds Data

Sense and Respond

Page 39: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation39

ObservationSpace

Decide

?

DirectedAttention

NEWINTERESTS

DeepReflection

CuratedData

PatternDiscovery

RelevanceFinds the Sensor

(<200ms)

Data Finds Data

Explore and ReflectSense and Respond

Page 40: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation40

ObservationSpace

Decide

?

DeepReflection

CuratedData

PatternDiscovery

RelevanceFinds the Sensor

(<200ms)

Data Finds Data

InfoSphere StreamsILog

NetezzaSPSS

Watson

DirectedAttention

Cognos

Explore and ReflectSense and Respond

InfoSphere Streams

NEWINTERESTS

SPSSSensemaking

Page 41: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation41

ObservationSpace

Decide

?

DirectedAttention

NEWINTERESTS

DeepReflection

CuratedData

PatternDiscovery

RelevanceFinds the Sensor

(<200ms)

Data Finds Data

Report and Manage

Explore and ReflectSense and Respond

Page 42: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation42

Decide

?

DirectedAttention

NEWINTERESTS

PatternDiscovery

RelevanceFinds the Sensor

(<200ms)

Data Finds Data

Info Management Systems

Content ManagementCase ManagementData Warehousing

Report and Manage

Page 43: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation43

Big Data Trends

Page 44: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation44

Val

ue o

f D

ata

The Greater the Context, the Greater the Value

Pile of Data

Records Managed(Big) (Ludicrous Big)

Data in Context

Page 45: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation45

Willing

ness

to

Wai

tThe better the

predictions … the faster they will be

wanted.

“Why did we have to wait until the

end of the day for the smart answer?”

Time Is Of The Essence

Relevance (Iffy) (Totally)

Day

Hour

200ms

Batch

Real-Time

Page 46: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation46

Closing Thoughts

Page 47: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation47

The most competitive organizations

are going to make sense of what they are observing

fast enough to do something about it

while they are observing it.

Page 48: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation48

Time

Sensemaking Algorithms

Available Observation

Space

Context

Wish This On The Competitor

EnterpriseAmnesia

Com

puti

ng P

ower

Gro

wth

Page 49: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation49

Time

The Way Forward: Enterprise Intelligence

Sensemaking Algorithms

Available Observation

Space

Context

Com

puti

ng P

ower

Gro

wth

Page 50: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation50

Related Blog Posts

Algorithms At Dead-End: Cannot Squeeze Knowledge Out Of A Pixel

Puzzling: How Observations Are Accumulated Into Context

On A Smarter Planet … Some Organizations Will Be Smarter-er Than Others

G2 | Sensemaking – One Year Birthday Today. Cognitive Basics Emerging.

Page 51: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation51

Email: [email protected]

Blog: www.jeffjonas.typepad.com

Twitter: http://www.twitter.com/jeffjonas

Questions?

Page 52: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation52

Enterprise Intelligence

Jeff Jonas, IBM Distinguished EngineerChief Scientist, IBM Entity Analytics

Email: [email protected]: www.jeffjonas.typepad.com

Twitter: http://www.twitter.com/jeffjonas

Page 53: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation53

Sensemaking on StreamsMy G2 Secret Little IBM Project

3+ years in the making

Page 54: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation54

G2 Mission Statement

1) Evaluate each new observation against previous observations.

2) Determine if what is being observed is relevant.

3) Delivering this actionable insight to its consumer … fast enough to do something about it while it is still happening.

4) Doing this with sufficient accuracy and scale to really matter.

Page 55: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation55

From Pixels to Pictures to Action

Observations

Data Finds Data

PersistentContext

Relevance Finds You

Consumer(An analyst, a system, the sensor itself, etc.)

This is G2

Page 56: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation56

Uniquely G2

More scalable, faster and extensible– Designed for grid compute and sub-200ms sense and respond

Smarter– Tolerance for disagreement (no such thing as a single version of truth)

– Support for more abstract entities (e.g., locations, products, asteroids)

– Support for more exotic features (e.g., biometrics, social circles)

Crazy stuff– Detects on its own when it is confused and makes “note to self”

– Geospatial reasoning including a sense of here and now

Privacy by Design (PbD) – More privacy and civil liberties enhancing features baked-in than any other

commercial technology

Page 57: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation57

PbD: Self-Correcting False Positives

Which reveals this is a FALSE POSITIVE

John T Smith Jr123 Main Street

703 111-2000DOB: 03/12/1984

John T Smith123 Main Street

703 111-2000DL: 009900991

A plausible claim these two people are the same

1

2 John T Smith Sr123 Main Street

703 111-2000DL: 009900991

Until this record comes into view

3

Page 58: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation58

PbD: Self-Correcting False Positives

John T Smith Jr123 Main Street

703 111-2000DOB: 03/12/1984

John T Smith123 Main Street

703 111-2000DL: 009900991

John T Smith Sr123 Main Street

703 111-2000DL: 009900991

New Best Practice:FIXED IN REAL-TIME

(not end of month)

John T Smith123 Main Street

703 111-2000DL: 009900991

1

3

2

2

Page 59: Enterprise intelligence apr2012   load - romania - 30 min

© 2012 IBM Corporation59

Customer Facing Systems

Data Mining

Back-of-House Accounting Systems

Fraud

This System That System

Sensemaking