THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

68
1

description

Presentation from Ira "Gus" Hunt, CIA #dataconf More at http://event.gigaom.com/structuredata/

Transcript of THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

Page 1: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

1

Page 2: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

!

Ira A. (Gus) Hunt Chief Technology Officer

Beyond Big Data

Riding the Technology Wave

Page 3: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

Our Mission

We are the nation's first line of defense. We accomplish what others cannot accomplish and go where others cannot go. We carry out our mission by:

Collecting information that reveals the plans, intentions and capabilities of our adversaries and provides the basis for decision and action. Producing timely analysis that provides insight, warning and opportunity to the President and decisionmakers charged with protecting and advancing America's interests. Conducting covert action at the direction of the President to preempt threats or achieve US policy objectives.

Page 4: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

2  

3  

4 Big Bets

–  Acquire, federate, secure and exploit. Grow the haystack, magnify the needles. Revolutionize Big Data Exploitation

Accelerate Operational Excellence

Serve CIA by supporting the IC

Drive Performance through Talent Management

–  Assume a leadership role in IC activities that matter to CIA; Build to share

–  Innovate IT operations and run IT like a business.

–  Focus on continuous learning and diversity of thought, experience, background

1  

4  

Page 5: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

2  3  

6 Key Technology Enablers

–  World-class abilities to discover patterns, correlate information, understand plans and intentions, and find and identify operational targets in a sea of data. Big Data analytics as a service

Advanced Mission Analytics—Analytics as a Service

Enterprise Widgets and Services

Security as a Service

Data Harbor—Data as a Service

–  One environment, all data, protected and secure.--ubiquitous encryption, enterprise authentication, audit, DRM, secure ID propagation, and Gold Version C&A.

–  A customizable, integrated and adaptive webtop that lets analysts, ops officers, and targeters to “have it their way”. Personalization in context.

–  An ultra-high performance data environment that enables CIA missions to acquire, federate, and position and securely exploit huge volumes data. Data in context.

1  

4  Cloud Computing—Infrastructure as a Service

–  Capacity ahead of demand. Large scale, elastic, commodity hosting, storage, and compute 5  

–  Immediate, secure and appropriate access to people, data and tools from anywhere at anytime Secure Mobility 0  

Page 6: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

6

It’s a

Big Data

World

Page 7: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

Google > 100 PB

> 1T indexed URLs > 3 million servers

> 7.2B page-views/day

7

Page 8: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

8

FaceBook > 1 billion users

> 300PB; +> 500TB/day > 35% of world’s photographs

Page 9: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

9

YouTube > 1000PB

+>72 hours/minute >37 million hours/year > 4 billion views/day

Page 10: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

10

World Population > 7,057,065,162

Page 11: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

11

Twitter > 124B tweets/year

> 390M/day ~4500/sec

Page 12: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

12

Global Text Messages > 6.1T per year

> 193,000 per second > 876 per person per year

Page 13: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

13

US Cell Calls > 2.2 T minutes/year

> 19 minutes / person / day (uncompressed < 1 YouTube/year)

Page 14: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

14

3 Driving Forces

Page 15: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

Social

15

Mobile

Cloud

Page 16: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

16

Big Data

+ + =

Page 17: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

17

+ + Increases the velocity of

innovation

Page 18: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

18

Accelerates social Change

+ +

Page 19: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

19

Page 20: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

20

Altered the Flow

of Information

+ +

Page 21: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

3 Emerging

Forces

Page 22: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

Nano

22

Bio

Sensors

Page 23: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

23

Microphone Image 3-axis accelerometer Touch Light Proximity Geolocation

Mobile Sensor Platform

Communicator, Tricorder, Transporter

Page 24: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

24

Pacemaker Blood sugar tester Insulin controller Health monitor Exercise coach Remote tune-ups Early warning system

Mobile Health Platform

Page 25: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

25

Identity by 3-axis accelerometer Gender (71%) Height--tall or short (80%) Weight--heavy or light (80%) You by your gait (100%)

Mobile Sensor Platform

Actitracker—Android App

Page 26: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

26

The inanimate becomes sentient

+ + + +

+ =

Page 27: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

27

Smarter Planet

Cars drive themselves

Machines know your needs

+ + + +

+ =

Page 28: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

28

Drive radical efficiencies Enhance social engagement Improve information sharing Enables global reach Green (automatic routing) Improve our health Stop/prevent crime …

+ + + +

+ =

Page 29: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

2  

3  

Sensors are Really Big

Sensors are unbounded 1  

Sensors are indiscriminate

Sensors are promiscuous

Page 30: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

2  

3  

The Internet of Things is Bigger

Everything is Connected 1  

Everything is a Sensor

Everything Communicates

Page 31: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

31

That’s the

Really Big Data

Challenge of the future

Page 32: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

32

Why We Care

Page 33: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

33

Why We Care

Page 34: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

34

Why We Care

Page 35: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

35

Why We Care

Page 36: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

2  3  

Impact of Big Data

Know what we know

Discover the gaps in our knowledge

More effective use of expensive or long lead collection assets

1  

4  Focus targeting to fill the gaps

Better global coverage to limit surprise 5  Enhance understanding and improve analysis 6  

Page 37: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

37

Implications

Page 38: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

2  3  

4 Rules of Big Data

It’s the data…

Power to the people

Context, context, context

1  

4  Latency breeds contempt

- Apologies to James Carville

- Apologies to the Black Panthers

- Apologies to Aesop

- Apologies to Lord Harold Samuel

Page 39: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

39

It’s the Data…

Page 40: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

Data vs Tools—A History Lesson

•  Sophisticated tools without the data are useless

•  Mediocre tools with the data are frustrating

•  Analysts will always opt for frustration over futility, if that is their only option

Page 41: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

2  

3  

Our Job Leverage the Big Data world

Find the Information that Matters

Connect the Dots

Understand the Plans of our Adversaries Safeguard our national security

1  

4  

Page 42: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

The Problem

42

Page 43: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

2  

3  

Our Problem: Which 5K

Don’t know the future value of data

We cannot connect dots we don’t have

Traditional, requirements driven, collection fails in the Big Data world

- Can’t task for data you don’t know you do need - The few cannot know the needs of the many - Global Coverage requires Global Data

1  

Page 44: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

2  3  

Characteristics of Big Data

More is always better

Signal to noise only gets worse

Requirements are usually hindsight

1  

4  Enumeration not modeling

Page 45: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

45

•  Analysts and operators are not data engineers •  Need insight and understanding •  Ask a question and get a coherent answer •  Cannot know what data sets contain

information of value to them •  Imbue data services and tools with those

smarts •  Smart Data, smart tools, smarter intelligence

Data as a Service

Page 46: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

46

Power to the People

Page 47: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

47

•  Analytics and tools are hard to use •  Specialists are required to derive value •  Skilled people are in short supply •  Algorithms are dense and arcane •  Require a lot of hand curation •  Built for business not for intelligence

Today

Page 48: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

48

New Fields of Expertise

Data Scientist Information Engineer

Page 49: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

Data Science

Data science combines elements from many fields:

Math Statistics Data Engineering Pattern Recognition and Learning Advanced Computing Visualization Uncertainty Modeling Data Warehousing High performance computing

* Wikipedia

*

Page 50: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

50

The power of big data can only be fully realized

when it is in the hands of the average user

Big Data Democracy Wins

Page 51: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

Tomorrow

•  Elegant, powerful and easy to use tools and visualizations

•  Machines to do more of the heavy lifting

•  Intelligent systems that learn from the user

•  Correlation not search

•  “Curiosity layer”– machines that are curious on your behalf

Page 52: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

52

People

Places

Organizations

Time

Events

Concepts

Things

7 Universal Constructs for Analytics

Page 53: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

53

User Built Recipes

Page 54: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

Keep it Simple

•  Data Scientists focus on hard problems

•  Build reusable components that anyone can apply—Recipes

•  Share them widely—Apps Store/Apps Mall—Recipe Book

•  Let users assemble components their way •  Experiment and fail quickly to succeed faster

Page 55: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

55

Latency Breeds Contempt

Page 56: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

Its All About Speed

•  Hadoop/Map Reduce—batch •  Flexible, powerful, slow

•  Equivalent of Real-Time Map/Reduce •  Flexible, powerful and fast •  Demel, Caffeine, Impala, Apache Drill, Spanner…

•  Recursive Streams processing w/

complex analytics

•  In-memory—peta-scale RAM architectures •  Distributed, in-memory analytics

Page 57: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

Tectonic Technology Shifts

Traditional Processing Data on SAN

Move Data to Question Backup

Vertical scaling Capacity after demand

DR Size to peak load

Tape SAN Disk

RAM limited

Mass Analytics/Big Data Data at processor Move Question to Data Replication management Horizontal scaling Capacity ahead of demand COOP Dynamic/elastic provisioning SAN Disk SSD Peta-scale RAM

Page 58: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

New Computing Architectures

•  Data close to compute •  Power at the edge •  Optical Computing/Optical Bus •  End of the motherboard—shared pools of

everything •  Software defined everything—compute,

storage, networking, data center •  Network is the bottleneck and constraint

Page 59: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

59

Context, Context, Context

Page 60: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

Everything in Your Frame of Reference

•  Widgets—Webtop in context to business

•  Schema on Read—Data in context to your question

•  User assembled analytics—answers in context to your questions

•  Elastic computing—computing in context to your demand

Page 61: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

61

Closing Thoughts

Page 62: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

62

High Noon in the

Information Age

Page 63: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

63

It is nearly within our grasp to compute on all human

generated information

Page 64: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

64

FaceBook > 1 billion users

> 35% of all photographs

Page 65: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

65

The inanimate is rapidly becoming sentient

Smarter Planet

Cars drive themselves

Machines know your needs

Page 66: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

66

3rd Wave of Computing

Cognitive Machines

Watson

Page 67: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

67

Moving faster than government can keep up The legal system is woefully behind What are your rights? Who owns your data? Driving the pace of social change Exponentially increasing cyber threats

+ + + +

+ =

Page 68: THE CIA’S “GRAND CHALLENGES” WITH BIG DATA from Structure:Data 2013

68