Analytics and Big Data Analytics

30
Analytics and Big Data Analytics Robin Bloor Ph D

description

View Dr. Robin Bloor's presentation from the Dec. 2013 Big Data Conference in Rome.

Transcript of Analytics and Big Data Analytics

Page 1: Analytics and Big Data Analytics

Analytics and Big

Data AnalyticsRobin Bloor Ph D

Page 2: Analytics and Big Data Analytics

The Sequence of Topics….

1 Data Science?2 The Nature of

Analytics3 Machine Learning

Et Al4 The Business

Perspective5 The Future

Page 3: Analytics and Big Data Analytics

1

Page 4: Analytics and Big Data Analytics

What Is Data Science?

There is no “data science.” It’s a misnomer

All science is empirical and involves data analysis.

Science implements a method.

So do statisticians

Page 5: Analytics and Big Data Analytics

What Is A Data Scientist?

Project managerQualified

statisticianDomain Business

expertExperienced data

architectSoftware engineer

(It’s a team)

Page 6: Analytics and Big Data Analytics

Data Scientist v Business Analysts

Claims that business analysts can be data scientists are dubious

Good practitioners of statistics understand data (from years of training)

Software understands nothing, it simply implements algorithms

Page 7: Analytics and Big Data Analytics

Who Understands Data?

Page 8: Analytics and Big Data Analytics

Nevertheless!

You can know more about a

business from its data than by any

other means

Page 9: Analytics and Big Data Analytics

2

TheNatureOfAnalytics

Page 10: Analytics and Big Data Analytics

The Field of Business Intelligence

Page 11: Analytics and Big Data Analytics

The Driving Force is Insight

Page 12: Analytics and Big Data Analytics

A Process Not An Activity

Data Analytics is a multi-disciplinary end-to-end process

Until recently it was a walled-garden. But recently the walls were torn down by… Data availability Scalable technology Open source tools

Page 13: Analytics and Big Data Analytics

The Data Analytics Process - Detail

Page 14: Analytics and Big Data Analytics

The CRITICAL Workload Issue

Previously, we viewed database workloads as an i/o optimization problem

With analytics the workload is a very variable mix of i/o and calculation

No databases were built for this – not even Big Data databases

Page 15: Analytics and Big Data Analytics

3

MachineLearning

Et Al

Page 16: Analytics and Big Data Analytics

Analytical Latencies

1 Data access

2 Data preparation

3 Model development

4 Execution

5 Implementation

6 Model Audit & Update

Speed = value (probably)

Page 17: Analytics and Big Data Analytics

The Open Source Dynamic

The R Language Over 1 million

users Hadoop and its

Ecosystem Reduced latency

for analytics Machine Learning

Algorithms Raw power

None of these are engineered for performance

Page 18: Analytics and Big Data Analytics

Machine Learning Algorithms - 1

There are many: Neural

network(s) Bayesian

networks Decisions

trees/random forests

Support vector machines

K-means Clustering Regression(s) Etc.

Page 19: Analytics and Big Data Analytics

Machine Learning Algorithms - 2

They are not newly invented

We did not previously use them much because we never had the computer power

Now that we have the power (at a price) we can employ them

Page 20: Analytics and Big Data Analytics

Machine Learning Algorithms - 3

Machine learning algorithms can check all possibilities

We never had the computer power

Now that we have the power (at a price) we can employ them

Page 21: Analytics and Big Data Analytics

The Impact?

Machine learning and processing power (parallelism) will change the data analysis process

The analytics team needs to understand IT

Page 22: Analytics and Big Data Analytics

4

TheBusinessPerspective

Page 23: Analytics and Big Data Analytics

Business Metamorphosis

The role of data analysis has not changed

Only the speed has changed

The process will evolve

It will be disruptive for incumbent vendors

Page 24: Analytics and Big Data Analytics

The Data Analysis Budget

Data Analysis is Business R&D

The focus is on business process

The outcome of successful R&D is a changed process

Think of manufacturing for a useful analogy

Page 25: Analytics and Big Data Analytics

The Data Analysis Budget

Data Analysis is Business R&D

The focus is on business process

The outcome of successful R&D is a changed process

Think of manufacturing for a useful analogy

Page 26: Analytics and Big Data Analytics

5

TheFuture

Page 27: Analytics and Big Data Analytics

Non è finita fino a quando la signora grassa canta

Hardware disruption Software disruption Business process

disruption All we know is:

Analytical processing will get faster

Analytic latencies will reduce

Data will continue to grow

Analytics will be a differentiator

Page 28: Analytics and Big Data Analytics

In Summary…

1 Data Science?2 The Nature of

Analytics3 Machine Learning

Et Al4 The Business

Perspective5 The Future

Page 29: Analytics and Big Data Analytics
Page 30: Analytics and Big Data Analytics

Grazie milleper la vostra attenzione