The Power of Big Data

66
The Power of Big Data Tim Wiles, Iain Batty and David Turnbull 31 January 2014

description

Safely exploiting the power of data and information is vital to your business. In this seminar, Iain, Tim and David will look at the advantages and risks presented to your business by 'Big Data' and its potential for data analysis. They will investigate a number of uses of NoSQL databases to support this analysis, and will discuss other business considerations including how to distribute data across multiple locations, and significantly reduce potential cost of storage. The seminar concludes with an overview of modern ways of securing your data assets, including different authentication methods and encryption.

Transcript of The Power of Big Data

Page 1: The Power of Big Data

The Power of Big Data

Tim Wiles, Iain Batty and David Turnbull

31 January 2014

Page 2: The Power of Big Data

Outline

• Big data Iain Batty

• NoSQL: The future of data storage?Tim Wiles

• Data security David Turnbull

Page 3: The Power of Big Data

Big Data

Page 4: The Power of Big Data

What is Data?

• Data is everywhere• You have more than you think• It’s your biggest asset

Page 5: The Power of Big Data

So What Is “Big” Data?

• Many Definitions• Study by Ward & Barker of St Andrews• “Big data is a term describing the storage

and analysis of large and or complex data sets using a series of techniques including, but not limited to: NoSQL, Map Reduce and machine learning.”

Page 6: The Power of Big Data

So What Is “Big” Data?

• We have a huge amount of data:– 90% of data was created in the last two years– 2.5 Exabyte's (2.5×1018) of data created

every day

• Data Analysis on a huge scale

Page 7: The Power of Big Data

THANKS TO BIG DATA…

Page 8: The Power of Big Data
Page 9: The Power of Big Data
Page 10: The Power of Big Data

How and Why Big Data is Used

• Healthcare• Scientific Research (Folding@Home,

SETI)• Market Research• Business Operation Optimisation

Page 11: The Power of Big Data

Why to use Big Data

• Investigative and Predictive• Increasing amount of public data access• Enables high level understanding of

previously unfathomable datasets

Page 12: The Power of Big Data

Why Not to Use Big Data• Expensive• Limited Pool of talent• Not always applicable• Must be used correctly: Correlation does

not mean causation

...yaaaarrrr?!

Page 13: The Power of Big Data

Conclusion

• Big Data technologies may or may not be right for you

• But the principles are universal:– Gather your data– Use novel new sources such as Social Media

and public data initiatives– Analyse it intelligently

Page 14: The Power of Big Data

NoSQL: The future of data storage?

Page 15: The Power of Big Data

Hard drives ~ 500 MB

Modems ~ 28-56 Kbps Digital cameras emerging

Floppy disks ~ 1.44 MB

BBC front page (1996): bit.ly/Kc6ojz

20 years ago…

Page 16: The Power of Big Data

Today…

BBC front page (today): bbc.in/18lsxlx

Page 17: The Power of Big Data

Data storage

Relational (SQL) NoSQLHighly structured Flexible structure

Single type Many types

£££££

Jack

of a

ll

trades Rig

ht tool

for t

he job?

Page 18: The Power of Big Data

Which horse do you back?

Page 19: The Power of Big Data

vs

VHS Betamax

Page 20: The Power of Big Data

vs

HD-DVD Blu-ray

Page 21: The Power of Big Data

Flavours of NoSQL

Key-value Column

Document Graph

Amazon Dynamo

Apache Cassandra Google BigTable

HBase

CouchDB

MongoDBNeo4j

AllegroGraph

Page 22: The Power of Big Data

Comparing the options

• Right tool for the job?

• Relational database → Can be adapted.

• NoSQL database → Specialised problem solving.

Page 23: The Power of Big Data

Relational Database

EmployeeID Employee

1 Tim Wiles

2 Iain Batty

3 David Turnbull

PayID Payment Method

1 Salaried

2 Ad Hoc

3 Digestive Biscuits

EmployeeID PayID

1 3

2 1

3 1

Page 24: The Power of Big Data

Key-value stores

Key Value

teh the

hlelo hello

edn end

tol tool

Page 25: The Power of Big Data

Column stores

Item Name Number Of Sales Total Cost (£) Total Revenue (£) Origin

Orange Juice 152,000 76,000 152,000 Spain

Apple Juice 137,000 54,800 123,300 UK

Pineapple Juice 63,000 37,800 78,750 Brazil

Grape Juice 84,000 46,200 92,400 Spain

Page 26: The Power of Big Data

Column stores

Item Name Number Of Sales Total Cost (£) Total Revenue (£) Origin

Orange Juice 152,000 76,000 152,000 Spain

Apple Juice 137,000 54,800 123,300 UK

Pineapple Juice 63,000 37,800 78,750 Brazil

Grape Juice 84,000 46,200 92,400 Spain

= 436,000

Page 27: The Power of Big Data

Column stores

Item Name Number Of Sales Total Cost (£) Total Revenue (£) Origin

Orange Juice 152,000 76,000 152,000 Spain

Apple Juice 137,000 54,800 123,300 UK

Pineapple Juice 63,000 37,800 78,750 Brazil

Grape Juice 84,000 46,200 92,400 Spain

Page 28: The Power of Big Data

Column stores

Item Name Number Of Sales Total Cost (£) Total Revenue (£) Origin

Orange Juice 152,000 76,000 152,000 Spain

Apple Juice 137,000 54,800 123,300 UK

Pineapple Juice 63,000 37,800 78,750 Brazil

Grape Juice 84,000 46,200 92,400 Spain

Profit = £231,650

Page 29: The Power of Big Data

Document stores

Page 30: The Power of Big Data

Document stores

Page 31: The Power of Big Data

Document storesCompany

Location 1

City:Durham

Employee List

Employee 1

Name: Tim Wiles

Age: 26

Start Date: 31/03/2013

Employee 2

Name: David

Turnbul

l

Age: 27

Location 2

City:London

Page 32: The Power of Big Data

Graph stores

Enemy

Enemy

FriendFriend

“Friend”

Enemy

Enemy

Page 33: The Power of Big Data

Case Study: Middle Earth University

Introduction to AlchemyWed 11AM

Advanced AlchemyWed 1PM

World DominationWed 9AM

Introduction to Magic

Wed 11AM

Advanced Magical Techniques

Wed 9AM

Page 34: The Power of Big Data

Case Study: Middle Earth University

Advanced Magical Techniques

Wed 9AM

Page 35: The Power of Big Data

Case Study: Middle Earth University

Introduction to AlchemyWed 11AM

Advanced AlchemyWed 1PM

World DominationWed 9AM

Introduction to Magic

Wed 11AM

Advanced Magical Techniques

Wed 9AM

Page 36: The Power of Big Data

Case Study: Middle Earth University

Introduction to AlchemyWed 11AM

Introduction to Magic

Wed 11AM

All courses running at 11AM on Wednesday

Page 37: The Power of Big Data

Case Study: Middle Earth University

Introduction to AlchemyWed 11AM

Advanced AlchemyWed 1PM

World DominationWed 9AM

Introduction to Magic

Wed 11AM

Advanced Magical Techniques

Wed 9AM

Page 38: The Power of Big Data

Case Study: Middle Earth University

Introduction to AlchemyWed 11AM

Advanced AlchemyWed 1PM

World DominationWed 9AM

Introduction to Magic

Wed 11AM

Advanced Magical Techniques

Wed 9AM

BMag

MMag

DMag

Page 39: The Power of Big Data

Case Study: Middle Earth University

Advanced AlchemyWed 1PM

Advanced Magical Techniques

Wed 9AM

DMag

Page 40: The Power of Big Data

Case Study: Middle Earth University

Advanced AlchemyWed 1PM

Advanced Magical Techniques

Wed 9AM

DMag

Page 41: The Power of Big Data

Case Study: Middle Earth University

Introduction to AlchemyWed 11AM

Advanced AlchemyWed 1PM

World DominationWed 9AM

Introduction to Magic

Wed 11AM

Advanced Magical Techniques

Wed 9AM

BMag

MMag

DMag

Shire Lecture Hall

Mordor Seminar Room

Page 42: The Power of Big Data

Case Study: Middle Earth University

Introduction to AlchemyWed 11AM

Advanced AlchemyWed 1PM

Mordor Seminar Room

Page 43: The Power of Big Data

Is NoSQL for everyone?

• Most businesses functioning effectively using only relational databases.

• Not the grand solution to all data storage problems.

• Train or employ → NoSQL knowledge.

Page 44: The Power of Big Data

However…

Page 45: The Power of Big Data

NoSQL is showing significant promise for certain aspects of almost any business.

Page 46: The Power of Big Data

Reasons to use NoSQL in your business

• Potential significant financial savings.

• Easy to adapt stored data as your business grows and your priorities change.

• Exceeding the performance of popular commercial relational databases.

Page 47: The Power of Big Data

Reasons to use NoSQL in your business

Effective tool for a holistic approach to analysing the growth/status of your business.

Page 48: The Power of Big Data

Reasons to use NoSQL in your business

Relational databases are not the only solution to

your data storage problems.

Page 49: The Power of Big Data

Data Security

Page 50: The Power of Big Data

Why Is Data Security Important• The cost of a data breach is continuing to rise

• Fewer customers remain loyal after a data breach

• Reputation losses and diminished goodwill – lost business cost has steadily increase over the last 6 years (£500 thousand in 2007)

• Malicious or criminal attacks are the most costly

Page 55: The Power of Big Data

Current Methods Of Authentication1. Basic User Name and Passwords

2. Biometrics

• Fingerprint Scanners• Voice recognition• Face scanning and recognition• Retina and iris scans

3. Multi-Factor Authentication

• Something possessed, as in a physical token or telephone• Something known, such as a password or mother’s maiden

name• Something inherent, like a biometric trait

Page 56: The Power of Big Data

Pros and Cons Of These Methods

1. Standard Username and Password authentication is extremely vulnerable to Rainbow Attacks

2. Relies on the ability of the system users to pick secure passwords

Adobe Crossword

Page 57: The Power of Big Data

Pros and Cons Of These Methods

• In theory biometrics is a great way to authenticate a user. Its impossible to lose your finger prints, unless you have both your hands chopped off.

Page 58: The Power of Big Data

The Best Solution• Multi-factor Authentication. A security measure that requires two or more

kinds of evidence that you are who you say you are.

• Authentication requires a combination of these bits of evidence rather than simply using one or the other.

• Something you know – Username, Password

• Something you have – An RSA Key, Credit Card

• Something inherent – A fingerprint, retina scan

• Multi-factor Authentication is very secure, but it is hard to implement everywhere.

• Requires users to remember to carry their RSA keys with them.

Page 59: The Power of Big Data

Emerging Methods Of Authentication

• YubiKey – Authentication method based on a unique physical token which cannot be duplicated or recorded, providing a credential based on something only an authorised user possesses.

• Can also be used with password managers such as LastPass

Page 60: The Power of Big Data

How Does YubiKey Work?

Page 61: The Power of Big Data

Quantum Cryptography

What is Quantum Cryptography?

The use of quantum mechanical effects to perform cryptographic tasks or to break cryptographic systems

What does that mean exactly?

• Using physics rather than mathematics to perform cryptographic tasks, such as generating cryptographic keys

• Moreover Quantum Cryptography addresses the problem of Key distribution

Page 62: The Power of Big Data

Quantum CryptographyHow does it work?

• It works by using a technique called Quantum Key Distribution (QKD). QKD enables two parties to produce a shared random secret key which is only known to them. They can then use this key to encrypt and decrypt messages passed between those parties.

• Keys are generated by using photons, which are produced using LEDS. These photons are then polarised using polarising filters and then transmitted

• The two parties decide on what filters are going to be used, and also assign a value, usually a binary value to each photon which has a certain polarisation

• When the whole transmission has happened a unique key has been produced

Page 63: The Power of Big Data

Quantum CryptographyWhat is the benefits of using Quantum Cryptography?

• An important property of quantum cryptography is the ability to detect the presence of a third party attempting to eavesdrop on the transmission of the secret key

• This is achieved because of a fundamental principle of quantum mechanics – the process of measuring a quantum system in general disturbs the system.

Page 64: The Power of Big Data

Questions

Page 65: The Power of Big Data

References1. http://www.technologyreview.com/view/51

9851/the-big-data-conundrum-how-to-define-it/

2. http://en.wikipedia.org/wiki/Big_data#cite_note-15

3. 2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]

4. http://www.yubico.com/products/yubikey-hardware/yubikey/technical-description/

5. https://wiki.archlinux.org/index.php/yubikey#How_does_it_work

Page 66: The Power of Big Data

Upcoming Seminars

• Capturing the Real Value of IT Service Management- Friday14th February

• Preparing for BYOD & Mobile Device Management- Friday 28th February