The Power of Big Data
-
Upload
waterstons-ltd -
Category
Technology
-
view
607 -
download
2
description
Transcript of The Power of Big Data
The Power of Big Data
Tim Wiles, Iain Batty and David Turnbull
31 January 2014
Outline
• Big data Iain Batty
• NoSQL: The future of data storage?Tim Wiles
• Data security David Turnbull
Big Data
What is Data?
• Data is everywhere• You have more than you think• It’s your biggest asset
So What Is “Big” Data?
• Many Definitions• Study by Ward & Barker of St Andrews• “Big data is a term describing the storage
and analysis of large and or complex data sets using a series of techniques including, but not limited to: NoSQL, Map Reduce and machine learning.”
So What Is “Big” Data?
• We have a huge amount of data:– 90% of data was created in the last two years– 2.5 Exabyte's (2.5×1018) of data created
every day
• Data Analysis on a huge scale
THANKS TO BIG DATA…
How and Why Big Data is Used
• Healthcare• Scientific Research (Folding@Home,
SETI)• Market Research• Business Operation Optimisation
Why to use Big Data
• Investigative and Predictive• Increasing amount of public data access• Enables high level understanding of
previously unfathomable datasets
Why Not to Use Big Data• Expensive• Limited Pool of talent• Not always applicable• Must be used correctly: Correlation does
not mean causation
...yaaaarrrr?!
Conclusion
• Big Data technologies may or may not be right for you
• But the principles are universal:– Gather your data– Use novel new sources such as Social Media
and public data initiatives– Analyse it intelligently
NoSQL: The future of data storage?
Hard drives ~ 500 MB
Modems ~ 28-56 Kbps Digital cameras emerging
Floppy disks ~ 1.44 MB
BBC front page (1996): bit.ly/Kc6ojz
20 years ago…
Data storage
Relational (SQL) NoSQLHighly structured Flexible structure
Single type Many types
£££££
Jack
of a
ll
trades Rig
ht tool
for t
he job?
Which horse do you back?
vs
VHS Betamax
vs
HD-DVD Blu-ray
Flavours of NoSQL
Key-value Column
Document Graph
Amazon Dynamo
Apache Cassandra Google BigTable
HBase
CouchDB
MongoDBNeo4j
AllegroGraph
Comparing the options
• Right tool for the job?
• Relational database → Can be adapted.
• NoSQL database → Specialised problem solving.
Relational Database
EmployeeID Employee
1 Tim Wiles
2 Iain Batty
3 David Turnbull
PayID Payment Method
1 Salaried
2 Ad Hoc
3 Digestive Biscuits
EmployeeID PayID
1 3
2 1
3 1
Key-value stores
Key Value
teh the
hlelo hello
edn end
tol tool
…
Column stores
Item Name Number Of Sales Total Cost (£) Total Revenue (£) Origin
Orange Juice 152,000 76,000 152,000 Spain
Apple Juice 137,000 54,800 123,300 UK
Pineapple Juice 63,000 37,800 78,750 Brazil
Grape Juice 84,000 46,200 92,400 Spain
Column stores
Item Name Number Of Sales Total Cost (£) Total Revenue (£) Origin
Orange Juice 152,000 76,000 152,000 Spain
Apple Juice 137,000 54,800 123,300 UK
Pineapple Juice 63,000 37,800 78,750 Brazil
Grape Juice 84,000 46,200 92,400 Spain
= 436,000
Column stores
Item Name Number Of Sales Total Cost (£) Total Revenue (£) Origin
Orange Juice 152,000 76,000 152,000 Spain
Apple Juice 137,000 54,800 123,300 UK
Pineapple Juice 63,000 37,800 78,750 Brazil
Grape Juice 84,000 46,200 92,400 Spain
Column stores
Item Name Number Of Sales Total Cost (£) Total Revenue (£) Origin
Orange Juice 152,000 76,000 152,000 Spain
Apple Juice 137,000 54,800 123,300 UK
Pineapple Juice 63,000 37,800 78,750 Brazil
Grape Juice 84,000 46,200 92,400 Spain
Profit = £231,650
Document stores
Document stores
Document storesCompany
Location 1
City:Durham
Employee List
Employee 1
Name: Tim Wiles
Age: 26
Start Date: 31/03/2013
Employee 2
Name: David
Turnbul
l
Age: 27
Location 2
City:London
Graph stores
Enemy
Enemy
FriendFriend
“Friend”
Enemy
Enemy
Case Study: Middle Earth University
Introduction to AlchemyWed 11AM
Advanced AlchemyWed 1PM
World DominationWed 9AM
Introduction to Magic
Wed 11AM
Advanced Magical Techniques
Wed 9AM
Case Study: Middle Earth University
Advanced Magical Techniques
Wed 9AM
Case Study: Middle Earth University
Introduction to AlchemyWed 11AM
Advanced AlchemyWed 1PM
World DominationWed 9AM
Introduction to Magic
Wed 11AM
Advanced Magical Techniques
Wed 9AM
Case Study: Middle Earth University
Introduction to AlchemyWed 11AM
Introduction to Magic
Wed 11AM
All courses running at 11AM on Wednesday
Case Study: Middle Earth University
Introduction to AlchemyWed 11AM
Advanced AlchemyWed 1PM
World DominationWed 9AM
Introduction to Magic
Wed 11AM
Advanced Magical Techniques
Wed 9AM
Case Study: Middle Earth University
Introduction to AlchemyWed 11AM
Advanced AlchemyWed 1PM
World DominationWed 9AM
Introduction to Magic
Wed 11AM
Advanced Magical Techniques
Wed 9AM
BMag
MMag
DMag
Case Study: Middle Earth University
Advanced AlchemyWed 1PM
Advanced Magical Techniques
Wed 9AM
DMag
Case Study: Middle Earth University
Advanced AlchemyWed 1PM
Advanced Magical Techniques
Wed 9AM
DMag
Case Study: Middle Earth University
Introduction to AlchemyWed 11AM
Advanced AlchemyWed 1PM
World DominationWed 9AM
Introduction to Magic
Wed 11AM
Advanced Magical Techniques
Wed 9AM
BMag
MMag
DMag
Shire Lecture Hall
Mordor Seminar Room
Case Study: Middle Earth University
Introduction to AlchemyWed 11AM
Advanced AlchemyWed 1PM
Mordor Seminar Room
Is NoSQL for everyone?
• Most businesses functioning effectively using only relational databases.
• Not the grand solution to all data storage problems.
• Train or employ → NoSQL knowledge.
However…
NoSQL is showing significant promise for certain aspects of almost any business.
Reasons to use NoSQL in your business
• Potential significant financial savings.
• Easy to adapt stored data as your business grows and your priorities change.
• Exceeding the performance of popular commercial relational databases.
Reasons to use NoSQL in your business
Effective tool for a holistic approach to analysing the growth/status of your business.
Reasons to use NoSQL in your business
Relational databases are not the only solution to
your data storage problems.
Data Security
Why Is Data Security Important• The cost of a data breach is continuing to rise
• Fewer customers remain loyal after a data breach
• Reputation losses and diminished goodwill – lost business cost has steadily increase over the last 6 years (£500 thousand in 2007)
• Malicious or criminal attacks are the most costly
The Cost Of a Data Breach
2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
The Cost Of a Data Breach
2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
The Cost Of a Data Breach
2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
The Causes Of a Data Breach
2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
Current Methods Of Authentication1. Basic User Name and Passwords
2. Biometrics
• Fingerprint Scanners• Voice recognition• Face scanning and recognition• Retina and iris scans
3. Multi-Factor Authentication
• Something possessed, as in a physical token or telephone• Something known, such as a password or mother’s maiden
name• Something inherent, like a biometric trait
Pros and Cons Of These Methods
1. Standard Username and Password authentication is extremely vulnerable to Rainbow Attacks
2. Relies on the ability of the system users to pick secure passwords
Adobe Crossword
Pros and Cons Of These Methods
• In theory biometrics is a great way to authenticate a user. Its impossible to lose your finger prints, unless you have both your hands chopped off.
The Best Solution• Multi-factor Authentication. A security measure that requires two or more
kinds of evidence that you are who you say you are.
• Authentication requires a combination of these bits of evidence rather than simply using one or the other.
• Something you know – Username, Password
• Something you have – An RSA Key, Credit Card
• Something inherent – A fingerprint, retina scan
• Multi-factor Authentication is very secure, but it is hard to implement everywhere.
• Requires users to remember to carry their RSA keys with them.
Emerging Methods Of Authentication
• YubiKey – Authentication method based on a unique physical token which cannot be duplicated or recorded, providing a credential based on something only an authorised user possesses.
• Can also be used with password managers such as LastPass
How Does YubiKey Work?
Quantum Cryptography
What is Quantum Cryptography?
The use of quantum mechanical effects to perform cryptographic tasks or to break cryptographic systems
What does that mean exactly?
• Using physics rather than mathematics to perform cryptographic tasks, such as generating cryptographic keys
• Moreover Quantum Cryptography addresses the problem of Key distribution
Quantum CryptographyHow does it work?
• It works by using a technique called Quantum Key Distribution (QKD). QKD enables two parties to produce a shared random secret key which is only known to them. They can then use this key to encrypt and decrypt messages passed between those parties.
• Keys are generated by using photons, which are produced using LEDS. These photons are then polarised using polarising filters and then transmitted
• The two parties decide on what filters are going to be used, and also assign a value, usually a binary value to each photon which has a certain polarisation
• When the whole transmission has happened a unique key has been produced
Quantum CryptographyWhat is the benefits of using Quantum Cryptography?
• An important property of quantum cryptography is the ability to detect the presence of a third party attempting to eavesdrop on the transmission of the secret key
• This is achieved because of a fundamental principle of quantum mechanics – the process of measuring a quantum system in general disturbs the system.
Questions
References1. http://www.technologyreview.com/view/51
9851/the-big-data-conundrum-how-to-define-it/
2. http://en.wikipedia.org/wiki/Big_data#cite_note-15
3. 2013 Cost Of Data Breach Study: United Kingdom [Ponemon Institute, May 2013]
4. http://www.yubico.com/products/yubikey-hardware/yubikey/technical-description/
5. https://wiki.archlinux.org/index.php/yubikey#How_does_it_work
Upcoming Seminars
• Capturing the Real Value of IT Service Management- Friday14th February
• Preparing for BYOD & Mobile Device Management- Friday 28th February