INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór...
Transcript of INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór...
![Page 1: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/1.jpg)
INTRODUCTION TO
BIG DATA MANAGEMENT
Björn Þór JónssonCRESS and School of Computer Science,
Reykjavík University
![Page 2: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/2.jpg)
![Page 3: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/3.jpg)
![Page 4: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/4.jpg)
![Page 5: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/5.jpg)
![Page 6: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/6.jpg)
![Page 7: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/7.jpg)
![Page 8: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/8.jpg)
Requirements?
![Page 9: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/9.jpg)
The Three “V”s
Volume
Velocity
Variety
Veracity
Validity
Viability
Value
![Page 10: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/10.jpg)
The Five “W”s
Who?
Where?
When?
Why?
What?
![Page 11: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/11.jpg)
Identification
Introspection
Integration
Immutability
![Page 12: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/12.jpg)
SMALL DATA
Specific questions
One location
Structured
Single user
Transient
Focused
Can be recreated
Small risk
Simple
Complete
BIG DATA
Broad concerns
Many locations
Varied, unstructured
Many providers
Durable
Broad
Gone if not captured
Big risk
Metadata is vital
Incremental
GOAL
LOCATION
STRUCTURE
SOURCE
LONGEVITY
MEASUREMENTS
REPRODUCIBILITY
STAKES
INTROSPECTION
ANALYSIS
![Page 13: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/13.jpg)
© Philippe Bonnet, 2014
Big data is not a product, but a collection of processes
Data Maintenance
Data Preparation
Data Integration
Big Data
Resource
Data Collection
Data Cleaning
ETL
Federation
DBs
Docs
Feeds
Analog
Data Analysis
Data Mining
Long-term Archival
![Page 14: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/14.jpg)
SMALL DATA
Specific questions
One location
Structured
Single user
Transient
Focused
Can be recreated
Small risk
Simple
Complete
BIG DATA
Broad concerns
Many locations
Varied, unstructured
Many providers
Durable
Broad
Gone if not captured
Big risk
Metadata is vital
Incremental
GOAL
LOCATION
STRUCTURE
SOURCE
LONGEVITY
MEASUREMENTS
REPRODUCIBILITY
STAKES
INTROSPECTION
ANALYSIS
![Page 15: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/15.jpg)
![Page 16: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/16.jpg)
![Page 17: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/17.jpg)
Components
Consistent Hashing
Replication (N)
Tunable Consistency(R + W < N)
Queries (MapReduce or SQL variants)
![Page 18: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/18.jpg)
![Page 19: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/19.jpg)
SQL variants
![Page 20: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/20.jpg)
![Page 21: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/21.jpg)
© Philippe Bonnet, 2014
Big data is not a product, but a collection of processes
Data Maintenance
Data Preparation
Data Integration
Big Data
Resource
Data Collection
Data Cleaning
ETL
Federation
DBs
Docs
Feeds
Analog
Data Analysis
Data Mining
Long-term Archival
![Page 22: INTRODUCTION TO BIG DATA MANAGEMENT - …...INTRODUCTION TO BIG DATA MANAGEMENT Björn Þór Jónsson CRESS and School of Computer Science, Reykjavík University Requirements? The](https://reader035.fdocuments.in/reader035/viewer/2022062605/5fd4965bb9039063114c4670/html5/thumbnails/22.jpg)
Sources
• http://jobs.aol.com/articles/2011/08/10/data-scientist-the-hottest-job-you-havent-heard-of/
• http://en.wikipedia.org/wiki/Data_science, http://en.wikipedia.org/wiki/MapReduce,
http://en.wikipedia.org/wiki/Big_data, http://en.wikipedia.org/wiki/List_of_countries_by_population
• http://www.delphianalytics.net/wp-content/uploads/2013/04/GrowthOfDataVsDataAnalysts.png
• http://media.economist.com/images/20100227/201009SRC696.gif
• http://www.datasciencecentral.com/profiles/blogs/structured-vs-unstructured-data-the-rise-of-data-anarchy
• http://www.zerohedge.com/sites/default/files/images/user5/imageroot/2012/10-2/Food%20For%20Thoughts.jpg
• http://www.theguardian.com/news/datablog/2012/mar/09/big-data-theory
• http://blogs-images.forbes.com/davefeinleib/files/2012/07/Big-Data-Trends.0031.png
• http://www.slideshare.net/4Neha/big-data-15681560
• http://www.mimul.com/pebble/default/images/blog/cloud/nosql_cap04.png
• http://www.ibmbigdatahub.com/sites/default/files/infographic_file/4-Vs-of-big-data.jpg
• http://reflectionsblog.emc.com/2012/06/scientific-big-data/
• http://go.nutanix.com/rs/nutanix/images/CAP_Diagram_dist-copy.jpg
• http://www.paperplanes.de/2011/12/9/the-magic-of-consistent-hashing.html
• Jules J. Berman, Principles of Big Data, Morgan Kaufmann, 2013
• Research papers, Wikipedia, …