Big data overwiew

Post on 23-Jan-2018

94 views 0 download

Transcript of Big data overwiew

Big Data Overview

○ What is Big Data?

○ Big Data in a real life.

○ Big Data problems & challenges?

○ Ways to solve these problems.

○ Big Data tools and technologies.

Crucial Questions

What is Big Data?

○ Too big to fit in memory

○ Too fast data acquisition

requirements

○ Too fast data processing

○ Too complex for

traditional processing

Big Data: numbers & facts

701,389 Facebook logins

38,194 posts to instagram

2.4 Million search queries

2.78 Million video views

Example: car fleet management○ 1M car profiles

○ Daily reports

○ Track position

by request

○ Keep history in

database

Real-time car fleet management○ 1K cars connected in real time

○ Gather data via OBD2 scanners in real-time

○ Gather data from cars’ GPS sensors in real-time

○ Store the data for future processing

○ Real-time calculation to predict traffic, engine

problems, accidents

What can we do with (Big)Data?○ Data ingestion & acquisition○ Data storage (search, transfer, sharing)○ Data processing & analysis○ Data visualization

Data Ingestion & Acquisition

○ Extract:

RDBMS, file systems, messaging

systems, sensors, log files

○ Transform:

Filter, encode/decode, aggregate,

validate

○ Load:

Data warehouse, messaging system

Data Storage

Big Data storage challenges:

○ Size (keep and search huge

amount of data)

○ Speed (data acquisition, data

search)

○ Availability (fault tolerance,

partition tolerance)

○ Consistency: all nodes see the same data at the same time

○ Availability: every request gets a response (success or failure)

○ Partition tolerance: system works despite of network failures

CAP Theorem

Streaming vs Batch processing

Batch Batch

Stream

Data

Data processing: Lambda Architecture

Data processing: Kappa Architecture

Data processing: MapReduce

Data Visualization

Everything as a Service

Example: Amazon Web Services

Q & A ?

○ What is Big Data?

○ Big Data in a real life.

○ Big Data problems & challenges?

○ Ways to solve these problems.

○ Big Data tools and technologies.