ABDW17-Lightning Talks track-Thinking in Streaming

Post on 12-Apr-2017

9 views 1 download

Transcript of ABDW17-Lightning Talks track-Thinking in Streaming

1

Thinking in Streaming

Yogi Devendra yogidevendra@apache.org

2

Outline

● What?

● Why?

● How?

3

What is streaming?

Image ref: [1]

Streaming

“Process on-the-go

4

Why streaming?

• Reducing the end to end latency• Effective use of resources

5

Significance

Image ref: [2]

6

Traffic Signal vs Flyover

Image ref: [3]

7

Stock exchange

Auction style trading Online trading

Image ref: [4] Image ref: [5]

8

Taxi booking

Cab drivers waiting for customers

Riders waiting for a cab

Ride on the go

Image ref: [6]

Image ref: [7]

Image ref: [8]

9

• Infrastructure availability• CPU, disks, memory• Distributed systems

•Datasets• Bigger datasets• More data sources• Ongoing data

• SLAs• Results expected as soon as possible

• Newer types of data sources• Kafka• REST• websockets

Key changes

10

Problem statement

● Input○ 1MB / record○ 1TB data / day○ Monthly/Daily file

● Output○ Process entire dataset and give

the answer at the end.○ E.g. : Aggregate CDR for

customer billing, generate detailed billing report

● Input○ 1MB / record○ 1 Million records / sec○ Continuous Stream

● Output○ Process data as it comes and

output current answer. ○ E.g. User dashboard with current

billing.

Batch Streaming

11

Characteristics of streaming

•On the go ⇒ Incremental logic•while (true) : current state = function ( current state , current record)result = function (current state)•Need to store only the current state (NOT DATA!!)

12

Current :• Rates are based on TRP ratings• TRP ratings are offline

•Bids for ad slots happen offline

Example : TV advertising

13

Impact of TRP on-the-go

14

How to stream?

Ad Server

Ad Ad Ad Ad Ad

Viewership of ESPN

Service provider 1

Service provider 2

Set top box

Publish Viewership

Bid for Ad slot

To Broadcaster

15

Streaming pipeline

Read from Input

Source

Cleanse the Data

Aggregate

Data

Ad Server

Viewership Output

Accept Winner

Ad Bidding

Input

Output to Broadcaste

r

16

• Incremental logic•Scalability, fault tolerance

• Modular design• No. of channels• No. of users

•Operability

Key considerations

17

Further possibilities

• Real time bidding for ad-slots based on viewership• Pay per use billing plans for users• User profiling, targeted ads

18

Streaming : Game Changer

• Streaming adds value to existing applications by reducing latencies.

•Streaming opens up more possibilities for applications.

Image ref: [9]

19

Questions

Questions

20

1. One piece flow LuciArsene slideshare | https://www.slideshare.net/LuciArsene/leandlarsene 2. Are you series Boomsbeat | http://images.boomsbeat.com/

3.Flyovers World Amazing pictures | https://worldamazingpictures.wordpress.com4. NYSE Alamy | http://www.alamy.com/5. NASDAQ Goforex | http://www.goforex.eu/6. Cabs driver colebegone | http://colebegone.blogspot.in/

7.Riders frpo | https://www.frpo.org/lobby-view/8.App based booking Officechai | https://officechai.com/news/9.Game changer gurussolutions | https://gurussolutions.com/

References