Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

46
Fast Cars, Big Data How Streaming Can Help Formula1 ? Tugdual Grall - MapR - @tgrall ROME 24-25 MARCH 2017

Transcript of Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

Page 1: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

Fast Cars, Big Data How Streaming Can Help Formula1 ?Tugdual Grall - MapR - @tgrall

ROME 24-25 MARCH 2017

Page 2: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 2

Agenda

• What’s the point of data in motorsports? • Demo • Architecture • What’s next? • Q&A

Page 3: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 3

What’s the point of data in motorsports?

Page 4: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 4

Page 5: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 5

Data in Motorsports

F1 Framework - http://f1framework.blogspot.de/2013/08/short-guide-to-f1-telemetry-spa-circuit.html

Page 6: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 6

Race Strategy

F1 Framework - http://f1framework.blogspot.be/2013/08/race-strategy-explained.html

Page 7: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 7

Got examples?

Page 8: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 8

Got examples?

• Up to 300 sensors per car

Page 9: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 9

Got examples?

• Up to 300 sensors per car • Up to 2000 channels

Page 10: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 10

Got examples?

• Up to 300 sensors per car • Up to 2000 channels • Sensor data are sent to the paddock in 2ms

Page 11: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 11

Got examples?

• Up to 300 sensors per car • Up to 2000 channels • Sensor data are sent to the paddock in 2ms • 1.5 billions of data points for a race

Page 12: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 12

Got examples?

• Up to 300 sensors per car • Up to 2000 channels • Sensor data are sent to the paddock in 2ms • 1.5 billions of data points for a race • 5 billions for a full race weekend

Page 13: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 13

Got examples?

• Up to 300 sensors per car • Up to 2000 channels • Sensor data are sent to the paddock in 2ms • 1.5 billions of data points for a race • 5 billions for a full race weekend • 5/6Gb of compressed data per car for 90mn

Page 14: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 14

Got examples?

• Up to 300 sensors per car • Up to 2000 channels • Sensor data are sent to the paddock in 2ms • 1.5 billions of data points for a race • 5 billions for a full race weekend • 5/6Gb of compressed data per car for 90mn

US Grand Prix 2014 : 243 Tb (race teams combined)

Page 15: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 15

How does that work?

Page 16: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 16

Production System Outline

Page 17: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 17

Simplified Demo System Outline

Archive MapR DB

Jetty /Bootstrap /

d3

Apache Drill (SQL access)

TORCS race simulator

MapR Streams

Page 18: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 18

TORCS for Cars, Physics & Drivers

• The Open Racing Car Simulator – http://torcs.sourceforge.net/

• Car racing game • AI & Research Platform

Page 19: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 19

Demo

Page 20: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 20

Architecture

Events Producers Events Consumers

sensors data

Real Time

Analytics

Analytics with SQL

https://github.com/mapr-demos/racing-time-series

Page 21: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 21

Architecture

Events Producers Events Consumers

sensors data

Real Time

Analytics

https://github.com/mapr-demos/racing-time-series

JSONProducer

Consumer

Page 22: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 22

Where/How to store data ?

Page 23: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 23

Big Datastore

Distributed File SystemHDFS/MapR-FS

NoSQL DatabaseHBase/MapR-DB

….

Page 24: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 24

Data Streaming

Page 25: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 25

Data Streaming Moving millions of events per h|mn|s

Page 26: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 26

What is Kafka?

• http://kafka.apache.org/ • Created at LinkedIn, open sourced in 2011 • Implemented in Scala / Java • Distributed messaging system built to scale

Page 27: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 27

Big Picture

Producer

Producer

Producer

Consumer

Consumer

Consumer

Page 28: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 28

Topic & Partitions

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5

0 1 2 3 4 5 6 7

Partition 0

Partition 1

Partition 2

Writes

Page 29: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 29

Consumer Groups

Page 30: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 30

More real life Kafka …

Zookeeper

Broker 1

Topic A Topic B

Broker 2

Topic A Topic B

Broker 3

Topic A Topic B

Producer

Producer

Producer

Consumer

Consumer

Consumer

Page 31: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 31

MapR Streams

• Distributed messaging system built to scale • Use Apache Kafka API 0.9.0 • No code change • Does not use the same “broker” architecture

– Log stored in MapR Storage (Scalable, Secured, Fast, Multi DC) – No Zookeeper

Page 32: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 32

Produce Messages

ProducerRecord<String, byte[]> rec = new ProducerRecord<>( “/apps/racing/stream:sensors_data“, eventName, value.toString().getBytes());

producer.send(rec, (recordMetadata, e) -> { if (e != null) { … });

producer.flush();

Page 33: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 33

Consume Messages

long pollTimeOut = 800; while(true) {

ConsumerRecords<String, String> records = consumer.poll(pollTimeOut); if (!records.isEmpty()) {

Iterable<ConsumerRecord<String, String>> iterable = records::iterator; StreamSupport

.stream(iterable.spliterator(), false)

.forEach((record) -> { // work with record object record.value();…

}); consumer.commitAsync(); } }

Page 34: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 34

What’s next?

Page 35: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 35

Sensor Data V1

• 3 main data points: – Speed (m/s) – RPM – Distance (m)

• Buffered

{ "_id":"1.458141858E9/0.324", "car" = "car1", "timestamp":1458141858, "racetime”:0.324, "records": [ { "sensors":{ "Speed":3.588583, "Distance":2003.023071, "RPM":1896.575806 }, "racetime":0.324, "timestamp":1458141858 }, { "sensors":{ "Speed":6.755624, "Distance":2004.084717, "RPM":1673.264526 }, "racetime":0.556, "timestamp":1458141858 },

Page 36: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 36

Capture more data

Page 37: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 37

Sensor Data V2

• 3 main data points: – Speed (m/s) – RPM – Distance (m) – Throttle – Gears – …

• Buffered

{ "_id":"1.458141858E9/0.324", "car" = "car1", "timestamp":1458141858, "racetime”:0.324, "records": [ { "sensors":{ "Speed":3.588583, "Distance":2003.023071, "RPM":1896.575806, "Throttle" : 33, "Gear" : 2 }, "racetime":0.324, "timestamp":1458141858 }, { "sensors":{ "Speed":6.755624, "Distance":2004.084717, “RPM”:1673.264526, "Throttle" : 37, "Gear" : 2 }, "racetime":0.556, "timestamp":1458141858 },

Page 38: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 38

Add new services

Page 39: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 39

New Data Service

sensors data

AlertsStream

Processing

Page 40: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 40

• Cluster Computing Platform • Extends “MapReduce” with extensions

– Streaming – Interactive Analytics

• Run in Memory • http://spark.apache.org/

Page 41: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 41

• Streaming Dataflow Engine – Datastream/Dataset APIs – CEP, Graph, ML

• Run in Memory • https://flink.apache.org/

Page 42: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 42

DemoStream Processing with Flink

Page 43: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 43

Streaming Architecture & Formula 1

• Stream & Process data in real time – Use distributed & scalable processing and storage – NoSQL Database, Distributed File System

• Decouple the source from the consumer(s) – Dashboard, Analytics, Machine Learning – Add new use case….

Page 44: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 44

Streaming Architecture & Formula 1• Stream & Process data in real

time – Use distributed & scalable

processing and storage – NoSQL Database, Distributed

File System

• Decouple the source from the consumer(s) – Dashboard, Analytics, Machine

Learning – Add new use case….

This is not only about Formula 1!

(Telco, Finance, Retail, Content, IT)

Page 45: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 45

One Last Thing…

Events Producers Events Consumers

Web Socket

https://github.com/mapr-demos/racing-time-series

BLE

MapR-StreamsApache Kafka

Page 46: Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Codemotion Rome 2017

© 2017 MapR TechnologiesMapR Confidential 46