Real-Time, Geospatial, Maps by Neil Dahlke

Post on 16-Jan-2017

430 views 3 download

Transcript of Real-Time, Geospatial, Maps by Neil Dahlke

Real-Time, Geospatial, Maps

Neil Dahlke

29 June 2016

Agenda

2

▪PowerStream▪Supercar▪Q&A▪Drinks

Renewable Energy

in the News

BCC: http://www.bbc.com/news/science-environment-36420750

Investment in renewablesreached $286 billion worldwide

in 2015

Germany Just Got Almost All of Its Power From Renewable Energy

May 15, 2016

Bloomberg: http://www.bloomberg.com/news/articles/2016-05-16/germany-just-

got-almost-all-of-its-power-from-renewable-energy

Denmark is aiming for 50% renewable energy sources within the next five yearsIndependent: http://www.independent.co.uk/environment/germany-just-got-almost-all-of-its-power-from-renewable-energy-a7037851.html

42% of electricity produced from wind turbines in 2015

The Guardian: http://www.theguardian.com/environment/2016/jan/18/denmark-broke-world-record-for-wind-power-in-2015

Portugal Runs for Four Days Straight on Renewable Energy Alonehttp://www.theguardian.com/environment/2016/may/18/portugal-runs-for-four-days-straight-on-renewable-energy-alone

22% of electricityprovided by wind in 2015

MemSQL PowerStreamPredicting the global health of wind turbines

Sensors

Wind Turbine Wind Farm

MemSQL PowerStream197,000 wind turbines around the world

1 to 2 million data points per secondwith MemSQL Streamliner

Simulation Details

11

Data producers (Python programs) push to Kafka▪1M data points per second from 200k turbines▪Generated sensor data is based on predetermined turbine failure

modelTransform models individual turbine (2 components per turbine) failures w/ machine learning, determining: How fast is the turbine deteriorating? How bad does the turbine get before being

repaired?

How does it work?

REAL-TIME INPUTS

REAL-TIMEAPPLICATION

Demo Architecture and Data Flow

13

REAL-TIME INPUTS

REAL-TIMEAPPLICATION

Demo Architecture and Data FlowSimulated sensor data is written to Kafka

14

Extract

REAL-TIME INPUTS

StreamlinerREAL-TIME

APPLICATION

Demo Architecture and Data FlowSimulated sensor data is written to KafkaStreamliner Extractor pulls data from Kafka into Spark

15

Extract, Transform

REAL-TIME INPUTS

StreamlinerREAL-TIME

APPLICATION

Demo Architecture and Data FlowSimulated sensor data is written to KafkaStreamliner Extractor pulls data from Kafka into SparkStreamliner Transformer then “scores” the failure model (ML algorithm)

• Failure model is scored through performing a regression on incoming sensor data values

16

Extract, Transform, Load

REAL-TIME INPUTS

StreamlinerREAL-TIME

APPLICATION

Demo Architecture and Data FlowSimulated sensor data is written to KafkaStreamliner Extractor pulls data from Kafka into SparkStreamliner Transformer then “scores” the failure model (ML algorithm)

• Failure model is scored through performing a regression on incoming sensor data valuesStreamliner Loader inserts the data into MemSQL

17

Cluster Architecture

18

Aggregator Nodes

Leaf Nodes

Cluster Architecture

19

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

ZooKeeperSpark Master

Internet-of-Things simulation depicting

health of wind turbines globally.

8 machines - AWS C4-2X large instances, at $0.311 per hour per machine,

annual cost ~ $22,000.

Cluster Architecture

20

Visual Layer

21

▪MemSQL data is rendered in a web UI• Turbine Health (green, yellow, red)

▪Draw positions of turbines on a MapBox map• A geospatial query is sent to MemSQL each time the map

view is moved▪Alerts based on predicted turbine health▪Data points shown on the UI map are all from real-time

queries• Real-time in this case = 1 second interval

Demo

The On-Demand

Economy

24

MemSQL Supercar

Real-time asset tracking and analysis

We live in an on-demand economy

Consumers are conditioned to instant services, like Uber, Stripe, and Airbnb

Where does that leave enterprises?

Racing to meet internal and external expectations for speed and personalization

Batch processing in the enterprise enemy

Enterprises must move from overnight to real-time, intra-day operations

Cluster Architecture

▪One single 16 core machine w/ 64 GB RAM is enough to handle all of the data in real time. ▪That’s really it

Data ProducerKafkaSpark

MemSQL AggMemSQL Leaf

ZooKeeperSpark Master

31

Simulation Details▪NYC Taxi and Limo Commission Trip Record Data

• Downloads available each year fo’ free

▪Simulation utilizes dataset from NYE (one of the busiest days for cabs in NYC)

▪Drivers are assigned pickups and dropoffs from real data set

▪Routes are replayed over time

32

Extract, Transform, Load

REAL-TIME INPUTS

StreamlinerREAL-TIME

APPLICATION

Demo Architecture and Data FlowSimulated driver data is written to KafkaStreamliner Extractor pulls data from Kafka into SparkStreamliner Transformer parses the CSV and transforms it to a Spark DataFrameStreamliner Loader inserts the data into MemSQL

33

Demo

Q&A

Resources▪Powerstream blog post

http://blog.memsql.com/powerstream-demo/

▪Powerstream recordinghttps://youtu.be/DhP324uNZMI?t=589

▪Supercar blog posthttp://blog.memsql.com/real-time-geospatial-intelligence-with-supercar/

▪Supercar recordinghttps://www.youtube.com/watch?v=2txICCLUV-Y

▪Today’s talks will be published soon.

36

Thank You