Flexible Network Analytics in the Cloud -...
Transcript of Flexible Network Analytics in the Cloud -...
![Page 1: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/1.jpg)
Flexible Network Analytics in the Cloud
Jon M. DuganSoftware Engineering GroupESnet (Energy Sciences Network)Lawrence Berkeley National Lab
CHI-NOG 07May 18, 2017
![Page 2: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/2.jpg)
2
Sisyphus, Network Engineer
![Page 3: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/3.jpg)
3
Crazy Happy Kid, Network Engineer
![Page 4: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/4.jpg)
4
Rugged Adventurer, Network Engineer
![Page 5: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/5.jpg)
5
Outline
Hard realities
Coping strategies
A programming model brings solace
Making the ascent with code
A scenic vista
![Page 6: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/6.jpg)
66
Network measurement is harder than it seems.
Hard realities
![Page 7: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/7.jpg)
77
Everything is a struggle.
Reality #1: It’s a mess
![Page 8: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/8.jpg)
8
Ideal
Photo: Fallingwater, Carol M. Highsmith via Wikimedia Commons
![Page 9: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/9.jpg)
9Photo: Shack, Oven Fresh via Wikimedia Commons
Reality
![Page 10: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/10.jpg)
10
The world is not neat and tidy.
Your data is not neat and tidy.
Accept it, design for it, live with it.Invite it over for dinner.
![Page 11: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/11.jpg)
1111
I’m not kidding, everything is a struggle.
Reality #2: Things Change
![Page 12: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/12.jpg)
12
FirstRequest
Photo: Fallingwater, Carol M. Highsmith via Wikimedia Commons
![Page 13: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/13.jpg)
13
SecondRequest
Photo: Farnsworth House, Wikimedia Commons
![Page 14: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/14.jpg)
14
What you needed yesterday probablyisn’t what you’ll need tomorrow.
You have to build something today that will work tomorrow.
Don’t make decisions today that you can make tomorrow.
Save your raw data. All of it if possible.
![Page 15: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/15.jpg)
1515
I told you everything is a struggle, but I’m not sure you believe me.
Reality #3: There’s always more
![Page 16: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/16.jpg)
Photo: Moscow. Standard buildings on Novokosinskaya street, аналоговый снимок via Wikimedia Commons.
![Page 17: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/17.jpg)
17
There are always new devices coming onlineand more telemetry you can collect.
Your storage and compute will be the same size tomorrow as they are today.
Use the cloud to scale.The cloud is elastic and (effectively) infinite.
![Page 18: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/18.jpg)
1818
Nevermind. You’ll see for yourself, everything is a struggle.
Reality #4: It’s never really done
![Page 19: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/19.jpg)
Photo: © Saúl Briceño from Wikimedia Commons Exterior of Centro Financiero Confinanzas (Torre de David),
![Page 20: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/20.jpg)
20
Nothing is ever really finished.
Time and money are limited.
Spend your time on the “what”: deriving insight from your data.
...not the “how”: building infrastructure.
![Page 21: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/21.jpg)
2121
Coping strategies
![Page 22: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/22.jpg)
22
The reality of the situation
1. It’s a mess 2. Things change
3. There’s always more 4. It’s never really done
● Design knowing things won’t be tidy
● “What” not “How”● Rely on the cloud for scaling
● Keep raw data to keep your options open
![Page 23: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/23.jpg)
2323
A programming model brings solace
![Page 24: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/24.jpg)
24
What is Apache Beam?
1. The Beam Programming Model
2. SDKs for writing Beam pipelines
3. Runners for existing distributed processing backends
○ Apache Apex
○ Apache Flink
○ Apache Spark
○ Google Cloud Dataflow
○ Local runner for testing
Slide courtesy of the Apache Beam Project
![Page 25: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/25.jpg)
25
The Evolution of Apache Beam
MapReduce
BigTable DremelColossus
FlumeMegastoreSpanner
PubSub
MillwheelApache Beam
Google Cloud Dataflow
Slide courtesy of the Apache Beam Project
![Page 26: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/26.jpg)
26
Why Apache Beam?
Unified - One model handles batch and streaming use cases.
Portable - Pipelines can be executed on multiple execution environments, avoiding lock-in.
Extensible - Supports user and community driven SDKs, Runners, transformation libraries, and IO connectors.
Slide courtesy of the Apache Beam Project
![Page 27: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/27.jpg)
27
The Beam Model: Asking the Right Questions
What results are calculated?
Where in event time are results calculated?
When in processing time are results materialized?
How do refinements of results relate?
Slide courtesy of the Apache Beam Project
![Page 28: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/28.jpg)
28
Customizing What Where When How
3Streaming
4Streaming
+ Accumulation
1Classic Batch
2Windowed
Batch
Slide courtesy of the Apache Beam Project
![Page 29: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/29.jpg)
2929
Making the ascent with code
![Page 30: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/30.jpg)
30
Your immutable data store: raw data
Make an immutable (read-only) repository of raw data
● Minimize processing of the data● Get it into the database as soon as possible● Keep your raw data for as long as you can
Benefits
● You get to change your mind later● If you make mistakes in later steps: recompute.
![Page 31: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/31.jpg)
31
Views: how to turn raw data into usable data
Use Beam to transform your raw data into a “view” of the data
Examples:
● converting SNMP counters to rates● grouping SNMP counters by customer or peer● binning flows by attributes● building network topologies from basic data
Views can be stacked, you can use one view as the input for a more refined view.
![Page 32: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/32.jpg)
32
Core Beam Abstractions
PCollections● Distributed, multi-element data set.
Transforms● Some code to run on all the data ● Take PCollection in and produce a PCollection
Pipeline I/O● Read data and produce a PCollection● Take a PCollection and write data
Pipeline● Input→PCollection→Transforms→PCollection→Output
![Page 33: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/33.jpg)
33
f( ): Focus on “what”, not “how”
Define the “what”
● I/O: What data to use and where it lives.● PCollections: Grouping the data for each transformation step● Transforms: allow you to define the “what”
Let Beam handle the “how”
● I/O: Reading and writing data at scale● PCollections: distributing data across workers● Transforms: applying your analysis to PCollections
![Page 34: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/34.jpg)
34
Example: Computing Total Traffic
# Python Beam SDKpipeline = beam.Pipeline('DirectRunner')
(pipeline | 'read' >> ReadFromText('./example.csv') | 'csv' >> beam.ParDo(FormatCSVDoFn()) | 'ifName key' >> beam.Map(group_by_device_interface) | 'group by iface' >> beam.GroupByKey() | 'compute rate' >> beam.FlatMap(compute_rate) | 'timestamp key' >> beam.Map(lambda row: (row['timestamp'], row['deltaIn'])) | 'group by timestamp' >> beam.GroupByKey() | 'sum by timestamp' >> beam.Map(lambda rates: (rates[0], sum(rates[1]))) | 'format' >> beam.Map(lambda row: '{},{}'.format(row[0], row[1])) | 'save' >> beam.io.WriteToText('./total_by_timestamp'))
pipeline.run()
![Page 35: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/35.jpg)
35
1 + 1 = 2Completeness Latency Cost
$$$
Data Processing Tradeoffs
Slide courtesy of the Apache Beam Project
![Page 36: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/36.jpg)
36
Stream or Batch?
Streaming for realtime insight● Current network load● Operational awareness● Threat detection / Anomaly Detection● Tend to be more expensive (VMs always running)● Unbounded data sets
Batch for precision or results that can wait● Billing● Precise traffic reports● Capacity planning● Tend to be cheaper (VMs used in bursts)● Bounded data sets
![Page 37: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/37.jpg)
3737
A scenic vista
![Page 38: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/38.jpg)
38
● Define new views to get different perspectives on raw data● Add new immutable data sets to gain more dimensions of data (SNMP,
Flow/sFlow, syslog, …)● Try out new ideas (running batch jobs is cheap and doesn’t impact system)● Teach others to write their own analysis
What can we see from here?
![Page 39: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/39.jpg)
39
Our experiences so far
1. There is a learning curve. 2. Docs aren’t amazing, but getting there.3. You may have to adjust your thinking. Need to understand the model to
know what will work at scale. 4. The cloud providers have a several choices when it comes to databases.
It’s easy to spend a lot of time investigating.5. Cost is manageable but it’s good to keep an eye on it.6. In the interest of vendor neutrality details about our specific vendor haven’t
been covered, but I’m happy to talk to you after the talk.
![Page 40: Flexible Network Analytics in the Cloud - CHI-NOGchinog.org/wp-content/uploads/2017/05/2.-Flexible... · 2017-05-24 · Unified - One model handles batch and streaming use cases.](https://reader034.fdocuments.in/reader034/viewer/2022042219/5ec5daa0148dbc039436db73/html5/thumbnails/40.jpg)
40
Thank you!Questions?Jon Dugan <[email protected]>http://x1024.net/blog/2017-05-12-CHINOG/
More about Apache Beam: https://beam.apache.org
The World Beyond Batch 101 & 102 https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101 https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102
More about ESnet:
Open source: http://software.es.net/Visualization portal: https://my.es.net/