Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Storm Internals
-
Upload
humoyun-ahmedov -
Category
Technology
-
view
104 -
download
4
Transcript of Apache Storm Internals
![Page 1: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/1.jpg)
STORM ANATOMY
Cloud Computing Course Prof Hanku Lee
Social Media Cloud Computing lab MS Akhmedov Khumoyun
![Page 2: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/2.jpg)
What is Stream processing
Stream processing is a technical paradigm to process big volume of unbound sequence of tuples in realtime
= stream
Source Stream Processor
• Continuous analytics• Online machine
learning• Sensor data monitoring• Financial trading …
![Page 3: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/3.jpg)
Storm at Twitter
Twitter Web Analytics
![Page 4: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/4.jpg)
What is Storm?
Storm is
• Fast & scalable• Fault-tolerant• Guarantees messages will be processed• Easy to setup & operate• Free & open source
distributed realtime computation system- Originally developed by Nathan Marz at BackType (acquired by Twitter)- Written in Java and Clojure
![Page 5: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/5.jpg)
Conceptual View
![Page 6: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/6.jpg)
Physical View
![Page 7: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/7.jpg)
Concepts
Streams Spouts Bolts Topologies
![Page 8: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/8.jpg)
Streams
Unbounded sequence of tuples
![Page 9: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/9.jpg)
Spouts
Source of streams
• Read from Kafka queue• Read from Twitter Streaming API
![Page 10: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/10.jpg)
Bolts
Processes input streams and produces new streams
![Page 11: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/11.jpg)
Bolts
• Functions• Filters• Aggregation• Joins• Talk to databases
![Page 12: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/12.jpg)
Topology
Network of spouts and bolts
![Page 13: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/13.jpg)
TasksSpouts and bolts execute as
many tasks across the cluster
![Page 14: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/14.jpg)
Stream grouping
When a tuple is emitted, which task does it go to?
![Page 15: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/15.jpg)
Stream grouping
• Shuffle grouping: pick a random task
• Fields grouping: consistent hashing on a
subset of tuple fields
• All grouping: send to all tasks
• Global grouping: pick task with lowest id
![Page 16: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/16.jpg)
Starting topology
![Page 17: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/17.jpg)
Starting topology
![Page 18: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/18.jpg)
Storm : Fault-tolerance
![Page 19: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/19.jpg)
Storm : Fault-tolerance
![Page 20: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/20.jpg)
Storm : Fault-tolerance
![Page 21: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/21.jpg)
Storm : Fault-tolerance
![Page 22: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/22.jpg)
Storm : Fault-tolerance
![Page 23: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/23.jpg)
Guarantees messages will be processed
![Page 24: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/24.jpg)
Message Passing (ZeroMQ)
![Page 25: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/25.jpg)
Easy to setup & operate
• Setup ZooKeeper cluster• Install dependencies on Nimbus and workermachines- ZeroMQ 2.1.7 and JZMQ- Java 6 and Python 2.6.6- unzip• Download and extract a Storm release to Nimbusand worker machines• Fill in mandatory configuration into storm.yaml• Launch daemons under supervision using “storm”script
![Page 26: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/26.jpg)
Cluster Summary
![Page 27: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/27.jpg)
Topology Summary
![Page 28: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/28.jpg)
Component Summary
![Page 29: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/29.jpg)
Advanced Topics
• Distributed RPC
• Transactional topologies
• Trident
• Using non-JVM languages with Storm
• Unit testing
• Patterns
![Page 30: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/30.jpg)
Real-time Twitter AnalyticsTrending Topics and Sentiment Analysis
MySQL
Kafka
Storm Cluster
Hadoop (HDFS and HBase )
Twitter Crawler
![Page 31: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/31.jpg)
![Page 32: Apache Storm Internals](https://reader035.fdocuments.in/reader035/viewer/2022062313/55c4ab67bb61ebee5a8b45de/html5/thumbnails/32.jpg)
THANK YOU FOR ATTENTION
Any Questions Are Welcome…