Storm distributed cache workshop
-
Upload
roger-rafanell-mas -
Category
Software
-
view
73 -
download
3
Transcript of Storm distributed cache workshop
![Page 1: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/1.jpg)
Storm Distributed Cache WorkshopHow to efficiently distribute mutable BLOBs into Apache Storm
![Page 2: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/2.jpg)
Problem (Apache Storm < v1.x)
Topology Resources:
● Dictionaries, ML Models, Geolocation Data, etc...
Typically packaged in topology JAR:
● Immutable: Any change require re-packaging & deployment
● Fine for small files
● Large files negatively impact on topology startup time
![Page 3: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/3.jpg)
Solution (Apache Storm v1.x)
Storm Distributed Cache:
● Allows sharing of files (BLOBs) among topologies
● Files can change over the lifetime of the topology
● Files can be updated from command line or programmatically
● Allows for files from several KB to several GB in size
● Allows for compression(e.g. Zip, Tar, Gzip)
![Page 4: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/4.jpg)
Storm Distributed Cache
Two Implementations:
● LocalFSBlobStore:
○ Stores data on Nimbus local file system
○ Supports Replication Factor (not needed for HDFS-backed implementation)
● HdfsBlobStore:
● Stores data on HDFS file system
![Page 5: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/5.jpg)
Nimbus in High Availability
![Page 6: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/6.jpg)
Nimbus in High Availability
HA Nimbus:
● Increase overall availability on Nimbus
● Nimbus hosts can join/leave at any time
● Leverages Distributed Cache API
● JAR, Config and Serialized Topology uploaded to Distr. Cache
● Replication guarantees availability of all files
![Page 7: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/7.jpg)
Storm Distributed Cache (Create)
![Page 8: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/8.jpg)
Storm Distributed Cache (Submit)
![Page 9: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/9.jpg)
Storm Distributed Cache (Update)
It is possible for the cached files to be updated while topologies are running. In the current
versions it is the user’s responsibility to check whether a new file is available
![Page 10: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/10.jpg)
Storm Distributed Cache (Reading BLOBs)
![Page 11: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/11.jpg)
Hands-On
![Page 12: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/12.jpg)
Intrastructure
+
Twitter producer
Apache Kafka
Aggregate
(WordCount)
+DistCache
Topology
![Page 13: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/13.jpg)
Storm DistCache Topology
Kafka Spout
Storm Distributed Cache
+
wordsToTrack.list
Apache Kafka
Sentence
SplitterCounter
Aggregate
(WordCount)
Tick
Stream
(Signal)
![Page 14: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/14.jpg)
Example
Checkout project:
● https://github.com/rrafanell/storm-distcache-example
● Follow the steps described in the README
Requirements:
● Java Oracle JDK 1.8 or similar
● Maven
● Docker
![Page 15: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/15.jpg)
Code Inspection
![Page 16: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/16.jpg)
Example (Starting the Infrastructure)
Storm UI: http://localhost:8080
![Page 17: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/17.jpg)
Example (Configuring The Twitter-producer)
![Page 18: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/18.jpg)
Example (Running The Twitter-producer)
![Page 19: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/19.jpg)
Example (Uploading BLOBs)
![Page 20: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/20.jpg)
Example (Checking BLOBs)
![Page 21: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/21.jpg)
Example (Running the Topology)
![Page 22: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/22.jpg)
Example (Running the Topology)
![Page 23: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/23.jpg)
Example (Updating the BLOBs & reloading on-the-fly)
![Page 24: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/24.jpg)
Example (Shutting down the Infrastructure)
![Page 25: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/25.jpg)
Storm Distributed Cache Workshop
THANK YOU!
![Page 26: Storm distributed cache workshop](https://reader031.fdocuments.in/reader031/viewer/2022022415/5a65cf1b7f8b9ab21e8b466b/html5/thumbnails/26.jpg)
Local FS Blob Store