Using Spark Part Time
-
Upload
rajiv-shah -
Category
Data & Analytics
-
view
433 -
download
0
Transcript of Using Spark Part Time
![Page 1: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/1.jpg)
PART TIME SPARK USERRajiv Shah
www.rajivshah.com
Chicago Spark Users MeetupNov 5, 2015
![Page 2: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/2.jpg)
ROADMAP
• Status of spark
• My take
• Examples
![Page 3: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/3.jpg)
status of spark
![Page 4: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/4.jpg)
![Page 5: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/5.jpg)
Strata+Hadoop mentions of spark
![Page 6: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/6.jpg)
Cloudera Blog Post on Sparkling Water
http://blog.cloudera.com/blog/2015/10/how-to-build-a-machine-learning-app-using-sparkling-water-and-apache-spark
![Page 7: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/7.jpg)
my personal take
![Page 8: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/8.jpg)
![Page 9: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/9.jpg)
![Page 10: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/10.jpg)
Insufficient Algorithms
![Page 11: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/11.jpg)
http://projects.rajivshah.com/shiny/outlier/
surfing for algorithms
![Page 12: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/12.jpg)
ML - MLLIB
![Page 13: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/13.jpg)
http://spark.apache.org/docs/latest/mllib-guide.html
![Page 14: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/14.jpg)
Language SchizophreniaScala, Python, R
![Page 15: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/15.jpg)
Lack of Documentation
![Page 16: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/16.jpg)
![Page 17: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/17.jpg)
Difficult to tune
![Page 18: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/18.jpg)
![Page 19: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/19.jpg)
Not for small or big data
![Page 20: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/20.jpg)
![Page 21: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/21.jpg)
![Page 22: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/22.jpg)
![Page 23: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/23.jpg)
![Page 24: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/24.jpg)
USING SPARK
![Page 25: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/25.jpg)
Spark makes the impossible,possible
![Page 26: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/26.jpg)
Spark is hard
![Page 27: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/27.jpg)
COOL THINGS ABOUT SPARK
• Scales up
• Streaming
• Enterprise worthy
• It looks like it will play nice
![Page 28: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/28.jpg)
SUGGESTIONS
• Get data engineers that will work with your data scientists
• If you can’t take advantage of spark’s strengths, don't use it
![Page 29: Using Spark Part Time](https://reader035.fdocuments.in/reader035/viewer/2022062503/587d15e31a28abae148b69b5/html5/thumbnails/29.jpg)
EXAMPLES
• Spark streaming - Streaming Kmeans clustering
• Anomaly Detection using H2O
• Recommenders