State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100...
Transcript of State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100...
![Page 1: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/1.jpg)
State of Spark, and where it is going
Reynold Xin @rxinStrata SingaporeDec 3rd, 2015
![Page 2: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/2.jpg)
SQL Streaming MLlib
Spark Core (RDD)
GraphX
Spark stack diagram
![Page 3: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/3.jpg)
A Great Year for Spark
Most active open source project in big data
New language: R
Widespread industry support & adoption
![Page 4: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/4.jpg)
Community Growth
2014 2015
Summit Attendees
2014 2015
MeetupMembers
2014 2015
Developers Contributing
3900
1100
50K
12K
500
1000
![Page 5: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/5.jpg)
Meetup Groups: December 2014
source: meetup.com
![Page 6: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/6.jpg)
Meetup Groups: December 2015
source: meetup.com
![Page 7: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/7.jpg)
![Page 8: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/8.jpg)
Users
1000+ companies
…
Distributors + Apps
50+ companies
…
![Page 9: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/9.jpg)
Diverse Runtime EnvironmentsHOW RESPONDENTS ARE
RUNNING SPARK
51%on a public cloud
MOST COMMON SPARK DEPLOYMENTENVIRONMENTS (CLUSTER MANAGERS)
48% 40% 11%Standalone mode YARN Mesos
Cluster Managers
![Page 10: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/10.jpg)
Industries Using Spark
Other
Software(SaaS, Web, Mobile)
Consulting (IT)Retail,
e-Commerce
Advertising,Marketing, PR
Banking, Finance
Health, Medical,Pharmacy, Biotech
Carriers,Telecommunications
Education
Computers, Hardware
29.4%
17.7%
14.0%
9.6%
6.7%
6.5%
4.4%
4.4%
3.9%
3.5%
![Page 11: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/11.jpg)
Top Applications
29%
36%
40%
44%
52%
68%
Faud Detection / Security
User-Facing Services
Log Processing
Recommendation
Data Warehousing
Business Intelligence
![Page 12: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/12.jpg)
Largest Cluster & Daily Intake
12
800 million+active users
8000+nodes
150 PB+1 PB+/day
![Page 13: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/13.jpg)
Alibaba Taobao
13
clustering(community detection)
belief propagation(influence & credibility)
collaborative filtering(recommendation)
* Spark Summit San Francisco 2014
![Page 14: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/14.jpg)
Possible Assets
Targeted Marketing
Financial Networking
Huawei FusionInsight Spark
……
DB/DW
Credit proof:about 2 Weeks
Credit Proof2~5 Seconds
Off LineHistory Query
On LineHistory query
Structured Data Structured, Semi-Structured, Unstructured Data
↑
Higher
History Query 7 years+1 year
Micro- loan Conversion Rate 40X
Credit Proof 2-5s15days
↑
↓
Top Retail Bank Huawei
Top Retail Bank & Huawei
![Page 15: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/15.jpg)
Are We Done?
No! Development is faster than ever. Expect Spark 2.0 in 2016.
Biggest technical change in 2015 was DataFrames• Moves many computations onto the relational Spark SQL optimizer
Enables both new APIs and more optimization, which is now happening through Project Tungsten
![Page 16: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/16.jpg)
Coming in Spark 1.6
Dataset API: typed interface over DataFrames / Tungsten• Common ask from developers who saw DataFrames
case class Person(name: String, age: Int)
val dataframe = read.json(“people.json”)val ds: Dataset[Person] = dataframe.as[Person]
ds.filter(p => p.name.startsWith(“M”)).groupBy(“name”).avg(“age”)
![Page 17: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/17.jpg)
Other Upcoming Features
DataFrame integration with GraphX and Streaming
More Tungsten features: faster in-memory cache, SSD storage, better code generation
Data sources for Streaming
![Page 18: State of Spark, and where it is going · Meetup Members 2014 2015 Developers Contributing 3900 1100 50K 12K 500 1000. MeetupGroups: December 2014 source: meetup.com. ... Alibaba Taobao](https://reader034.fdocuments.in/reader034/viewer/2022050604/5fab515efd3fb843dc48736b/html5/thumbnails/18.jpg)
Thank you.@rxin