Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
-
Upload
rackspace -
Category
Technology
-
view
786 -
download
2
description
Transcript of Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data PlatformBecause Sometimes You Just Need Metal
2
Meet Your Speakers
www.rackspace.com
Sean AndersonManager, Data Services
John EngatesCTO
David GrierSystems Engineer
3
• Big Data is now much more than hype – real customers with real use cases are adopting daily
• Recent survey found that business leaders expected the deployment of Hadoop to result in a 3-year benefit ranging from $5M to $50M+
• Close to 100% of business leaders have already deployed or plan to deploy ApacheTM Hadoop®
Big Data is Here to Stay
www.rackspace.com
"Enterprises are showing increasing interest in the value provided by the large-scale data processing that Hadoop and Spark can provide, but can be wary of the upfront cost and complexity of setting up a cluster to prove that value. Managed services such as [OnMetalTM Cloud Big Data Platform] enable enterprises to focus their energies on generating business insights rather than configuring and managing infrastructure.”
Matt Aslett451 Research Director, Data Platforms and Analytics
4
• Biggest impediments include:– Insufficient skills in-house to design and deploy
– Designing and deploying takes too long
– High cost of physical infrastructure
Hadoop is Hard
www.rackspace.com
3 10in onlybusinesses that plan to implement Hadoop have done so
5www.rackspace.com
• Original focus on batch processing• Streaming and interactive use cases emerging• Shift from jobs that take hours to seconds• Impala, Spark, and Presto are emerging tools
Hadoop is Changing
What are Companies Doing with Hadoop?
6www.rackspace.com
Vertical Use Case Data Type
Financial Services
New Account Risk Screens Text, Server Logs
Fraud Prevention Server Logs
Trading Risk Server Logs
Maximize Deposit Spread Text, Server Logs
Insurance Underwriting Geographic, Sensor, Text
Accelerate Loan Processing Text
Telecom
Call Detail Records (CDRs) Machine, Geographic
Infrastructure Investment Machine, Server logs
Next Product to Buy (NPTB) Clickstream
Real-time Bandwidth Allocation Server Logs, Text, Sentiment
New Product Development Machine, Geographic
Retail
360 View of the Customer Clickstream, Text
Analyze Brand Sentiment Sentiment
Localized, Personalized Promotions Geographic
Website Optimization Clickstream
Optimal Store Layout Sensor
Manufacturing
Supply Chain and Logistics Sensor
Assembly Line Quality Assurance Sensor
Proactive Maintenance Machine
Crowdsourced Quality Assurance Sentiment
7www.rackspace.com
What Is the Cost of Lacking a Big Data Strategy?
• Today every company can be a data company
• Successful companies will be data companies
• Under Armour isn’t just a fitness company---they’re a data company
8www.rackspace.com
9www.rackspace.com
10www.rackspace.com
11
Rackspace Cloud Big Data = Big Data as a Service
www.rackspace.com
• A fully managed Hadoop and Spark hardware and software stack with the elasticity and availability of the Rackspace Managed Cloud
• Save time and money in deploying, maintaining and scaling Big Data workloads
• Start small, spin instance up or down on demand, and scale elastically
12www.rackspace.com
The Trade Off...
Custom BuiltConsistentAvailable
Performant
Purpose BuiltElasticFlexible
On-Demand
13www.rackspace.com
OnMetal Lets You Scale Like the Internet Giants
BARE METAL SERVERS
API-drivenInstantly Available Highly Specialized No Hypervisor
“Rackspace Cloud, because of its single-tenant OnMetal line, is the only place on Earth where you can enjoy Facebook/Google-style infrastructure rented by the hour.”
-Ev KontsevoyDirector, Product
Rackspace
14
• For the first time, data scientists can get the best of both worlds: bare metal performance with cloud agility, all backed by Fanatical Support®
• What this means:– Spin projects up or down on demand so that
capacity is always perfectly aligned to demand
– When you’re running your projects, you can get screaming fast, predictable performance
www.rackspace.com
An Industry First for Big Data as a Service
15
• Rackspace OnMetal Cloud Big Data for Spark is engineered for break-through performance, enabling data scientists to iterate interactively with large data sets.
Breakthrough Performance
www.rackspace.com
Terasort DFS IO0
10
20
30
40
50
60
Traditional CBD
OnMetal CBD
Traditional Cloud Big Data
OnMetal Cloud Big Data
Se
con
ds
16
• Differentiators– CPU, Memory, SSD, Networking, and more
– optimized for screaming performance
• Three Flavors– Traditional: JBOD, commodity boxes, low
CPU and low RAM
– In Memory: SSD Drives, High Memory, High CPU
– Bare Metal resources a must due to high demands on all the resources
Under the Hood with Rackspace OnMetal
www.rackspace.com
OnMetal I/O
Workload type
• Online transaction processing (OLTP)• NoSQL databases• Traditional SQL databases
Features & Specs
• Intel Xeon E5-2680 v2 2.8 Ghz• 2X10 Core• 128GB RAM• Boot device (32GB SATADOM)• 2x L Si Nytro WarpDrive BLP4-1600
(1.6TB) for 3.2TB of high I/O storage• Redundant 10Gbps network
connections
17
• Introducing support of Apache SparkTM
• Apache Spark combined with Rackspace OnMetal enables enterprises to combine the breadth of structured and unstructured data with the speed of in-memory processing to build streaming, machine learning, and graph-optimized applications that allow businesses to take action at the speed of insight.
Rackspace Cloud Big Data is About More than Hadoop
www.rackspace.com
18
Apache Spark
www.rackspace.com
Speed Ease of Use Generality Integrated with Hadoop
19
• Deeper Integration with SQL Workloads
• Streaming Applications
• Machine Learning
• Iterative Processing
• Real Time Graphical Dashboards
New Use Cases
www.rackspace.com
20
• Big Data is here to stay• Hadoop is Hard• Rackspace makes Hadoop easy• With OnMetal and Spark, Rackspace takes Big Data beyond batch processing• Become a data company today
Summary
www.rackspace.com
21www.rackspace.com
22www.rackspace.com
23www.rackspace.com
24www.rackspace.com
Average Build Time:10 Minutes
25www.rackspace.com
26www.rackspace.com
27
Big Data Platform
www.rackspace.com
BARE METAL SERVERS
28
Rackspace Offerings for the Data Tier
www.rackspace.com
Infrastructure for Data
Managed Offerings of Most Popular Big Data, SQL, & NoSQL Databases
Managed Database Services for Production Apps
Cloud IaaSGet started fast
Dedicated Hosting
Predictable costs & performance
OnMetalCloud Elasticity &
Dedicated Performance
•Automatic DBA: Sharding, Backup, & HA
•Entire Stack Optimized on Bare Metal
•Supported 24x7x365 by experts•More than MongoDB…
•Architecture & Design•Tuning & Monitoring•24 x 7 x 365 Support•Cost Effective
DBA Services
29
www.baremetalbigdata.com
1. Sign up for a free trial
2. Want to know more? – Read my blog and check out the articles
What’s Next?
www.rackspace.com
30
Questions?
www.rackspace.com
THANK YOU
RACKSPACE® | 1 FANATICAL PLACE, CITY OF WINDCREST | SAN ANTONIO, TX 78218
US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM
© RACKSPACE LTD. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN THE UNITED STATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM