Apache Spark:The Analytics Operating System
Anjul BhambhriVice President, IBM Big Data Engineering
Deep Blue SQL RISC
DNA Transistor Magnetic Tape Linux PC
Fortran DRAM Mainframe Watson
Floppy Disk UPC
Punch Card
IBM: 100 years of (supporting) innovation
The Analytics
Operating System
Apache Spark
Enhance it! Offer it!
Leverage it!
Spark Technology Center @ SF
On-prem and on the cloud
Inside our products
At IBM, We Love Spark!
IBM Cloud Data Servicesnow featuring Spark isopen for data
IBM is Building on Apache Spark
• IBM Analytics• IBM Commerce• IBM Watson• IBM Research• IBM Cloud
Quarks from IBMAnnounced Feb 2016
• Open-source platform for building IoT applications
• Light-weight & embeddable• Integrates with Spark
• Lambda Architecture and Spark enable efficient batch and streaming analytics• Visualization at every step of data discovery enables better self service
The Weather Company clusters running hot: ~30 billion API requests per day ~120 million active mobile users #3 most active mobile user base Billions of events per day (1.3M/sec) ~360 PB of traffic daily Need to keep data forever
The use case:Efficient batch + streaming analysisSelf-serve data scienceBI / visualization tool support
An IBM Business
Spark for daily weather
Spark in Health CareHealth Care Data Lakes Improve how healthcare is delivered Collect and combine data from dozens of sources Clinical, Operational, Financial Inside and outside your enterprise
Benefits Better medical outcomes for patients Control cost and improve quality
SystemML on Spark Predictive Risk Modeling Right patient intervention relating to adverse health events
Spark in TelecomThe challenge: Improve customer satisfaction rates Multiple channels for customer interactions Very large data volumes
The need: Create a 360 degree view of a customer Stitch all interactions across channels –
“Customer Experience Journey” Classify interaction sentiment and take
necessary actions
• Spark Streaming brings all the data together• Spark Core is used to process and transform text and voice data• Spark MLLib algorithms stitch interactions on a journey and score “sentiment”• Spark SQL drives interactive queries via visual dashboards
PUB / SUBMQTT / WebSockets / Flume / Kafka
` ` `
JourneyDashboards
Interaction & Journey Data
Voice & Text Data
Apache Spark:The Analytics Operating System
THANK YOU!
Top Related