Post on 21-Jan-2018
© 2016 Mesosphere, Inc. All Rights Reserved.
From SMACK to SMAACKAlluxio meets DC/OSJörg Schad, MesosphereAdit Madan, Alluxio
#smack @Alluxio @dcos @joerg_schad @madanadit
© 2017 Mesosphere, Inc. All Rights Reserved.
20% OFFMCDCOS20
September 13th - 15th ● Dedicated Tracks● MesosCon University ● Town Halls● Hackathon
Accelerating Spark workloads in a Mesos environment with Alluxio, 09/15, 11AM
© 2017 Mesosphere, Inc. All Rights Reserved. 3
Fast Data
Batch Event ProcessingMicro-Batch
Days Hours Minutes Seconds Microseconds
Solves problems using predictive and prescriptive analyticsReports what has happened using descriptive analytics
Predictive User InterfaceReal-time Pricing and Routing Real-time AdvertisingBilling, Chargeback Product recommendations
© 2017 Mesosphere, Inc. All Rights Reserved. 4
The SMACK Stack
EVENTSUbiquitous data streams from connected devices
INGEST
Apache Kafka
STORE
Apache Spark
ANALYZE
Apache Cassandra
ACT
Akka
Ingest millions of events per second
Distributed & highly scalable database
Real-time and batch process data
Visualize data and build data driven applications
Mesos/ DC/OS
Sensors
Devices
Clients
© 2017 Mesosphere, Inc. All Rights Reserved. 6
NAIVE APPROACH
Typical Datacentersiloed, over-provisioned servers,
low utilization
Industry Average12-15% utilization
mySQL
microservice
Cassandra
Spark/Hadoop
Kafka
© 2017 Mesosphere, Inc. All Rights Reserved. 8
MULTIPLEXING OF DATA, SERVICES, USERS, ENVIRONMENTS
Typical Datacentersiloed, over-provisioned servers,
low utilization
Mesos/ DC/OSautomated schedulers, workload multiplexing onto the
same machines
mySQL
microservice
Cassandra
Spark/Hadoop
Kafka
Datacenter Operating System (DC/OS)
Distributed Systems Kernel (Mesos)
DC/OS ENABLES MODERN DISTRIBUTED APPS
Big Data + Analytics EnginesMicroservices (in containers)
Streaming
Batch
Machine Learning
Analytics
Functions & Logic
Search
Time Series
SQL / NoSQL
Databases
Modern App Components
Any Infrastructure (Physical, Virtual, Cloud)9
© 2017 Mesosphere, Inc. All Rights Reserved. 10
The SMACK Stack
EVENTSUbiquitous data streams from connected devices
INGEST
Apache Kafka
STORE
Apache Spark
ANALYZE
Apache Cassandra
ACT
Akka
Ingest millions of events per second
Distributed & highly scalable database
Real-time and batch process data
Visualize data and build data driven applications
Mesos/ DC/OS
Sensors
Devices
Clients
© 2017 Mesosphere, Inc. All Rights Reserved. 11
The SMACK Stack
EVENTSUbiquitous data streams from connected devices
INGEST
Apache Kafka
STORE
Apache Spark
ANALYZE
Apache Cassandra
ACT
Akka
Ingest millions of events per second
Distributed & highly scalable database
Real-time and batch process data
Visualize data and build data driven applications
Mesos/ DC/OS
Sensors
Devices
Clients
© 2017 Mesosphere, Inc. All Rights Reserved. 15
The SMAACK Stack
EVENTSUbiquitous data streams from connected devices
INGEST
Apache Kafka
STORE
Apache Spark
ANALYZE
Apache Cassandra
ACT
Akka
Ingest millions of events per second
Distributed & highly scalable database
Real-time and batch process data
Visualize data and build data driven applications
Mesos/ DC/OS
Sensors
Devices
Clients
Alluxio
© 2016 Mesosphere, Inc. All Rights Reserved.
BIG DATA ECOSYSTEM WITH ALLUXIO
…
…
FUSE Compatible File System Interface
Hadoop Compatible File System Interface
Native Key-Value Interface
Native File System Interface
HDFS Interface Amazon S3 Interface Swift Interface GlusterFS Interface
© 2017 Alluxio 17
© 2016 Mesosphere, Inc. All Rights Reserved.
BIG DATA ECOSYSTEM WITH ALLUXIO
…
…
FUSE Compatible File System Interface
Hadoop Compatible File System Interface
Native Key-Value Interface
Native File System Interface
HDFS Interface Amazon S3 Interface Swift Interface GlusterFS Interface
Enabling Application to Access Data from any Storage System at Memory-speed
© 2017 Alluxio 18
© 2016 Mesosphere, Inc. All Rights Reserved.
WHY ALLUXIO
© 2017 Alluxio
Co-located compute and data with memory-speed access to data
Virtualized across different storage systems under a unified namespace
Scale-out architecture
File system API, software only
19
© 2016 Mesosphere, Inc. All Rights Reserved.
ALLUXIO BENEFITS
© 2017 Alluxio
UnificationNew workflows across any data in any storage system
Orders of magnitude improvement in run time
Choice in compute and storage – grow each independently, buy only what is needed
Performance Flexibility
20
© 2016 Mesosphere, Inc. All Rights Reserved. 22
WHY DATA SERVICES ON DC/OS?
On-demand provisioning1
2
3
Simplified operations
Elastic data infrastructure
● Single command install of services
● Runtime software upgrade● Runtime application settings update● Monitoring & metrics● Managed persistent storage volumes
● Data services and containerized apps share resources● Deploy instances with different versions on the same
infrastructure● Resize instances● Add more instances
© 2017 Alluxio
© 2016 Mesosphere, Inc. All Rights Reserved. 23
ALLUXIO ON MESOSPHERE DC/OSFast, On-demand Unified Data at Memory Speed for Analytics
Alluxio
Mesosphere DC/OS
Any InfrastructureBuild apps once in DC/OS, and run anywhere
Runs distributed apps anywhere as simply as running apps on your laptop
Unify Data at Memory Speed Unify Data at Memory Speed
© 2017 Alluxio
© 2016 Mesosphere, Inc. All Rights Reserved. 24
ALLUXIO ON MESOSPHERE DC/OSFast, On-demand Unified Data at Memory Speed for Analytics
© 2017 Alluxio
© 2016 Mesosphere, Inc. All Rights Reserved.
WHY ALLUXIO ON MESOSPHERE DC/OS?
● Without Mesosphere DC/OS, provisioning of infrastructure is tedious
○ Mesosphere DC/OS automates app & cluster provisioning, management & elastic scaling
● Alluxio brings
○ A unified view of data across disparate storage systems
○ High performance & predictable SLA for analytics workloads
● Benefits include:
○ Process data in your existing cluster faster with Spark and other analytics frameworks
○ Process data from hybrid cloud storage systems (HDFS, S3, On-prem Object Stores etc)
© 2017 Alluxio 25
© 2016 Mesosphere, Inc. All Rights Reserved. 26
BIG DATA STACK WITH ALLUXIO ON MESOSPHERE DC/OSFast, On-demand Unified Data at Memory Speed for Analytics
Mesos
Container Orchestration Management & Monitoring Tools Apps Universe
Security Advanced Operations Multitenancy Adv. Network & Storage
Unifying Data at Memory Speed
© 2017 Alluxio
© 2016 Mesosphere, Inc. All Rights Reserved.
WHAT HAPPENED?
● Alluxio scheduler (developed using the DC/OS SDK) launched as a Marathon application
○ Marathon manages and restarts the scheduler in case of failures
○ Scheduler consists of YAML + scripting
● Alluxio scheduler launched master and worker processes
○ Scheduler manages the configured number of instances even w/ failures
● Configuration changes take effect on the fly
○ Scaled up the worker instances
© 2017 Alluxio 28
© 2016 Mesosphere, Inc. All Rights Reserved.
GET STARTED TODAY
Read:● Mesosphere Blog: http://ow.ly/ou0530ax9aM● Alluxio Blog: http://ow.ly/ILOZ30ax8YE
Try it out:● Install Alluxio from DC/OS Universe
Questions?
© 2017 Alluxio 29