Real-time data analytics with Cassandra at iland

24
Real-time data analytics with C* at iland Cassandra Tech Day @ Houston, TX October 14th, 2014 Julien Anguenot (@anguenot)

description

This presentation, from the 2014 Datastax Cassandra Tech Day in Houston, TX, is a quick overview on how iland Internet Solutions leverages Cassandra as its sole data storage platform for performance metrics as well as configuration across multiple-data centers in the US, EU and Asia. iland exposes these metrics to its customers trough its ECS portal application which is covering a wealth of functionality including offering visibility into resource consumption, billing, performance, alerts, the impact of change and other key areas. The platform also provides predictive analytics that help companies monitor performance, achieve consistency and anticipate growth requirements.

Transcript of Real-time data analytics with Cassandra at iland

Page 1: Real-time data analytics with Cassandra at iland

Real-time data analytics with C* at iland

Cassandra Tech Day @ Houston, TX!October 14th, 2014!

Julien Anguenot (@anguenot) !

Page 2: Real-time data analytics with Cassandra at iland

Agenda

• iland, iland platform and why C*?!• real-time data & domain constraints!• quick overview of iland C* deployment

Page 3: Real-time data analytics with Cassandra at iland

iland platform

Page 4: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014

19 8

8 3

Years delivering Years of delivering IT

ServicesYears cloud

infrastructure & disaster recovery expertise

IISO 27001 & SSAE16 global

data centers

Cloud-based Specializations:

Production; Test & Dev; DR

iland Internet Solutions

44

Page 5: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014

iland platform essentially

• data warehouse running across multiple data-centers!• monitoring (resource consumption / performance)!• billing!• alerting!• predictive analytics!• cloud management!• cloud services (backups, DR, etc.)!• desktop and mobile applications (iland portal app)

5

Page 6: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014 6

Page 7: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014 7

Page 8: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014 8

Page 9: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014 9

Page 10: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014 10

Page 11: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014

Why did we choose Cassandra?

• MySQL and MongoDB attempts been big fails!• write latency (constant-time writes)!• distributed nature (multi-data centers)!• scalability, reliability, performance, availability!• sharding makes things complicated!• no master - slave means no SPOF!• simplicity!

11

Page 12: Real-time data analytics with Cassandra at iland

real-time data & domain constraints

Page 13: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014

Constraints

• write latency!• precision (used for billing)!• availability!• multi-data center!• tens of thousands of VMs !• agent-less!• pull (vs push)

13

Page 14: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014

Pipeline

• collection of real-time data!• store!• aggregation!• rollups!• processing!• alerting!• reporting!• querying

14

Page 15: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014

Data sources

• VMware infrastructure stack (each location)!• vCloud Director, vCenter, vShield Manager!

• network statistics!• AMQP, Syslog-ng, Web Services!• Salesforce!• Veeam, Zerto, more …

15

Page 16: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014

Real-time metrics!

• 20 seconds samples!• dozens of performance counters per entity (VM, VNIC, etc.)!• time series!

• (timestamp, value)!• metadata!

• entity (network, vm, etc.)!• unit, etc.!

• then 1min, 1h, 1d, 1w and 1m historical rollups

16

Page 17: Real-time data analytics with Cassandra at iland

Overview of iland C* deployment

Page 18: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014

C* iland’s cluster (1/2)

• one (1) C* cluster !• 6 data centers - 25+ nodes - one (1) keyspace

18

Page 19: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014

C* iland’s cluster (2/2)• each DC (1 or 2 racks of 3 nodes)!

• 3 nodes per location for writes!• replication Factor (RF) = 3 !• 3 nodes per location for API reads (Dallas, TX, London, UK, Singapore)!

• each node!• VM (vCenter powered) - Ubuntu 14.04 LTS w/ Cassandra base dpkg!• 16GB of RAM / 16 vCPUs!• currently: ~ 1TB of data per node!• not using SSD (yet)

19

Page 20: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014 20

Reston, VALA,CA

Dallas, TX

US

Singapore

Asia

London,UK

Manchester,UK

EU

Data centers and replication (1/2)

Page 21: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014

Data centers and replication (2/2)

21

C* W

iland ReST API

iland core platform iland core platform

iland ReST API

C* R C* RC* W

C* R only deployed in: Dallas, TX - London, UK - Singapore

Page 22: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014

Access to data (read / write)

22

US portal

https://my.ilandcloud.com/

Citrix Netscaler (US)

EU portal Asia portal

US API EU API Asia API

Page 23: Real-time data analytics with Cassandra at iland

Julien Anguenot @ Cassandra Tech Day Houston 2014

At the application level?

• History!• POC: C* 1.2 + Astyanax (2013/02)!• V1: C* 2.0 + Astyanax w/ CQL3 over thrift (2013/06) !• Current version: C* 2.0 + DataStax CQL Java driver!

• Java / JEE cluster (Wildfly AS)!• AMQP / RabbitMQ!• Redis to cache up “hot” data (read latency)!• Python & DataStax CQL driver for cluster / data upgrades

23

Page 24: Real-time data analytics with Cassandra at iland

Slides !@ !

http://www.slideshare.net/anguenot/cassandra-tech-dayhou2014

http://www.iland.com/enterprise-cloud-services-portal/!@ilandcloud!@anguenot