Scaling Open edX with Kubernetes

Post on 13-Apr-2017

3.446 views 0 download

Transcript of Scaling Open edX with Kubernetes

Scaling Open edX with KubernetesDevOpsDays Boston9.15.2015

Who we are

Nate Aune

Morgan Robertson

What we’ll cover

● Background -- Open edX

● Introducing Kubernetes

● Kubernetes concepts

● Scaling + resiliency

● Open edX on Kubernetes

Open edX background

● edX: non-profit founded by MIT and Harvard

● 500+ courses, 5M students learning on edX.org

● edX released Open edX in June 2013

● Stanford, MongoDB, Salesforce, Google, Microsoft,

McKinsey, Johnson & Johnson, Smithsonian

Open edX - a catalyst for innovation

212 Contributors

One of the fastest growing open source projects on Github

Technical components

LMS/CMS (Django/Python)

Forum (Sinatra/Ruby)

User DB (MySQL)

Course DB (Mongo)

Tasks (Celery/RabbitMQ)

Caching (Memcache)

Proxy (Nginx)

Search (ElasticSearch)

Mapreduce (Hadoop)

Hosting infrastructure

S3 for serving:

● static assets

● grade downloads

● certificate downloads

● videos (for mobile)

● Load balancer

● Application server(s)

● Database server(s)

● Search server

● Utility server (tasks)

● Caching server

● Hadoop cluster

Typical scalable deployment of Open edX on AWS

Introducing Kubernetes

● Scheduling + orchestration layer for containerized applications

● Abstracts your infrastructure

● Open source project by Google

● Production-ready as of July 2015

Kubernetes architecture

Kubernetes vs. the Docker triad

Kubernetes Swarm Compose Machine

Scheduling ✔ ✔

Service discovery ✔ ✅

Container scaling ✔ ✔

Machine provisioning ✅ ✔

Health checking ✔

Secret management ✔

Production-ready ✔

Kubernetes core concepts

● Pods

● Services

● Replication controllers

Pods

● Group of containers + volumes scheduled together

● Smallest deployable unit

● Containers share certain resources including network stack

Services

Services

● Endpoint for a set of pods

● IP address, port, and label selectors

● Use round-robin routing to direct traffic to backend pods

Services + Pods

Replication Controllers

● Manage pod lifecycles for a number of replicas

● Provide scaling + fault tolerance

● Use label selectors

Pods + Services + Replication Controllers

Scaling with Kubernetes

● Replication controllers scale pods

● Services provide a single endpoint for a group of pods

● The Kubernetes master schedules pods across nodes

Resiliency with Kubernetes

● Replication controllers ensure a number of pods are running

● Services provide load balancing

● Health checks allow bad pods to be ignored/removed

Open edX on Kubernetes

● Goals:

○ Multi-tenant

○ Scalable + resilient

The challenge

Architecture

Monitoring with Sysdig

Sysdig drill-down

Lessons learned

● Containers should be stateless

● Put initialization tasks into separate pods that run once

● Services can be used to abstract non-containerized components

Conclusion

● We’re still learning, but..

○ Kubernetes is a promising technology for providing both scalability and resiliency

More info

Open edX - http://open.edx.org

Kubernetes - http://kubernetes.io

Google Container Engine - http://cloud.google.com/container-engine

Thank you for your time!Questions?

Slides: http://bit.ly/open-edx-kubernetes

nate@appsembler.commorgan@appsembler.com

@appsembler