2014 Ceph NYLUG Talk

53
2014 New York, NY Ceph @ NYLUG

description

Talk from 05 June 2014 NYLUG meeting at Bloomberg NYC. Short history of where Ceph came from, an architectural overview, and the current state of the community.

Transcript of 2014 Ceph NYLUG Talk

Page 1: 2014 Ceph NYLUG Talk

2014 New York, NYCeph @ NYLUG

Page 2: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

WHO?

2

Patrick McGarry

Director, Community – Red Hat

/. -> ALU -> P4 -> Inktank

scuttlemonkey

Lies and misinformation!

Page 3: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

AGENDA

3

INDUSTRY MUSINGS

INTRO TO CEPH

ARCHITECTURE

COMMUNITY

INKTANK CEPH ENTERPRISE

Page 4: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

THE FORECAST

By 2020over 15 ZB of data will be stored.1.5 ZB are stored today.

4

Page 5: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

THE PROBLEM

Existing systems don’t scale

Increasing cost and complexity

Need to invest in new platforms ahead of time

2010 2020

IT Storage Budget

Growth of data

5

Page 6: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

THE SOLUTION

PAST: SCALE UP

FUTURE: SCALE OUT

6

Page 7: 2014 Ceph NYLUG Talk

INTRO TO CEPH

Page 8: 2014 Ceph NYLUG Talk

Copyright © 2013 by Inktank | Private and Confidential

HISTORICAL TIMELINE

8

RHEL-OSP & RHEV Support FEB 2014

MAY 2012Launch of Inktank

OpenStack Integration 2011

2010Mainline Linux Kernel

Open Source 2006

2004 Project Starts at UCSC

Production Ready Ceph SEPT 2012

2012CloudStack Integration

OCT 2013Inktank Ceph Enterprise Launch

Xen Integration 2013

Page 9: 2014 Ceph NYLUG Talk

A STORAGE REVOLUTION

PROPRIETARY HARDWARE

PROPRIETARY SOFTWARE

SUPPORT & MAINTENANCE

COMPUTER

DISKCOMPUTE

RDISK

COMPUTER

DISK

STANDARDHARDWARE

OPEN SOURCE SOFTWARE

ENTERPRISEPRODUCTS &

SERVICES

COMPUTER

DISKCOMPUTE

RDISK

COMPUTER

DISK

Page 10: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

ARCHITECTURE

Page 11: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

ARCHITECTURAL COMPONENTS

11

RGWA web services

gateway for object storage, compatible

with S3 and Swift

LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,

PHP)

RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors

RBDA reliable, fully-distributed block device with cloud

platform integration

CEPHFSA distributed file

system with POSIX semantics and

scale-out metadata management

APP HOST/VM CLIENT

Page 12: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

ARCHITECTURAL COMPONENTS

12

RGWA web services

gateway for object storage, compatible

with S3 and Swift

LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,

PHP)

RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors

RBDA reliable, fully-distributed block device with cloud

platform integration

CEPHFSA distributed file

system with POSIX semantics and

scale-out metadata management

APP HOST/VM CLIENT

Page 13: 2014 Ceph NYLUG Talk

OBJECT STORAGE DAEMONS

13

FS

DISK

OSD

DISK

OSD

FS

DISK

OSD

FS

DISK

OSD

FS

btrfsxfsext4

M

M

M

Page 14: 2014 Ceph NYLUG Talk

RADOS CLUSTER

14

APPLICATION

M M

M M

M

RADOS CLUSTER

Page 15: 2014 Ceph NYLUG Talk

RADOS COMPONENTS

15

OSDs: 10s to 10000s in a cluster One per disk (or one per SSD, RAID

group…) Serve stored objects to clients Intelligently peer for replication & recovery

Monitors: Maintain cluster membership and state Provide consensus for distributed decision-

making Small, odd number These do not serve stored objects to

clients

M

Page 16: 2014 Ceph NYLUG Talk

WHERE DO OBJECTS LIVE?

16

??APPLICATION

M

M

M

OBJECT

Page 17: 2014 Ceph NYLUG Talk

A METADATA SERVER?

17

1

APPLICATION

M

M

M

2

Page 18: 2014 Ceph NYLUG Talk

CALCULATED PLACEMENT

18

FAPPLICATION

M

M

MA-G

H-N

O-T

U-Z

Page 19: 2014 Ceph NYLUG Talk

EVEN BETTER: CRUSH!

19

RADOS CLUSTER

OBJECT

10

01

01

10

10

01

11

01

10

01

01

10

10

01 11

01

1001

0110 10 01

11

01

Page 20: 2014 Ceph NYLUG Talk

CRUSH IS A QUICK CALCULATION

20

RADOS CLUSTER

OBJECT

10

01

01

10

10

01 11

01

1001

0110 10 01

11

01

Page 21: 2014 Ceph NYLUG Talk

CRUSH: DYNAMIC DATA PLACEMENT

21

CRUSH: Pseudo-random placement algorithm

Fast calculation, no lookup Repeatable, deterministic

Statistically uniform distribution Stable mapping

Limited data migration on change Rule-based configuration

Infrastructure topology aware Adjustable replication Weighting

Page 22: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

ARCHITECTURAL COMPONENTS

22

RGWA web services

gateway for object storage, compatible

with S3 and Swift

LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,

PHP)

RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors

RBDA reliable, fully-distributed block device with cloud

platform integration

CEPHFSA distributed file

system with POSIX semantics and

scale-out metadata management

APP HOST/VM CLIENT

Page 23: 2014 Ceph NYLUG Talk

ACCESSING A RADOS CLUSTER

23

APPLICATION

M M

M

RADOS CLUSTER

LIBRADOS

OBJECT

socket

Page 24: 2014 Ceph NYLUG Talk

L

LIBRADOS: RADOS ACCESS FOR APPS

24

LIBRADOS: Direct access to RADOS for applications C, C++, Python, PHP, Java, Erlang Direct access to storage nodes No HTTP overhead

Page 25: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

ARCHITECTURAL COMPONENTS

25

RGWA web services

gateway for object storage, compatible

with S3 and Swift

LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,

PHP)

RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors

RBDA reliable, fully-distributed block device with cloud

platform integration

CEPHFSA distributed file

system with POSIX semantics and

scale-out metadata management

APP HOST/VM CLIENT

Page 26: 2014 Ceph NYLUG Talk

THE RADOS GATEWAY

26

M M

M

RADOS CLUSTER

RADOSGWLIBRADOS

socket

RADOSGWLIBRADOS

APPLICATION APPLICATION

REST

Page 27: 2014 Ceph NYLUG Talk

RADOSGW MAKES RADOS WEBBY

27

RADOSGW: REST-based object storage proxy Uses RADOS to store objects API supports buckets, accounts Usage accounting for billing Compatible with S3 and Swift applications

Page 28: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

ARCHITECTURAL COMPONENTS

28

RGWA web services

gateway for object storage, compatible

with S3 and Swift

LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,

PHP)

RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors

RBDA reliable, fully-distributed block device with cloud

platform integration

CEPHFSA distributed file

system with POSIX semantics and

scale-out metadata management

APP HOST/VM CLIENT

Page 29: 2014 Ceph NYLUG Talk

STORING VIRTUAL DISKS

29

M M

RADOS CLUSTER

HYPERVISORLIBRBD

VM

Page 30: 2014 Ceph NYLUG Talk

SEPARATE COMPUTE FROM STORAGE

30

M M

RADOS CLUSTER

HYPERVISORLIBRB

D

VM HYPERVISORLIBRB

D

Page 31: 2014 Ceph NYLUG Talk

KERNEL MODULE FOR MAX FLEXIBLE!

31

M M

RADOS CLUSTER

LINUX HOSTKRBD

Page 32: 2014 Ceph NYLUG Talk

RBD STORES VIRTUAL DISKS

32

RADOS BLOCK DEVICE: Storage of disk images in RADOS Decouples VMs from host Images are striped across the cluster

(pool) Snapshots Copy-on-write clones Support in:

Mainline Linux Kernel (2.6.39+) Qemu/KVM, native Xen coming soon OpenStack, CloudStack, Nebula,

Proxmox

Page 33: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

ARCHITECTURAL COMPONENTS

33

RGWA web services

gateway for object storage, compatible

with S3 and Swift

LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,

PHP)

RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors

RBDA reliable, fully-distributed block device with cloud

platform integration

CEPHFSA distributed file

system with POSIX semantics and

scale-out metadata management

APP HOST/VM CLIENT

Page 34: 2014 Ceph NYLUG Talk

SEPARATE METADATA SERVER

34

LINUX HOST

M M

M

RADOS CLUSTER

KERNEL MODULE

datametadata 0110

Page 35: 2014 Ceph NYLUG Talk

SCALABLE METADATA SERVERS

35

METADATA SERVER Manages metadata for a POSIX-compliant

shared filesystem Directory hierarchy File metadata (owner, timestamps,

mode, etc.) Stores metadata in RADOS Does not serve file data to clients Only required for shared filesystem

Page 36: 2014 Ceph NYLUG Talk

CEPH AND OPENSTACK

36

RADOSGWLIBRADOS

M M

RADOS CLUSTER

OPENSTACK

KEYSTONE CINDER GLANCE

NOVASWIFTLIBRB

DLIBRB

D

HYPER-

VISORLIBRBD

Page 37: 2014 Ceph NYLUG Talk
Page 38: 2014 Ceph NYLUG Talk

Ceph Developer Summit

38

• Recent: “Giant”

• March 04-05• wiki.ceph.com • Virtual

(irc, hangout, pad, blueprint, youtube)

• 2 days (soon to be 3?)

• Discuss all work

• Recruit for your projects!

Page 39: 2014 Ceph NYLUG Talk

New Contribute Page

39

• http://ceph.com/community/Contribute

• Source tree• Issues• Share

experiences• Standups• One-stop shop

Page 40: 2014 Ceph NYLUG Talk

New Ceph Wiki

40

Page 41: 2014 Ceph NYLUG Talk

Accepted as a mentoring organization 8 mentors from Inktank & Community http://ceph.com/gsoc2014/ 2 student proposals accepted Hope to turn this into academic outreach

Google Summer of Code 2014

41

Page 42: 2014 Ceph NYLUG Talk

Ceph Days

42

• inktank.com/cephdays

• Recently:London, Frankfurt, NYC, Santa Clara

• Aggressive program

• Upcoming:Sunnyvale, Austin, Boston, Kuala Lumpur

Page 43: 2014 Ceph NYLUG Talk

Meetups

43

• Community organized

• World wide• Wiki• Ceph-

community• Goodies

available• Logistical

support• Drinkup to

tradeshow

Page 44: 2014 Ceph NYLUG Talk

We haven’t forgotten! Looking for potential founding members Especially important to keep the IP clean

Ceph Foundation

44

Page 45: 2014 Ceph NYLUG Talk

Coordinated Efforts

45

• Always need help

• CentOS SIG• OCP• Xen• Hadoop• OpenStack• CloudStack• Ganetti• Many more!

Page 46: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

http://metrics.ceph.com

46

Page 47: 2014 Ceph NYLUG Talk

THE PRODUCT

Page 48: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential48

INKTANK CEPH ENTERPRISEWHAT’S INSIDE?

Ceph Object and Ceph Block

Calamari

Enterprise Plugins (2014)

Support Services

Subscription-based

Priced on capacity

Single price for all protocols

Page 49: 2014 Ceph NYLUG Talk
Page 50: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

ROADMAPINKTANK CEPH ENTERPRISE

50

1.2 2.0

CEPH

CALAMARI

PLUGINS

Erasure Coding

RHEL7 Support

Cache Tiering

User Quotas

RADOS Management

Analytics Hosted/SaaS

SNMP, Hyper-V

Ceph 0.77 Firefly Ceph 0.87 “H-Release”

April 2014 September 2014

CephFS

Ubuntu 14.04 Support

VMware

HDFS Support

iSCSI

Intelligent Objects

QoS

2015

Page 51: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

Emperor Giant

H

I

J

Inktank Ceph Enterprise v1.1 (Dumpling LTS until May 2015)

Inktank Ceph Enterprise v1.2 (Firefly LTS until November 2015)

RELEASE SCHEDULE

51

2013 2014 2015

Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2

FireflyDumpling

Page 52: 2014 Ceph NYLUG Talk

Copyright © 2014 by Inktank | Private and Confidential

Read about the latest version of Ceph. The latest stuff is always at http://ceph.com/get

Deploy a test cluster using ceph-deploy. Read the quick-start guide at http://ceph.com/qsg

Read the rest of the docs! Find docs for the latest release at http://ceph.com/docs

Ask for help when you get stuck! Community volunteers are waiting for you at

http://ceph.com/help

GETTING STARTED WITH CEPH

52

Page 53: 2014 Ceph NYLUG Talk

THANK YOU!

Patrick McGarryDirector, CommunityRed Hat

[email protected]

@scuttlemonkey

YOUR PICTURE HERE