New Features for Ceph with Cinder and Beyond

44
New Features for Ceph with Cinder and Beyond

Transcript of New Features for Ceph with Cinder and Beyond

Page 1: New Features for Ceph with Cinder and Beyond

New Features for Ceph with Cinder

and Beyond

Page 2: New Features for Ceph with Cinder and Beyond

73

Page 3: New Features for Ceph with Cinder and Beyond

Why Ceph?

60

• Low cost

• Flexible

• Scalable

• Open source

Page 4: New Features for Ceph with Cinder and Beyond

79

RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,

intelligent storage nodes

LIBRADOS

A library allowing

apps to directly

access RADOS, with support for

C, C++, Java,

Python, Ruby,

and PHP

RBD A reliable and fully-

distributed block device, with a Linux

kernel client and a

QEMU/KVM driver

CEPH FS A POSIX-compliant

distributed file system, with a Linux

kernel client and

support for FUSE

RADOSGW A bucket-based REST

gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 5: New Features for Ceph with Cinder and Beyond

81

DISK

FS

DISK DISK

OSD

DISK DISK

OSD OSD OSD OSD

FS FS FS FS btrfs xfs

ext4

M M M

Page 6: New Features for Ceph with Cinder and Beyond

82

M

M

M

HUMAN

Page 7: New Features for Ceph with Cinder and Beyond

83

Monitors:

• Maintain cluster map

•  Provide consensus for distributed decision-making

• Must have an odd number

•  These do not serve stored objects to clients

M

OSDs: • One per disk (recommended)

•  At least three in a cluster

•  Serve stored objects to clients

•  Intelligently peer to perform replication tasks

•  Supports object classes

Page 8: New Features for Ceph with Cinder and Beyond

APP

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

Page 9: New Features for Ceph with Cinder and Beyond

APP

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

Page 10: New Features for Ceph with Cinder and Beyond

APP

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

A-G

H-N

O-T

U-Z

F

*

Page 11: New Features for Ceph with Cinder and Beyond

107

10 10 01 01 10 10 01 11 01 10

10 10 01 01 10 10 01 11 01 10

hash(object name) % num pg

CRUSH(pg, cluster state, rule set)

Page 12: New Features for Ceph with Cinder and Beyond

108

10 10 01 01 10 10 01 11 01 10

10 10 01 01 10 10 01 11 01 10

Page 13: New Features for Ceph with Cinder and Beyond

109

CRUSH

•  Pseudo-random placement algorithm

•  Ensures even distribution

•  Repeatable, deterministic

•  Rule-based configuration

•  Replica count

•  Infrastructure topology

•  Weighting

Page 14: New Features for Ceph with Cinder and Beyond

110

CLIENT

??

Page 15: New Features for Ceph with Cinder and Beyond

112

Page 16: New Features for Ceph with Cinder and Beyond

113

CLIENT

??

Page 17: New Features for Ceph with Cinder and Beyond

111

Page 18: New Features for Ceph with Cinder and Beyond

84

RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,

intelligent storage nodes

LIBRADOS

A library allowing

apps to directly

access RADOS, with support for

C, C++, Java,

Python, Ruby,

and PHP

RBD A reliable and fully-

distributed block device, with a Linux

kernel client and a

QEMU/KVM driver

CEPH FS A POSIX-compliant

distributed file system, with a Linux

kernel client and

support for FUSE

RADOSGW A bucket-based REST

gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 19: New Features for Ceph with Cinder and Beyond

LIBRADOS

M

M

M

85

APP

native

Page 20: New Features for Ceph with Cinder and Beyond

L

LIBRADOS

•  Provides direct access to RADOS for applications

•  C, C++, Python, PHP, Java

• No HTTP overhead

Page 21: New Features for Ceph with Cinder and Beyond

87

RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,

intelligent storage nodes

LIBRADOS

A library allowing

apps to directly

access RADOS, with support for

C, C++, Java,

Python, Ruby,

and PHP

RBD A reliable and fully-

distributed block device, with a Linux

kernel client and a

QEMU/KVM driver

CEPH FS A POSIX-compliant

distributed file system, with a Linux

kernel client and

support for FUSE

RADOSGW A bucket-based REST

gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 22: New Features for Ceph with Cinder and Beyond

88

M

M

M

LIBRADOS

RADOSGW

APP

native

REST

LIBRADOS

RADOSGW

APP

Page 23: New Features for Ceph with Cinder and Beyond

89

RADOS Gateway:

•  REST-based interface to RADOS

•  Supports buckets, accounting

•  Compatible with S3 and Swift applications

Page 24: New Features for Ceph with Cinder and Beyond

90

RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,

intelligent storage nodes

LIBRADOS

A library allowing

apps to directly

access RADOS, with support for

C, C++, Java,

Python, Ruby,

and PHP

CEPH FS A POSIX-compliant

distributed file system, with a Linux

kernel client and

support for FUSE

RADOSGW A bucket-based REST

gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

RBD A reliable and fully-

distributed block device, with a Linux

kernel client and a

QEMU/KVM driver

Page 25: New Features for Ceph with Cinder and Beyond

91

M

M

M

VM

LIBRADOS LIBRBD

VIRTUALIZATION CONTAINER

Page 26: New Features for Ceph with Cinder and Beyond

LIBRADOS

92

M

M

M

LIBRBD

CONTAINER

LIBRADOS LIBRBD

CONTAINER VM

Page 27: New Features for Ceph with Cinder and Beyond

LIBRADOS

93

M

M

M

KRBD (KERNEL MODULE)

HOST

Page 28: New Features for Ceph with Cinder and Beyond

RADOS Block Device:

• Storage of virtual disks in RADOS

• Allows decoupling of VMs and

containers

• Live migration!

• Images are striped across the

cluster

• Thin-provisioning

• Snapshots and cloning

Page 29: New Features for Ceph with Cinder and Beyond

LIBRADOS

115

M

M

M

VM

LIBRBD

VIRTUALIZATION CONTAINER

Page 30: New Features for Ceph with Cinder and Beyond

HOW DO YOU

SPIN UP

THOUSANDS OF VMs

INSTANTLY

AND

EFFICIENTLY?

116

Page 31: New Features for Ceph with Cinder and Beyond

144

117

0 0 0 0

instant copy

= 144

Page 32: New Features for Ceph with Cinder and Beyond

4 144

118

CLIENT

write

write

write

= 148

write

Page 33: New Features for Ceph with Cinder and Beyond

4 144

119

CLIENT read

read

read

= 148

Page 34: New Features for Ceph with Cinder and Beyond

29

local disk(VM images)

Novacompute

Glance(templates)

read X

X

X'

old-style VM image creation

● ephemeral

● expensive to create

Page 35: New Features for Ceph with Cinder and Beyond

Why use block storage?

• Persistent• More familiar to users

• Not tied to a single host• Decouples compute and storage• Enables Live migration

• Extra capabilities of storage system• Efficient snapshots• Different types of storage available• Cloning for fast restore or scaling

Page 36: New Features for Ceph with Cinder and Beyond

31

CinderAPI

Cindervolume

create image from X

X

Cinder volume creation

Glance(templates)

volume driver

locate X

location of X

read X

X'

reference to X'

flexibility in where VM images are stored

Page 37: New Features for Ceph with Cinder and Beyond

32

CinderAPI

Cindervolume

create image from X

X

Efficient volume creation

Glance(templates)

volume driver

locate X

location of X

clone X to X'

X'

reference to X'

fast CoW clone

X' complete

Page 38: New Features for Ceph with Cinder and Beyond

What's new in Bobtail:

Improved OSD threading

54

• Filesystem and journal related-locks are now more fine-grained

• Boosted single disk IOPS from 6k to 22k• Restructured how map updates are handled,

letting each placement group process them independently

Page 39: New Features for Ceph with Cinder and Beyond

What's new in Bobtail:

Recovery QoS

55

• Message priority system reworked to prevent starvation

• Recovery operations can be lower priority than client I/O without starving

• Requests to access an object can increase recovery priority for that object

Page 40: New Features for Ceph with Cinder and Beyond

What's new in Bobtail:

Block Device Cloning

56

• Instantly create new volumes based on templates (snapshots)

• Integrated with Cinder in Folsom• Grizzly adds the ability to copy (not clone)

non-raw images to RBD

Page 41: New Features for Ceph with Cinder and Beyond

What's new in Bobtail:

Keystone Integration

57

• RADOS gateway can talk to keystone to authenticate swift api requests

• Let keystone manage your users• Supported by the Ceph juju charm

Page 42: New Features for Ceph with Cinder and Beyond

What's next: Cuttlefish

58

• Incremental backup for block devices• On-disk encryption• REST management API for RADOS gateway• More performance improvements (especially

for small I/O)• More! (http://www.inktank.com/about-

inktank/roadmap/)

Page 43: New Features for Ceph with Cinder and Beyond

What's next: Dumpling

59

• Geo-replication for RADOS gateway• REST management API for Ceph cluster• ...

(virtual) Ceph Developer Summit May 6

Page 44: New Features for Ceph with Cinder and Beyond

Questions?

Josh Durgin

[email protected]

jdurgin on freenode

inktank.com | ceph.com