Storing VMs with Cinder and Ceph RBD.pdf

44
Storing VMs with Cinder and Ceph RBD

description

true

Transcript of Storing VMs with Cinder and Ceph RBD.pdf

Page 1: Storing VMs with Cinder and Ceph RBD.pdf

Storing VMs with Cinder and

Ceph RBD

Page 2: Storing VMs with Cinder and Ceph RBD.pdf

Growing With Hardware Appliances

First PB

•  Proprietary storage hardware

• Well-known storage vendor

$14 b’zillion

Second PB

•  Proprietary storage hardware

•  Same storage vendor

Another

$14 b’zillion

47

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

Page 3: Storing VMs with Cinder and Ceph RBD.pdf

52

DC

DC

DC

DC

D

C

DC

DC

DC

DC

DC

DC

DC

C++

Page 4: Storing VMs with Cinder and Ceph RBD.pdf

53

DC

DC

DC

DC

D

C

DC

DC

DC

DC

DC

DC

DC

C++ X

Page 5: Storing VMs with Cinder and Ceph RBD.pdf

54

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

HUMAN [DEVELOPER]

!!

Page 6: Storing VMs with Cinder and Ceph RBD.pdf

Hard Drives Are Tiny Record Players and They Fail Often jon_a_ross, Flickr / CC BY 2.0 71

Page 7: Storing VMs with Cinder and Ceph RBD.pdf

72

D

55 times / day

= D

D D

x 1 MILLION

D D

D D

Page 8: Storing VMs with Cinder and Ceph RBD.pdf

73

Page 9: Storing VMs with Cinder and Ceph RBD.pdf

OPEN SOURCE

COMMUNITY-FOCUSED

SCALABLE

NO SINGLE POINT OF FAILURE

SOFTWARE BASED

SELF-MANAGING

philosophy design

Page 10: Storing VMs with Cinder and Ceph RBD.pdf

79

RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,

intelligent storage nodes

LIBRADOS

A library allowing

apps to directly

access RADOS, with support for

C, C++, Java,

Python, Ruby,

and PHP

RBD A reliable and fully-

distributed block device, with a Linux

kernel client and a

QEMU/KVM driver

CEPH FS A POSIX-compliant

distributed file system, with a Linux

kernel client and

support for FUSE

RADOSGW A bucket-based REST

gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 11: Storing VMs with Cinder and Ceph RBD.pdf

81

DISK

FS

DISK DISK

OSD

DISK DISK

OSD OSD OSD OSD

FS FS FS FS btrfs xfs

ext4

M M M

Page 12: Storing VMs with Cinder and Ceph RBD.pdf

82

M

M

M

HUMAN

Page 13: Storing VMs with Cinder and Ceph RBD.pdf

83

Monitors:

• Maintain cluster map

•  Provide consensus for distributed decision-making

• Must have an odd number

•  These do not serve stored objects to clients

M

OSDs: • One per disk (recommended)

•  At least three in a cluster

•  Serve stored objects to clients

•  Intelligently peer to perform replication tasks

•  Supports object classes

Page 14: Storing VMs with Cinder and Ceph RBD.pdf

APP??

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

Page 15: Storing VMs with Cinder and Ceph RBD.pdf

APP

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

Page 16: Storing VMs with Cinder and Ceph RBD.pdf

APP

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

A-G

H-N

O-T

U-Z

F*

Page 17: Storing VMs with Cinder and Ceph RBD.pdf

107

10 10 01 01 10 10 01 11 01 10

10 10 01 01 10 10 01 11 01 10

hash(object name) % num pg

CRUSH(pg, cluster state, rule set)

Page 18: Storing VMs with Cinder and Ceph RBD.pdf

108

10 10 01 01 10 10 01 11 01 10

10 10 01 01 10 10 01 11 01 10

Page 19: Storing VMs with Cinder and Ceph RBD.pdf

109

CRUSH

•  Pseudo-random placement algorithm

•  Ensures even distribution

•  Repeatable, deterministic

•  Rule-based configuration

•  Replica count

•  Infrastructure topology

•  Weighting

Page 20: Storing VMs with Cinder and Ceph RBD.pdf

110

CLIENT

??

Page 21: Storing VMs with Cinder and Ceph RBD.pdf

112

Page 22: Storing VMs with Cinder and Ceph RBD.pdf

113

CLIENT

??

Page 23: Storing VMs with Cinder and Ceph RBD.pdf

111

Page 24: Storing VMs with Cinder and Ceph RBD.pdf

84

RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,

intelligent storage nodes

LIBRADOS

A library allowing

apps to directly

access RADOS, with support for

C, C++, Java,

Python, Ruby,

and PHP

RBD A reliable and fully-

distributed block device, with a Linux

kernel client and a

QEMU/KVM driver

CEPH FS A POSIX-compliant

distributed file system, with a Linux

kernel client and

support for FUSE

RADOSGW A bucket-based REST

gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 25: Storing VMs with Cinder and Ceph RBD.pdf

LIBRADOS

M

M

M

85

APP

native

Page 26: Storing VMs with Cinder and Ceph RBD.pdf

L

LIBRADOS

•  Provides direct access to RADOS for applications

•  C, C++, Python, PHP, Java

• No HTTP overhead

Page 27: Storing VMs with Cinder and Ceph RBD.pdf

87

RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,

intelligent storage nodes

LIBRADOS

A library allowing

apps to directly

access RADOS, with support for

C, C++, Java,

Python, Ruby,

and PHP

RBD A reliable and fully-

distributed block device, with a Linux

kernel client and a

QEMU/KVM driver

CEPH FS A POSIX-compliant

distributed file system, with a Linux

kernel client and

support for FUSE

RADOSGW A bucket-based REST

gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 28: Storing VMs with Cinder and Ceph RBD.pdf

88

M

M

M

LIBRADOS

RADOSGW

APP

native

REST

LIBRADOS

RADOSGW

APP

Page 29: Storing VMs with Cinder and Ceph RBD.pdf

89

RADOS Gateway:

•  REST-based interface to RADOS

•  Supports buckets, accounting

•  Compatible with S3 and Swift applications

Page 30: Storing VMs with Cinder and Ceph RBD.pdf

90

RADOS A reliable, autonomous, distributed object store comprised of self-healing, self-managing,

intelligent storage nodes

LIBRADOS

A library allowing

apps to directly

access RADOS, with support for

C, C++, Java,

Python, Ruby,

and PHP

CEPH FS A POSIX-compliant

distributed file system, with a Linux

kernel client and

support for FUSE

RADOSGW A bucket-based REST

gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

RBD A reliable and fully-

distributed block device, with a Linux

kernel client and a

QEMU/KVM driver

Page 31: Storing VMs with Cinder and Ceph RBD.pdf

91

M

M

M

VM

LIBRADOS LIBRBD

VIRTUALIZATION CONTAINER

Page 32: Storing VMs with Cinder and Ceph RBD.pdf

LIBRADOS

92

M

M

M

LIBRBD

CONTAINER

LIBRADOS LIBRBD

CONTAINER VM

Page 33: Storing VMs with Cinder and Ceph RBD.pdf

LIBRADOS

93

M

M

M

KRBD (KERNEL MODULE)

HOST

Page 34: Storing VMs with Cinder and Ceph RBD.pdf

RADOS Block Device:

• Storage of virtual disks in RADOS

• Allows decoupling of VMs and

containers

• Live migration!

• Images are striped across the

cluster

• Thin-provisioning

• Snapshots and cloning

Page 35: Storing VMs with Cinder and Ceph RBD.pdf

LIBRADOS

115

M

M

M

VM

LIBRBD

VIRTUALIZATION CONTAINER

Page 36: Storing VMs with Cinder and Ceph RBD.pdf

HOW DO YOU

SPIN UP

THOUSANDS OF VMs

INSTANTLY

AND

EFFICIENTLY?

116

Page 37: Storing VMs with Cinder and Ceph RBD.pdf

144

117

0 0 0 0

instant copy

= 144

Page 38: Storing VMs with Cinder and Ceph RBD.pdf

4 144

118

CLIENT

write

write

write

= 148

write

Page 39: Storing VMs with Cinder and Ceph RBD.pdf

4 144

119

CLIENT read

read

read

= 148

Page 40: Storing VMs with Cinder and Ceph RBD.pdf

29

local disk(VM images)

Novacompute

Glance(templates)

read X

X

X'

old-style VM image creation

● ephemeral

● expensive to create

Page 41: Storing VMs with Cinder and Ceph RBD.pdf

Why use block storage?

• Persistent• More familiar to users

• Not tied to a single host• Decouples compute and storage• Enables Live migration

• Extra capabilities of storage system• Efficient snapshots• Different types of storage available• Cloning for fast restore or scaling

Page 42: Storing VMs with Cinder and Ceph RBD.pdf

31

CinderAPI

Cindervolume

create image from X

X

Cinder volume creation

Glance(templates)

volume driver

locate X

location of X

read X

X'

reference to X'

flexibility in where VM images are stored

Page 43: Storing VMs with Cinder and Ceph RBD.pdf

32

CinderAPI

Cindervolume

create image from X

X

Efficient volume creation

Glance(templates)

volume driver

locate X

location of X

clone X to X'

X'

reference to X'

fast CoW clone

X' complete

Page 44: Storing VMs with Cinder and Ceph RBD.pdf

Questions?

Josh Durgin

[email protected]

jdurgin on freenode

inktank.com | ceph.com