Ceph Day NYC: Ceph Fundamentals

63
Ceph Fundamentals Ross Turk VP Community, Inktank

description

Ross Turk provides an overview of the Ceph architecture at Ceph Day NYC.

Transcript of Ceph Day NYC: Ceph Fundamentals

Page 1: Ceph Day NYC: Ceph Fundamentals

Ceph FundamentalsRoss TurkVP Community, Inktank

Page 2: Ceph Day NYC: Ceph Fundamentals

2

ME ME ME ME ME ME.

Ross TurkVP Community, Inktank

[email protected]@rossturk

inktank.com | ceph.com

Page 3: Ceph Day NYC: Ceph Fundamentals

3

Ceph Architectural OverviewAh! Finally, 32 slides in and he gets to the nerdy stuff.

Page 4: Ceph Day NYC: Ceph Fundamentals

4

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 5: Ceph Day NYC: Ceph Fundamentals

5

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 6: Ceph Day NYC: Ceph Fundamentals

6

DISK

FS

DISK DISK

OSD

DISK DISK

OSD OSD OSD OSD

FS FS FSFS btrfsxfsext4

MMM

Page 7: Ceph Day NYC: Ceph Fundamentals

7

M

M

M

HUMAN

Page 8: Ceph Day NYC: Ceph Fundamentals

8

Monitors:• Maintain cluster

membership and state• Provide consensus for

distributed decision-making• Small, odd number• These do not serve stored

objects to clients

M

OSDs:• 10s to 10000s in a cluster• One per disk• (or one per SSD, RAID group…)• Serve stored objects to

clients• Intelligently peer to perform

replication and recovery tasks

Page 9: Ceph Day NYC: Ceph Fundamentals

9

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 10: Ceph Day NYC: Ceph Fundamentals

10

LIBRADOS

M

M

M

APP

socket

Page 11: Ceph Day NYC: Ceph Fundamentals

LLIBRADOS• Provides direct access to

RADOS for applications• C, C++, Python, PHP, Java,

Erlang• Direct access to storage

nodes• No HTTP overhead

Page 12: Ceph Day NYC: Ceph Fundamentals

12

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 13: Ceph Day NYC: Ceph Fundamentals

13

M

M

M

LIBRADOS

RADOSGW

APP

socket

REST

Page 14: Ceph Day NYC: Ceph Fundamentals

14

RADOS Gateway:• REST-based object

storage proxy• Uses RADOS to store

objects• API supports buckets,

accounts• Usage accounting for

billing• Compatible with S3 and

Swift applications

Page 15: Ceph Day NYC: Ceph Fundamentals

15

Page 16: Ceph Day NYC: Ceph Fundamentals

16

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

Page 17: Ceph Day NYC: Ceph Fundamentals

17

M

M

M

VM

LIBRADOS

LIBRBD

HYPERVISOR

Page 18: Ceph Day NYC: Ceph Fundamentals

18

LIBRADOS

M

M

M

LIBRBD

HYPERVISOR

LIBRADOS

LIBRBD

HYPERVISORVM

Page 19: Ceph Day NYC: Ceph Fundamentals

19

LIBRADOS

M

M

M

KRBD (KERNEL MODULE)

HOST

Page 20: Ceph Day NYC: Ceph Fundamentals

20

RADOS Block Device:• Storage of disk images in

RADOS• Decouples VMs from host• Images are striped across

the cluster (pool)• Snapshots• Copy-on-write clones• Support in:• Mainline Linux Kernel

(2.6.39+)• Qemu/KVM, native Xen

coming soon• OpenStack, CloudStack,

Nebula, Proxmox

Page 21: Ceph Day NYC: Ceph Fundamentals

21

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APP APP HOST/VM CLIENT

Page 22: Ceph Day NYC: Ceph Fundamentals

22

M

M

M

CLIENT

0110

datametadata

Page 23: Ceph Day NYC: Ceph Fundamentals

23

Metadata Server• Manages metadata for a

POSIX-compliant shared filesystem• Directory hierarchy• File metadata (owner,

timestamps, mode, etc.)• Stores metadata in RADOS• Does not serve file data to

clients• Only required for shared

filesystem

Page 24: Ceph Day NYC: Ceph Fundamentals

24

What Makes Ceph Unique?Part one: it never, ever remembers where it puts stuff.

Page 25: Ceph Day NYC: Ceph Fundamentals

25

APP??

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

Page 26: Ceph Day NYC: Ceph Fundamentals

26How Long Did It Take You To Find Your Keys This Morning?azmeen, Flickr / CC BY 2.0

Page 27: Ceph Day NYC: Ceph Fundamentals

27

APP

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

Page 28: Ceph Day NYC: Ceph Fundamentals

28Dear Diary: Today I Put My Keys on the Kitchen CounterBarnaby, Flickr / CC BY 2.0

Page 29: Ceph Day NYC: Ceph Fundamentals

29

APP

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

DC

A-G

H-N

O-T

U-Z

F*

Page 30: Ceph Day NYC: Ceph Fundamentals

30I Always Put My Keys on the Hook By the Doorvitamindave, Flickr / CC BY 2.0

Page 31: Ceph Day NYC: Ceph Fundamentals

31

HOW DO YOUFIND YOUR KEYS

WHEN YOUR HOUSEIS

INFINITELY BIGAND

ALWAYS CHANGING?

Page 32: Ceph Day NYC: Ceph Fundamentals

32The Answer: CRUSH!!!!!pasukaru76, Flickr / CC SA 2.0

Page 33: Ceph Day NYC: Ceph Fundamentals

33

OBJECT

10 10 01 01 10 10 01 11 01 10

hash(object name) % num pg

CRUSH(pg, cluster state, rule set)

Page 34: Ceph Day NYC: Ceph Fundamentals

34

OBJECT

10 10 01 01 10 10 01 11 01 10

Page 35: Ceph Day NYC: Ceph Fundamentals

35

CRUSH• Pseudo-random placement

algorithm• Fast calculation, no lookup• Repeatable, deterministic• Statistically uniform

distribution• Stable mapping• Limited data migration on

change• Rule-based configuration• Infrastructure topology aware• Adjustable replication• Weighting

Page 36: Ceph Day NYC: Ceph Fundamentals

36

CLIENT

??

Page 37: Ceph Day NYC: Ceph Fundamentals

37

Page 38: Ceph Day NYC: Ceph Fundamentals

38

Page 39: Ceph Day NYC: Ceph Fundamentals

39

Page 40: Ceph Day NYC: Ceph Fundamentals

40

CLIENT

??

Page 41: Ceph Day NYC: Ceph Fundamentals

41

What Makes Ceph UniquePart two: it has smart block devices for all those impatient, selfish VMs.

Page 42: Ceph Day NYC: Ceph Fundamentals

42

LIBRADOS

M

M

M

VM

LIBRBD

HYPERVISOR

Page 43: Ceph Day NYC: Ceph Fundamentals

43

HOW DO YOUSPIN UP

THOUSANDS OF VMsINSTANTLY

ANDEFFICIENTLY?

Page 44: Ceph Day NYC: Ceph Fundamentals

44

144 0 0 0 0

instant copy

= 144

Page 45: Ceph Day NYC: Ceph Fundamentals

45

4144

CLIENT

write

write

write

= 148

write

Page 46: Ceph Day NYC: Ceph Fundamentals

46

4144

CLIENTread

read

read

= 148

Page 47: Ceph Day NYC: Ceph Fundamentals

47

What Makes Ceph Unique?Part three: it has powerful friends, ones you probably already know.

Page 48: Ceph Day NYC: Ceph Fundamentals

48

M

M

M

APACHE CLOUDSTACK

HYPER-VISOR

PRIMARY STORAGE POOL

SECONDARY STORAGE POOL

snapshots

templates

images

Page 49: Ceph Day NYC: Ceph Fundamentals

49

M

M

M

OPENSTACK

KEYSTONE API

SWIFT API CINDER API GLANCE API

NOVAAPI

HYPER-VISOR

RADOSGW

Page 50: Ceph Day NYC: Ceph Fundamentals

50

What Makes Ceph Unique?Part three: clustered metadata

Page 51: Ceph Day NYC: Ceph Fundamentals

51

M

M

M

CLIENT

0110

Page 52: Ceph Day NYC: Ceph Fundamentals

52

M

M

M

Page 53: Ceph Day NYC: Ceph Fundamentals

53

one tree

three metadata servers

??

Page 54: Ceph Day NYC: Ceph Fundamentals

54

Page 55: Ceph Day NYC: Ceph Fundamentals

55

Page 56: Ceph Day NYC: Ceph Fundamentals

56

Page 57: Ceph Day NYC: Ceph Fundamentals

57

Page 58: Ceph Day NYC: Ceph Fundamentals

58

DYNAMIC SUBTREE PARTITIONING

Page 59: Ceph Day NYC: Ceph Fundamentals

59

Now What?

Page 60: Ceph Day NYC: Ceph Fundamentals

60

8 years & 26,000 commits

Page 61: Ceph Day NYC: Ceph Fundamentals

61

Getting Started With Ceph

Read about the latest version of Ceph.• The latest stuff is always at http://ceph.com/get

Deploy a test cluster using ceph-deploy.• Read the quick-start guide at http://ceph.com/qsg

Deploy a test cluster on the AWS free-tier using Juju.• Read the guide at http://ceph.com/juju

Read the rest of the docs!• Find docs for the latest release at http://ceph.com/docs

Have a working cluster up quickly.

Page 62: Ceph Day NYC: Ceph Fundamentals

62

Getting Involved With Ceph

Most project discussion happens on the mailing list.• Join or view archives at http://ceph.com/list

IRC is a great place to get help (or help others!)• Find details and historical logs at http://ceph.com/irc

The tracker manages our bugs and feature requests.• Register and start looking around at http://ceph.com/tracker

Doc updates and suggestions are always welcome.• Learn how to contribute docs at http://ceph.com/docwriting

Help build the best storage system around!

Page 63: Ceph Day NYC: Ceph Fundamentals

63

Questions?

Ross TurkVP Community, Inktank

[email protected]@rossturk

inktank.com | ceph.com