Ceph Intro & Architectural Overview - Red Hat

40
Ceph Intro & Architectural Overview Federico Lucifredi Product Management Director, Ceph Storage Vancouver & Guadalajara, May 18 th , 2015

Transcript of Ceph Intro & Architectural Overview - Red Hat

Page 1: Ceph Intro & Architectural Overview - Red Hat

Ceph Intro & Architectural OverviewFederico LucifrediProduct Management Director, Ceph StorageVancouver & Guadalajara, May 18th, 2015

Page 2: Ceph Intro & Architectural Overview - Red Hat

2

CLOUD SERVICES

COMPUTE NETWORK STORAGE

the future of storage™

Page 3: Ceph Intro & Architectural Overview - Red Hat

3

HUMANHUMAN COMPUTERCOMPUTER TAPETAPE

HUMANHUMAN ROCKROCK

HUMANHUMAN

INKINK

PAPERPAPER

Page 4: Ceph Intro & Architectural Overview - Red Hat

4

HUMANHUMAN COMPUTERCOMPUTER TAPETAPE

Page 5: Ceph Intro & Architectural Overview - Red Hat

5

YOUYOU TECHNOLOGYTECHNOLOGY YOUR DATAYOUR DATA

Page 6: Ceph Intro & Architectural Overview - Red Hat

6

How Much Store Things All Human History?!writing

paper

computers

distributed storage

cloud computing

gaaaaaaaaahhhh!!!!!!

carving

Page 7: Ceph Intro & Architectural Overview - Red Hat

7

HUMANHUMAN COMPUTERCOMPUTER DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

HUMANHUMAN

HUMANHUMAN

Page 8: Ceph Intro & Architectural Overview - Red Hat

8

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMANHUMANHUMAN

HUMANHUMAN

HUMANHUMANHUMANHUMAN

HUMANHUMANHUMANHUMAN

HUMANHUMAN

HUMANHUMANHUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

COMPUTERCOMPUTER

Page 9: Ceph Intro & Architectural Overview - Red Hat

9

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

DISKDISK

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMANHUMANHUMAN

HUMANHUMAN

HUMANHUMANHUMANHUMAN

HUMANHUMANHUMANHUMAN

HUMANHUMAN

HUMANHUMANHUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

GIANT SPENDY

COMPUTER

GIANT SPENDY

COMPUTER

Page 10: Ceph Intro & Architectural Overview - Red Hat

10

DISKDISKCOMPUTERCOMPUTER

HUMANHUMAN

HUMANHUMAN

HUMANHUMANDISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

Page 11: Ceph Intro & Architectural Overview - Red Hat

11

HUMANHUMAN

HUMANHUMAN

HUMANHUMAN

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

Page 12: Ceph Intro & Architectural Overview - Red Hat

12

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

“STORAGE APPLIANCE”

Page 13: Ceph Intro & Architectural Overview - Red Hat

Storage ApplianceMichael Moll, Wikipedia / CC BY-SA 2.0 13

Page 14: Ceph Intro & Architectural Overview - Red Hat

SUPPORT AND MAINTENANCESUPPORT AND MAINTENANCE

PROPRIETARY SOFTWARE

PROPRIETARY SOFTWARE

14

PROPRIETARY HARDWARE

PROPRIETARY HARDWARE

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

34% of revenue(5.7 billion dollars)

1.3 billion in R&DSpent in a year

1.6+ million square feetof manufacturing space

$NYSE:EMC, FY2014 10K

Page 15: Ceph Intro & Architectural Overview - Red Hat

15

1010100110

1010110011

1001100101

1001101011

1001100111

1001010011

THE CLOUD

Page 16: Ceph Intro & Architectural Overview - Red Hat

SUPPORT AND MAINTENANCESUPPORT AND MAINTENANCE

PROPRIETARY SOFTWARE

PROPRIETARY SOFTWARE

16

PROPRIETARY HARDWARE

PROPRIETARY HARDWARE

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

STANDARD HARDWARESTANDARD HARDWARE

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

DISKDISKCOMPUTERCOMPUTER

OPEN SOURCE SOFTWARE

OPEN SOURCE SOFTWARE

ENTERPRISE SUBSCRIPTION

ENTERPRISE SUBSCRIPTION

(optional)

Page 17: Ceph Intro & Architectural Overview - Red Hat

17

Page 18: Ceph Intro & Architectural Overview - Red Hat

18

OPEN SOURCEOPEN SOURCE

COMMUNITY-FOCUSEDCOMMUNITY-FOCUSED

SCALABLESCALABLE

NO SINGLE POINT OF FAILURENO SINGLE POINT OF FAILURE

SOFTWARE BASEDSOFTWARE BASED

SELF-MANAGINGSELF-MANAGING

philosophy design

Page 19: Ceph Intro & Architectural Overview - Red Hat

19

8 years & 20,000 commits later…

Page 20: Ceph Intro & Architectural Overview - Red Hat

20

Page 21: Ceph Intro & Architectural Overview - Red Hat

21

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APPAPP APPAPP HOST/VMHOST/VM CLIENTCLIENT

Page 22: Ceph Intro & Architectural Overview - Red Hat

22

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APPAPP APPAPP HOST/VMHOST/VM CLIENTCLIENT

Page 23: Ceph Intro & Architectural Overview - Red Hat

23

DISKDISK

FSFS

DISKDISK DISKDISK

OSDOSD

DISKDISK DISKDISK

OSDOSD OSDOSD OSDOSD OSDOSD

FSFS FSFS FSFSFSFS btrfsxfsext4

MMMMMM

Page 24: Ceph Intro & Architectural Overview - Red Hat

24

MM

MM

MM

HUMANHUMAN

Page 25: Ceph Intro & Architectural Overview - Red Hat

25

Monitors:• Maintain cluster membership and state• Provide consensus for distributed decision-making• Small, odd number• These do not serve stored objects to clients

MM

OSDs:• 10s to 10000s in a cluster• One per disk• (or one per SSD, RAID group…)• Serve stored objects to clients• Intelligently peer to perform replication and recovery tasks

Page 26: Ceph Intro & Architectural Overview - Red Hat

26

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APPAPP APPAPP HOST/VMHOST/VM CLIENTCLIENT

Page 27: Ceph Intro & Architectural Overview - Red Hat

LIBRADOSLIBRADOS

MM

MM

MM

27

APPAPP

socket

Page 28: Ceph Intro & Architectural Overview - Red Hat

LLLIBRADOS• Provides direct access to

RADOS for applications• C, C++, Python, PHP, Java,

Erlang• Direct access to storage nodes• No HTTP overhead

Page 29: Ceph Intro & Architectural Overview - Red Hat

29

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APPAPP APPAPP HOST/VMHOST/VM CLIENTCLIENT

Page 30: Ceph Intro & Architectural Overview - Red Hat

30

MM

MM

MM

LIBRADOSLIBRADOS

RADOSGWRADOSGW

APPAPP

socket

REST

Page 31: Ceph Intro & Architectural Overview - Red Hat

31

RADOS Gateway:• REST-based object storage

proxy• Uses RADOS to store objects• API supports buckets,

accounts• Usage accounting for billing• Compatible with S3 and

Swift applications

Page 32: Ceph Intro & Architectural Overview - Red Hat

32

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APPAPP APPAPP HOST/VMHOST/VM CLIENTCLIENT

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

Page 33: Ceph Intro & Architectural Overview - Red Hat

33

MM

MM

MM

VMVM

LIBRADOSLIBRADOSLIBRBDLIBRBD

VIRTUALIZATION CONTAINERVIRTUALIZATION CONTAINER

Page 34: Ceph Intro & Architectural Overview - Red Hat

LIBRADOSLIBRADOS

34

MM

MM

MM

LIBRBDLIBRBD

CONTAINERCONTAINER

LIBRADOSLIBRADOSLIBRBDLIBRBD

CONTAINERCONTAINERVMVM

Page 35: Ceph Intro & Architectural Overview - Red Hat

LIBRADOSLIBRADOS

35

MM

MM

MM

KRBD (KERNEL MODULE)KRBD (KERNEL MODULE)

HOSTHOST

Page 36: Ceph Intro & Architectural Overview - Red Hat

36

RADOS Block Device:• Storage of disk images in RADOS• Decouples VMs from host• Images are striped across the cluster (pool)• Snapshots• Copy-on-write clones• Support in:• Mainline Linux Kernel (2.6.39+)• Qemu/KVM• OpenStack, CloudStack

Page 37: Ceph Intro & Architectural Overview - Red Hat

37

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

RADOS

A reliable, autonomous, distributed object store comprised of self-healing, self-managing, intelligent storage nodes

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

LIBRADOS

A library allowingapps to directlyaccess RADOS,with support forC, C++, Java,Python, Ruby,and PHP

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

RBD

A reliable and fully-distributed block device, with a Linux kernel client and a QEMU/KVM driver

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

CEPH FS

A POSIX-compliant distributed file system, with a Linux kernel client and support for FUSE

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

RADOSGW

A bucket-based REST gateway, compatible with S3 and Swift

APPAPP APPAPP HOST/VMHOST/VM CLIENTCLIENT

Page 38: Ceph Intro & Architectural Overview - Red Hat

38

MM

MM

MM

CLIENTCLIENT

01100110

datametadata

Page 39: Ceph Intro & Architectural Overview - Red Hat

39

Metadata Server• Manages metadata for a POSIX-compliant shared filesystem• Directory hierarchy• File metadata (owner,

timestamps, mode, etc.)• Stores metadata in RADOS• Does not serve file data to clients• Only required for shared filesystem

Page 40: Ceph Intro & Architectural Overview - Red Hat

Questions?

40

Federico LucifrediPM Director, Ceph

[email protected]@0xF2

redhat.com | ceph.com