Post on 04-Dec-2014
description
Case Study: Flying CircusBerlin CEPH meetup
2014-01-27, Christian Theune <ct@gocept.com>
/me
• Christian Theune
• Co-Founder of gocept
• Software Developer(formerly Zope, Plone, grok), Python (lots of packages)
• ct@gocept.com
• @theuni
What worked for us?
raw image on local server
lvm volume via iSCSI (ietd + open-iscsi)
What didn’t work (for us)
ATA over Ethernet
Gluster(sheepdog)
Linux HA solution for iSCSI
CEPH
• been watching for ages
• started work in December 2012
• production roll-out since December 2013
• about 50% migrated in production
Our production structure• KVM hosts with 2x1Gbps (STO and STB)
• Old storages with 5*600GB RAID 5 + 1 Journal SAS 15k drives
• 5 monitors, 6 OSDs currently
• RBD from KVM hosts and backup server, 1 cluster per customer project (multiple VMs)
• Acceptable performance on existing hardware
Good stuff• No single point of failure any more.!
• Create/destroy VM images on KVM hosts!
• Fail-over and self-healing works nicely
• Virtualisation for storage “as it should be”™
• High quality of concepts, implementation, and documentation
• Relatively simple to configure
ceph -s (and -w)
ceph osd tree
Current issues• Bandwith vs. Latency: replicas from RBD client?!?.
• Deciding for PG allocation in various situations.
• Deciding for new hardware.
• Backup has become a bottle neck.
• I can haz “ceph osd pool stats” per RBD volume?
• Still measuring performance. RBD is definitely sucking up some performance.
Summary• finally … FINALLY … F I N A L L Y !
• feels sooo good
• well, at least we did not want to throw up using it
• works as promised
• can’t stop praising it …