Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinder in Production,...

download Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinder in Production, OpenStack Israel 2015

of 21

  • date post

    12-Aug-2015
  • Category

    Technology

  • view

    275
  • download

    0

Embed Size (px)

Transcript of Avishay Traeger & Shimshon Zimmerman, Stratoscale - Deploying OpenStack Cinder in Production,...

  1. 1. Deploying Cinder in Production Avishay Traeger, PhD Shimshon Zimmerman
  2. 2. Copyright 2015 Goals Deploy Cinder in a way that is scalable and resilient to failures Limited downtime Reduce admin intervention
  3. 3. Copyright 2015 Available resources Good documentation exists for Deploying OpenStack Many OpenStack HA Guides, for example: http://docs. openstack.org/high-availability-guide
  4. 4. Copyright 2015 What well cover today What is Cinder? How is it designed? What issues in Cinder impact deploying with HA? How can I best deploy Cinder today? HA Cinder - live demo Cinder and Neutron spun off of Nova Similar architectures Some lessons from Cinder may be generalized
  5. 5. Copyright 2015 Who are we? Stratoscale R&D Former Cinder core member A couple words on what we do at Stratoscale Hyper-converged infrastructure: Compute: customized KVM & docker Storage: high performance scale out block storage Network: full-featured SDN Management plane based on OpenStack Easy install, easy upgrade, no maintenance Sound interesting? Were hiring!
  6. 6. Copyright 2015 What is Cinder? Abstraction that enables uniform management of block storage Exposes northbound API (e.g., create, list, delete volumes and snapshots) Storage drivers implement southbound API Enables connections between Nova and storage
  7. 7. Copyright 2015 High-level architecture cinder client cinder-api cinder-volume driver cinder-scheduler cinder-backup driver storage REST SQL DB Components to make HA: Storage Database RPC messaging Cinder services
  8. 8. Copyright 2015 HA storage Almost all storage supported by Cinder is HA These may not be - by default they use local disks (SPOF) LVM: Sets up an iSCSI target over LVM on device NFS: Sets up an NFS server on a file system Must make sure you have redundant network paths from compute to the storage
  9. 9. Copyright 2015 HA database An SQL DB is used to store OpenStack metadata For example, in Cinder, information about volumes, snapshots, quotas, volume types, etc. This DB must be replicated Galera + MySQL/MariaDB/Percona PostgreSQL replication Store DB on DRBD backend Scale-out architectures for SQL exist, but not been tried with OpenStack as far as we know: https://github.com/youtube/vitess
  10. 10. Copyright 2015 Inter-service messaging OpenStack projects use oslo messaging for RPCs which wraps RabbitMQ (AMQP) By default queues are located on a single node (SPOF) Configure mirroring: https://www.rabbitmq.com/ha.html Qpid (AMQP) Similar to RabbitMQ (configure queue replicas) ZeroMQ Allows broker-based reliability like the others Also allows brokerless peer-to-peer model
  11. 11. Copyright 2015 HA Cinder services Great open source project: www.consul.io Service discovery: DNS or HTTP interface Health checking for services and hosts Key-value store Scalable, multi-datacenter Other solutions exist (etcd, zookeeper, pacemaker).
  12. 12. Copyright 2015 HA management Our Solution OpenStack services RabbitMQ Galera & MariaDB consul service discovery
  13. 13. Copyright 2015 Command flow example 1: create cinder-api: create DB record call cinder- scheduler (RPC) scheduler chooses backend call cinder- volume (RPC) cinder- volume works driver creates volume on storage DB update available
  14. 14. Copyright 2015 Command flow example 1: create cinder-api: create DB record call cinder- scheduler (RPC) scheduler chooses backend call cinder- volume (RPC) cinder- volume works driver creates volume on storage DB update available
  15. 15. Copyright 2015 Command flow example 1: create cinder-api: create DB record call cinder- scheduler (RPC) scheduler chooses backend call cinder- volume (RPC) cinder- volume works driver creates volume on storage DB update available
  16. 16. Copyright 2015 Command flow example 2: extend cinder-api: check volume state and update call cinder- volume (RPC) cinder- volume works driver creates volume on storage DB update available
  17. 17. Copyright 2015 Command flow example 2: extend cinder-api: check volume state and update call cinder- volume (RPC) cinder- volume works driver creates volume on storage DB update available
  18. 18. Copyright 2015 Best practices with Juno Run cinder-api in active/active mode with a load balancer in front If you are worried about two processes modifying the same volume simultaneously, you can work around it with UUID-based routing and local file locks Run cinder-scheduler in active/active mode One cinder-volume per backend in active/passive mode Cannot run active/active because of local file locks Make sure you clean up the DB - objects in transient states and deleted objects that are not purged
  19. 19. Copyright 2015 Fresh in Kilo Kilo was just released Lots of work on driver stability, including CI Multi-attaching volumes has been merged into Cinder, but unfortunately the Nova bits havent gone in - will wait for Liberty Support for incremental backups, additional consistency group APIs
  20. 20. Copyright 2015 Liberty and beyond The community is aware of the issues raised here Some more localized issues like atomic state transitions should be addressed in Liberty Recovery and maintaining consistency is a problem with no clear roadmap at this point
  21. 21. Thank You Avishay Traeger, PhD Shimshon Zimmerman