Dockerizing the Hard Services: Neutron and Nova
-
Upload
claytononeill -
Category
Technology
-
view
140 -
download
0
Transcript of Dockerizing the Hard Services: Neutron and Nova
Dockerizing the Hard Services:Neutron & Nova
Clayton O’Neill<[email protected]>
Overview● What magic incantations are needed to run these services at all?● How to prevent HA router failover on service restarts● How to prevent network namespaces from breaking everything● Bonus: How are network namespaces are related to Cinder!?
Previously Seen On...● “Deploying OpenStack Using Docker in Production”
• Why use Docker?• How do we deploy OpenStack with Docker?• Pain Points
● Video: https://youtu.be/3pc85InNR20● Puppet module:
https://github.com/twc-openstack/puppet-os_docker
Docker & OpenStack @ Charter● Docker in production in July 2015● Running in Docker in production:
• Cinder, Designate, Glance, Heat, Keystone, Nova & Neutron● Using Ceph & Solidfire for volume storage● Using VXLAN tenant networks with HA routers● Using Docker 1.12
/var/run vs /run
Overview● What magic incantations are needed to run these services at
all?● How to prevent HA router failover on service restarts● How to prevent network namespaces from breaking everything● Bonus: How are network namespaces are related to Cinder!?
How to Run Neutron OVS Agent in Dockerdocker run --net host --privileged-v /run/openvswitch:/run/openvswitch -v /lib/modules:/lib/modules:ro -v /etc/neutron:/etc/neutron:ro -v /var/log/neutron:/var/log/neutron -v /var/lib/neutron:/var/lib/neutron -v /run/lock/neutron:/run/lock/neutron -v /run/neutron:/run/neutronmy-docker-registry:5000/cirrus/neutron:7.0.4-120-g1a1224a.19.7a17221/usr/bin/neutron-openvswitch-agent --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/plugins/ml2/ml2_conf.ini --config-file=/etc/neutron/plugins/ml2/openvswitch_agent.ini
How to Run Nova Compute in Dockerdocker run --net host --privileged -e OS_DOCKER_GROUP_DIR=/etc/nova/groups-e OS_DOCKER_HOME_DIR=/var/lib/nova -v /etc/nova:/etc/nova:ro -v /etc/ceph:/etc/ceph:ro -v /etc/iscsi:/etc/iscsi -v /dev:/dev -v /etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro -v /lib/modules:/lib/modules:ro -v /run/libvirt:/run/libvirt -v /run/openvswitch:/run/openvswitch -v /run/lock:/run/lock-v /var/log/nova:/var/log/nova -v /var/lib/nova:/var/lib/nova -v /var/lib/libvirt:/var/lib/libvirtmy-docker-registry:5000/cirrus/nova:12.0.4-2-gc55aacf.19.0522b22/usr/bin/nova-compute
Docker “--net host”● The “--net host” flag turns off Docker networking● Slightly faster...● Nova and Neutron both interact directly with host networking
Docker “--privileged”● Similar to giving container root privileges● Needed for iptables, mount, etc● Neutron & Nova still run as unprivileged user● Root-wrap is used to execute privileged commands● More fine grained options now (--cap-add & --cap-drop)
Neutron OVS “Data Volume” Mounts● Docker volumes used expose the host filesystem into container● -v /etc/neutron:/etc/neutron:ro
• Allows read-only access to config files● -v /var/log/neutron:/var/log/neutron
• Allows writing log files● -v /run/openvswitch:/run/openvswitch
• Allows communicating with OVS & OVSDB via control socket• Allows OVS commands to work inside container
● /var/lib/neutron, /run/lock/neutron, /run/neutron• Mounted to store state outside of the container
Nova Compute “Data Volume” Mounts● /etc/ceph - Ceph keys (read-only)● /etc/iscsi - iSCSI (Solidfire configuration)● /dev - iSCSI mount validation● /etc/ssh/ssh_known_hosts - Live migration (read only)● /run/openvswitch- OVS/OVSDB control● /run/libvirt - Libvirt control socket
Overview● What magic incantations are needed to run these services at all?● How to prevent HA router failover on service restarts● How to prevent network namespaces from breaking everything● Bonus: How are network namespaces are related to Cinder!?
How Do Neutron HA Routers Work Anyway?● Keepalived is used to provide HA feature for virtual routers● Keepalived sends heartbeats two network nodes● Keepalived failover if no heartbeats heard● Keepalived failover on shutdown● IP failover interrupts data path traffic
• Not instantaneous• NAT/Firewall mappings lost
The Problem with HA Routers and Docker● L3-agent spawns Keepalived as a child process● L3-agent in the container means Keepalived also inside● Keepalived lifetime tied to container lifetime● Container restarts lead to router failovers!● L3-agent rolling restart causes all routers to failover!● DHCP-agent has the same issue with DNSMasq
Stabilizing Keepalived & DNSMasq● Separate Keepalived and DNSMasq from the agents● Keepalived and DNSMasq in separate containers● Start Keepalived/DNSMasq containers from inside a container!● L3/DHCP agent restarts don’t affect Keepalived/DNSMasq
Enable Docker Inside Agent Containers● L3-Agent and DHCP-Agent need to start Docker containers● Need access to Docker Engine Socket
• Socket provides API access to the Docker Engine• -v /var/run/docker.sock:/var/run/docker.sock
● Need Docker client inside container● Docker client API version has to match Docker Engine API version
• https://docs.docker.com/engine/reference/api/docker_remote_api/● Neutron user *cannot* access Docker Engine
How to Get Neutron to Cooperate?● Intercept keepalived and DNSMasq invocations
• Place keepalived/dnsmasq wrappers in /usr/local/bin● Update rootwrap filters● Add ‘--pid host’ to ‘docker run’
• Agents need to see keepalived/dnsmasq process ids
Upgrade to Docker 1.12● Docker 1.12 allows engine restarts w/o container restarts● Initially mounted /var/run/docker.sock into containers● Worked until we restarted Docker Engine● Socket inside the container pointed to old socket● Reconfigured Docker Engine for second socket
• /var/run/docker-sock/docker.sock● Updated container to mount /var/run/docker-sock● Updated scripts to use socket in new location
Neutron HA Routers And Docker – It Works!# docker ps --format 'table {{.Names}}\t{{.Image}}\t{{.Command}}'NAMES IMAGE COMMANDneutron-metadata-agent neutron:7.0.4-120-g1a1224a.19.7a17221 "/venv/wrapper.sh /us"neutron-l3-agent neutron:7.0.4-120-g1a1224a.19.7a17221 "/venv/wrapper.sh /us"neutron-openvswitch-agent neutron:7.0.4-120-g1a1224a.19.7a17221 "/venv/wrapper.sh /us"keepalived-035063f3-e480-4ce6-9e16-087d862ca0c1 openstack-dev:20160731.0-11002e.25.70d62e6 "ip netns exec qroute"keepalived-9718ffd2-5125-4894-88f0-da93a6cf451d openstack-dev:20160731.0-11002e.25.70d62e6 "ip netns exec qroute"keepalived-488b09fc-f17b-4db3-b3ab-46d54533d291 openstack-dev:20160731.0-11002e.25.70d62e6 "ip netns exec qroute"keepalived-f0b84d2d-dc61-4179-b3fd-7a8966801d8d openstack-dev:20160731.0-11002e.25.70d62e6 "ip netns exec qroute"keepalived-e86e466f-ddb4-4fa7-92d9-87ff71a6ed6c openstack-dev:20160731.0-11002e.25.70d62e6 "ip netns exec qroute"keepalived-66a30a15-df7c-4cde-be52-20a1d8cc772a openstack-dev:20160731.0-11002e.25.70d62e6 "ip netns exec qroute"keepalived-4e120804-959d-45c9-8c9a-8376351508d2 openstack-dev:20160731.0-11002e.25.70d62e6 "ip netns exec qroute"keepalived-c589785a-5594-46dc-b726-b499025f1c80 openstack-dev:20160731.0-11002e.25.70d62e6 "ip netns exec qroute"keepalived-34fb33ea-fee0-4fb8-addc-32ed27b4b02c openstack-dev:20160731.0-11002e.25.70d62e6 "ip netns exec qroute"keepalived-48529201-aeab-4f17-9d4f-1ea2a631311c openstack-dev:20160731.0-11002e.25.70d62e6 "ip netns exec qroute"keepalived-d9dcb6b1-9dc6-49b8-86e7-2ef7a97fedbe openstack-dev:20160731.0-11002e.25.70d62e6 "ip netns exec qroute"
Overview● What magic incantations are needed to run these services at all?● How to prevent HA router failover on service restarts● How to prevent network namespaces from breaking
everything● Bonus: How are network namespaces are related to Cinder!?
Network Namespace Magic● What is a network namespace?● How do network namespaces work?● How can Dockerized services break namespaces?
What Is a Network Namespace?From ip-netns(8): A network namespace is logically another copy of the network stack, with its own routes, firewall rules, and network devices.
By default a process inherits its network namespace from its parent. Initially all the processes share the same default network namespace from the init process.
By convention a named network namespace is an object at /var/run/netns/NAME that can be opened. The file descriptor resulting from opening /var/run/netns/NAME refers to the specified network namespace. Holding that file descriptor open keeps the network namespace alive. The file descriptor can be used with the setns(2) system call to change the network namespace associated with a task.
What Is a Network Namespace?From ip-netns(8): A network namespace is logically another copy of the network stack, with its own routes, firewall rules, and network devices.
By default a process inherits its network namespace from its parent. Initially all the processes share the same default network namespace from the init process.
By convention a named network namespace is an object at /var/run/netns/NAME that can be opened. The file descriptor resulting from opening /var/run/netns/NAME refers to the specified network namespace. Holding that file descriptor open keeps the network namespace alive. The file descriptor can be used with the setns(2) system call to change the network namespace associated with a task.
What Is a Network Namespace?From ip-netns(8): A network namespace is logically another copy of the network stack, with its own routes, firewall rules, and network devices.
By default a process inherits its network namespace from its parent. Initially all the processes share the same default network namespace from the init process.
By convention a named network namespace is an object at /var/run/netns/NAME that can be opened. The file descriptor resulting from opening /var/run/netns/NAME refers to the specified network namespace. Holding that file descriptor open keeps the network namespace alive. The file descriptor can be used with the setns(2) system call to change the network namespace associated with a task.
How Is a Network Namespace Created?● The “ip netns add” command calls unshare(CLONE_NEWNET)● This places the “ip netns” process in a new, empty namespace● Changes what /proc/self/ns/net points to● If nothing is using the namespace, it disappears● Using the namespace means:
• A process in the namespace• Process that has this file open
# ls -l /proc/self/ns/netlrwxrwxrwx 1 root root 0 Sep 6 20:18 /proc/self/ns/net -> net:[4026531957]
Network Namespace Persistence● Running “strace ip netns add test” shows:
• open("/var/run/netns/test", O_RDONLY|O_CREAT|O_EXCL, 0) = 4• close(4) • unshare(CLONE_NEWNET)• mount("/proc/self/ns/net", "/var/run/netns/test", 0x43981d, MS_BIND, NULL)
How Are Namespaces Persisted?● Running “strace ip netns add test” shows:
• open("/var/run/netns/test", O_RDONLY|O_CREAT|O_EXCL, 0) = 4• close(4) • unshare(CLONE_NEWNET)• mount("/proc/self/ns/net", "/var/run/netns/test", 0x43981d, MS_BIND, NULL)
● Creates an alias for the namespace file● Alias outlives the “ip netns add test” process● Namespace becomes permanent and reusable
What Does This Have To Do With Docker?● Docker volumes interact with mounts in unintuitive ways● Network namespaces are persisted as filesystem mounts!● L3-Agent and DHCP agent need /var/run/netns!
Docker Volumes And Filesystem Mounts● By default:
• Existing mounts are visible when container is started are visible inside it• New mounts inside container aren’t visible outside• New mounts outside container aren’t visible inside• Unmounts don’t propagate from host to container or vice-versa
● Mounts are setup on container start, but not synchronized
Docker Namespace Test
# ip netns add test1# docker run --name test --detach -v /run/netns:/run/netns ubuntu sleep 1h8fda0cf570d33f4984cdcaa4e540dd07531e7a4ecb51df3594a85b3346aa294c# ip netns add test2
Docker Namespace Test - Host
# ls -l /run/netnstotal 0-r--r--r-- 1 root root 0 Sep 29 15:15 test1-r--r--r-- 1 root root 0 Sep 29 15:15 test2# grep /run/netns /proc/mountstmpfs /run/netns tmpfs rw,nosuid,noexec,relatime,size=791316k,mode=755 0 0nsfs /run/netns/test1 nsfs rw 0 0nsfs /run/netns/test2 nsfs rw 0 0
Docker Namespace Test - Container
# docker exec test ls -l /var/run/netns/total 0-r--r--r-- 1 root root 0 Sep 29 15:15 test1---------- 1 root root 0 Sep 29 15:15 test2# docker exec test grep /run/netns /proc/mountstmpfs /run/netns tmpfs rw,nosuid,noexec,relatime,size=791316k,mode=755 0 0nsfs /run/netns/test1 nsfs rw 0 0
Solution: Docker Volume Flags!● Docker supports flags to change default mount behavior● Flags on mount determine propagation of new mounts/unmounts● Uses Linux shared subtree flags● Kernel docs
• https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt● Docker docs
• https://github.com/docker/docker/blob/master/docs/reference/commandline/service_create.md#bind-propagation
● Fix is simple: Use the “shared” flag for /run/netns• -v /run/netns:run/netns:shared• Enables bidirectional propagation of mounts
Overview● What magic incantations are needed to run these services at all?● How to prevent HA router failover on service restarts● How to prevent network namespaces from breaking everything● Bonus: How are network namespaces are related to Cinder!?
Cinder NFS Backend and Nova● NFS backend wants to be able to mount NFS shares itself● Libvirt/QEMU need to be able to see NFS shares
• Libvirt & QEMU are outside nova-compute container● If nova-compute mounts the share, Libvirt can’t see it!● We need mounts under /var/lib/cinder/mnt to propagate● Shared flag must be applied to a filesystem mount on host● Solution
• mount --bind /var/lib/cinder/mnt /var/lib/cinder/mnt• mount --make-shared /var/lib/cinder/mnt• docker run -v /var/lib/cinder/mnt:/var/lib/cinder/mnt:shared
Questions?● Email: [email protected]● IRC: clayton● Twitter: clayton_oneill