Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks...

41
Live Migration Mitaka and Beyond Paul Murray, Hewlett Packard Enterprise Andrea Rosa, Hewlett Packard Enterprise Pawel Koniszewski, Intel Corporation

Transcript of Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks...

Page 1: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Live MigrationMitaka and Beyond

Paul Murray, Hewlett Packard EnterpriseAndrea Rosa, Hewlett Packard EnterprisePawel Koniszewski, Intel Corporation

Page 2: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Why live migration?

Host Maintenance

Rolling Updates

Power Optimization

Page 3: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Migrations in OpenStack

✗ Non-live migration (Cold migration)• nova migrate <server>

✓ Live migration• nova live-migration <server> [<host>]

✓ Block live migration (optional)• nova live-migration --block-migrate <server> [<host>]

Page 4: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Assumptions

• Live

• Consistent

• Transparent

• Minimal service disruption

Page 5: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Production experience

Works 85% of the time* !• Hypervisor doesn’t know what type of disks there are• Migration may fail• Migration may never end• Migration traffic may impact network bandwidth• Possible guest network disruption on migration• New resource types and physical mappings• …

...and bugs

*Vancouver OpenStack Summit: Live Migration at HP Public Cloud, Dive into VM Live Migrationhttps://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/live-migration-at-hp-public-cloudhttps://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/dive-into-vm-live-migration

Page 6: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Production experience

Works 85% of the time* !• Hypervisor doesn’t know what type of disks there are• Migration may fail• Migration may never end• Migration traffic may impact network bandwidth• Possible guest network disruption on migration• New resource types and physical mappings• …

...and bugs

*Vancouver OpenStack Summit: Live Migration at HP Public Cloud, Dive into VM Live Migrationhttps://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/live-migration-at-hp-public-cloudhttps://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/dive-into-vm-live-migration

Page 7: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Nova Live Migration Priority

Make it simple• Fewer config options• Fewer API options

Make it work• Many fixes in libvirt/qemu• Fix bugs in OpenStack• Improve CI

Make it managable• Progress information• Abort or force to completion• Isolate networking

Page 8: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Make it Simple

Page 9: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Friendly Live Migration API

Keep It Simple, Stupid Sir

Page 10: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

“Let the machine do the dirty work”*

*Kernighan And Ritchie. “Elements of programming style” -1978

- block_migration- disk_over_commit

- live_migration_flag- block_migration_flag+ live_migration_tunnelled

Page 11: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

block_migration Nova virt driver Libvirt: is_on_shared_storageXen: Using aggregateHyperV: Not supported

disk_over_commit Libvirt specific, do not expose it via API

Page 12: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Warning - Rolling Upgrades

• Mitaka API is not backward compatible• nova --os-compute-api-version 2.24 live-migration <server> [<host>]

• nova --os-compute-api-version 2.24 live-migration --block-migrate <server> [<host>]

Page 13: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

live_migration_flag block_migration_flag

live_migration_tunnelled

Page 14: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may
Page 15: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Make it Work

Page 16: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Scheduling in Mitaka

• All original scheduling properties are preserved• Scheduler can correctly choose target for live migration

• extra specs• scheduler hints• image properties

Page 17: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Block migration with attached volumes

• Volumes are not copied

• Selective disk migration requires Libvirt >= 1.2.17• In Ubuntu it requires >=1.2.16, but a manual change in code on compute nodes is needed

Page 18: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

• Default config drive type is iso9660• Due to Libvirt bug iso9660 is not migratable

Live Migration with config drive attached

iso9660 vfat

Block live migration ✗ ✓

Volume-backed live migration ✗ ✓

Shared storage live migration ✓ ✓

Page 19: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Memory Oversubscription prior to Mitaka

• LM to specific host does not use memory oversubscription•ram_allocation_ratio

Compute Node A2 GB RAM

Reported RAM = available - reserved

nova-conductor

2 GB

2 GB

2 GB

4 GBnova-scheduler

ram_allocation_ratio = 2.0

Page 20: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Memory Oversubscription in Mitaka

• LM to specific host mimics RAM Filter logic

Compute Node A2 GB RAM

Total RAMFree RAMRam allocation ratio

nova-conductor 4 GB

4 GBnova-scheduler

Memory = Total * Ratio – (Total – Free)

Page 21: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Page Modification Logging

• Hardware-assisted dirty logging mechanism• Performance of a VM increased up to 8%• Requires:

• 4th generation Intel Xeon processor• Kernel version >=4.0

Page 22: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Make it Managable

Page 23: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Pets and cattle metaphor

molly.mycompany.com

charlie.mycompany.com

Page 24: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Pets and Cattle metaphor

server1.mycompany.com

server2.mycompany.com

Page 25: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Pets and cattle metaphor: the theory

Page 26: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Pets and cattle metaphor: the reality

Molly2.company.com

Charlie1.company.com

Page 27: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Management of on-going live migrations

• Progress details• Force to complete• Abort

Page 28: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Progress details

nova server-migration-list <server>

nova server-migration-show <server> <migration id>

• List migrations for a server

• Show a migration for a server

• Details: disk progress, memory progress

Page 29: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Progress details

$ nova server-migration-show e5fe4c30-c993-43a3-a4b4-a1e48ee93606 4+------------------------+--------------------------------------+| Property | Value |+------------------------+--------------------------------------+| created_at | 2016-04-22T12:31:55.000000 || dest_compute | devstack3 || dest_host | - || dest_node | - || disk_processed_bytes | 8109686784 || disk_remaining_bytes | 13365149696 || disk_total_bytes | 21474836480 || id | 4 || memory_processed_bytes | 0 || memory_remaining_bytes | 2156605440 || memory_total_bytes | 2156605440 || server_uuid | e5fe4c30-c993-43a3-a4b4-a1e48ee93606 || source_compute | devstack3a || source_node | - || status | running || updated_at | 2016-04-22T12:37:07.000000 |+------------------------+--------------------------------------+

Page 30: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Operators loves to kill a live migration

Page 31: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

How to abort an in-progress live migration

nova live-migration-abort <server> <migration id>

• Abort the running job and triggers a rollback

• Works only when libvirt is used as a driver (QEMU/KVM hypervisor)

• Won’t work with post-copy live migration

Page 32: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Force to Complete

nova live-migration-force-complete <server> <migration id>

• Pauses VM during LM

• Automatically unpauses VM

• Works only when libvirt is used as a driver

Page 33: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Live Migration on dedicated network

New configuration parameter:

live_migration_inbound_addr

Live migration traffic

Page 34: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Complex KVM installation with VSA model

live_migration_inbound_addr

Page 35: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Future of Live Migration

Page 36: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Post-copy Live Migration

Pre-copy Post-copy

● Move workload to destination in the middle of the process

Page 37: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Post-copy Live Migration

• Live migration ends in a finite time• VM needs to be rebooted in case of failure• Performance impact on memory reads

Page 38: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Check Destination on Migration

• Live migration can be forced to particular host• Adds a new parameter to check provided host in scheduler

nova live-migration <server> [<host>] --check

Page 39: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Summary

Mitaka blueprints (merged):

• Fewer config options• Fewer API options• Scheduling with original request properties• Block migration with volumes and vfat config drive• Fixed memory over subscription• Progress reporting• Abort migration• Force migration to complete• Split network plane

Future• Post-copy live migration• Check destination…

… and more in planning

Page 40: Live Migration · Migrations in OpenStack ... • Hypervisor doesn’t know what type of disks there are • Migration may fail • Migration may never end • Migration traffic may

Legal Notices and Disclaimers

• Intel technologies’ features and benefits depend on system configuration and may require enabled

hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer.

• No computer system can be absolutely secure.

• Tests document performance of components on a particular test, in specific systems. Differences in

hardware, software, or configuration will affect actual performance. Consult other sources of information

to evaluate performance as you consider your purchase. For more complete information about

performance and benchmark results, visit http://www.intel.com/performance.

• Intel, the Intel logo and others are trademarks of Intel Corporation in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others.

• © 2016 Intel Corporation.