Sharing Best Practices in Setting up and Operating OpenStack CI Loops

22
Operating OpenStack Xen CI Loops Stefano Stabellini, Bob Ball, Anthony Perard BoF Sharing Best Practices 20/05/2015

Transcript of Sharing Best Practices in Setting up and Operating OpenStack CI Loops

Operating OpenStack Xen CI Loops

Stefano Stabellini, Bob Ball, Anthony Perard

BoF – Sharing Best Practices

20/05/2015

© 2014 Citrix. Confidential.2

About this BoF

Brief introduction to XenProject CI

Comparing two CI environments we have set up (XenServer vs libvirt+Xen CI)

Talk through some of the issues we have encountered, and what we see as good

practices coming out of those issues

We aren’t claiming to have “the answers” – we hope our experience will prompt

for discussions.

© 2014 Citrix. Confidential.3

Why Xen?

Xen is a type-1 hypervisor

small footprint (Less then 100K LOC)

GPLv2

Powers the largest public cloud in production(>50% of vendors listed in Gartner 2014 Cloud magic

quadrant, more in terms of hosts)

© 2014 Citrix. Confidential.4

Make Xen the best hypervisor for OpenStack

© 2014 Citrix. Confidential.5

OpenStack

© 2014 Citrix. Confidential.6

OpenStack

© 2014 Citrix. Confidential.7

OpenStack

© 2014 Citrix. Confidential.8

OpenStack

© 2014 Citrix. Confidential.9

Progress and Goals

Goals:

• Make Xen a great platform for OpenStack production deployments

• Make Xen a great platform for OpenStack development and hacking

01/2015: Xen via libvirt still in Group C

A tale of two CIs

© 2014 Citrix. Confidential.11

A tale of two CIs

XenServer CI• 15 months old

• Built before many current tools were

available

• Single use devstack Virtual Machines

• Custom process to watch Gerrit

• Custom process to trigger jobs

• Heavily-modified upstream components

• Uploading logs to swift

Libvirt+Xen CI• 3 months old

• Fork of Ramy Asselin’s puppet scripts

• Single use devstack Virtual Machines

• Zuul watches Gerrit stream

• Jenkins triggers jobs

• Uploading logs to swift

© 2014 Citrix. Confidential.12

XenServer CI: Major components

Gerrit Xenapi-os-testing Devstack-GateCitrix-openstack-ci

Nodepool

project-config

© 2014 Citrix. Confidential.13

XenServer CI: The Good, The Bad and The Ugly

Single-use VMs

Easy access statistics

Trivial to disable

Email monitoring

Swift upload

Failure reproduction

Swift upload

Custom orchestration

Tempest exclusion list

Single cloud

Constant rebasing

Forked upstream repos

Comment format

Single point of failure

Inconsistent reliability

© 2014 Citrix. Confidential.14

Libvirt+Xen CI: Major components

Nodepool

Gerrit Zuul

Jenkins Devstack-GateGearman

JJB

os-ext-testing

© 2014 Citrix. Confidential.15

Libvirt+Xen CI: The Good, The Bad and The Ugly

Single-use VMs

Based on upstream

Swift upload

Highly reliable

Steep learning curve

Swift upload

Backport upstream Xen fixes

Changes to puppet scripts

No monitoring

Single Cloud

Swift upload

No pre-prod env

© 2014 Citrix. Confidential.16

Libvirt+Xen CI: Upgrades and backportNo “hacks” required

Libvirt 1.2.14 with:• f86ae40 libxl: Move job acquisition in libxlDomainStart to callers

• 894d2ff libxl: acquire a job when destroying a domain

• 6dfec1e libxl: drop virDomainObj lock when destroying a domain

Xen 4.4 (ubuntu package) with:• 9369988 libxl: event handling: Break out ao_work_outstanding

• f1335f0 libxl: event handling: ao_inprogress does waits while reports outstanding

• 4783c99 libxl: In domain death search, start search at first domid we want

• 188e9c5 libxl: Domain destroy: fork

© 2014 Citrix. Confidential.17

Our mistakesWell, some of them…

Not regularly attending the Third Party meetings - XenServer CI predated them

Too many forks - although some have been merged back already

Assumed creating own environment was easier

Incorrect assumptions with devstack-gate flags – Use The Source Luke.

Insufficient isolation between the CI environments – cloud credentials

Too many CPUs / Not enough RAM – Devstack is hungry

Using Microsoft Outlook – Insufficient Filtering

Best Practices

© 2014 Citrix. Confidential.19

Our Suggestions(Not necessarily best practices)

Participate in the Third Party meetings

Participate in the Third Party WG meetings

Nodes Single use, orchestrated by Nodepool. Preferably in an OpenStack cloud.

Orchestration Third-party CI puppet scripts.

Logs Served from Swift. We now have > 1TB logs.

Projects Almost everything(!) - Minimal suggestion: Add Tempest and Devstack.

Coverage Disable tests to improve pass rate(!).

© 2014 Citrix. Confidential.20

Our Questions

Voting Simplifying the disable/enable loop

Coverage Disabling individual tests – Where is the line?

Monitoring Need a solution both for us and for sharing results more widely

Enforcement Several cases of cores ignoring valid fails – breaking the CI

Platform vs Driver testing Testing of all the things

Shared orchestration Wouldn’t it be nice… No one can cover all weekends and

timezones

© 2014 Citrix. Confidential.21

Open Discussion

© 2014 Citrix. Confidential.22

WORK BETTER. LIVE BETTER.