Microservice Builder: A Microservice DevOps Pipeline for Rapid Delivery and Promotion
Using a Canary Microservice to Validate the Software Delivery Pipeline
Transcript of Using a Canary Microservice to Validate the Software Delivery Pipeline
Using a Canary Microserviceto Validate the Software Delivery Pipeline
Tony WilmerLead DevOps Pipeline Engineer - DigitalGlobe Inc.
About DigitalGlobe
DigitalGlobe is the world’s leading provider of high-resolution Earth imagery, data and analysis.
2
3
Satellite Constellation
The world’s most sophisticated commercial satellite constellation in orbit
4
Global Coverage
Capable of collecting well over one billion square kilometers of quality imagery per year
7
Mapping
Mexico City, Mexico. Imagery© DigitalGlobe. Map© OpenStreetMap contributors
8
Impactful Situational Analysis
Top image shows two slave labor fishing boats tied to Silver Sea 2, a roughly 2,300-ton refrigerated cargo ship, with its cargo hold open to receive the slave-caught seafood. Bottom image shows the analysis of the same photo.
http://eplore.digitalglobe.com/see-freedom
Combating Human Trafficking & Slavery
10
• Many mature processes and tools already exist
• Talented Engineers & Developers
• Engineers are allowed to pick the best solutions or tool for the job
• Executive Management support
• WV-4 Launch
11
• Multi-geographical development locations
• Over 70 Agile Teams
• Separate release streams
• Complex Missions Control Systems
• Over 300 Applications
• Disparate environments make it hard to test
12
• Monolithic systems, manually maintained
• Multi-module builds with cross dependencies
• Long release cycles
• Long deploy outages
• Silo teams – knowledge gaps
Why DevOps?
• Customer demand for quicker enhancements and fixes
• Reduce cost by changing architecture to Microservices
• Easier to add new functionality (low Impact)
• Standardize the platform
• Better release automation (XL Release)
13
Pipeline as a Service
14
• The pipeline infrastructure built and maintained with IaC
• Support hybrid cloud infrastructure• AWS + Cloud Foundry + Openstack
• Have a Pipeline for the Pipeline
• Provide self service onboarding –enable developers
The Pipeline should be Fast, Secure, Reliable & Available!
18
XL Release • Orchestration layer
• Hides the complexity
• Release templates are flexible
• Release overview
• Good reporting
Jenkins • Works well for DIY build automation
• Difficult to manage jobs & config
• Difficult to navigate folders and jobs
• Lots of plugins to manage
Why XL Release?
Pipeline Tech Stack
19
Dashboard
Dev Build Integrate Test Release Deploy Operate
Release Orchestration
Operations
DashboardDev / Test dashboards
Infrastructure
XL Release - Orchestration Layer
• Delivers customer facing applications to production• “Fed-Ex – We deliver!”
• Multiple customers with unique needs
• Workflow for our IT processes• Refreshing pipeline infrastructure “Get Well
- Stay Well”
• Get the workflow right, then automate it
20
How do we know it’s working?
• ELK Stack Dashboards• Requires constant monitoring &
alerting
• User support via phone, email, chat, tickets• Also requires monitoring & alerting
• Canary Microservice• Automatically runs and alerts on
failures
21
Let your Canary Sing!
• Microservice that touches entire tech stack
• Canary Release Validates • Pipeline Release Template (workflow)
• Tool to tool communications
• Operational Platform
• Production instance triggers a new release restarting the workflow
22
Canary Enhancements
• Additional Programming language support
• Better integration with issue tracking & notifications systems
• More trend analysis
• Support new tools and platforms
• Negative testing
24
Pipeline Availability Report
26
98.990%
96.629%
99.983% 99.933%
99.167%
95.00%
96.00%
97.00%
98.00%
99.00%
100.00%
Dec '16 Jan '17 Feb '17 Mar '17 Apr '16
% Successful
• CI/CD Pipeline Availability – April 2017• Degradation
• Unplanned : None
• Outage• Unplanned : ~6 hrs – Artifactory crash : Artifactory stopped at
midnight due to disk space issues. Customer impact was ~20 min (first job was at 6am) (would make numbers 99.954%)
Canary Availability Reports
27
• Canary testing• We lost a number of
canaries during the artifactory disk issue which caused a race condition in resubmission of new canaries
• Manual Processes• % of time waiting for
somebody to push a button (Prod Gate) to the total time for a release to reach production
76%79%
85%89%
97% 96%
50%
75%
100%
Nov 16 Dec 16 Jan 17 Feb 17 Mar 17 Apr 17
% of successful Canaries per month
94.8% 94.9%96.6%
95.3%93.2%
91.8%
88.0%
90.0%
92.0%
94.0%
96.0%
98.0%
Nov '16 Dec '16 Jan '17 Feb '17 Mar '17 Apr '17
% of time releases wait at manual gates
Manual
Pipeline Volume in XL Release
28
• Average Release• Duration: How long does a
single release take to get through the Pipeline?
• Automation Percentage: Percentage of automated tasks in completed releases during the selected time period.
• Releases per month• Number of releases
completed per month.
29
Future: Service Maturity Dashboard
Pro
bab
ility
ImpactLow High
High
Medium High Critical
Low Medium High
Low Low Medium
Threat Level
Mis
sio
n C
on
tro
l Op
erat
ors
Co
ntr
ol o
f Sa
telli
tes
Pro
du
ct O
rder
ing
Pro
du
ct P
rod
uct
ion
Bar
e M
etal
Ser
vice
Inte
rdep
end
enci
es
P8
00
Dep
end
ent
Infr
astr
uct
ure
Feat
ure
To
ggle
s
HA
Ris
k Sc
ore
Service 1 0 0 0 0 0 0 0 0 0 0.0
Service 2 2 3 1 2 3 2 3 0 5 3.5
Service 3 4 5 3 4 5 4 5 5 0 5.8
Service 4 6 7 5 6 7 6 7 0 5 8.2
Service 5 8 9 7 8 9 8 9 0 0 9.7
ProbabilityImpact Mitigation
The Pipeline will gather statistics to drive a Release Score