Chef at Etsy
-
Upload
jonlives -
Category
Technology
-
view
1.663 -
download
0
description
Transcript of Chef at Etsy
Chef at Etsy
@jonlives
Jon Cowie
Sr Operations Engineer
30 Million Members
4
1 Million Active Shops
20 Million Items Listed
5
60 Million Monthly Unique Visitors
@jonlives
We Love Chef!
@jonlives
Absorb what is useful.
Discard what is useless.
@jonlives
“I am not smart enough to build an ontology … that
can encompass all the variations in infrastructure.
Nobody is, the world moves too fast.”
@jonlives
There is no magic pill.
@jonlives
You are the expert.
@jonlives
Chef at Etsy
• Chef Server 11.1.4
• ~2000 Nodes
• CentOS, some Mac OS X
@jonlives
Beginning of 2010 Today
@jonlives
Chef at Etsy
@jonlives
Evolution of Chef
@jonlives
2010: The Beginning
• ~250 Nodes (Ubuntu & CentOS
• The first cookbooks
• Out of the box workflow
@jonlives
2011: Growth
• ~400 Nodes (CentOS)
• Chef still pretty specialised knowledge
• Handlers added
@jonlives
2012: A big year
• ~800 Nodes (CentOS & MacOS X) • More in-house Chef expertise • Workflow tooling • Debugging tooling • Monitoring
@jonlives
2013: Chef at Etsy
• ~1500 Nodes • Workflow tooling enhancements • Feature flags in Chef • Chef performance - Chef 11 upgrade
@jonlives
2014: Chef at Etsy
• ~2000 nodes • Consolidation • CI with Chef • Omnibus • Work-in-Progress tooling
@jonlives
Patterns & Workflows
@jonlives
Cookbook Workflow
@jonlives
$> review -r jcowie --cc ops
@jonlives
knife-spork
• https://github.com/jonlives/knife-spork • Workflow tool • Helps multiple chefs avoid clashing • Visibility into changes • Plugins
@jonlives
knife-spork
• knife spork bump • knife spork upload • Test change
@jonlives
Test Change
• https://github.com/jonlives/knife-flip
• knife node flip foo.etsy.com testing
• knife role flip MyRole testing
@jonlives
Test Change
• https://github.com/mrtazz/knife-wip • Uses node tags <irccat> CHEF: bburry started work cent7 package bugfixing on deploy01.ny5.etsy.com
@jonlives
knife-spork
• knife spork bump • knife spork upload • Test change • knife spork promote --remote • git commit and push
@jonlives
Monitoring & Debugging
@jonlives
knife-spork & CI Job
<irccat> CHEF: Jon Cowie uploaded [email protected] <irccat> CHEF: Jon Cowie promoted [email protected] to production <snip> <irccat> Git PUSH -> Sysops/chef <snip> <Jenkins> Starting build #5649 for job chef-server-git-sync <Jenkins> Project chef-server-git-sync build #5649: SUCCESS in 2 min 36 sec: http://ci.etsycorp.com/job/chef-server-git-sync/5649/
@jonlives
IRC Handler<irccat> Chef run failed on officebackup01.office.etsy.com gist failed, see /var/log/chef/client.log on the host !
<irccat> Still Failing on dbnest01.ny4.etsy.com since 2 days ago https://github.etsycorp.com/gist/656d8914fbef5a6bd9aa
@jonlives
Lastrun Data
• https://github.com/jgoulah/knife-lastrun
• knife node lastrun foo.bar.com
@jonlives
Lastrun Data% knife node lastrun dbnest01.ny4.etsy.com Status failed Elapsed Time 29.055892 Start Time 2014-‐10-‐06 12:54:51 +0000 End Time 2014-‐10-‐06 12:55:20 +0000 !<snip> !Exception <snip> Installed package backupd-‐1.4-‐1.365657d.el5.centos is newer than candidate package backupd-‐1.2-‐1.99ddb8e.el5
@jonlives
Dashboards
@jonlives
Dashboards
@jonlives
Dashboards
@jonlives
Monitoring & Debugging
• https://github.com/etsy/chef-handlers • https://github.com/etsy/dashboard • https://github.com/jgoulah/knife-lastrun • https://github.com/bmarini/knife-inspect
@jonlives
Feature Flags
@jonlives
Downsides of Existing Approach
• Holding cookbook in testing is blocking • Accidental promotions • Testing env affects all cookbooks • “Upgrade” envs often used • How to make it more “Etsy”?
@jonlives
@jonlives
chef-whitelist
• https://github.com/etsy/chef-whitelist • Databag driven • Cookbook library • Feature flags!
@jonlives
chef-whitelist{ "id": "php-5-5-17", "patterns": [ "statsd*.ny5.etsy.com", "deploy*.ny5.etsy.com", <snip> ] }
@jonlives
chef-whitelist
if node.is_in_whitelist? "php-5-5-17" package "php-pecl-opcache" do action :remove end end
@jonlives
Configuration Data
@jonlives
Keep cookbooks:• Simple • Modular • Scalable • Maintainable
@jonlives
Environments
• Cookbook version constraints
@jonlives
Roles
• Group-level config • Syslog-ng • Iptables • Sudoers
@jonlives
Roles - iptables“firewall": { "ports": { "11211": { "subnet_group": "prod_subnets" }, <snip> } }
@jonlives
Roles - Syslog-ng"syslog": {
"web": {
"web_apache_access_log": {
"source": "/var/log/httpd/access_log",
"source_program_override": "APACHEACCESS: ",
"destination": "/data/syslog/current/web/access.log",
"destination_filters": [
"host('^(web0|dlweb)')",
"match('APACHEACCESS')"
]
}
}
@jonlives
Data Bags
• Global / Datacenter specific Config • Ganglia • Cobbler • VOIP
• Data Storage
@jonlives
Data Bags - Ganglia{
"id": "config_se5",
"grid_name": "EtsySE5",
"authority": "http://gangliase5.etsycorp.com",
"trusted_hosts": <snip>,
"groups": {
"Utilities": "239.2.11.71",
<snip>
}
<snip>
}
@jonlives
Data Bags - Cobbler{
"id": "config_corp",
"cobbler_server": "corpking02.corp.etsy.com",
"dns_servers": [ “10.x.x.x", “10.x.x.x" ],
"dhcp_ranges": {
"10.100.x.0": {
"routers": "10.x.x.1",
"mask": "255.255.255.0",
"range": "10.x.x.11 10.x.x.250"
}
}
}
@jonlives
Write cookbooks you’ll thank yourself for.
@jonlives
!
http://jonliv.es/book !
Discount Code: AUTHD !
40% off Print 50% off Digital