ScaleOut your team - Building a technology team for scale in a DevOps culture

30

Transcript of ScaleOut your team - Building a technology team for scale in a DevOps culture

Page 1: ScaleOut your team - Building a technology team for scale in a DevOps culture
Page 2: ScaleOut your team - Building a technology team for scale in a DevOps culture

Scale Out your team Building your technology team for scale

SVP Platform Engineering & Operations Shai Peretz

Page 3: ScaleOut your team - Building a technology team for scale in a DevOps culture

Provide people with the most interesting, relevant and trusted content

Audience First.

Our Lighthouse

Page 4: ScaleOut your team - Building a technology team for scale in a DevOps culture

Widget Examples

Page 5: ScaleOut your team - Building a technology team for scale in a DevOps culture
Page 6: ScaleOut your team - Building a technology team for scale in a DevOps culture
Page 7: ScaleOut your team - Building a technology team for scale in a DevOps culture
Page 8: ScaleOut your team - Building a technology team for scale in a DevOps culture

Traffic: >25 Billion PVs per month >8 Billion recs per day Reach: >550M users globally Data: multiple petabytes (dist.) Servers: >4000 physical nodes Monitoring: >4m metrics per minute Team: ~130 Engineers (Dev + Ops) High growth rate

Scaling Vectors

Page 9: ScaleOut your team - Building a technology team for scale in a DevOps culture

- We design and build our own data centers (optimization, cost) - Collocation (less clouds on the horizon:) - Active/Active approach - Rely on external services when needed (DNS, CDN)

Operational Decisions

Page 10: ScaleOut your team - Building a technology team for scale in a DevOps culture

- No SPOF (n+x) - Vendor diversity - Flexible architecture - Commodity hardware - Scale out – no central devices - Open source

Design guidelines

Page 11: ScaleOut your team - Building a technology team for scale in a DevOps culture

- Automate using Chef - Configuration as code (Source control) - Log changes automatically

Configuration Management

Page 12: ScaleOut your team - Building a technology team for scale in a DevOps culture

Architecture – tolerance Service owner responsibility War rooms Ops + Dev Production Party Open communication with business

Disaster Recovery

Page 13: ScaleOut your team - Building a technology team for scale in a DevOps culture

Ops: - Facilities - Network and Infra - Visibility - Data systems - Production Engineering

Platform Group Structure

Engineering: - Data delivery & processing - App Infrastructure - Build/Dev tools - Ops tools

Page 14: ScaleOut your team - Building a technology team for scale in a DevOps culture

Skilled Ops engineers Sit with product development teams Product/Business KPIs What? – from product, How? - From Ops PE team lead – sync, training, implementations Two way communication

Production Engineering

Page 15: ScaleOut your team - Building a technology team for scale in a DevOps culture

-Very short release cycles (>100 per day) - Micro services - Easy to find issue (fix or rollback) - Automated deployment process - Testing & monitoring - Work procedures and culture

Continuous Deployment

Page 16: ScaleOut your team - Building a technology team for scale in a DevOps culture

Continuous Deployment

Ownership & Trust

Product Developers own their Services

Platform own Infrastructure, Hardware & Network services

Page 17: ScaleOut your team - Building a technology team for scale in a DevOps culture

- Ownership - Trust - Good communication - Learning

Values

Page 18: ScaleOut your team - Building a technology team for scale in a DevOps culture

- Face to Face - Sync and Share - Hipchat – always on - Open channels

Communication

Page 19: ScaleOut your team - Building a technology team for scale in a DevOps culture

- Prevention (Anomaly detection, trends) - MTTD, MTTR, MTTS Technology will eventually fail, we promise to fix it ASAP!

Stability Goals

Page 20: ScaleOut your team - Building a technology team for scale in a DevOps culture

- Graph everything - Self serve - Combination of internal and external tools: Collectd/Graphite/Nagios Logstash/ElasticSearch/Kibana New Relic/Boundary Keynote/Pingdom/Catchpoint - Dashboards – Graphitus/Grafana - Escalation of critical alerts via PagerDuty

Visibility

Page 21: ScaleOut your team - Building a technology team for scale in a DevOps culture

Prevention

Immune system

Unit tests (10k every 10m)

Integration and Regression

Self tests

Monitoring system

Alerts

Page 22: ScaleOut your team - Building a technology team for scale in a DevOps culture

Keys to success: - Self serve - Eliminate false alarms (Signal to noise ratio) Automatic full coverage

Immune System

Page 23: ScaleOut your team - Building a technology team for scale in a DevOps culture

To NOC or not to NOC?

Mean Time To Detect

Page 24: ScaleOut your team - Building a technology team for scale in a DevOps culture

Ops on shift Engineer on call Escalation policy on PD Manage on HipChat/Phone

Mean Time To Recover

Page 25: ScaleOut your team - Building a technology team for scale in a DevOps culture

Escalate only critical issues Measure time to resolve Blameless learn from events (Take-Ins) Respect your team’s sleep!

Mean Time To Sleep

Page 26: ScaleOut your team - Building a technology team for scale in a DevOps culture

After each event Blameless Action Items Publish Follow up

Take ins

Page 27: ScaleOut your team - Building a technology team for scale in a DevOps culture

Order 2-3 times a year Load testing + Prediction Elasticity for engineering Automatic provisioning

Capacity planning

Page 28: ScaleOut your team - Building a technology team for scale in a DevOps culture

Weekly tech talks IL TechTalks (+ techtalk week) Reversim podcast and summit Internal/External Lectures Sunday School

Learning

Do you guys ever work??

Page 29: ScaleOut your team - Building a technology team for scale in a DevOps culture

Two weeks dedicated to the needs of the technical teams

Quality Time

Page 30: ScaleOut your team - Building a technology team for scale in a DevOps culture

Thank You

[email protected]

Shai Peretz, SVP Platform Engineering & Operations

And Yes, we are hiring…