Spotify: From 1 to 100 Hadoop developers

12
From 1 to 100 developers Scaling for developer productivity at Spotify @dawhiting HUG UK @ Strata 11/11/2013

description

How Spotify scaled their Hadoop cluster and the people working on it from 1 to over 100 develop, and 1 node to now over 690 nodes pushing them to have the largest Hadoop cluster in Europe.

Transcript of Spotify: From 1 to 100 Hadoop developers

Page 1: Spotify: From 1 to 100 Hadoop developers

From 1 to 100 developers Scaling for developer productivity at Spotify

@dawhiting

HUG UK @ Strata 11/11/2013

Page 2: Spotify: From 1 to 100 Hadoop developers

How do I scale? How many nodes?How much data?How many records?

2

Page 3: Spotify: From 1 to 100 Hadoop developers

How do I scale my development?

How many developers?How many teams?How many Hadoop jobs?How much code?

3

Data Infrastructure - July 2013

Page 4: Spotify: From 1 to 100 Hadoop developers

4

A brief history of Hadoop development at Spotify2008 - Spotify launches in Sweden

2009 - First Hadoop cluster for royalties, 2 developers

2010 - Up to 37 nodes, BI team formed, 3 devs/3 analysts

2011 - to Elastic MapReduce

2012 - Back to own cluster, 60 -> 190 nodes, Infrastructure/Insights/Tools team split

2013 - 6 teams just for data infrastructure, ~100 developers using Hadoop cluster.

Page 5: Spotify: From 1 to 100 Hadoop developers

Issues

What could possibly go wrong?•Contention for resources•Repetition of code, repetition of data•Poor code quality / technical debt•Disorganised HDFS•Data cataloguing

5

Page 6: Spotify: From 1 to 100 Hadoop developers

6

Contention for resources

Priority and isolation•What is important?Hadoop scheduler•Capacity scheduler•Queue isolationYARN•Resource allocation

Page 7: Spotify: From 1 to 100 Hadoop developers

Don’t Repeat YourselfRefactor data, not just code•Make popular data available pre-joined

•Analyse code to find jobs with the same dependencies

Work at a higher level•MapReduce out, (S)Crunch in•Allow substitution of operations for cached data

7

Page 8: Spotify: From 1 to 100 Hadoop developers

Code Quality &Technical DebtStable platform•Python -> JVMAbolish custom infrastructure•Off-the-shelf is often good enough

•Eg. Sqoop, Kafka, ...Testing•Make testing easier than running

•Enforced testing

8

Page 9: Spotify: From 1 to 100 Hadoop developers

HDFSRetention policy•Automatic deletion of old intermediate data

•Opt-out, not opt-inEstablish convention•Can you correctly guess the path to the data you need?

Enforce structure•Path literals are a code smell

9

Page 10: Spotify: From 1 to 100 Hadoop developers

Data Library

Core datasets•Identify•Catalogue•Document•MonitorData library as code library•Easy to use•Synced with release cycles

10

Page 11: Spotify: From 1 to 100 Hadoop developers

You can have it easier than us

Act now•Big Data technical debt is worse than normal technical debt•Rewriting 10 jobs is easier than rewriting 300

Plan to decentralise•At some point it won’t be enough to trust your developers•You won’t be able to review every job forever

Make it simpler to do things the right way•Example: build tools

11

Page 12: Spotify: From 1 to 100 Hadoop developers

Want to join the band?We’re hiring for Stockholm and NYC

Check out http://www.spotify.com/jobs for more information.