SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ...
Transcript of SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ...
![Page 2: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/2.jpg)
![Page 3: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/3.jpg)
2017
INSTAGRAM HISTORY
2010
2012/4/9joined
2014/1
600M users/month
![Page 4: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/4.jpg)
INSTAGRAM EVERYDAY
400 Million Users
4+ Billion likes
100 Million photo/video uploads
Top account: 110 Million followers
![Page 5: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/5.jpg)
![Page 6: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/6.jpg)
SCALING MEANS
Scale out
Scale up
Scale dev team
![Page 7: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/7.jpg)
SCALE OUT
![Page 8: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/8.jpg)
SCALE OUT
![Page 9: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/9.jpg)
![Page 10: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/10.jpg)
SCALE OUT
![Page 11: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/11.jpg)
![Page 12: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/12.jpg)
![Page 13: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/13.jpg)
“Let’s all pray that Amazon gets everything sorted out in short order.”
![Page 14: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/14.jpg)
INSTAGRAM STACK
Tuesday, June 25th, 2013
memcache
RabbitMQ
PostgreSQL
Cassandra
Celery
OtherServicesDjango
![Page 15: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/15.jpg)
STORAGE VS. COMPUTING
• Storage: needs to be consistent across data centers• Computing: driven by user traffic, as needed basis
![Page 16: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/16.jpg)
SCALE OUT: STORAGE
Tuesday, June 25th, 2013
user, media, friendship etc
![Page 17: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/17.jpg)
SCALE OUT: STORAGE
Tuesday, June 25th, 2013
user, media, friendship etc
Master
Replica
ReplicaDjango
Write
Read
![Page 18: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/18.jpg)
SCALE OUT: STORAGE
Tuesday, June 25th, 2013
user, media, friendship etc
Master
Replica
ReplicaDjango
Write
ReadDC1
DC2
DC3
![Page 19: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/19.jpg)
SCALE OUT: STORAGE
Tuesday, June 25th, 2013
user feeds, activities etc
Replica
ReplicaReplica
Write - 2Read - 1
![Page 20: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/20.jpg)
SCALE OUT: STORAGE
Tuesday, June 25th, 2013
user feeds, activities etc
Replica
ReplicaReplica
Write - 2Read - 1
![Page 21: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/21.jpg)
COMPUTING
Tuesday, June 25th, 2013
![Page 22: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/22.jpg)
Tuesday, June 25th, 2013
Django
RabbitMQ PostgreSQL
CassandraCelery
Django
RabbitMQPostgreSQL
CassandraCelery
memcacheDC1 DC2memcache
![Page 23: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/23.jpg)
MEMCACHE
Tuesday, June 25th, 2013
• High performance key-value store in memory• Millions of reads/writes per second• Sensitive to network condition• Cross region operation is prohibitive
No global consistency
![Page 24: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/24.jpg)
feed
get
Django
User R
DC1
Django
PostgreSQL memcache
User Ccomment
setinsert
![Page 25: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/25.jpg)
Django
memcache PostgreSQL
User Ccomment
insertset
DC1
Django
memcachePostgreSQL
User R
feed
get
DC2
replication
![Page 26: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/26.jpg)
Django
memcache PostgreSQL
User Ccomment
insertset
DC1
Django
memcachePostgreSQL
User R
feed
DC2
replication
Cache invalidate
Cache invalidate
get
![Page 27: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/27.jpg)
COUNTERS
select count(*) from user_likes_media
where media_id=12345;
100s ms
![Page 28: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/28.jpg)
COUNTER
Tuesday, June 25th, 2013
select count from media_likes where media_id=12345;
10s us
![Page 29: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/29.jpg)
Cache invalidatedAll djangos try to access DB
![Page 30: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/30.jpg)
MEMCACHE LEASE
d1 d2 memcache dbtime
lease-get
filllease-get
wait or use stale
read from DB
lease-set
lease-get
hit
![Page 31: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/31.jpg)
INSTAGRAM STACK - MULTI REGION
Tuesday, June 25th, 2013
Django
RabbitMQ
PostgreSQL
Cassandra
Celery
memcache
Django
RabbitMQ
PostgreSQL
Cassandra
Celery
memcache
DC1 DC2
![Page 32: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/32.jpg)
SCALING OUT
Tuesday, June 25th, 2013
• Capacity• Reliability• Regional failure ready
![Page 33: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/33.jpg)
SCALING OUT - CHALLENGES, OPPORTUNITIES
Tuesday, June 25th, 2013
• Beyond North America• More localized social network• Direct messaging• Live streaming
![Page 34: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/34.jpg)
20
40
60
80
100
0 2 4 6 8 10 12 14 16 18 20 22 24
User growth Server growth
![Page 35: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/35.jpg)
“Don’t count the servers, make the servers count”
![Page 36: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/36.jpg)
SCALE UP
![Page 37: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/37.jpg)
SCALE UP
Use as few CPU instructions as possible
Use as few servers as possible
![Page 38: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/38.jpg)
SCALE UP
Use as few CPU instructions as possibleUse as few servers as possible
![Page 39: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/39.jpg)
CPU
Monitor
Optimize
Analyze
![Page 40: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/40.jpg)
COLLECT
struct perf_event_attr pe;
pe.type = PERF_TYPE_HARDWARE;
pe.config = PERF_COUNT_HW_INSTRUCTIONS;
fd = perf_event_open(&pe, 0, -1, -1, 0);
ioctl(fd, PERF_EVENT_IOC_ENABLE, 0); <code you want to measure> ioctl(fd, PERF_EVENT_IOC_DISABLE, 0); read(fd, &count, sizeof(long long));
![Page 41: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/41.jpg)
DYNOSTATS
20
40
60
80
100
0 2 4 6 8 10 12 14 16 18 20 22 24
Follow
Feed
Explore
![Page 42: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/42.jpg)
REGRESSION
20
40
60
80
100
0 2 4 6 8 10 12 14 16 18 20 22 24
![Page 43: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/43.jpg)
With new feature
Without new feature
![Page 44: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/44.jpg)
![Page 45: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/45.jpg)
CPU
Monitor
Optimize
Analyze
![Page 46: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/46.jpg)
PYTHON CPROFILE
import cProfile, pstats, StringIO pr = cProfile.Profile()
pr.enable() # ... do something ... pr.disable() s = StringIO.StringIO() sortby = 'cumulative' ps = pstats.Stats(pr, stream=s).sort_stats(sortby) ps.print_stats() print s.getvalue()
![Page 47: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/47.jpg)
![Page 48: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/48.jpg)
CPU - ANALYZEcontinuous profiling
generate_profile explore --start <start-time> --duration <minutes>
![Page 49: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/49.jpg)
CPU - ANALYZEcontinuous profiling
20
40
60
80
100
0 2 4 6 8 10 12 14 16 18 20 22 24
Caller
Callee
Callee
![Page 50: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/50.jpg)
![Page 51: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/51.jpg)
CPU
Monitor
Optimize
Analyze
![Page 52: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/52.jpg)
igcdn-photos-d-a.akamaihd.net/hphotos-ak-xpl1/t51.2885-19/s300x300/12345678_1234567890_987654321_a.jpg
![Page 53: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/53.jpg)
igcdn-photos-d-a.akamaihd.net/hphotos-ak-xpl1/t51.2885-19/s150x150/12345678_1234567890_987654321_a.jpg
igcdn-photos-d-a.akamaihd.net/hphotos-ak-xpl1/t51.2885-19/s400x600/12345678_1234567890_987654321_a.jpg
igcdn-photos-d-a.akamaihd.net/hphotos-ak-xpl1/t51.2885-19/s200x200/12345678_1234567890_987654321_a.jpg
igcdn-photos-d-a.akamaihd.net/hphotos-ak-xpl1/t51.2885-19/s300x300/12345678_1234567890_987654321_a.jpg
![Page 54: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/54.jpg)
CPU - OPTIMIZE
![Page 55: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/55.jpg)
igcdn-photos-d-a.akamaihd.net/hphotos-ak-xpl1/t51.2885-19/s300x300/12345678_1234567890_987654321_a.jpg
150x150
400x600
200x200
![Page 56: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/56.jpg)
CPU - OPTIMIZE
C is really faster
• Candidate functions:• Used extensively• Stable
• Cython or C/C++
![Page 57: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/57.jpg)
Use as few CPU instructions as possible
Use as few servers as possible
SCALE UP
![Page 58: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/58.jpg)
ONE WEB SERVER
Process 1
SharedMemory
PrivateMemory
Process N
![Page 59: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/59.jpg)
SCALE UP: MEMORY
• Run in optimized mode (-O)• Remove dead code
Reduce code
![Page 60: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/60.jpg)
SCALE UP: MEMORY
• Move configuration into shared memory• Disable garbage collection
Share more
![Page 61: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/61.jpg)
SCALE UP: MEMORY
20+% capacity increase
![Page 62: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/62.jpg)
SCALE UP: NETWORK LATENCY
Synchronous processing model with long latency
===> Worker starvation and fewer CPU instr executed
![Page 63: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/63.jpg)
Stories
FeedDjango
Feed
Stories
SuggestedUsers
ASYNC IO
![Page 64: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/64.jpg)
Use as few CPU instructions as possible
Use as few servers as possible
Scale up
![Page 65: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/65.jpg)
SCALE UP: CHALLENGES, OPPORTUNITIES
• Faster python run-time• Async web framework• Better memory analysis• etc etc
![Page 66: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/66.jpg)
SCALE DEV TEAM
![Page 67: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/67.jpg)
SCALING TEAM
30% engineers joined in last 6 months
Bootcampers - 1 week
Hack-A-Month - 4 weeks
Intern - 12 weeks
![Page 68: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/68.jpg)
Comment Filtering
Self-harm Prevention
Windows App
Multiple media in one post
Video View Notification
Saved Posts
First Story Notification
Instagram Live
Instagram Stories
![Page 69: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/69.jpg)
Which server?
NewTable or New Column?
What Index?Should
I cache it?
Will I lock up DB?
Will I bring down Instagram?
![Page 70: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/70.jpg)
WHAT WE WANT
• Automatically handle cache• Define relations, not worry about implementations• Self service by product engineers• Infra focuses on scale
![Page 71: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/71.jpg)
TAO
USER1
USER2
USER3mediaposted
posted bylikes
liked by
likes
liked by
![Page 72: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/72.jpg)
Comment Filtering
Self-harm Prevention
Windows App
Multiple media in one post
Video View Notification
Saved Posts
First Story Notification
Instagram Live
Instagram Stories
![Page 73: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/73.jpg)
SOURCE CONTROL
Master
Live
Direct
![Page 74: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/74.jpg)
SOURCE CONTROL
• Context switching• Code sync/merge overhead• Surprises• Refactor/major upgrade• Performance tracking harder
With branches
![Page 75: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/75.jpg)
SOURCE CONTROL
Master
Live
Direct
![Page 76: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/76.jpg)
SOURCE CONTROL
Master Live Direct
![Page 77: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/77.jpg)
SOURCE CONTROL
• Continous integration• Collaborate easily• Fast bisect and revert• Continuous performance monitoring
No branches
![Page 78: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/78.jpg)
FEATURE LAUNCH
Engineers
Employees
Dogfooder
Some demographics
World
![Page 79: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/79.jpg)
FEATURE LOAD TEST
![Page 80: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/80.jpg)
Once a
40-60 rollouts per day
daydiffweek?!!
![Page 81: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/81.jpg)
CHECKS AND BALANCES
Code reviewunittest
Code acceptedcommitted Canary To the Wild
![Page 82: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/82.jpg)
![Page 83: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/83.jpg)
SCALING MEANS
Scale out
Scale up
Scale dev team
![Page 84: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/84.jpg)
![Page 85: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/85.jpg)
TAKEAWAYS
Scaling is everybody’s responsibility
Scaling is continuous effort
Scaling is multi-dimensional
![Page 86: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/86.jpg)
QUESTIONS?
![Page 87: SCALING INSTAGRAM INFRA › system › files › presentation-slides › qcon20… · RabbitMQ PostgreSQL Cassandra Celery Other Services Django. STORAGE VS. COMPUTING • Storage:](https://reader035.fdocuments.in/reader035/viewer/2022070819/5f1aedff73924575bc4f3307/html5/thumbnails/87.jpg)