Highly scalable-architectures

22
Highly scalable architectures How Twitter deals with its scalability problems Benjamin Hiltpolt Master Seminar 3

Transcript of Highly scalable-architectures

Highly scalable architectures

How Twitter deals with its scalability problems

Benjamin Hiltpolt Master Seminar 3

● Scalability● Twitter

○ Introduction○ Problems○ Solutions

● Summary

Overview

... is the ability of a system, network, or process, to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth.

Scalability

1. André B. Bondi, 'Characteristics of scalability and their impact on performance', Proceedings of the 2nd international workshop on Software and performance, Ottawa, Ontario, Canada, 2000, ISBN 1-58113-195-X, pages 195–203

● Horizontally

● Vertically

Scalability

● load scalability● functional scalability● administrative scalability● geographic scalability

Scalability challenges

● microblogging● tweets: 140 chars● 500M registered users● 1.6 Billion searches per day● 90M TPD

Twitter

downtime 2007: 5d 23h ~2% downtime

FIFA World Cup 2010: Service Rejections ~10-20%

Scalability problems

June 2009 Twitpocalypse

September 2009 Twitpocalypse

solution: Snowflake

Scalability problems

presidential elections● 31 M tweets● 327452 TPM● 455000 retweets of this photohow do they manage this?

Twitter now

starting October 2010

● do a lot of monitoring● use open source● decoupling -> soft launches● review system

Redesign

PuppetReview BoardMurder (bittorrent)Darkmode

Management

open API● single point to optimize● use the crowd to extend twitter● 90% of all calls

API

● cache (DB only Backup)● memcache● denormalize● avoid joins● sharding

○ (Flock DB --> MySQL)

Database

biggest rails application

unicorn (no downtime during deploy)

rails scales

ruby doesn't ?

Ruby on Rails

faster (20%)

JVM: Garbage collection

scalable

static types

Scala on Rails

first Starling (Ruby)● mid-2008 queue crashed

now Kestrel (Scala, Memcache)

Queue

big problem● e.g. bot harvesting friends

solutions:● realistic limits● delete such accounts

Abuse

Unexpected problems:● Twitpocalypse● UNIX: Cron, Syslog fail● most (first) solutions will fail

Lessions

● DRY, KISS● testing (deploy only if tests pass)● monitoring (collect everything)● caching● avoid synchronous calls

Lessions:

Thank you!

Questions?

http://engineering.twitter.com/2010/09/tech-behind-new-twittercom.htmlhttp://www.theregister.co.uk/2012/11/08/twitter_epic_traffic_saved_by_java/http://engineering.twitter.com/2012/11/bolstering-our-infrastructure.htmlhttp://blog.twitter.com/2010/09/better-twitter.htmlhttp://highscalability.com/blog/2009/6/27/scaling-twitter-making-twitter-10000-percent-faster.htmlhttp://www.youtube.com/watch?v=gxUWfQmN8-4&feature=gvhttp://www.youtube.com/watch?v=z8LU0Cj6BOUhttp://blog.redfin.com/devblog/2010/05/how_and_why_twitter_uses_scala.htmlhttp://www.artima.com/scalazine/articles/twitter_on_scala.htmlhttp://www.technologyreview.com/view/412834/the-secret-behind-twitters-growth/?nlid=1908http://www.unlimitednovelty.com/2009/04/twitter-blaming-ruby-for-their-mistakes.htmlhttp://www.infoq.com/news/2009/06/Twitter-Architecture Twitter architecture overviewhttp://highscalability.com/blog/2009/4/5/at-some-point-the-cost-of-servers-outweighs-the-cost-of-prog.htmlhttp://blog.obiefernandez.com/content/2009/04/my-reasoned-response-about-scala-at-twitter.html

refs