Scalability meetup-05-2013-presentation

Post on 15-Jan-2015

1.242 views 2 download

Tags:

description

www.nimbl3.com

Transcript of Scalability meetup-05-2013-presentation

How to scale your platform when you are not Facebook

2013 Carlos Herrera

caherrerapa@gmail.com

CTO - Binumi, former Head of Product and IT – Lazada TH (Rocket Internet)MBA and B. Computer Science7 countries10 years in tech. Contributed to some OSSProduct Management, Software dev, ethical hacking, corporate governanceCoca Cola FEMSA, Infosys, among others

Who am i?

Be humble“Normal” scale platforms are different.–Customer paying vs Like a cat–Sensitivity*–Structured information vs Unstructured–Concurrency–Reach across regions–GBs vs Several TB or PB

Why I shouldn’t do exactly as fb?

You scale a platform not a language*Language selection drivers– Problem– Maturity– Community– Talent pool

Nice to experiment but focus on the problem (yes, im talking about you mongodb guys)Different solutions / Creativity

No silver bullet

Scaling in 7 steps

How does it start?

Everything on the same server

Reading the DB all the time

JS, CSS, Images served from your

server

1. Define, measure, benchmarkb. MeasureMuninIcingaStatsdNew Relic $Pingdom $

c. BenchmarkSiegeJmeterSysbench

a. Baremetal vs Cloud (Amazon)Physical vs VirtualizedPowerful vs FlexibleNormal skills vs Hard core skillsSupport vs On your ownOutages?Avoid anything with Cpanel for God’s sakeInternational gateway

Cache?Set content in memory/driveFaster than DBKey = ValueTTLMemcached (> 1 server), APC (1 server) Other Common use: sessionsExtremely important in cloud

Live without cacheConstant hits to DBDB easily the bottleneckWasting $$ you paid for memory

2. Cache system

2. Cache systema. Customer goes to your website’s homepageb. The page requested needs to load a list of

productsc. Is the list of products in the cache by the key

“XXXXX”?a. Yes. Retrieve from cache using key

“XXXXX”, use it and return page.b. No. Go to the Database, perform the query

and save it in cache for later, use the info and return the page.

Introducing Cache

Content Delivery NetworkStatic files (CSS, JS, HTML, JPG)Amazon, Rackspace, Cloudflare, Akamai

CDN< Page Load time< Server load (important in cloud)Inexpensive> Automation

3. Content Delivery Network

4. Decouple and revisit your web node Separation of concerns

Webnodes apart from DBMore visibilityEasier to scale

HorizontalVertical

Evaluate Nginx, Puma(RoR)PHP-FPM

5. Adding web nodes / load balancing

Load BalancerSends traffic to internal web nodesEasier to pay. ELB*, RackspaceNo budget? Nginx, HAProxyLocation depends if you pay or you build it*

www.yourapp.com

web01.yourapp.com

web02.yourapp.com

Beware of the session management

If you don’t use sticky sessions be careful you can end up not remembering your customers as the load balancer sends traffic to the less busy

5. Adding web nodes / load balancing

6. Scaling the database: vertical

VerticalBigger serverRDS (downtime)EasierBut there is a limit

7. Scaling the database: horizontal

Master / SlaveMore of effort for sysadmin

MasterImportant readsAlways writes

SlaveOnly ReadsConsider delay

Enough for 1000RPM per node*

Need more?Separate memcache serversShardingRAIDsIf you are doing heavy search add Apache SolrRevisit your problem

Do you need HadoopCan you use NoSQL (Cassandra,

Mongo)?

Something is not working?

Revisit your database designIndexes anyone?

Revisit your app designPessimistic locks?

Lack of good algorithms?

Bottleneck => web nodesWe added one web and dbRPMUsed historical performance data Rackspace (no choice)Reduce loaded elements on frontpageStandby - Monitor APDEXWe did well. No downtimes

TV and BTS at Lazada

20-25% of the traffic just to one pageThat page was on a small server we controlled in TH2 hours handling our own end of the world

12/12/12 Campaign

Tech communication with businessMonitoringVPSStatic filesCache

What could be done better?

SolutionDelay traffic / Business guysEC2CloudfrontCache and fix codeNew relic

1M video clips20K+ video clips per monthHeavy searchStreamingFast quick preview. Faster than Animoto or WevideoRendering

Some lessons learnt

7 DO’s

1. No fear

2. Iterate fast

3. Decouple and create interfaces

4. Track and Monitor

5. Version and manage branches

6. Use the right tools

…”mmm I will need a bigger drill”

7. Code conventions / Process

7 DON’Ts

1. Don’t do “Temporary fixes” Bad code is like Karma

2. “DRY”*

3. Don’t modify “Live” files or databases directly

4. Don’t forget testing environment

5. Don’t use self-managed servers

6. Don’t be the last to know

7. Don’t scale too soon just be prepared

Interesting projects

Apache Mesos (Distributed apps).Cassandra (Database)Kentrel (Queue)BERT (RPC)Apache HadoopApache ZookeeperHipHopVM

Do you code in PHP?I’m Hiringcaherrerapa@gmail

“Scaling is like replacing all components on a car while driving it at 100 mph.”

“Initial scaling won’t be glamorous”