Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf ·...

53
 Scalable Architectures 101 MNPHP Feb  3, 2011 Mike Willbanks    Blog: http://blog.digitalstruct.com Twitter: mwillbanks      IRC: lubs on freenode

Transcript of Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf ·...

Page 1: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

ScalableArchitectures 101

MNPHPFeb  3, 2011

Mike Willbanks   Blog: http://blog.digitalstruct.comTwitter: mwillbanks     IRC: lubs on freenode

Page 2: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Scalability?

Your application is growing, your systems are slowing and growth is inevitable... 

● Where do we go from here?● Load Balancing● Web Servers● Database Servers● Cache Servers

● Job Servers● DNS Servers● CDN Servers● Front­End Performance

Page 3: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

The Beginning...

Single Server Syndrome

● One Server Many Functions

● Web Server, Database Server, Cache Server, Job Server, DNS Server, Mail Server....

● How we know it's time

● iostat, cpu load, overall degradation

Page 4: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

The Next Step...

Single Separation Syndrome

● Separation of Web and Database

● Fix the main disk I/O bottleneck.● However, we can't handle our current I/O, CPU or 

amount of requests on our web server.

Page 5: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

     Load Balancing

Page 6: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Load Balancing Our Environment

Page 7: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Several Options

● DNS Rotation (Little to No Cost)● Not very reliable, but works on a small scale.

● Software Based (Commodity Server Cost)● HAProxy, Pound, Varnish, Squid, Wackamole, 

Perlbal, Web Server Proxy...

● Hardware Based (High Cost Appliance)● Several vendors ranging based on need.

– A10, F5, etc.

Page 8: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Routing Types of Load Balancers

● Round Robin● Static● Least Connections● Source● IP● Basic Authentication

● URI● URI Parameter● Header● Cookie● Regular Expression

Page 9: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Open Source Software Options

● Out of the many options we will focus in on 3● HAProxy – By and large one of the most popular.● Pound – Said to be great for medium traffic sites.● Varnish – A caching solution that also does load 

balancing

Page 10: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

HAProxy

● Pros● Extremely full featured● Very well known● Handles just about every type of routing● Several examples online● Has a web­based GUI

● Cons● No native SSL support (use Stunnel)● Setup can be complex and take a lot of time

Page 11: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Sample HAProxy Configuration

global log 127.0.0.1 local0 log 127.0.0.1 local1 notice maxconn 4096 user haproxy group haproxy daemon

defaults log global mode http option httplog option dontlognull retries 3 option redispatch maxconn 2000 contimeout 5000 clitimeout 50000 srvtimeout 50000

listen localhost 0.0.0.0:80 option httpchk GET / balance roundrobin cookie SERVERID server serv1 0.0.0.0:8080 check inter 2000 rise 2 fall 5 server serv2 0.0.0.0:8080 check inter 2000 rise 2 fall 5 option httpclose stats enable stats uri /lb?stats stats realm haproxy stats auth test:test

Page 12: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Pound

● Pros● chroot support● Native SSL support● Insanely simple setup● Supports virtually all types of routing● Many online tutorials

● Cons● No web­based statistics (use poundctl)● HAProxy can scale more...

Page 13: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Sample Pound Configuration

User "www-data"Group "www-data"LogLevel 1 Alive 30Control "/var/run/pound/poundctl.socket"ListenHTTP Address 127.0.0.1 Port 80 xHTTP 0

Service BackEnd Address 127.0.0.1 Port 8080 End BackEnd Address 127.0.0.1 Port 8080 End End End

Page 14: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Varnish

● Pros● Supports front­end caching● Farily simple setup● Extremely well known● Many online tutorials● Large suite of tools (varnishstat, varnishtop, 

varnishlog, varnishreplay, varnishncsa)

● Cons● No native SSL support (use Pound or Stunnel)● If you want a WebGUI you must PAY

Page 15: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Sample Varnish Configuration

backend default1 { .host = "127.0.0.1"; .port = "8080"; .probe = { .url = "/"; .interval = 5s; .timeout = 1s; .window = 5; .threshold = 3; } }

backend default2 { .host = "127.0.0.1"; .port = "8080"; .probe = { .url = "/"; .interval = 5s; .timeout = 1s; .window = 5; .threshold = 3; } }

director default round-robin { { .backend = default1; } { .backend = default2; } }

sub vcl_recv { if (req.http.host ~ "^127.0.0.1$") { set req.backend = default; } }

Page 16: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

What We Need to Remember

● Web Servers● One always needs to be available● Don't use SSL on the web server level!

● Headers● Pass headers if SSL is on or not● Client IP is likely on X­forwarded­for● If using Virtual Hosts pass the Host

● Sessions● Need a solution if not using sticky routing

Page 17: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

     Web Servers

Page 18: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Several Options

● Apache● IIS● Nginx● Lighttpd● etc.

Page 19: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Configuration

● Sever name should be the same on all servers● Make a server alias so you can reach individual 

servers w/o load balancing

● Each configuration SHOULD or MUST be the same.

● Client IP will likely be in X­forwarded­for.● SSL will not be in $_SERVER['HTTPS'] and 

HTTP_ header instead.

Page 20: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

What We Need to Remember

● Files● All web servers need our files.● Static content could be tagged in version control.● Static content may need a file server / CDN / etc.● User Generated content on NFS mount or served 

from the cloud or a CDN.

● Sessions● All web servers need access to our sessions.● Remember disk is slow and the database will be a 

bottleneck.  How about distributed caching?

Page 21: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Other Thoughts

● Running PHP on your web server may be a resource hog, you may want to offload static content requests to nginx, lighttpd or some other lightweight web server.● Running a proxy to your main web servers works 

great for hardworking processes.  While serving static content from the lightweight server.

Page 22: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

DatabaseServers

Page 23: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Where We All Start

Single Database Server

● Lots of options and steps as we move forward.

Page 24: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Replication

Single Master, Single Slave

● Write code that can write to the master and read from the slave.

● Exception: Be smart, don't write to the master and read from the slave on the table you just wrote to.

Page 25: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Multiple Slaves

Single Master, Multiple Slaves

● It is a great time to start to implement connection pooling.

Page 26: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Multiple Masters

Multiple Master, Multiple Slaves

● Do NOT write to both masters at once with MySQL!

● Be warned, auto­incrementing now should change so you do not conflict.

Page 27: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Partitioning

Segmenting your Data● Vertical Partitioning

● Move less accessed columns, large data columns and columns not likely in the where to other tables.

● Horizontal Partitioning

● Done by moving rows into different tables.– Based on Range, Date, User or Interlaced

Page 28: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

What We Need to Remember

● Replication● There may be a lag!● All reports / read queries should go here● Don't read here directly after a write

– Transactions / Lag / etc.

● Sessions● Never store sessions in the DB

– Large binlogs, garbage collection causes slow queries, queue may fill up and cause a crash or max connections.

Page 29: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

CacheServers

(not full page)

Page 30: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Caching

“Caching is imperative in scaling and performance”

● Single Server– Shared Memory: APC / Xcache / etc– File Based: Files / Sqlite / etc– Not highly scalable, great for configuration files.

● Distributed– Memcached, Redis, etc.– Setup consistent hashing.

● Do not cache what cannot be re­created.

Page 31: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Caching

In The Beginning

● Single Caching Server

● Start to cache fetches, invalidate cache on write and write new cache, always reading from the cache.

Page 32: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Distributed Caching

Distributed Mania

● Write based on consistent hashing (hash of a key that you are writing) 

● Server depends on the hash.

● Hint – use the memcached pecl extension.

Page 33: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

The Read / Write Process

In the most simple form...

Page 34: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

What We Need to Remember

● Replicated or not...

● Elasticity

● Consistent hashing – cannot add or remove w/o losing data● Sessions

● Store me here... please please please!● Memory Caches

● Durability ­ If it fails, it's gone!● Ensure dedicated memory!● If you run out of memory, does it remove an old and add the 

new or not allow anything to come in?

Page 35: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

JobServers

Page 36: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

“Message queues and mailboxes are software­engineering components used for interprocess communication, or for inter­

thread communication within the same process. They use a queue for messaging – the passing of control or of content.”

http://en.wikipedia.org/wiki/Message_queue

Page 37: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Messages are Everywhere

Page 38: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

What are Message Queues

● A FIFO buffer● Asynchronous push / pull● An application framework for sending and 

receiving messages.● A way to communicate between applications / 

systems.● A way to decouple components.● A way to offload work.

Page 39: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Where We All Start

Single Job Server

● Lots of options and steps as we move forward.

Producer Message QueueServer

ConsumerQueue Receive

Page 40: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Distributed Job Servers

Distributed Mania

● Load balance a message queue for scale

● Can continue to create more workers

Producer

Message QueueServer

Consumer Consumer Consumer ConsumerConsumer

Message QueueServer

Message QueueServer

Producer Producer

Page 41: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Why are Message Queues Useful?

● Asynchronous Processing● Communication between Applications / Systems● Image Resizing● Video Processing● Sending out Emails● Auto­Scaling Virtual Instances● Log Analysis● The list goes on...

Page 42: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

What We Need to Remember

● Replication or not?● You need to keep your workers running

● Supervisord or monit or some other monitoring...

● Don't offload things just to offload● If it needs to be real­time and not near real­time this 

is not a  good place for things – however, your boss does not need to know :)

Page 43: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

     DNS Servers

Page 44: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

What to do

● Just about every domain registrar runs DNS● DO NOT RUN YOUR OWN!

● Anycast DNS● Anycast is a network addressing and routing 

scheme whereby data is routed to the "nearest" or "best" destination as viewed by the routing topology.

● It's sexy, it's sweet and it is FAST!● A “cheaper” provider is DNS Made Easy.

– Yes the interface is ugly.

Page 45: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

What to look for...

● Wildcard support● Failover / Distributed● CNAME support● TXT support● Name Server support

Page 46: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

     CDN Servers

Page 47: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Why Use a CDN

● Free your bandwidth● Free your server from serving basic files● Distributed servers around the globe

Page 48: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

What you need to know

● Origin Pull● Utilizes your own web server and pulls the content 

and stores it in their nodes.

● PoP Pull● You upload the content to something like S3 and it 

has a CDN on the top of it like CloudFront.

Page 49: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

What's the best?

● Depends on your need...● Origin Pull is great if you want to maintain all of 

the content in your web server.● PoP Push is great for storing things like user 

generated content.

Page 50: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Front­EndPerformance

Page 51: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Discussion Points

● Tactics● Minification (JavaScript / CSS)● CSS Sprites● GZIP● Cookies are evil● Parallel downloads (using subdomains for serving● HTTP Expires

Page 52: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Discussion Points

● Tools● Yslow● Firebug● Google Page Speed● Google Webmaster Tools

Page 53: Scalable Architectures 101 - Meetupfiles.meetup.com/1558487/scalable-architecture-101.pdf · Scalable Architectures 101 MNPHP Feb 3, 2011 Mike Willbanks Blog: http ...

   

Questions?

Mike Willbanks   Blog: http://blog.digitalstruct.comTwitter: mwillbanks     IRC: lubs on freenode