Distribute the workload, PHPTek, Amsterdam, 2011

Post on 12-May-2015

2.515 views 1 download

Tags:

description

Many services and applications are ill-equipped to handle a sudden rush of popularity, so their services either become unavailable or unbearably slow. By taking a chapter from the ant colonies in the wild, where their strength lies in their numbers and their ability to work together towards a common goal, you can achieve greater performance, more redundancy, higher availability and have the ability to scale services up and down as required easily. By leveraging systems such as gearman, memcache, daemons, message queues, load balancers and more, you too can enter the world of distributed systems and scalability.

Transcript of Distribute the workload, PHPTek, Amsterdam, 2011

Distribute the workload

Helgi Þormar ÞorbjörnssonPHP Tek, Chicago, 26th May 2011

Sunday, 29 May 2011

Who am I?

Sunday, 29 May 2011

Helgi

Sunday, 29 May 2011

VP of Engineering at Orchestra.io

Helgi

Sunday, 29 May 2011

VP of Engineering at Orchestra.io

Developer at PEAR

Helgi

Sunday, 29 May 2011

VP of Engineering at Orchestra.io

Developer at PEAR

From Iceland

Helgi

Sunday, 29 May 2011

VP of Engineering at Orchestra.io

Developer at PEAR

From Iceland

@h on Twitter

Helgi

Sunday, 29 May 2011

Why Distribute?

Sunday, 29 May 2011

Why Distribute?

Efficiency

Sunday, 29 May 2011

Why Distribute?

Budget

Efficiency

Sunday, 29 May 2011

Why Distribute?

Budget

Efficiency

Perception

Sunday, 29 May 2011

Efficiency

10 small servers > 1 big

Sunday, 29 May 2011

Budget

Sunday, 29 May 2011

Budget

Spend wisely

Sunday, 29 May 2011

Budget

Spend wisely

Commodity servers

Sunday, 29 May 2011

Budget

Spend wisely

Commodity servers

Cloud Computing (EC2)

Sunday, 29 May 2011

Perception

Sunday, 29 May 2011

Perception

Defer intensive processes

Sunday, 29 May 2011

Perception

Defer intensive processes

Give instant feedback

Sunday, 29 May 2011

Perception

Defer intensive processes

Give instant feedback

Users keep on browsing

Sunday, 29 May 2011

Perception

Defer intensive processes

Give instant feedback

Users keep on browsing

Sunday, 29 May 2011

Sunday, 29 May 2011

Ant Colonies

Sunday, 29 May 2011

Teamwork

When faced with a problem they will solve the problem as one.

Sunday, 29 May 2011

Sunday, 29 May 2011

Sunday, 29 May 2011

Architect for Distribution

Sunday, 29 May 2011

Characteristics

Sunday, 29 May 2011

Characteristics

Decoupling

Sunday, 29 May 2011

Characteristics

Decoupling

Elasticity

Sunday, 29 May 2011

Characteristics

Decoupling

Elasticity

High Availability

Sunday, 29 May 2011

Characteristics

Decoupling

Elasticity

High Availability

Concurrency

Sunday, 29 May 2011

Decoupling

Sunday, 29 May 2011

Application

DB API

Cache FE

Sunday, 29 May 2011

Application

DB API

Cache FE

Sunday, 29 May 2011

ApplicationDB API

Cache FE

Sunday, 29 May 2011

ApplicationDB API

Cache FE

Cache

Sunday, 29 May 2011

ApplicationDB API

Cache FE

Cache

API

Sunday, 29 May 2011

ApplicationDB API

Cache FE

Cache

API

API

Sunday, 29 May 2011

Elasticity

Sunday, 29 May 2011

Cloud Computing

Sunday, 29 May 2011

Load Balancing

Sunday, 29 May 2011

HA Proxy

Nginx

My Favourite

Sunday, 29 May 2011

Monitoring

Sunday, 29 May 2011

When do I need more servers?

Sunday, 29 May 2011

Needs to be around from the start!

Sunday, 29 May 2011

Keep records

Sunday, 29 May 2011

Spot trends

Sunday, 29 May 2011

Different types

Sunday, 29 May 2011

Different types

Hardware Performance

Sunday, 29 May 2011

Different types

Hardware Performance

Software Performance

Sunday, 29 May 2011

Different types

Hardware Performance

Software Performance

Availability

Sunday, 29 May 2011

Different types

Hardware Performance

Software Performance

Availability

Resourcing

Sunday, 29 May 2011

Different types

Hardware Performance

Software Performance

Availability

Resourcing

Sunday, 29 May 2011

Applications

Sunday, 29 May 2011

ApplicationsNew Relic

Sunday, 29 May 2011

ApplicationsNew Relic

CloudKick

Sunday, 29 May 2011

ApplicationsNew Relic

CloudKick

ScoutApp

Sunday, 29 May 2011

ApplicationsNew Relic

CloudKick

ScoutApp

Nagios

Sunday, 29 May 2011

ApplicationsNew Relic

CloudKick

ScoutApp

Nagios

Cacti

Sunday, 29 May 2011

ApplicationsNew Relic

CloudKick

ScoutApp

Nagios

Cacti

Circonus

Sunday, 29 May 2011

Automation

Sunday, 29 May 2011

Plug into your monitoring

Sunday, 29 May 2011

Bringing together Monitoring and Elastic behaviour into one

beautiful whole!

Sunday, 29 May 2011

Add some intelligence to add / remove servers as needed based

on current information.

Sunday, 29 May 2011

Just make sure it doesn’t turn into...

Sunday, 29 May 2011

Skynet!!Sunday, 29 May 2011

High Availability

Sunday, 29 May 2011

Get a highly available and resilient setup by following a few

of those recommendations

Sunday, 29 May 2011

Remember, even Google has outages

Sunday, 29 May 2011

What to avoid

Sunday, 29 May 2011

Local Sessions

Sunday, 29 May 2011

Store sessions in DB / Memcache

Solution

Sunday, 29 May 2011

Local Memory

Sunday, 29 May 2011

Networked Memcache

Solution

Sunday, 29 May 2011

Local Files

Sunday, 29 May 2011

Local Uploads

Sunday, 29 May 2011

Writing to /tmp

Sunday, 29 May 2011

Store on S3 or a networked FS

Solution

Sunday, 29 May 2011

Serve up static files from CDNs

Solution

Sunday, 29 May 2011

Servers can vanish at any given time

Sunday, 29 May 2011

Internal APIs

Sunday, 29 May 2011

Application

S3GFS FS

Internal Storage API

Sunday, 29 May 2011

Application

MySQLMongo Cache

Internal DB API

Sunday, 29 May 2011

Eventually Consistent

Sunday, 29 May 2011

CAP Therom

Sunday, 29 May 2011

Consistency

Availability

Partition Tolerance

Sunday, 29 May 2011

Consistency

All nodes see the same data at the same time

Sunday, 29 May 2011

Availability

Node failures do not prevent survivors from continuing to

operate

Sunday, 29 May 2011

Partition Tolerance

The system continues to operate despite arbitrary message loss

Sunday, 29 May 2011

Consistency

Availability

Partition Tolerance

Sunday, 29 May 2011

Queue Systems

Sunday, 29 May 2011

Good for

Sunday, 29 May 2011

Good forImage Processing

Sunday, 29 May 2011

Good forImage Processing

Distributed Logs

Sunday, 29 May 2011

Good forImage Processing

Distributed Logs

Data Mining

Sunday, 29 May 2011

Good forImage Processing

Distributed Logs

Data Mining

Mass Emails

Sunday, 29 May 2011

Good forImage Processing

Distributed Logs

Data Mining

Mass Emails

Intensive transformation

Sunday, 29 May 2011

Good forImage Processing

Distributed Logs

Data Mining

Mass Emails

Intensive transformation

Search

Sunday, 29 May 2011

Common Tools

Sunday, 29 May 2011

Common Tools

Gearman

Sunday, 29 May 2011

Common Tools

Gearman

Hadoop

Sunday, 29 May 2011

Common Tools

Gearman

Hadoop

ZeroMQ

Sunday, 29 May 2011

Common Tools

Gearman

Hadoop

ZeroMQ

RabbitMQ

Sunday, 29 May 2011

Common Tools

Gearman

Hadoop

ZeroMQ

RabbitMQ

And many others!

Sunday, 29 May 2011

Gearman

Sunday, 29 May 2011

Your Client Code

Gearman Client API(C, PHP, Perl, MySQL UDF, ...)

Gearman Job Servergearmand

Gearman Worker API(C, PHP, Perl, Python, ...)

Your Worker Code

Your App Gearman

Sunday, 29 May 2011

pear.php.net/net_gearman

Sunday, 29 May 2011

A Story!

Sunday, 29 May 2011

Financial Software

Sunday, 29 May 2011

3000+ Clients

Sunday, 29 May 2011

Each one has 5 external data sources

Sunday, 29 May 2011

Each data source is a web service

Sunday, 29 May 2011

Ran every 6 hours every day

Sunday, 29 May 2011

Cron

Sunday, 29 May 2011

Cron

Gearman

Sunday, 29 May 2011

Cron

Gearman

Job 1

Sunday, 29 May 2011

Cron

Gearman

Job 11

2

3

4

5

Web Services

Sunday, 29 May 2011

Cron

Gearman

Job 11

2

3

4

5

Web Services

1

43

2

5

Processing

Sunday, 29 May 2011

But! That wasn’t enough

Sunday, 29 May 2011

Job kicked off on login

Sunday, 29 May 2011

Supervisord

Sunday, 29 May 2011

Questions?

@hhelgi@orchestra.io

Joind.in: http://joind.in/3433

Sunday, 29 May 2011