Eric2012

32
Building a Website To Scale www.ManwinJobs.com Target: 200 Million page views per day and beyond! By Eric Pickup [email protected] Twitter: EricPickupYP

description

https://joind.in/6123

Transcript of Eric2012

Page 1: Eric2012

Building a Website To Scale

www.ManwinJobs.com

Target: 200 Million page views per day and beyond!By Eric Pickup

[email protected] Twitter: EricPickupYP

Page 2: Eric2012

www.ManwinJobs.com

Contents

1. The Context

2. The Requirements

3. The Architecture

4. The Good and the Bad

Page 3: Eric2012

www.ManwinJobs.com

What are we talking about?

YP First Launched

1 Million daily

visitors

100,000 Uploads

Aug2006

Apr2007

Dec2007

Feb2008

Apr2011

100 million daily

page views

Acquired by Manwin

Page 4: Eric2012

www.ManwinJobs.com

Traffic In Perspective

Source: Alexa.com

Alexa global rank 95

100 Gb/s – 3 full DVDs streamed every single second

Page 5: Eric2012

www.ManwinJobs.com

The Context

Written in PERL with a very complex architecture

First few months dedicated to learning the site, maintain it, and plan the re-write.

Re-write started in August 2011 and was originally planned for a delivery in mid-November.

Actually launched at the end of January.

Page 6: Eric2012

www.ManwinJobs.com

The requirements

1

2

3

4

Support 200 million+ daily requests

100% transparent to users

Six years of legacy data

Even faster site

Page 7: Eric2012

www.ManwinJobs.com

The Architecture

Page 8: Eric2012

www.ManwinJobs.com

The Architecture

Page 9: Eric2012

www.ManwinJobs.com

The Architecture

Fast and reliable load-balancing.

Intelligent load distribution.

Performs health-checks

Page 10: Eric2012

www.ManwinJobs.com

The Architecture

Page 11: Eric2012

www.ManwinJobs.com

The Architecture

Reverse proxy optimized for better speed

Reduces web and database server load

Very rich and flexible configuration

Page 12: Eric2012

www.ManwinJobs.com

The Architecture

Cache management (what, for how long)

Edge Side Includes (ESIs)

Health check on Web servers

Page 13: Eric2012

www.ManwinJobs.com

The Architecture

Page 14: Eric2012

www.ManwinJobs.com

The Architecture

Custom logging of page views

Used for tasks like view counters or related videos

Between 8GB and 15GB of logs per hour!

Page 15: Eric2012

www.ManwinJobs.com

The Architecture

Page 16: Eric2012

www.ManwinJobs.com

The Architecture

High-performance HTTP server.

PHP-FPM

External CDNs for Static files like CSS, images and JS

Page 17: Eric2012

www.ManwinJobs.com

The Architecture

Page 18: Eric2012

www.ManwinJobs.com

The Architecture

FPM hosts our framework of choice: Symfony2.

Fast and feature rich.

A wealth of bundles already available.

Page 19: Eric2012

www.ManwinJobs.com

The Architecture

Page 20: Eric2012

www.ManwinJobs.com

The Architecture

A messaging component

Designed for large scale deployments

ActiveMQ to do writes (MySQL and Redis)

Page 21: Eric2012

www.ManwinJobs.com

The Architecture

Partially implemented with mitigated results.

Too rigid for a site requiring constant changes.

Gains not justifying Java and a separate infrastructure.

Page 22: Eric2012

www.ManwinJobs.com

The Architecture

Page 23: Eric2012

www.ManwinJobs.com

The Architecture

Ability to manage pools of servers with health checks. We maintain 2 pools:

Write pool with fail-over to backup-Master. Read pool with all servers except Master.

Page 24: Eric2012

www.ManwinJobs.com

The Architecture

Page 25: Eric2012

www.ManwinJobs.com

The Architecture

Open source, advanced key-value store

Read operations on Redis are FAST

Primary data source

Page 26: Eric2012

www.ManwinJobs.com

The Architecture

Updated in real time as with MySQL.

Redis Sorted Sets for all lists.

Pipelining is VERY important for performance.

Page 27: Eric2012

www.ManwinJobs.com

The Architecture

Persistence needs tuning.

RDB does a snapshot but is very IO extensive.

AOF does incremental backups and is IO pain-free.

Page 28: Eric2012

www.ManwinJobs.com

The Architecture

Page 29: Eric2012

www.ManwinJobs.com

The Architecture

Very normalized database since not used directly for site.

Some tables have over 100 million rows.

Used to populate Redis lists for new features

Page 30: Eric2012

www.ManwinJobs.com

The good and the bad

Main reasons for the delays:

Decisions concerning some of the technologies to use.

Learning curve for new technologies longer than expected.

Data transfer and restructuring in MySQL and Redis

Staffing issues.

Page 31: Eric2012

www.ManwinJobs.com

The good and the bad

Was it a success?

Launch without any downtime

New site about 10% faster

Valuable expertise gained

A GOOD SUCCESS STORY WITH LESSONS LEARNED

Page 32: Eric2012

www.ManwinJobs.com

Eric [email protected] Twitter: EricPickupYP