Scaling PHP/MySQL Presentation from Flickr

download Scaling PHP/MySQL Presentation from Flickr

of 41

Transcript of Scaling PHP/MySQL Presentation from Flickr

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    1/41

    Hyatt Regency San Francisco Airport Burlingame, CA

    San Francisco,CA

    October 18-21,2005

    Hardware Layouts for LAMPInstallations

    John Allspaw, Flickr Plumbr

    Flickr (Yahoo)

    [email protected]

    October 18, 2005

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    2/41

    Oct. 18, 2005 # 2

    Hardware Layouts for LAMPInstallations

    Hardware requirements for LAMP installshave to do with:

    o A decent amount about the actualhardware (in-box stuff)

    o A bit more about the hardwarearchitecture

    o Which should complement the applicationarchitecture

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    3/41

    Oct. 18, 2005 # 3

    Hardware Layouts for LAMPInstallations

    What well talk about here:

    o Database (MySQL) layouts and

    considerations

    o Some miscellaneous/esoteric stuff(lessons learned)

    o Caching content and considerations

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    4/41

    Oct. 18, 2005 # 4

    Hardware Layouts for LAMPInstallations

    Growing Up, One Box solution

    Basic web application (discussion board, etc.)

    Low traffic

    Apache/PHP/MySQL on one machine

    Bottlenecks will start showing up:

    Most likely database before apache/php

    Disk I/O (Innodb) or locking wait states (MyISAM)

    Context switching between memory work (apache) andCPU work (MySQL)

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    5/41

    Oct. 18, 2005 # 5

    Hardware Layouts for LAMPInstallations

    ONE BOX

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    6/41

    Oct. 18, 2005 # 6

    Hardware Layouts for LAMPInstallations

    Growing Up, Two Box solution

    Higher traffic application (more demand)

    Apache/PHP on box A, MySQL on box B

    Same network = bad (*or is it ?), separate network =good

    Bottlenecks with start to be:

    Disk I/O on MySQL machine (Innodb) Locking on MyISAM tables

    Network I/O

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    7/41Oct. 18, 2005 # 7

    Hardware Layouts for LAMPInstallations

    TWO BOX

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    8/41Oct. 18, 2005 # 8

    Hardware Layouts for LAMPInstallations

    Growing Up, Many Boxes withReplication solution

    Yet even higher traffic

    Writes are separated from reads (master gets

    IN/UP/DEL, slaves get SELECTs) Diminishes network bottlenecks, disk I/O, and

    other in-box issues

    SELECTs, IN/UP/DEL can be specified within theapplication,

    OR.

    Load-balancing can be used

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    9/41Oct. 18, 2005 # 9

    Hardware Layouts for LAMPInstallationsMANY BOX

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    10/41Oct. 18, 2005 # 10

    Hardware Layouts for LAMPInstallations

    Slave Lag

    When slaves cant keep up with replication

    Theyre too busy:

    Reading (production traffic)

    Writing (replication)

    Manifests as:

    Comments/photos/any user-entered data doesnt

    show up on the site right away So users will repeat the action, thinking that it

    didnt take the first time, makes situation worse

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    11/41Oct. 18, 2005 # 11

    Hardware Layouts for LAMPInstallations

    Insert funny photo here about slave lag*

    *slave lag isnt funny

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    12/41

    Oct. 18, 2005 # 12

    Hardware Layouts for LAMPInstallations

    Hardware Load Balancing MySQL

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    13/41

    Oct. 18, 2005 # 13

    Hardware Layouts for LAMPInstallations

    How Its Usually Done

    Standard MySQL master/slave replication

    All writes (inserts/updates/deletes) fromapplication go to Master

    All reads (selects) from application go to a load-balanced VIP (virtual IP) spreading out loadacross all slaves

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    14/41

    Oct. 18, 2005 # 14

    Hardware Layouts for LAMPInstallations

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    15/41

    Oct. 18, 2005 # 15

    Hardware Layouts for LAMPInstallations

    What Is Good About Load Balancing

    you can add/remove slaves without affectingapplication, since queries are atomic (sorta/kinda)

    additional monitoring point and some automaticfailure handling

    you can treat all of your slave pool as oneresource, and makes capacity planning a loteasier if you know the ceiling of each slave

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    16/41

    Oct. 18, 2005 # 16

    Hardware Layouts for LAMPInstallations

    How do you know the ceiling (maximum QPScapacity) of each slave ?

    First make a guess based on benchmarking (orlook up some bench results from Toms Hardware

    or anandtech.com, etc.

    Then get more machines than that :)

    Scary: in production during a lull in traffic,

    remove machines from the pool until you detectlag

    The QPS you saw right before slave lag set in:

    THAT is your ceiling

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    17/41

    Oct. 18, 2005 # 17

    Hardware Layouts for LAMPInstallations

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    18/41

    Oct. 18, 2005 # 18

    Hardware Layouts for LAMPInstallations

    What Can Be Bad/Tough About Load Balancing:

    not all load-balancers are created equal, not all load-balancing companies expect this product use, sosupport may still be thin

    not that many people are doing it in high-volumesituations yet, so support from community isnt largeeither

    Gotchas:

    port exhaustion, health checks,

    and balance algorithms

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    19/41

    Oct. 18, 2005 # 19

    Hardware Layouts for LAMPInstallations

    Port Exhaustion

    PROBLEM:

    LB is basically a traffic cop, nothing more

    Side effect of having a lot of connections: only~64,511 ports per each IP (VIP) to use

    64,511 ports/120 sec per port.

    ~535 max concurrent connections per IP*

    * Not really, but close to it: tcp_tw_recycle and tcp_tw_reuse

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    20/41

    Oct. 18, 2005 # 20

    Hardware Layouts for LAMPInstallations

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    21/41

    Oct. 18, 2005 # 21

    Hardware Layouts for LAMPInstallations

    Port Exhaustion (contd)

    SOLUTION:

    Use a pool of IPs on the database slave/farm side

    (Netscaler calls these subnet IPs, Alteon callsthem PiPs)

    Monitor port/connection usage, know when itstime to add more

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    22/41

    Oct. 18, 2005 # 22

    Hardware Layouts for LAMPInstallations

    Health checks

    LB wont know anything about how well eachMySQL slave is doing, and will pass traffic as longas port 3306 is answering

    Load balancers dont talk SQL, only things likeplain old TCP, HTTP/S, maybe FTP

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    23/41

    Oct. 18, 2005 # 23

    Hardware Layouts for LAMPInstallations

    Health checks (contd)Two options:

    1. Dirty, but workable:

    Have each server monitor itself, and shutoff/firewall its own port 3306, even ifMySQL is still running

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    24/41

    Oct. 18, 2005 # 24

    Hardware Layouts for LAMPInstallations

    Health checks (contd)

    2. Cleaner, but a bit more work:

    Have each server monitor itself, and run a

    check via xinetd (for example, a nagiosmonitor)

    So the LB can tickle that port, and expect backan OK string. If not, itll automatically takethat server out of the pool

    Good for detecting and counteracting isolatedincidents of slave lag and automaticallyhandling it

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    25/41

    Oct. 18, 2005 # 25

    Hardware Layouts for LAMPInstallations

    Health Checks

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    26/41

    Oct. 18, 2005 # 26

    Hardware Layouts for LAMPInstallations

    Balancing Algorithms

    Load balancers know HTTP, FTP, basic TCP, but not SQL

    Two things to care about:

    Should the server still be in the pool ? (healthchecks)

    How should load get balanced ?

    least connections or least bandwidth orleast anything = BAD

    Because not all SQL queries are created equal

    Use round-robin or random

    What happens if you dont: Evil Favoritism

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    27/41

    Oct. 18, 2005 # 27

    Hardware Layouts for LAMPInstallations

    Evil Favoritism

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    28/41

    Oct. 18, 2005 # 28

    Hardware Layouts for LAMPInstallations

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    29/41

    Oct. 18, 2005 # 29

    Hardware Layouts for LAMPInstallations

    Meanwhile.for in-the-box considerations

    Interleaving memory *does* make a difference

    Always RAID10 (or RAID0 if youre crazy*) but NEVER

    RAID5 (for Innodb, anyway)

    RAID10 has much more read capacity, and a writepenalty,but not as much as RAID5

    Always have battery backup for HW RAID write caching

    Or, dont use write caching at all

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    30/41

    Oct. 18, 2005 # 30

    Hardware Layouts for LAMPInstallations

    IN-THE-BOX considerations (contd)

    Always have proper monitoring (nagios, etc.) forfailed/rebuilding drives

    SATA or SCSI ? SCSI ! Its worth it!

    10k or 15k RPM SCSI ? 15k! Its worth it!(~20% performance increase when youre disk bound)

    For 64bit Linux (AMD64 or EM64T):

    Crank up the RAM for Innodbs buffer pool

    Swapping = very very bad either:Turn it off (slightly scary)

    Leave it on and set /proc/sys/vm/swapiness = 0

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    31/41

    Oct. 18, 2005 # 31

    Hardware Layouts for LAMPInstallations

    10k versus 15k drives ?

    Does it really matter that much ?

    Some in-the-wild proof.

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    32/41

    Oct. 18, 2005 # 32

    Hardware Layouts for LAMPInstallations

    10K drives

    15K drives

    Slave Lag in production

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    33/41

    Oct. 18, 2005 # 33

    Hardware Layouts for LAMPInstallations

    Using MySQL with a SAN (Storage AreaNetwork)

    Do layout storage same as if they would be local

    Do make sure that the HBA (fiber card) driver is wellsupported by Linux

    Dont share volumes across databases

    Dont forget to correctly tune Queue Depth Size,

    which should be increasing, from server HBA ->switch -> storage

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    34/41

    Oct. 18, 2005 # 34

    Hardware Layouts for LAMPInstallations

    Caching your static content

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    35/41

    Oct. 18, 2005 # 35

    Hardware Layouts for LAMPInstallations

    Caching Static Content

    SQUID = good

    Relieve your front-end PHP machines from looking up

    data that will never (or rarely) change

    Generate static pages, and cache them in squid,along with your images

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    36/41

    Oct. 18, 2005 # 36

    Hardware Layouts for LAMPInstallations

    Caching Static Content (contd)

    Use SQUID to accelerate plain-old origin webservers,also known as reverse-proxy HTTP acceleration

    Described here and elsewhere:

    http://www.squid-cache.org/Doc/FAQ/FAQ-20.html

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    37/41

    Oct. 18, 2005 # 37

    Hardware Layouts for LAMPInstallations

    Basic SQUID layout

    squid accepts requests on 80 passes on cache misses to apache on 81

    apache uses as its docroot an NFS mounted dir should be on local subnet, or dedicated net

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    38/41

    Oct. 18, 2005 # 38

    Hardware Layouts for LAMPInstallations

    Good HW layout for high-volumeSQUIDing

    Do use SCSI, and many spindles for disk cache dirs

    Dont use RAID

    Do use network attached storage, or place the originservers on separate machines

    Do use ext3 with noatime for disk cache dirs

    Do monitor squid stats

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    39/41

    Oct. 18, 2005 # 39

    Hardware Layouts for LAMPInstallations

    lickr: How We Roll

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    40/41

    Oct. 18, 2005 # 40

    Hardware Layouts for LAMPInstallations

    Yummy SQUID stats:

    >2800 images/sec, ~75-80% are cachehits

    ~10 million photos cached at any time 1.5 million cached in memory

  • 8/15/2019 Scaling PHP/MySQL Presentation from Flickr

    41/41

    Hardware Layouts for LAMPInstallations

    The End