Memcache @ Facebook

download Memcache @ Facebook

of 47

Transcript of Memcache @ Facebook

  • 8/2/2019 Memcache @ Facebook

    1/47

    memcache@facebook

    Marc Kwiatkowski

    memcache tech lead

    QCon 2010

  • 8/2/2019 Memcache @ Facebook

    2/47

    How big is facebook?

  • 8/2/2019 Memcache @ Facebook

    3/47

  • 8/2/2019 Memcache @ Facebook

    4/47

    Objects

    More than 60 million status updates posted each day 694/s

    More than 3 billion photos uploaded to the site each month

    23/s More than 5 billion pieces of content (web links, news stories,

    blog posts, notes, photo albums, etc.) shared each week

    8K/s

    Average user has 130 friends on the site

    50 Billion friend graph edges

    Average user clicks the Like button on 9 pieces of content

    each month

  • 8/2/2019 Memcache @ Facebook

    5/47

    - Infrastructure

    Thousands of servers in several data centers in two regions Web servers

    DB servers

    Memcache Servers Other services

  • 8/2/2019 Memcache @ Facebook

    6/47

    The scale of memcache @ facebook

    Memcache Ops/s over 400M gets/sec

    over 28M sets/sec

    over 2T cached items over 200Tbytes

    Network IO

    peak rx 530Mpkts/s 60GB/s peak tx 500Mpkts/s 120GB/s

  • 8/2/2019 Memcache @ Facebook

    7/47

    A typical memcache servers P.O.V.

    Network I/O rx 90Kpkts/s 9.7MB/s

    tx 94Kpkts/s 19MB/s

    Memcache OPS 80K gets/s

    2K sets/s

    200M items

  • 8/2/2019 Memcache @ Facebook

    8/47

    Evolution of facebooksarchitecture

  • 8/2/2019 Memcache @ Facebook

    9/47

  • 8/2/2019 Memcache @ Facebook

    10/47

    Scaling Facebook: Interconnecteddata

    Bob

  • 8/2/2019 Memcache @ Facebook

    11/47

    Scaling Facebook: Interconnecteddata

    Bob Brian

  • 8/2/2019 Memcache @ Facebook

    12/47

    Scaling Facebook: Interconnecteddata

    Bob BrianFelicia

  • 8/2/2019 Memcache @ Facebook

    13/47

    Memcache Rules of the Game

    GET object from memcache on miss, query database and SET object to memcache

    Update database row and DELETE object in memcache

    No derived objects in memcache Every memcache object maps to persisted data in database

  • 8/2/2019 Memcache @ Facebook

    14/47

    Scaling memcache

  • 8/2/2019 Memcache @ Facebook

    15/47

    Phatty Phatty Multiget

    QuickTime and aH.264 decompressor

    are needed to see this picture.

  • 8/2/2019 Memcache @ Facebook

    16/47

    Phatty Phatty Multiget (notes)

    PHP runtime is single threaded and synchronous To get good performance for data-parallel operations like

    retrieving info for all friends, its necessary to dispatch

    memcache get requests in parallel

    Initially we just used polling I/O in PHP.

    Later we switched to true asynchronous I/O in a PHP C

    extension

    In both case the result was reduced latency throughparallelism.

  • 8/2/2019 Memcache @ Facebook

    17/47

    Pools and Threads

    PHP Client

  • 8/2/2019 Memcache @ Facebook

    18/47

    PHP Client

    sp:12345

    sp:12346 sp:12347cs:12345

    cs:12346 cs:12347

  • 8/2/2019 Memcache @ Facebook

    19/47

    PHP Client

    sp:12345 sp:12346 sp:12347 cs:12345 cs:12346 cs:12347

  • 8/2/2019 Memcache @ Facebook

    20/47

    PHP Client

  • 8/2/2019 Memcache @ Facebook

    21/47

    Pools and Threads (notes)

    Privacy objects are small but have poor hit rates User-profiles are large but have good hit rates

    We achieve better overall caching by segregating different

    classes of objects into different pools of memcache servers

    Memcache was originally a classic single-threaded unix

    daemon

    This meant we needed to run 4 instances with 1/4 the RAM

    on each memcache server

    4X the number of connections to each both

    4X the meta-data overhead

    We needed a multi-threaded service

  • 8/2/2019 Memcache @ Facebook

    22/47

  • 8/2/2019 Memcache @ Facebook

    23/47

    Connections and Congestion (notes)

    As we added web-servers the connections to each memcachebox grew.

    Each webserver ran 50-100 PHP processes

    Each memcache box has 100K+ TCP connections

    UDP could reduce the number of connections

    As we added users and features, the number of keys per-

    multiget increased

    Popular people and groups

    Platform and FBML

    We began to see incast congestion on our ToR switches.

    UDP allowed us to do congestion detection and admission-

  • 8/2/2019 Memcache @ Facebook

    24/47

    Serialization and Compression

    We noticed our short profiles werent so short 1K PHP serialized object

    fb-serialization

    based on thrift wire format

    3X faster

    30% smaller

    gzcompress serialized strings

  • 8/2/2019 Memcache @ Facebook

    25/47

    Multiple Datacenters

    SF Web

    SFMemcache

    SCMemcache

    SC Web

    SC MySQL

    Memcache ProxyMemcache Proxy

  • 8/2/2019 Memcache @ Facebook

    26/47

    Multiple Datacenters (notes)

    In the early days we had two data-centers

    The one we were about to turn off

    The one we were about to turn on

    Eventually we outgrew a single data-center Still only one master database tier

    Rules of the game require that after an update we need to

    broadcast deletes to all tiers

    The mcproxy era begins

  • 8/2/2019 Memcache @ Facebook

    27/47

    Multiple Regions

    SF Web

    SFMemcache

    SCMemcache

    SC Web

    SC MySQL

    Memcache ProxyMemcache Proxy

    MySql replication

    East Coast

    VA MySQL

    VA Web

    VAMemcache

    Memcache Proxy

    West Coast

  • 8/2/2019 Memcache @ Facebook

    28/47

    Multiple Regions (notes)

    Latency to east coast and European users was/is terrible.

    So we deployed a slave DB tier in Ashburn VA

    Slave DB tracks syncs with master via MySQL binlog

    This introduces a race condition mcproxy to the rescue again

    Add memcache delete pramga to MySQL update and insert

    ops

    Added thread to slave mysqld to dispatch deletes in east

    coast via mcpro

  • 8/2/2019 Memcache @ Facebook

    29/47

    key

    Replicated Keys

    key key key

    PHP Client PHP ClientPHP Client

    Memcache MemcacheMemcacheMemcache

  • 8/2/2019 Memcache @ Facebook

    30/47

    key

    Replicated Keys

    key#0 key#1 key#3

    PHP Client PHP ClientPHP Client

    MemcacheMemcache Memcache

  • 8/2/2019 Memcache @ Facebook

    31/47

  • 8/2/2019 Memcache @ Facebook

    32/47

    Memcache Rules of the Game

    New Rule

    If a key is hot, pick an alias and fetch that for reads

    Delete all aliases on updates

  • 8/2/2019 Memcache @ Facebook

    33/47

    Mirrored Pools

    General pool with wide fanout

    Shard 1 Shard 2

    Specialized Replica 2

    Shard 1 Shard 2

    Shard 1 Shard 2 Shard 3 Shard n

    Specialized Replica 1

    ...

  • 8/2/2019 Memcache @ Facebook

    34/47

    Mirrored Pools (notes)

    As our memcache tier grows the ratio of keys/packet

    decreases

    100 keys/1 server = 1 packet

    100 keys/100 server = 100 packets

    More network traffic

    More memcache server kernel interrupts per request

    Confirmed Info - critical account meta-data

    Have you confirmed your account?

    Are you a minor?

    Pulled from large user-profile objects

  • 8/2/2019 Memcache @ Facebook

    35/47

    Hot Misses

    [animation]

  • 8/2/2019 Memcache @ Facebook

    36/47

    Hot Misses (notes)

    Remember the rules of the game

    update and delete

    miss, query, and set

    When the object is very, very popular, that query rate can kill

    a database server

    We need flow control!

    M h R l f h G

  • 8/2/2019 Memcache @ Facebook

    37/47

    Memcache Rules of the Game

    For hot keys, on miss grab a mutex before issuing db query

    memcache-add a per-object mutex

    key:xxx => key:xxx#mutex

    If add succeeds do the query

    If add fails (because mutex already exists) back-off and try

    again

    After set delete mutex

    H D l

  • 8/2/2019 Memcache @ Facebook

    38/47

    Hot Deletes

    [hot groups graphics]

    H t D l t ( t )

  • 8/2/2019 Memcache @ Facebook

    39/47

    Hot Deletes (notes)

    Were not out of the woods yet

    Cache mutex doesnt work for frequently updated objects

    like membership lists and walls for viral groups and

    applications.

    Each process that acquires a mutex finds that the object has

    been deleted again

    ...and again

    ...and again

    R l f th G C hi I t t

  • 8/2/2019 Memcache @ Facebook

    40/47

    Rules of the Game: Caching Intent

    Each memcache server is in the perfect position to detect and

    mitigate contention

    Record misses

    Record deletes

    Serve stale data

    Serve lease-ids

    Dont allow updates without a valid lease id

  • 8/2/2019 Memcache @ Facebook

    41/47

    Next Steps

    Sh i g M h T ffi

  • 8/2/2019 Memcache @ Facebook

    42/47

    Shaping Memcache Traffic

    mcproxy as router

    admission control

    tunneling inter-datacenter traffic

    Cache Hierarchies

  • 8/2/2019 Memcache @ Facebook

    43/47

    Cache Hierarchies

    Warming up Cold Clusters

    Proxies for Cacheless Clusters

    Big Low Latency Clusters

  • 8/2/2019 Memcache @ Facebook

    44/47

    Big Low Latency Clusters

    Bigger Clusters are Better

    Low Latency is Better

    L2.5

    UDP

    Proxy Facebook Architecture

    Worse IS better

  • 8/2/2019 Memcache @ Facebook

    45/47

    Worse IS better

    Richard Gabriels famous essay contrasted

    ITS and Unix

    LISP and C

    MIT and New Jersey

    Why Memcache Works

  • 8/2/2019 Memcache @ Facebook

    46/47

    Why Memcache Works

    Uniform, low latency with partial results is a better user

    experience

    memcache provides a few robust primitives

    key-to-server mapping

    parallel I/O

    flow-control

    traffic shaping

    that allow ad hoc solutions to a wide range of scaling issues

  • 8/2/2019 Memcache @ Facebook

    47/47

    (c) 2010 Facebook, Inc. or its licensors. "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0