Sharding @ Instagram - PostgreSQL .Sharding @ Instagram SFPUG April 2012 Mike Krieger Instagram...

download Sharding @ Instagram - PostgreSQL .Sharding @ Instagram SFPUG April 2012 Mike Krieger Instagram Monday,

of 183

  • date post

    15-Sep-2018
  • Category

    Documents

  • view

    213
  • download

    0

Embed Size (px)

Transcript of Sharding @ Instagram - PostgreSQL .Sharding @ Instagram SFPUG April 2012 Mike Krieger Instagram...

  • Sharding @ InstagramSFPUG April 2012

    Mike KriegerInstagram

    Monday, April 23, 12

  • me

    - Co-founder, Instagram- Previously: UX & Front-end

    @ Meebo

    - Stanford HCI BS/MS- @mikeyk on everything

    Monday, April 23, 12

  • pug!

    Monday, April 23, 12

  • communicating and sharing in the real world

    Monday, April 23, 12

  • Monday, April 23, 12

  • Monday, April 23, 12

  • Monday, April 23, 12

  • 30+ million users in less than 2 years

    Monday, April 23, 12

  • at its heart, Postgres-driven

    Monday, April 23, 12

  • a glimpse at how a startup with a small eng team scaled with PG

    Monday, April 23, 12

  • a brief tangent

    Monday, April 23, 12

  • the beginning

    Monday, April 23, 12

  • Text

    Monday, April 23, 12

  • 2 product guys

    Monday, April 23, 12

  • no real back-end experience

    Monday, April 23, 12

  • (you should have seen my first time finding my

    way around psql)

    Monday, April 23, 12

  • analytics & python @ meebo

    Monday, April 23, 12

  • CouchDB

    Monday, April 23, 12

  • CrimeDesk SF

    Monday, April 23, 12

  • Monday, April 23, 12

  • early mix: PG, Redis, Memcached

    Monday, April 23, 12

  • ...but were hosted on a single machine

    somewhere in LA

    Monday, April 23, 12

  • Monday, April 23, 12

  • less powerful than my MacBook Pro

    Monday, April 23, 12

  • okay, we launched.now what?

    Monday, April 23, 12

  • 25k signups in the first day

    Monday, April 23, 12

  • everything is on fire!

    Monday, April 23, 12

  • best & worst day of our lives so far

    Monday, April 23, 12

  • load was through the roof

    Monday, April 23, 12

  • friday rolls around

    Monday, April 23, 12

  • not slowing down

    Monday, April 23, 12

  • lets move to EC2.

    Monday, April 23, 12

  • Monday, April 23, 12

  • PG upgrade to 9.0

    Monday, April 23, 12

  • Monday, April 23, 12

  • scaling = replacing all components of a car

    while driving it at 100mph

    Monday, April 23, 12

  • this is the story of how our usage of PG has

    evolved

    Monday, April 23, 12

  • Phase 1: All ORM, all the time

    Monday, April 23, 12

  • why pg? at first, postgis.

    Monday, April 23, 12

  • ./manage.py syncdb

    Monday, April 23, 12

  • ORM made it too easy to not really think through

    primary keys

    Monday, April 23, 12

  • pretty good for getting off the ground

    Monday, April 23, 12

  • Media.objects.get(pk = 4)

    Monday, April 23, 12

  • first version of our feed (pre-launch)

    Monday, April 23, 12

  • friends = Relationships.objects.filter(source_user = user)

    recent_photos = Media.objects.filter(user_id__in = friends).order_by(-pk)[0:20]

    Monday, April 23, 12

  • main feed at launch

    Monday, April 23, 12

  • Redis:// user 33 postsfriends = SMEMBERS followers:33for user in friends: LPUSH feed:

    // for readingLRANGE feed:4 0 20

    Monday, April 23, 12

    feed:4feed:4feed:4feed:4

  • canonical data: PGfeeds/lists/sets: Redis

    object cache: memcache

    Monday, April 23, 12

  • post-launch

    Monday, April 23, 12

  • moved db to its own machine

    Monday, April 23, 12

  • at time, largest table: photo metadata

    Monday, April 23, 12

  • ran master-slave from the beginning, with

    streaming replication

    Monday, April 23, 12

  • backups: stop the replica, xfs_freeze drives, and take EBS snapshot

    Monday, April 23, 12

  • AWS tradeoff

    Monday, April 23, 12

  • 3 early problems we hit with PG

    Monday, April 23, 12

  • 1 oh, that setting was the problem?

    Monday, April 23, 12

  • work_mem

    Monday, April 23, 12

  • shared_buffers

    Monday, April 23, 12

  • cost_delay

    Monday, April 23, 12

  • 2 Django-specific:

    Monday, April 23, 12

  • 3 connection pooling

    Monday, April 23, 12

  • (we use PGBouncer)

    Monday, April 23, 12

  • somewhere in this crazy couple of months,

    Christophe to the rescue!

    Monday, April 23, 12

  • photos kept growing and growing...

    Monday, April 23, 12

  • ...and only 68GB of RAM on biggest machine in EC2

    Monday, April 23, 12

  • so what now?

    Monday, April 23, 12

  • Phase 2: Vertical Partitioning

    Monday, April 23, 12

  • django db routers make it pretty easy

    Monday, April 23, 12

  • def db_for_read(self, model): if app_label == 'photos': return 'photodb'

    Monday, April 23, 12

  • ...once you untangle all your foreign key

    relationships

    Monday, April 23, 12

  • (all of those user/user_id interchangeable calls bite

    you now)

    Monday, April 23, 12

  • plenty of time spent in PGFouine

    Monday, April 23, 12

  • read slaves (using streaming replicas) where

    we need to reduce contention

    Monday, April 23, 12

  • a few months later...

    Monday, April 23, 12

  • photosdb > 60GB

    Monday, April 23, 12

  • precipitated by being on cloud hardware, but likely to have hit limit eventually

    either way

    Monday, April 23, 12

  • what now?

    Monday, April 23, 12

  • horizontal partitioning!

    Monday, April 23, 12

  • Phase 3: sharding

    Monday, April 23, 12

  • surely well have hired someone experienced before we actually need

    to shard

    Monday, April 23, 12

  • ...never true about scaling

    Monday, April 23, 12

  • 1 choosing a method2 adapting the application

    3 expanding capacity

    Monday, April 23, 12

  • evaluated solutions

    Monday, April 23, 12

  • at the time, none were up to task of being our

    primary DB

    Monday, April 23, 12

  • NoSQL alternatives

    Monday, April 23, 12

  • Skypes sharding proxy

    Monday, April 23, 12

  • Range/date-based partitioning

    Monday, April 23, 12

  • did in Postgres itself

    Monday, April 23, 12

  • requirements

    Monday, April 23, 12

  • 1 low operational & code complexity

    Monday, April 23, 12

  • 2 easy expanding of capacity

    Monday, April 23, 12

  • 3 low performance impact on application

    Monday, April 23, 12

  • schema-based logical sharding

    Monday, April 23, 12

  • many many many (thousands) of logical

    shards

    Monday, April 23, 12

  • that map to fewer physical ones

    Monday, April 23, 12

  • // 8 logical shards on 2 machines

    user_id % 8 = logical shard

    logical shards -> physical shard map

    { 0: A, 1: A, 2: A, 3: A, 4: B, 5: B, 6: B, 7: B}

    Monday, April 23, 12

  • // 8 logical shards on 2 4 machines

    user_id % 8 = logical shard

    logical shards -> physical shard map

    { 0: A, 1: A, 2: C, 3: C, 4: B, 5: B, 6: D, 7: D}

    Monday, April 23, 12

  • schemas!

    Monday, April 23, 12

  • all that public stuff Id been glossing over for 2

    years

    Monday, April 23, 12

  • - database: - schema: - table: - columns

    Monday, April 23, 12

  • spun up set of machines

    Monday, April 23, 12

  • using fabric, created thousands of schemas

    Monday, April 23, 12

  • machineA: shard0 photos_by_user shard1 photos_by_user shard2 photos_by_user shard3 photos_by_user

    machineB: shard4 photos_by_user shard5 photos_by_user shard6 photos_by_user shard7 photos_by_user

    Monday, April 23, 12

  • (fabric or similar parallel task executor is

    essential)

    Monday, April 23, 12

  • application-side logic

    Monday, April 23, 12

  • SHARD_TO_DB = {}

    SHARD_TO_DB[0] = 0SHARD_TO_DB[1] = 0SHARD_TO_DB[2] = 0SHARD_TO_DB[3] = 0SHARD_TO_DB[4] = 1SHARD_TO_DB[5] = 1SHARD_TO_DB[6] = 1SHARD_TO_DB[7] = 1

    Monday, April 23, 12

  • instead of Django ORM, wrote really simple db

    abstraction layer

    Monday, April 23, 12

  • select/update/insert/delete

    Monday, April 23, 12

  • select(fields, table_name, shard_key, where_statement, where_parameters)

    Monday, April 23, 12

  • select(fields, table_name, shard_key, where_statement, where_parameters)

    ...shard_key % num_logical_shards = shard_id

    Monday, April 23, 12

  • in most cases, user_id for us

    Monday, April 23, 12

  • custom Django test runner to create/tear-down sharded DBs

    Monday, April 23, 12

  • most queries involve visiting handful of shards

    over one or two machines

    Monday, April 23, 12

  • if mapping across shards on single DB, UNION

    ALL to aggregate

    Monday, April 23, 12

  • clients to library pass in:((shard_key, id),

    (shard_key,id)) etc

    Monday, April 23, 12

  • library maps sub-selects to each shard, and each

    machine

    Monday, April 23, 12

  • p