Memcached Talk

35
Memcache Rob Sharp [email protected] Lead Developer The Sound Alliance

description

A talk on using Memcache in Ruby, given to the Ruby-on-Rails Sydney Group.

Transcript of Memcached Talk

Page 1: Memcached Talk

MemcacheRob Sharp

[email protected]

Lead DeveloperThe Sound Alliance

Page 2: Memcached Talk

About Memcached

• conceived by Brad Fitzpatrick as a solution to the scaling issues faced by Livejournal

• “memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load”

Page 3: Memcached Talk

Who we are

• Sydney-based online media publishing

• Community and Content Sites

• Fasterlouder.com.au

• Inthemix.com.au

• Samesame.com.au

• Thoughtbythem - Marketing Agency

• White Label Gig Ticketing

Page 4: Memcached Talk

Who we are

• Inthemix.com.au

• Australia’s busiest music website

• ~ 250,000 pages per day

• Plus two other busy sites!

• Maximum performance for the hardware we have

Page 5: Memcached Talk

Current Architechture

• 3 Linux servers

• Apache

• Lighttpd

• Memcache

• 1 MySQL Master

Page 6: Memcached Talk

Why do we use memcached?

• CMS written in OO style, using Activerecord

• PHP4 and objects don't mix too well

• Activerecord gave us fast development but reduced performance

• Call us greedy, but we want both

• Use Rails Memcache!

Page 7: Memcached Talk

Our Application

• CMS written from the ground up

• Effectively three sites running on one codebase

• Uses three seperate databases, but aiming to consolidate more

• Has data namespacing implemented in most places

• But seperation is not quite there yet!

Page 8: Memcached Talk

Our Memcache Setup

• We have 3 webservers running memcache

• Each server runs three daemons on separate ports - one for each site (more on this later!)

Page 9: Memcached Talk

Memcache Pool• Each daemon knows about the other 2

daemons and connects to them over TCP

• This allows us to store data once, and access it from any server, whether in the pool or not

• Hashing algorithm means that a given key maps to a single server

• Efficient use of memory

• Efficient for cache clearing

Page 10: Memcached Talk

Memcache Pool

• But what if we lose a server? We can

• Ignore it - we simply get misses for any keys we attempt to retrieve

• Remove it - our hashing algorithm breaks... :(

• We can also add new servers to the pool after data has been stored, but the same hashing problem occurs

Page 11: Memcached Talk

Memcache Pool

• Consistent hashing will solve the problem of removing or adding servers once data has been hashed

• Currently in its infancy - not really production ready

• We simply monitor our daemons and restart if required

Page 12: Memcached Talk

Installing Memcached

• Available in most Linux distros

• packaged for Fedora, RHEL4/5, Ubuntu, Debian, Gentoo and BSD

• OSX? Use Ports!

• sudo port install memcache

• sudo gem install memcache-client

• sudo gem install cached_model

Page 13: Memcached Talk

Memcache and Ruby

• We’ll use the memcache-client gem

• Pure Ruby implementation

• Pretty fast!

Page 14: Memcached Talk

Storing Stuffrsharp$ sudo gem install memcache-client

require 'memcache'

memcache_options = { :compression => true, :debug => false, :namespace => 'my_favourite_artists', :readonly => false, :urlencode => false}

Cache = MemCache.new memcache_optionsCache.servers = 'localhost:11211'

Page 15: Memcached Talk

Storing Stuff

Cache.set 'favourite_artist', 'Salvador Dali'skateboarder = Cache.get 'favourite_artist'

Cache.delete 'favourite_artist'

Page 16: Memcached Talk

Memcache Namespaces

• Memcache doesn’t have namespaces, so we have to improvise

• Prefix your keys with a namespace by setting the namespace when you connect

• Our solution:

• Run multiple memcache instances on different ports

Page 17: Memcached Talk

Roll your own?

• Memcache-client provides basic cache methods

• What if we extended ActiveRecord?

• We can, with active_model

Page 18: Memcached Talk

Storing Stuff Part Deuxrsharp$ sudo gem install cached_model

require 'cached_model'

memcache_options = { :compression => true, :debug => false, :namespace => 'hifibuys', :readonly => false, :urlencode => false}

CACHE = MemCache.new memcache_optionsCACHE.servers = 'localhost:11211'

Page 19: Memcached Talk

Storing Stuff Part Deux

class Artist < CachedModel

end

Page 20: Memcached Talk

cached_model Performance

• CachedModel is not magic.

• CachedModel only accelerates simple finds for single rows.

• CachedModel won’t cache every query you run.

• CachedModel isn’t smart enough to determine the dependencies between your queries so that it can accelerate more complicated queries. If you want to cache more complicated queries you need do it by hand.

Page 21: Memcached Talk

Other options

• acts_as_cached provides a similar solution

Page 22: Memcached Talk

Memcache Storage

• Memcache stores blobs

• The memcache client handles marshalling, so you can easily cache objects

• This does however mean that the objects aren’t necessarily cross-language

Page 23: Memcached Talk

Memcache Storage

• The most obvious things to store are objects

• We cache articles

• We cache collections of articles

• We cache template data

• We cache fragments

• We don’t cache SQL queries

Page 24: Memcached Talk

What we cache

• Our sites are fairly big and data-rich communities

• Almost every page has editorially controlled attributes along with user generated content

• Like...

Page 25: Memcached Talk

Our Example Dataset

• Article

• Joins Artists

• Joins Locations

• Joins Genres

• Joins Related Content

• Joins Related Forum Activity

• Joins Related Gallery Data

Page 26: Memcached Talk

Our Example Dataset

• Article (continues)

• ...

• Joins Media Content

• Joins Comments

• Joins ‘Rollcalls’

• Joins other secret developments

Page 27: Memcached Talk

Our Example

• An article requires many data facets

• Most don’t change that often

• We also know when they change

• Yay for the Observer pattern

• User content changes much more regularly

• Can be changed from outside our controlled area (e.g. Forums)

Page 28: Memcached Talk

Our ExampleSummary

• Data can be loosely divided into editorially controlled and user-generated

• Cache editorially controlled content separately from user-generated content

• Simplest way to implement is in fragment caching

Page 29: Memcached Talk

Fragment Caching

• Memcache allows timed expiry of fragments

• Identify areas that change infrequently and cache

• Remember to measure performance before and after

• Evidence suggests very large gains!

• Use memcache_fragments

Page 30: Memcached Talk

Caching Fragmentsrsharp$ sudo gem install memcache_fragments

require 'memcache_fragments'

memcache_options = { :compression => true, :debug => false, :namespace => 'hifibuys', :readonly => false, :urlencode => false}

CACHE = MemCache.new memcache_optionsCACHE.servers = 'localhost:11211'

Page 31: Memcached Talk

Caching Fragments

ActionController::Base.fragment_cache_store = :mem_cache_store ,{}ActionController::Base.fragment_cache_store.data = CACHE, {}ActionController::CgiRequest::DEFAULT_SESSION_OPTIONS.merge!({ 'cache' => CACHE })

Page 32: Memcached Talk

Caching Fragments

<% cache 'my/cache/key', :expire => 10.minutes do %> ...

<% end %>

Page 33: Memcached Talk

Memcache Sessions

• We could store our session in Memcache

• Great for load balancing - share across a server farm without using a DB store

• Ideal for transient data

• Solution exists in db_memcache_store

• DB backend with memcache layer - the best of both worlds

Page 34: Memcached Talk

In Summary• Memcache gives you a distributed cache

store

• Very fast and very easy to use

• Lots of ruby and rails libraries

• memcache_client

• cached_model

• db_memcache_store

• memcache_fragments

Page 35: Memcached Talk

Any Questions?