Download - Introduction to memcached

Transcript
Page 1: Introduction to memcached

INTRODUCTION TOMEMCACHED

Page 2: Introduction to memcached

Tagsmemcached, performance, scalability, php, mySQL, caching techniques, #ikdoeict

Page 3: Introduction to memcached

jurriaanpersyn.comlead web dev at Netlog since 4 yearsphp + mysql + frontendworking on Gatcha

Page 4: Introduction to memcached

For who?talk for students professional bachelor ICT www.ikdoeict.be

Page 5: Introduction to memcached

Why this talk?One of the first things I’ve learnt at Netlog. Using it every single day.

Page 6: Introduction to memcached

Program- About caching- About memcached- Examples- Tips & tricks- Toolsets and other solutions

Page 7: Introduction to memcached

What is caching?A copy of real data with faster (and/or cheaper) access

Page 8: Introduction to memcached

• From Wikipedia: "A cache is a collection of data duplicating original values stored elsewhere or computed earlier, where the original data is expensive to fetch (owing to longer access time) or to compute, compared to the cost of reading the cache."

• Term introducted by IBM in the 60’s

What is caching?

Page 9: Introduction to memcached

• simple key/value storage

• simple operations

• save

• get

• delete

The anatomy

Page 10: Introduction to memcached

• storage cost

• retrieval cost (network load / algorithm load)

• invalidation (keeping data up to date / removing irrelevant data)

• replacement policy (FIFO/LFU/LRU/MRU/RANDOM vs. Belady’s algorithm)

• cold cache / warm cache

Terminology

Page 11: Introduction to memcached

• cache hit and cache miss

• typical stats:

• hit ratio (hits / hits + misses)

• miss ratio (1 - hit ratio)

• 45 cache hits and 10 cache misses

• 45/(45+10) = 82% hit ratio

• 18% miss ratio

Terminology

Page 12: Introduction to memcached

• caches are only efficient when the benefits of faster access outweigh the overhead of checking and keeping your cache up to date

• more cache hits then cache misses

When to cache?

Page 13: Introduction to memcached

• at hardware level (cpu, hdd)

• operating systems (ram)

• web stack

• applications

• your own short term vs long term memory

Where are caches used?

Page 14: Introduction to memcached

• Browser cache

• DNS cache

• Content Delivery Networks (CDN)

• Proxy servers

• Application level

• full output caching (eg. Wordpress WP-Cache plugin)

• ...

Caches in the web stack

Page 15: Introduction to memcached

• Application level

• opcode cache (APC)

• query cache (MySQL)

• storing denormalized results in the database

• object cache

• storing values in php objects/classes

Caches in the web stack (cont’d)

Page 16: Introduction to memcached

• the earlier in the process, the closer to the original request(er), the faster• browser cache will be faster then cache on a proxy

• but probably also the harder to get it right• the closer to the requester the more parameters the cache

depends on

Efficiency of caching?

Page 17: Introduction to memcached

• As PHP backend developer, what to cache?

• expensive operations: operations that work with slower resources

• database access

• reading files (in fact, any filesystem access)

• API calls

• Heavy computations

• XML

What to cache on the server-side?

Page 18: Introduction to memcached

• As PHP backend developer, where to store cache results?

• in database (computed values, generated html)• you’ll still need to access your database

• in static files (generated html or serialized php values)• you’ll still need to access your file system

Where to cache on the server-side?

Page 19: Introduction to memcached

in memory!

Page 20: Introduction to memcached

memcached

Page 21: Introduction to memcached

• Free & open source, high-performance, distributed memory object caching system

• Generic in nature, intended for use in speeding up dynamic web applications by alleviating database load.

• key/value dictionary

About memcached

Page 22: Introduction to memcached

• Developed by Brad Fitzpatrick for LiveJournal in 2003

• Now used by Netlog, Facebook, Flickr, Wikipedia, Twitter, YouTube ...

About memcached (cont’d)

Page 23: Introduction to memcached

• It’s a server

• Client access over TCP or UDP

• Servers can run in pools

• eg. 3 servers with 64GB mem each give you a single pool of 192GB storage for caching

• Servers are independent, clients manage the pool

Technically

Page 24: Introduction to memcached

• high demand (used often)

• expensive (hard to compute)

• common (shared accross users)

• Best? All three

What to store in memcache?

Page 25: Introduction to memcached

• Typical:

• user sessions (often)

• user data (often, shared)

• homepage data (eg. often, shared, expensive)

What to store in memcache? (cont’d)

Page 26: Introduction to memcached

• Workflow:

• monitor application (query logs / profiling)

• add a caching level

• compare speed gain

What to store in memcache? (cont’d)

Page 27: Introduction to memcached

• Fast network access (memcached servers close to other application servers)

• No persistency (if your server goes down, data in memcached is gone)

• No redundancy / fail-over

• No replication (single item in cache lives on one server only)

• No authentication (not in shared environments)

Memcached principles

Page 28: Introduction to memcached

• 1 key is maximum 1MB

• keys are strings of 250 characters (in application typically MD5 of user readable string)

• No enumeration of keys (thus no list of valid keys in cache at certain moment, list of keys beginnen with “user_”, ...)

• No active clean-up (only clean up when more space needed, LRU)

Memcached principles (cont’d)

Page 29: Introduction to memcached

$ telnet localhost 11211Trying 127.0.0.1...Connected to localhost.Escape character is '^]'.get fooVALUE foo 0 2hiENDstatsSTAT pid 8861(etc)

Page 30: Introduction to memcached

• both ASCII as Binary protocol

• in real life:

• clients available for all major languages

• C, C++, PHP, Python, Ruby, Java, Perl, Windows, ...

Client Access

Page 31: Introduction to memcached

• Support the basics such as multiple servers, setting values, getting values, incrementing, decrementing and getting stats.

• pecl/memcache

• pecl/memcached

• newer, in beta, a couple more features

PHP Clients

Page 32: Introduction to memcached

pecl/memcache pecl/memcachedFirst Release Date 2004-06-08 2009-01-29 (beta)Actively Developed? Yes YesExternal Dependency None libmemcached FeaturesAutomatic Key Fixup Yes NoAppend/Prepend No YesAutomatic Serialzation2 Yes YesBinary Protocol No OptionalCAS No YesCompression Yes YesCommunication Timeout Connect Only Various OptionsConsistent Hashing Yes YesDelayed Get No YesMulti-Get Yes YesSession Support Yes YesSet/Get to a specific server No YesStores Numerics Converted to Strings Yes

PHP Client Comparison

Page 33: Introduction to memcached

• Memcached::add — Add an item under a new key

• Memcached::addServer — Add a server to the server pool• Memcached::decrement — Decrement numeric item's value

• Memcached::delete — Delete an item

• Memcached::flush — Invalidate all items in the cache

• Memcached::get — Retrieve an item• Memcached::getMulti — Retrieve multiple items

• Memcached::getStats — Get server pool statistics

• Memcached::increment — Increment numeric item's value

• Memcached::set — Store an item• ...

PHP Client functions

Page 34: Introduction to memcached

• Pages with high load / expensive to generate

• Very easy

• Very fast

• But: all the dependencies ...

• language, css, template, logged in user’s details, ...

Output caching

Page 35: Introduction to memcached

<?php

$html = $cache->get('mypage');if (!$html){ ob_start(); echo "<html>"; // all the fancy stuff goes here echo "</html>"; $html = ob_get_contents(); ob_end_clean(); $cache->set('mypage', $html);}echo $html;

?>

Page 36: Introduction to memcached

• on a lower level

• easier to find all dependencies

• ideal solution for offloading database queries

• the database is almost always the biggest bottleneck in backend performance problems

Data caching

Page 37: Introduction to memcached

<?php

function getUserData($UID){ $key = 'user_' . $UID; $userData = $cache->get($key); if (!$userData) { $queryResult = Database::query("SELECT * FROM USERS WHERE uid = " . (int) $UID); $userData = $queryResult->getRow(); $cache->set($userData); } return $userData;}

?>

Page 38: Introduction to memcached

“There are only two hard things in Computer Science: cache invalidation and naming things.”

Phil Karlton

Page 39: Introduction to memcached

• Caching for a certain amount of time

• eg. 10 minutes

• don’t delete caches

• thus: You can’t trust that data coming from cache is correct

Invalidation

Page 40: Introduction to memcached

• Use: Great for summaries

• Overview

• Pages where it’s not that big a problem if data is a little bit out of dat (eg. search results)

• Good for quick and dirty optimizations

Invalidation (cont’d)

Page 41: Introduction to memcached

• Store forever, and expire on certain events

• the userdata example

• store userdata for ever

• when user changes any of his preferences, throw cache away

Invalidation (cont’d)

Page 42: Introduction to memcached

• Use:

• data that is fetched more then it’s updated

• where it’s critical the data is correct

• Improvement: instead of delete on event, update cache on event. (Mind: race conditions. Cache invalidation always as close to original change as possible!)

Invalidation

Page 43: Introduction to memcached

• sessions (cross server)

• database results (via database class, or object caching)

• flooding checks

• output caching (eg. for RSS feeds)

• locks

Uses at Netlog

Page 44: Introduction to memcached

<?phpfunction getUserData($UID){ $db = DB::getInstance(); $db->prepare("SELECT * FROM USERS WHERE uid = {UID}"); $db->assignInt('UID', $UID); $db->execute(); return $db->getRow();}?>

Page 45: Introduction to memcached

<?phpfunction getUserData($UID){ $db = DB::getInstance(); $db->prepare("SELECT * FROM USERS WHERE uid = {UID}"); $db->assignInt('UID', $UID); $db->setCacheTTL(0); // cache forever $db->execute(); return $db->getRow();}?>

Page 46: Introduction to memcached

<?phpfunction getUserData($UID, $invalidateCache = false){ $db = DB::getInstance(); $db->prepare("SELECT * FROM USERS WHERE uid = {UID}"); $db->assignInt('UID', $UID); $db->setCacheTTL(0); // cache forever if ($invalidateCache) { return $db->invalidateCache(); } $db->execute(); return $db->getRow();}?>

Page 47: Introduction to memcached

<?phpfunction updateUserData($UID, $data){ $db = DB::getInstance(); $db->prepare("UPDATE USERS SET ... WHERE uid = {UID}");

... getUserData($UID, true); // invalidate cache return $result;}?>

Page 48: Introduction to memcached

<?phpfunction getLastBlogPosts($UID, $start = 0,

$limit = 10, $invalidateCache = false){ $db = DB::getInstance(); $db->prepare("SELECT blogid FROM BLOGS WHERE uid = {UID} ORDER BY dateadd DESC LIMIT {start}, {limit}"); $start; $limit; $UID; $db->setCacheTTL(0); // cache forever if ($invalidateCache) { return $db->invalidateCache(); } $db->execute(); return $db->getResults();}?>

Page 49: Introduction to memcached

<?phpfunction addNewBlogPost($UID, $data){ $db = DB::getInstance(); $db->prepare("INSERT INTO BLOGS ..."); ...// invalidate caches

getLastBlogPosts($UID, 0, 10); getLastBlogPosts($UID, 11, 20);... // ???

return $result;}?>

Page 50: Introduction to memcached

<?phpfunction getLastBlogPosts($UID, $start = 0, $limit = 10){ $cacheVersionNumber = CacheVersionNumbers:: get('lastblogsposts_' . $UID); $db = DB::getInstance(); $db->prepare("SELECT blogid FROM ..."); ... $db->setCacheVersionNumber($cacheVersionNumber); $db->setCacheTTL(0); // cache forever $db->execute(); return $db->getResults();}?>

Page 51: Introduction to memcached

<?phpclass CacheVersionNumbers{ public static function get($name) { $result = $cache->get('cvn_' . $name); if (!$result) { $result = microtime() . rand(0, 1000); $cache->set('cvn_' . $name, $result); } return $result; } public static function bump($name) { return $cache->delete('cvn_' . $name); }}?>

Page 52: Introduction to memcached

<?phpfunction addNewBlogPost($UID, $data){ $db = DB::getInstance(); $db->prepare("INSERT INTO BLOGS ...");

... CacheVersionNumbers::bump('lastblogsposts_' . $UID); return $result;}?>

Page 53: Introduction to memcached

• queries with JOIN and WHERE statements are harder to cache

• often not easy to find the cache key on update/change events

• solution: JOIN in PHP

Query Caching (cont’d)

Page 54: Introduction to memcached

• queries with JOIN and WHERE statements are harder to cache

• often not easy to find the cache key on update/change events

• solution: JOIN in PHP

• In following example: what if nickname of user changes?

Query Caching (cont’d)

Page 55: Introduction to memcached

<?php $db = DB::getInstance();$db->prepare("SELECT c.comment_message, c.comment_date, u.nickname FROM COMMENTS c JOIN USERS u ON u.uid = c.commenter_uid

WHERE c.postid = {postID}");...?>

Page 56: Introduction to memcached

<?php $db = DB::getInstance();$db->prepare("SELECT c.comment_message, c.comment_date,

c.commenter_uid AS uid FROM COMMENTS c WHERE c.postid = {postID}");...$comments = Users::addUserDetails($comments);...?>

Page 57: Introduction to memcached

<?php...public static function addUserDetails($array){ foreach($array as &$item) { $item = array_merge($item,

self::getUserData($item['uid'])); // assume high hit ratio

} return $item;} ...?>

Page 58: Introduction to memcached

• Pro’s:

• speed, duh.

• queries get simpler (better for your db)

• easier porting to key/value storage solutions

• Cons:

• You’re relying on memcached to be up and have good hit ratios

So?

Page 59: Introduction to memcached

• We reduced database access

• Memcached is faster, but access to memcache still has it’s price

• Solution: multiget

• fetch multiple keys from memcached in one single call

• result is array of items

Multi-Get Optimisations

Page 60: Introduction to memcached

• back to addUserDetails example

• find UID’s from array

• multiget to memcached for details of UID’s

• for UID’s without result, do a query• SELECT ... FROM USERS WHERE uid IN (...)

• for each fetched user, store in cache

• worst case (no hits): 1 query

• return merged cache/db results

Multi-Get Optimisations (cont’d)

Page 61: Introduction to memcached

• client is responsible for managing pool

• hashes a certain key to a certain server

• clients can be naïve: distribute keys on size of pool

• if one server goes down, all keys will now be queried on other servers > cold cache

• use a client with consistent hashing algorithms, so if server goes down, only data on that server gets lost

Consistent Hashing

Page 62: Introduction to memcached

• available stats from servers include:

• uptime, #calls (get/set/...), #hits (since uptime), #misses (since uptime)

• no enumeration, no distinguishing on types of caches

• add own logging / statistics to monitor effectiveness of your caching strategy

Memcached Statistics

Page 63: Introduction to memcached

• Be carefull when security matters. (Remember ‘no authentication’?)• Working on authentication for memcached via SASL Auth

Protocol

• Caching is not an excuse not to do database tuning. (Remember cold cache?)

• Make sure to write unit tests for your caching classes and places where you use it. (Debugging problems related to out-of-date cache data is hard and boring. Very boring.)

More tips ...

Page 64: Introduction to memcached

• Zend framework has Zend_Cache with support for a memcached backend

• Wordpress has 3 plugins for working with memcached

• all of the other major frameworks have some sort of support (built in or via plugins): Symfony, Django, CakePHP, Drupal, ...

• Gear6: memcached servers in the cloud

Libraries for memcached

Page 65: Introduction to memcached

• memcachedb (persistent memcached)

• opcode caching

• APC (php compiled code cache, usable for other purposes too)

• xCache

• eAccelerator

• Zend optimizer

memcached isn’t the only caching solution

Page 66: Introduction to memcached

• main bottleneck in php backends is database• adding php servers is easier then scaling databases

• a complete caching layer before your database layer solves a lot of performance and scalability issues• but being able to scale takes more then memcached

• performance tuning, beginning with identifying the slowest and most used parts stays important, be it tuning of your php code, memcached calls or database queries

Last thought

Page 67: Introduction to memcached

FOR DEVELOPERS

Page 68: Introduction to memcached

YOUR GAME

A TOP SOCIAL GAME

High-score Handling

Tournaments

Challenge builder

Achievements

Got an idea for a game? Great!

Page 69: Introduction to memcached

Gatcha For Game Developers

Game trackingStart game and end game calls results in accurate gameplay tracking and allows us to show who is playing the game at any given moment, compute popularity, target games.

High-scoresYou push your high-score to our API, we do the hard work of creating different types of leader boards and rankings.

AchievementsPushing achievements reached in your game, just takes one API call, no configuration needed.

Page 70: Introduction to memcached

Gatcha For Game Developers

Multiplayer GamesWe run SmartFox servers that enable you to build real-time multiplayer games, with e.g.. in game chat

coming:

Challenges & TournamentsAllow your game players to challenge each other, or build challenges & contests yourself.

Page 71: Introduction to memcached

Gatcha For Game Developers

How to integrate?Flash GamesWe offer wrapper for AS3 and AS2 games with full implementation of our API

Unity3D Games

OpenSocial GamesTalk to the supported containers via the Gatcha OpenSocial Extension

Other GamesSimple iframe implementation. PHP Client API available for the Gatcha API

Start developing in our sandbox.

Page 72: Introduction to memcached
Page 73: Introduction to memcached
Page 74: Introduction to memcached
Page 75: Introduction to memcached

Job openings

Weʼre searching for great developers!

PHP TalentsWorking on integrations and the gaming platform

Flash DevelopersWorking on Flash Games and the gaming platform

Design ArtistsDesigning games and integrations

Page 77: Introduction to memcached

Resources, a.o.:• memcached & apc: http://www.slideshare.net/benramsey/

caching-with-memcached-and-apc• speed comparison: http://dealnews.com/developers/

memcachedv2.html• php client comparison: http://code.google.com/p/memcached/

wiki/PHPClientComparison• cakephp-memcached: http://teknoid.wordpress.com/

2009/06/17/send-your-database-on-vacation-by-using-cakephp-memcached/

• caching basics: http://www.slideshare.net/soplakanets/caching-basics

• caching w php: http://www.slideshare.net/JustinCarmony/effectice-caching-w-php-caching