Caching and tuning fun for high scalability€¦ · Caching and tuning fun for high scalability Wim...

Post on 06-Jul-2020

2 views 0 download

Transcript of Caching and tuning fun for high scalability€¦ · Caching and tuning fun for high scalability Wim...

Caching and tuning funfor high scalability

Wim GoddenCu.be Solutions

Who am I ?Wim Godden (@wimgtr)Owner of Cu.be Solutions (http://cu.be)Open Source developer since 1997Developer of OpenXZend Certified EngineerZend Framework Certified EngineerMySQL Certified Developer

Who are you ?Developers ?System/network engineers ?Managers ?

Caching experience ?

Goals of this tutorialEverything about caching and tuningA few techniques

How-to

How-NOT-to

→ Increase reliability, performance and scalability

5 visitors/day → 5 million visitors/day

(Don't expect miracle cure !)

LAMP

Architecture

Test page3 DB-queries

select firstname, lastname, email from user where user_id = 5;

select title, createddate, body from article order by createddate desc limit 5;

select title, createddate, body from article order by score desc limit 5;

Page just outputs result

Our base benchmarkApachebench = useful enoughResult ?

Single webserver ProxyStatic PHP Static PHP

Apache + PHP 3900 17.5 6700 17.5

Limit :CPU, network

or disk

Limit :database

CachingCaching

What is caching ?

CACHECACHE

What is caching ?

x = 5, y = 2n = 50 Same result

CACHECACHE

select*

fromarticle

join useron article.user_id = user.id

order bycreated desc

limit10

Doesn't changeall the time

Theory of caching

DB

Cache

$data = get('key')

false

GET /pagePage

select data from table

$data = returned result

set('key', $data)

if ($data == false)

Theory of caching

DB

Cache

HIT

Caching techniques#1 : Store entire pages

#2 : Store part of a page (block)#3 : Store data retrieval (SQL ?)

#4 : Store complex processing result#? : Your call !

When you have data, think :Creating time ?Modification frequency ?Retrieval frequency ?

How to find cacheable dataNew projects : start from 'cache everything'Existing projects :

Look at MySQL slow query log

Make a complete query log (don't forget to turn it off !)

Check page loading times

Caching storage - DiskData with few updates : goodCaching SQL queries : preferably not

DON'T use NFS or other network file systemsespecially for sessions

locking issues !

high latency

Caching storage - Disk / ramdiskLocal

5 Webservers → 5 local caches

How will you keep them synchronized ?→ Don't say NFS or rsync !

Caching storage - Memcache(d)Facebook, Twitter, YouTube, … → need we say more ?Distributed memory caching systemMultiple machines ↔ 1 big memory-based hash-tableKey-value storage system

Keys - max. 250bytes

Values - max. 1Mbyte

Caching storage - Memcache(d)Facebook, Twitter, YouTube, … → need we say more ?Distributed memory caching systemKey-value storage system

Keys - max. 250bytes

Values - max. 1Mbyte

Extremely fast... non-blocking, UDP (!)

Memcache - where to install

Memcache - where to install

Memcache - installation & running itInstallation

Distribution package

PECL

Windows : binaries

RunningNo config-files

memcached -d -m <mem> -l <ip> -p <port>

ex. : memcached -d -m 2048 -l 172.16.1.91 -p 11211

Caching storage - Memcache - some notesNot fault-tolerant

It's a cache !

Lose session data

Lose shopping cart data

...

Caching storage - Memcache - some notesNot fault-tolerant

It's a cache !

Lose session data

Lose shopping cart data

Firewall your Memcache port !

Memcache in code

<?php$memcache = new Memcache();$memcache->addServer('172.16.0.1', 11211);$memcache->addServer('172.16.0.2', 11211);

$myData = $memcache->get('myKey');if ($myData === false) { $myData = GetMyDataFromDB(); // Put it in Memcache as 'myKey', without compression, with no expiration $memcache->set('myKey', $myData, false, 0);}echo $myData;

Benchmark with Memcache

Single webserver ProxyStatic PHP Static PHP

Apache + PHP 3900 17.5 6700 17.5Apache + PHP + MC 3900 55 6700 108

Memcache slabs(or why Memcache says it's full when it's not)

Multiple slabs of different sizes :Slab 1 : 400 bytes

Slab 2 : 480 bytes (400 * 1.2)

Slab 3 : 576 bytes (480 * 1.2) (and so on...)

Multiplier (1.2 here) can be configuredStore a lot of very large objects→ Large slabs : full→ Rest : free→ Eviction of data !

Memcache - Is it working ?Connect to it using telnet

"stats" command →

Use Cacti or other monitoring tools

STAT pid 2941STAT uptime 10878STAT time 1296074240STAT version 1.4.5STAT pointer_size 64STAT rusage_user 20.089945STAT rusage_system 58.499106STAT curr_connections 16STAT total_connections 276950STAT connection_structures 96STAT cmd_get 276931STAT cmd_set 584148STAT cmd_flush 0STAT get_hits 211106STAT get_misses 65825STAT delete_misses 101STAT delete_hits 276829STAT incr_misses 0STAT incr_hits 0STAT decr_misses 0STAT decr_hits 0STAT cas_misses 0STAT cas_hits 0STAT cas_badval 0STAT auth_cmds 0STAT auth_errors 0STAT bytes_read 613193860STAT bytes_written 553991373STAT limit_maxbytes 268435456STAT accepting_conns 1STAT listen_disabled_num 0STAT threads 4STAT conn_yields 0STAT bytes 20418140STAT curr_items 65826STAT total_items 553856STAT evictions 0STAT reclaimed 0

Memcache - backing up

Memcache - tipPage with multiple blocks ?→ use Memcached::getMulti()

But : what if you get some hits and some misses ?

getMulti($array) Hashingalgorithm

Updating data

Updating data

LCD_Popular_Product_List

Adding/updating data

$memcache->delete('LCD_Popular_Product_List');

Adding/updating data

Adding/updating data - Why it crashed

Adding/updating data - Why it crashed

Adding/updating data - Why it crashed

Cache stampeding

Cache stampeding

Memcache code ?

DB

Visitor interface Admin interface

Memcache code

Cache warmup scriptsUsed to fill your cache when it's emptyRun it before starting Webserver !

2 ways :Visit all URLs

Error-proneHard to maintain

Call all cache-updating methods

Make sure you have a warmup script !

Cache stampeding - what about locking ?Seems like a nice idea, but...While lock in placeWhat if the process that created the lock fails ?

LAMP...

→ LAMMP

→ LNMMP

NginxWeb serverReverse proxyLightweight, fast12.2% of all Websites

NginxNo threads, event-drivenUses epoll / kqueueLow memory footprint

10000 active connections = normal

Nginx - Configuration

server { listen 80; server_name www.domain.ext *.domain.ext; index index.html; root /home/domain.ext/www;} server { listen 80; server_name photo.domain.ext; index index.html; root /home/domain.ext/photo;}

Nginx with PHP-FPMSince PHP 5.3.3Runs on port 9000Nginx connects using fastcgi method

location / { fastcgi_pass 127.0.0.1:9000; fastcgi_index index.php; include fastcgi_params; fastcgi_param SCRIPT_NAME $fastcgi_script_name; fastcgi_param SCRIPT_FILENAME /home/www.4developers.pl/$fastcgi_script_name; fastcgi_param SERVER_NAME $host; fastcgi_intercept_errors on;

}

Nginx + PHP-FPM featuresGraceful upgradeSpawn new processes under high loadChrootSlow request log !

Nginx + PHP-FPM featuresGraceful upgradeSpawn new processes under high loadChrootSlow request log !fastcgi_finish_request() → offline processing

Nginx + PHP-FPM - performance ?

Single webserver ProxyStatic PHP Static PHP

Apache + PHP 3900 17.5 6700 17.5Apache + PHP + MC 3900 55 6700 108Nginx + PHP-FPM + MC 11700 57 11200 112

Limit :single-threadedApachebench

Reverse proxy time...

VarnishNot just a load balancerReverse proxy cache / http accelerator / …Caches (parts of) pages in memoryCareful :

uses threads (like Apache)

Nginx usually scales better (but doesn't have VCL)

Varnish - backends + load balancing

backend server1 {.host = "192.168.0.10";

}backend server2 {

.host = "192.168.0.11";}

director example_director round-robin {{ .backend = server1;}{

.backend = server2;}

}

Varnish - VCLVarnish Configuration LanguageDSL (Domain Specific Language)

→ compiled to C

Hooks into each requestDefines :

Backends (web servers)

ACLs

Load balancing strategy

Can be reloaded while running

Varnish - whatever you wantReal-time statistics (varnishtop, varnishhist, ...)ESI

Article content page

Article content (TTL : 15 min)/article/732

Varnish - ESIPerfect for caching pages

Header (TTL : 60 min)/top

Latest news (TTL : 2 min) /news

Navigation(TTL :

60 min)/nav

In your article page output :<esi:include src="/top"/><esi:include src="/nav"/><esi:include src="/news"/><esi:include src="/article/732"/>

In your Varnish config :sub vcl_fetch { if (req.url == "/news") { esi; /* Do ESI processing */ set obj.ttl = 2m; } elseif (req.url == "/nav") { esi; set obj.ttl = 1m; } elseif ….….}

Varnish with ESI - hold on tight !

Single webserver ProxyStatic PHP Static PHP

Apache + PHP 3900 17.5 6700 17.5Apache + PHP + MC 3900 55 6700 108Nginx + PHP-FPM + MC 11700 57 11200 112Varnish - - 11200 4200

Varnish - what can/can't be cached ?Can :

Static pages

Images, js, css

Pages or parts of pages that don't change often (ESI)

Can't :POST requests

Very large files (it's not a file server !)

Requests with Set-Cookie

User-specific content

ESI → no caching on user-specific content ?

Logged in as : Wim Godden

5 messages

TTL = 5minTTL=1h

TTL = 0s ?

Coming soon...Based on NginxReduces load by 50 – 95%

Requires code changes !

Well-built project → few changes

Effect on webservers and database servers

What's the result ?

What's the result ?

FiguresFirst customer :

No. of web servers : 18 → 4

No. of db servers : 6 → 2

Total : 24 → 6 (75% reduction !)

Second customer (already using Nginx + Memcache) :No. of web servers : 72 → 8

No. of db servers : 15 → 4

Total : 87 → 12 (86% reduction !)

AvailabilityStable at 2 customersStill under heavy developmentBeta : July 2012Final : Sep 2012

PHP speed - some tipsUpgrade PHP - every minor release has 5-15% speed gain !Use an opcode cache (APC, eAccelerator, XCache)

DB speed - some tipsUse same types for joins

i.e. don't join decimal with int

RAND() is evil !count(*) is evil in InnoDB without a where clause !Persistent connect is sort-of evil

Caching & Tuning @ frontend

http://www.websiteoptimization.com/speed/tweak/average-web-page/

Frontend tuning1. You optimize backend2. Frontend engineers messes up → havoc on backend3. Don't forget : frontend sends requests to backend !

SO...

Care about frontendTest frontendCheck what requests frontend sends to backend

Tuning frontendMinimize requests

Combine CSS/JavaScript files

Tuning frontendMinimize requests

Combine CSS/JavaScript files

Use CSS Sprites

CSS Sprites

Tuning content - CSS sprites

Tuning content - CSS sprites

11 images11 HTTP requests24KByte

1 image1 HTTP requests14KByte

Tuning frontendMinimize requests

Combine CSS/JavaScript files

Use CSS Sprites (horizontally if possible)

Put CSS at topPut JavaScript at bottom

Max. no connections

Especially if JavaScript does Ajax (advertising-scripts, …) !

Avoid iFramesAgain : max no. of connections

Don't scale images in HTMLHave a favicon.ico (don't 404 it !)

→ see my blog

What else can kill your site ?Redirect loops

Multiple requestsMore load on WebserverMore PHP to process

Additional latency for visitor

Try to avoid redirects anyway

→ In ZF : use $this->_forward instead of $this->_redirect

Watch your logs, but equally important...Watch the logging process →Logging = disk I/O → can kill your server !

Above all else... be prepared !Have a monitoring systemUse a cache abstraction layer (disk → Memcache)Don't install for the worst → prepare for the worstHave a test-setupHave fallbacks

→ Turn off non-critical functionality

So...Cache

But : never delete, always push !

Have a warmup script

Monitor your cache

Have an abstraction layer

Apache = fine, Nginx = betterStatic pages ? Use VarnishTune your frontend → impact on backend !

Questions ?

Questions ?

We're hiring !Lots of challengesWork with cutting-edge technologyVaried :

PHP developmentSystem / network architectureScalability servicesBuild our own servicesWork on Open Source

→ mail us : info@cu.be

ContactTwitter @wimgtrWeb http://techblog.wimgodden.beSlides http://www.slideshare.net/wimgE-mail wim.godden@cu.be

Please...Rate my talk : http://joind.in/6327

Thanks !

Please...Rate my talk : http://joind.in/6327