June8 presentation

36
Varnish – A brief introduction Nicolas A. Bérard-Nault June 15, 2011

Transcript of June8 presentation

Page 1: June8 presentation

Varnish – A brief introduction

Nicolas A. Bérard-NaultJune 15, 2011

Page 2: June8 presentation

Regular page view

Page 3: June8 presentation

Reverse proxy cached page view

Page 4: June8 presentation

So what is Varnish ?-Reverse proxy cache-Designed from the ground up to be an HTTP accelerator solution

We will cover-Default configuration and options-ESI-HTTP headers-Keezmovies.com - Benchmarks - Use case - Problems & solutions

Page 5: June8 presentation

Configuring VarnishVarnish uses a configuration file compiled to C on the fly and included as a shared library. The configuration format is called the VCL (Varnish Configuration Language), a domain specific language reminescent of Perl.

If the VCL is not enough, you can configure using inline C and the VRT (Varnish Run Time) library.

For a full reference: http://www.varnish-cache.org/docs/2.1/tutorial/vcl.html

Page 6: June8 presentation

Step by step through the configurationBack end definitions

backend www { .host = "www.example.com"; .port = "http"; .connect_timeout = 1s; .first_byte_timeout = 5s; .between_bytes_timeout = 2s;.probe = {

.url = "/test.jpg";

.timeout = 0.3 s;

.window = 8;

.threshold = 3; }

}

You can have as many backends as you want

Page 7: June8 presentation

Step by step through the configurationDirector definitions

director www_director random { { .backend = www1; .weight = 2; } { .backend = www2; .weight = 1; }

}

director www_director round-robin { { .backend = www1; } { .backend = www2; }

}

You can have as many directors as you want

Page 8: June8 presentation

Highly simplified flow chart of Varnishoperations

Page 9: June8 presentation

Step by step through the configurationrecv: connection is received

sub vcl_recv { if (req.restarts == 0) { if (req.http.x-forwarded-for) { set req.http.X-Forwarded-For = req.http.X-Forwarded-For + ", " + client.ip; } else { set req.http.X-Forwarded-For = client.ip; } } if (req.http.Authorization || req.http.Cookie) { /* Not cacheable by default */ return (pass); } if (req.request != "GET" && req.request != "HEAD") { /* We only deal with GET and HEAD by default */ return (pass); } return (lookup);}

Page 10: June8 presentation

Be careful with your HTTP verbs…

Verb PotencyGET NullipotentPOST Non-idempotentPUT IdempotentDELETE Idempotent

But we always cheat…

Page 11: June8 presentation

vcl_hash

Page 12: June8 presentation

vcl_hash: create object hash for requestsub vcl_hash {

hash_data(req.url);if (req.http.host) {

hash_data(req.http.host);} else {

hash_data(server.ip);}return (hash);

}

Page 13: June8 presentation

vcl_hit, vcl_miss

Page 14: June8 presentation

vcl_pass: request not cacheable

vcl_miss: post-lookup object does not exist in cache

vcl_hit: post-lookup, object exists in cache

sub vcl_pass {return (pass);

}

sub vcl_miss {return (fetch);

}

sub vcl_hit {return (deliver);

}

Page 15: June8 presentation

vcl_fetch

Page 16: June8 presentation

sub vcl_fetch {if (beresp.ttl <= 0s ||

beresp.http.Set-Cookie || beresp.http.Vary == "*") {

set beresp.ttl = 120 s; return (hit_for_pass);}return (deliver);

}

vcl_fetch: post object fetched from back-end

Page 17: June8 presentation

vcl_fetch

Page 18: June8 presentation

Step by step through the configurationvcl_deliver: object is to be delivered to client

sub vcl_deliver {return (deliver);

}

Page 19: June8 presentation

ESI (edge-side include)

Invented by Akamai, only a subset is supported by Varnish

Varnish supports include:<div>Hello:<esi:include src=“/getname.php“ /></div>

Will be processed into:<div>Hello:Roger Cyr</div>

Page 20: June8 presentation

ESI (edge-side include)To enable ESI processing, used the esi keyword in vcl_fetch.

ESI and gzipVarnish WILL NOT be able to do ESI processing on gzip’ed backend responses. It will also not be able to do ungzip an ESI response.

In all cases, ESIs and gzip are not a good mix. Better support is planned for Varnish 3.0.

Page 21: June8 presentation

HTTP headersVarnish relies on HTTP headers to know what to cache and for how long.

This is done through the Cache-Control HTTP header.

Cache-Control: 30Cache-Control: max-age=900Cache-Control: no-cacheCache-Control: must-revalidate

Read the HTTP RFC !http://tools.ietf.org/html/rfc2616#section-14.9

Page 22: June8 presentation

keezmovies.com

Page 23: June8 presentation

keezmovies.com- Average of 13 million hits per day (~ 150 queries per second)- Homepage gets a large part of the hits (~35%, ~53 queries per second)- Logged in traffic is a very, very, very small minority

Perfect candidate for full page caching

Page 24: June8 presentation

Some results for KM

Tested four configurations:1)Apache + PHP2)Apache + PHP + APC3)Lighttpd + PHP + APC4)Varnish

- Homepage (size = 90k, gzipped = 10k).- Tested using Apache Benchmark withIncreasing concurrency.

Page 25: June8 presentation
Page 26: June8 presentation
Page 27: June8 presentation
Page 28: June8 presentation

But…1) Content differs slightly for certain countries

(notoriously, Germany)2) Google Analytics cookies3) And of course, not all GET requests are

nullipotent

The good news is, two of these three problems are easily tackable !

Page 29: June8 presentation

Problem #1: GeolocalizationEssentially, each page has 2 versions:1) German visitor & disclaimer not accepted2) Rest of the world & German visitor who accepted

disclaimer

__attribute__((constructor)) voidload_module(){ /* … */ handle = dlopen(“/usr/lib/varnish/geoip.so”, RTLD_NOW); if (handle != NULL) { get_country_code = dlsym(handle, “get_country_code”); }}}C

Page 30: June8 presentation

sub vcl_recv { C{ char *cc = (*get_country_code)(VRT_IP_string(sp, VRT_r_client_ip(sp))); VRT_SetHdr(sp, HDR_REQ, "\017X-Country-Code:", cc, vrt_magic_string_end); }C

if (req.http.Cookie ~ "age_verified.*" ) { set req.http.X-Age-Verified = "1"; } else { set req.http.X-Age-Verified = "0"; }}

The following code is added to vcl_recv

The PHP page is responsible for setting the age_verified cookie oncethe disclaimer is accepted

Page 31: June8 presentation

sub vcl_hash { if (req.http.x-country-code=="DE" && req.http.x-age-verified == "0") {

set req.hash += req.http.x-age-verified; set req.hash += req.http.x-country-code; }}

You can download the Varnish GeoIP library here: http://www.varnish-cache.org/trac/wiki/GeoipUsingInlineC

It uses the Maxmind GeoIP library.

The following code is added to vcl_hash

Page 32: June8 presentation

Problem #2: Google Analytics cookiesub vcl_recv { if (req.http.Cookie) { if (req.http.Cookie ~ "user_cookie.*" ) {

return( pass); }

remove req.http.Cookie; } }

This removes all cookies except the ones we know to be useful

Page 33: June8 presentation

Problem #3: GET requests with side effects

JSON UDP packets

Page 34: June8 presentation

Stats server- Nodejs server, communicating with database directly (could be communicating with website through API)- Does batch queries- Can handle and aggregate requests from many Varnish servers at the same time-Bonus: can be used for many, many, many other things….

Core: http://github.com/nicobn/AlysObserverVarnish module: http://github.com/nicobn/AlysVarnish

Page 35: June8 presentation

Side note: Your TTL is too highKeezMovies: 53qps on home pageRapidly decreasing marginal utility

0.10.25 0.5 1 2 3 4 5 6 7 8 9 10 20 30 40 50 60

120240

6003600

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

% of saved requests in an hour in function of TTL

% o

f sac

ved

requ

ests

TTL (s)

Dr. Strangelove or how I learned to stop worrying and love low TTLs

Page 36: June8 presentation

Questions ?