Download - Varnish Cache Plus. Random notes for wise web developers

Transcript

Varnish Cache PlusRandom notes for wise web developers

Carlos Abalde, Roberto Moreda {cabalde, moreda}@allenta.com October 2014

Agenda

1. Introduction

2. Varnish 101

3. Invalidations

4. HTTP headers

5. Content composition

6. VAC

7. VCS

8. Device detection

9. Varnish Plus 4.x

10. Q&A

1. Introduction

Disclaimer๏ General understanding of ‘The Varnish Book’ is assumed

‣ This is not the official Varnish Cache training

‣ This is not a Varnish Cache internals course

‣ This is not a Varnish module development course

‣ This is a collection of random notes for web developers willing to make the most of Varnish Cache Plus

๏ OSS Varnish Cache vs. Varnish Cache Plus

‣ 3.x vs. 4.x

Varnish Cache 3.x

๏ The Varnish Book

‣ https://www.varnish-software.com/static/book/

๏ The Varnish Reference Manual

‣ https://www.varnish-cache.org/docs/.../index.html

๏ Default VCL

‣ https://www.varnish-cache.org/trac/.../default.vcl

What everybody should know

Varnish Cache Plus 3.x

๏ Support, advise & training

๏ Varnish Enhanced Cache Invalidation

‣ Hash Two, Hash Ninja…

๏ Varnish Administration Console (VAC)

๏ Varnish Custom Statistics (VCS)

๏ Device detection

Components I

Varnish Cache Plus 3.x

๏ Varnish Tuner

๏ Enhanced HTTP streaming

๏ Packaged binary VMODs

๏ Varnish Paywall

๏ … and more to come shortly!

Components II

Varnish Cache Plus 3.x

๏ 64 bits

๏ Distributions

‣ RedHat Enterprise Linux 5 & 6

‣ Ubuntu Linux 12.04 LTS (precise)

‣ Ubuntu Linux 14.04 LTS (trusty)

‣ Debian Linux 7 (wheezy)

Supported platforms

2. Varnish 101

Caching policy

๏ Varnish Cache Plus would require zero configuration in a perfect world with perfect HTTP citizens

‣ Correct HTTP caching headers

‣ Vary HTTP header used wisely

‣ HTTP cookies used conservatively

๏ By default Varnish Cache Plus will not cache anything marked as private, carrying a cookie or including a '*'  Vary HTTP header

VCL

๏ Varnish Configuration Language

‣ Domain specific state engine

‣ No loops, variables, functions…

‣ Command line configuration & Tunable parameters

๏ Translated to C code

๏ Loaded as a dynamically generated shared library

‣ Zero downtime & Blazingly fast

Overview

VCL

๏ Normalize client-input

๏ Pick a backend / director

๏ Re-write / extend client-input

๏ Decide caching policy based on client-input

๏ Access control

๏ Security barriers

vcl_recv I

VCLvcl_recv II

sub  vcl_recv  {    #  Backend  selection  &  URL  normalization.    if  (req.http.host  ~  "^blogs\.")  {        set  req.backend  =  blogs;        set  req.http.host  =  regsub(req.http.host,"^blogs\.",  "");        set  req.url  =  regsub(req.url,  "^",  "/blogs");    }  else  {        set  req.backend  =  default;    }    #  Poor  man's  device  detection.    if  (req.http.User-­‐Agent  ~  "(iPad|iPhone|Android)")  {        set  req.http.X-­‐Device  =  "mobile";    }  else  {        set  req.http.X-­‐Device  =  "desktop";    }}

VCL

๏ Sanitize / extend backend response

๏ Override cache duration

‣ beresp.ttl  

- s-­‐maxage & maxage in Cache-­‐Control HTTP header

- Expires HTTP header

- Default TTL

‣ Beware with TTL of hitpass objects

vcl_fetch I

VCLvcl_fetch II

sub  vcl_fetch  {    #  Override  caching  TTL.    if  (beresp.http.Cache-­‐Control  !~  "s-­‐maxage")  {        set  beresp.ttl  =  0;        if  (bereq.url  ~  "\.jpg(\?|$)")  {            set  beresp.ttl  =  30s;        }    }      #  Never  cache  a  Set-­‐Cookie  header.    if  (beresp.ttl  >  0s)  {        unset  beresp.http.Set-­‐Cookie;    }    #  Create  ban-­‐lurker  friendly  objects.    set  beresp.http.X-­‐Url  =  bereq.url;}

VCLRequest flow I

VCLRequest flow II

Process architecture

VMODs

๏ Shared libraries extending the VCL core

‣ std VMOD

- std.toupper(), std.log(), std.fileread()…

‣ ABI (Application Binary Interface) mismatches

๏ cookie, header, var, curl, digest, geoip, boltsort, memcached, redis, dns…

๏ https://www.varnish-cache.org/vmods

Backends

๏ Multiple backends

‣ Selected at request time based on any request property

๏ Probes

‣ Per-backend periodic health checks

- Interval, timeout, expected response…

๏ Directors

‣ Load balanced backend groups

Error handling

๏ Some backend may be sick for a particular object

‣ Other objects from the same backend can still be accessed

- Unless more than a set amount of objects are added to the saint mode blacklist for a specific backend

๏ Do not request again the object to that backend for a period of time

‣ Grace mode is used when all possible backends for the requested object have been blacklisted

๏ Complement backend probes

Saint mode

Error handling

๏ A graced object is an object that has expired, but is still kept in cache

‣ beresp.ttl vs. beresp.grace

๏ Graced objects are used to

‣ Serve outdated content if the backend is down

- Probes or saint mode is required for this

‣ Serve sightly staled content while fresh versions are fetched

Grace mode

Beyond caching policy

๏ Why restricting VCL / VMODs to implement the caching policy?

๏ Any logic modeled in VCL / VMODs is compiled, embedded & executed in the caching edger layer

‣ 1000x times faster than typical Java / PHP apps

- Strong restrictions

‣ Accounting, paywalling, A/B testing…

varnishtest

๏ Powerful Varnish-specific testing tool

‣ Mocked clients & backends executing / processing HTTP requests against real Varnish Cache Plus instances

‣ http://www.clock.co.uk/...varnishtest

๏ Essential when implementing complex VCL logic

๏ Easily integrable in any CI infrastructure

FAQ๏ When SSL support will be implemented?

‣ "[...] huge waste of time and effort to even think about it."

๏ When SPDY support will be implemented?

‣ "[...] Varnish is not speedy, Varnish is fast! [...]"

๏ What is the recommended value for this bizarre kernel / varnishd parameter I found in some random blog?

‣ Use Varnish Tuner + Fine tune based on necessity

‣ Pay attention to workspaces & syslog messages

3. Invalidations

Overview

๏ Updated objects may be available before TTL expiration

‣ Purges

‣ Forced misses

‣ Bans

‣ Hash Two / Hash Ninja / …

Purges

๏ VCL

๏ Eagerly discards an object along with all its variants

Overview

acl  internal  {    "localhost";    "192.168.55.0"/24;}  

sub  vcl_recv  {    if  (req.request  ==  "PURGE")  {        if  (client.ip  !~  internal)  {            error  405  "Not  allowed.";        }        return  (lookup);    }}

sub  vcl_hit  {    if  (req.request  ==  "PURGE")  {        purge;        error  200  "Purged.";    }}  

sub  vcl_miss  {    if  (req.request  ==  "PURGE")  {        purge;          error  200  "Purged.";    }}

Purges

๏ What if the new object cannot be fetched after the invalidation?

‣ Soft-purges VMOD

‣ Forces misses

๏ What if multiple objects need to be invalidated? What if objects need to be invalidated too frequently?

‣ Bans

‣ Hash Two

Downsides I

Purges

๏ How to invalidate hitpass objects?

‣ Not possible in Varnish Cache Plus 3.x

- Redesigned in Varnish Cache Plus 4.x

- https://www.varnish-cache.org/trac/.../1033

‣ return(pass); during vcl_recv is preferred when possible

Downsides II

Forced misses

๏ VCL

๏ Forces a cache miss for the request

‣ Useful for cache priming scripts

Overview

sub  vcl_recv  {    if  (req.http.X-­‐Priming-­‐Script)  {        ...        set  req.hash_always_miss  =  true;    }    ...}

Forced misses

๏ Object will always be (re)fetched from the backend

๏ New object is put into cache and used from that point onward

‣ Old object is not evicted until it’s safe to do so

‣ Controls who takes the penalty of waiting for an updated object

๏ Old objects are not freed up until expiration

‣ This is considered a flaw and a fix is expected

Behavior

Bans

๏ VCL or CLI

๏ Lazily discards multiple objects matching an expression

‣ Logical operators + Object attributes + Regular expressions

‣ Only works on objects already in the cache

๏ Ban lurker

‣ Frees up memory + Keeps the ban list at a manageable size

‣ obj.* based expressions

Overview

BansExample

sub  vcl_recv  {    if  (req.request  ==  "BAN")  {        ...        if  (!req.http.X-­‐Ban-­‐Url-­‐Regexp)  {            error  400  "Empty  URL  regexp.";        }        ban("obj.http.X-­‐Url  ~  "  +  req.http.X-­‐Ban-­‐Url-­‐Regexp);    }}  

sub  vcl_fetch  {    set  beresp.http.X-­‐Url  =  req.url;}  

sub  vcl_deliver  {    unset  resp.http.X-­‐Url;}

Hash Two

๏ VCL + VMOD

๏ Workarounds bans scalability

Overview

HTTP/1.x  200  OKTransfer-­‐Encoding:  chunked...X-­‐Tags:  C10  P42  P236  P857...

ban  obj.http.X-­‐Tags  ~  "(\s|^)P42(\s|$)"

Hash TwoExample

import  hashtwo;  

sub  vcl_recv  {    if  (req.request  ==  "PURGE")  {        ...        if  (hashtwo.purge(req.http.X-­‐Tag)  !=  0)  {            error  200  "Purged.";        }  else  {            error  404  "Not  found.";        }    }}  

sub  vcl_fetch  {    set  beresp.http.X-­‐HashTwo  =  beresp.http.X-­‐Tags; }

4. HTTP headers

Cache related headers

๏ Expires

๏ Cache-Control

๏ Last-Modified

๏ If-Modified-Since

๏ If-None-Match

๏ Etag

๏ Pragma

๏ Vary

๏ Age

Cache-Control

๏ Specifies directives that must be applied by all caching mechanisms (from Varnish Cache Plus to browser cache)

Overview

‣ public  |  private  

‣ no-­‐store  

‣ no-­‐cache  

‣ max-­‐age  

‣ s-­‐maxage  

‣ must-­‐revalidate  

‣ no-­‐transform  

‣ …

Cache-Control

๏ Ignored in incoming client HTTP requests

๏ Only s-­‐maxage & max-­‐age used in backend HTTP responses to calculate default TTL

‣ Always overrides Expires header

‣ Beware of Age header in client responses

- Objects not cached client side

- https://www.varnish-cache.org/...Caching

beresp.ttl

Vary

๏ Indicates the response returned by the backend server may vary depending on headers received in the request

๏ Object variants & Hit ratio

‣ Vary:  Accept-­‐Encoding  

- Normalization of Accept-­‐Encoding header is not required

‣ Vary:  User-­‐Agent

5. Content composition

Overview๏ Break objects into smaller fragments

‣ Separate cache policy for each fragment

‣ Increase hit ratio

๏ Tools

‣ Edge Side Includes (ESI)

‣ AJAX

- Beware of RTT & Cross domain policy

Edge Side Includes

๏ Subset of ESI Language Specification 1.0

‣ <esi:include  src="<URL>  "  />  

‣ <esi:remove>...</esi:remove>  

‣ <!-­‐-­‐esi  ...—>  

๏ set  beresp.do_esi  =  true;  

‣ Separate Varnish requests

๏ Testing ESI in dev environment

6. VAC

Overview

๏ Central control of Varnish Cache Plus servers

‣ Web UI + RESTful API

- Super Fast Purger

๏ Cache group management

‣ Real time statistics, VCL editor, ban submission…

๏ Varnish Agent 2

Super Fast Purger

๏ High performance intermediary distributing invalidation requests to groups of Varnish Cache Plus servers

‣ Leverages speed & flexibility of VCL

‣ Keep-alive workaround

๏ Part of the VAC RESTful API

‣ Trivially integrable in existing applications

Change management

๏ Easily integrable using the VAC RESTful API

‣ git, Mercurial… hooks

‣ Jenkins, Travis, GitLab… CI scripts

๏ Manual VCL bundle generation

๏ Orchestrated / programmed deployments, rollbacks, etc.

7. VCS

Overview

๏ Real-time aggregated statistics

‣ Multiple vstatdprobe daemons

‣ One vstatd daemon

‣ JSON + Time series API

๏ VSM log based

‣ Efficient circular in-memory data structure

‣ std.log("vcs-­‐key:"  +  <key  suffix>);

Some ideas

๏ Trending articles or sale products

๏ Cache hits and cache misses

๏ URLs with long load times

๏ URLs with the most 5xx response codes

๏ Where traffic is coming from

๏ …

Example

sub  vcl_deliver  {    std.log("vcs-­‐key:"  +  req.http.host);    std.log("vcs-­‐key:"  +  req.http.host  +  req.url);    std.log("vcs-­‐key:TOTAL");    if  (obj.hits  ==  0)  {        std.log("vcs-­‐key:MISS");    }  }

API I๏ Stats (#requests, #misses, avg ttfb, acc body bytes, #2xx,

#3xx…) for key named “example.com" during the last time windows

‣ GET  /key/example.com  

๏ Keys that produced the most 5xx responses during the last time window

‣ GET  /all/top_5xx  

๏ Top 5 requested keys during the last time window

‣ GET  /all/top/5?verbose=1

API II

๏ Top 10 most requested keys ending with ‘.gif' during the last time window

‣ GET  /match/(.*)%5C.gif$/top  

๏ Top 50 slowest backend requests aggregating the last 20 time windows

‣ GET  /all/top_ttfb/50?b=20

8. Device detection

Overview๏ VMOD

๏ DeviceAtlas

‣ https://deviceatlas.com

‣ Database locally deployed & Daily updated

๏ OSS alternatives

‣ https://github.com/serbanghita/Mobile-Detect

‣ …

Example

import  deviceatlas;  

sub  vcl_recv  {    if  (deviceatlas.lookup(req.http.User-­‐Agent,                                                                                  "isMobilePhone")  ==  "1")  {        set  req.http.X-­‐Device  =  "mobile";    }  elsif  (deviceatlas.lookup(req.http.User-­‐Agent,                                                              "isTablet")  ==  "1")  {        set  req.http.X-­‐Device  =  "tablet";    }  else  {        set  req.http.X-­‐Device  =  "desktop";    }}

Some ideas

๏ Redirections based on device properties

๏ Backend selection based on device properties

๏ Normalization of the UA header

‣ Caching different versions (i.e. Vary header) of the same object based on normalized UAs

๏ …

9. Varnish Plus 4.x

Highlights๏ Client / backend thread split

‣ Background content refreshing

๏ Redesigned purges

‣ return(purge); during vcl_recv

๏ Directors implemented as VMODs

‣ Consistent hashing director

๏ Distinction between error & synthetic responses

10. Q&A