Download - Varnish Cache Plus. Random notes for wise web developers

Varnish Cache PlusRandom notes for wise web developers

Carlos Abalde, Roberto Moreda {cabalde, moreda}@allenta.com October 2014

Agenda

1. Introduction

2. Varnish 101

3. Invalidations

4. HTTP headers

5. Content composition

6. VAC

7. VCS

8. Device detection

9. Varnish Plus 4.x

10. Q&A

1. Introduction

Disclaimer๏ General understanding of ‘The Varnish Book’ is assumed

‣ This is not the official Varnish Cache training

‣ This is not a Varnish Cache internals course

‣ This is not a Varnish module development course

‣ This is a collection of random notes for web developers willing to make the most of Varnish Cache Plus

๏ OSS Varnish Cache vs. Varnish Cache Plus

‣ 3.x vs. 4.x

Varnish Cache 3.x

๏ The Varnish Book

‣ https://www.varnish-software.com/static/book/

๏ The Varnish Reference Manual

‣ https://www.varnish-cache.org/docs/.../index.html

๏ Default VCL

‣ https://www.varnish-cache.org/trac/.../default.vcl

What everybody should know

https://www.varnish-software.com/static/book/

https://www.varnish-cache.org/docs/3.0/reference/index.html

https://www.varnish-cache.org/trac/browser/bin/varnishd/default.vcl?rev=3.0

Varnish Cache Plus 3.x

๏ Support, advise & training

๏ Varnish Enhanced Cache Invalidation

‣ Hash Two, Hash Ninja…

๏ Varnish Administration Console (VAC)

๏ Varnish Custom Statistics (VCS)

๏ Device detection

Components I


๏ Varnish Tuner

๏ Enhanced HTTP streaming

๏ Packaged binary VMODs

๏ Varnish Paywall

๏ … and more to come shortly!

Components II


๏ 64 bits

๏ Distributions

‣ RedHat Enterprise Linux 5 & 6

‣ Ubuntu Linux 12.04 LTS (precise)

‣ Ubuntu Linux 14.04 LTS (trusty)

‣ Debian Linux 7 (wheezy)

Supported platforms

2. Varnish 101

Caching policy

๏ Varnish Cache Plus would require zero configuration in a perfect world with perfect HTTP citizens

‣ Correct HTTP caching headers

‣ Vary HTTP header used wisely

‣ HTTP cookies used conservatively

๏ By default Varnish Cache Plus will not cache anything marked as private, carrying a cookie or including a '*' Vary HTTP header

VCL

๏ Varnish Configuration Language

‣ Domain specific state engine

‣ No loops, variables, functions…

‣ Command line configuration & Tunable parameters

๏ Translated to C code

๏ Loaded as a dynamically generated shared library

‣ Zero downtime & Blazingly fast

Overview

VCL

๏ Normalize client-input

๏ Pick a backend / director

๏ Re-write / extend client-input

๏ Decide caching policy based on client-input

๏ Access control

๏ Security barriers

vcl_recv I

VCLvcl_recv II

sub vcl_recv { # Backend selection & URL normalization. if (req.http.host ~ "^blogs\.") { set req.backend = blogs; set req.http.host = regsub(req.http.host,"^blogs\.", ""); set req.url = regsub(req.url, "^", "/blogs"); } else { set req.backend = default; } # Poor man's device detection. if (req.http.User-‐Agent ~ "(iPad|iPhone|Android)") { set req.http.X-‐Device = "mobile"; } else { set req.http.X-‐Device = "desktop"; }}

VCL

๏ Sanitize / extend backend response

๏ Override cache duration

‣ beresp.ttl

- s-‐maxage & maxage in Cache-‐Control HTTP header

- Expires HTTP header

- Default TTL

‣ Beware with TTL of hitpass objects

vcl_fetch I

VCLvcl_fetch II

sub vcl_fetch { # Override caching TTL. if (beresp.http.Cache-‐Control !~ "s-‐maxage") { set beresp.ttl = 0; if (bereq.url ~ "\.jpg(\?|$)") { set beresp.ttl = 30s; } } # Never cache a Set-‐Cookie header. if (beresp.ttl > 0s) { unset beresp.http.Set-‐Cookie; } # Create ban-‐lurker friendly objects. set beresp.http.X-‐Url = bereq.url;}

VCLRequest flow I

VCLRequest flow II

Process architecture

VMODs

๏ Shared libraries extending the VCL core

‣ std VMOD

- std.toupper(), std.log(), std.fileread()…

‣ ABI (Application Binary Interface) mismatches

๏ cookie, header, var, curl, digest, geoip, boltsort, memcached, redis, dns…

๏ https://www.varnish-cache.org/vmods

https://www.varnish-cache.org/vmods

Backends

๏ Multiple backends

‣ Selected at request time based on any request property

๏ Probes

‣ Per-backend periodic health checks

- Interval, timeout, expected response…

๏ Directors

‣ Load balanced backend groups

Error handling

๏ Some backend may be sick for a particular object

‣ Other objects from the same backend can still be accessed

- Unless more than a set amount of objects are added to the saint mode blacklist for a specific backend

๏ Do not request again the object to that backend for a period of time

‣ Grace mode is used when all possible backends for the requested object have been blacklisted

๏ Complement backend probes

Saint mode

Error handling

๏ A graced object is an object that has expired, but is still kept in cache

‣ beresp.ttl vs. beresp.grace

๏ Graced objects are used to

‣ Serve outdated content if the backend is down

- Probes or saint mode is required for this

‣ Serve sightly staled content while fresh versions are fetched

Grace mode

Beyond caching policy

๏ Why restricting VCL / VMODs to implement the caching policy?

๏ Any logic modeled in VCL / VMODs is compiled, embedded & executed in the caching edger layer

‣ 1000x times faster than typical Java / PHP apps

- Strong restrictions

‣ Accounting, paywalling, A/B testing…

varnishtest

๏ Powerful Varnish-specific testing tool

‣ Mocked clients & backends executing / processing HTTP requests against real Varnish Cache Plus instances

‣ http://www.clock.co.uk/...varnishtest

๏ Essential when implementing complex VCL logic

๏ Easily integrable in any CI infrastructure

http://www.clock.co.uk/blog/getting-started-with-varnishtest

FAQ๏ When SSL support will be implemented?

‣ "[...] huge waste of time and effort to even think about it."

๏ When SPDY support will be implemented?

‣ "[...] Varnish is not speedy, Varnish is fast! [...]"

๏ What is the recommended value for this bizarre kernel / varnishd parameter I found in some random blog?

‣ Use Varnish Tuner + Fine tune based on necessity

‣ Pay attention to workspaces & syslog messages

https://www.varnish-cache.org/docs/trunk/phk/ssl.html

https://www.varnish-cache.org/docs/trunk/phk/spdy.html

3. Invalidations

Overview

๏ Updated objects may be available before TTL expiration

‣ Purges

‣ Forced misses

‣ Bans

‣ Hash Two / Hash Ninja / …

Purges

๏ VCL

๏ Eagerly discards an object along with all its variants

Overview

acl internal { "localhost"; "192.168.55.0"/24;}

sub vcl_recv { if (req.request == "PURGE") { if (client.ip !~ internal) { error 405 "Not allowed."; } return (lookup); }}

sub vcl_hit { if (req.request == "PURGE") { purge; error 200 "Purged."; }}

sub vcl_miss { if (req.request == "PURGE") { purge; error 200 "Purged."; }}

Purges

๏ What if the new object cannot be fetched after the invalidation?

‣ Soft-purges VMOD

‣ Forces misses

๏ What if multiple objects need to be invalidated? What if objects need to be invalidated too frequently?

‣ Bans

‣ Hash Two

Downsides I

Purges

๏ How to invalidate hitpass objects?

‣ Not possible in Varnish Cache Plus 3.x

- Redesigned in Varnish Cache Plus 4.x

- https://www.varnish-cache.org/trac/.../1033

‣ return(pass); during vcl_recv is preferred when possible

Downsides II

https://www.varnish-cache.org/trac/ticket/1033

Forced misses

๏ VCL

๏ Forces a cache miss for the request

‣ Useful for cache priming scripts

Overview

sub vcl_recv { if (req.http.X-‐Priming-‐Script) { ... set req.hash_always_miss = true; } ...}

Forced misses

๏ Object will always be (re)fetched from the backend

๏ New object is put into cache and used from that point onward

‣ Old object is not evicted until it’s safe to do so

‣ Controls who takes the penalty of waiting for an updated object

๏ Old objects are not freed up until expiration

‣ This is considered a flaw and a fix is expected

Behavior

Bans

๏ VCL or CLI

๏ Lazily discards multiple objects matching an expression

‣ Logical operators + Object attributes + Regular expressions

‣ Only works on objects already in the cache

๏ Ban lurker

‣ Frees up memory + Keeps the ban list at a manageable size

‣ obj.* based expressions

Overview

BansExample

sub vcl_recv { if (req.request == "BAN") { ... if (!req.http.X-‐Ban-‐Url-‐Regexp) { error 400 "Empty URL regexp."; } ban("obj.http.X-‐Url ~ " + req.http.X-‐Ban-‐Url-‐Regexp); }}

sub vcl_fetch { set beresp.http.X-‐Url = req.url;}

sub vcl_deliver { unset resp.http.X-‐Url;}

Hash Two

๏ VCL + VMOD

๏ Workarounds bans scalability

Overview

HTTP/1.x 200 OKTransfer-‐Encoding: chunked...X-‐Tags: C10 P42 P236 P857...

ban obj.http.X-‐Tags ~ "(\s|^)P42(\s|$)"

Hash TwoExample

import hashtwo;

sub vcl_recv { if (req.request == "PURGE") { ... if (hashtwo.purge(req.http.X-‐Tag) != 0) { error 200 "Purged."; } else { error 404 "Not found."; } }}

sub vcl_fetch { set beresp.http.X-‐HashTwo = beresp.http.X-‐Tags; }

4. HTTP headers

Cache related headers

๏ Expires

๏ Cache-Control

๏ Last-Modified

๏ If-Modified-Since

๏ If-None-Match

๏ Etag

๏ Pragma

๏ Vary

๏ Age

Cache-Control

๏ Specifies directives that must be applied by all caching mechanisms (from Varnish Cache Plus to browser cache)

Overview

‣ public | private

‣ no-‐store

‣ no-‐cache

‣ max-‐age

‣ s-‐maxage

‣ must-‐revalidate

‣ no-‐transform

‣ …

Cache-Control

๏ Ignored in incoming client HTTP requests

๏ Only s-‐maxage & max-‐age used in backend HTTP responses to calculate default TTL

‣ Always overrides Expires header

‣ Beware of Age header in client responses

- Objects not cached client side

- https://www.varnish-cache.org/...Caching

beresp.ttl

https://www.varnish-cache.org/trac/wiki/VCLExampleLongerCaching

Vary

๏ Indicates the response returned by the backend server may vary depending on headers received in the request

๏ Object variants & Hit ratio

‣ Vary: Accept-‐Encoding

- Normalization of Accept-‐Encoding header is not required

‣ Vary: User-‐Agent

5. Content composition

Overview๏ Break objects into smaller fragments

‣ Separate cache policy for each fragment

‣ Increase hit ratio

๏ Tools

‣ Edge Side Includes (ESI)

‣ AJAX

- Beware of RTT & Cross domain policy

Edge Side Includes

๏ Subset of ESI Language Specification 1.0

‣ <esi:include src="<URL> " />

‣ <esi:remove>...</esi:remove>

‣ <!-‐-‐esi ...—>

๏ set beresp.do_esi = true;

‣ Separate Varnish requests

๏ Testing ESI in dev environment

http://www.w3.org/TR/esi-lang

6. VAC

Overview

๏ Central control of Varnish Cache Plus servers

‣ Web UI + RESTful API

- Super Fast Purger

๏ Cache group management

‣ Real time statistics, VCL editor, ban submission…

๏ Varnish Agent 2

Super Fast Purger

๏ High performance intermediary distributing invalidation requests to groups of Varnish Cache Plus servers

‣ Leverages speed & flexibility of VCL

‣ Keep-alive workaround

๏ Part of the VAC RESTful API

‣ Trivially integrable in existing applications

Change management

๏ Easily integrable using the VAC RESTful API

‣ git, Mercurial… hooks

‣ Jenkins, Travis, GitLab… CI scripts

๏ Manual VCL bundle generation

๏ Orchestrated / programmed deployments, rollbacks, etc.

7. VCS

Overview

๏ Real-time aggregated statistics

‣ Multiple vstatdprobe daemons

‣ One vstatd daemon

‣ JSON + Time series API

๏ VSM log based

‣ Efficient circular in-memory data structure

‣ std.log("vcs-‐key:" + <key suffix>);

Some ideas

๏ Trending articles or sale products

๏ Cache hits and cache misses

๏ URLs with long load times

๏ URLs with the most 5xx response codes

๏ Where traffic is coming from

๏ …

Example

sub vcl_deliver { std.log("vcs-‐key:" + req.http.host); std.log("vcs-‐key:" + req.http.host + req.url); std.log("vcs-‐key:TOTAL"); if (obj.hits == 0) { std.log("vcs-‐key:MISS"); } }

API I๏ Stats (#requests, #misses, avg ttfb, acc body bytes, #2xx,

#3xx…) for key named “example.com" during the last time windows

‣ GET /key/example.com

๏ Keys that produced the most 5xx responses during the last time window

‣ GET /all/top_5xx

๏ Top 5 requested keys during the last time window

‣ GET /all/top/5?verbose=1

API II

๏ Top 10 most requested keys ending with ‘.gif' during the last time window

‣ GET /match/(.*)%5C.gif$/top

๏ Top 50 slowest backend requests aggregating the last 20 time windows

‣ GET /all/top_ttfb/50?b=20

8. Device detection

Overview๏ VMOD

๏ DeviceAtlas

‣ https://deviceatlas.com

‣ Database locally deployed & Daily updated

๏ OSS alternatives

‣ https://github.com/serbanghita/Mobile-Detect

‣ …

Example

import deviceatlas;

sub vcl_recv { if (deviceatlas.lookup(req.http.User-‐Agent, "isMobilePhone") == "1") { set req.http.X-‐Device = "mobile"; } elsif (deviceatlas.lookup(req.http.User-‐Agent, "isTablet") == "1") { set req.http.X-‐Device = "tablet"; } else { set req.http.X-‐Device = "desktop"; }}

Some ideas

๏ Redirections based on device properties

๏ Backend selection based on device properties

๏ Normalization of the UA header

‣ Caching different versions (i.e. Vary header) of the same object based on normalized UAs

๏ …

9. Varnish Plus 4.x

Highlights๏ Client / backend thread split

‣ Background content refreshing

๏ Redesigned purges

‣ return(purge); during vcl_recv

๏ Directors implemented as VMODs

‣ Consistent hashing director

๏ Distinction between error & synthetic responses

10. Q&A