Varnish Cache Plus. Random notes for wise web developers
-
Upload
carlos-abalde -
Category
Technology
-
view
1.435 -
download
7
description
Transcript of Varnish Cache Plus. Random notes for wise web developers
Varnish Cache PlusRandom notes for wise web developers
Carlos Abalde, Roberto Moreda {cabalde, moreda}@allenta.com October 2014
Agenda
1. Introduction
2. Varnish 101
3. Invalidations
4. HTTP headers
5. Content composition
6. VAC
7. VCS
8. Device detection
9. Varnish Plus 4.x
10. Q&A
Disclaimer๏ General understanding of ‘The Varnish Book’ is assumed
‣ This is not the official Varnish Cache training
‣ This is not a Varnish Cache internals course
‣ This is not a Varnish module development course
‣ This is a collection of random notes for web developers willing to make the most of Varnish Cache Plus
๏ OSS Varnish Cache vs. Varnish Cache Plus
‣ 3.x vs. 4.x
Varnish Cache 3.x
๏ The Varnish Book
‣ https://www.varnish-software.com/static/book/
๏ The Varnish Reference Manual
‣ https://www.varnish-cache.org/docs/.../index.html
๏ Default VCL
‣ https://www.varnish-cache.org/trac/.../default.vcl
What everybody should know
Varnish Cache Plus 3.x
๏ Support, advise & training
๏ Varnish Enhanced Cache Invalidation
‣ Hash Two, Hash Ninja…
๏ Varnish Administration Console (VAC)
๏ Varnish Custom Statistics (VCS)
๏ Device detection
Components I
Varnish Cache Plus 3.x
๏ Varnish Tuner
๏ Enhanced HTTP streaming
๏ Packaged binary VMODs
๏ Varnish Paywall
๏ … and more to come shortly!
Components II
Varnish Cache Plus 3.x
๏ 64 bits
๏ Distributions
‣ RedHat Enterprise Linux 5 & 6
‣ Ubuntu Linux 12.04 LTS (precise)
‣ Ubuntu Linux 14.04 LTS (trusty)
‣ Debian Linux 7 (wheezy)
Supported platforms
Caching policy
๏ Varnish Cache Plus would require zero configuration in a perfect world with perfect HTTP citizens
‣ Correct HTTP caching headers
‣ Vary HTTP header used wisely
‣ HTTP cookies used conservatively
๏ By default Varnish Cache Plus will not cache anything marked as private, carrying a cookie or including a '*' Vary HTTP header
VCL
๏ Varnish Configuration Language
‣ Domain specific state engine
‣ No loops, variables, functions…
‣ Command line configuration & Tunable parameters
๏ Translated to C code
๏ Loaded as a dynamically generated shared library
‣ Zero downtime & Blazingly fast
Overview
VCL
๏ Normalize client-input
๏ Pick a backend / director
๏ Re-write / extend client-input
๏ Decide caching policy based on client-input
๏ Access control
๏ Security barriers
vcl_recv I
VCLvcl_recv II
sub vcl_recv { # Backend selection & URL normalization. if (req.http.host ~ "^blogs\.") { set req.backend = blogs; set req.http.host = regsub(req.http.host,"^blogs\.", ""); set req.url = regsub(req.url, "^", "/blogs"); } else { set req.backend = default; } # Poor man's device detection. if (req.http.User-‐Agent ~ "(iPad|iPhone|Android)") { set req.http.X-‐Device = "mobile"; } else { set req.http.X-‐Device = "desktop"; }}
VCL
๏ Sanitize / extend backend response
๏ Override cache duration
‣ beresp.ttl
- s-‐maxage & maxage in Cache-‐Control HTTP header
- Expires HTTP header
- Default TTL
‣ Beware with TTL of hitpass objects
vcl_fetch I
VCLvcl_fetch II
sub vcl_fetch { # Override caching TTL. if (beresp.http.Cache-‐Control !~ "s-‐maxage") { set beresp.ttl = 0; if (bereq.url ~ "\.jpg(\?|$)") { set beresp.ttl = 30s; } } # Never cache a Set-‐Cookie header. if (beresp.ttl > 0s) { unset beresp.http.Set-‐Cookie; } # Create ban-‐lurker friendly objects. set beresp.http.X-‐Url = bereq.url;}
VMODs
๏ Shared libraries extending the VCL core
‣ std VMOD
- std.toupper(), std.log(), std.fileread()…
‣ ABI (Application Binary Interface) mismatches
๏ cookie, header, var, curl, digest, geoip, boltsort, memcached, redis, dns…
๏ https://www.varnish-cache.org/vmods
Backends
๏ Multiple backends
‣ Selected at request time based on any request property
๏ Probes
‣ Per-backend periodic health checks
- Interval, timeout, expected response…
๏ Directors
‣ Load balanced backend groups
Error handling
๏ Some backend may be sick for a particular object
‣ Other objects from the same backend can still be accessed
- Unless more than a set amount of objects are added to the saint mode blacklist for a specific backend
๏ Do not request again the object to that backend for a period of time
‣ Grace mode is used when all possible backends for the requested object have been blacklisted
๏ Complement backend probes
Saint mode
Error handling
๏ A graced object is an object that has expired, but is still kept in cache
‣ beresp.ttl vs. beresp.grace
๏ Graced objects are used to
‣ Serve outdated content if the backend is down
- Probes or saint mode is required for this
‣ Serve sightly staled content while fresh versions are fetched
Grace mode
Beyond caching policy
๏ Why restricting VCL / VMODs to implement the caching policy?
๏ Any logic modeled in VCL / VMODs is compiled, embedded & executed in the caching edger layer
‣ 1000x times faster than typical Java / PHP apps
- Strong restrictions
‣ Accounting, paywalling, A/B testing…
varnishtest
๏ Powerful Varnish-specific testing tool
‣ Mocked clients & backends executing / processing HTTP requests against real Varnish Cache Plus instances
‣ http://www.clock.co.uk/...varnishtest
๏ Essential when implementing complex VCL logic
๏ Easily integrable in any CI infrastructure
FAQ๏ When SSL support will be implemented?
‣ "[...] huge waste of time and effort to even think about it."
๏ When SPDY support will be implemented?
‣ "[...] Varnish is not speedy, Varnish is fast! [...]"
๏ What is the recommended value for this bizarre kernel / varnishd parameter I found in some random blog?
‣ Use Varnish Tuner + Fine tune based on necessity
‣ Pay attention to workspaces & syslog messages
Overview
๏ Updated objects may be available before TTL expiration
‣ Purges
‣ Forced misses
‣ Bans
‣ Hash Two / Hash Ninja / …
Purges
๏ VCL
๏ Eagerly discards an object along with all its variants
Overview
acl internal { "localhost"; "192.168.55.0"/24;}
sub vcl_recv { if (req.request == "PURGE") { if (client.ip !~ internal) { error 405 "Not allowed."; } return (lookup); }}
sub vcl_hit { if (req.request == "PURGE") { purge; error 200 "Purged."; }}
sub vcl_miss { if (req.request == "PURGE") { purge; error 200 "Purged."; }}
Purges
๏ What if the new object cannot be fetched after the invalidation?
‣ Soft-purges VMOD
‣ Forces misses
๏ What if multiple objects need to be invalidated? What if objects need to be invalidated too frequently?
‣ Bans
‣ Hash Two
Downsides I
Purges
๏ How to invalidate hitpass objects?
‣ Not possible in Varnish Cache Plus 3.x
- Redesigned in Varnish Cache Plus 4.x
- https://www.varnish-cache.org/trac/.../1033
‣ return(pass); during vcl_recv is preferred when possible
Downsides II
Forced misses
๏ VCL
๏ Forces a cache miss for the request
‣ Useful for cache priming scripts
Overview
sub vcl_recv { if (req.http.X-‐Priming-‐Script) { ... set req.hash_always_miss = true; } ...}
Forced misses
๏ Object will always be (re)fetched from the backend
๏ New object is put into cache and used from that point onward
‣ Old object is not evicted until it’s safe to do so
‣ Controls who takes the penalty of waiting for an updated object
๏ Old objects are not freed up until expiration
‣ This is considered a flaw and a fix is expected
Behavior
Bans
๏ VCL or CLI
๏ Lazily discards multiple objects matching an expression
‣ Logical operators + Object attributes + Regular expressions
‣ Only works on objects already in the cache
๏ Ban lurker
‣ Frees up memory + Keeps the ban list at a manageable size
‣ obj.* based expressions
Overview
BansExample
sub vcl_recv { if (req.request == "BAN") { ... if (!req.http.X-‐Ban-‐Url-‐Regexp) { error 400 "Empty URL regexp."; } ban("obj.http.X-‐Url ~ " + req.http.X-‐Ban-‐Url-‐Regexp); }}
sub vcl_fetch { set beresp.http.X-‐Url = req.url;}
sub vcl_deliver { unset resp.http.X-‐Url;}
Hash Two
๏ VCL + VMOD
๏ Workarounds bans scalability
Overview
HTTP/1.x 200 OKTransfer-‐Encoding: chunked...X-‐Tags: C10 P42 P236 P857...
ban obj.http.X-‐Tags ~ "(\s|^)P42(\s|$)"
Hash TwoExample
import hashtwo;
sub vcl_recv { if (req.request == "PURGE") { ... if (hashtwo.purge(req.http.X-‐Tag) != 0) { error 200 "Purged."; } else { error 404 "Not found."; } }}
sub vcl_fetch { set beresp.http.X-‐HashTwo = beresp.http.X-‐Tags; }
Cache related headers
๏ Expires
๏ Cache-Control
๏ Last-Modified
๏ If-Modified-Since
๏ If-None-Match
๏ Etag
๏ Pragma
๏ Vary
๏ Age
Cache-Control
๏ Specifies directives that must be applied by all caching mechanisms (from Varnish Cache Plus to browser cache)
Overview
‣ public | private
‣ no-‐store
‣ no-‐cache
‣ max-‐age
‣ s-‐maxage
‣ must-‐revalidate
‣ no-‐transform
‣ …
Cache-Control
๏ Ignored in incoming client HTTP requests
๏ Only s-‐maxage & max-‐age used in backend HTTP responses to calculate default TTL
‣ Always overrides Expires header
‣ Beware of Age header in client responses
- Objects not cached client side
- https://www.varnish-cache.org/...Caching
beresp.ttl
Vary
๏ Indicates the response returned by the backend server may vary depending on headers received in the request
๏ Object variants & Hit ratio
‣ Vary: Accept-‐Encoding
- Normalization of Accept-‐Encoding header is not required
‣ Vary: User-‐Agent
Overview๏ Break objects into smaller fragments
‣ Separate cache policy for each fragment
‣ Increase hit ratio
๏ Tools
‣ Edge Side Includes (ESI)
‣ AJAX
- Beware of RTT & Cross domain policy
Edge Side Includes
๏ Subset of ESI Language Specification 1.0
‣ <esi:include src="<URL> " />
‣ <esi:remove>...</esi:remove>
‣ <!-‐-‐esi ...—>
๏ set beresp.do_esi = true;
‣ Separate Varnish requests
๏ Testing ESI in dev environment
Overview
๏ Central control of Varnish Cache Plus servers
‣ Web UI + RESTful API
- Super Fast Purger
๏ Cache group management
‣ Real time statistics, VCL editor, ban submission…
๏ Varnish Agent 2
Super Fast Purger
๏ High performance intermediary distributing invalidation requests to groups of Varnish Cache Plus servers
‣ Leverages speed & flexibility of VCL
‣ Keep-alive workaround
๏ Part of the VAC RESTful API
‣ Trivially integrable in existing applications
Change management
๏ Easily integrable using the VAC RESTful API
‣ git, Mercurial… hooks
‣ Jenkins, Travis, GitLab… CI scripts
๏ Manual VCL bundle generation
๏ Orchestrated / programmed deployments, rollbacks, etc.
Overview
๏ Real-time aggregated statistics
‣ Multiple vstatdprobe daemons
‣ One vstatd daemon
‣ JSON + Time series API
๏ VSM log based
‣ Efficient circular in-memory data structure
‣ std.log("vcs-‐key:" + <key suffix>);
Some ideas
๏ Trending articles or sale products
๏ Cache hits and cache misses
๏ URLs with long load times
๏ URLs with the most 5xx response codes
๏ Where traffic is coming from
๏ …
Example
sub vcl_deliver { std.log("vcs-‐key:" + req.http.host); std.log("vcs-‐key:" + req.http.host + req.url); std.log("vcs-‐key:TOTAL"); if (obj.hits == 0) { std.log("vcs-‐key:MISS"); } }
API I๏ Stats (#requests, #misses, avg ttfb, acc body bytes, #2xx,
#3xx…) for key named “example.com" during the last time windows
‣ GET /key/example.com
๏ Keys that produced the most 5xx responses during the last time window
‣ GET /all/top_5xx
๏ Top 5 requested keys during the last time window
‣ GET /all/top/5?verbose=1
API II
๏ Top 10 most requested keys ending with ‘.gif' during the last time window
‣ GET /match/(.*)%5C.gif$/top
๏ Top 50 slowest backend requests aggregating the last 20 time windows
‣ GET /all/top_ttfb/50?b=20
Overview๏ VMOD
๏ DeviceAtlas
‣ https://deviceatlas.com
‣ Database locally deployed & Daily updated
๏ OSS alternatives
‣ https://github.com/serbanghita/Mobile-Detect
‣ …
Example
import deviceatlas;
sub vcl_recv { if (deviceatlas.lookup(req.http.User-‐Agent, "isMobilePhone") == "1") { set req.http.X-‐Device = "mobile"; } elsif (deviceatlas.lookup(req.http.User-‐Agent, "isTablet") == "1") { set req.http.X-‐Device = "tablet"; } else { set req.http.X-‐Device = "desktop"; }}
Some ideas
๏ Redirections based on device properties
๏ Backend selection based on device properties
๏ Normalization of the UA header
‣ Caching different versions (i.e. Vary header) of the same object based on normalized UAs
๏ …
Highlights๏ Client / backend thread split
‣ Background content refreshing
๏ Redesigned purges
‣ return(purge); during vcl_recv
๏ Directors implemented as VMODs
‣ Consistent hashing director
๏ Distinction between error & synthetic responses