Varnish Web Accelerator

Getting going - simplest possible setup

part1_basic.cfm

<CFHEADER NAME="Cache-Control" VALUE="s-maxage=600">

<cfoutput>
<html>
	<body>
		<strong>Generated by server:</strong> #dateformat(now(),'ddd, mmm d yyyy')# #timeformat(now(),'HH:mm:ss')#<br>
		<strong>Loaded by browser:</strong> <script type="text/javascript">document.write(new Date());</script>
	</body>
</html>
</cfoutput>
  • The default varnish installation serves cached content from port 6081 proxied from 127.0.0.1:80.
  • that is important - the default varnish config is appended to yours
  • Cookies (i.e. sessions) should be disabled, as by default varnish will invalidate caches if cookies are present.
  • Setting browser cache separate to varnish cache - you can control varnish cache but not client

Static assets

/etc/varnish/default.vcl

sub vcl_recv {
    if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") {
        unset req.http.Accept-Encoding;
        // Remove user agent
        if (req.http.User-Agent) {
            set req.http.User-Agent = "";
        }
        unset req.http.Cookie;
        return(lookup);
    }
}

virtual.conf

<VirtualHost *:80>
    ...
    
    # Set up caching on media files for 1 year (forever?)
    <FilesMatch "\.(jpg|jpeg|gif|png|ico|css|zip|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$">
        ExpiresDefault A29030400
        Header append Cache-Control "public"
    </FilesMatch>
</VirtualHost>

Tips

  • Leave the default backend (127.0.0.1) in place.
  • ~ is a regex operator for strings
  • set and unset update and remove a property
  • req.http properties are request headers
  • this configuration leaves control of cache timeouts with Apache, but you can force a cache timeout in varnish with a line like "set beresp.ttl = 48h;"
  • return(lookup) returns control to varnish, requesting a cache lookup if possible
  • if there is not return in your config, the default varnish config is executed

Flush Varnish cache from backend

/etc/varnish/default.vcl

// IPs/domains that can access the purge url
acl purge {
    "localhost";
    "117.53.174.178";
    "117.53.174.179";
    "203.26.11.39";
}

sub vcl_recv {
    // Purge everything url - this isn't the squid way, but works
    if (req.url ~ "^/varnishpurge") {
       if (!client.ip ~ purge) {
                error 405 "Not allowed.";
       }
       if (req.url == "/varnishpurge") {
           ban("req.http.host == " + req.http.host + " && req.url ~ ^/");
           error 841 "Purged site.";
       }
       else {
           ban("req.http.host == " + req.http.host + " && req.url ~ ^" + regsub( req.url, "^/varnishpurge(.*)$", "\1" ) + "$" );
           error 842 "Purged page.";
       }
    }
}
  • it's a good idea to use an ACL to restrict access to the sensitive functionality like flushing
  • notice that for ACLs, the ~ operator is an 'in' check
  • this acl sets up a URL that will trigger a flush '/varnishpurge'
  • /varnishpurge by itself purges every page on the domain
  • /varnishpurge/url/you/want/to/purge purges /url/you/want/to/purge
  • 'ban' is for varnish 3 what 'purge' was for 2
  • in 3 you can do "string" + myvar, in 2 it was just "string" myvar (implicit concatenation)
  • notice the custom error numbers
  • bans are stored in memory, and every page request is checked against every ban - there are performance implications

Invalidate cache from client side

By IP

/etc/varnish/default.vcl

// IPs/domains that bypass cache
acl bypass {
    "1.2.3.4";
}

sub vcl_recv {
    if (client.ip ~ bypass) {
        return(pass);
    }
}

return(pass) passes the request to the backend

By Cookie

/etc/varnish/default.vcl

sub vcl_recv {
    if (req.http.Cookie ~ "LOGGED-IN=1") {
        return(pass)
    }
}

bypasses cache if there is a LOGGED-IN cookie with value 1

By Header

/etc/varnish/default.vcl

sub vcl_recv {
    if (req.http.X-Requested-With == "XMLHttpRequest") {
        return(pass)
    }
}

ajax requests bypass cache

Caching strategies; you can set, and flush now for real world scenarios

Bomb proof static delivery: fullasagoog.com

/etc/varnish/default.vcl

sub vcl_recv {
    set req.grace = 300s;
}

sub vcl_fetch {
    if (beresp.status == 500) {
       set beresp.saintmode = 20s;
       if (req.request != "POST") {
           return(restart);
       } else {
           error 500 "Failed";
       }
    }
    set beresp.grace = 300s;
    set beresp.ttl = 5s;
}

 

  • req.grace defines how old a cache can be if there is a backend problem
  • beresp.saintmode flags the current page on the current back end as "bad"
  • return(restart) loops varnish around to retry the request - if there is no "good" backend for the page, the grace will kick in
  • POST information can't really be restarted
  • beresp.grace is effectively the maximum grace the request can have
  • beresp.ttl is the cache timeout - setting it overrides the value in the Cache-Control header

 

Members vs The great unwashed: adnews

  • flat proxy for anonymous users
  • live site for logged in users; show VCL looking for loggedin cookies

/etc/varnish/default.vcl

sub vcl_hash { 
    ### these 2 entries are the default ones used for vcl. Below we add our own.
    set req.hash += req.http.host;
    set req.hash += req.url;

    set req.http.X-Varnish-Hashed-By = req.http.host req.url;
    
    if (req.http.Cookie ~ "LOGGED-IN=1") {
       set req.hash += regsub( req.http.Cookie, "^.*?LOGGED-IN=1" );
       set req.http.X-Varnish-Hashed-By = req.http.X-Varnish-Hashed-By regsub( req.http.Cookie, "^.*?LOGGED-IN=1" );
    }
    
    return(hash);
}
sub vcl_recv {
    // Let any non "GET" / "HEAD" right through
    if (req.request != "GET" && req.request != "HEAD"){
        return(pass);
    }
    
    // strip cookies for static files
    if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") {
        unset req.http.Accept-Encoding;
        // Remove user agent
        if (req.http.User-Agent) {
            set req.http.User-Agent = "";
        }
        unset req.http.Cookie;
        return(lookup);
    }
    
    // strip cookies for everything except specific pages
    if ( !(req.url ~ "Login" || req.url ~ "login" || req.url ~ "logout") ) { 
        // Clean up accept encoding values
        # Handle compression correctly. Varnish treats headers literally, not
        # semantically. So it is very well possible that there are cache misses
        # because the headers sent by different browsers aren't the same.
        # @see: http://varnish.projects.linpro.no/wiki/FAQ/Compression
        if (req.http.Accept-Encoding) {
            if (req.http.Accept-Encoding ~ "gzip") {
       	        # if the browser supports it, we'll use gzip
       		set req.http.Accept-Encoding = "gzip";
            } elsif (req.http.Accept-Encoding ~ "deflate") {
                # next, try deflate if it is supported
                set req.http.Accept-Encoding = "deflate";
            } else {
                # unknown algorithm. Probably junk, remove it
                unset req.http.Accept-Encoding;
            }
        }
    
        // Remove user agent
        if (req.http.User-Agent) {
            set req.http.User-Agent = "";
        }
	
        // If the user is logged in, let session cookies through (Note: these values should not be getting included in the cache hash)
        if (req.http.Cookie ~ "FC-LOGGED-IN=1") {
    	   set req.http.Cookie = ";" req.http.Cookie;
    	   set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
	   set req.http.Cookie = regsuball(req.http.Cookie, ";(CFID|CFTOKEN|JSESSIONID|FC-LOGGED-IN|FC-ROLES)=", "; \1=");
	   set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
	   set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
	   
	   if (req.http.Cookie == "") {
	       unset req.http.Cookie;
	   }
	}
	
	return(lookup);
    }
    else {
        return(pass);
    }
}
sub vcl_fetch {
    // strip cookies for static files
    if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") {
       unset beresp.http.set-cookie;
       return(deliver);
    }
    
    // strip cookies for everything except specific pages
    if ( req.url ~ "Login" || req.url ~ "login" || req.url ~ "logout" ) {
        set beresp.http.X-Cacheable = "NO:Login or logout page";
        return(pass);
    }
    else {
    	unset beresp.http.set-cookie;
    }
    
    // varnish determined the object was not cacheable
    if (!beresp.cacheable){
       set beresp.http.X-Cacheable = "NO:Not Cacheable";
    }
    elsif (beresp.http.Cache-Control ~ "private"){
       set beresp.http.X-Cacheable = "NO:Cache-Control=private";
       return(pass);
    }
    else {
	// Keep stale content for a little while, to serve while varnish gets a fresh copy
	set beresp.grace = 15s;
	
        set beresp.http.X-Cacheable = "YES";
    }
    return(deliver);
}

User interactivity: adnews

  • form posts are not proxied (only GET); comment forms; show VCL ignoring POST requests

/etc/varnish/default.vcl

sub vcl_hash { 
    ### these 2 entries are the default ones used for vcl. Below we add our own.
    set req.hash += req.http.host;
    set req.hash += req.url;

    set req.http.X-Varnish-Hashed-By = req.http.host req.url;
    
    if (req.http.Cookie ~ "LOGGED-IN=1") {
       set req.hash += regsub( req.http.Cookie, "^.*?LOGGED-IN=1" );
       set req.http.X-Varnish-Hashed-By = req.http.X-Varnish-Hashed-By regsub( req.http.Cookie, "^.*?LOGGED-IN=1" );
    }
    
    return(hash);
}
sub vcl_recv {
    // Let any non "GET" / "HEAD" right through
    if (req.request != "GET" && req.request != "HEAD"){
        return(pass);
    }
    
    // strip cookies for static files
    if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") {
        unset req.http.Accept-Encoding;
        // Remove user agent
        if (req.http.User-Agent) {
            set req.http.User-Agent = "";
        }
        unset req.http.Cookie;
        return(lookup);
    }
    
    // strip cookies for everything except specific pages
    if ( !(req.url ~ "Login" || req.url ~ "login" || req.url ~ "logout") ) { 
        // Clean up accept encoding values
        # Handle compression correctly. Varnish treats headers literally, not
        # semantically. So it is very well possible that there are cache misses
        # because the headers sent by different browsers aren't the same.
        # @see: http://varnish.projects.linpro.no/wiki/FAQ/Compression
        if (req.http.Accept-Encoding) {
            if (req.http.Accept-Encoding ~ "gzip") {
       	        # if the browser supports it, we'll use gzip
       		set req.http.Accept-Encoding = "gzip";
            } elsif (req.http.Accept-Encoding ~ "deflate") {
                # next, try deflate if it is supported
                set req.http.Accept-Encoding = "deflate";
            } else {
                # unknown algorithm. Probably junk, remove it
                unset req.http.Accept-Encoding;
            }
        }
    
        // Remove user agent
        if (req.http.User-Agent) {
            set req.http.User-Agent = "";
        }
	
        // If the user is logged in, let session cookies through (Note: these values should not be getting included in the cache hash)
        if (req.http.Cookie ~ "FC-LOGGED-IN=1") {
    	   set req.http.Cookie = ";" req.http.Cookie;
    	   set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
	   set req.http.Cookie = regsuball(req.http.Cookie, ";(CFID|CFTOKEN|JSESSIONID|FC-LOGGED-IN)=", "; \1=");
	   set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
	   set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
	   
	   if (req.http.Cookie == "") {
	       unset req.http.Cookie;
	   }
	}
	
	return(lookup);
    }
    else {
        return(pass);
    }
}
sub vcl_fetch {
    // strip cookies for static files
    if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") {
       unset beresp.http.set-cookie;
       return(deliver);
    }
    
    // Let any non "GET" / "HEAD" right through
    if (req.request != "GET" && req.request != "HEAD"){
        set beresp.http.X-Cacheable = "NO:Not GET or HEAD";
        return(pass);
    }
    // strip cookies for everything except specific pages
    elsif ( req.url ~ "Login" || req.url ~ "login" || req.url ~ "logout" ) {
        set beresp.http.X-Cacheable = "NO:Login or logout page";
        return(pass);
    }
    else {
    	unset beresp.http.set-cookie;
    }
    
    // varnish determined the object was not cacheable
    if (!beresp.cacheable){
       set beresp.http.X-Cacheable = "NO:Not Cacheable";
    }
    elsif (beresp.http.Cache-Control ~ "private"){
       set beresp.http.X-Cacheable = "NO:Cache-Control=private";
       return(pass);
    }
    else {
	// Keep stale content for a little while, to serve while varnish gets a fresh copy
	set beresp.grace = 15s;
	
        set beresp.http.X-Cacheable = "YES";
    }
    return(deliver);
}

 

We use the following strategy to have user specific content on a site:
  • unset all cookies for every page, with the following exceptions
  • all logging in / out are done on specific URLs that don't have cookies reset, and aren't cached
  • users and content should be segmented into groups - the relevant groups can be set into a cookie on login, which can be used to cache pages
  • personal content (e.g. "Welcome Bob!") can be AJAXed in, allowing the main content to be cached, and the personal content not

 

Personalized pods: adnews, newspaper works

  • Ajax personalized content after proxied page; show VCL ignoring Ajax requests
 
sub vcl_recv {
    
    // strip cookies for static files
    if (req.url ~ "^/favicon" || req.url ~ "^/cache" || req.url ~ "^/css" || req.url ~ "^/js" || req.url ~ "^/wsimages" || req.url ~ "^/base" || req.url ~ "^/webtop/cffp" || req.url ~ "^/webtop/icons"){
    //if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") {
        unset req.http.Accept-Encoding;
        // Remove user agent
        if (req.http.User-Agent) {
            set req.http.User-Agent = "";
        }
        unset req.http.Cookie;
        return(lookup);
    }
    
    // strip cookies for everything except specific pages
    if ( !(req.url ~ "profile" || req.url ~ "Login" || req.url ~ "login" || req.url ~ "logout") ) { 
        // Clean up accept encoding values
        # Handle compression correctly. Varnish treats headers literally, not
        # semantically. So it is very well possible that there are cache misses
        # because the headers sent by different browsers aren't the same.
        # @see: http://varnish.projects.linpro.no/wiki/FAQ/Compression
        if (req.http.Accept-Encoding) {
            if (req.http.Accept-Encoding ~ "gzip") {
       	        # if the browser supports it, we'll use gzip
       		set req.http.Accept-Encoding = "gzip";
            } elsif (req.http.Accept-Encoding ~ "deflate") {
                # next, try deflate if it is supported
                set req.http.Accept-Encoding = "deflate";
            } else {
                # unknown algorithm. Probably junk, remove it
                unset req.http.Accept-Encoding;
            }
        }
    
        // Remove user agent
        if (req.http.User-Agent) {
            set req.http.User-Agent = "";
        }
	
	return(lookup);
    }
    else {
        return(pass);
    }
}
sub vcl_fetch {
    // strip cookies for static files
    if (req.url ~ "^/favicon" || req.url ~ "^/cache" || req.url ~ "^/css" || req.url ~ "^/js" || req.url ~ "^/wsimages" || req.url ~ "^/base" || req.url ~ "^/webtop/cffp" || req.url ~ "^/webtop/icons"){
    //if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") {
       unset beresp.http.set-cookie;
       return(deliver);
    }
    
    // Let any non "GET" / "HEAD" right through
    if (req.request != "GET" && req.request != "HEAD"){
        set beresp.http.X-Cacheable = "NO:Not GET or HEAD";
        return(pass);
    }
    // strip cookies for everything except specific pages
    elsif ( req.url ~ "profile" || req.url ~ "Login" || req.url ~ "login" || req.url ~ "logout" ) {
        set beresp.http.X-Cacheable = "NO:Login or logout page";
        return(pass);
    }
    else {
    	unset beresp.http.set-cookie;
    }
    
    // varnish determined the object was not cacheable
    if (!beresp.cacheable){
       set beresp.http.X-Cacheable = "NO:Not Cacheable";
    }
    elsif (beresp.http.Cache-Control ~ "private"){
       set beresp.http.X-Cacheable = "NO:Cache-Control=private";
       return(pass);
    }
    else {
	// Keep stale content for a little while, to serve while varnish gets a fresh copy
	set beresp.grace = 15s;
	
        set beresp.http.X-Cacheable = "YES";
    }
    return(deliver);
}

Tips

Normalising user agent and compression headers

Varnish can include the user-agent and accept-encoding headers in cache hash, so normalising them is a good idea:

/etc/varnish/default.vcl

sub vcl_recv {
    if (req.http.Accept-Encoding) {
        if (req.http.Accept-Encoding ~ "gzip") {
       	    # if the browser supports it, we'll use gzip
            set req.http.Accept-Encoding = "gzip";
        } elsif (req.http.Accept-Encoding ~ "deflate") {
            # next, try deflate if it is supported
            set req.http.Accept-Encoding = "deflate";
        } else {
            # unknown algorithm. Probably junk, remove it
            unset req.http.Accept-Encoding;
        }
    }

    // Remove user agent
    if (req.http.User-Agent) {
        set req.http.User-Agent = "";
    }
}

If your website serves different HTML to mobile users, you might use code like this:

if (req.http.User-Agent ~ "iP(hone|od)" || req.http.User-Agent ~ "Android" || req.http.User-Agent ~ "Symbian" || req.http.User-Agent ~ "^BlackBerry" || req.http.User-Agent ~ "^SonyEricsson" || req.http.User-Agent ~ "^Nokia" || req.http.User-Agent ~ "^SAMSUNG" || req.http.User-Agent ~ "^LG" || req.http.User-Agent ~ " webOS") {
    set req.http.User-Agent = "mobile";
}
else {
    set req.http.User-Agent = "desktop";
}

Custom cache hash

This VCL adds a cookie value (ROLES) to the cache key. req.http.X-Varnish-Hashed-By is helpful for debugging, as you can inspect the request header to find out what the cache key is.

/etc/varnish/default.vcl

sub vcl_hash {
    ### these 2 entries are the default ones used for vcl. Below we add our own.
    set req.hash += req.http.host;
    set req.hash += req.url;
    
    set req.http.X-Varnish-Hashed-By = req.http.host req.url;
    
    if (req.http.Cookie ~ "ROLES=1") {
       set req.hash += regsub( req.http.Cookie, "^.*?ROLES=([^;]*);*.*$", "\1" );
       set req.http.X-Varnish-Hashed-By = req.http.X-Varnish-Hashed-By regsub( req.http.Cookie, "^.*?ROLES=([^;]*);*.*$", "\1" );
    }

    return(hash);
}

Pass along original IP address as a custom header

/etc/varnish/default.vcl

sub vcl_recv {
    // Pass along the IP for uncached pages
    if (req.http.x-forwarded-for) {
        set req.http.X-Forwarded-For = req.http.X-Forwarded-For ", " client.ip;
    } else {
        set req.http.X-Forwarded-For = client.ip;
    }
}

Refining your caching strategy.

Varnish tooling

  • showcase varnishhist
  • plus other command line utils