Varnish: ignore some cache cookies for hash computation

Varnish: ignore some cache cookies for hash computation - varnish

Situation::
Varnish needs to cache even in the presence of cookies on the request.
The request may contain N arbitrary cookies of which certain known cookies must not form part of the cache key. The arbitrary cookies don't contain any user sensitive data, eg. they are cosmetic helpers like is_authenticated=1.
The actual backend must receive the original set of cookies unmolested in the case of a cache miss.
I don't want to check for URL patterns in the VCL because that assumes too much knowledge of the backend.
This is surprisingly hard to solve. All solutions I've found so far assumes a whitelist for (2) whereas I need a blacklist. And most solutions delete cookies which are supposed to go through to the backend.

So how about (untested):
# We use builtin.vcl logic and 'return' from our vcl_recv in order
# to prevent default Varnish behaviour of not caching with cookies present
sub vcl_recv {
# your vcl_recv starts here
# ...
# your vcl_recv ends here
if (req.method == "PRI") {
/* We do not support SPDY or HTTP/2.0 */
return (synth(405));
}
if (req.method != "GET" &&
req.method != "HEAD" &&
req.method != "PUT" &&
req.method != "POST" &&
req.method != "TRACE" &&
req.method != "OPTIONS" &&
req.method != "DELETE") {
/* Non-RFC2616 or CONNECT which is weird. */
return (pipe);
}
if (req.method != "GET" && req.method != "HEAD") {
/* We only deal with GET and HEAD by default */
return (pass);
}
if (req.http.Authorization) {
/* Not cacheable by default */
return (pass);
}
return (hash);
}
sub vcl_hash {
set req.http.X-Cookie-Hash = regsub(req.http.cookie, "KNOWN1=[^;]+;", "");
set req.http.X-Cookie-Hash = regsub(req.http.X-Cookie-Hash, "KNOWN2=[^;]+;", "");
hash_data(req.http.X-Cookie-Hash);
}
Which removes each known cookie and hashes on the remainders :) Not ideal, because it can't guarantee the order of cookies in the header, but other from that should work.

Related

How to redirect all sub urls just to the main page

Super new to varnish. As the title states. I want to redirect anything under example.com/* to just https://example.com/
so far I've tried
if (client.ip != "127.0.0.1" && req.http.host == "example.com") {
set req.http.x-redir = "https://example.com";
error 850 "Moved Permanently";
}
Any thoughts on how I can do this?

Have a look at this specific section of a tutorial I wrote: https://www.varnish-software.com/developers/tutorials/redirect/#http-to-https-redirections.
Here's the code I would use for that:
vcl 4.1;
import proxy;
backend default {
.host = "127.0.0.1";
.port = 8080;
}
sub vcl_recv {
if ((req.http.X-Forwarded-Proto && req.http.X-Forwarded-Proto != "https") ||
(req.http.Scheme && req.http.Scheme != "https")) {
return (synth(750));
} elseif (!req.http.X-Forwarded-Proto && !req.http.Scheme && !proxy.is_ssl()) {
return (synth(750));
}
}
sub vcl_synth {
if (resp.status == 750) {
set resp.status = 301;
set resp.http.location = "https://" + req.http.Host + req.url;
set resp.reason = "Moved";
return (deliver);
}
}
This tutorial checks 3 things:
Whether or not the URL scheme of the request is sent via the X-Forwarded-Proto header
Whether or not the URL scheme of the request is sent via the X-Scheme header which is part of HTTP/2
Whether or not the PROXY protocol was used and the PROXY TLV attributes contains the scheme
When either of these checks concludes that plain HTTP is used, the return(synth(750)) return statement is used to return a synthetic response.
In the vcl_synth subroutine, status code 750 is caught and results in a 301 redirect to the HTTPS version of that request.
The X-Forwarded-Proto header should be set by your TLS PROXY if you're connecting to Varnish using regular HTTP.
If instead you're using the PROXY protocol to connect to Varnish, you should have a look at the following tutorial: https://www.varnish-software.com/developers/tutorials/proxy-protocol-varnish/

Varnish 6 missing requests for same URL coming from different browsers

This is how my varnish.vcl looks like.
vcl 4.0;
import directors;
import std;
backend client {
.host = "service1";
.port = "80";
}
sub vcl_recv {
std.log("varnish log info:" + req.http.host);
# caching pages in client
set req.backend_hint = client;
# If request is from conent or for pages remove headers and cache
if ((req.url ~ "/content/") || (req.url ~ "/cms/api/") || req.url ~ "\.(png|gif|jpg|jpeg|json|ico)$" || (req.url ~ "/_nuxt/") ) {
unset req.http.Cookie;
std.log("Cachable request");
}
# If request is not from above do not cache and pass to Backend.
else
{
std.log("Non cachable request");
return (pass);
}
}
sub vcl_backend_response {
if ((bereq.url ~ "/content/") || (bereq.url ~ "/cms/api/") || bereq.url ~ "\.(png|gif|jpg|jpeg|json|ico)$" || (bereq.url ~ "/_nuxt/") )
{
unset beresp.http.set-cookie;
set beresp.http.cache-control = "public, max-age=259200";
set beresp.ttl = 12h;
return (deliver);
}
}
# Add some debug info headers when delivering the content:
# X-Cache: if content was served from Varnish or not
# X-Cache-Hits: Number of times the cached page was served
sub vcl_deliver {
# Was a HIT or a MISS?
if ( obj.hits > 0 )
{
set resp.http.X-Cache-Varnish = "HIT";
}
else
{
set resp.http.X-Cache-Varnish = "MISS";
}
# And add the number of hits in the header:
set resp.http.X-Cache-Hits = obj.hits;
}
If I am hitting a page from same browser netwrok tab showing
X-Cache-Varnish = "HIT";
X-Cache-Hits = ;
Lets say if I hot from chrome 10 times this is what I get
X-Cache-Varnish = "HIT";
X-Cache-Hits = 9;
9 because first was a miss and rest 9 were served from cache.
If I try incognito window or a different browser it gets its own count starting from 0. I think somehow I am still caching cookies. I could not identify what I am missing.
Ideally, I want to delete all cookies for specific paths. but somehow unset does not seem to be working for me.

If you really want to make sure these requests are cached, make sure you do a return(hash); in your if-statement.
If you don't return, the built-in VCL will take over, and continue executing its standard behavior.
Apart from that, it's unclear whether or not your backend sets a Vary header which might affect your hit rate.
Instead of guessing, I suggest we use the logs to figure out it.
Run the following command to track your requests:
varnishlog -g request -q "ReqUrl ~ '^/content/'"
This statement's VSL Query expression assumes the URL starts with /content. Please adjust accordingly.
Please send me an extract of varnishlog for 1 specific URL, but also for both situations:
The one that hits the cache on a regular browser tab
The one that results in a cache miss in incognito mode or from a different browser
The logs will give more context and explain what happened.

Temporarily block access to all POST and login functionality

I am looking for a rule to temporarily block from users accessing my site without disabling the site completely - so only read access, on my varninsh server I have the following so far:
/**
* Temporally restrict access.
*
*/
if(req.http.host ~ "domain.tld") {
if(req.http.user-agent != "SYS-ADMIN") {
if(req.method == "POST" || req.url ~ "^/wp-login.php$" || req.url ~ "^/wp-admin") {
return(synth(503));
}
}
}
would this suffice or am missing something?

Adding the req.method == "POST" did the trick and disabled access to the site from users not being able to write to the DB without blocking the whole site.

varnish 4 - abandon, vcl_synth, and restarting requests with new headers

I've already asked and received an awesome answer about how to get an error from the backend to force serving from stale cache (grace) found here: varnish 4: serve graced object if beresp.status is an error?
but now that logic needs an extra step: i include the following code currently
sub vcl_backend_fetch {
if (bereq.retries == 0) {
unset bereq.http.X-Varnish-Backend-5xx;
unset bereq.http.X-Varnish-Backend-206;
} else {
if (bereq.http.X-Varnish-Backend-5xx) {
return (abandon);
}
if (bereq.http.X-Varnish-Backend-206) {
return (abandon);
}
}
}
sub vcl_synth {
if (resp.status == 503 &&
req.method != "POST" &&
!req.http.X-Varnish-Restarted-5xx) {
set req.http.X-Varnish-Restarted-5xx = "1";
return (restart);
}
if (resp.status == 503 &&
req.method != "POST" &&
!req.http.X-Varnish-Restarted-206) {
set req.http.X-Varnish-Restarted-206 = "1";
return (restart);
}
}
obviously the second if statement in the vcl_synth is virtually identical to the first one, with the exception of the header it's looking for. I need to differentiate the 206 to restart with a different request header, but I am not sure how. the issue is that, if the backend returns a 206, the rest of the logic abandons the backend fetch (which hands off to vcl_synth with a 503), and vcl_synth restarts the request to force serving graced cached objects. however, if there's no graced cache object to return to the user, then the user sees a 503 instead of a 206.
Before realizing that this line of thinking was not possible, i tried to have vcl_backend_fetch return a synth(206), so that vcl_synth could use resp.status to differentiate, and add a different request header before restarting the request. then i would be able to look for that header in vcl_miss, and if it's there, i could restart the whole request a second time, and force it to serve the 206 from the usual backend.
TL;DR: how do I differentiate two different cases in a vcl_backend_fetch abandon, to get vcl_synth to restart the request with two different headers?

I'm pretty sure it's not possible to do anything in vcl_backend_fetch in order to abandon the request and transfer some kind of flag to vcl_synth.
I'm not sure if I've completely understood your use case, but it seems that would need something like:
sub vcl_synth {
if (resp.status == 503 &&
req.method != "POST" &&
!req.http.X-Varnish-Restarted-5xx) {
# Let's restart the request for the first time and try to
# serve grace content. This could select an always-sick
# backend during vcl_recv.
set req.http.X-Varnish-Restarted-5xx = "1";
return (restart);
}
if (resp.status == 503 &&
req.method != "POST" &&
req.http.X-Varnish-Restarted-5xx &&
!req.http.X-Varnish-Restarted-206) {
# Grace content was not available after the first restart. Let's
# restart again the request and try to serve content using some
# other backend.
set req.http.X-Varnish-Restarted-206 = "1";
return (restart);
}
}

Varnish handling different from different methods of request

(Varnish 2.1.5)
I've got some strange situation in my Varnish. I'm trying to invalidate cache objects through PURGE requests initiated from NodeJS.
My testing consists of requesting the object, letting it cache, then do a purge request, then request it again (resulting in a fetch), and then request it again resulting in a hit of the refreshed cache object.
When I test this through the Firefox debug console, it works fine. All steps seem to work as expected. When I test the entire process in NodeJS, it works as expected, just fine. However, when I let the object cache through Firefox, and then try to invalidate it through NodeJS, it reports a 404 Not in cache.
I'm a 100% sure I'm using the same URI, and I have no idea why it acts this way. Has anyone else experienced this problem? And if yes, what is the solution to this problem?
This is my VCL:
backend default {
.host = "127.0.0.1";
.port = "80";
}
acl purge {
"localhost";
"*loadbalancer-ip*";
}
sub vcl_recv {
if (req.request == "PURGE") {
if(!client.ip ~ purge) {
error 405 "Not allowed.";
}
return (lookup);
} else if (req.url ~ "(?i)\.(jpeg|jpg|png|gif|ico|js|css|xml)$") {
unset req.http.Cookie;
return (lookup);
} else {
return (pass);
}
}
sub vcl_hit {
if (req.request == "PURGE") {
set obj.ttl = 0s;
error 200 "Purged";
}
}
sub vcl_miss {
if (req.request == "PURGE") {
error 404 "Not in cache.";
}
}
sub vcl_fetch {
if (req.url ~ "(?i)\.(jpeg|jpg|png|gif|ico|js|css|xml)$") {
unset beresp.http.set-cookie;
return (deliver);
}
}
sub vcl_deliver {
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
} else {
set resp.http.X-Cache = "MISS";
}
}
As you can see, my configuration is pretty straight forward. This configuration is for testing purposes, I know using the loadbalancer IP is not safe and I will change it to use the Forwarded-For IP once everything works.

With a little help of this thread:
What is the function of the "Vary: Accept" HTTP header?
I came to know that it considers the Vary header when determining whether to cache or not, and which version to get from cache.
In my case, the Vary header contained the User-Agent, which is why I was getting different results from different methods.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Varnish: ignore some cache cookies for hash computation - varnish

Related

How to redirect all sub urls just to the main page

Varnish 6 missing requests for same URL coming from different browsers

Temporarily block access to all POST and login functionality

varnish 4 - abandon, vcl_synth, and restarting requests with new headers

Varnish handling different from different methods of request

Categories

Resources