Show content from /page/a on url /page/b Varnish - varnish

I would like to setup Varnish 2.1.5 rules to show content from another page in some cases, yet keep the original URL intact.
eg When user requests /page/a s/he will be shown /page/b instead, but still see the /page/a URL in the browser.
This specific use case I need it for gracefully handling 404 errors on translated pages. Im not sure how to send the request back through to vcl_recv
The as I understand, the lifecycle flow, and current logic looks like this:
sub vcl_recv {
if(req.http.cookie ~ "lang_pref") {
# Redirect to Preferred Language
error 999 "i18n cookie";
}...
sub vcl_deliver {
if (resp.status == 999 ) {
set resp.status = 302;
set resp.response = "Found";
}... # more i18n logic
sub vcl_fetch {
# Set Varnish error if backend cant find requested i18n page
if (beresp.status == 404 && req.url ~ "^\/(en|fr|de)(\/.*)?$") {
error 494;
}...
sub vcl_error {
# Double check i18n pages for English before 404
if (obj.status == 494) {
set obj.http.Location = "https://site/page/a";
}
set obj.status = 302;
return(deliver);
}
What I'm assuming, instead of set obj.http.Location "https://site/page/a";, I need to somehow send the request back to vcl_recv then use regsub().
How would I go about that?

Should be as easy as:
sub vcl_error {
# Double check i18n pages for English before 404
if (obj.status == 494 && req.url == "/page/a") {
set req.url = "/page/b";
return(restart);
}
}

Related

How to invalidate a request in varnish which has a request param

I want to invalidate a request in Varnish from a java backend with HTTP headers.
till now I am able to achieve cache arequest which does not have query param in it.
Let's say I have a request: localhost:8090/api/data/abc?fields=test,test1
what headers do I need to set in this case for varnish to cache it.
I am able to ban a request which is like : localhost:8090/api/data/abc
by using this headers for this request:
request: localhost:8090/api/data/abc
headers:
responseHeaders.set("x-host", "localhost:8080");
responseHeaders.set("x-url", "/api/data/abc");
VCL code
You can use the VCL example from the banning tutorial on the Varnish Developer Portal, which looks like this:
vcl 4.1;
acl purge {
"localhost";
"192.168.55.0"/24;
}
sub vcl_recv {
if (req.method == "BAN") {
if (!client.ip ~ purge) {
return (synth(405));
}
if (!req.http.x-invalidate-pattern) {
return (purge);
}
ban("obj.http.x-url ~ " + req.http.x-invalidate-pattern
+ " && obj.http.x-host == " + req.http.host);
return (synth(200,"Ban added"));
}
}
sub vcl_backend_response {
set beresp.http.x-url = bereq.url;
set beresp.http.x-host = bereq.http.host;
}
sub vcl_deliver {
unset resp.http.x-url;
unset resp.http.x-host;
}
Keep in mind that you need to add your backend definition and customize the values of the ACL.
Removing objects that match a specific query string pattern
Let's say the cache contains 4 objects identified by the following URL:
/?a=1
/?a=2
/?a=3
/?a=4
Imagine we want to remove the first 3. Assuming the value of the a query string parameter is a number, we can create the following HTTP request:
curl -XBAN -H"x-invalidate-pattern: ^/\?a=[1-3]+" http://localhost
As a result /?a=1, /?a=2 and /?a=3 will be removed from the cache, whereas /?a=4 is still stored in the cache.
Conclusion
The query string parameter is part of the URL. As long as you can match it in a regular expression, you can remove specific objects from the cache.

Varnish vcl_backend_response detect vcl_recv return (hash)

On a multiple website set-up using varnish 5.1 on port 80, I don't want to cache all domains.
That is easily done in vcl_recv.
if ( req.http.Host == "cache.this.domain.com" ) {
return(hash);
}
return(pass);
Now in vcl_backend_response I want to do some processing for cached domains.
Of course I can do if( bereq.http.Host == "cache.this.domain.com" ), but is there a way to know if it was a return(hash) or a return(pass) call in vcl_recv from within vcl_backend_response?
I thought that this could make sense but couldn't find the information.
Thanks for your help.
In addition to the ad-hoc approach suggested by #Daniel V., an alternative that might fit your needs is:
sub vcl_backend_response {
if (!bereq.uncacheable) {
...
}
}
This let's you execute the extra processing only for cacheable objects.
It really makes me wonder why you need such processing in the first place.
I don't think there's a way to tell directly how you landed into vcl_backend_response. So I suppose you can set a flag and check on that later, i.e.:
sub vcl_recv {
if ( req.http.Host == "cache.this.domain.com" ) {
set req.http.return_type = "hash";
return(hash);
}
set req.http.return_type = "pass";
return(pass);
}
sub vcl_backend_response {
if( bereq.http.return_type == "pass" ) ...
}

How to exclude special pages from being cached (Varnish)?

I am new to VCL rules.
I want to exclude special pages from being cached by varnish cache.
What I exactly want to do is exclude all urls from being cached that include a specific query string "query=(number between 1 and 100)"
This code works only for one specific query.
sub vcl_recv {
# don't cache these special pages
if (req.url ~ "query=100") {
return(pass);
}
}
I just want to be sure this rule should work for the whole range from 1-100, right ?
sub vcl_recv {
# don't cache these special pages
if (req.url ~ "query=[0-9]") {
return(pass);
}
}
or do i have to do it like this ?
sub vcl_recv {
# don't cache these special pages
if (req.url ~ "query=1||query=2||...||query=99||query=100") {
return(pass);
}
}
I don't know if Varnish supports curly braces, if it does, you should do something like:
sub vcl_recv {
# don't cache these special pages
if (req.url ~ "query=([0-9]{1,2}|100)") {
return(pass);
}
}
By the way, this regular expression matches "query=990". I don't know how your url is composed but you should add something to avoid that (if you really need to).
For example, if there are other params:
sub vcl_recv {
# don't cache these special pages
if (req.url ~ "query=([0-9]{1,2}|100)&") {
return(pass);
}
}
Or in case it's the last param in the url:
sub vcl_recv {
# don't cache these special pages
if (req.url ~ "query=([0-9]{1,2}|100)$") {
return(pass);
}
}

Varnish-Cache Time based access

I have in Apache something like this
RewriteCond %{REMOTE_HOST} !^11\.22\.33\.[12]\d\d$ #As example allows from 100 till 200
RewriteCond %{REMOTE_HOST} !^127\.0\.0\.1$
RewriteCond %{TIME} <20140113090000
RewriteRule ^/access-on-monday/ http://www.mysite.com/ [NC,L,R=302]
which works perfect.
I need same for Varnish.
Because server behind LoadBalancer it gets X-Forwarded-For header with real client IP. I still check %{REMOTE_HOST} in Apache because rpaf_module is installed.
I added next in Varnish with helping ipcast vmod
import ipcast;
acl office {
"localhost";
"11.22.33.100"/24; //Let's think that it matches with 11.22.33.100 - 11.22.33.200
}
sub vcl_recv {
if (req.http.X-Forwarded-For !~ ",") {
set req.http.xff = req.http.X-Forwarded-For;
} else {
set req.http.xff = regsub(req.http.X-Forwarded-For, "^[^,]+.?.?(.*)$", "\1");
}
if (ipcast.clientip(req.http.xff) != 0) {
error 400 "Bad request";
}
if (!client.ip ~ office) {
set req.http.X-Redir-Url = "http://" + req.http.Host + "/";
error 751 "Found";
}
}
Then in vcl_error I make redirect but it doesn't matter here.
My question is it possible to make timebased access like in Apache?
You are limited with simple VCL but Varnish allow to do much more than simple statements.
You can enhance and drive your workhorse with inline-C or VMODs, and do this job in C.
For example, if you want to do add timebased access from :
backend server_available_in_2014 {
.host="127.0.0.1";
.port="8080";
}
sub vcl_recv {
set req.backend = server_available_in_2014; # IT MUST BE AVAILABLE ONLY in 2014
}
You can convert your date 201401010000 into an UNIX timestamp 1389617122, and write simple inline-C :
backend server_available_in_2014 {
.host="127.0.0.1";
.port="8080";
}
C{
double TIM_real(void);
}C
sub vcl_recv {
C{
if (TIM_real() > 1389617122.0) {
VRT_l_req_backend(sp, VGCDIR(_server_available_in_2014));
}
}C
}
TIM_real() is returning a current timestamp (look at varnish/lib/libvarnish/time.c) and VRT_l_req_backend statement is exactly the same as set req.backend = server_available_in_2014;, but written in C instead of VCL.
If you want more tweaks, you can compile your VCL into C by executing the following command: varnishd -f default.vcl -C

Varnish vcl_hash to remove a parameter

I'm using Varnish 2.0.6 and I'm having trouble with finding good documentation to write the vcl_hash function.
I need to remove a few parameters from the URL of my API before caching. In particular a userid that is passed to track analytics but not to change the results.
URL: /api/browse?node=123&userid=3432432564363
I wrote this but it's not cleat to me if the vcl_hash function needs to end with 'hash' or 'return(hash)' or NOTHING and if I need to handle all the cases or just my special case. It's not clear to me if I'm overwriting method or I'm extending it.
I have:
sub vcl_hash {
if (req.url ~ "^/api/browse") {
set req.hash += regsuball(req.url,"&userid=([A-z0-9]+)","");
}
hash;
}
Is it missing something?
I tested a few things, and this one seems to work:
sub vcl_hash {
if (req.url ~ "^/api/browse") {
set req.hash += regsuball(req.url,"&userid=([A-z0-9]+)","");
} else {
set req.hash += req.url;
}
set req.hash += req.http.host;
hash;
}
So it looks like you also have to handle the default case when you rewrite vcl_hash.
The following is a general solution that works for me (starting from varnish v4), to remove several unwanted parameters.
The list of parameters can be extended easily, as long as the value-regex matches: The value regex matches all URL-safe characters, so it should match for all URL-encoded parameters.
sub vcl_hash {
# conditional replacement is faster then evaluating regexes all the time
if (req.method == "GET" || req.method == "HEAD") {
hash_data(regsuball(req.url, "(userid|sid|uid)=[%.-_~A-z0-9]+&?", ""));
}
else {
hash_data(req.url);
}
hash_data(req.http.host);
return (lookup);
}

Resources