Can Varnish backend be specified by stand-alone app?

Can Varnish backend be specified by stand-alone app? - varnish

Currently we have 3 applications (varnish backends):
Eshop
CMS
Routing - app which returns status code of which backend should be chosen to hit.
The main idea between this is that we have the same domain for Eshop and CMS. And all links are stored in MySQL database. So to decide where should varnish hit we are using some routing app.
If we are hitting routing backend in the vcl_recv we are checking status in vcl_backend_response and switching the backend like that:
sub vcl_backend_response {
if (591 == beresp.status) {
set bereq.backend = eshopDirector.backend();
return (retry);
} elsif (592 == beresp.status || 593 == beresp.status || 594 == beresp.status) {
set bereq.backend = cmsDirector.backend();
return (retry);
}
}
After retry we are not caching results because vcl_recv stage is not hit again to identify if results has to be retrieved from the cache or from the backend itself.
So the question here, is there any way to identify which backend has to be used from vcl_recv stage, to get cached results afterwards?
Maybe it's possible to make CURL request from there to get status from the routing app and handle it accordingly?

Related

Varnish 6 missing requests for same URL coming from different browsers

This is how my varnish.vcl looks like.
vcl 4.0;
import directors;
import std;
backend client {
.host = "service1";
.port = "80";
}
sub vcl_recv {
std.log("varnish log info:" + req.http.host);
# caching pages in client
set req.backend_hint = client;
# If request is from conent or for pages remove headers and cache
if ((req.url ~ "/content/") || (req.url ~ "/cms/api/") || req.url ~ "\.(png|gif|jpg|jpeg|json|ico)$" || (req.url ~ "/_nuxt/") ) {
unset req.http.Cookie;
std.log("Cachable request");
}
# If request is not from above do not cache and pass to Backend.
else
{
std.log("Non cachable request");
return (pass);
}
}
sub vcl_backend_response {
if ((bereq.url ~ "/content/") || (bereq.url ~ "/cms/api/") || bereq.url ~ "\.(png|gif|jpg|jpeg|json|ico)$" || (bereq.url ~ "/_nuxt/") )
{
unset beresp.http.set-cookie;
set beresp.http.cache-control = "public, max-age=259200";
set beresp.ttl = 12h;
return (deliver);
}
}
# Add some debug info headers when delivering the content:
# X-Cache: if content was served from Varnish or not
# X-Cache-Hits: Number of times the cached page was served
sub vcl_deliver {
# Was a HIT or a MISS?
if ( obj.hits > 0 )
{
set resp.http.X-Cache-Varnish = "HIT";
}
else
{
set resp.http.X-Cache-Varnish = "MISS";
}
# And add the number of hits in the header:
set resp.http.X-Cache-Hits = obj.hits;
}
If I am hitting a page from same browser netwrok tab showing
X-Cache-Varnish = "HIT";
X-Cache-Hits = ;
Lets say if I hot from chrome 10 times this is what I get
X-Cache-Varnish = "HIT";
X-Cache-Hits = 9;
9 because first was a miss and rest 9 were served from cache.
If I try incognito window or a different browser it gets its own count starting from 0. I think somehow I am still caching cookies. I could not identify what I am missing.
Ideally, I want to delete all cookies for specific paths. but somehow unset does not seem to be working for me.

If you really want to make sure these requests are cached, make sure you do a return(hash); in your if-statement.
If you don't return, the built-in VCL will take over, and continue executing its standard behavior.
Apart from that, it's unclear whether or not your backend sets a Vary header which might affect your hit rate.
Instead of guessing, I suggest we use the logs to figure out it.
Run the following command to track your requests:
varnishlog -g request -q "ReqUrl ~ '^/content/'"
This statement's VSL Query expression assumes the URL starts with /content. Please adjust accordingly.
Please send me an extract of varnishlog for 1 specific URL, but also for both situations:
The one that hits the cache on a regular browser tab
The one that results in a cache miss in incognito mode or from a different browser
The logs will give more context and explain what happened.

My varnish config doesn't seem to be working properly

I'm very new on varnish and I've a business on my hands recently. It's a local magazine website http caching (Tech Stack is Javascript + PHP). I'm trying to use varnish 4 for caching the website. What they want me to do is; any new articles should be appeared on FE immediately, any deleted articles should be erased from the FE immediately, any changes on website's current appereance should be applied directly (changing articles' current locations, they can be dragged anywhere on the website based on articles' popularity change.) and finally any changes on existing articles should be applied to website immediately. As you see on the config below, in sub vcl_recv block I tried to use return(purge) for POST requests, because new articles and article changes is applied via POST request. But it doesn't work at all. When I try create a new dummy content or make changes on existing articles, it's not purging the cache and showing the fresh content even if POST request is successful. Also, on the BE side, I tried to use if (beresp.status == 404) for deleted articles, but it doesn't work too. When I delete the dummy article I created, it's not being deleted too, I'm still seein the stale content. How should I change my config to get all these things done? Thank you.
my varnish config is ;
import directors;
import std;
backend server1 {
.host = "<some ip>";
.port = "<some port>";
}
sub vcl_init {
new bar = directors.round_robin();
bar.add_backend(server1);
}
sub vcl_recv {
set req.backend_hint = bar.backend();
if (req.http.Cookie == "") {
unset req.http.Cookie;
}
set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js)=[^;]*", "");
if (req.url ~ "\.(css|js|png|gif|jp(e)?g|swf|ico)") {
unset req.http.cookie;
}
if (req.url ~ "\.*") {
unset req.http.cookie;
}
if (req.method == "POST") {
return(purge);
}
}
sub vcl_deliver {
# A bit of debugging info.
if (obj.hits > 0) {
set resp.http.X-Cache = "HIT";
}
else {
set resp.http.X-Cache = "MISS";
}
}
sub vcl_backend_response {
set beresp.grace = 1h;
set beresp.ttl = 120s;
if (bereq.url ~ "\.*") {
unset beresp.http.Set-Cookie;
unset beresp.http.Cache-Control;
}
if (bereq.method == "POST") {
return(abandon);
}
if (beresp.status == 404) {
return(abandon);
}
return (deliver);
}

No need to use the director if you only have one backend. Varnish will automatically select the backend you declared if there's only 1 backend.
Purging content
The POST purge call you're doing is not ideal. Please have a look at the following page to learn more about content invalidation in Varnish: https://varnish-cache.org/docs/6.0/users-guide/purging.html#http-purging
The snippet on that page contains an ACL to protect your platform from unauthorized purges.
It's important to know that you'll need to create a hook into your CMS or your MVC controller, that does the purge call.Here's a simple example using curl in PHP:
$curl = curl_init("http://your.varnish.cache/url-to-purge");
curl_setopt($curl, CURLOPT_CUSTOMREQUEST, "PURGE");
curl_exec($curl);
As you can see, this is an HTTP request done in cURL that uses the custom PURGE HTTP request method. This call needs to be executed in your good right after the changes are stored in the database. This post-publishing hook will ensure that Varnish clears this specific object from cache.
VCL cleanup
The statement below doesn't look like a reliable way to remove cookies, because the expression will remove cookies from all pages dat contain a dot:
if (req.url ~ "\.*") {
unset req.http.cookie;
}
The same applies to the following statement coming from the vcl_backend_response hook:
if (bereq.url ~ "\.*") {
unset beresp.http.Set-Cookie;
unset beresp.http.Cache-Control;
}
I assume some pages do actually need cookies to properly function. An admin panel for example, or the CMS, or maybe even a header that indicates whether or not you're logged in.
The best way forward is to define a blacklist or whitelist of URL patterns that can or cannot be cached.
Here's an example:
if(req.url !~ "^/(admin|user)" {
unset req.http.Cookie;
}
The example above will only keep cookies for pages that start with /admin or /user. There are other ways as well.
Conclusion
I hope the purging part is clear. If not, please take a closer look at https://varnish-cache.org/docs/6.0/users-guide/purging.html#http-purging.
In regards to the VCL cleanup: purging can only work if the right things are stored in cache. Dealing with cookies can be tricky in Varnish.
Just try to define under what circumstances cookies should be kept for specific pages. Otherwise, you can just remove the cookies.
Hope that helps. Good luck.
Thijs

Varnish: Making each API keys objects cache separately

I have Varnish 4 installation where I select a backend based on an API-key in the header of each request, such as (we are in vcl_recv):
if (req.url ~ "/content") {
# Check for presence of X-Api-Key header
if ((! req.http.X-Api-Key) || ((! req.http.X-Api-Key ~ "prod-") && (! req.http.X-Api-Key ~ "test-"))) {
return(synth(403,"Access Denied - API key missing or invalid."));
}
if (req.http.X-Api-Key ~ "prod-") {
set req.backend_hint = PROD.backend();
}
if (req.http.X-Api-Key ~ "test-") {
set req.backend_hint = TEST.backend();
}
}
However, objects fetched from the PROD backend can get delivered to requests to the TEST backend, if their TTL is not expired, and vice versa.
How do I make sure that content from each backend is isolated from the other?

This one is easy. Since you want the cache to vary by specific header, you should tell Varnish about it. So either make your backend send Vary: X-Api-Key (best route), or have Varnish hash on that header's value:
sub vcl_hash {
if (req.http.X-Api-Key) {
hash_data(req.http.X-Api-Key);
}
}

varnish 4 - abandon, vcl_synth, and restarting requests with new headers

I've already asked and received an awesome answer about how to get an error from the backend to force serving from stale cache (grace) found here: varnish 4: serve graced object if beresp.status is an error?
but now that logic needs an extra step: i include the following code currently
sub vcl_backend_fetch {
if (bereq.retries == 0) {
unset bereq.http.X-Varnish-Backend-5xx;
unset bereq.http.X-Varnish-Backend-206;
} else {
if (bereq.http.X-Varnish-Backend-5xx) {
return (abandon);
}
if (bereq.http.X-Varnish-Backend-206) {
return (abandon);
}
}
}
sub vcl_synth {
if (resp.status == 503 &&
req.method != "POST" &&
!req.http.X-Varnish-Restarted-5xx) {
set req.http.X-Varnish-Restarted-5xx = "1";
return (restart);
}
if (resp.status == 503 &&
req.method != "POST" &&
!req.http.X-Varnish-Restarted-206) {
set req.http.X-Varnish-Restarted-206 = "1";
return (restart);
}
}
obviously the second if statement in the vcl_synth is virtually identical to the first one, with the exception of the header it's looking for. I need to differentiate the 206 to restart with a different request header, but I am not sure how. the issue is that, if the backend returns a 206, the rest of the logic abandons the backend fetch (which hands off to vcl_synth with a 503), and vcl_synth restarts the request to force serving graced cached objects. however, if there's no graced cache object to return to the user, then the user sees a 503 instead of a 206.
Before realizing that this line of thinking was not possible, i tried to have vcl_backend_fetch return a synth(206), so that vcl_synth could use resp.status to differentiate, and add a different request header before restarting the request. then i would be able to look for that header in vcl_miss, and if it's there, i could restart the whole request a second time, and force it to serve the 206 from the usual backend.
TL;DR: how do I differentiate two different cases in a vcl_backend_fetch abandon, to get vcl_synth to restart the request with two different headers?

I'm pretty sure it's not possible to do anything in vcl_backend_fetch in order to abandon the request and transfer some kind of flag to vcl_synth.
I'm not sure if I've completely understood your use case, but it seems that would need something like:
sub vcl_synth {
if (resp.status == 503 &&
req.method != "POST" &&
!req.http.X-Varnish-Restarted-5xx) {
# Let's restart the request for the first time and try to
# serve grace content. This could select an always-sick
# backend during vcl_recv.
set req.http.X-Varnish-Restarted-5xx = "1";
return (restart);
}
if (resp.status == 503 &&
req.method != "POST" &&
req.http.X-Varnish-Restarted-5xx &&
!req.http.X-Varnish-Restarted-206) {
# Grace content was not available after the first restart. Let's
# restart again the request and try to serve content using some
# other backend.
set req.http.X-Varnish-Restarted-206 = "1";
return (restart);
}
}

varnish cache banning with pattern matching

Right, I'm going to be honest, I don't know varnish vcl, I can work out some basic stuff but I don't know it very well which is obviously why I'm having issue's.
I'm trying to set up cache banning via a http request, however the request can't come in via the DNS but rather through the IP address of the varnish box otherwise I can't be sure that every varnish box cache will have the target flushed; this is because we have several varnish boxes all behind an ELB so you can't guarantee that a ban request will not go to the same box twice, hence doing this via IPs.
I'm using this to insure that only the allowed IP's are allowed to ban but this isn't working:
sub vcl_hit {
if (req.request == "BAN") {
ban("req.url ==" + req.url);
error 200 "Purged";
}
}
I don't really know what to do to get this working and I've looked but most of the tutorials I've found seem to be for full URLS rather than just ip + pattern_to_purge

from your config example i expect you use varnish 3
you can add a list of ips that is allowed to do the purge as followed
acl ban_allowed_ip {
"127.0.0.1";
"127.0.0.2";
}
inside your if(req.request =="BAN") add the following
if (!client.ip ~ ban_allowed_ip) {
error 405 "Not allowed.";
}

The answer is to use:
if (req.request == "BAN") {
if (req.http.X-Debug != "True") {
error 405 "Not allowed.";
}
ban("obj.http.x-url ~ " + req.url);
error 200 "ban added";
}
Whilst this will return 200 regardless if the item in the cache exists or not, it does add the ban.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string