Varnish keeps caching my tracking software - varnish

I have a Varnish setup for one of my sites. I'm using the open source software Piwik for my stats tracking.
Piwik have an option of having a Proxy for tracking, which means that the URL of Piwik won't be revealed in my source code. Basically it's a PHP file that sits on my wordpress install and it sends CURL posts to my Piwik install...
Now, I set up my Varnish using:
https://github.com/mattiasgeniar/varnish-3.0-configuration-templates
In vcl_fetch I added:
if (req.url ~ "piwik") {
set beresp.ttl = 120s;
return (hit_for_pass);
}
In vcl_recv I added:
if (req.url ~ "piwik") {
return (pass);
}
What happens is, I see only 50% of the traffic I actually have on the website...
I'm afraid it's because of my vcl_fetch settings...
I read the differences between pass and hit_for_pass and from what I understand beresp.ttl is a config that I guides varnish to keep doing pass for 120s
One more thing, W3TotalCache on WP adds some caching headers like Max-Age & expires to my piwik.php file. Without Varnish it's still working well and tracking correctly. Is it possible that there is some sort of collision between Varnish and those headers?
Do I get it right?
Why do you think 50% of my tracking is missed?
Thanks.

The Varnish configuration for pass-ing in vcl_recv is correct.
The code you have in vcl_fetch can be removed, it doesn't make any difference at that point because of the code in recv.
Remember that any VCL code that filters response headers in vcl_fetch is also run for pass-ed responses. I'd guess that you are filtering the Set-Cookie that piwik sends.

Related

Angular ssr varnish usage

It working but not affect performance. Do I use it right?
/etc/varnish/default.vcl
backend default {
.host = "127.0.0.1";
.port = "4000"; }
i was add vanish port instead 4000 in nginx config
location / {
proxy_pass http://localhost:6081;
}
My Angular application (google pagespeed) desktop performance is 99% but the mobile performance is 40-60%.
Varnish's out-of-the-box behavior respects HTTP caching best practices.
This means:
Only cache HTTP GET & HTTP HEAD calls
Don't serve responses from cache when the request contains cookie headers
Don't serve responses from cache when the request contains authorization headers
Don't store responses in cache when set-cookie headers are present
Don't store responses in cache when the cache-control header is a zero TTL or when it contains the following: no-cache, or no-store, or private
Under all circumstances Varnish will try to serve from cache or store in cache.
This is that behavior written in VCL: https://github.com/varnishcache/varnish-cache/blob/6.0/bin/varnishd/builtin.vcl
Adapting to the real world
Although these caching best practices make sense, they are not realistic when you look at the real world. In the real world we use cookies all the time.
That's why you'll probably have to write some VCL code to change the behavior of the cache. In order to do so, you have to be quite familiar with the HTTP endpoints of your app, but also the parts where cookies are used.
Parts of your app where cookie values are used on the server-side will have to be excluded from caching
Parts of your app where cookie values aren't used will be stored in cache
Tracking cookies that are only used at the client side will have to be stripped
How to examine what's going on
The varnishlog binary will help you understand the kind of traffic that is going through Varnish and how Varnish behaves with that traffic.
I've written an in-depth blog post about this, please have a look: https://feryn.eu/blog/varnishlog-measure-varnish-cache-performance/
Writing VCL
Once you've figured out what is causing the drop in performance, you can write VCL to mitigate. Please have a look at the docs site to learn about VCL: https://varnish-cache.org/docs/6.0/index.html
The is reference material in there, a user guide and even a tutorial.
Good luck

cache pictures from remote server with varnish

I'm creating simple page where will a lot of pictures. All pictures are hosted on remote provider (hosted on object storage and I have only links to all pictures) To speed up www I would like to use varnish to cache this pictures but I have problem:
All pictures are served with https, so I've used haproxy to terminate ssl and next traffic go to varnish, but how to map in varnish website address that should be visible for end user like https://www.website.com/picture.jpg with remote address where is picture hosted(https://www.remotehostedpictures.com/picture.jpg) . So, in final result user must see first link, remote address remotehostedpictures.com/picture.jpg can't be visible.
In your varnish vcl_recv you should change your request host header, then you must declare remotehostedpictures.com as your backend.
In the end, you should have something like this (code not tested)
sub vcl_recv {
if (req.url ~ "\.jpg") {
set req.http.host = "www.remotehostedpictures.com";
set req.backend_hint = remote_host;
}
}
backend remote_host {
.host = "www.remotehostedpictures.com";
.port = "80";
}
By the way, beware of dns in backend.host. If the dns resolved to multiple IPs varnish will use the first one. The dns resolving is done at vcl compile time so if the dns change you should reload your vcl.
I think that storing images in Varnish is not god idea, because than Varnish will fill whole ram quickly (if You have lot of images), than when Varnish is full it purges whole cache, imagine what is happening on the server when whole cache is purged and You have traffic on Your page.
Some time ago I make such cache in Varnish and after few hours live I have to disable caching images because of purging (for me most important was caching page content).
In such situations best solution is CDN. You can use external service such as Cloudflare, or make simple CDN on Nginx (which will only serve static files with expire header).
Hope it helps :)

How can I return a 500 response for all requests to a specific file at the Varnish level?

Background:
Our network structure brings all traffic into a Varnish installation, which then ports traffic to one of 5 different web servers, based on rules that a previous administrator setup. I don't have much experience with Varnish.
Last night we were being bombarded by requests to a specific file. This file is one that we limit to a specific set of servers, and it has direct link to our master database, due to reasons. Obviously, this wasn't optimal, and our site was hit pretty hard because of it. What I attempted to do, and failed, was to write a block of code in the Varnish VCL that would return a 500 response for every request to that file, which I could then comment out after the attack period ended.
Question:
What would that syntax be? I've done my googling, but at this point I think it's the fact that I don't know enough about Varnish to be able to word my search properly, so I'm not finding the information that I need.
You can define your own vcl_recv, prior to any other vcl_recv in your configuration, reload Varnish, and you should get the behaviour you're looking for.
sub vcl_recv {
if (req.url ~ "^/path/to/file(\?.*)?$") {
return (synth(500, "Internal Server Error"));
}
}

varnish vcl_recv default behaviour

Could you help me in confirming the default behavior of vcl_recv in Varnish ?
vcl_recv definition that comes in default.vcl file is commented out in the application setup.
We have provided our custom version of vcl_recv in a vcl file without specifying a return(lookup) or lookup statement. However caching seems to be proper when trying to access an images or static content. Does varnish internally implement some sort of logic to cache on top of what is defined in default.vcl's vcl_recv and user defined vcl_recv ?
Thanks
Explanation
When you define a custom VCL (vcl_recv in this case) Varnish automagically appends the default VCL to yours.
Keep in mind that if you do something like return(lookup)/pass/etc in your VCL, varnish default VCL won't be executed after that line is executed.
From Varnish docs:
It is executed right after any user-specified VCL, and is always present. You can not remove it.
And:
Consider either replicating all the logic in your own VCL, or letting Varnish fall through to the default VCL.
Example
sub vcl_recv {
if (req.http.host ~ "dev") {
return(pass);
}
}
This won't save in cache any request that has a "dev" in its host. But it will still save in cache anything else.
Extra:
Great tool to be sure if varnish is working: Here

How to write VCL in varnish to do no caching

I need to write VCL in Varnish so to prevent caching under certain conditions like cookie value.
Any idea how to do that?
Write and load your own .vcl file to instruct varnish when to cache. By default, requests with cookies will not be cached.
You could start with the Varnish tutorial, and don't hesitate to ask a more specific question on this site if you can't make it work...
Place the following inside your vcl_recv:
# as soon as we have a NO_CACHE cookie pass request
if (req.http.cookie ~ "NO_CACHE=") {
return (pass);
}

Resources