It working but not affect performance. Do I use it right?
/etc/varnish/default.vcl
backend default {
.host = "127.0.0.1";
.port = "4000"; }
i was add vanish port instead 4000 in nginx config
location / {
proxy_pass http://localhost:6081;
}
My Angular application (google pagespeed) desktop performance is 99% but the mobile performance is 40-60%.
Varnish's out-of-the-box behavior respects HTTP caching best practices.
This means:
Only cache HTTP GET & HTTP HEAD calls
Don't serve responses from cache when the request contains cookie headers
Don't serve responses from cache when the request contains authorization headers
Don't store responses in cache when set-cookie headers are present
Don't store responses in cache when the cache-control header is a zero TTL or when it contains the following: no-cache, or no-store, or private
Under all circumstances Varnish will try to serve from cache or store in cache.
This is that behavior written in VCL: https://github.com/varnishcache/varnish-cache/blob/6.0/bin/varnishd/builtin.vcl
Adapting to the real world
Although these caching best practices make sense, they are not realistic when you look at the real world. In the real world we use cookies all the time.
That's why you'll probably have to write some VCL code to change the behavior of the cache. In order to do so, you have to be quite familiar with the HTTP endpoints of your app, but also the parts where cookies are used.
Parts of your app where cookie values are used on the server-side will have to be excluded from caching
Parts of your app where cookie values aren't used will be stored in cache
Tracking cookies that are only used at the client side will have to be stripped
How to examine what's going on
The varnishlog binary will help you understand the kind of traffic that is going through Varnish and how Varnish behaves with that traffic.
I've written an in-depth blog post about this, please have a look: https://feryn.eu/blog/varnishlog-measure-varnish-cache-performance/
Writing VCL
Once you've figured out what is causing the drop in performance, you can write VCL to mitigate. Please have a look at the docs site to learn about VCL: https://varnish-cache.org/docs/6.0/index.html
The is reference material in there, a user guide and even a tutorial.
Good luck
Related
I'm just testing Workers sites with a Hugo-created static site. It already existed, so I used the docs’ instructions for adapting an existing site. The cache-control headers for the woff2 and css files all show up with no-cache, contrary to what I'd have expected based on https://support.cloudflare.com/hc/en-us/articles/200172516#h_a01982d4-d5b6-4744-bb9b-a71da62c160a. The Workers site in question is https://hosts-test-hugo.brycewray.workers.dev/.
I found the following at https://levelup.gitconnected.com/use-cloudflare-javascript-workers-to-deploy-you-static-generated-site-ssg-1c518e078646 but don't know if it's related:
A Cloudflare Worker is a piece of JavaScript code that runs every time you access a specific route on a website proxied by Cloudflare. The code is executed on every request before they reach Cloudflare’s cache. This means Worker responses are not cached (although requests made by the worker to other web services might be cached with the appropriate caching headers).
Does the site need to have a custom domain — i.e., rather than being a “.workers.dev” URL — before it will have normal caching behavior? Is that even related?
[Note: I am posting this here because I’ve been unsuccessful in getting a response on either the Cloudflare community forum or the Cloudflare subreddit — hoping for better results here.]
Thanks again to Cloudflare’s Kenton Varda for his answer here, without which I’d have been totally stuck. I also thank Brian Li for additional and equally valuable help he provided separately.
Adding the following for others who may find it useful . . .
The remaining problem I encountered was that I didn’t know how to assign differing cache-control values (such as one month, or 2592000) to most of the static assets while leaving the HTML at a much smaller value (like 3600 or 0). Then, a few days later, I found the answer within a comment in Issue #81 for the Cloudflare KV-Asset-Handler repository. So, now, I have the following code within my Worker’s index.js file:
options.cacheControl = {
browserTTL: 0,
edgeTTL: 0,
bypassCache: false // default
}
const filesRegex = /(.*\.(ac3|avi|bmp|br|bz2|css|cue|dat|doc|docx|dts|eot|exe|flv|gif|gz|ico|img|iso|jpeg|jpg|js|json|map|mkv|mp3|mp4|mpeg|mpg|ogg|pdf|png|ppt|pptx|qt|rar|rm|svg|swf|tar|tgz|ttf|txt|wav|webp|webm|webmanifest|woff|woff2|xls|xlsx|xml|zip))$/
if(url.pathname.match(filesRegex)) {
options.cacheControl.edgeTTL = 2592000
options.cacheControl.browserTTL = 2592000
}
If you look at that linked issue comment and wonder what’s different: the only thing is that I removed html (and, to be safe, htm) from the list of extensions. As a result, my Worker site’s HTML has zero caching while each CSS, font, or image file has a one-month cache-control setting — exactly the desired result. Note: The vast majority of the site’s images are hosted elsewhere, but I do still host a small number for favicons and fallback in general.
By default, Workers Sites does not serve a Cache-Control header, but you can customize it to do so. (EDIT to clarify: Workers Sites are cached on Cloudflare's edge by default, and support "etags" for revalidation. The Cache-Control header controls whether they are also cached in the browser without requiring revalidation.)
Note that Workers Sites works very differently from using Cloudflare with a classic origin server. If you're reading something about caching on Cloudflare, but it doesn't specifically mention Workers Sites, then it probably does not apply to Workers Sites.
With Workers Sites, your site is served by a Cloudflare Worker -- code that runs directly on Cloudflare's servers. So, you have no "origin" server behind Cloudflare, and Cloudflare's cache doesn't work in the normal way. The Worker code is completely responsible for serving the content, including setting any headers like Cache-Control.
In fact, when you create a new Workers Sites project using wrangler, the code for this Worker is generated for you -- but you are allowed to edit it! You can customize the code all you want to do whatever you want. The code for the Worker is found in your project directory under workers-site/index.js. The code looks like this -- in fact, it is initialized as a copy of that file from GitHub.
This worker code depends on a library (npm module) called #cloudflare/kv-asset-handler to do most of the work. This library can be customized to handle caching in various ways through the cacheControl option.
But where do you set this option? Well, in your worker code!
Open up workers-site/index.js and look for the part that looks like this:
async function handleEvent(event) {
const url = new URL(event.request.url)
let options = {}
/**
* You can add custom logic to how we fetch your assets
* by configuring the function `mapRequestToAsset`
*/
// options.mapRequestToAsset = handlePrefix(/^\/docs/)
The comment mentions one way that you can use options to customize how your site is served, but you can also set cacheControl here. Try adding this:
options.cacheControl = {
browserTTL: 3600 // 1 hour
}
Now re-deploy your site, and you should see assets are served with Cache-Control: max-age=3600. Of course, this means that your content may be cached in people's browsers for up to an hour (3600 seconds); you may prefer a longer or shorter period.
Note that if you aren't a programmer, this may all seem a bit daunting. Workers Sites is really designed for people who want to be able to customize how their sites are served by editing JavaScript code. For those not interested in writing code, you will be limited to the default behavior, which may or may not suit your needs.
What is the best practice when it comes to setting security policies such as CSP and http security headers such as HSTS? Should they be configured within my express.js application? Or is it best practice to configure them in nginx? I found documents on how to implement them but I am not sure where they should be implemented.
Either could be used. You should put them where it's most appropriate for you and really depends on your set up.
I'm assuming you have an Nginx web server in front of one or more NodeJs application servers?
If so, then are some pages returned from Nginx (e.g. static pages) and some from Node (e.g. dynamic)? Do you have more than one Node server?
It also depends what you are doing with Node. It's quite common to have Nginx return HTML, CSS and Javascript and then use that Javascript to make AJAX calls to a node server to return JSON data. As CSP is needed on the HTML document and not tele JSON it makes no sense to return CSP headers from Node in this scenario.
Some headers like HSTS are set for whole domain so, to me, it makes sense to set them at the Nginx layer so they affect all requests - static pages served by Nginx and dynamic pages served by one or more Node servers. This also means you don't have to remember to set them if you ever set up another node server as well.
However if different data is returned for each service and/or request it may make sense to do that in Node. For example if your Node application needs to be able to set different CORS headers differently based on the request coming in, then it makes no sense to do this in Nginx and try to repeat the logic based on request URL and parameter.
Ultimately you should decide to do it where it makes most sense based on application set up, where its most likely to be set correctly (so it's not set when it shouldn't it set to wrong value and also so it's not too easy to forget to set it in future) and where it makes most sense to manage it (e.g. sometimes it's easier to change application code than server config or vice versa).
Is there any way for varnish to read a list of backend urls from a text file, and then proxy cache misses to a random url taken from the text file?
What I imagine is something like this pseudocode...
/var/services/backend-urls.conf
http://backend-host-1/path/to/application
http://backend-host-2/path/to/application
http://backend-host-3/path/to/application
# etc
varnish config
sub vcl_miss {
// read a list of urls from a text file
backendHosts = readFile("/var/services/backend-urls.conf");
//choose a random url from the file
randomHost = chooseLineAtRandom(backendHosts);
//proxy the request to the random host
set req.backend = randomHost;
}
To provide some background, I work on a server system that comprises a number of backend applications that currently sit behind a front-end running apache. We are evaluating replacing the apache layer with varnish so we can benefit from the caching capabilities of varnish. We also have a service discovery framework that knows the endpoint locations for each backend application (the endpoint urls change periodically as new hosts emerge or are taken out of service).
Currently we use the RewriteMap functionality in mod_rewrite to route requests to the backend services. Then we have a process to maintain the lists of backend services based upon the contents of the service discovery framework.
All this works well for us in apache, except that apache is like using a sledgehammer to crack a nut. All we really want is the reverse proxy loigc, and the caching in varnish would be helpful too.
Is there any way to have varnish read the list of backend urls from an external resource?
Without resorting to custom vmod/c modules, the quick answer is no.
The VCL instruction are being compiled within varnish, and that rules out run-time inclusions.
But why not include within the VCL a separate backend vcl which includes the current backends.
that vcl file could be written out on demand. Then using varnishadm CLI command you could request a new compile of the VCL, therefore bringing the config live.
I can see two potential solutions.
The first is to have something generate your VCL and backends such as Chef or some custom scripting. You can then process the text file into backend definitions and the necessary VCL to invoke them. To handle the requirement for the random backend you could use a director. I've not dealt with directors myself but it looks like they are meant to solve that requirement. When changes to the backends occur you could rerun the generation script/Chef and tell Varnish to reload its configuration either using varnishadm or service varnish reload to avoid a full restart.
The second would be to implement it in C, either via a VMOD as Marcel Dumont suggests or possibly using inline C in your VCL.
With vmod_dynamic you can just use any DNS name as a backend or even service records.
For your use case, one option would be to set up an SRV record in DNS pointing to all your servers and then just use that as for example in the basic-stub.vtc test case.
I would like to add the Varnish-Cache version/signature to my incoming HTTP requests so I can log the Varnish version with requests on my webserver. I understand this information is available in obj.http.Server, but this doesn't work inside vcl_recv or vcl_miss:
set req.http.X-VARNISH-VERSION = obj.http.Server;
Apparently those vcl subs only have access to req and not obj. Is there any other way to get the version number into an HTTP request header?
I am using Varnish 3.0.2.
[Edit]
I am using a Varnish module as an integral component in my system and as part of my automated testing I am running functional tests through the load balancer. I want my web servers (hhvm i this case) to know what version of Varnish is proxying the requests. Currently I am using a hardcoded string for this purpose, but I would like to automate this so I can distribute a non-hardcoded configuration to my varnish servers.
Varnish only sets the Server header when performing a synthetic response (like in vcl_error) and the header in that case doesn't contains Varnish' version.
Please extend your question, I can't envision what you want to achieve with that (and why a fixed string header substitution won't feet your needs).
I've noticed an issue on one of my sites whereby my content pages (which shouldn't set any cookies, should all be returning "Cache-Control: public" with a max-age set, and don't require authorization).
My issue is that somehow HitPass objects are making it into my cache, removing the caching from that page. I need to debug this, but am confused at exactly how best to do this particularly as I'm unable to replicate the issue.
I notice that varnish gives me an ID beside the HitPass in the varnish log. I assume this is the varnish ID for the request that generated the HitPass, and that searching back in a varnish log would tell me exactly what was wrong with the response?
Would it be better to just remove the SetCookie header from pages that I want to cache? The problem is that vcl_fetch is called even if a URL is passed... Is there any way to tell in vcl_fetch whether or not the current request has been passed by vcl_recv?
SetCookie is indeed a reason why you get hit-for-pass objects in your cache. This is an important optimization for non-prepared sites. A hit-for-pass will let varnish go straight to the backend for each of these request instead of stall them and wait for the response of the previous one.
I'm not sure as to exactly what you are wanting to debug. If it's the set-cookie, you should probably either remove that from the backend or make your own rules on what ones to cache or what one's to ignore in your cache. If you still need the set-cookie and it has unique values, hit-for-pass is the way to do that best.