We are using Fastly and its Varnish to deliver content from our services. To distribute the content amongst several services, we are using following snippet:
sub vcl_recv {
#FASTLY recv
if (req.url.path ~ "^/services/") {
set req.url = regsub(req.url, "/services/(.*?)/", "/");
}
}
This works and allows us to deliver /services/user/get to /get endpoint of the user service.
However using this Snippet makes Fastly completely skip gzip compression. This is fixable with using return(lookup):
sub vcl_recv {
#FASTLY recv
if (req.url.path ~ "^/services/") {
set req.url = regsub(req.url, "/services/(.*?)/", "/");
}
return (lookup);
}
At this point the gzip compression is working. Unfortunately this makes all POST, PATCH, DELETE requests arrive as GET.
I tried to study the Varnish docs and I am not sure whether (lookup) is truly the field I need. Can you lead me to how this should be implemented?
Built-in VCL
Varnish uses the built-in VCL to create a set of common rules. They serve as a safety net to the end user.
See https://github.com/varnishcache/varnish-cache/blob/master/bin/varnishd/builtin.vcl for the built-in VCL.
This code and this file should not be loaded by you, but are executed automatically when you don't perform an explicit return statement in one of the subroutines.
Any explicit return statement will bypass the default behavior. Sometimes this is required to customize the behavior of Varnish. But sometimes it is counterproductive and causes undesired behavior.
The consequences of your return(lookup) statement
The sub vcl_recv {} subroutine, which is responsible for handling incoming requests, does an unconditional return(lookup).
This means every single request results in a cache lookup, even if the built-in VCL would rather pass these requests directly to the backend.
There are 2 strategies to decide on caching:
Define rules what can be cached and perform a return(pass) on all other requests
Define rules on what cannot be cached and perform a return(lookup) on all other requests
Basically it's a blacklist/whitelist kind of thing.
GET & HEAD vs the other request methods
The built-in VCL only allows GET and HEAD requests to be cached. Other request methods, like for example a POST, implies that stat changes will occur. That's why they are not cached.
If you try performing a return(lookup) for a POST call, Varnish will internally change this request to a GET.
There are ways to cache POST calls, but in general you should not do that.
How you should structure your sub vcl_recv {}
I would advise you to remove the return(lookup) statement from your sub vcl_recv {} subroutine.
As explained, the built-in VCL will take over as soon as you exited your custom sub vcl_recv {}.
However, the built-in VCL will not be super helpful, because your website probably some cookies in place.
It's important to strip off cookies in a sensible way, and keep them for requests that require cookies. For those pages, a return(pass) will be inserted to ensure that these personalized requests don't get looked up in cache, but are directly passed to the backend.
What about gzip?
It is possible to figure out why gzip stopped working. The varnishlog tool allows you to introspect a running system and filter out logs.
See https://feryn.eu/blog/varnishlog-measure-varnish-cache-performance/ for an extensive blog post I wrote about this topic.
Maybe varnishlog can help you find the reason why gzip compression stopped working at some point.
Related
How can I make Varnish work like a switch?
I need to consult an authentication service with a request of the original client request. That authentication service checks upon original request if access is permitted and replies simply with a status code and probably some more information in the header. Upon that status code and header information from that auth service, I would like varnish to serve content from different backends. Depending on the status code the backend can vary and I would like to add some additional header before Varnish fetches the content.
Finally varnish should cache and reply to client.
Yes, that's doable using some VCL and VMODs. For example, you could use the cURL VMOD during vcl_recv in order to trigger the HTTP request against the authentication service, check response, and then use that info for backend selection and other caching decisions (that would be just simple VCL). A much better alternative would be the http VMOD, but that one is only available in Varnish Enterprise. In fact, a similar example to what you want to achieve is available in the linked documentation; see 'HTTP Request' section.
In any case, it would be a good idea to minimise interactions with the authentication service using some high performance caching mechanism. For example, you could use the redis VMOD for that (or even Varnish itself!).
I want to display different html styles to users based on $_SERVER['HTTP_USER_AGENT']. How can I achieve this with varnish settings, to make it have a specific cache for a specific user agent.
I know I can achieve something similar with JS, but that's not reliable for me I want to do it server side.
The php I will use in my html to detect user agents will look like this;
<?php if($_SERVER['HTTP_USER_AGENT'] == $target):?>
<style>
//CSS
</style>
<?php endif;?>
How can I setup Varnish so it work neatly with this?
All you need to do is modify the vcl_hash method to add more info to the cache key.
https://varnish-cache.org/docs/trunk/users-guide/vcl-hashing.html
sub vcl_hash {
hash_data(req.http.User-Agent);
}
Be aware that there are no real standards that are followed for User Agent strings, so the variations are huge even for what are identical browsers. I would expect a 99% cache miss on this technique unless you will be controlling the User Agents yourself (internal system etc.)
If you want a different cache for mobile devices, the following might be more successful as it tries to detect a mobile browser, then uses a normalised cache key value to improve hit rate:
sub vcl_hash {
if (req.http.User-Agent ~ "mobile") {
// hash_data
hash_data("mobile");
}
}
Varnish supports that by default. You don't need to change Varnish's configuration. You only need to send the Vary header:
The Vary HTTP response header determines how to match future request headers to decide whether a cached response can be used rather than requesting a fresh one from the origin server.
In your specific case where you want it to vary based on the User-Agent, Varnish will understand that it needs to create different versions of the same object in cache for each different User-Agent.
Beware that using varying you cache might reduce your hit-rate considerably due to the number of variations the User-Agent header has. To avoid that, normalization is required. You can read more normalizing User-Agent headers in Varnish's documentation
Do you know how to change the response header in CouchDB? Now it has Cache-control: must-revalidate; and I want to change it to no-cache.
I do not see any way to configure CouchDB's cache header behavior in its configuration documentation for general (built-in) API calls. Since this is not a typical need, lack of configuration for this does not surprise me.
Likewise, last I tried even show and list functions (which do give custom developer-provided functions some control over headers) do not really leave the cache headers under developer control either.
However, if you are hosting your CouchDB instance behind a reverse proxy like nginx, you could probably override the headers at that level. Another option would be to add the usual "cache busting" hack of adding a random query parameter in the code accessing your server. This is sometimes necessary in the case of broken client cache implementations but is not typical.
But taking a step back: why do you want to make responses no-cache instead of must-revalidate? I could see perhaps occasionally wanting to override in the other direction, letting clients cache documents for a little while without having to revalidate. Not letting clients cache at all seems a little curious to me, since the built-in CouchDB behavior using revalidated Etags should not yield any incorrect data unless the client is broken.
I'm wondering if my (possibly strange) use case is possible to implement in Varnish with VCL. My application depends on receiving responses from a cacheable API server with very low latencies (i.e. sub-millisecond if possible). The application is written in such a way that an "empty" response is handled appropriately (and is a valid response in some cases), and the API is designed in such a way that non-empty responses are valid for a long time (i.e. days).
So, what I would like to do is configure varnish so that it:
Attempts to look up (and return) a cached response for the given API call
On a cache miss, immediately return an "empty" response, and queue the request for the backend
On a future call to a URL which was a cache miss in #2, return the now-cached response
Is it possible to make Varnish act in this way using VCL alone? If not, is it possible to write a VMOD to do this (and if so, pointers, tips, etc, would be greatly appreciated!)
I don't think you can do it with VCL alone, but with VCL and some client logic you could manage it quite easily, I think.
In vcl_miss, return an empty document using error 200 and set a response header called X-Try-Again in the default case.
In the client app, when receiving an empty response with X-Try-Again set, request the same resource asynchronously but add a header called X-Always-Fetch to the request. Your app does not wait for the response or do anything with it once it arrives.
Also in vcl_miss, check for the presence of the same X-Always-Fetch header. If present, return (fetch) instead of the empty document. This will request the content from the back end and cache it for future requests.
I also found this article which may provide some help though the implementation is a bit clunky to me compared to just using your client code: http://lassekarstensen.wordpress.com/2012/10/11/varnish-trick-serve-stale-content-while-refetching/
I'm afraid I am fairly new to varnish but I have a problem whch I cannot find a solution to anywhere (yet): Varnish is set up to cache GET requests. We have some requests which have so many parameters that we decided to pass them in the body of the request. This works fine when we bypass Varnish but when we go through Varnish (for caching), the request is passed on without the body, so the service behind Varnish fails.
I know we could use POST, but we want to GET data. I also know that Varnish CAN pass the request body on if we use pass mode but as far as I can see, requests made in pass mode aren't cached. I've already put a hash into the url so that when things work, we will actually get the correct data from cache (as far as the url goes the calls would otherwise all look to be the same).
The problem now is "just" how to rewrite vcl_fetch to pass on the request body to the webserver? Any hints and tips welcome!
Thanks in advance
Jon
I don't think you can, but, even if you can, it's very dangerous : Varnish won't store request body into cache or hash table, so it won't be able to see any difference between 2 requests with same URI and different body.
I haven't heard about a VCL key to read request body but, if it exists, you can pass it to req.hash to differenciate requests.
Anyway, request body should only be used with POST or PUT...and POST/PUT requests should not be cached.
Request body is supposed to send data to the server. A cache is used to get data...
I don't know the details, but I think there's a design issue in your process...
I am not sure I got your question right but if you try to interact with the request body in some way this is not possible with VCL. You do not have any VCL variable/subroutine to do this.
You can find the list of variables available in VCL here (or in man vcl) :
https://github.com/varnish/Varnish-Cache/blob/master/lib/libvcl/generate.py#L105
I agree with Gauthier, you seem to have a design issue in your system.
'Hope that helps.