Varnish never sending If-Not-Modified - varnish

I'm trying to use varnish to cache rpms and other giant binaries. What I would've expected is that when an object is expired in the cache varnish would send a request with If-Not-Modified to the backend and then assuming the object didn't change, varnish would refresh the ttl on the local cached object without downloading a new one. I wrote a test backend to generate specific request (set small max-age and whatnot, as well as see the header varnish sends) but I never get anything else then full fetch. If-Not-Modified is never sent. My VCL is basically the default VCL. I tried playing around with setting small ttl/grace but never got any interesting behavior.
Is varnish even able to do what I want it to ? If so has anyone done anything similar and can give tips ?

The request sent to the backend when an object is expired is the one that Varnish receives from the client.
So when testing your setup, are you sending an If-Not-Modified header in your requests to Varnish?
Have a look at https://www.varnish-software.com/wiki/content/tutorials/varnish/builtin_vcl.html to see what the built in VCL is.
Under vcl_backend_fetch, which will be called if there is no object in the cache, you can see there is no complex logic around stale objects, it is just passing on the request as is.

First of all, quite a bit has happened in varnish-cache since this question was posted. I am answering the questions for varnish-cache 6.0 and later:
The behavior the OP expects is how varnish should behave now if the backend returns the Last-Modified and/or Etag headers.
Obviously, an object can only be refreshed if it still exist in cache. This is what beresp.keep is for. It extends the time an object is kept in cache after ttl and grace have expired. Note that objects are also LRU evicted if the cache is too small to keep all objects for their maximum lifetime.
On the comment by #maxschlepzig, it might be based on a misunderstanding:
When an object is not in cache but is to be cached, varnish can not forward the client request's conditional headers (If-Modified-Since, If-None-Match) because a 304 response would not be good for caching (it has not body and is relevant only for a particular request). Instead, varnish strips to conditional headers for this case to (potentially) get a 200 response with an object to put into cache.
As explained above, for a subsequent backend request after the ttl has expired, the conditional headers are constructed based on the cached response. The conditional headers from the client are not used for this case either.
All of this above applies for the case that an object is to be cached at all (Fetch, Hit-for-Miss (as created by setting beresp.uncacheable)).
For Pass and Hit-for-Pass (as created by return(pass(duration)) in vcl_backend_response), the client conditional headers are passed to the backend.

Related

How to instruct varnish to generate the cache key based on response header data

I need to do cache key generation/store based on the response header received from the backend not based on the URL requested from the client-side.
The main reason for doing so is: I have a backend logic to reply to the client with some different data if requested things are not available.
Ex:
Request: example.com/foo/102030?names=test1
now, my backend checks for is test1 is present for 102030, if not it checks whether 102030 has a special tag = Y : basically tells that have to look for some other matching object.
so, in that case, backed reply's to clients with data again, which can be accessible with example.com/foo/000000?names=test1.
So, now the problem is if some other request comes with example.com/foo/000000?names=test1, varnish considers this as a different request based on URL but in actuality, I need to serve the same data which is already present in Cache with example.com/foo/000000?names=test1.
Currently, I do Ban using some regex syntax from the backend, so in that case, I can easily invalidate the object which stored with /foo/000000?names=test1 not the other one.
So, is there a way through which I can store the cache key based on the response header info?
There's no way you can do this unfortunately. Only request information can be used to create the cache key.
That is by design, because incoming requests only have their own request properties they can present to Varnish to identify the resource they wish to retrieve.

Http Cache-Control Clashes with Multiple Browser Tabs/Sessions

I am using Angular 5, with a Hapi NodeJs backend. When I send the "cache-control: private, max-age=3600" header in the http response the response is cached correctly. The problem is that when I make the exact same request in a different tab and with connections to different database the data cached in the browser tab 1 is shared with browser tab 2 when it make the same request. Is there a way for the cache to only be used per application instance using the cache-control header?
Same Webapp in Browser Tab 1. Same Domain.
Database 1
Same Webapp in Browser Tab 2. Same Domain.
Database 2
User agent needs to somehow differentiate these cache entries. Probably your best option is to adjust a cache entry key (add subdomain, path or query parameter that identifies a database to a URI).
You can also use custom HTTP header (such as X-Database), in pair with Vary HTTP header but in this case user agent may store only single response at a time because it is still uses URI as cache key and Vary HTTP header for response validation only. Relevant excerpt from The State of Browser Caching, Revisited article by Mark Nottingham:
The one hiccup I saw was that all tested browser caches would only store one variant at a time; i.e., if your responses contain Vary: Foo and you get two requests, the first with Foo: 1 and the second with Foo: 2, the second response will evict the first from cache.
Whether that’s a problem depends on how you use Vary; if you want to reuse cached responses with old values, it might reduce efficiency. However, that doesn’t seem like a common use case;
For more information check RFC 7234 Hypertext Transfer Protocol (HTTP/1.1): Caching, and Understanding The Vary Header article by Andrew Betts

How to change response header (cache) in CouchDB?

Do you know how to change the response header in CouchDB? Now it has Cache-control: must-revalidate; and I want to change it to no-cache.
I do not see any way to configure CouchDB's cache header behavior in its configuration documentation for general (built-in) API calls. Since this is not a typical need, lack of configuration for this does not surprise me.
Likewise, last I tried even show and list functions (which do give custom developer-provided functions some control over headers) do not really leave the cache headers under developer control either.
However, if you are hosting your CouchDB instance behind a reverse proxy like nginx, you could probably override the headers at that level. Another option would be to add the usual "cache busting" hack of adding a random query parameter in the code accessing your server. This is sometimes necessary in the case of broken client cache implementations but is not typical.
But taking a step back: why do you want to make responses no-cache instead of must-revalidate? I could see perhaps occasionally wanting to override in the other direction, letting clients cache documents for a little while without having to revalidate. Not letting clients cache at all seems a little curious to me, since the built-in CouchDB behavior using revalidated Etags should not yield any incorrect data unless the client is broken.

Return a synthetic response then fetch and cache object in Varnish?

I'm wondering if my (possibly strange) use case is possible to implement in Varnish with VCL. My application depends on receiving responses from a cacheable API server with very low latencies (i.e. sub-millisecond if possible). The application is written in such a way that an "empty" response is handled appropriately (and is a valid response in some cases), and the API is designed in such a way that non-empty responses are valid for a long time (i.e. days).
So, what I would like to do is configure varnish so that it:
Attempts to look up (and return) a cached response for the given API call
On a cache miss, immediately return an "empty" response, and queue the request for the backend
On a future call to a URL which was a cache miss in #2, return the now-cached response
Is it possible to make Varnish act in this way using VCL alone? If not, is it possible to write a VMOD to do this (and if so, pointers, tips, etc, would be greatly appreciated!)
I don't think you can do it with VCL alone, but with VCL and some client logic you could manage it quite easily, I think.
In vcl_miss, return an empty document using error 200 and set a response header called X-Try-Again in the default case.
In the client app, when receiving an empty response with X-Try-Again set, request the same resource asynchronously but add a header called X-Always-Fetch to the request. Your app does not wait for the response or do anything with it once it arrives.
Also in vcl_miss, check for the presence of the same X-Always-Fetch header. If present, return (fetch) instead of the empty document. This will request the content from the back end and cache it for future requests.
I also found this article which may provide some help though the implementation is a bit clunky to me compared to just using your client code: http://lassekarstensen.wordpress.com/2012/10/11/varnish-trick-serve-stale-content-while-refetching/

Varnish / VCL gurus: How to pass request body using Varnish fetch?

I'm afraid I am fairly new to varnish but I have a problem whch I cannot find a solution to anywhere (yet): Varnish is set up to cache GET requests. We have some requests which have so many parameters that we decided to pass them in the body of the request. This works fine when we bypass Varnish but when we go through Varnish (for caching), the request is passed on without the body, so the service behind Varnish fails.
I know we could use POST, but we want to GET data. I also know that Varnish CAN pass the request body on if we use pass mode but as far as I can see, requests made in pass mode aren't cached. I've already put a hash into the url so that when things work, we will actually get the correct data from cache (as far as the url goes the calls would otherwise all look to be the same).
The problem now is "just" how to rewrite vcl_fetch to pass on the request body to the webserver? Any hints and tips welcome!
Thanks in advance
Jon
I don't think you can, but, even if you can, it's very dangerous : Varnish won't store request body into cache or hash table, so it won't be able to see any difference between 2 requests with same URI and different body.
I haven't heard about a VCL key to read request body but, if it exists, you can pass it to req.hash to differenciate requests.
Anyway, request body should only be used with POST or PUT...and POST/PUT requests should not be cached.
Request body is supposed to send data to the server. A cache is used to get data...
I don't know the details, but I think there's a design issue in your process...
I am not sure I got your question right but if you try to interact with the request body in some way this is not possible with VCL. You do not have any VCL variable/subroutine to do this.
You can find the list of variables available in VCL here (or in man vcl) :
https://github.com/varnish/Varnish-Cache/blob/master/lib/libvcl/generate.py#L105
I agree with Gauthier, you seem to have a design issue in your system.
'Hope that helps.

Resources