Varnish / VCL gurus: How to pass request body using Varnish fetch?

Varnish / VCL gurus: How to pass request body using Varnish fetch? - varnish

I'm afraid I am fairly new to varnish but I have a problem whch I cannot find a solution to anywhere (yet): Varnish is set up to cache GET requests. We have some requests which have so many parameters that we decided to pass them in the body of the request. This works fine when we bypass Varnish but when we go through Varnish (for caching), the request is passed on without the body, so the service behind Varnish fails.
I know we could use POST, but we want to GET data. I also know that Varnish CAN pass the request body on if we use pass mode but as far as I can see, requests made in pass mode aren't cached. I've already put a hash into the url so that when things work, we will actually get the correct data from cache (as far as the url goes the calls would otherwise all look to be the same).
The problem now is "just" how to rewrite vcl_fetch to pass on the request body to the webserver? Any hints and tips welcome!
Thanks in advance
Jon

I don't think you can, but, even if you can, it's very dangerous : Varnish won't store request body into cache or hash table, so it won't be able to see any difference between 2 requests with same URI and different body.
I haven't heard about a VCL key to read request body but, if it exists, you can pass it to req.hash to differenciate requests.
Anyway, request body should only be used with POST or PUT...and POST/PUT requests should not be cached.
Request body is supposed to send data to the server. A cache is used to get data...
I don't know the details, but I think there's a design issue in your process...

I am not sure I got your question right but if you try to interact with the request body in some way this is not possible with VCL. You do not have any VCL variable/subroutine to do this.
You can find the list of variables available in VCL here (or in man vcl) :
https://github.com/varnish/Varnish-Cache/blob/master/lib/libvcl/generate.py#L105
I agree with Gauthier, you seem to have a design issue in your system.
'Hope that helps.

Related

Can I put an object into varnish?

I want to push to Varnish specific url to cache it after processing. Flow:
processing image
when finished "push" url reference to the image to varnish to cache it
I do not want to wait for requests by clients to cache it then. It should be ready after processing to return it with high performance. Is it possible?
I can send an internal request GET like a standard client and make it cached, but I would prefer defining i.e PUT request in varnish config, and make it cached without returning it in that process.

Your only option is an internal HEAD (better than GET; it will be internally converted to a GET by Varnish when submitting the request to the backend side). The PUT approach is not possible, at least not without implementing a VMOD for it, and it probably won't be a simple one.

NodeJS, bypass URL length limitation

It seems NodeJS only allows URLs of max. size 80KB.
I need to pass longer URLs to an internal application, is that possible to bypass that limitation without recompiling NodeJs (which is impossible for me on the setup).

You cannot change the limit without modifing the http_parser.h file.
You will need a better way of sending the data to your application whether it be in the request body or a file within the request body. Without more information it is difficult to propose a solution to your problem

Varnish never sending If-Not-Modified

I'm trying to use varnish to cache rpms and other giant binaries. What I would've expected is that when an object is expired in the cache varnish would send a request with If-Not-Modified to the backend and then assuming the object didn't change, varnish would refresh the ttl on the local cached object without downloading a new one. I wrote a test backend to generate specific request (set small max-age and whatnot, as well as see the header varnish sends) but I never get anything else then full fetch. If-Not-Modified is never sent. My VCL is basically the default VCL. I tried playing around with setting small ttl/grace but never got any interesting behavior.
Is varnish even able to do what I want it to ? If so has anyone done anything similar and can give tips ?

The request sent to the backend when an object is expired is the one that Varnish receives from the client.
So when testing your setup, are you sending an If-Not-Modified header in your requests to Varnish?
Have a look at https://www.varnish-software.com/wiki/content/tutorials/varnish/builtin_vcl.html to see what the built in VCL is.
Under vcl_backend_fetch, which will be called if there is no object in the cache, you can see there is no complex logic around stale objects, it is just passing on the request as is.

First of all, quite a bit has happened in varnish-cache since this question was posted. I am answering the questions for varnish-cache 6.0 and later:
The behavior the OP expects is how varnish should behave now if the backend returns the Last-Modified and/or Etag headers.
Obviously, an object can only be refreshed if it still exist in cache. This is what beresp.keep is for. It extends the time an object is kept in cache after ttl and grace have expired. Note that objects are also LRU evicted if the cache is too small to keep all objects for their maximum lifetime.
On the comment by #maxschlepzig, it might be based on a misunderstanding:
When an object is not in cache but is to be cached, varnish can not forward the client request's conditional headers (If-Modified-Since, If-None-Match) because a 304 response would not be good for caching (it has not body and is relevant only for a particular request). Instead, varnish strips to conditional headers for this case to (potentially) get a 200 response with an object to put into cache.
As explained above, for a subsequent backend request after the ttl has expired, the conditional headers are constructed based on the cached response. The conditional headers from the client are not used for this case either.
All of this above applies for the case that an object is to be cached at all (Fetch, Hit-for-Miss (as created by setting beresp.uncacheable)).
For Pass and Hit-for-Pass (as created by return(pass(duration)) in vcl_backend_response), the client conditional headers are passed to the backend.

How to change response header (cache) in CouchDB?

Do you know how to change the response header in CouchDB? Now it has Cache-control: must-revalidate; and I want to change it to no-cache.

I do not see any way to configure CouchDB's cache header behavior in its configuration documentation for general (built-in) API calls. Since this is not a typical need, lack of configuration for this does not surprise me.
Likewise, last I tried even show and list functions (which do give custom developer-provided functions some control over headers) do not really leave the cache headers under developer control either.
However, if you are hosting your CouchDB instance behind a reverse proxy like nginx, you could probably override the headers at that level. Another option would be to add the usual "cache busting" hack of adding a random query parameter in the code accessing your server. This is sometimes necessary in the case of broken client cache implementations but is not typical.
But taking a step back: why do you want to make responses no-cache instead of must-revalidate? I could see perhaps occasionally wanting to override in the other direction, letting clients cache documents for a little while without having to revalidate. Not letting clients cache at all seems a little curious to me, since the built-in CouchDB behavior using revalidated Etags should not yield any incorrect data unless the client is broken.

Return a synthetic response then fetch and cache object in Varnish?

I'm wondering if my (possibly strange) use case is possible to implement in Varnish with VCL. My application depends on receiving responses from a cacheable API server with very low latencies (i.e. sub-millisecond if possible). The application is written in such a way that an "empty" response is handled appropriately (and is a valid response in some cases), and the API is designed in such a way that non-empty responses are valid for a long time (i.e. days).
So, what I would like to do is configure varnish so that it:
Attempts to look up (and return) a cached response for the given API call
On a cache miss, immediately return an "empty" response, and queue the request for the backend
On a future call to a URL which was a cache miss in #2, return the now-cached response
Is it possible to make Varnish act in this way using VCL alone? If not, is it possible to write a VMOD to do this (and if so, pointers, tips, etc, would be greatly appreciated!)

I don't think you can do it with VCL alone, but with VCL and some client logic you could manage it quite easily, I think.
In vcl_miss, return an empty document using error 200 and set a response header called X-Try-Again in the default case.
In the client app, when receiving an empty response with X-Try-Again set, request the same resource asynchronously but add a header called X-Always-Fetch to the request. Your app does not wait for the response or do anything with it once it arrives.
Also in vcl_miss, check for the presence of the same X-Always-Fetch header. If present, return (fetch) instead of the empty document. This will request the content from the back end and cache it for future requests.
I also found this article which may provide some help though the implementation is a bit clunky to me compared to just using your client code: http://lassekarstensen.wordpress.com/2012/10/11/varnish-trick-serve-stale-content-while-refetching/

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string