How to ignore big cookies in varnish - varnish

I want to ignore requests with big cookie size. We have some requests dropped in varnish due to "BogoHeader Header too long: Cookie: xyz". How can it be done in VCL? I didn't find any len, length or strlen function in VCL, I know that it can be done in the vcl_rcev phase.

The strlen() feature won't help to fix your problem. Varnish discards the request due to the large Cookie header before vcl_recv is executed. If you don't want those requests to be discarded you need to check and adjust some runtime parameters: http_req_hdr_len, http_req_size, http_resp_hdr_len, etc.
In any case, if you are still interested in the strlen() feature, it would be trivial to add it to the std VMOD, but that support doesn't exist at the moment. You could consider using an existing VMOD including utilities like strlen() (or implement it on your own), but that's probably too much work. Finally, you could consider using a hacky approach using just VCL and a regexp:
if (req.http.Cookie ~ "^.{1024,}$") {
...
}

Related

Varnish return(fetch/deliver) vs chunked encoding

I'm trying to get varnish cache response to be chunked... (that's possible, right?)
I have the following scenario:
1 - cache is clean, good to go (service varnish restart)
2 - access the www.mywebsite.com/page for the first time
(no content-length is returned, and chunking is there, great!)
3 - the next time I access the page (like simple reloading) it will be cached.. and now I get this:
(now we have content-length... which means no chunking :( not great!)
After reading some Varnish docs/blogs (and this: http://book.varnish-software.com/4.0/chapters/VCL_Basics.html), looks like there are two "last" returns: return(fetch) or return(deliver).
When forcing a return(fetch), the chunked encoding works... but it also means that the request won't be cached, right? While return(deliver) caches correctly but adds the content-length header.
I've tried adding these to my default.vcl file:
set beresp.do_esi = true; (at vcl_backend_response stage)
and
unset beresp.http.content-length; (at different stages, without success)
So.. how to have Varnish caching working with Transfer-Encoding: chunked?
Thanks for your attention!
Is there a reason why you want to send it chunked? Chunked transfer encoding is kind of a clumsy workaround for when the content length isn't known ahead of time. What's actually happening here is Varnish is able to compute the length of the gzipped content after caching it for the first time, and so doesn't have to use the workaround! Rest assured that you are not missing out on any performance gains in this scenario.

AWS Cloudfront - Cache Behaviour - Path Pattern format

The documentation for the Path Pattern property is not exactly exhaustive.
The pattern to which this cache behavior applies. For example, you can specify images/*.jpg
Now, i understand the path pattern could be things like images/* and other simple variations but, can it be something like /path/*/latest/?
I can save that pattern but it doesn't seem to work as expected. It looks like Cloufront ignores everything after the * and caches everything below path/*, regardless of the fact that /path/*/latest is the top behaviour (0 order) with a TTL of zero.
To further clarify, i have a /path/* that i want to be served (and cached) by cloudfront, with the exception of one particular subpath, say path/*/latest that can be served by Cloudfront, but shouldn't be cached (hence i gave it a TTL of zero).
The problem may be with the ordering of the behaviours. Does path/*/latest occur before /path/*? Cache behaviors are processed in the order in which they're listed in the CloudFront console

Modify Header server: ArangoDB

Something that seems easy, but I don't find the way to do that. Does it possible to change the header sent in a response
server: ArangoDB
by something else (in order to be less verbose and more secure) ?
Also, I need to store a large string (very long url + lot of informations) in a document, but what is the max length of a joi.string ?
Thx,
The internal string limit in V8 (the JavaScript engine used by ArangoDB) is around 256 MB in the V8 version used by ArangoDB. Thus 256 MB will be the absolute maximum string length that can be used from JavaScript code that's executed in ArangoDB.
Regarding maximum URL lengths as mentioned above: URLs should get too long because very long URLs may not be too portable across browsers. I think in practice several browser will enforce some URL max length limits of around 64 K, so URLs should definitely not get longer than this value. I would recommend using much shorter URLs though and passing hugh payloads in the HTTP request body instead. This also means you may need to change from HTTP GET to HTTP POST or HTTP PUT, but its at least portable.
Finally regarding the HTTP response header "Server: ArangoDB" that is sent by ArangoDB in every HTTP response: starting with ArangoDB 2.8, there is an option to turn this off: --server.hide-product-header true. This option is not available in the stable 2.7 branch yet.
No, there currently is no configuration to disable the server: header in ArangoDB.
I would recommend prepending an NGiNX or similar HTTP-Proxy to achieve that (and other possible hardening for your service).
The implementation of server header can be found in lib/Rest/HttpResponse.cpp.
Regarding Joi -
I only found howto specify a string length in joi - not what its maximum could be.
I guess the general javascript limit for strings should be taken into account.
However, it rather seems that you shouldn't exceed the limit of 2000 chars for URLs which thereby should be the limit.

Performance difference between res.json() and res.end()

I want to send a JSON response using Node and Express. I'm trying to compare the performance of res.end and res.json for this purpose.
Version 1: res.json
res.json(anObject);
Version 2: res.end
res.setHeader('Content-Type', 'application/json');
res.end(JSON.stringify(anObject));
Running some benchmarks I can see that the second version is almost 15% faster than the first one. Is there a particular reason I have to use res.json if I want to send a JSON response?
Yes, is is very desirable to use json despite the overhead.
setHeader and end come from the native http module. By using them, you're effectively bypassing a lot Express's added features, hence the moderate speed bump in a benchmark.
However, benchmarks in isolation don't tell the whole story. json is really just a convenience method that sets the Content-Type and then calls send. send is an extremely useful function because it:
Supports HEAD requests
Sets the appropriate Content-Length header to ensure that the response does not use Transfer-Encoding: chunked, which wastes bandwidth.
Most importantly, provides ETag support automatically, allowing conditional GETs.
The last point is the biggest benefit of json and probably the biggest part of the 15% difference. Express calculates a CRC32 checksum of the JSON string and adds it as the ETag header. This allows a browser making subsequent requests for the same resource to issue a conditional GET (the If-None-Match header), and your server will respond 304 Not Modified if the JSON string is the same, meaning the actual JSON need not be sent over the network again.
This can add up to substantial bandwidth (and thus time) savings. Because the network is a much larger bottleneck than CPU, these savings are almost sure to eclipse the relatively small CPU savings you'd get from skipping json().
Finally, there's also the issue of bugs. Your "version 2" example has a bug.
JSON is stringified as UTF-8, and Chrome (contrary to spec) does not default to handling application/json responses as UTF-8; you need to supply a charset. This means non-ASCII characters will be mangled in Chrome. This issue has already been discovered by Express users, and Express sets the proper header for you.
This is one of the many reasons to be careful of premature/micro-optimization. You run the very real risk of introducing bugs.

Xively string data

I would like to know if it is possible to send a block of data like 128 bytes of data to a Xively server MOTOROLA SREC for example I need this to do firmware upgrades / download images to my Arduino connected device? As far as I can see one can only get - datapoints / values ?
A value of a datapoint can be a string. Firmware updates can be implement using Xively API V2 by just storing string encoded binaries as datapoints, provided that the size is small.
You probably can make some use of timestamps for rolling back versions that did work or something similar. Also you probably want to use the datapoints endpoint so you can just grab the entire response body and no need to parse anything.
/v2/feeds/<feed_id>/datastreams/<datastream_id>/datapoints/<timestamp>.csv
I suppose, you will need implement this in the bootloader which needs to be very small and maybe you can actually skip paring the HTTP headers and only attempt to very whether the body looks right (i.e. has some magic byte that you put in there, you can also try some to checksum it. This would a little bit opportunistic, but might be okay for an experiment. You should probably add Xively device provisioning to this also, but wouldn't try implementing everything right away.
It is however quite challenging to implement reliable firmware updates and there are sever papers out there which you should read. Some suggest to make device's behaviour most primitive you can, avoid any logic and make it rely on what server tells it to do.
To actually store the firmware string you can use cURL helper.
Add first version into a new datastream
Update with a new version

Resources