What does Varnish hit-for-pass mean?

What does Varnish hit-for-pass mean? - varnish

Varnish Version 3 has some objects for different operations.
For example, pass is used when it has to retrieve data from backend, and it uses hit when it finds requesting content in cache.
But I cant understand usage of hit-for-pass. When does Varnish use it? I haven't found any useful material on the web which makes it clear.

A hit_for_pass object is made to optimize the fetch procedure against a backend server.
For ordinary cache misses, Varnish will queue all clients requesting the same cache object and send a single request to the backend. This is usually quickest, letting the backend work on a single request instead of swamping it with n requests at the same time.
Remember that some backends use a lot of time preparing an object; 10 seconds is not uncommon. If this is the front page HTML and you have 3000 req/s against it, sending just one backend request makes a lot of sense.
The issue arises when after Varnish has fetched the object it sees that it can't be cached. Reasons for this can be that the backend sends "Cache-Control: max-age=0", or (more often) a Set-Cookie header. In this case you have somewhere between 3,000 and 30,000 clients (3k req/s * 10sec) sitting idle in queue, and for each of these clients the same slow one-at-a-time backend request must complete to serve them. This will ruin your site response time.
So Varnish saves the decision that this request cannot be cached by creating a hit_for_pass object.
On the next request for the same URL, the cache lookup will return a hit_for_pass object. This signals that multiple fetches may be done at the same time. Your backend might not be too happy about it, but at least Varnish isn't queuing the clients for no reason.

Related

Varnish: serve multiple simultaneous users a single large file with one 1 backend request/connection

I'm not sure if this is possible with Varnish, but I have a backend server that generates large (1-2GB) files, but is located on a relatively slow connection.
Is it possible to configure Varnish (running on a remote machine with a fast connection) to serve multiple users at the same time, but only opening one connection to the backend?
For example, user 1 starts downloading a file through Varnish from the backend, and partway through, user 2 also begins a download. Rather than opening a new connection, could Varnish serve everything it has already cached (say, 100MB), and then once user 2 "catches up", it keeps downloading at the speed of the backend connection (because Varnish is serving the same content to both users). After that initial period, that file would be fully cached on Varnish and is served fast.
Is this something Varnish could be configured to do (or is there a better proxy/cache software for this use case).

What you're describing is called request coalescing and this is a standard Varnish feature.
As long as the object is stored in cache, the streamed content that is partially stored in cache will be consumed by all clients consuming that resource.
I don't think that opening up a 2nd connection is really a problem, but I guess your thought process is avoiding a cache miss for partially cached data. No worries though, Varnish has you covered.

Faster HTTP scraping per POST request?

I'm writing an API that returns an array of redirects for any given page:
router.post('/trace', function(req,res){
if(!req.body.link)
return res.status(405).send(""); //error: no link provided!
console.log("\tapi/trace()", req.body.link);
var redirects = [];
function exit(goodbye){
if(goodbye)
console.log(goodbye);
res.status(200).send(JSON.stringify(redirects)); //end
}
function getRedirect(link){
request({ url: link, followRedirect: false }, function (err, response, body) {
if(err)
exit(err);
else if(response.headers.location){
redirects.push(response.headers.location);
getRedirect(response.headers.location);
}
else
exit(); //all done!
});
}
getRedirect(req.body.link);
});
and here is the corresponding browser request:
$.post('/api/trace', { link: l }, cb);
a page will make about 1000 post request very quickly and then waits a very long time to get each request back.
The problem is the response to the nth request is very slow. individual request takes about half a second, but as best I cant tell the express server is processing each link sequentially. I want the server to make all the requests and respond as it receives a response.
Am I correct in assuming express POST router is running processes sequentially? How do I get it to blast all requests and pass the responses as it gets them?

My question is why is it so slow / is POST an async process on a "out of the box" express server?
You may be surprised to find out that this is probably first a browser issue, not a node.js issue.
A browser will have a max number of simultaneous requests it will allow your Javascript ajax to make to same host which will vary slightly from one browser to the next, but is around 6. So, if you're making 1000 requests, then only around 6 are being sent at at time. The rest go in a queue in the browser waiting for prior requests to finish. So, your node server likely isn't getting 1000 simultaneous requests. You should be able to confirm this by logging incoming requests in your node.js app. You will probably see a long delay before it receives the 1000th request (because it's queued by the browser).
Here's a run-down of how many simultanous requests to a given host each of the browser supported (as of a couple years ago): Max parallel http connections in a browser?.
My first recommendation would be to package up an array of requests to make from the client to the server (perhaps 50 at a time) and then send that in one request. That will give your node.js server plenty to chew on and won't run afoul of the browser's connection limit to the same host.
As for the node.js server, it depends a lot on what you're doing. If most of what you're doing in the node.js server is just networking and not a lot of processing that requires CPU cycles, then node.js is very efficient at handling lots and lots of simultaneous requests. If you start engaging a bunch of CPU (processing or preparing results), then you make benefit from either adding worker processes or using node.js clustering. In your case, you may want to use worker processes. You can examine your CPU load when your node.js server is processing a bunch of work and see if the one CPU that node.js is using is anywhere near 100% or not. If it isn't, then you don't need more node.js processes. If it is, then you do need to spread the work over more node.js processes to go faster.
In your specific case, it looks like you're really only doing networking to collect 302 redirect responses. Your single node.js process should be able to handle a lot of those requests very efficiently so probably the issue is just that your client is being throttled by the browser.
If you want to send a lot of requests to the server (so it can get to work on as many as feasible), but want to get results back immediately as they become available, that's a little more work.
One scheme that could work is to open a webSocket or socket.io connection. You can then send a giant array of URLs that you want the server to check for you in one message over the socket.io connection. Then, as the server gets a result, it can send back each individual result (tagged with the URL that it corresponds to). That way, you can somewhat get the best of both worlds with the server crunching on a long list of URLs, but able to send back individual responses as soon as it gets them.
Note, you will probably find that there is an upper limit to how many outbound http requests you may want to run at the same time from your node.js server too. While modern versions of node.js don't throttle you like the browser does, you probably also don't want your node.js server attempting to run 10,000 simultaneous requests because you may exhaust some sort of network resource pool. So, once you get past the client bottleneck, you will want to test your server at different levels of simultaneous requests open to see where it performs best. This is both to optimize its performance, but also to protect your server against attempting to overextend its use of networking or memory resources and get into error conditions.

How to persist HTTP response in redis

I am creating a long-polling chat application on nodeJS without using Socket.io and scaling it using clusters.
I have to find a way to store all the long-polled HTTP requests and response objects in such a way that it is available across all node clusters(so that when a message is received for a long-polled request, I can get that request and respond to it)
I have tried using redis, however, when I stringify http request and response objects, I get "Cannot Stringify Cyclic Structure" Error.
Maybe I am approaching it in a wrong way. In that case, how do we generally implement lon-polling across different clusters?

What you're asking seems to be a bit confused.
In a long-polling situation, a client makes an http request that is routed to a specific HTTP server. If no data to satisfy that request is immediately available, the request is then kept alive for some extended period of time and either it will eventually timeout and the client will then issue another long polling request or some data will become available and a response will be returned to the request.
As such, you do not make this work in clusters by trying to centrally save request and response objects. Those belong to a specific TCP connection between a specific server and a specific client. You can't save them and use them elsewhere and it also isn't something that helps any of this work with clustering either.
What I would think the clustering problem you have here is that when some data does become available for a specific client, you need to know which server that client has a long polling request that is currently live so you can instruct that specific server to return the data from that request.
The usual way that you do this is you have some sort of userID that represents each client. When any client connects in with a long polling request, that connection is cluster distributed to one of your servers. That server that gets the request, then writes to a central database (often redis) that this userID userA is now connected to server12. Then, when some data becomes available for userA, any agent can lookup that user in the redis store and see that the user is currently connected to server12. So, they can instruct server12 to send the data to userA using the current long polling connection for userA.
This is just one strategy for dealing with clustering - there are many others such as sticky load balancing, algorithmic distribution, broadcast distribution, etc... You can see an answer that describes some of the various schemes here.

If you are sure you want to store all the request and responses, have a look at this question.
Serializing Cyclic objects
you can also try cycle.js
However, I think you would only be interested in serializing few elements from request/response. An easier (probably better too) approach would be to just copy the required key/value pairs from request/response object in to a separate object and store them.

Nodejs: How to reject https posts based on headers

In a node.js server that accepts HTTPS post requests that are typically pretty large (a few MBs) we want to be able to start processing the requests before the entire thing is accepted by the server.
For example, if a request with a big fat body arrives, we want to look at its path and based on it decide whether to terminate/reject it, without having to wait for the entire request to arrive (and pay IO cost of receiving that fat body).

You could try the the Connect Limit middleware:
https://github.com/senchalabs/connect/blob/master/lib/middleware/limit.js
or, implement your own solution in a similar way by checking req.headers[content-length], etc..

Based on experimentation, it seems that Node.js only fires the request event after parsing the HTTP headers. Meaning there's a chance to examine the headers before we even start listening for the data event.
Thus the solution seems to be to check the headers before reading any data, and potentially rejecting the request at that point. If we don't reject at that point, we start accumulating the data buffers as they arrive and if they exceed the limit (and thus conflict with the reported content length) we have another chance to reject the request right there by calling response.end().

Why is node.js only processing six requests at a time?

We have a node.js server which implements a REST API as a proxy to a central server which has a slightly different, and unfortunately asymmetric REST API.
Our client, which runs in various browsers, asks the node server to get the tasks from the central server. The node server gets a list of all the task ids from the central one and returns them to the client. The client then makes two REST API calls per id through the proxy.
As far as I can tell, this stuff is all done asynchronously. In the console log, it looks like this when I start the client:
Requested GET URL under /api/v1/tasks/*: /api/v1/tasks/
This takes a couple seconds to get the list from the central server. As soon as it gets the response, the server barfs this out very quickly:
Requested GET URL under /api/v1/tasks/id/:id :/api/v1/tasks/id/438
Requested GET URL under /api/v1/workflow/id/:id :/api/v1/workflow/id/438
Requested GET URL under /api/v1/tasks/id/:id :/api/v1/tasks/id/439
Requested GET URL under /api/v1/workflow/id/:id :/api/v1/workflow/id/439
Requested GET URL under /api/v1/tasks/id/:id :/api/v1/tasks/id/441
Requested GET URL under /api/v1/workflow/id/:id :/api/v1/workflow/id/441
Then, each time a pair of these requests gets a result from the central server, another two lines is barfed out very quickly.
So it seems our node.js server is only willing to have six requests out at a time.

There are no TCP connection limits imposed by Node itself. (The whole point is that it's highly concurrent and can handle thousands of simultaneous connections.) Your OS may limit TCP connections.
It's more likely that you're either hitting some kind of limitation of your backend server, or you're hitting the builtin HTTP library's connection limit, but it's hard to say without more details about that server or your Node implementation.
Node's built-in HTTP library (and obviously any libraries built on top of it, which are most) maintains a connection pool (via the Agent class) so that it can utilize HTTP keep-alives. This helps increase performance when you're running many requests to the same server: rather than opening a TCP connection, making a HTTP request, getting a response, closing the TCP connection, and repeating; new requests can be issued on reused TCP connections.
In node 0.10 and earlier, the HTTP Agent will only open 5 simultaneous connections to a single host by default. You can change this easily: (assuming you've required the HTTP module as http)
http.globalAgent.maxSockets = 20; // or whatever
node 0.12 sets the default maxSockets to Infinity.
You may want to keep some kind of connection limit in place. You don't want to completely overwhelm your backend server with hundreds of HTTP requests under a second – performance will most likely be worse than if you just let the Agent's connection pool do its thing, throttling requests so as to not overload your server. Your best bet will be to run some experiments to see what the optimal number of concurrent requests is in your situation.
However, if you really don't want connection pooling, you can simply bypass the pool entirely – sent agent to false in the request options:
http.get({host:'localhost', port:80, path:'/', agent:false}, callback);
In this case, there will be absolutely no limit on concurrent HTTP requests.

It's the limit on number of concurrent connections in the browser:
How many concurrent AJAX (XmlHttpRequest) requests are allowed in popular browsers?
I have upvoted the other answers, as they helped me diagnose the problem. The clue was that node's socket limit was 5, and I was getting 6 at a time. 6 is the limit in Chrome, which is what I was using to test the server.

How are you getting data from the central server? "Node does not limit connections" is not entirely accurate when making HTTP requests with the http module. Client requests made in this way use the http.globalAgent instance of http.Agent, and each http.Agent has a setting called maxSockets which determines how many sockets the agent can have open to any given host; this defaults to 5.
So, if you're using http.request or http.get (or a library that relies on those methods) to get data from your central server, you might try changing the value of http.globalAgent.maxSockets (or modify that setting on whatever instance of http.Agent you're using).
See:
http.Agent documentation
agent.maxSockets documentation
http.globalAgent documentation
Options you can pass to http.request, including an agent parameter to specify your own agent

Node js can handle thousands of incoming requests - yes!
But when it comes down to ougoing requests every request has to deal with a dns lookup and dns lookup's, disk reads etc are handled by the libuv which is programmed in C++. The default value of threads for each node process is 4x threads.
If all 4x threads are busy with https requests ( dns lookup's ) other requests will be queued. That is why no matter how brilliant your code might be : you sometimes get 6 or sometimes less concurrent outgoing requests per second completed.
Learn about dns cache to reduce the amount of dns look up's and increase libuv size. If you use PM2 to manage your node processes they do have a well documentation on their side on environment variables and how to inject them. What you are looking for is the environment variable UV_THREADPOOL_SIZE = 4
You can set the value anywhere between 1 or max limit of 1024. But keep in mind libuv limit of 1024 is across all event loops.

I have seen the same problem in my server. It was only processing 4 requests.
As explained already from 0.12 maxsockets defaults to infinity. That easily overwhelms the sever. Limiting the requests to say 10 by
http.globalAgent.maxSockets = 20;
solved my problem.

Are you sure it just returns the results to the client? Node processes everything in one thread. So if you do some fancy response parsing or anything else which doesn't yield, then it would block all your requests.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string