I have multiple Node.js servers, and I want each client to connect to the one closest to them (or with the least hops).
At the moment, I'm am having the clients connect to the server which responds first (ie the lowest ping).
However, will this actually be an accurate way to match people up to the best server? If not, what would be a better method?
A single ping will give you a single round-trip-time metric to go on. This is better than nothing and often (but not always) can be a good-enough proxy for general network latency to that server. It would be OK to start with this. Keep in mind that:
ping is an ICMP message which is typically handled by the OS directly not your application, so a pingable server might have a crashed or slow application
ICMP is commonly blocked by firewalls so an unpingable server may in fact be serving the application just fine.
However the "best" server is a lot more complicated, and if you want a high-quality determination of "best" you'll probably need to take into account the server's recent performance more holistically including things like recent application-level response times, total number of active connections, CPU load, I/O load, etc. You will probably want an application level way for the client to ask for basic health/performance metrics and the server to send them back and then an algorithm to convert that into a rankable numeric score. I would think a good next step after delay to get an empty response would be just to look at each server's average (or percentile) response time over the last minute or so and select the one with the lowest time.
Related
I'm trying to learn Node.js and adequate design approaches.
I've implemented a little API server (using express) that fetches a set of data from several remote sites, according to client requests that use the API.
This process can take some time (several fecth / await), so I want the user to know how is his request doing. I've read about socket.io / websockets but maybe that's somewhat an overkill solution for this case.
So what I did is:
For each client request, a requestID is generated and returned to the client.
With that ID, the client can query the API (via another endpoint) to know his request status at any time.
Using setTimeout() on the client page and some DOM manipulation, I can update and display the current request status every X, like a polling approach.
Although the solution works fine, even with several clients connecting concurrently, maybe there's a better solution?. Are there any caveats I'm not considering?
TL;DR The approach you're using is just fine, although it may not scale very well. Websockets are a different approach to solve the same problem, but again, may not scale very well.
You've identified what are basically the only two options for real-time (or close to it) updates on a web site:
polling the server - the client requests information periodically
using Websockets - the server can push updates to the client when something happens
There are a couple of things to consider.
How important are "real time" updates? If the user can wait several seconds (or longer), then go with polling.
What sort of load can the server handle? If load is a concern, then Websockets might be the way to go.
That last question is really the crux of the issue. If you're expecting a few or a few dozen clients to use this functionality, then either solution will work just fine.
If you're expecting thousands or more to be connecting, then polling starts to become a concern, because now we're talking about many repeated requests to the server. Of course, if the interval is longer, the load will be lower.
It is my understanding that the overhead for Websockets is lower, but still can be a concern when you're talking about large numbers of clients. Again, a lot of clients means the server is managing a lot of open connections.
The way large services handle this is to design their applications in such a way that they can be distributed over many identical servers and which server you connect to is managed by a load balancer. This is true for either polling or Websockets.
I have to collect real timer ticker data on trading pairs (usd/eur etc) from APIs of different sites. The data is usually a small JSON object with mostly ~10 numbers. The naive strategy is to make a request every 5 or so seconds to get the up-to-date ticker data of each of those sites. Some of them, though, provide a websocket option, which allows them to notify me directly when a change occurs, and, I believe, is more efficient. The issue is some of those sites don't offer that option, so the overall code will be simpler to organize and read if I use the same method for all sites (i.e., http requests). I'm also not sure the data is heavyweight enough to justify that choice.
From the experts who dealt with similar situations, is this case one where a relevant performance improvement can be expected from using sockets instead of timed http requests when it is available?
It depends:
WebSockets make only sense if you keep them open most of the time. If you instead open a new WebSocket connection each time you want to have new data the overhead is larger compared to a simple HTTP request. It is not so much the bandwidth (but this too) but you need more round trips to get your data which makes everything slower.
WebSockets take more resources at your end because you have to keep a TCP connection open for each open WebSocket connection. If there are only a small number of sites you need to ask it does not matter, if there are a lot it will matter. While it can be an advantage (less latency) to keep normal HTTP connections alive too you can close them if you have less resources.
If most of the time the data you get is the same then WebSockets might be more efficient because you only get send the new data when it actually changes.
If you want to be informed of new data as soon as possible then WebSockets perform better. If you only need a 5 second precision anyway it does not matter much.
For interactive web apps, things like Websockets are getting more popular. However, as the client, and proxy world is not always fully compliant, one usually use a complex framework like 'Socket.IO', hiding several different mechanisms for any case that may disable the other ones.
I just wonder what the downsides of a properly implemented long polling are, because with today's servers like node.js it is quite easy to implement and relies on old http technology which is well supported (despite the long polling behaveiour itself may break it).
From an high level view, long polling (despite some additional overhead, feasable for medium traffic apps) resembles a true push behaviour as WebSockets do, as the server actually send it's answer whenever he likes (despite some timeout / heartbeat mechanism).
So we have some more overhead due to the more TCP/IP acknowledgements I guess, but no constant traffic like frequent polling would do.
And using an event driven server, we would have no thread overhead to keep the connections blocked.
So is there any else hard downside that forces medium-traffic apps like chats to use WebSockets rather than long polling?
Overhead
It will create a new connection each time, so it will send the HTTP headers... including the cookie header that may be large.
Also, just "check if there is something new" is another connection for nothing. Connections implies the work of many items like firewalls, load balancers, web servers ... etc.. Probably, establish the connection is most time consuming thing as soon your IT infrastructure have several inspectors.
If you are using HTTPS, you are doing again and again the most expensive operation, the TLS handshake. TLS performance is good once the connection is established and the symmetric encryption is working, but the process of establishing the connection, key exchange and all that jazz is not fast.
Also, when connections are done, log entries are written somewhere, counters are incremented somewhere, memory is consumed, objects are created... etc... etc.. For example, the reason why we have different logging configurations when in production and in development, is because writing log entries also affect performance.
Presence
When is a long polling user connected or disconnected? If you check for this at a given moment of time... what would be the reliable amount of time you should wait to double check, to ensure it is disconnected or connected?
This may be totally irrelevant if your application just broadcast stuff, but it may be very relevant if your application is a game.
Not persistent
This is the big deal.
Since a new connection is created each time, if you have load balanced servers, in a round robbin scenario you cannot know in which server the next connection is going to fall.
When a user's server is known, like when using a WebSocket, you can push the events to that server straight away, and the server will relay them to the connection. If the user disconnects, the server can notify straight away that the user is not connected anymore, and when connect again can subscribe again.
If the server where the user is connected at the moment that an event for him is generated is unknown, you have to wait for the user to connect so then you can say "hey, user 123 is here, give me all the news since this timestamp", what make it a little bit more cumbersome. Long polling is not really push technology, but request-response, so if you plan for a EDA architecture, at some point you are going to have some level of impedance you have to address, like for example, you need a event aggregator that can give you all the events from a given timestamp (the last time that user connected to ask for news).
SignalR (I guess it is the equivalent in .NET to socket.io) for example, has a message bus named backplane, that relay all the messages to all the servers, as key for scaling out. Therefore, when a user connect to other server, "his" pending events are there "as well"(!) It is a "not too bad approach", but as you can guess, affects the throughput:
Limitations
Using a backplane, the maximum message throughput is lower than it is
when clients talk directly to a single server node. That's because the
backplane forwards every message to every node, so the backplane can
become a bottleneck. Whether this limitation is a problem depends on
the application. For example, here are some typical SignalR scenarios:
Server broadcast (e.g., stock ticker): Backplanes work well for this
scenario, because the server controls the rate at which messages are
sent.
Client-to-client (e.g., chat): In this scenario, the backplane might
be a bottleneck if the number of messages scales with the number of
clients; that is, if the rate of messages grows proportionally as more
clients join.
High-frequency realtime (e.g., real-time games): A backplane is not
recommended for this scenario.
For some projects, this may be a showstopper.
Some applications just broadcast general data, but others have a connection semantics, like for example a multiplayer game, and it is important to send the right events to the right connections.
IMHO
Long polling is a good solution for small projects, but became a big burden for high scalable apps that need high frecuency and/or very segmented event sending.
I implemented a Node.js Express server that supported long polling. The biggest mistake I made was not cleaning up the requests which caused slowing down the server. If your server doesn't support concurrency or threads, one of the essential tasks is to set the appropriate timeouts for the requests/responses to release them from the loop, which you have to do by yourself.
Edit: Also you need to keep in mind that browsers have their specific limit for the number of connections (i.e. 6 per hostname for Google Chrome). So if you have too many long polling connections at the same time, you will probably block yourself.
My node.js server is experiencing times when it becomes slow or unresponsive, even occasionally resulting in 503 gateway timeouts when attempting to connect to the server.
I am 99% sure (based upon tests that I have run) that this lag is coming specifically from the large number of outbound requests I am making with the node-oauth module to contact external APIs (Facebook, Twitter, and many others). Admittedly, the number of outbound requests being made is relatively large (in the order of 30 or so per minute). Even worse, this frequently means that the corresponding inbound requests to my server can take ~5-10 seconds to complete. However, I had a previous version of my API which I had written in PHP which was able to handle this amount of outbound requests without any problem at all. Actually, the CPU usage for the same number (or even fewer) requests with my Node.js API is about 5x that of my PHP API.
So, I'm trying to isolate where I can improve upon this, and most importantly to make sure that 503 timeouts do not occur. Here's some stuff I've read about or experimented with:
This article (by LinkedIn) recommends turning off socket pooling. However, when I contacted the author of the popular nodejs-request module, his response was that this was a very poor idea.
I have heard it said that setting "http.globalAgent.maxSockets" to a large number can help, and indeed it did seem to reduce bottlenecking for me
I could go on, but in short, I have been able to find very little definitive information about how to optimize performance so these outbound connections do not lag my inbound requests from clients.
Thanks in advance for any thoughts or contributions.
FWIW, I'm using express and mongoose as well, and my servers are hosted on the Amazon Cloud (2x M1.Large for the node servers, 2x load balancers, and 3x M1.Small MongoDB instances).
It sounds to me that the Agent is capping your requests to the default level of 5 per-host. Your tests show that cranking up the agent's maxSockets helped... you should do that.
You can prove this is the issue by firing up a packet sniffer, or adding more debugging code to your application, to show that this is the limiting factor.
http://engineering.linkedin.com/nodejs/blazing-fast-nodejs-10-performance-tips-linkedin-mobile
Disable the agent altogether.
As a hypothetical example, let's say that I wanted to make an application that displays peoples twitter networks. I would provide an API that would allow a client to query on a single username. That user's top x tweets would be sent to the client. Then, each person that had been mentioned by the initial person would be scanned. Their top x tweets would be sent to the client. This process would recursively continue, breadth-first, until a pre-defined depth was reached. The client would be receiving the data in real time, displaying statistics such as number of users scanned, number of known users remaining to scan, and a growing list of the tweet data. None of the processing is complicated (regex of small amounts of text), but many, many network requests would be spawned from a single initial request.
I really want the fantastic realtime capabilities of node.js with socket.io, but I feel like this is an abuse of those technologies - they're not meant for heavy server-side lifting. Is there a more appropriate toolset for what I am trying to accomplish, or a particular way to use these tools to that end? Milewise is doing something similar-ish, but I think that my application would consume significantly more network resources than theirs.
Thanks.
The best network transport which you can get on the web now are WebSockets which offers persistent bi-directional real-time connection between server and client. Although not every browser supports them, socket.io gives you a couple of fallback solutions which may however decrease the network performance when compared to WebSockets as stated in this article:
During making connection with WebSocket, client and server exchange
data per frame which is 2 bytes each, compared to 8 kilo bytes of http
header when you do continuous polling.
...
Reducing kilobytes of data
to 2 bytes…and reducing latency from 150ms to 50ms is far more than
marginal. In fact, these two factors alone are enough to make
WebSocket seriously interesting to Google.
Apart from network transport, other things may also be important, for example how are you fetching, formating and processing the data on the server side. In node.js heavy CPU bound computations may block processing of other asynchronous operations, therefore these kind of operations should be dispatched to separate threads or processes in order to prevent blocking.