Sending Raw Binary Data Between Two Servers With Websockets/Node.js? - node.js

I've been reading lately about Node.js, Websockets and Socket.io out of curiosity. However, the other day I was thinking of a problem one of my client faces and was wondering if they may be the solution. Essentially, there are two servers. Server 1, is serving raw binary data. Server 2, is setup to receive and handle that binary data.
What needs to happen is data from server 1, is passed through a web browser and then delivered to server 2.
I'm curious to know if this is possible, and what angles you may take to solve it?

It's certainly possible. The connections to both will need to be initiated from the browser/Javascript, but once in place it should be easy to proxy the data from one to the other.
However, there are many non-browser WebSocket clients so you might consider just making a direct WebSocket connection from one server to the other. See this wikipedia page for WebSocket client (and server) implementations.

Related

Node.js design approach. Server polling periodically from clients

I'm trying to learn Node.js and adequate design approaches.
I've implemented a little API server (using express) that fetches a set of data from several remote sites, according to client requests that use the API.
This process can take some time (several fecth / await), so I want the user to know how is his request doing. I've read about socket.io / websockets but maybe that's somewhat an overkill solution for this case.
So what I did is:
For each client request, a requestID is generated and returned to the client.
With that ID, the client can query the API (via another endpoint) to know his request status at any time.
Using setTimeout() on the client page and some DOM manipulation, I can update and display the current request status every X, like a polling approach.
Although the solution works fine, even with several clients connecting concurrently, maybe there's a better solution?. Are there any caveats I'm not considering?
TL;DR The approach you're using is just fine, although it may not scale very well. Websockets are a different approach to solve the same problem, but again, may not scale very well.
You've identified what are basically the only two options for real-time (or close to it) updates on a web site:
polling the server - the client requests information periodically
using Websockets - the server can push updates to the client when something happens
There are a couple of things to consider.
How important are "real time" updates? If the user can wait several seconds (or longer), then go with polling.
What sort of load can the server handle? If load is a concern, then Websockets might be the way to go.
That last question is really the crux of the issue. If you're expecting a few or a few dozen clients to use this functionality, then either solution will work just fine.
If you're expecting thousands or more to be connecting, then polling starts to become a concern, because now we're talking about many repeated requests to the server. Of course, if the interval is longer, the load will be lower.
It is my understanding that the overhead for Websockets is lower, but still can be a concern when you're talking about large numbers of clients. Again, a lot of clients means the server is managing a lot of open connections.
The way large services handle this is to design their applications in such a way that they can be distributed over many identical servers and which server you connect to is managed by a load balancer. This is true for either polling or Websockets.

Is using web requests more efficient than socket.io?

I have a Node.js project in which the front end needs to request some data from the back end. Since it only has to do so once per session, I was thinking of using web requests instead of socket.io because I don't need a continuous connection between the front end and the back end all the time.
So my question is: How much, if at all, will the efficiency of my project increase if I use web requests instead of socket.io?
If you're making a one-time request from client to server, there is no reason to use a continuous socket.io connection or to create a socket.io connection, use it for one message and then disconnect it.
It will certainly be simpler to just use a single http request to get your data.
How much, if at all, will the efficiency of my project increase?
The main difference would be at high scale your server wouldn't need to handle lots of simultaneous socket.io connections. At small scale, you probably won't notice much of difference either way. The main reason for choosing an http request is that it's the simpler and proper architecture for making a single request from client to server. A socket.io connection has it's uses for different circumstances.

Use Apache Thrift for two-way communication?

Is it possible to implement a two-way communication between client and server with Apache Thrift? Thus not only to be able to make RPC from client to server, but also the other way round? In my project I have the requirement that the server must also push some data to the client without being asked by the client before to do this.
There are two ways how to achieve this with Thrift.
If both ends are more or less peers and you connect them through sockets or pipes, you simply set up a server and a client on both ends and you're pretty much done. This does not work in all cases, however, especially with HTTP.
If you connect server and client through HTTP or a similar channel, there is a technique called "long polling". It basically requires the client to call the server as usual, but the call will only return when the server wants to send some data back to the client. After receiving the data, the client starts another call if he's still interested in more data.
As Denis pointed out, depending on your exact use case, you might want to consider using a MQ system. Note that it is still possible to use Thrift to de/serialize the messages into and from the queues. The contrib folder has some examples that show how to use Thrift with ZMQ, Rebus and some others.
You are better to use queues then, e.g. ZeroMQ.

node.js server with socket.io handling 50000 simultaneous clients

We are developing a Javascript control which should be constantly connected to a server for receiving animation updates.
We are planning to host this stuff on an Amazon cloud.
The scenario is like this: server connects to activemq queue waiting for updates, for each update it broadcasts it to all connected clients.
Is it even possible to handle such load with node.js + socket.io?
Will a single node.js server be able to handle such load?
How to organize fast transport between different nodes if we will have to use more than one node?
Will single node.js server be able to handle such load?.. How to organize fast transport between different nodes if we will have to use more than one node
You say that you are planning to host on Amazon. So first off, nothing should be scoped for a single server. Amazon machines will simply "disappear", you have to assume that you are going to use multiple computers.
...handling 50k simultaneous clients
So to start with, 50k connections for a single box is a very big number. Here's a very detailed blog post discussing "getting to 10k" with node.js+socket.io.
Here's a very telling quote:
it seemed as though 10,000 clients simply required more serialization
than my server was able to handle.
So a key component to "getting to 50k" is going to be the amount of work required just pushing data over the wire.
How to organize fast transport between different nodes if we will have to use more than one node.
That blog post is the first of 3. When you're done the first, read the other two. That should point you in the right direction.

Socket.io vs AJAX Use cases

Background: I am building a web app using NodeJS + Express. Most of the communication between client and server is REST (GET and POST) calls. I would typically use AJAX XMLHttpRequest like mentioned in https://developers.google.com/appengine/articles/rpc. And I don't seem to understand how to make my RESTful service being used for Socket.io as well.
My questions are
What scenarios should I use Socket.io over AJAX RPC?
Is there a straight forward way to make them work together. At least for Expressjs style REST.
Do I have real benefits of using socket.io(if websockets are used -- TCP layer) on non real time web applications. Like a tinyurl site (where users post queries and server responds and forgets).
Also I was thinking a tricky but nonsense idea. What if I use RESTful for requests from clients and close connection from server side and do socket.emit().
Thanks in advance.
Your primary problem is that WebSockets are not request/response oriented like HTTP is. You mention REST and HTTP interchangeably, keep in mind that REST is a methodology behind designing and modeling your HTTP routes.
Your questions,
1. Socket.io would be a good scenario when you don't require a request/response format. For instance if you were building a multiplayer game in which whoever could click on more buttons won, you would send the server each click from each user, not needing a response back from the server that it registered each click. As long as the WebSocket connection is open, you can assume the message is making it to the server. Another use case is when you need a server to contact a client sporadically. An analytics page would be a good use case for WebSockets as there is no uniform pattern as to when data needs to be at the client, it could happen at anytime.
The WebSocket connection is an HTTP GET request with a special header requesting the server to upgrade it to a WebSocket connection. Distinguishing different events and message on the WebSocket connection is up to your application logic and likely won't match REST style URIs and methods (otherwise you are replication HTTP request/reply in a sense).
No.
Not sure what you mean on the last bit.
I'll just explain more about when you want to use Socket.IO and leave the in-depth explanation to Tj there.
Generally you will choose Socket.IO when performance and/or latency is a major concern and you have a site that involves users polling for data often. AJAX or long-polling is by far easier to implement, however, it can have serious performance problems in high load situations. By high-load, I mean like Facebook. Imagine millions of people loading their feed, and every minute each user is asking the server for new data. That could require some serious hardware and software to make that work well. With Socket.IO, each user could instead connect and just indefinitely wait for new data from the server as it arrives, resulting in far less overall server traffic.
Also, if you have a real-time application, Socket.IO would allow for a much better user experience while maintaining a reasonable server load. A common example is a chat room. You really don't want to have to constantly poll the server for new messages. It would be much better for the server to broadcast new messages as they are received. Although you can do it with long-polling, it can be pretty expensive in terms of server resources.

Resources