Clustered HTTP server that syncs clients for file transfer

Clustered HTTP server that syncs clients for file transfer - node.js

I'm writing a Node HTTP server that essentially only exists for NAT punchthrough. Its job is to facilitate a client sending a file, and another client receiving that file.
Edit: The clients are other Node processes, not browsers. We're using Websockets because some client locations won't allow non-HTTP/S port connections.
The overall process works like this:
All clients keep an open websocket connection.
The receiving client (Alice) tells the server via Websocket that it wants a file from another client (Bob).
The server generates a unique token for this transaction.
The server notifies Alice that it should download the file from /downloads?token=xxxx. Alice connects, and the connection is left open.
The server notifies Bob that it should upload the file to /uploads?token=xxxx. Bob connects and begins uploading the file, since Alice is already listening on the other side.
Once the transfer is complete, both connections are closed.
This is all accomplished by storing references to the HTTP req and res objects inside of a transfers object, indexed by the token. It all works great... as long as I'm not clustering the server.
In the past, when I've used clustering, I've converted the server to be stateless. However, that's not going to work here: I need the req and res objects to be stored in a state so that I can pipe the data.
I know that I could just buffer the transferring data to disk, but I would rather avoid that if at all possible: part of the requirements is that if I buffer anything to disk I must encrypt it, and I'd rather avoid putting the additional load of encrypting/decrypting transient data on the server.
Does anyone have any suggestion on how I could implement this in a way that supports clustering, or a pointer to further research sources I could look at?
Thanks so much!

Related

Is it possible to have server to server communication with websockets?

I'm trying to have 2 servers communicate with each other, I'm pretty new to websockets so its kind of confusing. Also, just to put it out there, i'm not trying to do this: websocket communication between servers;
My goal here is to basically use a socket to read data from another server (if this is possible?) I'll try to easily explain more below;
We'll assume there is a website called https://www.test.com (going to this website returns an object)
With a normal HTTP request, you would just do:
$.get('https://www.test.com').success(function (r) {
console.log(r)
})
And this would return r, which is an object thats something like this {test:'1'};
Now from what I understand with websockets, is that you cannot return data from them because you don't actually 'request' data, you just send data through said socket.
Since I know what test.com returns, and I know all of the headers that i'm going to need, is it possible to just open a socket with test.com and wait for that data to be changed without requesting it?
I understand how client-server communication works with socketio/websockets im just not sure if its possible to do server-server communication.
If anyone has any links to documentation or anything trying to help explain, it would be much appreciated, I just want to learn how this works. (or if its even possible)

Yes, I you can do what (assuming I understood your needs correctly). You can establish a websocket connection between two servers and then either side can just send data to the other. That will trigger an event at the other server and it will receive the sent data as part of that event. You can do this operation either direction from serverA to serverB or vice versa or both.
In node.js, everything is event driven. So, you would establish the webSocket connection and then just set up an event handler to be triggered when data arrives. The other server can then just send new data whenever it has updated data to send. This is referred to as the "push" model. So, rather than serverA asking serverB is it has any new data, you establish the webSocket connection and serverB just sends new data to serverA whenever that new data is available. Done correctly, this is both more efficient and more timely (as there is no polling interval and no cycles wasted asking for data when there is nothing new).
The identical model can be used between servers or client to server. The only difference with the client/server model is that the webSocket must be initially established client to server. With the server to server model, either server can initiate the connection.
You can think of a webSocket connection like establishing a phone call. Once the phone call is established, either side can just say something and the other end hears what they're saying. The webSocket connection is similar. Once its established, either side can just send some data to the other end and the other end will receive it. It's an open pipeline ready to have data sent either way. In node.js, when data arrives on that pipeline, it triggers and event so the listener will get that event and see the data that was sent.

How to detect network failure in sockets Node.js

I am trying to write internal transport system.
Data should be transferred from client to server using net sockets.
It is working fine except handling of network issues.
If I place firewall between client and server, on both sides I will not see any error, so data will continue to fill kernel buffer on client side.
And if I will restart app in this moment I will lose all data in buffer.
Question:
Do we have any way to detect network issues?
Do we have any way to get data back from kernel buffers?

Node js exposes the low level socket api to you very directly. I'm assuming that you are using a TCP socket to send and receive data.
One way to ensure that there is an active connection between the client and server is to send heartbeat signals back and forth. If you fail to receive a heartbeat from the server while sending data, you can assume that the connection failed.
As for the second part of your question: There is no easy way to get data back from kernel buffers. If losing the data will be a problem, I would make sure to write it to disk.

How to persist HTTP response in redis

I am creating a long-polling chat application on nodeJS without using Socket.io and scaling it using clusters.
I have to find a way to store all the long-polled HTTP requests and response objects in such a way that it is available across all node clusters(so that when a message is received for a long-polled request, I can get that request and respond to it)
I have tried using redis, however, when I stringify http request and response objects, I get "Cannot Stringify Cyclic Structure" Error.
Maybe I am approaching it in a wrong way. In that case, how do we generally implement lon-polling across different clusters?

What you're asking seems to be a bit confused.
In a long-polling situation, a client makes an http request that is routed to a specific HTTP server. If no data to satisfy that request is immediately available, the request is then kept alive for some extended period of time and either it will eventually timeout and the client will then issue another long polling request or some data will become available and a response will be returned to the request.
As such, you do not make this work in clusters by trying to centrally save request and response objects. Those belong to a specific TCP connection between a specific server and a specific client. You can't save them and use them elsewhere and it also isn't something that helps any of this work with clustering either.
What I would think the clustering problem you have here is that when some data does become available for a specific client, you need to know which server that client has a long polling request that is currently live so you can instruct that specific server to return the data from that request.
The usual way that you do this is you have some sort of userID that represents each client. When any client connects in with a long polling request, that connection is cluster distributed to one of your servers. That server that gets the request, then writes to a central database (often redis) that this userID userA is now connected to server12. Then, when some data becomes available for userA, any agent can lookup that user in the redis store and see that the user is currently connected to server12. So, they can instruct server12 to send the data to userA using the current long polling connection for userA.
This is just one strategy for dealing with clustering - there are many others such as sticky load balancing, algorithmic distribution, broadcast distribution, etc... You can see an answer that describes some of the various schemes here.

If you are sure you want to store all the request and responses, have a look at this question.
Serializing Cyclic objects
you can also try cycle.js
However, I think you would only be interested in serializing few elements from request/response. An easier (probably better too) approach would be to just copy the required key/value pairs from request/response object in to a separate object and store them.

It is interesting to create a new node app to handle socket.io?

I want to add on an existing project some sockets with nodeJs and Socket.io.
I already have 2 servers :
An API RESTful web service, to storage and manage my datas.
A Public web service to return HTML, assets (js, css, images, ...)
On the first try, I create my socket server on the Public one. But I think it will be better if I create an other one to handle only socket query.
What do you think ? It's a good idea or just an useless who will add more problem than solve (maybe duplicate intern lib, ..)
Also, i'm using token to communicate between Public and API, do I have to create another to communication between socket and API ? Or I can use the same one ?
------[EDIT]------
As nobody didn't understand me well I have create a schema with the infrastructure I was thinking about.
It is a good way to proceed ?
The Public Server and Socket server have to be the same ? Or can be separate ?
Do I must create a socket connection between API and Socket server for each client connected ?
Thank you !

Thanks for explaining better.
First of all, while this seems reasonable, this way of using Socket.io is not the most common one. The biggest advantage of using Socket.io is that it keeps a channel open for 2-way communication. The main advantage of this is that the server itself can send messages to the client without the latter having to poll periodically.
Think, for example, of a mail client. Without sockets, the browser would have to poll periodically to check for new mail. With an open socket connection, instead, as soon as a new mail comes the server notifies the client immediately.
In your case, the benefits could be limited, and I'm not sure the additional complexity of a Socket.io server (and cost!) would really be worth the modest speed improvement on REST requests. However, at the end it's up to you.
In answer to your points
See above
If the "public server" is not written in Node.js they can't be the same application. Wether they reside on the same server, it's up to you and your budget. Ideally they should be separate, for bigger workloads.
If you just want the socket server to act as a real-time proxy, then yes, you'll have to create a socket connection for each request. How that will work is:
The client requests a resource to the Socket.io server.
The Socket.io server does the normal HTTP request to the API server (e.g. using request)
The response is returned to the client over the socket connection
The workflow represented in #3 is the reason why you should expect only moderate performance improvement. Indeed, you'll get some better latency, but most of the overhead for starting a HTTP request is still there!

connecting two client socket

i have an application that has client and a server. The server is basically only used to store the file names that the clients have so that when other clients want to search for files, they can go the server, find the client that has the file they want and receive the file by directly connecting to it. By now, i can get the socket information of the client that has the file requested by the other client. However, i am now confused about how to connect these two clients. Do i have to create a separate client and a server socket between the two clients or there are other ways.

Now You have two choices:-
Let the server continue his role, and the server can act as an intermediary between the two parties. It should download the file from the client which has it and send it (via any suitable protocol) to the client who requested the file. This is called the Client -Server Architecture. This is a simple approach and you have the benefits such as file caching etc. i.e. If in future same file is requested the server can send it directly without asking for the client.
You can continue using the P2P architecture, and Create a separate a socket between the two parties, this is not straight forward and needs special care when multiple processes are working simultaneously.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string