With a Node.js cluster, how do I share connections? - node.js

I have an Azure hosted application (iisnode) that accepts direct connections from multiple client services. This application streams data between the various connections. If running on a system with multiple instances of node.js, the actual TCP connections will be connecting to different instances.
Is there a way to somehow "move" or "share" the in-memory connection from one instance to another?
Sure, I could build some inter-instance communication to route data, but I don't think the application will scale since it's entire purpose is to move data around quickly. For example, I would have 4 instances, 100 connections to each, and I would spend as many resources moving the data between instances as I would spend moving the data between the client connections.

When you configure iisnode to create more than one node.exe process (using the nodeProcessCountPerApplication setting), it will dispatch incoming HTTP requests between them using a round robin logic; the application has no control over that behavior. Given your scenario there is no way to deterministically ensure that the requests ("connections") from two distinct clients will be colocated in the same node.exe process.
There is no mechanism to "move" an existing TCP connection or HTTP request between node.exe processes.
In general a better way to address such a notification scenario may be to use a subscription-based messaging infrastructure as your backend. ServiceBus in Azure provides such mechanisms. In this design, each instance of node.exe would subscribe to a particular topic when it receives a connection from the client, and be notified by ServiceBus when a matching notification arrives, possibly via a different instance of node.exe.

Related

WebSocket server clustering with state sharing in Kubernetes deployment

I am looking for a way to cluster WebSocket servers - written in node - so that a proper load balancing and a client request will be served by appropriate node instance. In case of WebSocket, the connection is stateful and I believe a node cluster could help. I want the connection/state information to be shared so that any node instance could serve the request than the client does not need to keep a track of the specific node instance. The reason for this thought process is to ensure that the node instances can be killed and replaced by new instances without bothering about the overheads of state management.
I have a setup where we use multiple instances with load balancers in AWS ECS, deployed by CI/CD pipelines. The number of frontend and backend servers varies between 2 and 8 each depending on bursts and current deployments. If one server crashes, a new one will take its place.
We use socket.io with the Redis adapter to share the websocket state between all connected instances via the in-memory db Redis. This ensures that even if the clients are connected to different instances, they all receive the events.

How to configure MassTransit in an unreliable network environment?

I'm trying to get my head around MassTransit in combination with RabbitMQ.
The basic concepts are working in a test project, but what I need is the following:
My system will have one or more servers that react to real life events (telephony). These events wil, by means of MassTransit and RabbitMQ, translate into messages that will be picked up by one or more receivers via a separate server, set up as RabbitMQ host. So far so good.
However, I cannot assume that I always have a connection between the publisher and the host machines. Just assume that the publishing server will continue to consume the real life events, but now cannot publish it's messages.
So, the question is: Does MassTransit have some kind of mechanism to store messages locally some way until the connection is re-established?
Or should I install RabbitMQ on every publishing server as well, in order to create a local exchange? Then I have to make the exchanges synchronize themselves after a reconnect.
Probably you have to implement a store and forward policy. Instead of publishing directly your message through MassTransit and RabbitMQ, you can store the message in a persistence repository (a local database) and delegate to some other process the notification through Masstransit of the messages stored before. This approach is often referred as "Client High Availability". This does not substitute the standard HA (High Availability) on server like the one implemented by RabbitMQ. But it's a good approach to use in a distributed system (like the one you described) because it could help you a lot in scenarios of server failure (e.g. an issue on RabbitMQ server that causes some loss of messages that you still have inside the store of some client and therefore you can make it process again).

What is the best way to communicate between two servers?

I am building a web app which has two parts. In one part it uses a real time connection between the server and the client and in the other part it does some cpu intensive task to provide relevant data.
Implementing the real time communication in nodejs and the cpu intensive part in python/java. What is the best way the nodejs server can participate in a duplex communication with the other server ?
For a basic solution you can use Socket.IO if you are already using it and know how it works, it will get the job done since it allows for communication between a client and server where the client can be a different server in a different language.
If you want a more robust solution with additional options and controls or which can handle higher traffic throughput (though this shouldn't be an issue if you are ultimately just sending it through the relatively slow internet) you can look at something like ØMQ (ZeroMQ). It is a messaging queue which gives you more control and lots of different communications methods beyond just request-response.
When you set either up I would recommend using your CPU intensive server as the stable end(server) and your web server(s) as your client. Assuming that you are using a single server for your CPU intensive tasks and you are running several NodeJS server instances to take advantage of multi-cores for your web server. This simplifies your communication since you want to have a single point to connect to.
If you foresee needing multiple CPU servers you will want to setup a routing server that can route between multiple web servers and multiple CPU servers and in this case I would recommend the extra work of learning ØMQ.
You can use http.request method provided to make curl request within node's code.
http.request method is also used for implementing Authentication api.
You can put your callback in the success of request and when you get the response data in node, you can send it back to user.
While in backgrount java/python server can utilize node's request for CPU intensive task.
I maintain a node.js application that intercommunicates among 34 tasks spread across 2 servers.
In your case, for communication between the web server and the app server you might consider mqtt.
I use mqtt for this kind of communication. There are mqtt clients for most languages, including node/javascript, python and java. In my case I publish json messages using mqtt 'topics' and any task that has registered to subscribe to a 'topic' receives it's data when published. If you google "pub sub", "mqtt" and "mosquitto" you'll find lots of references and examples. Mosquitto (now an Eclipse project) is only one of a number of mqtt brokers that are available. Another very good broker that is written in Java is called hivemq.
This is a very simple, reliable solution that scales well. In my case literally millions of messages reliably pass through mqtt every day.
You must be looking for socketio
Socket.IO enables real-time bidirectional event-based communication.
It works on every platform, browser or device, focusing equally on reliability and speed.
Sockets have traditionally been the solution around which most
realtime systems are architected, providing a bi-directional
communication channel between a client and a server.

redis in Node.js app environment

I am building an app with several Node.js instances as a Backend (http server, socket server and several a pool of domain servers). Now I am trying to cover several communication and configuration aspects and am wondering if redis makes an appropriate solution.
So, I would use it for the following purposes:
Implementation of a shared run-time lookup table. It's a table of several hundreds of relativelly simple records, accessed and manipulated by 2 node-instances.
Implementation of message queues. Each domain server receives commands from the http server and should execute them sequentially. Domain server should be able to listen on a redis-event, and execute each new command upon its arival
socket sever also has a regis message queue and listen to its event, in order to push notification to connected clients
Is redis "too heavy" for such a purpose?
Does it offer all needed functionality?
I can definitelly implement a look-up in a file and/or memory and a queue using sockets. However, it might make a code cleaner and a solution more robust with redis.
Redis is definitely not a heavy solution, on the contrary.
It's small, insanely fast (when using pipelining), easy to deploy. I consider it as a light solution, a kind of swiss knife that may solves many problems.
Redis based message queues are OK if you don't expect any guarantee on the message delivery. That is to say Redis based queues can't assure you the client has received the message. If it's a problem for your application you should consider using an heavier solution, like 0mq or Rabbitmq.

Does each queue on ZeroMQ require it's own port?

We are looking to build a facade in nodejs that will accept requests from a client and then farm out the requests to a number of services using request/reply pattern to a number of different backend services. We want these requests held on individual queues in the event that one of the backend services is down. From initially reading of the ZeroMQ docs, it appears each queue is bound to its own port. When sending a message to a socket, there doesn't appear to be a way of naming a queue/topic to send to.
Is there a one-one mapping between ports and queues?
Thanks, Tom
ZeroMQ doesn't have the concept of "queues" or "topics". Your application consists of tasks, connected across some protocol, e.g. tcp://, and sending each other messages in various patterns. In your example one task will bind to an address:port and the workers will connect to it. The sender then sends requests to its socket, which deals them out to workers.
The best way to learn ZeroMQ is to work through at least the first couple of chapters of the Guide, before you design your own application. Many of the existing messaging concepts you're familiar with disappear into simpler patterns with ZeroMQ.

Resources