Communicating between two different processes in Node.js - node.js

The issue is:
Lets assume we have two Node.js processes running: example1.js and example2.js.
In example1.js there is function func1(input) which returns result1 as a result.
Is there a way from within example2.js to call func1(input) and obtain result1 as the outcome?
From what I've learned about Node.js, I have only found one solution which uses sockets for communication. This is less than ideal however because it would require one process listening on a port. If possible I wish to avoid that.
EDIT: After some questions I'd love to add that in hierarchy example1.js cannot be child process of example2.js, but rather the opposite. Also if it helps -- there can be only one example1.js processing its own data and many example2.js's processing own data + data from first process.

The use case you describe makes me think of dnode, with which you can easily expose functions to be called by different processes, coordinated by dnode, which uses network sockets (and socket.io, so you can use the same mechanism in the browser).
Another approach would be to use a message queue, there are many good bindings for different message queues.
The simplest way to my knowledge, is to use child_process.fork():
This is a special case of the spawn() functionality for spawning Node processes. In addition to having all the methods in a normal ChildProcess instance, the returned object has a communication channel built-in. The channel is written to with child.send(message, [sendHandle]) and messages are received by a 'message' event on the child.
So, for your example, you could have example2.js:
var fork = require('child_process').fork;
var example1 = fork(__dirname + '/example1.js');
example1.on('message', function(response) {
console.log(response);
});
example1.send({func: 'input'});
And example1.js:
function func(input) {
process.send('Hello ' + input);
}
process.on('message', function(m) {
func(m);
});

May be you should try Messenger.js. It can do IPC in a handy way.
You don't have to do the communication between the two processes by yourself.

Use Redis as a message bus/broker.
https://redis.io/topics/pubsub
You can also use socket messaging like ZeroMQ, which are point to point / peer to peer, instead of using a message broker like Redis.
How does this work?
With Redis, in both your node applications you have two Redis clients doing pub/sub. So each node.js app would have a publisher and subscriber client (yes you need 2 clients per node process for Redis pub/sub)
With ZeroMQ, you can send messages via IPC channels, directly between node.js processes, (no broker involved - except perhaps the OS itself..).

Related

Redis publish memory leak?

I know that there is already many questions like this, but i don't find one that fits my implementation.
I'm using redis in a Node.js env, and it feels like redis.publish is leaking some memory. I expect it to be some kind of "backpressure" thing, like seen here:
Node redis publisher consuming too much memory
But to my understanding: Node needs to release that kind of pressure in a synchronous context, otherwise, the node event loop won't be called, and the GC won't be called either.
My program looks like that:
const websocketApi = new WebsocketApi()
const currentState = {}
websocketApi.connect()
websocketApi.on('open', () => {
channels.map((channel) => websocketApi.subscribeChannel(channel))
})
websocketApi.on('message', (message) => {
const ob = JSON.parse(message)
if (currentState[ob.id]) {
currentState[ob.id] = update(currentState[ob.id], ob.data)
} else {
currentState[ob.id] = ob.data
}
const payload = {
channel: ob.id,
info: currentState[ob.id],
timestamp: Date.now(),
type: 'newData'
}
// when i remove this part, the memory is stable
redisClient.publish(payload.channel, JSON.stringify(payload))
})
// to reconnect in case of error
websocketApi.on('close', () =>
websocketApi.connect())
It seems that the messages are too close from each other, so it doesn't have time to release the strings hold in the redis.publish.
Do you have any idea of what is wrong in this code ?
EDIT: More specifically, what I can observe when I do memory dumps of my application:
The memory is staturated with string that are my Stringified JSON payloads, and "chunks" of messages that are send via Redis itself. Their ref are hold inside the redis client manly in variables called chunk.
Some string payloads are still released, but I create them way faster.
When I don't publish the messages via Redis, the "currentState" variable grows until a point then don't grow anymore. It obviously has a big RAM impact, but it's expected. The rest is fine and the application is stable around 400mb, and it explodes whith the redis publisher (PM2 restarts it cause it reaches max RAM capacity)
My feeling here is that I ask redis to publish way more that it can handle, and redis doesn't have the time to finish to publish the messages. It still holds all the context, so it doesn't release anything. I may need some kind of "queue" to let redis release some context and finish publishing the messages. Is that really a possibility or am I becoming crazy ?
Basically, every loop in my program is "independent". Is it possible to have as many redis clients as I have got loops ? is it a better idea ? (IMHO, node is mono threaded, so it won't help, but it may help the V8 to better track down memory references and releasing memory)
The redis client buffers commands if the client is not connected either because it has not yet connected or its connection fails or it fails to connect.
Make sure that you can connect to the redis server. Make sure that your program is connected to the server. I would suggest adding a listener to redisClient.on('connect') if that is not emitted the client never connected.
If you are connected, the client shouldn't be buffering but to make the problem appear sooner disable the offline queue, pass the option enable_offline_queue: false to createClient this will cause attempts to send commands when not connected fail.
You should attach an error listener to the redisClient: redisClient.on('error', console.error.bind(console)). This might yield a message as to why the client is buffering.

Node JS Socket.IO Emitter (and redis)

I'll give a small premise of what I'm trying to do. I have a game concept in mind which requires multiple players sitting around a table somewhat like poker.
The normal interaction between different players is easy to handle via socket.io in conjunction with node js.
What I'm having a hard time figuring out is; I have a cron job which is running in another process which gets new information every minute which then needs to be sent to each of those players. Since this is a different process I'm not sure how I send certain clients this information.
socket.io does have information for this and I'm quoting it below:
In some cases, you might want to emit events to sockets in Socket.IO namespaces / rooms from outside the context of your Socket.IO processes.
There’s several ways to tackle this problem, like implementing your own channel to send messages into the process.
To facilitate this use case, we created two modules:
socket.io-redis
socket.io-emitter
From what I understand I need these two modules to do what I mentioned earlier. What I do not understand however is why is redis in the equation when I just need to send some messages.
Is it used to just store the messages temporarily?
Any help will be appreciated.
There are several ways to achieve this if you just need to emit after an external event. It depend on what you're using for getting those new data to send :
/* if the other process is an http post incoming you can use for example
express and use your io object in a custom middleware : */
//pass the io in the req object
app.use( '/incoming', (req, res, next) => {
req.io = io;
})
//then you can do :
app.post('/incoming', (req, res, next) => {
req.io.emit('incoming', req.body);
res.send('data received from http post request then send in the socket');
})
//if you fetch data every minute, why don't you just emit after your job :
var job = sheduledJob('* */1 * * * *', io => {
axios.get('/myApi/someRessource').then(data => io.emit('newData', data.data));
})
Well in the case of socket.io providing those, I read into that you actually need both. However this shouldn't necessarily be what you want. But yes, redis is probably just used to store data temporarily, where it also does a really good job, by being close to what a message queue does.
Your cron now wouldn't need a message queue or similar behaviour.
My suggestion though would be to run the cron with some node package from within your process as a child_process hook onto it's readable stream and then push directly to your sockets.
If the cron job process is also a nodejs process, you can exchange data through redis.io pub-sub client mechanism.
Let me know what is your cron job process in and in case further help required in pub-sub mechanism..
redis is one of the memory stores used by socket.io(in case you configure)
You must employ redis only if you have multi-server configuration (cluster) to establish a connection and room/namespace sync between those node.js instances. It has nothing to do with storing data in this case, it works as a pub/sub machine.

NodeJS and AWS SQS

Folks,
I would like to set up a message queue between our Java API and NodeJS API.
After reading several examples of using aws-sdk, I am not sure how to make the service watch the queue.
For instance, this article Using SQS with Node: Receiving Messages Example Code tells me to use the sqs.receiveMessage() to receive and sqs.deleteMessage() to delete a message.
What I am not clear about, is how to wrap this into a service that runs continuously, which constantly takes the messages off the sqs queue, passes them to the model, stores them in mongo, etc.
Hope my question is not entirely vague. My experience with Node lies primarily with Express.js.
Is the answer as simple as using something like sqs-poller? How would I implement the same into an already running NodeJS Express app? Quite possibly I should look into SNS to not have any delay in message transfers.
Thanks!
For a start, Amazon SQS is a pseudo queue that guarantees availability of messages but not their sequence in FIFO fashion. You have to implement sequencing logic into your app if you want it to work that way.
Coming back to your question, SQS has to be polled within your app to check if there are new messages available. I implemented this in an app using setInterval(). I would poll the queue for items and if no items were found, I would delay the next call and in case some items were found, the next call would be immediate bypassing the setInterval(). This is obviously a very raw implementation and you can look into alternatives. How about a child process on your server that pings your NodeJS app when a new item is found in SQS ? I think you can implement the child process as a watcher in BASH without using NodeJS. You can also look into npm modules if there is already one for this.
In short, there are many ways you can poll but polling has to be done one way or the other if you are working with Amazon SQS.
I am not sure about this but if you want to be notified of items, you might want to look into Amazon SNS.
When writing applications to consume messages from SQS I use sqs-consumer:
const Consumer = require('sqs-consumer');
const app = Consumer.create({
queueUrl: 'https://sqs.eu-west-1.amazonaws.com/account-id/queue-name',
handleMessage: (message, done) => {
console.log('Processing message: ', message);
done();
}
});
app.on('error', (err) => {
console.log(err.message);
});
app.start();
See the docs for more information (well documented):
https://github.com/bbc/sqs-consumer

why is performance of redis+socket.io better than just socket.io?

I earlier had all my code in socket.io+node.js server. I recently converted all the code to redis+socket.io+socket.io+node.js after noticing slow performance when too many users send messages across the server.
So, why socket.io alone was slow because it is not multi threaded, so it handles one request or emit at a time.
What redis does is distribute these requests or emits across channels. Clients subscribe to different channels, and when a message is published on a channel, all the client subscribed to it receive the message. It does it via this piece of code:
sub.on("message", function (channel, message) {
client.emit("message",message);
});
The client.on('emit',function(){}) takes it from here to publish messages to different channels.
Here is a brief code explaining what i am doing with redis:
io.sockets.on('connection', function (client) {
var pub = redis.createClient();
var sub = redis.createClient();
sub.on("message", function (channel, message) {
client.emit('message',message);
});
client.on("message", function (msg) {
if(msg.type == "chat"){
pub.publish("channel." + msg.tousername,msg.message);
pub.publish("channel." + msg.user,msg.message);
}
else if(msg.type == "setUsername"){
sub.subscribe("channel." +msg.user);
}
});
});
As redis stores the channel information, we can have different servers publish to the same channel.
So, what i dont understand is, if sub.on("message") is getting called every time a request or emit is sent, why is redis supposed to be giving better performance? I suppose even the sub.on("message") method is not multi threaded.
As you might know, Redis allows you to scale with multiple node instances. So the performance actually comes after the fact. Utilizing the Pub/Sub method is not faster. It's technically slower because you have to communicate between Redis for every Pub/Sign signal. The "giving better performance" is only really true when you start to horizontally scale out.
For example, you have one node instance (simple chat room) -- that can handle a maximum of 200 active users. You are not using Redis yet because there is no need. Now, what if you want to have 400 active users? Whilst using your example above, you can now achieve this 400 user mark, which is a "performance increase". In the sense you can now handle more users, but not really a speed increase. If that makes sense. Hope this helps!

Realtime messaging with NodeJS across multiple processes

I'm trying to implement an API that interacts with a NodeJS server for realtime messaging. Now when that NodeJS app is deployed to a scalable environment like Heroku, multiple instances of this app may be running.
Is it possible to design the node app so that all clients subscribed to a "message channel" will receive this message, although multiple node instances are running - and therefore multiple copies of this channel?
Check out zeromq, it should provide some simple, high performance IPC abstractions to do what you want. In particular, the pub/sub example will be useful.
The main challenge as I imagine it, without knowing anything about how Heroku spawns multiple server instances, will be the logic to determine who is the publisher (the rest of the instances will be subscribers). So let's say, for argument's sake, that your hosting provider gives you an environment variable called INSTANCE_NUM which is an integer in [0,1024] indicating the instance number of the process; so we'll say that instance zero is the message publisher.
var zmq = require('zeromq')
if (process.env['INSTANCE_NUM'] === '0') { // I'm the publisher.
var emitter = getEventEmitter(); // e.g. an HttpServer.
var pub = zmq.createSocket('pub');
pub.bindSync('tcp://*:5555');
emitter.on('someEvent', function(data) {
pub.send(data);
});
} else { // I'm a subscriber.
var sub = zmq.createSocket('sub');
sub.subscribe('');
sub.on('message', function(data) {
// Handle the event data...
});
sub.connect('tcp://localhost:5555');
}
Note that I'm new to zeromq and the above code is totally untested, just for demonstration.

Resources