Node.JS - Fork blocks parent loop execution - node.js

I'm trying to 'multi-thread' using fork and starting up a new process. The problem is once the program is executed and the first fork executes the whole parent code is blocked by the executing fork.
I thought fork just returns a process object?
Here is a stripped back version of the code I am using:
// I am looping a multi-dimensional array and sending an object to the fork IPC.
let array = ["a0", "a1", "a2"];
// Blocking Code
arr.forEach((arrayItem) => {
const forked = fork('./child.js');
// IPC code to send object...
// Capture IPC child messages
forked.on('message', (msg) => {
// Handle message...
});
});
It's worth adding what I want to happen for carity!
Using forEach I'd be able to create multiple child processes that all communicate over IPC without the parent execution block.

Related

How to queue up websocket messages through forked worker and simultaneously process them one by one in main loop in Node js?

I am writing a crypto trading bot that is required to listen to websocket streams (orderbook changes, trade executions etc.). On every event, the websocket should save the incoming message in an array and call the main logic loop to process the message.
While the logic code is executing, if more messages are received, they should be queued up in an array (saved somewhere) and they should not immediately call up the main loop. Idea is that once main loop is done with the processing, it can look up the queue array and pick the next message to process. This way no messages will be lost. Also main logic loop won't be called multiple times if multiple messages arrive while it is already working.
I am using the following code but not able to achieve the desired architecture.
webSocket.onopen = function(event) {
var msg = {
"op": "authKeyExpires",
"args": ["somekey", nonce, signature + ""]
};
webSocket.send(JSON.stringify(msg));
webSocket.send(JSON.stringify({"op":"subscribe", "args":["orderBookApi:BTCPFC_0"]}));
};
webSocket.onmessage = async function(e) {
queue.push(JSON.parse(e.data));
main_logic(queue);
}
async function main_logic(queue){
//process the next message in the queue and then delete it. Keep doing it till queue is empty.
}
I have read that maybe forking or worker process for websocket can help. Kindly advise as I am new to node js and programming in general.

How do I avoid a race condition with Node.js's process.send?

What exactly happens when a child process (created by child_process.fork()) in Node sends a message to its parent (process.send()) before the parent has an event handler for the message (child.on("message",...))? (It seems, at least, like there must be some kind of buffer.)
In particular, I'm faced with what seems like an unavoidable race condition - I cannot install a message handler on a child process until after I've finished the call to fork, but the child could potentially send me (the parent) a message right away. What guarantee do I have that, assuming a particularly horrible interleaving of OS processes, I will receive all messages sent by my child?
Consider the following example code:
parent.js:
const child_process = require("child_process");
const child_module = require.resolve("./child");
const run = async () => {
console.log("parent start");
const child = child_process.fork(child_module);
await new Promise(resolve => setTimeout(resolve, 40));
console.log("add handler");
child.on("message", (m) => console.log("parent receive:", m));
console.log("parent end");
};
run();
child.js:
console.log("child start");
process.send("123abc");
console.log("child end");
In the above, I'm hoping to simulate a "bad interleaving" by preventing the message handler from being installed for a few milliseconds (suppose that a context switch takes place immediately after the fork, and that some other processes run for a while before the parent's node.js process can be scheduled again). In my own testing, the parent seems to "reliably" receive the message with numbers << 40ms (e.g. 20ms), but for values >35ms, it's flaky at best, and for values >> 40ms (e.g. 50 or 60), the message is never received. What's special about these numbers - just how fast the processes are being scheduled on my machine?
It seems to be independent of whether the handler is installed before or after the message is sent. For example, I've observed both of the following executions with the timeout set to 40 milliseconds. Notice that in each one, the child's "end" message (indicating that the process.send() has already happened) comes before "add handler". In one case, the message is received, but in the next, it's lost. It's possible, I suppose, that buffering of the standard output of these processes could potentially cause these outputs to be misrepresenting the true execution - is that's what's going on here?
Execution A:
parent start
child start
child end
add handler
parent end
parent receive: 123abc
Execution B:
parent start
child start
child end
add handler
parent end
In short - is there a solution to this apparent race condition? I seem to be able to "reliably" receive messages as long as I install a handler "soon" enough - but am I just getting lucky, or is there some guarantee that I'm getting? How do I ensure, without relying on luck, that this code will always work (barring cosmic rays, spilled coffee, etc...)? I can't seem to find any detail about how this is supposed to work in the Node documentation.
What exactly happens when a child process (created by child_process.fork()) in Node sends a message to its parent (process.send()) before the parent has an event handler for the message (child.on("message",...))? (It seems, at least, like there must be some kind of buffer.)
First off, the fact that a message arrived from another process goes into the nodejs event queue. It won't be processed until the current nodejs code finishes whatever it was doing and returns control back to the event loop so that it can process the next event in the event queue. If that moment arrives before there is any listener for that incoming event, then it is just received and then thrown away. The message arrives, the code looks to call any registered event handlers and if there are none, then it's done. It's the same as if you call eventEmitter.emit("someMsg", data) and there are no listeners for "someMsg". But, read on, there is hope for your specific situation.
In particular, I'm faced with what seems like an unavoidable race condition - I cannot install a message handler on a child process until after I've finished the call to fork, but the child could potentially send me (the parent) a message right away. What guarantee do I have that, assuming a particularly horrible interleaving of OS processes, I will receive all messages sent by my child?
Fortunately, due to the single-threaded, event-driven nature of nodejs, this is not a problem. You can install the message handler before there's any chance of the message arriving and being processed. This is because even though the child may be started up and may be running independently using other CPUs or interleaved with your process, the single-threaded nature and the event driven architecture help you solve this problem.
If you do something like this:
const child = child_process.fork(child_module);
child.on("message", (m) => console.log("parent receive:", m));
Then you are guaranteed that your message handler will be installed before there's any chance of an incoming message being processed and you will not miss it. This is because the interpreter is busy running these two lines of code and does not return control back to the event loop until after these two lines of code are run. Therefore, no incoming message from the child_module can get processed before your child.on(...) handler is installed.
Now, if you purposely do return back to the event loop as you are doing here with the await before installing your event handler like your code here:
const run = async () => {
console.log("parent start");
const child = child_process.fork(child_module);
// this await allows events in the event queue to be processed
// while this function is suspended waiting for the await
await new Promise(resolve => setTimeout(resolve, 40));
console.log("add handler");
child.on("message", (m) => console.log("parent receive:", m));
console.log("parent end");
};
run();
Then, you have purposely introduced a race condition with your own coding that can be avoided by just installing the event handler BEFORE the await like this:
const run = async () => {
console.log("parent start");
// no events will be processed before these next three statements run
const child = child_process.fork(child_module);
console.log("add handler");
child.on("message", (m) => console.log("parent receive:", m));
await new Promise(resolve => setTimeout(resolve, 40));
console.log("parent end");
};
run();

Multiple bash scripts can't run asynchronously within spawned child processes

This was running good for single call asynchronously:
"use strict";
function bashRun(commandList,stdoutCallback,completedCallback)
{
const proc=require("child_process");
const p=proc.spawn("bash");
p.stdout.on("data",function(data){
stdoutCallback(output);
});
p.on("exit",function(){
completedCallback();
});
p.stderr.on("data",function(err){
process.stderr.write("Error: "+err.toString("utf8"));
});
commandList.forEach(i=>{
p.stdin.write(i+"\n");
});
p.stdin.end();
}
module.exports.bashRun = bashRun;
But when inside a for loop, it doesn't. It just outputs only latest element(process)'s stdout info:
for(var i=0;i<20;i++)
{
var iLocal =i;
bashRun(myList,function(myStdout){ /* only result for iLocal=19 !*/},function(){});
}
I need this asynchronously (and also concurrently with multiple child processes) give output from each stdoutCallback functions to do some processing in it. While stdout doesn't work, completedCallback is called 20 times at least so there must be still 20 processes throughout some time slice but not sure if they existed on same slice of time.
What am I doing wrong so that spawned child processes can not give their output to nodejs? (why only last of them (i=19) can?)
I tried to exchange spawn with fork but now it gives error
p.stdout.on("data",function(data){
^
TypeError: Cannot read property 'on' of null
How can I use something else to retain same functionality of above module?
Looks like issue with scope value of i, try changing loop to use let.
Eg: for(let i=0;i<20;i++)

Electron: Perform sqlite (better-sqlite) db operations in another thread

I'm developing a desktop application using Electron framework and I've to use sqlite database for app data.
I decided to use better-sqlite3 because of:
Custom SQL function support (It's very important for me)
It's much faster than node-sqlite3 in most cases
It is simple to use.
It's synchronous API (in most cases I need to get data serialized)
but in some cases, when I perform a query that takes a while to response, the application UI won't responses to user until the query ends.
how can I run some db queries in another thread? or run them asyncronized (like node-sqlite3)?
sorry for bad english
Node allows you a separate process out-of-the-box. ( Threads are a different matter - alas no WebWorkers :( though you can prob find a thread add-on lib somwhere.
EDIT: Node has added worker_threads since I originally posted this answer. Haven't tried it yet / dunno if they work with better-sqlite.END EDIT
I've had the same issue as you - needing synchronous code to run without blocking the main thread and I used a child process. It was for better-sqlite too !
Problem is that how to handle io streams and sigints etc for control is not immediately obvious and differs depending on whether you're running on windows or posix.
I use a forked child process with silent option set to true to do the synchronous db work.
If you need control of that process or progress update reports back to your main process for your gui during sync ops ; I control/communicate with the child process by reading/writing on the child process stdin/out using fileSystem writeFileSync / readFileSync at various points in my child process code ( you can't use the normal inter-process comms api during sync ops as that's event driven and can't operate while synchronous code is running. Though you can mix and match the two types of io)
example of forked child process ;
//parent.js and child.js in same folder
//parent.js
process.on('exit', (code) => {
console.log(`Parent to exit with code: ${code}`);
});
const readLine = require("readline") ;
const cp = require('child_process');
var forkOptions = {
//execArgv:['--inspect-brk'], // uncomment if debugging the child process
silent:true // child gets own std pipes (important) whch are piped to parent
};
var childOptions = [] ;
const child = cp.fork(`./child.js`,childOptions,forkOptions);
//for messages sent from child via writeSync
const childChannel = readLine.createInterface({
input: child.stdout
}).on("line",function(input){
console.log("writeSync message received from child: " + input) ;
});
//for messages sent from child via process.send
child.on('message', (m) => {
console.log("process.send message received from child: " + m) ;
});
// Child.js
process.on('exit', (code) => {
console.log(`Child to exit with code: ${code}`);
});
const fs = require('fs');
function doSyncStuff(){
for(let i = 0 ; i < 20 ; i++){
//eg. sync db calls happening here
process.send(`Hello via process.send from child. i = ${i} \n`); // async commms . picked up by parent's "child.on" event
fs.writeFileSync(process.stdout.fd,`Hello via writeFileSync from child. i = ${i} \n`) ; // sync comms. picked up by parent's readLine listener ("process" here is the child )
}
}
doSyncStuff();

How can I execute a node.js module as a child process of a node.js program?

Here's my problem. I implemented a small script that does some heavy calculation, as a node.js module. So, if I type "node myModule.js", it calculates for a second, then returns a value.
Now, I want to use that module from my main Node.JS program. I could just put all the calculation in a "doSomeCalculation" function then do:
var myModule = require("./myModule");
myModule.doSomeCalculation();
But that would be blocking, thus it'd be bad. I'd like to use it in a non-blocking way, like DB calls natively are, for instance. So I tried to use child_process.spawn and exec, like this:
var spawn = require("child_process").spawn;
var ext = spawn("node ./myModule.js", function(err, stdout, stderr) { /* whatevs */ });
ext.on("exit", function() { console.log("calculation over!"); });
But, of course, it doesn't work. I tried to use an EventEmitter in myModule, emitting "calculationDone" events and trying to add the associated listener on the "ext" variable in the example above. Still doesn't work.
As for forks, they're not really what I'm trying to do. Forks would require putting the calculation-related code in the main program, forking, calculating in the child while the parent does whatever it does, and then how would I return the result?
So here's my question: can I use a child process to do some non-blocking calculation, when the calculation is put in a Node file, or is it just impossible? Should I do the heavy calculation in a Python script instead? In both cases, how can I pass arguments to the child process - for instance, an image?
I think what you're after is the child_process.fork() API.
For example, if you have the following two files:
In main.js:
var cp = require('child_process');
var child = cp.fork('./worker');
child.on('message', function(m) {
// Receive results from child process
console.log('received: ' + m);
});
// Send child process some work
child.send('Please up-case this string');
In worker.js:
process.on('message', function(m) {
// Do work (in this case just up-case the string
m = m.toUpperCase();
// Pass results back to parent process
process.send(m.toUpperCase(m));
});
Then to run main (and spawn a child worker process for the worker.js code ...)
$ node --version
v0.8.3
$ node main.js
received: PLEASE UP-CASE THIS STRING
It doesn't matter what you will use as a child (Node, Python, whatever), Node doesn't care. Just make sure, that your calculcation script exits after everything is done and result is written to stdout.
Reason why it's not working is that you're using spawn instead of exec.

Resources