node: await with child process message handler - node.js

I'm having trouble wrapping my hands around some async/await code I'm working on. Is there a way to make a child process's message handler async? Here is what my child process file looks like:
// child.ts
import { writeImage } from './generate-images'
const slowFunction = async (imageAttributes) => {
console.log("inside child slowFunction....")
await writeImage(imageAttributes, 0, true)
}
process.on('message', (msg) => {
console.log('starting child....')
slowFunction(msg)
console.log('exiting child')
process.exit()
})
I am calling it via fork in a big loop in the parent process because i need to perform this slow function a few thousand times, this is a dumbed down version of what im calling inside a big loop in the parent:
// parent.ts
const child = child_process.fork(path.join(__dirname, 'child.ts'))
child.send(Array.from(imageAttributes)[i])
child.on('exit', function () {
console.log(`child exiting`)
// do some cleanup
})
the problem is that all my forks keep exiting before slowFunction finishes because its an async function, but i cant add async slowFunction(msg) because the process.on('message', ...) handler is not async.
any ideas?

You need to make process.on('message', ...) an async function. No callbacks will be async by default.
process.on('message', async (msg) => {
console.log('starting child....')
await slowFunction(msg)
console.log('exiting child')
process.exit()
})

Related

My worker threads are not being picked up

I'm implementing worker threads in my Node/Typescript application. Now I've come quite far except for the fact that it seems my worker threads are not being picked up/executed. I've added some loggers inside function which should be executed by the worker thread, but for some reason it doesn't show.
I'm calling a function to create a worker. Like this:
create_worker("./src/core/utils/worker.js", {
term: "amsterdam",
search_grid: chunked_maps_service![i],
path: "../../adapters/worker.ts",
})
And this is the function to create a worker:
import { Worker } from "worker_threads";
const create_worker = (file: string, workerData: {}) =>
new Promise<void>((resolve, reject) => {
const worker = new Worker(file, {
workerData: workerData,
});
worker.on("message", resolve);
worker.on("error", reject);
worker.on("exit", (code) => {
if (code !== 0)
reject(new Error(`Worker stopped with exit code ${code}`));
});
});
export default create_worker;
Because I'm using typescript I need to compile the typescript code to javascript code to make the worker understand it: Like this:
const path = require("path");
const { workerData } = require("worker_threads");
require("ts-node").register();
require(path.resolve(__dirname, workerData.path));
And then this is the function which should be executed in the worker thread:
import { parentPort } from "worker_threads";
async function worker() {
console.log("started");
parentPort.postMessage("fixed!");
}
Is there anything I'm forgetting?
You aren't ever calling the function in your worker.
This code in the worker:
import { parentPort } from "worker_threads";
async function worker() {
console.log("started");
parentPort.postMessage("fixed!");
}
Just defines a function named worker. It never calls that function.
If you want it called immediately, then you must call it like this:
import { parentPort } from "worker_threads";
async function worker() {
console.log("started");
parentPort.postMessage("fixed!");
}
worker();
FYI, in the worker implementations I've built, I usually either execute something based on data passed into the worker or the worker executes something based on a message it receives from the parent.

Trigger the execution of a function if any condition is met

I'm writing an HTTP API with expressjs in Node.js and here is what I'm trying to achieve:
I have a regular task that I would like to run regularly, approx every minute. This task is implemented with an async function named task.
In reaction to a call in my API I would like to have that task called immediately as well
Two executions of the task function must not be concurrent. Each execution should run to completion before another execution is started.
The code looks like this:
// only a single execution of this function is allowed at a time
// which is not the case with the current code
async function task(reason: string) {
console.log("do thing because %s...", reason);
await sleep(1000);
console.log("done");
}
// call task regularly
setIntervalAsync(async () => {
await task("ticker");
}, 5000) // normally 1min
// call task immediately
app.get("/task", async (req, res) => {
await task("trigger");
res.send("ok");
});
I've put a full working sample project at https://github.com/piec/question.js
If I were in go I would do it like this and it would be easy, but I don't know how to do that with Node.js.
Ideas I have considered or tried:
I could apparently put task in a critical section using a mutex from the async-mutex library. But I'm not too fond of adding mutexes in js code.
Many people seem to be using message queue libraries with worker processes (bee-queue, bullmq, ...) but this adds a dependency to an external service like redis usually. Also if I'm correct the code would be a bit more complex because I need a main entrypoint and an entrypoint for worker processes. Also you can't share objects with the workers as easily as in a "normal" single process situation.
I have tried RxJs subject in order to make a producer consumer channel. But I was not able to limit the execution of task to one at a time (task is async).
Thank you!
You can make your own serialized asynchronous queue and run the tasks through that.
This queue uses a flag to keep track of whether it's in the middle of running an asynchronous operation already. If so, it just adds the task to the queue and will run it when the current operation is done. If not, it runs it now. Adding it to the queue returns a promise so the caller can know when the task finally got to run.
If the tasks are asynchronous, they are required to return a promise that is linked to the asynchronous activity. You can mix in non-asynchronous tasks too and they will also be serialized.
class SerializedAsyncQueue {
constructor() {
this.tasks = [];
this.inProcess = false;
}
// adds a promise-returning function and its args to the queue
// returns a promise that resolves when the function finally gets to run
add(fn, ...args) {
let d = new Deferred();
this.tasks.push({ fn, args: ...args, deferred: d });
this.check();
return d.promise;
}
check() {
if (!this.inProcess && this.tasks.length) {
// run next task
this.inProcess = true;
const nextTask = this.tasks.shift();
Promise.resolve(nextTask.fn(...nextTask.args)).then(val => {
this.inProcess = false;
nextTask.deferred.resolve(val);
this.check();
}).catch(err => {
console.log(err);
this.inProcess = false;
nextTask.deferred.reject(err);
this.check();
});
}
}
}
const Deferred = function() {
if (!(this instanceof Deferred)) {
return new Deferred();
}
const p = this.promise = new Promise((resolve, reject) => {
this.resolve = resolve;
this.reject = reject;
});
this.then = p.then.bind(p);
this.catch = p.catch.bind(p);
if (p.finally) {
this.finally = p.finally.bind(p);
}
}
let queue = new SerializedAsyncQueue();
// utility function
const sleep = function(t) {
return new Promise(resolve => {
setTimeout(resolve, t);
});
}
// only a single execution of this function is allowed at a time
// so it is run only via the queue that makes sure it is serialized
async function task(reason: string) {
function runIt() {
console.log("do thing because %s...", reason);
await sleep(1000);
console.log("done");
}
return queue.add(runIt);
}
// call task regularly
setIntervalAsync(async () => {
await task("ticker");
}, 5000) // normally 1min
// call task immediately
app.get("/task", async (req, res) => {
await task("trigger");
res.send("ok");
});
Here's a version using RxJS#Subject that is almost working. How to finish it depends on your use-case.
async function task(reason: string) {
console.log("do thing because %s...", reason);
await sleep(1000);
console.log("done");
}
const run = new Subject<string>();
const effect$ = run.pipe(
// Limit one task at a time
concatMap(task),
share()
);
const effectSub = effect$.subscribe();
interval(5000).subscribe(_ =>
run.next("ticker")
);
// call task immediately
app.get("/task", async (req, res) => {
effect$.pipe(
take(1)
).subscribe(_ =>
res.send("ok")
);
run.next("trigger");
});
The issue here is that res.send("ok") is linked to the effect$ streams next emission. This may not be the one generated by the run.next you're about to call.
There are many ways to fix this. For example, you can tag each emission with an ID and then wait for the corresponding emission before using res.send("ok").
There are better ways too if calls distinguish themselves naturally.
A Clunky ID Version
Generating an ID randomly is a bad idea, but it gets the general thrust across. You can generate unique IDs however you like. They can be integrated directly into the task somehow or can be kept 100% separate the way they are here (task itself has no knowledge that it's been assigned an ID before being run).
interface IdTask {
taskId: number,
reason: string
}
interface IdResponse {
taskId: number,
response: any
}
async function task(reason: string) {
console.log("do thing because %s...", reason);
await sleep(1000);
console.log("done");
}
const run = new Subject<IdTask>();
const effect$: Observable<IdResponse> = run.pipe(
// concatMap only allows one observable at a time to run
concatMap((eTask: IdTask) => from(task(eTask.reason)).pipe(
map((response:any) => ({
taskId: eTask.taskId,
response
})as IdResponse)
)),
share()
);
const effectSub = effect$.subscribe({
next: v => console.log("This is a shared task emission: ", v)
});
interval(5000).subscribe(num =>
run.next({
taskId: num,
reason: "ticker"
})
);
// call task immediately
app.get("/task", async (req, res) => {
const randomId = Math.random();
effect$.pipe(
filter(({taskId}) => taskId == randomId),
take(1)
).subscribe(_ =>
res.send("ok")
);
run.next({
taskId: randomId,
reason: "trigger"
});
});

setTimeout blocks Promise inside a child process

I encountered a weird issue with setTimeout inside a promise in a child process.
These are my files:
index.js:
const {spawnSync} = require('child_process');
const {resolve} = require('path');
const timeoutPromiseModule = resolve(__dirname, '.', 'timeout-promise');
const {stdout} = spawnSync('node', [timeoutPromiseModule]);
console.log(stdout.toString());
timeout-promise.js:
Promise.race([
Promise.resolve(),
new Promise((resolve, reject) => {
setTimeout(() => {reject('too long')}, 10000);
})
])
.then(x=> console.log('resolved'))
.catch(e => console.log('rejected'));
When I run node index.js I expected the output to be print immediatly but what actually happens is that the output hangs until setTimeout's callback is called by the child process.
What's causing this and how can this be resolved?
I'm guessing it's something to do with the child process's event loop that prevents the child process from closing until the messages empty?
I uploaded the code to GitHub for your convenience:
https://github.com/alexkubica/promise-race-settimeout-blocking-inside-child-process
The reason for this is that spawnSync will not return until the child process has fully closed as stated in the documentation:
The child_process.spawnSync() method is generally identical to
child_process.spawn() with the exception that the function will not
return until the child process has fully closed. [...]
Note that a node script will only exit when there are no more pending tasks in the eventloop's queue, which in this case happens after the timeout has resolved.
You can switch to spawn to see the immediatley resolved promise output:
const res = spawn('node', [timeoutPromiseModule]);
res.stdout.on('data', (data) => {
console.log(`stdout: ${data}`);
});
res.on('close', (code) => {
console.log(`child process exited with code ${code}`);
});

How to listen terminate signal from worker in nodejs?

I have a main process which spawn one new thread using worker_threads. Main process in some particular cases have to close this thread without mattering if it has finished its task or not, so MainThread makes use of terminate() to stop the thread. However, this thread spawn different dependencies that need to be closed before exiting. These dependencies have to be closed from the thread, so I cannot use worker.on('exit') since it is run on main process.
Is there some way of listening the terminate from the worker itself?
Some minimal example of what I would like to achieve.
const {Worker, isMainThread} = require('worker_threads');
if (isMainThread) {
const worker = new Worker(__filename);
worker.on('message', console.log);
worker.on('error', console.log);
worker.on('exit', console.log);
setTimeout(() => {
console.log('Worker is gonna be terminated');
worker.terminate();
}, 5000);
} else {
(async () => {
console.log('I am the worker');
// This thread will spawn its own dependencies, so I want to listen here the terminate signal from
// mainThread to close the dependencies of this worker
// Sth like the following will be awesome
// thread.on('exit', () => { /* close dependencies */ })
// Simulate a task which takes a larger time than MainThread wants to wait
await new Promise(resolve => {
setTimeout(resolve, 10000);
});
})();
}
You can signal exit to the worker thread via worker.postMessage(value) / parentPort.on("message", (value) => {...}), and then use process.exit() in the worker thread. Of course do the clean up first.
I would recommend to use an object as value, that way you will be able to pass multiple commands or date from main thread to worker thread.
const { Worker, isMainThread, parentPort } = require("worker_threads");
if (isMainThread) {
const worker = new Worker(__filename);
worker.on("message", console.log);
worker.on("error", console.log);
worker.on("exit", console.log);
setTimeout(() => {
console.log("Worker is gonna be terminated");
// replace worker.terminate(); with something like
worker.postMessage({ exit: true });
// maybe add another setTimeout with worker.terminate() just in case?
}, 5000);
} else {
(async () => {
// listen for message and do things according to passed value
parentPort.on("message", (value) => {
// worker threads do not have multiple listeners available for passing different event,
// therefore add one onMessage listener, and pass an object with commands/data from main thread
if (value.exit) {
// clean up
console.log("doing cleanup");
process.exit(0);
}
});
// add other logic for receiving messages from main thread
console.log("I am the worker");
// This thread will spawn its own dependencies, so I want to listen here the terminate signal from
// mainThread to close the dependencies of this worker
// Sth like the following will be awesome
// thread.on('exit', () => { /* close dependencies */ })
// Simulate a task which takes a larger time than MainThread wants to wait
await new Promise((resolve) => {
setTimeout(resolve, 10000);
});
})();
}

In a forked child process, process.exit() is executed before the await functions are complete

I call process.exit() at the end of an event handler in a forked child process, after some async functions that I call with await.
process.on('message', async (message) => {
try {
await setup(message);
await run(message);
} catch (err) {
}
process.exit();
});
My expectation is the async functions to be completed before process.exit() is executed, but it looks like the process.exit() is called before the async functions are complete. If I remove the process.exit(), the child process completes the async functions but then it doesn't exit which is not desirable.

Resources