kafkajs - gracefully stop kafkajs instance after disconnect - node.js

I'm using kafkajs on both my production and integration tests.
Before all my tests I'm creating a kafkajs instance with producer & consumer connect/subscribe/run(eachMeassage)...
After all my tests I want to stop gracefully all my node process including kafkajs components.
I'm doing actually this:
export function stopHelper(): Promise<void> {
return new Promise<void>((resolve, reject) => {
if (kafkaHelperState === kafkaHelperStateStatus.running) {
kafkaHelperState = kafkaHelperStateStatus.stopping
log.debug("stopHelper", kafkaHelperState);
Promise.all([producer.disconnect, consumer.disconnect])
.then(() => {
kafkaHelperState = kafkaHelperStateStatus.stopped
log.info("stopHelper", kafkaHelperState);
resolve()
})
.catch(error => reject(error))
} else {
log.warn("stopHelper", "kafkaHelper is not " + kafkaHelperStateStatus.running)
}
})
}
Promises seems to work.
I'm able to see that my Integration Test suite is finished with both producer & consumer disconnected.
But my node process is still running without doing anything.
Before that I was using kafka-node. When I stopped the consumer, my node process ends without having to specify any process.exit(0)
Does there is a way to gracefully destroy the instance of kafkajs in the node process?

Promise.all([producer.disconnect(), consumer.disconnect()])
instead of
Promise.all([producer.disconnect, consumer.disconnect])

Related

worker thread won't respond after first message?

I'm making a server script and, to make it easier for both hosts and clients to do what they want, I made a customizable server script that runs using nw.js(with a visual interface). Said script was made using web workers since nw.js was having problems with support to worker threads.
Now that NW.js fixed their problems with worker threads, I've been trying to move all the things that were inside the web workers to worker threads, but there's a problem: When the main thread receives the answer from the second thread, the later stops responding to any subsequent message.
For example, running the following code with either NW.js or Node.js itself will return "pong" only once
const { Worker } = require('worker_threads');
const worker = new Worker('const { parentPort } = require("worker_threads");parentPort.once("message",message => parentPort.postMessage({ pong: message })); ', { eval: true });
worker.on('message', message => console.log(message));
worker.postMessage('ping');
worker.postMessage('ping');
How do I configure the worker so it will keep responding to whatever message it receives after the first one?
Because you use EventEmitter.once() method. According to the documentation this method does the next:
Adds a one-time listener function for the event named eventName. The
next time eventName is triggered, this listener is removed and then
invoked.
If you need your worker to process more than one event then use EventEmitter.on()
const worker = new Worker('const { parentPort } = require("worker_threads");' +
'parentPort.on("message",message => parentPort.postMessage({ pong: message }));',
{ eval: true });

Node.js library for executing long running background tasks

I have a architecture with a express.js webserver that accepts new tasks over a REST API.
Furthermore, I have must have another process that creates and supervises many other tasks on other servers (distributed system). This process should be running in the background and runs for a very long time (months, years).
Now the questions is:
1)
Should I create one single Node.js app with a task queue such as bull.js/Redis or Celery/Redis that basically launches this long running task once in the beginning.
Or
2)
Should I have two processes, one for the REST API and another daemon processes that schedules and manages the tasks in the distributed system?
I heavily lean towards solution 2).
Drawn:
I am facing the same problem now. as we know nodejs run in single thread. but we can create workers for parallel or handle functions that take some time that we don't want to affect our main server. fortunately nodejs support multi-threading.
take a look at this example:
const worker = require('worker_threads');
const {
Worker, isMainThread, parentPort, workerData
} = require('worker_threads');
if (isMainThread) {
module.exports = function parseJSAsync(script) {
return new Promise((resolve, reject) => {
const worker = new Worker(__filename, {
workerData: script
});
worker.on('message', resolve);
worker.on('error', reject);
worker.on('exit', (code) => {
if (code !== 0)
reject(new Error(`Worker stopped with exit code ${code}`));
});
});
};
} else {
const { parse } = require('some-js-parsing-library');
const script = workerData;
parentPort.postMessage(parse(script));
}
https://nodejs.org/api/worker_threads.html
search some articles about multi-threading in nodejs. but remember one here , the state cannot be shared with threads. you can use some message-broker like kafka, rabbitmq(my recommended), redis for handling such needs.
kafka is quite difficult to configure in production.
rabbitmq is good because you can store messages, queues and .., in local storage too. but personally I could not find any proper solution for load balancing these threads . maybe this is not your answer, but I hope you get some clue here.

Multiple AWS Instances and Node events

I have an implementation in node where an API when called does some processing and waits for an event from another function before returning the response. This works fine when ran locally and when running in a single instance in AWS but when multiple instances are involved there are some issues which I'm assuming is because the API is being called from one instance and the emitter is being emitted in another instance. Is there any way to keep the listeners and emitters same across all instances?
Update :
After some research I found that using an application loadbalancer with some logic for routing can help with this issue. I am marking the answer below as correct because while it did not help me with AWS autoscaling, it did help me find an alernate solution to my problem.
AFAIU you think that event emitted from one process is being handled in a different process, but it never would be the case from what I know because each process has its own memory and also events would be associated with the process only.
I have added a sample code that demonstrates what I meant by it. Maybe if you post the code you are referring to, we could check what went wrong.
const cluster = require("cluster");
const EventEmitter = require("events");
if (cluster.isMaster) {
cluster.fork();
const myEE = new EventEmitter();
myEE.on("foo", arg =>
console.log("emitted from ", arg, "received in master")
);
setTimeout(() => {
myEE.emit("foo", "master");
}, 1000);
} else {
const myEE = new EventEmitter();
myEE.on("foo", arg => console.log("emitted from", arg, "received in worker"));
setTimeout(() => {
myEE.emit("foo", "client");
}, 2000);
}

NodeJS and RabbitMQ, how to be sure my message is processed

I am building a kind of micro service application and using RabbitMQ to communicate between my services.
I have a nodeJS app that is supposed to receive messages from RabbitMQ and execute commands when a particular message comes in. So here is what the following code does:
Connects to RabbitMQ
Listens to symfony_messages queue
If a message identified by product.created comes in, the script executes a particular command using spawn from child_process.
My question is: Sometimes, I am going to "restart" my script. How can I be sure that at the moment of restarting the script is not in the middle of processing an event? How can I be sure that the process is not going to consume a message and stop before spawning the process?
The possible solution that came to my mind is:
Send a signal to the nodeJS process to tell him "Process a last message and stop". But how can I send such a signal?
And here is the code (you do not need to read if you already get the question):
const amqp = require('amqplib/callback_api')
const { spawn } = require('child_process')
amqp.connect('amqp://guest:guest#127.0.0.1:5672', (err, conn) => {
if (err) {
console.log(err)
return
}
conn.createChannel((err, channel) => {
let q = 'symfony_messages'
channel.assertQueue(q, {
durable: false
})
console.log(" [*] Waiting for messages in %s. To exit press CTRL+C", q);
channel.consume(q, (msg) => {
let event = JSON.parse(msg.content.toString())
if (event.name === 'product.created') {
console.log('Indexing order...')
let cp = spawn('php', [path.join(__dirname, '..', '..', 'bin', 'console'), 'elastic:index:orders', event.payload.product_id])
cp.stdout.on('data', (data) => {
console.log(`stdout: ${data}`);
})
cp.stderr.on('data', (data) => {
console.log(`stderr: ${data}`);
})
cp.on('close', (code) => {
console.log(`child process exited with code ${code}`);
})
}
}, {noAck: true});
})
})
Wouldn't it be a good pattern to use the channel.ack(message) function on the message once the message has been processed successfully? You've set the noAck option to true, but you can use the ACK mechanism to ensure messages are only taken off the queue once they are successfully processed.
Likewise, you can use the Nack function to deliberately tell RabbitMQ that the message was not processed, I normally do this in the process function error handler (or promise.catch).
I use a similar mechanism in a service that writes messages to a database. I only ACK the message once the message is written to the db. It's also useful to setup a dead letter exchange / queue within RabbitMQ so that any message that is Nacked ends up there. You can then inspect these messages and see why they couldn't be processed (or automatically attempt to re-process once the error condition that caused the problem is resolved.)

Shutting down nodejs microservice gracefully when using Docker Swarm

I am have multiple micro services written in Nodejs Koa running in Docker Swarm.
Since container orchestration tools like Kubernetes or Swarm can scale up and down services instantly, I have a question on gracefully shutting down Nodejs service to prevent unfinished running process.
Below is the flow I can think of:
sending a SIGNINT signal to each worker process, Does docker swarm
send SIGNINT to worker when scaling down service?
the worker are responsible to catch the signal, cleanup or free any
used resource and finish the its process, How can I stop new api
request, wait for any running process to finish before shutting
down?
Some code below from reference:
process.on('SIGINT', () => {
const cleanUp = () => {
// How can I clean resources like DB connections using Sequelizer
}
server.close(() => {
cleanUp()
process.exit()
})
// Force close server after 5secs, `Should I do this?`
setTimeout((e) => {
cleanUp()
process.exit(1)
}, 5000)
})
I created a library (https://github.com/sebhildebrandt/http-graceful-shutdown) that can handle graceful shutdowns as you described. Works well with Express and Koa.
This package also allows you to create function (should return a promise) to additionally clean up things like DB stuff, ... here some example code how to use it:
const gracefulShutdown = require('http-graceful-shutdown');
...
server = app.listen(...);
...
// your personal cleanup function - this one takes one second to complete
function cleanup() {
return new Promise((resolve) => {
console.log('... in cleanup')
setTimeout(function() {
console.log('... cleanup finished');
resolve();
}, 1000)
});
}
// this enables the graceful shutdown with advanced options
gracefulShutdown(server,
{
signals: 'SIGINT SIGTERM',
timeout: 30000,
development: false,
onShutdown: cleanup,
finally: function() {
console.log('Server gracefulls shutted down.....')
}
}
);
I personally would increase the final timeout from 5 secs to higher value (10-30 secs). Hope that helps.

Resources