How to work around amqplib's Channel#consume odd signature? - node.js

I am writing a worker that uses amqplib's Channel#consume method. I want this worker to wait for jobs and process them as soon as they appear in the queue.
I wrote my own module to abstract away ampqlib, here are the relevant functions for getting a connection, setting up the queue and consuming a message:
const getConnection = function(host) {
return amqp.connect(host);
};
const createChannel = function(conn) {
connection = conn;
return conn.createConfirmChannel();
};
const assertQueue = function(channel, queue) {
return channel.assertQueue(queue);
};
const consume = Promise.method(function(channel, queue, processor) {
processor = processor || function(msg) { if (msg) Promise.resolve(msg); };
return channel.consume(queue, processor)
});
const setupQueue = Promise.method(function setupQueue(queue) {
const amqp_host = 'amqp://' + ((host || process.env.AMQP_HOST) || 'localhost');
return getConnection(amqp_host)
.then(conn => createChannel(conn)) // -> returns a `Channel` object
.tap(channel => assertQueue(channel, queue));
});
consumeJob: Promise.method(function consumeJob(queue) {
return setupQueue(queue)
.then(channel => consume(channel, queue))
});
My problem is with Channel#consume's odd signature. From http://www.squaremobius.net/amqp.node/channel_api.html#channel_consume:
#consume(queue, function(msg) {...}, [options, [function(err, ok) {...}]])
The callback is not where the magic happens, the message's processing should actually go in the second argument and that breaks the flow of promises.
This is how I planned on using it:
return queueManager.consumeJob(queue)
.then(msg => {
// do some processing
});
But it doesn't work. If there are no messages in the queue, the promise is rejected and then if a message is dropped in the queue nothing happens. If there is a message, only one message is processed and then the worker stalls because it exited the "processor" function from the Channel#consume call.
How should I go about it? I want to keep the queueManager abstraction so my code is easier to reason about but I don't know how to do it... Any pointers?

As #idbehold said, Promises can only be resolved once. If you want to process messages as they come in, there is no other way than to use this function. Channel#get will only check the queue once and then return; it wouldn't work for a scenario where you need a worker.

just as an option. You can present your application as a stream of some messages(or events). There is a library for this http://highlandjs.org/#examples
Your code should look like this(it isn`t a finished sample, but I hope it illustrates the idea):
let messageStream = _((push, next) => {
consume(queue, (msg) => {
push(null, msg)
})
)
// now you can operate with your stream in functional style
message.map((msg) => msg + 'some value').each((msg) => // do something with msg)
This approach provides you a lot of primitives for synchronization and transformation
http://highlandjs.org/#examples

Related

Socket.io async/await for .on()

I'm building a socket.io Node JS application and my socket.io server will be listening for data from many socket.io clients, I need to save data to an API via my socket.io server as quickly as possible and figure that async/await is the best way forward.
Right now, I've got a function inside my .on('connection'), but is there a way I can make this an async function rather than have a nested function inside?
io.use((socket, next) => {
if (!socket.handshake.query || !socket.handshake.query.token) {
console.log('Authentication Error 1')
return
}
jwt.verify(socket.handshake.query.token, process.env.AGENT_SIGNING_SECRET, (err, decoded) => {
if (err) {
console.log('Authentication Error 2')
return
}
socket.decoded = decoded
next()
})
}).on('connection', socket => {
socket.on('agent-performance', data => {
async function savePerformance () {
const saved = await db.saveToDb('http://127.0.0.1:8000/api/profiler/save', data)
console.log(saved)
}
savePerformance()
})
})
Sort of, but you'll probably want to keep your current code if there can be multiple agent-performance events. You can modify the following, but it'd be messy and less readable. Event emitters still exist for a reason, they're not made obsolete by the introduction of promises. If it's performance you're after, your current code is probably faster and more resistant to backpressure and easier to error-handle.
events.on is a utility function that takes an event emitter (like socket) and returns an iterator that yields promises. You can await those with for await of.
events.once is a utility function that takes an event emitter (like socket) and returns a promise that resolves when the specified event is executed.
const { on, once } = require('events');
(async function() {
// This is an iterator that can emit infinite number of times.
const iterator = on(io, 'connection');
// Yield a promise, await it, run what is between `{ }` and repeat.
for await (const socket of iterator) {
const data = await once(socket, 'agent-performance');
const saved = await db.saveToDb(/* etc */);
}
})();
As the names imply, on is similar to socket.on and once is similar to socket.once. In the above example:
connected user 1, first agent-performance event: OK
connected user 1, second agent-performance event: not handled, there's no more event handler, since once is "used up".
connected user 2, first agent-performance event: OK
The documentation for on has a note about concurrency when using for await (x of on(...)), but I don't know if that would be a problem in your usecase.
// The execution of this inner block is synchronous and it
// processes one event at a time (even with await). Do not use
// if concurrent execution is required.

nodejs worker_threads How to understand that all workers have completed work

Began to study worker_threads in nodejs
After completion of work, the last worker must finish the script
process.exit();
my index.js
const { Worker } = require("worker_threads");
const logUpdate = require("log-update");
const threads = 100;
const someData = 'some data';
let names = [...Array(threads)].fill(0);
for (let i = 0; i < threads; i++) {
const port = new Worker(require.resolve("./worker.js"), {
workerData: { someData, i }
});
port.on("message", (data) => handleMessage(data, i));
port.on("error", (e) => console.log(e));
port.on("exit", (code) => console.log(`Exit code: ${code}`));
}
function handleMessage(_, index) {
names[index]++;
logUpdate(names.map((status, i) => `Thread ${i}: ${status} ${_}`).join("\n"));
}
worker.js
const { parentPort, workerData } = require("worker_threads" )
const { someData, i } = workerData;
(async () => {
parentPort.postMessage( `worker start ${i} some data ${someData}`);
process.exit();
})();
Now workers are created and work out, but after they are completed, the script does not complete its work
From whatever knowledge I have, I understand that if there is no error caused in the worker until its exit, then the return code is 0, otherwise 1 (automatically). Look at process exit codes.
So when you do process.exit() at that time you can pass the code as an argument. (or not have process.exit() at all).
Generally what I have seen, and do is:
Wrap the worker creation in a Promise returning function.
NOTE:
worker.terminate() : terminates the worker thread.
(Personal choice) I choose to do parentPort.postMessage(JSON.stringify(true)) if I complete my task
in my worker thread. Hence worker.on('message', message => {if (message === true) {worker.terminate();resolve(true);} else {console.log(message)}}); in the main thread.
Whenever I have to create threads for repetitive task-delegations. I use .map function on the array. the function inside returns the same return value as that from my worker creator (Promised returning) function. Hence eventually I map the array into array of promises.
Then I put this array of promises into Promise.all().
Then I check for returned values of Promise.all()
Checkout:
My Gist (as an example for Promise.all()).
Medium (Just for worker_thread basics. An Awesome article)

How to run asynchronous tasks synchronous?

I'm developing an app with the following node.js stack: Express/Socket.IO + React. In React I have DataTables, wherein you can search and with every keystroke the data gets dynamically updated! :)
I use Socket.IO for data-fetching, so on every keystroke the client socket emits some parameters and the server calls then the callback to return data. This works like a charm, but it is not garanteed that the returned data comes back in the same order as the client sent it.
To simulate: So when I type in 'a', the server responds with this same 'a' and so for every character.
I found the async module for node.js and tried to use the queue to return tasks in the same order it received it. For simplicity I delayed the second incoming task with setTimeout to simulate a slow performing database-query:
Declaration:
const async = require('async');
var queue = async.queue(function(task, callback) {
if(task.count == 1) {
setTimeout(function() {
callback();
}, 3000);
} else {
callback();
}
}, 10);
Usage:
socket.on('result', function(data, fn) {
var filter = data.filter;
if(filter.length === 1) { // TEST SYNCHRONOUSLY
queue.push({name: filter, count: 1}, function(err) {
fn(filter);
// console.log('finished processing slow');
});
} else {
// add some items to the queue
queue.push({name: filter, count: filter.length}, function(err) {
fn(data.filter);
// console.log('finished processing fast');
});
}
});
But the way I receive it in the client console, when I search for abc is as follows:
ab -> abc -> a(after 3 sec)
I want it to return it like this: a(after 3sec) -> ab -> abc
My thought is that the queue runs the setTimeout and then goes further and eventually the setTimeout gets fired somewhere on the event loop later on. This resulting in returning later search filters earlier then the slow performing one.
How can i solve this problem?
First a few comments, which might help clear up your understanding of async calls:
Using "timeout" to try and align async calls is a bad idea, that is not the idea about async calls. You will never know how long an async call will take, so you can never set the appropriate timeout.
I believe you are misunderstanding the usage of queue from async library you described. The documentation for the queue can be found here.
Copy pasting the documentation in here, in-case things are changed or down:
Creates a queue object with the specified concurrency. Tasks added to the queue are processed in parallel (up to the concurrency limit). If all workers are in progress, the task is queued until one becomes available. Once a worker completes a task, that task's callback is called.
The above means that the queue can simply be used to priorities the async task a given worker can perform. The different async tasks can still be finished at different times.
Potential solutions
There are a few solutions to your problem, depending on your requirements.
You can only send one async call at a time and wait for the first one to finish before sending the next one
You store the results and only display the results to the user when all calls have finished
You disregard all calls except for the latest async call
In your case I would pick solution 3 as your are searching for something. Why would you use care about the results for "a" if they are already searching for "abc" before they get the response for "a"?
This can be done by giving each request a timestamp and then sort based on the timestamp taking the latest.
SOLUTION:
Server:
exports = module.exports = function(io){
io.sockets.on('connection', function (socket) {
socket.on('result', function(data, fn) {
var filter = data.filter;
var counter = data.counter;
if(filter.length === 1 || filter.length === 5) { // TEST SYNCHRONOUSLY
setTimeout(function() {
fn({ filter: filter, counter: counter}); // return to client
}, 3000);
} else {
fn({ filter: filter, counter: counter}); // return to client
}
});
});
}
Client:
export class FilterableDataTable extends Component {
constructor(props) {
super();
this.state = {
endpoint: "http://localhost:3001",
filters: {},
counter: 0
};
this.onLazyLoad = this.onLazyLoad.bind(this);
}
onLazyLoad(event) {
var offset = event.first;
if(offset === null) {
offset = 0;
}
var filter = ''; // filter is the search character
if(event.filters.result2 != undefined) {
filter = event.filters.result2.value;
}
var returnedData = null;
this.state.counter++;
this.socket.emit('result', {
offset: offset,
limit: 20,
filter: filter,
counter: this.state.counter
}, function(data) {
returnedData = data;
console.log(returnedData);
if(returnedData.counter === this.state.counter) {
console.log('DATA: ' + JSON.stringify(returnedData));
}
}
This however does send unneeded data to the client, which in return ignores it. Somebody any idea's for further optimizing this kind of communication? For example a method to keep old data at the server and only send the latest?

nodejs concurrency synchronous execution

I've two WebSockets getting data asynchronously, every time I get some message from the sockets I execute some code in CompareData.
The problem is that CompareData should be executed synchronously, or (better) only if it is not already running
This is my code:
function CompareData(data) {
console.log('data ', data);
AsyncFunction();
};
ws1 = new WebSocket(WS1_URL);
ws2 = new WebSocket(WS2_URL);
ws1.on('message', (data) => {
CompareData(data);
});
ws2.on('message', (data) => {
CompareData(data);
});
Can you help me, please? I'm very new to NodeJs
Node.js is single threaded. So you don't really get true concurrency issues occurring in Node programs as you might in other languages. In your example, there can only be at most one WebSocket callback for CompareData occurring at any given time.
You should not make synchronous call in node.js but you can make those call sequential. See below example might be helpful.
var messages = [];
var inProgress = false;
function CompareData(data) {
return new Promise((resolve, reject) => {
// do some stuff and resolve
setTimeout(() => {
resolve(data);
}, 1000);
});
};
const start = async () => {
if (!inProgress) {
if (messages.length !== 0) {
inProgress = true;
try {
const data = await CompareData(messages.shift());
console.log(data);
} catch (error) {
console.log(error);
}
inProgress = false;
await start();
}else{
console.log('Process Done');
}
}
}
const handler = (data) => {
messages.push(data);
start();
}
handler(1);
handler(2);
handler(3);
handler(4);
// ws1 = new WebSocket(WS1_URL);
// ws2 = new WebSocket(WS2_URL);
// ws1.on('message', handler);
// ws2.on('message', handler);
You should use some mutex in order to avoid that two async operations of compareData are executed at the same time, like node-mutex or mutexify.
My suggestions are:
First of all, you need to know when CompareData is finished. Reorganize your code to use promises or callbacks. If you're using third-party async functions, I'm almost sure they provide some feedback on completion - This is a must have in async world
Add inProgress = false flag somewhere to serve for you as simple lock. As someone posted, JS is single-threaded and you're guaranteed that your code won't get interrupted in the middle of operation. Thanks to that you can use really simple locks instead of complicated os-based mutexes known from multithreaded
langs.
In ws.on(...) check if inProgress is set. If not, lock it and run CompareData
In CompareData completion callback or on promise resolution set inProgress back to false, so you're no longer ignoring incoming data.
If you can simply discard the data, there is no need to complicate this scenario with extra queues, mutexes, etc.
If you need to serve it all, then queue incoming data and serve next piece after completion callback is fired.
This is basically what Rahul's suggests, but he uses features that are not established in current version of standard, so don't use it if you're not transpiling your code.

How to consume just one message from rabbit mq on nodejs

Im using amqp.node library to integrate rabbitmq into my system.
But in consumer i want to process just one message at the time, then acknowledge the message then consume the next message from the queue.
The current code is:
// Consumer
open.then(function(conn) {
var ok = conn.createChannel();
ok = ok.then(function(ch) {
ch.assertQueue(q);
ch.consume(q, function(msg) {
if (msg !== null) {
othermodule.processMessage(msg, function(error, response){
console.log(msg.content.toString());
ch.ack(msg);
});
}
});
});
return ok;
}).then(null, console.warn);
The ch.consume will process all messages in the channel at one time and the function of the module call it here othermodule will not be executed in the same time line.
I want to wait for the othermodule function to finish before consume the next message in the queue.
At this moment (2018), I think RabbitMQ team has an option to do that:
https://www.rabbitmq.com/tutorials/tutorial-two-javascript.html
ch.prefetch(1);
In order to defeat that we can use the prefetch method with the value
of 1. This tells RabbitMQ not to give more than one message to a
worker at a time. Or, in other words, don't dispatch a new message to
a worker until it has processed and acknowledged the previous one.
Instead, it will dispatch it to the next worker that is not still
busy.
Follow up the example here :
https://www.npmjs.com/package/amqplib
// Consumer
function consumer(conn) {
var ok = conn.createChannel(on_open);
function on_open(err, ch) {
if (err != null) bail(err);
ch.assertQueue(q);
// IMPORTANT
ch.prefetch(1);
ch.consume(q, function(msg) {
if (msg !== null) {
console.log(msg.content.toString());
ch.ack(msg);
}
});
}
}
Refs: http://www.squaremobius.net/amqp.node/channel_api.html#channel_prefetch
You need to set a prefetch value as shown in this example:
https://github.com/squaremo/amqp.node/blob/master/examples/tutorials/rpc_server.js#L22
When you create the model you need to set the QOS on it. Here is how we would do it in C#:
var _model = rabbitConnection.CreateModel();
// Configure the Quality of service for the model. Below is how what each setting means.
// BasicQos(0="Dont send me a new message untill I’ve finshed", _fetchSize = "Send me N messages at a time", false ="Apply to this Model only")
_model.BasicQos(0, _fetchSize, false);
var consumerTag = _model.BasicConsume(rabbitQueue.QueueName, false, _consumerName, queueingConsumer);
You have to set QoS = 1.
ch = ...
ch.qos(1);
ch.consume(q, msg => { ... });
(javascript)

Resources