NodeJS How To Confirm MongoDB Driver Is Not Bkocking - node.js

I've asked some general questions around this topic before (node and blocking). This time the question is a little more specific.
Let's say I've got a node/express app which has a handle that is accepting HTTP requests (doesn't matter, say they're simple GETs).
And it has a separate handler which reads messages off of a RabbitMQ queue, as they arrive, and then does a read from Mongo (Mongo is on a different machine), followed by a write.
If Mongo was "very" busy, would/could that cause the HTTP handler to appear unavailable?
I'm using the Mongo native driver. I would think any blocking that is occurring while the Mongo driver waits for a response from the server would have Node happily accepting and handling HTTP requests, but I don't know for sure.
In a related scenario, swap-out a busy Mongo for a handler that reads a Rabbit message and PUTs a record into a "very" busy ElasticSearch. Will that cause issues with the HTTP handler?
I'd go straight to testing it, but that's a little tricky and gets expensive testing every time I'm not sure what the theory is. So I thought I'd ask.
Here's a (simplified) example of the code:
// HTTP handler...
app.post('/eventcapture/event', (req: express.Request, res: express.Response) => {
var evt: eventDS.IEvent = ('TypeID' in req.body) ? req.body : JSON.parse(req.body);
//create an id
evt._id = uuid.v4();
bus.Publish(evt)
.then((success) => {
res.jsonp(200, { success: true });
})
.catch((failReason:Error) => {
console.error('[ERROR] - Failure writing event: %s,%s', failReason.name, failReason.message);
logError(failReason, evt);
res.jsonp(500, { success: false, reason: failReason });
});
});
// We generically define additional handlers in an array, and then kick them off with a loop.
// Here we have one handler which reads an event, goes to mongo to get additional data which
// it adds into the event before publishing it back out. And a second handler which will catch
// these "augmented" events and push them into Mongo
var processes = [
{
enabled: true,
name: 'augmenter',
inType: 'EventCapture:RawEvent',
handler: (event: eventDS.IEvent) => {
console.log('[LOG] - augment event: %s', event._id);
Profile.FindOne({ _id: event.User.ProfileID })
.then((profile) => {
if (profile) {
console.log('[LOG] - found Profile: %s', profile._id);
event.User.Email = profile.PersonalDetail.Email;
//other values also...
//change the TypeID for publishing
event.TypeID = 'EventCapture:AugmentedEvent';
return event;
}
else throw new Error(util.format('unable to find profile: %s', event.User.ProfileID));
})
.then((augmentedEvent) => bus.Publish(augmentedEvent)) //publish the event back out
.catch((failReason:Error) => {
console.error('[ERROR] - failure publishing augmented event: %s, %s, %s', event._id, failReason.name, failReason.message);
logError(failReason, event);
});
}
},
{
enabled: true,
name: 'mongo',
inType: 'EventCapture:AugmentedEvent',
handler: (event: eventDS.IEvent) => {
console.log('[LOG] - push to mongo: %s', event.User.ProfileID);
Event.Save(event, { safe: true })
.then((success) => console.log('[LOG] - pushed to mongo: %s', event._id))
.catch((failReason:Error) => {
console.error('[ERROR] - failure pushing to mongo: %s, %s', event._id, failReason);
logError(failReason, event);
});
}
}
];
processes.forEach((process, idx, allProcesses) => {
if (process.enabled) {
bus.Subscribe(process.name, process.inType, process.handler);
}
});

No. This is the awesomeness of async programming. Node can do other things while it waits for mongodb to get back to it. You can assume that popular node modules like mongodb write things in an async fashion.
Here's a video that goes into a lot of detail about the event loop: http://vimeo.com/96425312?utm_source=nodeweekly&utm_medium=email
At the end of the day, things like the mongo driver are written using node's low level io and network libraries. These libraries enforce async flow. The author of a package would have to go out of her way to make it sync.

Related

Node JS App crashes with ERR_SOCKET_CANNOT_SEND error

I have a node js service that consumes messages from Kafka and processes it through various steps of transformation logic. During the processing, services use Redis and mongo for storage and caching purposes. In the end, it sends the transformed message to another destination via UDP packets.
On startup, it starts consuming message from Kafka after a while, it crashes down with the unhandled error: ERR_CANNOT_SEND unable to send data(see below picture).
restarting the application resolves the issue temporarily.
I initially thought it might have to do with the forwarding through UDP sockets, but the forwarding destinations are reachable from the consumer!
I'd appreciate any help here. I'm kinda stuck here.
Consumer code:
const readFromKafka = ({host, topic, source}, transformationService) => {
const logger = createChildLogger(`kafka-consumer-${topic}`);
const options = {
// connect directly to kafka broker (instantiates a KafkaClient)
kafkaHost: host,
groupId: `${topic}-group`,
protocol: ['roundrobin'], // and so on the other kafka config.
};
logger.info(`starting kafka consumer on ${host} for ${topic}`);
const consumer = new ConsumerGroup(options, [topic]);
consumer.on('error', (err) => logger.error(err));
consumer.on('message', async ({value, offset}) => {
logger.info(`recieved ${topic}`, value);
if (value) {
const final = await transformationService([
JSON.parse(Buffer.from(value, 'binary').toString()),
]);
logger.info('Message recieved', {instanceID: final[0].instanceId, trace: final[1]});
} else {
logger.error(`invalid message: ${topic} ${value}`);
}
return;
});
consumer.on('rebalanced', () => {
logger.info('cosumer is rebalancing');
});
return consumer;
};
Consumer Service startup and error handling code:
//init is the async function used to initialise the cache and other config and components.
const init = async() =>{
//initialize cache, configs.
}
//startConsumer is the async function that connects to Kafka,
//and add a callback for the onMessage listener which processes the message through the transformation service.
const startConsumer = async ({ ...config}) => {
//calls to fetch info like topic, transformationService etc.
//readFromKafka function defn pasted above
readFromKafka( {topicConfig}, transformationService);
};
init()
.then(startConsumer)
.catch((err) => {
logger.error(err);
});
Forwarding code through UDP sockets.
Following code throws the unhandled error intermittently as this seemed to work for the first few thousands of messages, and then suddenly it crashes
const udpSender = (msg, destinations) => {
return Object.values(destinations)
.map(({id, host, port}) => {
return new Promise((resolve) => {
dgram.createSocket('udp4').send(msg, 0, msg.length, port, host, (err) => {
resolve({
id,
timestamp: Date.now(),
logs: err || 'Sent succesfully',
});
});
});
});
};
Based on our comment exchange, I believe the issue is just that you're running out of resources.
Throughout the lifetime of your app, every time you send a message you open up a brand new socket. However, you're not doing any cleanup after sending that message, and so that socket stays open indefinitely. Your open sockets then continue to pile up, consuming resources, until you eventually run out of... something. Perhaps memory, perhaps ports, perhaps something else, but ultimately your app crashes.
Luckily, the solution isn't too convoluted: just reuse existing sockets. In fact, you can just reuse one socket for the entirety of the application if you wanted, as internally socket.send handles queueing for you, so no need to do any smart hand-offs. However, if you wanted a little more concurrency, here's a quick implementation of a round-robin queue where we've created a pool of 10 sockets in advance which we just grab from whenever we want to send a message:
const MAX_CONCURRENT_SOCKETS = 10;
var rrIndex = 0;
const rrSocketPool = (() => {
var arr = [];
for (let i = 0; i < MAX_CONCURRENT_SOCKETS; i++) {
let sock = dgram.createSocket('udp4');
arr.push(sock);
}
return arr;
})();
const udpSender = (msg, destinations) => {
return Object.values(destinations)
.map(({ id, host, port }) => {
return new Promise((resolve) => {
var sock = rrSocketPool[rrIndex];
rrIndex = (rrIndex + 1) % MAX_CONCURRENT_SOCKETS;
sock.send(msg, 0, msg.length, port, host, (err) => {
resolve({
id,
timestamp: Date.now(),
logs: err || 'Sent succesfully',
});
});
});
});
};
Be aware that this implementation is still naïve for a few reasons, mostly because there's still no error handling on the sockets themselves, only on their .send method. You should look at the docs for more info about catching events such as error events, especially if this is a production server that's supposed to run indefinitely, but basically the error-handling you've put inside your .send callback will only work... if an error occurs in a call to .send. If between sending messages, while your sockets are idle, some system-level error outside of your control occurs and causes your sockets to break, your socket may then emit an error event, which will go unhandled (like what's happening in your current implementation, with the intermittent errors that you see prior to the fatal one). At that point they may now be permanently unusable, meaning they should be replaced/reinstated or otherwise dealt with (or alternatively, just force the app to restart and call it a day, like I do :-) ).

Subscription to MQTT broker and get the data being passed

I am generating data with a node.js simulator and passing this data to a http route /simulator/data
In the application I am listening broker with MQTT mqtthandler.js file which I share below.
//This is mqtthandler.js file
const mqtt = require("mqtt");
class MqttHandler {
constructor() {
this.mqttClient = null;
this.host = "mqtt://localhost:1883";
this.username = "YOUR_USER"; // mqtt credentials if these are needed to connect
this.password = "YOUR_PASSWORD";
}
connect() {
// Connect mqtt with credentials (in case of needed, otherwise we can omit 2nd param)
this.mqttClient = mqtt.connect(this.host, {
username: this.username,
password: this.password,
});
// Mqtt error calback
this.mqttClient.on("error", (err) => {
console.log(err);
this.mqttClient.end();
});
// Connection callback
this.mqttClient.on("connect", () => {
console.log(`mqtt client connected`);
});
// mqtt subscriptions
this.mqttClient.subscribe("value", { qos: 0 });
// When a message arrives, console.log it
this.mqttClient.on("message", function (topic, message) {
console.log(message.toString());
});
this.mqttClient.on("close", () => {
console.log(`mqtt client disconnected`);
});
}
// Sends a mqtt message to topic: mytopic
sendMessage(message) {
this.mqttClient.publish("value", message);
}
}
module.exports = MqttHandler;
When simulator sending the data to the /simulator/data route, I am getting the value and sending the broker with value topic. I share the post request code and output of simulator below.
var mqttHandler = require("../mqtthandler");
module.exports = function (app) {
app.get("/simulator", function (req, res) {
res.render("iot/simulator");
});
// route to display all the data that is generated
app.get("/simulator/data", require("./controllers/data").all);
var mqttClient = new mqttHandler();
mqttClient.connect();
// route to write data to the database
app.post(
"/simulator/data",
require("./controllers/data").write,
(req, res) => {
mqttClient.sendMessage(req.body.value);
res.status(200).send("Message sent to mqtt");
}
);
// delete the data when the stream is stopped or when the app is closed
app.get("/simulator/data/delete", require("./controllers/data").delete);
};
When I send get request to /simulator/data I am able to see generated data, however this data is not being sent to broker.
//This is output of simulator
[
{
"_id": "5ecfadc13cb66f10e4d9d39b",
"value": "1.886768240197795",
"__v": 0,
"categories": []
},
{
"_id": "5ecfadc23cb66f10e4d9d39c",
"value": "7.351404601932272",
"__v": 0,
"categories": []
}
]
PS: Broker is created via node-red
I would like to pass this data to broker and see the result with MQTT subscription. However I can not find where am I making mistake.
Your solution is to fix your development process. Rather than working from failure debugging 2 subsystems (your publisher / simulator and your subscriber), work from success:
1) use publishers that you KNOW work, eg. mosquitto_pub, any simulator that works, etc.
2) use subscribers that you KNOW work, eg. mosquitto_sub
This will solve your problem in minutes, rather than hours or days, and let you focus on the code that you REALLY want to develop.
So a couple of things to look at here.
Your this.mqttClient.subscribe() call is inline with your connect(), on.("error",...), on.("message",...) etc. So the subscribe() could fire before the connect() has finished...and thus you will never subscribe. Put the subscribe() inside the connect() block.
You are subscribing to "value", which is not a proper MQTT topic. If you must, use value/# for the subscribe(), and "value/.." for the publish(). Your class only allows for the single, hard-coded topic, so won't be very useful when you want to reuse that class for other projects. Take the time now to pass the topic string to the class as well.

How can I deal with message processing failures in Node.js using MQTT?

I'm using a service written in Node.js to receive messages via MQTT (https://www.npmjs.com/package/mqtt) which then writes to a database (SQL Server using mssql).
This will work very nicely when everything is functioning normally, I create the mqtt listener and subscribe to new message events.
However, if the connection to the DB fails (this may happen periodically due to a network outage etc.), writing the message to the database will fail and the message will be dropped on the floor.
I would like to tell the MQTT broker - "I couldn't process the message, keep it in the buffer until I can."
var mqtt = require('mqtt')
var client = mqtt.connect('mymqttbroker')
client.on('connect', function () {
client.subscribe('messagequeue')
})
client.on('message', function (topic, message) {
writeMessageToDB(message).then((result) => {console.log('success'};).catch((err) => {/* What can I do here ?*/});
})
Maybe set a timeout on a resend function? Probably should be improved to only try n times before dropping the message, but it's definitely a way to do it. This isn't tested, obviously, but it should hopefully give you some ideas...
var resend = function(message){
writeMessageToDB(message).then((result) => {
console.log('Resend success!')
})
.catch((err) => {
setTimeout(function(message){
resend(message);
}, 60000);
});
}
client.on('message', function (topic, message) {
writeMessageToDB(message).then((result) => {
console.log('success')
})
.catch((err) => {
resend(message);
});
});

Wait for an event to happen before sending HTTP response in NodeJS?

I'm looking for a solution to waiting for an event to happen before sending a HTTP response.
Use Case
The idea is I call a function in one of my routes: zwave.connect("/dev/ttyACM5"); This function return immediately.
But there exists 2 events that notice about if it succeed or fail to connect the device:
zwave.on('driver ready', function(){...});
zwave.on('driver failed', function(){...});
In my route, I would like to know if the device succeed or fail to connect before sending the HTTP response.
My "solution"
When an event happen, I save the event in a database:
zwave.on('driver ready', function(){
//In the database, save the fact the event happened, here it's event "CONNECTED"
});
In my route, execute the connect function and wait for the event to
appear in the database:
router.get('/', function(request, response, next) {
zwave.connect("/dev/ttyACM5");
waitForEvent("CONNECTED", 5, null, function(){
response.redirect(/connected);
});
});
// The function use to wait for the event
waitForEvent: function(eventType, nbCallMax, nbCall, callback){
if(nbCall == null) nbCall = 1;
if(nbCallMax == null) nbCallMax = 1;
// Looking for event to happen (return true if event happened, false otherwise
event = findEventInDataBase(eventType);
if(event){
waitForEvent(eventType, nbCallMax, nbCall, callback);
}else{
setTimeout(waitForEvent(eventType, callback, nbCallMax, (nbCall+1)), 1500);
}
}
I don't think it is a good practice because it iterates calls over the database.
So what are your opinions/suggestions about it?
I've gone ahead and added the asynchronous and control-flow tags to your question because at the core of it, that is what you're asking about. (As an aside, if you're not using ES6 you should be able to translate the code below back to ES5.)
TL;DR
There are a lot of ways to handle async control flow in JavaScript (see also: What is the best control flow module for node.js?). You are looking for a structured way to handle it—likely Promises or the Reactive Extensions for JavaScript (a.k.a RxJS).
Example using a Promise
From MDN:
The Promise object is used for asynchronous computations. A Promise represents a value which may be available now, or in the future, or never.
The async computation in your case is the computation of a boolean value describing the success or failure to connect to the device. To do so, you can wrap the call to connect in a Promise object like so:
const p = new Promise((resolve) => {
// This assumes that the events are mutually exclusive
zwave.connect('/dev/ttyACM5');
zwave.on('driver ready', () => resolve(true));
zwave.on('driver failed', () => resolve(false));
});
Once you have a Promise representing the state of the connection, you can attach functions to its "future" value:
// Inside your route file
const p = /* ... */;
router.get('/', function(request, response, next) {
p.then(successful => {
if (successful) {
response.redirect('/connected');
}
else {
response.redirect('/failure');
}
});
});
You can learn more about Promises on MDN, or by reading one of many other resources on the topic (e.g. You're Missing the Point of Promises).
Have you tried this? From the look of it, your zwave probably have already implemented an EventEmmiter, you just need to attach a listener to it
router.get('/', function(request, response, next) {
zwave.connect("/dev/ttyACM5");
zwave.once('driver ready', function(){
response.redirect(/connected);
});
});
There is a npm sync module also. which is used for synchronize the process of executing the query.
When you want to run parallel queries in synchronous way then node restrict to do that because it never wait for response. and sync module is much perfect for that kind of solution.
Sample code
/*require sync module*/
var Sync = require('sync');
app.get('/',function(req,res,next){
story.find().exec(function(err,data){
var sync_function_data = find_user.sync(null, {name: "sanjeev"});
res.send({story:data,user:sync_function_data});
});
});
/*****sync function defined here *******/
function find_user(req_json, callback) {
process.nextTick(function () {
users.find(req_json,function (err,data)
{
if (!err) {
callback(null, data);
} else {
callback(null, err);
}
});
});
}
reference link: https://www.npmjs.com/package/sync

Mongoose 4 sockets closing

I have a script that I use to save all my models (for ex to reindex). I am getting socket closing errors after a couple of hundred saves (Ihave 900 total). I recently upgraded to Mongoose 4.2.3 from 3.x.x and started seeing these errors. I am not sure what else to go on.
Errors:
{ [MongoError: server ds0133252-a0.mongolab.com:133252 sockets closed]
name: 'MongoError',
message: 'server ds051252-a0.mongolab.com:51252 sockets closed' }
{ [MongoError: server ds0133252-a0.mongolab.com:133252 sockets closed]
name: 'MongoError',
....
The script is pretty basic:
var mongoose = require(mongoose),
Product = require('../models/product'),
config = require('config');
mongoose.connect(config.db.mongo.connection, config.db.mongo.options);
Product.find(function(err, products) {
products.forEach(function(p) {
p.markModified('description');
p.save(function(e, product) {
if(e) console.log(e);
console.log(product.id);
});
});
});
The model is pretty complex but hasn't changed in a while. I have disabled the "save" middleware with same errors so it should be pretty standard.
Suggestions?
You can use an async flow control library like async to use an async iterator that lets you limit the number of concurrent save operations.
In this case, async.eachLimit would be a good fit (doc link is to each, scroll down to see the eachLimit variant). For example, to limit the iteration to no more than 5 concurrent saves:
Product.find(function(err, products) {
async.eachLimit(products, 5, function(p, callback) {
p.markModified('description');
p.save(function(e, product) {
if(e) console.log(e);
console.log(product.id);
callback(err);
});
});
});
Note that the callback parameter of eachLimit must be called when the save completes so that the library knows that particular iteration is complete.

Resources