NodeJS/SailsJS app database block - node.js

Understand that NodeJS is a single thread process, but if I have to run a long process database process, do I need to start a web worker to do that?
For example, in a sails JS app, I can call database to create record, but if the database call take times to finish, it will block other user from access the database.
Below are a sample code i tried
var test = function(cb) {
for(i=0;i<10000;i++) {
Company.create({companyName:'Walter Jr'+i}).exec(cb);
}
}
test(function(err,result){
});
console.log("return to client");
return res.view('cargo/view',{
model:result
});
On first request, I see the return almost instant. But if I request it again, I will need to wait for all the records being entered before It will return me the view again.
What is the common practice for this kinda of blocking issue?

Node.js has non-blocking, asynchronous IO.
read the article below it will help you to restructure your code
http://hueniverse.com/2011/06/29/the-style-of-non-blocking/
Also start using Promises to help you avoid writing blocking IO.

Related

How to properly use database when scaling a NodeJS app?

I am wondering how I would properly use MySQL when I am scaling my Node.JS app using the cluster module. Currently, I've only come up with two solutions:
Solution 1:
Create a database connection on every "worker".
Solution 2:
Have the database connection on a master process and whenever one of the workers request some data, the master process will return the data. However, using this solution, I do not know how I would be able to get the worker to retrieve the data from the master process.
I (think) I made a "hacky" workaround emitting with a unique number and then waiting for the master process to send the message back to the worker and the event name being the unique number.
If you don't understand what I mean by this, here's some code:
// Worker process
return new Promise (function (resolve, reject) {
process.send({
// Other data here
identifier: <unique number>
})
// having a custom event emitter on the worker
worker.once(<unique number>, function (data) {
// data being the data for the request with the unique number
// resolving the promise with returned data
resolve(data)
})
})
//////////////////////////
// Master process
// Custom event emitter on the master process
master.on(<eventName>, function (data) {
// logic
// Sending data back to worker
master.send(<other args>, data.identifier)
}
What would be the best approach to this problem?
Thank you for reading.
When you cluster in NodeJS, you should assume each process is completely independent. You really shouldn't be relaying messages like this to/from the master process. If you need multiple threads to access the same data, I don't think NodeJS is what you should be using. However, If you're just doing basic CRUD operations with your database, clustering (solution 1) is certainly the way to go.
For example, if you're trying to scale write ops to your database (assuming your database is properly scaled), each write op is independent from another. When you cluster, a single write request will be load balanced to one of your workers. Then in the worker, you delegate the write op to your database asynchronously. In this scenario, there is no need for a master process.
If you've not planned on using a proper microservice architecture where each process would actually have its own database (or perhaps just an in-memory storage), your best bet IMO is to use a connection pool created by the main process and have each child request a connection out of that pool. That's probably the safest approach to avoid issues in the neighborhood of threadsafety errors.

How to use newrelic.createBackgroundTransaction without a handle function

I am looking into using newrelic APM to monitor certain parts of our codebase.
I want to watch transactions that are not simple HTTP calls, but background processes. These transactions are completed by worker processes and we want to monitor them in the main part of the app.
Pseudo code:
var fork = childProcess.spawn('node', ['--harmony', 'path-to-worker.js', args]);
fork.stdout.on('data', function(data) {
// a finished transaction
// this fires most likely more than once
});
We basically need something like newrelic.createBackgroundTransaction() that can log a transaction immediately, without having to pass it a function to execute and time (I can do that myself).
Can I do something like this on the free tier of newrelic?

Spawning a Node.js task to run on its own

Sorry if this is a basic question. I'm just starting my 3rd week of doing Node.js programming! I looked around and didn't see an answer to this, specifically. Maybe it's just assumed when answering questions about child_process.spawn/fork by those who know this stuff better than I do.
I have a Node/Express app where I want to take in an HTTP request, save a bit of data to Mongo, return success/error, but...at the same time kick off a process to take some of the data and do a lookup against a web API. I want to save that data back to Mongo, but there's no need to have that communicated back to the HTTP client. (I'll probably log the success/error of that call somewhere.)
How do I kick off that 2nd task to run independent of the main request and not cause the response to wait for it to complete?
The 2nd task will also be written in Node.js. I'd like it to just be another function in the same file, if possible.
Thanks in advance!
I don't see why you would need spawning another process just for that. In node you are not limited to the http request lifecycle to run stuff like other frameworks. This should do it:
function yourHandler(req, res, next) {
dataAccess.writeToMongo(someData, function(err, res) {
var status = err ? 500 : 200;
// write back to response already!
res.status(status);
res.end();
// do not completely terminate yet
// kick off web api call
apiClient.doSomething();
});
}

How to lock (Mutex) in NodeJS?

There are external resources (accessing available inventories through an API) that can only be accessed one thread at a time.
My problems are:
NodeJS server handles requests concurrently, we might have multiple requests at the same time trying to reserve inventories.
If I hit the inventory API concurrently, then it will return duplicate available inventories
Therefore, I need to make sure that I am hitting the inventory API one thread at a time
There is no way for me to change the inventory API (legacy), therefore I must find a way to synchronize my nodejs server.
Note:
There is only one nodejs server, running one process, so I only need to synchronize the requests within that server
Low traffic server running on express.js
I'd use something like the async module's queue and set its concurrency parameter to 1. That way, you can put as many tasks in the queue as you need to run, but they'll only run one at a time.
The queue would look something like:
var inventoryQueue = async.queue(function(task, callback) {
// use the values in "task" to call your inventory API here
// pass your results to "callback" when you're done
}, 1);
Then, to make an inventory API request, you'd do something like:
var inventoryRequestData = { /* data you need to make your request; product id, etc. */ };
inventoryQueue.push(inventoryRequestData, function(err, results) {
// this will be called with your results
});

How is the event-loop implemented in node.js?

If node simply has two threads, one to execute the main code and the other for all the callbacks, then blocking can still occur if the callbacks are resource/time intensive.
Say you have 100,000 concurrent users and each client request to the node app runs a complicated and time consuming database query, (assuming no caching is done) will the later users experience blocking when waiting for the query to return?
function onRequest(request, response) {
//hypothetical database call
database.query("SELECT * FROM hugetable", function(data) {
response.writeHead(200, {"Content-Type": "text/plain"});
response.write("database result: " + data);
response.end();
});
}
http.createServer(onRequest).listen(8888);
If each callback can run on its own thread, then this is a non-issue. But if all the callbacks run on a single separate dedicated thread then node doesn't really help us much in such a scenario.
Please read this good article: http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/
There is no multiple threads, only one that executes your node.js logic. (Behind the scenes I/O should not be considered for application logic)
Requests to database are async as well - all you do, is put query in queue for transfer to db socket, then everything else happens behind the scenes and it will callback back only when there is response from database comes, so there is no blocking from application logic.

Resources