I am in the process of building an API service using Node.js with ExpressJS. I understand Node is single threaded, but allows for asynchronous operations.
For example, this simple express route:
app.post('/a', (req, res) => {
console.log('processing');
setTimeout(() => {
res.send('Hello World!');
console.log('sent');
}, 5000);
console.log('waiting');
});
If I make 2 immediately consecutive requests to this route, the console logs are run in parallel. What if these were writes to a database instead of setTimeout's? How do developers guarantee synchronous writes, so that if this endpoint was pinged twice immediately, it would not result in duplicate inserts for example? Is it controlled on the database side (with indexes, perhaps)? Maybe with transactions, but what if the database doesn't support transactions?
Any help or understanding would be more than appreciated.
Related
When making a REST service using express in node, how do i prevent a blocking task from blocking the entire rest service? Take as example the following express rest service:
const express = require('express');
const app = express();
app.get('/', (req, res) => res.send('Hello, World'));
const blockService = async function () {
return new Promise((resolve, reject) => {
const end = Date.now() + 20000;
while (Date.now() < end) {
const doSomethingHeavyInJavaScript = 1 + 2 + 3;
}
resolve('I am done');
});
}
const blockController = function (req, res) {
blockService().then((val) => {
res.send(val);
});
};
app.get('/block', blockController);
app.listen(3000, () => console.log('app listening on port 3000'));
In this case, a call to /block will render the entire service unreachable for 20 seconds. This is a big problem if there are many clients using the service, since no other client will be able to access the service for that time. This is obviously a problem of the while loop being blocking code, and thus hanging the main thread. This code might be confusing, since, despite using a promise in blockService, the main thread still hangs. How do i ensure that blockService will run a worker-thread and not the event-loop?
By default node.js runs your Javascript code in a single thread. So, if you really have CPU intensive code in a request handler (like you show above), then that is indeed a problem. Your options are as follows:
Start up a Worker Thread and run the CPU-intensive code in a worker thread. Since version 10, node.js has had worker threads for this purpose. You then communicate back the result to the main thread with messaging.
Start up any other process that runs node.js code or any type of code and compute the result in that other process. You then communicate back the result to the main thread with messaging.
Use node clustering to start N processes so that if once process is stuck with a CPU intensive operation, at least one of the others is hopefully free to run other requests.
Please note that a lot of things that servers do like read files, do networking, make requests to databases are all asynchronous and non-blocking so it's not incredibly common to actually have lots of CPU intensive code. So, if this is just a made up example for your own curiosity, you should make sure you actually have a CPU-intensive problem in your server before you go designing threads or clusters.
Node.js is an event-based model that uses a single runtime thread. For the reasons you've discovered, Node.js is not a good choice for CPU bound tasks (or synchronously blocking tasks). Node.js works best for coordinating I/O asynchronously.
worker-threads were released in Node.js v12. This allows you to use another thread for blocking tasks. They are relatively simple to use and could work if you absolutely need the offload blocking tasks.
I have seen some questions about sending response immediately and run CPU intensive tasks.
My case is my node application depends on third party service responses so the process flow is
Node receives request and authenticates with third-party service
Send response to user after authentication
Do some tasks that needs responses from third party service
Save the results to database
In my case there is no CPU intensive tasks and no need to give results of additional tasks to the user but node needs to wait for responses from third-party service. I have to do multiple req/res to/from the third-party service after the authentication to complete the task.
How can I achieve this situation?
I have seen some workarounds with child_process, nextTick and setTimeOut.
Ultimately I want to send response immediately to user and do tasks related to that user.
Thanks in advance.
elsewhere in your code
function do_some_tasks() { //... }
// route function
(req, res) => {
// call some async task
do_some_tasks()
// if the above is doing some asynchronous task, next function should be called immediately without waiting, question is is it so?
res.send()
}
// if your do_some_tasks() is synchronous func, the you can do
// this function call will be put to queue and executed asynchronously
setImmediate(() => {
do_some_tasks()
})
// this will be called in the current iteration
res.send(something)
Just writing a very general code block here:
var do_some_tasks = (req, tp_response) => {
third_party_tasks(args, (err, result)=<{
//save to DB
});
}
var your_request_handler = (req,res) => {
third_party_auth(args, (tp_response)=>{
res.send();
//just do your tasks here
do_some_tasks(req, tp_response);
});
}
From https://node-postgres.com/features/connecting , seems like we can choose between Pool or Client to perform query
pool.query('SELECT NOW()', (err, res) => {
console.log(err, res)
pool.end()
})
client.query('SELECT NOW()', (err, res) => {
console.log(err, res)
client.end()
})
Their functionalities look very much the same. But, the documentation doesn't explain much the difference between Pool and Client.
May I know, what thing I should consider, before choosing between Pool or Client?
May I know, what thing I should consider, before choosing between Pool or Client?
Use a pool if you have or expect to have multiple concurrent requests. That is literally what it is there for: to provide a pool of re-usable open client instances (reduces latency whenever a client can be reused).
In that case you definitely do not want to call pool.end() when your query completes, you want to reserve that for when your application terminates because pool.end() disposes of all the open client instances. (Remember, the point is to keep up to a fixed number of client instances available.)
One of the most significant differences to know, is that you must use Client when you use transactions.
From the documentation:
You must use the same client instance for all statements within a
transaction. PostgreSQL isolates a transaction to individual clients.
This means if you initialize or use transactions with the pool.query
method you will have problems. Do not use transactions with the
pool.query method.
iOS application perform request for sending messages to users. I want to return result to application, and after that send push notification to users, and I don't want to wait until notifications were pushed successfully or not.
app.post("/message", function(req, res, next) {
User.sendMessages(query, options, function(err, results) {
res.json(results);
sendPushNotifications();
});
});
How can I do this?
That's how it works.
Keep in mind everything that happens in node is in a single thread, unlike other back-end languages you might be used to.
Requests, jobs, everything happens in that single thread. Unless, of course, you use cluster or something like that.
If node simply has two threads, one to execute the main code and the other for all the callbacks, then blocking can still occur if the callbacks are resource/time intensive.
Say you have 100,000 concurrent users and each client request to the node app runs a complicated and time consuming database query, (assuming no caching is done) will the later users experience blocking when waiting for the query to return?
function onRequest(request, response) {
//hypothetical database call
database.query("SELECT * FROM hugetable", function(data) {
response.writeHead(200, {"Content-Type": "text/plain"});
response.write("database result: " + data);
response.end();
});
}
http.createServer(onRequest).listen(8888);
If each callback can run on its own thread, then this is a non-issue. But if all the callbacks run on a single separate dedicated thread then node doesn't really help us much in such a scenario.
Please read this good article: http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/
There is no multiple threads, only one that executes your node.js logic. (Behind the scenes I/O should not be considered for application logic)
Requests to database are async as well - all you do, is put query in queue for transfer to db socket, then everything else happens behind the scenes and it will callback back only when there is response from database comes, so there is no blocking from application logic.