Why I cannot get value from async await by waiting synchronously? - node.js

I am trying to get mongodb instance synchronously. I know this is not recommended but I just experiment and wonder why this doesn't work. this.db is still undefined after 10 seconds of waiting when normally asynchronous code gets it in less than 500 milliseconds.
Repository.js:
var mongodb = require('mongodb');
var config = require('../config/config');
var mongoConfig = config.mongodb;
var mongoClient = mongodb.MongoClient;
class Repository {
constructor() {
(async () => {
this.db = await mongoClient.connect(mongoConfig.host);
})();
}
_getDb(t) {
t = t || 500;
if(!this.db && t < 10000) {
sleep(t);
t += 500;
this._getDb(t);
} else {
return this.db;
}
}
collection(collectionName) {
return this._getDb().collection(collectionName);
}
}
function sleep(ms) {
console.log('sleeping for ' + ms + ' ms');
var t = new Date().getTime();
while (t + ms >= new Date().getTime()) {}
}
module.exports = Repository;
app.js:
require('../babelize');
var Repository = require('../lib/Repository');
var collection = new Repository().collection('products');

Javascript is an event-based architecture. All code is initiated via an event from the event queue and the next event is pulled from the event queue ONLY when the code from the previous event has finished executing. This means that your main Javascript code is single threaded.
As such, when you fire an async operation, it starts up an operation and when that operation finishes, it puts an event in the event queue. That event (which will trigger the callback for the async operation) will not run until the code from the previous event finishes running and returns back to the system.
So, now to your code. You start running some code which launches an async operation. Then, you loop forever and never return back to the system. Because of that, the next event in the event queue from the completion of your async operation can NEVER run.
So, in a nutshell, you cannot spin in a loop waiting for an async operation to complete. That's incompatible with the event driven scheme that Javascript uses. You never return back to the event system to let the async completion callback ever run. So, you just have a deadlock or infinite loop.
Instead, you need to code for an async response by returning a promise or by passing in a callback that is called sometime later. And your code needs to finish executing and then let the callback get called sometime in the future. No spinning in loops in Javascript waiting for something else to run.
You can see the async coding options here: How do I return the response from an asynchronous call?

Related

Websocket - Waiting for a http request callback to execute before next pusher event

So I'm working with websockets to process data from website's API. For every new event I also send some http requests back to the website in order to obtain more data. Up untill now everything has worked fine, but now that I started using async requests to speed it up a bit things got a bit different. My code used to process one event and then move on to the next one (these events come in extremely quick - around 10 per second) but now it just seems to ignore the async (non blocking) part and move on to the next event and that way it just skips over half of the code. Note that the code works fine outside the Pusher. I'm using the 'pusher-client' module. My code looks like this:
var Request = require("request");
var requestSync = require('sync-request');
var Pusher = require('pusher-client');
var events_channel = pusher.subscribe('inventory_changes');
events_channel1.bind('listed', function(data)
{
var var2;
//Async request (to speed up the code)
function myFunction(callback){
request("url", function(error, response, body) {
if (!error && response.statusCode == 200)
{
result = JSON.stringify(JSON.parse(body));
return callback(null, result);
}
else
{
return callback(error, null);
}
});
}
myFunction(function(err, data){
if(!err)
{
var2 = data
return(data);
}
else
{
return(err);
}
});
//The part of the code below waits for the callback and the executes some code
var var1 = var2;
check();
function check()
{
if(var2 === var1)
{
setTimeout(check, 10);
return;
}
var1 = var2;
//A CHUNK OF CODE EXECUTES HERE (connected to the data from the callback)
}
});
In conclusion the code works, but not inside the pusher due to the pusher skipping the asynchronous request. How would I make the pusher wait for my async request to finish, before processing the next event (I have no idea)? If you happen to know, please let me know :)
You need to implement a queue to handle events one after another. I'm curious how it worked before, even without Pusher you'd have to implement some queue mechanism for it.
const eventsQueue = []
events_channel1.bind('listed', function(data) {
eventsQueue.push(data)
handleNewEvent()
})
let processingEvent = false
function handleNewEvent() {
if (processingEvent) return // do nothing if already processing an event
processingEvent = true
const eventData = eventsQueue.shift() // pick the first element from array
if (!eventData) return // all events are handled at the moment
... // handle event data here
processingEvent = false
handleNewEvent() // handle next event
}
Also, you should call clearTimeout method to clear your timeout when you don;t need it anymore.
And it's better to use promises or async/await instead of callbacks. Your code will be much easier to read and maintain.

How to stop async code from running Node.JS

I'm creating a program where I constantly run and stop async code, but I need a good way to stop the code.
Currently, I have tried to methods:
Method 1:
When a method is running, and another method is called to stop the first method, I start an infinite loop to stop that code from running and then remove the method from the queue(array)
I'm 100% sure that this is the worst way to accomplish it, and it works very buggy.
Code:
class test{
async Start(){
const response = await request(options);
if(stopped){
while(true){
await timeout(10)
}
}
}
}
Code 2:
var tests = [];
Start(){
const test = new test();
tests.push(test)
tests.Start();
}
Stop(){
tests.forEach((t, i) => {t.stopped = true;};
tests = [];
}
Method 2:
I load the different methods into Workers, and when I need to stop the code, I just terminate the Worker.
It always takes a lot of time(1 sec) to create the Worker, and therefore not the best way, since I need the code to run without 1-2 sec pauses.
Code:
const Worker = require("tiny-worker");
const code = new Worker(path.resolve(__dirname, "./Code/Code.js"))
Stopping:
code.terminate()
Is there any other way that I can stop async code?
The program contains Request using nodejs Request-promise module, so program is waiting for requests, it's hard to stop the code without one of the 2 methods.
Is there any other way that I can stop async code?
Keep in mind the basic of how Nodejs works. I think there is some misunderstanding here.
It execute the actual function in the actual context, if encounters an async operation the event loop will schedule it's execetution somewhere in the future. There is no way to remove that scheduled execution.
More info on event loop here.
In general for manage this kind of situations you shuold use flags or semaphores.
The program contains Request using nodejs Request-promise module, so program is waiting for requests, it's hard to stop the code
If you need to hard "stop the code" you can do something like
func stop() {
process.exit()
}
But if i'm getting it right, you're launching requests every x time, at some point you need to stop sending the request without managing the response.
You can't de-schedule the response managemente portion, but you can add some logic in it to (when it will be runned) check if the "request loop" has been stopped.
let loop_is_stopped = false
let sending_loop = null
func sendRequest() {
const response = await request(options) // "wait here"
// following lines are scheduled after the request promise is resolved
if (loop_is_stopped) {
return
}
// do something with the response
}
func start() {
sending_loop = setInterval(sendRequest, 1000)
}
func stop() {
loop_is_stopped = true
clearInterval(sending_loop)
}
module.exports = { start, stop }
We can use Promise.all without killing whole app (process.exit()), here is my example (you can use another trigger for calling controller.abort()):
const controller = new AbortController();
class Workflow {
static async startTask() {
await new Promise((res) => setTimeout(() => {
res(console.log('RESOLVE'))
}, 3000))
}
}
class ScheduleTask {
static async start() {
return await Promise.all([
new Promise((_res, rej) => { if (controller.signal.aborted) return rej('YAY') }),
Workflow.startTask()
])
}
}
setTimeout(() => {
controller.abort()
console.log("ABORTED!!!");
}, 1500)
const run = async () => {
try {
await ScheduleTask.start()
console.log("DONE")
} catch (err) {
console.log("ERROR", err.name)
}
}
run()
// ABORTED!!!
// RESOLVE
"DONE" will never be showen.
res will be complited
Maybe would be better to run your code as script with it's own process.pid and when we need to interrupt this functionality we can kill this process by pid in another place of your code process.kill.

nodejs setTimeout and recursive calls context (this/self)

Im currently working on a project and got stuck at the setTimeout() function in a recursive function. Im rather new to promises so may i did implement this part not corretly too.
The programm should do:
add a listener to a stream event 'readable'
write a request to the stream if specific periodic data is read
resolve promise and remove listener after some other data (answer) is
recieved
Send message from Stream to another process
repeat by calling the same method recursivly with a delay of 10secs
Bassicly im trying to poll from a stream every 10 seconds.
The code looks simplyfied like this:
class XYZ {
myFunction(commands, intervall, i) {
var self = this;
var promise = new Promise((resolve, reject) => {
// I have to write to a Stream and listen for an answer
self.dataStream.write(someData, () => {
self.dataStream.addListener('readable', handleStuff);
});
// Function that handles incoming data from the Stream
var handleStuff = function () {
if (self.dataStream == someFormat) {
self.dataStream.write(commands[i]);
} else {
self.dataStream.removeListener('readable', hadleStuff);
resolve(self.dataStream.read());
}
}
});
// Resolving by sending msg and calling recursivly
promise.then((message) => {
self.send(message);
if (i + 1 > resetValue) {
setTimeout(() => {
self.myFunction(commands, intervall, 0);
}, intervall);
} else {
self.myFunction(commands, intervall, i + 1);
}
});
}
};
And i call it like this:
var myXYZ = new XYZ();
myXYZ.myFunction(myCommands, 10000, 0);
Now when i run this the initial call, it works just fine and sends the message from the dataStream to another process. But when the setTimeout() function is called the function gets "stuck" after writing data to the stream for the first time and the promise is not resolved neither rejected.
My guess is that im mixing up the context (this/self) in my code. Theres sadly no error message, so that i think my logic is faulty. It also works fine if i just remove the setTimeout() function. My Question now is how does setTimeout() change the context from which the code operates?

setTimeout or child_process.spawn?

I have a REST service in Node.js with one specific request running a bunch of DB commands and other file processing that could take 10-15 seconds to run. Since I didn't want to hold up my browser request thread, I wrote a separate .js script to do the needful, called the script using child_process.spawn() in my Node.js code and immediately returned OK back to the client. This works fine, but then so does calling the same script (as a local function) by just using a simple setTimeout.
router.post("/longRequest", function(req, res) {
console.log("Started long request with id: " + req.body.id);
var longRunningFunction = function() {
// Usually runs a bunch of things that take time.
// Simulating a 10 sec delay for sample code.
setTimeout(function() {
console.log("Done processing for 10 seconds")
}, 10000);
}
// Below line used to be
// child_process.spawn('longRunningFunction.js'
setTimeout(longRunningFunction, 0);
res.json({status: "OK"})
})
So, this works for my purpose. But what's the downside ? I probably can't monitor the offline process easily as child_process.spawn which would give me a process id. But, does this cause problems in the long run ? Will it hold up Node.js processing if the 10 second processing increases to a lot more in the future ?
The actual longRunningFunction is something that reads an Excel file, parses it and does a bulk load using tedious to a MS SQL Server.
var XLSX = require('xlsx');
var FileAPI = require('file-api'), File = FileAPI.File, FileList = FileAPI.FileList, FileReader = FileAPI.FileReader;
var Connection = require('tedious').Connection;
var Request = require('tedious').Request;
var TYPES = require('tedious').TYPES;
var importFile = function() {
var file = new File(fileName);
if (file) {
var reader = new FileReader();
reader.onload = function (evt) {
var data = evt.target.result;
var workbook = XLSX.read(data, {type: 'binary'});
var ws = workbook.Sheets[workbook.SheetNames[0]];
var headerNames = XLSX.utils.sheet_to_json( ws, { header: 1 })[0];
var data = XLSX.utils.sheet_to_json(ws);
var bulkLoad = connection.newBulkLoad(tableName, function (error, rowCount) {
if (error) {
console.log("bulk upload error: " + error);
} else {
console.log('inserted %d rows', rowCount);
}
connection.close();
});
// setup your columns - always indicate whether the column is nullable
Object.keys(columnsAndDataTypes).forEach(function(columnName) {
bulkLoad.addColumn(columnName, columnsAndDataTypes[columnName].dataType, { length: columnsAndDataTypes[columnName].len, nullable: true });
})
data.forEach(function(row) {
var addRow = {}
Object.keys(columnsAndDataTypes).forEach(function(columnName) {
addRow[columnName] = row[columnName];
})
bulkLoad.addRow(addRow);
})
// execute
connection.execBulkLoad(bulkLoad);
};
reader.readAsBinaryString(file);
} else {
console.log("No file!!");
}
};
So, this works for my purpose. But what's the downside ?
If you actually have a long running task capable of blocking the event loop, then putting it on a setTimeout() is not stopping it from blocking the event loop at all. That's the downside. It's just moving the event loop blocking from right now until the next tick of the event loop. The event loop will be blocked the same amount of time either way.
If you just did res.json({status: "OK"}) before running your code, you'd get the exact same result.
If your long running code (which you describe as file and database operations) is actually blocking the event loop and it is properly written using async I/O operations, then the only way to stop blocking the event loop is to move that CPU-consuming work out of the node.js thread.
That is typically done by clustering, moving the work to worker processes or moving the work to some other server. You have to have this work done by another process or another server in order to get it out of the way of the event loop. A setTimeout() by itself won't accomplish that.
child_process.spawn() will accomplish that. So, if you have an actual event loop blocking problem to solve and the I/O is already as async optimized as possible, then moving it to a worker process is a typical node.js solution. You can communicate with that child process in a number of ways, but one possibility would be via stdin and stdout.

How to sleep the thread in node.js without affecting other threads?

As per Understanding the node.js event loop, node.js supports a single thread model. That means if I make multiple requests to a node.js server, it won't spawn a new thread for each request but will execute each request one by one. It means if I do the following for the first request in my node.js code, and meanwhile a new request comes in on node, the second request has to wait until the first request completes, including 5 second sleep time. Right?
var sleep = require('sleep');
sleep.sleep(5)//sleep for 5 seconds
Is there a way that node.js can spawn a new thread for each request so that the second request does not have to wait for the first request to complete, or can I call sleep on specific thread only?
If you are referring to the npm module sleep, it notes in the readme that sleep will block execution. So you are right - it isn't what you want. Instead you want to use setTimeout which is non-blocking. Here is an example:
setTimeout(function() {
console.log('hello world!');
}, 5000);
For anyone looking to do this using es7 async/await, this example should help:
const snooze = ms => new Promise(resolve => setTimeout(resolve, ms));
const example = async () => {
console.log('About to snooze without halting the event loop...');
await snooze(1000);
console.log('done!');
};
example();
In case you have a loop with an async request in each one and you want a certain time between each request you can use this code:
var startTimeout = function(timeout, i){
setTimeout(function() {
myAsyncFunc(i).then(function(data){
console.log(data);
})
}, timeout);
}
var myFunc = function(){
timeout = 0;
i = 0;
while(i < 10){
// By calling a function, the i-value is going to be 1.. 10 and not always 10
startTimeout(timeout, i);
// Increase timeout by 1 sec after each call
timeout += 1000;
i++;
}
}
This examples waits 1 second after each request before sending the next one.
Please consider the deasync module, personally I don't like the Promise way to make all functions async, and keyword async/await anythere. And I think the official node.js should consider to expose the event loop API, this will solve the callback hell simply. Node.js is a framework not a language.
var node = require("deasync");
node.loop = node.runLoopOnce;
var done = 0;
// async call here
db.query("select * from ticket", (error, results, fields)=>{
done = 1;
});
while (!done)
node.loop();
// Now, here you go
When working with async functions or observables provided by 3rd party libraries, for example Cloud firestore, I've found functions the waitFor method shown below (TypeScript, but you get the idea...) to be helpful when you need to wait on some process to complete, but you don't want to have to embed callbacks within callbacks within callbacks nor risk an infinite loop.
This method is sort of similar to a while (!condition) sleep loop, but
yields asynchronously and performs a test on the completion condition at regular intervals till true or timeout.
export const sleep = (ms: number) => {
return new Promise(resolve => setTimeout(resolve, ms))
}
/**
* Wait until the condition tested in a function returns true, or until
* a timeout is exceeded.
* #param interval The frenequency with which the boolean function contained in condition is called.
* #param timeout The maximum time to allow for booleanFunction to return true
* #param booleanFunction: A completion function to evaluate after each interval. waitFor will return true as soon as the completion function returns true.
*/
export const waitFor = async function (interval: number, timeout: number,
booleanFunction: Function): Promise<boolean> {
let elapsed = 1;
if (booleanFunction()) return true;
while (elapsed < timeout) {
elapsed += interval;
await sleep(interval);
if (booleanFunction()) {
return true;
}
}
return false;
}
The say you have a long running process on your backend you want to complete before some other task is undertaken. For example if you have a function that totals a list of accounts, but you want to refresh the accounts from the backend before you calculate, you can do something like this:
async recalcAccountTotals() : number {
this.accountService.refresh(); //start the async process.
if (this.accounts.dirty) {
let updateResult = await waitFor(100,2000,()=> {return !(this.accounts.dirty)})
}
if(!updateResult) {
console.error("Account refresh timed out, recalc aborted");
return NaN;
}
return ... //calculate the account total.
}
It is quite an old question, and though the accepted answer is still entirely correct, the timers/promises API added in v15 provides a simpler way.
import { setTimeout } from 'timers/promises';
// non blocking wait for 5 secs
await setTimeout(5 * 1000);

Resources