Looping over JSON objects synchronously NodeJS - node.js

I'm requesting a JSON API then doing a for each loop over a JSON object and want a way to know once it's complete.
This is my current for each loop
Object.entries(jsonbody.data.users).forEach(([key, value]) => {
//code to run for each user
});
At the moment it'll just loop through each user till it's completed them all but I have no definitive max number of objects there could be so I can just do the usual loop counting technique.
I'm not sure if I've just got this completely wrong but I've always had trouble working out how to perform synchronous events with node js so hopefully, this will teach me

It depends on whether the "code to run for each user" is synchronous or asynchronous.
There isn't enough detail in your question to give a definitive answer.
If the code in the loop is synchronous then the loop will complete before any code after it will run.
That is the definition of synchronous code ;)
let count = 0;
Object.entries(jsonbody.data.users).forEach(([key, value]) => {
// some sync code to run for each user
count++;
});
// Because the code in your loop is synchronous then we'll only get to this
// part of the code once the code above is complete.
console.log(`The tasks have finished`);
console.log(`count of users=`, count);
If the per-user code is asynchronous then you will need to wait until all tasks have completed before continuing.
It is common to use Promises for this:
// Note the use of `.map()` here instead of `.forEach()` that is used in previous example
const promises = Object.entries(jsonbody.data.users).map(([key, value]) => {
return new Promise((resolve, reject) => {
// some async code to run for each user
// Resolving to `key` here for example only - you might return data from
// some `fetch()` request or other async task
resolve(key);
});
});
// Because the code above is **asynchronous** then we'll reach this point in code
// before the code above has "finished" (i.e. there are still outstanding tasks).
// We have to wait until those tasks have completed using either `await` or `promise.then()`
const results = await Promise.all(promises);
console.log(`The tasks have finished. results=`, results);
// OR use `promise.then(...)`
Promise.all(promises).then((results) => {
console.log(`The tasks have finished. results=`, results);
});

Related

Node Express. Continue function execution after res.render

It is necessary to build the page and then continue the execution of the function, which takes a lot of time. But res.render waits for the function to execute no matter where it is called.
I want the page to start building without waiting for the data to be processed.
Here is my code:
let promise = new Promise((resolve, reject) => {
let wb = new Workbook();
let ws = sheet_from_array_of_arrays(public_data); //May take up to 5 seconds
wb.SheetNames.push(ws_name);
wb.Sheets[ws_name] = ws;
XLSX.writeFile(wb, '/tmp/' + name); //May take up to 10 seconds
resolve('End of promise!!!');
})
res.render('ihelp/lists/person_selection_view_data', {
data: public_data,
name
});
promise.then(answer => { console.log(answer) });
How can i do this?
The code that you want to run looks to be synchronous (I don't see any promises or callbacks related to it). If it takes multiple seconds to run, that will mean that it will block your entire app for that amount of time.
This means that asynchronous functions, like res.render(), will not complete until the processing is done. Even if you change the order:
res.render(...);
long_running_code();
Not only will this not make res.render() send back a response before long_running_code is started, it will also stop your app from responding to new incoming requests (and/or block any current requests) until it's done.
If you have CPU-intensive code that will block the event look, take a look at worker_threads, which can be used to offload CPU-intensive code to separate threads, and therefore keep your main JS thread free to handle the HTTP-part of your app.
The res.render takes a 3rd argument for a callback. If supplied, Express will not send the rendered html string automatically. Any extended process can be executed after that.
res.render('ihelp/lists/person_selection_view_data', {data: public_data, name}, function (err, html) {
res.send(html);
let wb = new Workbook();
let ws = sheet_from_array_of_arrays(public_data); //May take up to 5 seconds
wb.SheetNames.push(ws_name);
wb.Sheets[ws_name] = ws;
XLSX.writeFile(wb, '/tmp/' + name); //May take up to 10 seconds
})
The executor function (inside the Promise constructor) is executed synchronously, and if this takes too long, you can move it into another Promise:
res.render('ihelp/lists/person_selection_view_data', {
data: public_data,
name
});
Promise.resolve().then(function() {
new Promise((resolve, reject) => {
// your executor code
}).then(answer => { console.log(answer) });
});

Nodejs loop through array of urls in a synchronous way

i've worked with node now for 2 years but cannot solve the following requirements:
I have an array of ~ 50.000 Parameters
I need to loop through the array and make a get request to always the same url with the parameter added
I need to write the result of the url-call back to the array
It's needed to do this one by one, as i can not call the api with several threads.
I'm sure there is a simple solution for that but everything i tried didn't make the code wait for the get request to return. I know that doing things synchronous in node is not the way we should to things, but in this special situation it is by design that the process shall not go on till the result comes back.
Any hint appreciated
Regards
Use a for loop, use a means of doing the GET request that returns a promise (such as the got() library) and then use await to pause the for loop until your response comes back.
const got = require('got');
const yourArray = [...];
async function run() {
for (let [index, item] of yourArray.entries()) {
try {
let result = await got(item.url);
// do something with the result
} catch(e) {
// either handle the error here or throw to stop further processing
}
}
}
run().then(() => {
console.log("all done");
}).catch(err => {
console.log(err);
});

Possible to make an event handler wait until async / Promise-based code is done?

I am using the excellent Papa Parse library in nodejs mode, to stream a large (500 MB) CSV file of over 1 million rows, into a slow persistence API, that can only take one request at a time. The persistence API is based on Promises, but from Papa Parse, I receive each parsed CSV row in a synchronous event like so: parseStream.on("data", row => { ... }
The challenge I am facing is that Papa Parse dumps its CSV rows from the stream so fast that my slow persistence API can't keep up. Because Papa is synchronous and my API is Promise-based, I can't just call await doDirtyWork(row) in the on event handler, because sync and async code doesn't mix.
Or can they mix and I just don't know how?
My question is, can I make Papa's event handler wait for my API call to finish? Kind of doing the persistence API request directly in the on("data") event, making the on() function linger around somehow until the dirty API work is done?
The solution I have so far is not much better than using Papa's non-streaming mode, in terms of memory footprint. I actually need to queue up the torrent of on("data") events, in form of generator function iterations. I could have also queued up promise factories in an array and work it off in a loop. Any which way, I end up saving almost the entire CSV file as huge collection of future Promises (promise factories) in memory, until my slow API calls have worked all the way through.
async importCSV(filePath) {
let parsedNum = 0, processedNum = 0;
async function* gen() {
let pf = yield;
do {
pf = yield await pf();
} while (typeof pf === "function");
};
var g = gen();
g.next();
await new Promise((resolve, reject) => {
try {
const dataStream = fs.createReadStream(filePath);
const parseStream = Papa.parse(Papa.NODE_STREAM_INPUT, {delimiter: ",", header: false});
dataStream.pipe(parseStream);
parseStream.on("data", row => {
// Received a CSV row from Papa.parse()
try {
console.log("PA#", parsedNum, ": parsed", row.filter((e, i) => i <= 2 ? e : undefined)
);
parsedNum++;
// Simulate some really slow async/await dirty work here, for example
// send requests to a one-at-a-time persistence API
g.next(() => { // don't execute now, call in sequence via the generator above
return new Promise((res, rej) => {
console.log(
"DW#", processedNum, ": dirty work START",
row.filter((e, i) => i <= 2 ? e : undefined)
);
setTimeout(() => {
console.log(
"DW#", processedNum, ": dirty work STOP ",
row.filter((e, i) => i <= 2 ? e : undefined)
);
processedNum++;
res();
}, 1000)
})
});
} catch (err) {
console.log(err.stack);
reject(err);
}
});
parseStream.on("finish", () => {
console.log(`Parsed ${parsedNum} rows`);
resolve();
});
} catch (err) {
console.log(err.stack);
reject(err);
}
});
while(!(await g.next()).done);
}
So why the rush Papa? Why not allow me to work down the file a bit slower -- the data in the original CSV file isn't gonna run away, we have hours to finish the streaming, why hammer me with on("data") events that I can't seem to slow down?
So what I really need is for Papa to become more of a grandpa, and minimize or eliminate any queuing or buffering of CSV rows. Ideally I would be able to completely sync Papa's parsing events with the speed (or lack thereof) of my API. So if it weren't for the dogma that async code can't make sync code "sleep", I would ideally send each CSV row to the API inside the Papa event, and only then return control to Papa.
Suggestions? Some kind of "loose coupling" of the event handler with the slowness of my async API is fine too. I don't mind if a few hundred rows get queued up. But when tens of thousands pile up, I will run out of heap fast.
Why hammer me with on("data") events that I can't seem to slow down?
You can, you just were not asking papa to stop. You can do this by calling stream.pause(), then later stream.resume() to make use of Node stream's builtin back-pressure.
However, there's a much nicer API to use than dealing with this on your own in callback-based code: use the stream as an async iterator! When you await in the body of a for await loop, the generator has to pause as well. So you can write
async importCSV(filePath) {
let parsedNum = 0;
const dataStream = fs.createReadStream(filePath);
const parseStream = Papa.parse(Papa.NODE_STREAM_INPUT, {delimiter: ",", header: false});
dataStream.pipe(parseStream);
for await (const row of parseStream) {
// Received a CSV row from Papa.parse()
const data = row.filter((e, i) => i <= 2 ? e : undefined);
console.log("PA#", parsedNum, ": parsed", data);
parsedNum++;
await dirtyWork(data);
}
console.log(`Parsed ${parsedNum} rows`);
}
importCSV('sample.csv').catch(console.error);
let processedNum = 0;
function dirtyWork(data) {
// Simulate some really slow async/await dirty work here,
// for example send requests to a one-at-a-time persistence API
return new Promise((res, rej) => {
console.log("DW#", processedNum, ": dirty work START", data)
setTimeout(() => {
console.log("DW#", processedNum, ": dirty work STOP ", data);
processedNum++;
res();
}, 1000);
});
}
Async code in JavaScript can sometimes be a little hard to grok. It's important to remember how Node operates handles concurrency.
The node process is single-threaded, but it uses a concept called an event loop. The consequence of this is that async code and callbacks are essentially equivalent representations of the same thing.
Of course, you need an async function to use await, but your callback from Papa Parse can be an async function:
parse.on("data", async row => {
await sync(row)
})
Once the await operation completes, the arrow function ends, and all references to row will be eliminated, so the garbage collector can successfully collect row, releasing that memory.
The effect this has is concurrently executing sync every time a row is parsed, so if you can only sync one record at a time, then I would recommend wrapping the sync function in a debouncer.

sync/async functions and frozen background loop

I'm trying to dev a lottery game on top of a blockchain.
In server.js, this loop is intended to run "in background" and gather data on the blockchain every second.
let test = 0;
setTimeout(async function run() {
test++;
console.log(test);
setTimeout(run, 1000);
}, 1000);
In an other module, I want to allow user to login with a keystore/password :
socket.on("login", data => {
helpers.checkCredentials(data).then(r => {
//console.log(jwt.verify(r, process.env.JWT_PASS));
//console.log(r);
io.emit("login", r);
});
});
that calls checkCredentials :
const int4 = require("int4.js");
module.exports = async function(cred) {
let account = null;
let RPC = int4.rpc(process.env.URL_RPC);
if (cred.keystore !== undefined && cred.password !== undefined) {
try {
account = int4.keystore.fromV3Keystore(cred.keystore, cred.password);
console.log(account);
} catch (e) {
return "bad password or keystore";
}
(...)
But the function int4.keystore.fromV3Keystore takes time to decrypt the keystore (approx. 5-7s) and during this time, my counter doesn't run, it's frozen.
I think I'm not familiar enough with practice of async functions, and despite a lot of readings I can't figure out why it freezes.
I tried to use util to util.promisify the int4.keystore.fromV3Keystore, hoping for some "async acting" but... nope.
Can you help me getting this "background" loop independant from the other functions ? Or making a clean code that would run every functions really asynchronously ?
Thanks a lot.
I use the beta lib https://github.com/intfoundation/int4.js for the INTChain testnet keystore decrypt
All of these async functions are running on the same thread. Promises and async/await is a way to get this one thread to run things asynchronously.
A result of this is that if any of these functions take too long before yielding execution (return or await on some other), the event loop gets stuck.
What you need is actual multithreading. You need to create a worker thread in which you run the long-running function and get the result back.
Here are some pointers to get started:
https://blog.logrocket.com/node-js-multithreading-what-are-worker-threads-and-why-do-they-matter-48ab102f8b10/
https://blog.logrocket.com/a-complete-guide-to-threads-in-node-js-4fa3898fe74f/

How to stop class/functions from continuing to execute code in Node.js

I have made a few questions about this already, but maybe this question would result in better answers(i'm bad at questions)
I have one class, called FOO, where I call an async Start function, that starts the process that the class FOO was made to do. This FOO class does a lot of different calculations, as well as posting/getting the calculations using the node.js "requets" module.
-I'm using electron UI's (by pressing buttons, that executes a function etc..) to create and Start the FOO class-
class FOO {
async Start(){
console.log("Start")
await this.GetCalculations();
await this.PostResults()
}
async PostResults(){
//REQUESTS STUFF
const response = {statusCode: 200} //Request including this.Cal
console.log(response)
//Send with IPC
//ipc.send("status", response.statusCode)
}
async GetCalculations(){
for(var i = 0; i < 10; i++){
await this.GetCalculation()
}
console.log(this.Cal)
}
async GetCalculation(){
//REQUEST STUFF
const response = {body: "This is a calculation"} //Since request module cant be used in here.
if(!this.Cal) this.Cal = [];
this.Cal.push(response)
}
}
var F1 = new FOO();
F1.Start();
Now imagine this code but with A LOT more steps and more requests ect. where it might take seconds/minutes to finish all tasks in the class FOO.
-Electron got a stop button that the user can hit when he wants the calculations to stop-
How would I go about stopping the entire class from continuing?
In some cases, the user might stop and start right after, so I have been trying to figure out a way to STOP the code from running entirely, but where the user would still be able to create a new class and start that, without the other class running in the background.
I have been thinking about "tiny-worker" module, but on the creation of the worker, it takes 1-2 seconds, and this decreases the purpose of a fast calculation program.
Hopefully, this question is better than the other ones.
Update:
Applying the logic behind the different answers I came up with this:
await Promise.race([this.cancelDeferred, new Promise( async (res, req) => {
var options ={
uri: "http://httpstat.us/200?sleep=5000"
}
const response = await request(options);
console.log(response.statusCode)
})])
But even when the
this.cancelDeferred.reject(new Error("User Stop"));
Is called, the response from the request "statuscode" still gets printed out when the request is finished.
The answares I got, shows some good logic, that I didn't know about, but the problem is that they all only stop the request, the code hanlding the request response will still execute, and in some cases trigger a new request. This means that I have to spam the Stop function until it fully stops it.
Framing the problem as a whole bunch of function calls that make serialized asynchronous operations and you want the user to be able to hit a Cancel/Stop button and cause the chain of asynchronous operations to abort (e.g. stop doing any more and bail on getting whatever eventual result it was trying to get).
There are several schemes I can think of.
1. Each operation checks some state property. You make these operations all part of some object that has a aborted state property. The code for every single asynchronous operation must check that state property after it completes. The Cancel/Stop button can be hooked up to set this state variable. When the current asynchronous operation finishes, it will abort the rest of the operation. If you are using promises for sequencing your operations (which it appears you are), then you can reject the current promise causing the whole chain to abort.
2. Create some async wrapper function that incorporates the cancel state for you automatically. If all your actual asynchronous operations are of some small group of operations (such as all using the request module), then you can create a wrapper function around whichever request operations you use that when any operation completes, it checks the state variable for you or merges it into the returned promise and if it has been stopped, it rejects the returned promise which causes the whole promise chain to abort. This has the advantage that you only have to do the if checks in one place and the rest of your code just switches to using your wrapped version of the request function instead of the regular one.
3. Put all the async steps/logic into another process that you can kill. This seems (to me) like using a sledge hammer for a small problem, but you could launch a child_process (which can also be a node.js program) to do your multi-step async operations and when the user presses stop/cancel, then you just kill the child process. Your code that is monitoring the child_process and waiting for a result will either get a final result or an indication that it was stopped. You probably want to use an actual process here rather than worker threads so you get a full and complete abort and so all memory and other resources used by that process gets properly reclaimed.
Please note that none of these solutions use any sort of infinite loop or polling loop.
For example, suppose your actual asynchronous operation was using the request() module.
You could define a high scoped promise that gets rejected if the user clicks the cancel/stop button:
function Deferred() {
let p = this.promise = new Promise((resolve, reject) => {
this.resolve = resolve;
this.reject = reject;
});
this.then = this.promise.then.bind(p);
this.catch = this.promise.catch.bind(p);
this.finally = this.promise.finally.bind(p);
}
// higher scoped variable that persists
let cancelDeferred = new Deferred();
// function that gets called when stop button is hit
function stop() {
// reject the current deferred which will cause
// existing operations to cancel
cancelDeferred.reject(new Error("User Stop"));
// put a new deferred in place for future operations
cancelDeferred = new Deferred();
}
const rp = require('request-promise');
// wrapper around request-promise
function rpWrap(options) {
return Promise.race([cancelDeferred, rp(options)]);
}
Then, you just call rpWrap() everywhere instead of calling rp() and it will automatically reject if the stop button is hit. You need to then code your asynchronous logic so that if any reject, it will abort (which is generally the default and automatic behavior for promises anywa).
Asynchronous functions do not run code in a separate thread, they just encapsulate an asynchronous control flow in syntactic sugar and return an object that represents its completion state (i.e. pending / resolved / rejected).
The reason for making this distinction is that once you start the control flow by calling the async function, it must continue until completion, or until the first uncaught error.
If you want to be able to cancel it, you must declare a status flag and check it at all or some sequence points, i.e. before an await expression, and return early (or throw) if the flag is set. There are three ways to do this.
You can provide a cancel() function to the caller which will be able set the status.
You can accept an isCancelled() function from the caller which will return the status, or conditionally throw based on the status.
You can accept a function that returns a Promise which will throw when cancellation is requested, then at each of your sequence points, change await yourAsyncFunction(); to await Promise.race([cancellationPromise, yourAsyncFunction()]);
Below is an example of the last approach.
async function delay (ms, cancellationPromise) {
return Promise.race([
cancellationPromise,
new Promise(resolve => {
setTimeout(resolve, ms);
})
]);
}
function cancellation () {
const token = {};
token.promise = new Promise((_, reject) => {
token.cancel = () => reject(new Error('cancelled'));
});
return token;
}
const myCancellation = cancellation();
delay(500, myCancellation.promise).then(() => {
console.log('finished');
}).catch(error => {
console.log(error.message);
});
setTimeout(myCancellation.cancel, Math.random() * 1000);

Resources