Building a node.js CLI application. Users should choose some tasks to run and based on that, tasks should work and then spinners (using ora package) should show success and stop spin.
The issue here is spinner succeed while tasks are still going on. Which means it doesn't wait.
Tried using typical Async/Await as to have an async function and await each function under condition. Didn't work.
Tried using promise.all. Didn't work.
Tried using waterfall. Same.
Here's the code of the task runner, I create an array of functions and pass it to waterfall (Async-waterfall package) or promise.all() method.
const runner = async () => {
let tasks = [];
spinner.start('Running tasks');
if (syncOptions.includes('taskOne')) {
tasks.push(taskOne);
}
if (syncOptions.includes('taskTwo')) {
tasks.push(taskTwo);
}
if (syncOptions.includes('taskThree')) {
tasks.push(taskThree);
}
if (syncOptions.includes('taskFour')) {
tasks.push(taskFour);
}
// Option One
waterfall(tasks, () => {
spinner.succeed('Done');
});
// Option Two
Promise.all(tasks).then(() => {
spinner.succeed('Done');
});
};
Here's an example of one of the functions:
const os = require('os');
const fs = require('fs');
const homedir = os.homedir();
const outputDir = `${homedir}/output`;
const file = `${homedir}/.file`;
const targetFile = `${outputDir}/.file`;
module.exports = async () => {
await fs.writeFileSync(targetFile, fs.readFileSync(file));
};
I tried searching concepts. Talked to the best 5 people I know who can write JS properly. No clue.. What am I doing wrong ?
You don't show us all your code, but the first warning sign is that it doesn't appear you are actually running taskOne(), taskTwo(), etc...
You are pushing what look like functions into an array with code like:
tasks.push(taskFour);
And, then attempting to do:
Promise.all(tasks).then(...)
That won't do anything useful because the tasks themselves are never executed. To use Promise.all(), you need to pass it an array of promises, not an array of functions.
So, you would use:
tasks.push(taskFour());
and then:
Promise.all(tasks).then(...);
And, all this assumes that taskOne(), taskTwo(), etc... are function that return a promise that resolves/rejects when their asynchronous operation is complete.
In addition, you also need to either await Promise.all(...) or return Promise.all() so that the caller will be able to know when they are all done. Since this is the last line of your function, I'd generally just use return Promise.all(...) and this will let the caller get the resolved results from all the tasks (if those are relevant).
Also, this doesn't make much sense:
module.exports = async () => {
await fs.writeFileSync(targetFile, fs.readFileSync(file));
};
You're using two synchronous file operations. They are not asynchronous and do not use promises so there's no reason to put them in an async function or to use await with them. You're mixing two models incorrectly. If you want them to be synchronous, then you can just do this:
module.exports = () => {
fs.writeFileSync(targetFile, fs.readFileSync(file));
};
If you want them to be asynchronous and return a promise, then you can do this:
module.exports = async () => {
return fs.promises.writeFile(targetFile, await fs.promises.readFile(file));
};
Your implementation was attempting to be half and half. Pick one architecture or the other (synchronous or asynchronous) and be consistent in the implementation.
FYI, the fs module now has multiple versions of fs.copyFile() so you could also use that and let it do the copying for you. If this file was large, copyFile() would likely use less memory in doing so.
As for your use of waterfall(), it is probably not necessary here and waterfall uses a very different calling model than Promise.all() so you certainly can't use the same model with Promise.all() as you do with waterfall(). Also, waterfall() runs your functions in sequence (one after the other) and you pass it an array of functions that have their own calling convention.
So, assuming that taskOne, taskTwo, etc... are functions that return a promise that resolve/reject when their asynchronous operations are done, then you would do this:
const runner = () => {
let tasks = [];
spinner.start('Running tasks');
if (syncOptions.includes('taskOne')) {
tasks.push(taskOne());
}
if (syncOptions.includes('taskTwo')) {
tasks.push(taskTwo());
}
if (syncOptions.includes('taskThree')) {
tasks.push(taskThree());
}
if (syncOptions.includes('taskFour')) {
tasks.push(taskFour());
}
return Promise.all(tasks).then(() => {
spinner.succeed('Done');
});
};
This would run the tasks in parallel.
If you want to run the tasks in sequence (one after the other), then you would do this:
const runner = async () => {
spinner.start('Running tasks');
if (syncOptions.includes('taskOne')) {
await taskOne();
}
if (syncOptions.includes('taskTwo')) {
await taskTwo();
}
if (syncOptions.includes('taskThree')) {
await taskThree();
}
if (syncOptions.includes('taskFour')) {
await taskFour();
}
spinner.succeed('Done');
};
Related
I have a function with multiple forEach loops:
async insertKpbDocument(jsonFile) {
jsonFile.doc.annotations.forEach((annotation) => {
annotation.entities.forEach(async (entity) => {
await this.addVertex(entity);
});
annotation.relations.forEach(async (relation) => {
await this.addRelation(relation);
});
});
return jsonFile;
}
I need to make sure that the async code in the forEach loop calling the this.addVertex function is really done before executing the next one.
But when I log variables, It seems that the this.addRelation function is called before the first loop is really over.
So I tried adding await terms before every loops like so :
await jsonFile.doc.annotations.forEach(async (annotation) => {
await annotation.entities.forEach(async (entity) => {
await this.addVertex(entity);
});
await annotation.relations.forEach(async (relation) => {
await this.addRelation(relation);
});
});
But same behavior.
Maybe it is the log function that have a latency? Any ideas?
As we've discussed, await does not pause a .forEach() loop and does not make the 2nd item of the iteration wait for the first item to be processed. So, if you're really trying to do asynchronous sequencing of items, you can't really accomplish it with a .forEach() loop.
For this type of problem, async/await works really well with a plain for loop because they do pause the execution of the actual for statement to give you sequencing of asynchronous operations which it appears is what you want. Plus, it even works with nested for loops because they are all in the same function scope:
To show you how much simpler this can be using for/of and await, it could be done like this:
async insertKpbDocument(jsonFile) {
for (let annotation of jsonFile.doc.annotations) {
for (let entity of annotation.entities) {
await this.addVertex(entity);
}
for (let relation of annotation.relations) {
await this.addRelation(relation);
}
}
return jsonFile;
}
You get to write synchronous-like code that is actually sequencing asynchronous operations.
If you are really avoiding any for loop, and your real requirement is only that all calls to addVertex() come before any calls to addRelation(), then you can do this where you use .map() instead of .forEach() and you collect an array of promises that you then use Promise.all() to wait on the whole array of promises:
insertKpbDocument(jsonFile) {
return Promise.all(jsonFile.doc.annotations.map(async annotation => {
await Promise.all(annotation.entities.map(entity => this.addVertex(entity)));
await Promise.all(annotation.relations.map(relation => this.addRelation(relation)));
})).then(() => jsonFile);
}
To fully understand how this works, this runs all addVertex() calls in parallel for one annotation, waits for them all to finish, then runs all the addRelation() calls in parallel for one annotation, then waits for them all to finish. It runs all the annotations themselves in parallel. So, this isn't very much actual sequencing except within an annotation, but you accepted an answer that has this same sequencing and said it works so I show a little simpler version of this for completeness.
If you really need to sequence each individual addVertex() call so you don't call the next one until the previous one is done and you're still not going to use a for loop, then you can use the .reduce() promise pattern put into a helper function to manually sequence asynchronous access to an array:
// helper function to sequence asynchronous iteration of an array
// fn returns a promise and is passed an array item as an argument
function sequence(array, fn) {
return array.reduce((p, item) => {
return p.then(() => {
return fn(item);
});
}, Promise.resolve());
}
insertKpbDocument(jsonFile) {
return sequence(jsonFile.doc.annotations, async (annotation) => {
await sequence(annotation.entities, entity => this.addVertex(entity));
await sequence(annotation.relations, relation => this.addRelation(relation));
}).then(() => jsonFile);
}
This will completely sequence everything. It will do this type of order:
addVertex(annotation1)
addRelation(relation1);
addVertex(annotation2)
addRelation(relation2);
....
addVertex(annotationN);
addRelation(relationN);
where it waits for each operation to finish before going onto the next one.
foreach will return void so awaiting it will not do much. You can use map to return all the promises you create now in the forEach, and use Promise.all to await all:
async insertKpbDocument(jsonFile: { doc: { annotations: Array<{ entities: Array<{}>, relations: Array<{}> }> } }) {
await Promise.all(jsonFile.doc.annotations.map(async(annotation) => {
await Promise.all(annotation.entities.map(async (entity) => {
await this.addVertex(entity);
}));
await Promise.all(annotation.relations.map(async (relation) => {
await this.addRelation(relation);
}));
}));
return jsonFile;
}
I understand you can run all the addVertex concurrently. Combining reduce with map splitted into two different set of promises you can do it. My idea:
const first = jsonFile.doc.annotations.reduce((acc, annotation) => {
acc = acc.concat(annotation.entities.map(this.addVertex));
return acc;
}, []);
await Promise.all(first);
const second = jsonFile.doc.annotations.reduce((acc, annotation) => {
acc = acc.concat(annotation.relations.map(this.addRelation));
return acc;
}, []);
await Promise.all(second);
You have more loops, but it does what you need I think
forEach executes the callback against each element in the array and does not wait for anything. Using await is basically sugar for writing promise.then() and nesting everything that follows in the then() callback. But forEach doesn't return a promise, so await arr.forEach() is meaningless. The only reason it isn't a compile error is because the async/await spec says you can await anything, and if it isn't a promise you just get its value... forEach just gives you void.
If you want something to happen in sequence you can await in a for loop:
for (let i = 0; i < jsonFile.doc.annotations.length; i++) {
const annotation = jsonFile.doc.annotations[i];
for (let j = 0; j < annotation.entities.length; j++) {
const entity = annotation.entities[j];
await this.addVertex(entity);
});
// code here executes after all vertix have been added in order
Edit: While typing this a couple other answers and comments happened... you don't want to use a for loop, you can use Promise.all but there's still maybe some confusion, so I'll leave the above explanation in case it helps.
async/await does not within forEach.
A simple solution: Replace .forEach() with for(.. of ..) instead.
Details in this similar question.
If no-iterator linting rule is enabled, you will get a linting warning/error for using for(.. of ..). There are lots of discussion/opinions on this topic.
IMHO, this is a scenario where we can suppress the warning with eslint-disable-next-line or for the method/class.
Example:
const insertKpbDocument = async (jsonFile) => {
// eslint-disable-next-line no-iterator
for (let entity of annotation.entities) {
await this.addVertex(entity)
}
// eslint-disable-next-line no-iterator
for (let relation of annotation.relations) {
await this.addRelation(relation)
}
return jsonFile
}
The code is very readable and works as expected. To get similar functionality with .forEach(), we need some promises/observables acrobatics that i think is a waste of effort.
I was able to implement this, but I'm not able to explain why the process will exit with code 0 if I don't have another async function (processAfterDeclaration) always trying to pull from the Observable, recordObservable.
setup to run the source file
npm init -y
npm i rxjs#7.4.0 byline
source file that does what I want, but in a confusing way
// node.js 14
const fs = require('fs');
const pipeline = require('util').promisify(require('stream').pipeline);
const byline = require('byline');
const { Observable } = require('rxjs');
const { take } = require('rxjs/operators');
const sleep = ms => new Promise(r => setTimeout(r, ms));
let recordObservable;
(async () => {
const inputFilePath = 'temp.csv';
try {
const data = 'a,b,c\n' +
'1,2,3\n' +
'10,20,30\n' +
'100,200,300';
fs.writeFileSync(inputFilePath, data);
console.log('starting pipeline');
// remove this line, and the `await pipeline` resolves, but process exits early?
processAfterDeclaration().catch(console.error);
await pipeline(
fs.createReadStream(inputFilePath),
byline.createStream(),
async function* (sourceStream) {
console.log('making observable', inputFilePath);
recordObservable = new Observable(async subscriber => {
for await (const lineBuffer of sourceStream) {
subscriber.next(lineBuffer.toString());
}
subscriber.complete();
});
console.log('made observable', recordObservable);
}
);
console.log('pipeline done', recordObservable);
} catch (error) {
console.error(error);
} finally {
fs.unlinkSync(inputFilePath);
}
})();
async function processAfterDeclaration() {
while (!recordObservable) {
await sleep(100);
}
console.log('can process');
await recordObservable
.pipe(take(2))
.subscribe(console.log)
}
edit: It may be better to just forgo node.js stream.pipeline. I'd think using pipeline is best bc it should be the most efficient and offers backpressuring, but I want to test some things offered by RxJS.
edit2: More reasons to be able to forgo stream.pipeline is that I can still use pipe methods and provide any readable stream to the from function as an arg. I can then use subscribe method to write/append each thing from the observable to my output stream then call add on my subscription to add teardown logic, specifically for closing my write stream. I would hope that RxJS from would help determine when to close the read stream it's given as input. Finally, I would recommend await lastValueFrom(myObservable) or firstValueFrom possibly.
RxJS from operator
The RxJS from operator will turn an async iterator (like node stream) into an observable for you!
I can't run/test on your code, but something in this ballpark should work.
const fs = require('fs');
const byline = require('byline');
const { from } = require('rxjs');
const { map, take, finalize} = require('rxjs/operators');
const inputFilePath = 'temp.csv';
(async () => {
const data = 'a,b,c\n' +
'1,2,3\n' +
'10,20,30\n' +
'100,200,300';
fs.writeFileSync(inputFilePath, data);
console.log('starting pipeline');
from(byline(fs.createReadStream(inputFilePath)))
.pipe(
map(lineBuffer => lineBuffer.toString()),
take(2),
finalize(() => fs.unlinkSync(inputFilePath))
)
.subscribe(console.log);
})();
Your second async function
I'm not able to explain why the process will exit with code 0 if I don't have another async function (processAfterDeclaration) always trying to pull from the Observable
If you define a function and never call it, that function will never calculate anything.
If you define an observable and never subscribe to it, that observable will never do anything either. That's different from promises which start the moment they're defined. You just need to subscribe to that observable, it doesn't need a separate function though.
This should work the same:
recordObservable = new Observable(async subscriber => {
for await (const lineBuffer of sourceStream) {
subscriber.next(lineBuffer.toString());
}
subscriber.complete();
});
recordObservable.pipe(
take(2)
).subscribe(console.log)
The second async function
I'm not able to explain why the process will exit with code 0 if I
don't have another async function (processAfterDeclaration) always
trying to pull from the Observable
The logic error was that the await pipeline would never resolve or reject because the 3rd step in the pipeline would never yield anything because nothing would ever subscribe and pull from the recordObservable. It was a deadlock that was written accidentally.
I want to force Node to wait for the promise to complete (either by failure or success). So far, I'm not seeing what I wanted. I'm trying to wait for two web services to complete before merging, and especially before the function ends.
I've tried these two approaches below. In both cases, Node refuses to wait.
Approach 1:
function getStrategy() {
// this is a web service that takes a few ms to run,
// but so far I haven't seen any evidence that it bothers.
}
function getConfig() {
// Both strategy and jwt are set to Promises
const strategy = getStrategy();
const jwt = getJwt();
const lodash = require('lodash');
var config = {};
try {
Promise.all([strategy,jwt])
.then(data => { config = lodash.merge(config,data)})
} catch(error) {
console.log(' Print error message');
}
return config;
}
Approach 2:
function getStrategy() {
// this is a web service that takes a few ms to run,
// but so far I haven't seen any evidence that it bothers.
}
async function getConfig() {
// Both strategy and jwt are set to Promises
const strategy = getStrategy();
const jwt = getJwt();
const lodash = require('lodash');
var config = {};
try {
var promiseResult = await Promise.all([strategy, jwt]);
const lodash = require('lodash');
config = lodash.merge(config, strategy[0]);
config = lodash.merge(config, jwt);
} catch (reason) {
console.error('------------------------------------------------------------------------------------');
console.error(reason);
console.error("in getConfig(): Could not fetch strategy or jwt");
console.error('------------------------------------------------------------------------------------');
}
return config;
}
I wish approach 2 worked, but it does not. It will not print any console.log statements after the call to Promise.all. So within that function it does wait. Except that it because I told it to "await", I have to make the function async. That allows Node to say, "oh, it's an async function, I can just go off and do something else and completely ignore the await keyword." It does this by returning to the function calling getConfig().
In the first approach, neither the "then" handler, nor any exceptions are thrown. It just impatiently leaves the function and goes back to the caller.
How do I get the thread that calls the getConfig() function to wait for the result. I mean really wait, not like, partially "await". Or, throw an exception and let me handle that. I'm finding that in Node, as soon as something because asynchronous, I have no idea how to get the await, or the then handler to work.
Updated attempt:
I separated the two service calls to individually control each service. I now have
async function getSynchronousStrategy(isVaultAvailable) {
const secretsVaultReader = require('./src/configuration/secretsVaultReader');
const secretsConfigReader = require('./src/configuration/secretsConfigReader');
const strategy = isVaultAvailable
? secretsVaultReader.getStrategy()
: secretsConfigReader.getStrategy();
await strategy;
console.log('## strategy after await=' + strategy);
return strategy; ''
}
...
const strategy = await getSynchronousStrategy(isVaultAvailable);
console.log('### strategy =' + JSON.stringify(strategy));
...
In the above case, Node sees the await strategy, but then prints
"## strategy after await=[object Promise]"
However, the second await seems to work, and it prints the desired strategy. I'm guessing the promise was eventually settled and it was able to print the result. I don't mind the time, I just want it to wait.
Obviously, it did not wait. It
I would suggest simply using await with both the function calls directly, in this manner it will wait for both the functions to return responses before proceeding further.
I am very new to nodejs and stuck at a place where one function populates an array and the other reads from it.
Is there any simple construct to synchronize this.
Code looks something like Below
let arr = [];
let prod = function() {
arr.push('test');
};
let consume = function() {
process(arr.pop());
};
I did find some complicated ways to do it :(
Thanks alot for any help... ☺️
By synchronizing you probably mean that push on one side of your application should trigger pop on the other. That can be achieved with not-so-trivial event-driven approach, using the NodeJS Events module.
However, in simple case you could try another approach with intermediary object that does the encapsulation of array operations and utilizes the provided callbacks to achieve observable behavior.
// Using the Modular pattern to make some processor
// which has 2 public methods and private array storage
const processor = () => {
const storage = [];
// Consume takes value and another function
// that is the passed to the produce method
const consume = (value, cb) => {
if (value) {
storage.push(value);
produce(cb);
}
};
// Pops the value from storage and
// passes it to a callback function
const produce = (cb) => {
cb(storage.pop());
};
return { consume, produce };
};
// Usage
processor().consume(13, (value) => {
console.log(value);
});
This is really a noop example, but I think that this should create a basic understanding how to build "synchronization" mechanism you've mentioned, using observer behavior and essential JavaScript callbacks.
You can use callback to share data between two functions
function prod(array) {
array.push('test1')
}
function consume() {
prod(function (array) {
console.log(array)
})
}
For example, I am writing a random generator with crypto.randomBytes(...) along with another async functions. To avoiding fall in callback hell, I though I could use the sync function of crypto.randomBytes. My doubt is if I do that my node program will stop each time I execute the code?. Then I thought if there are a list of async functions which their time to run is very short, these could work as synchronous function, then developing with this list of functions would be easy.
Using the mz module you can make crypto.randomBytes() return a promise. Using await (available in Node 7.x using the --harmony flag) you can use it like this:
let crypto = require('mz/crypto');
async function x() {
let bytes = await crypto.randomBytes(4);
console.log(bytes);
}
x();
The above is nonblocking even though it looks like it's blocking.
For a better demonstration consider this example:
function timeout(time) {
return new Promise(res => setTimeout(res, time));
}
async function x() {
for (let i = 0; i < 10; i++) {
console.log('x', i);
await timeout(2000);
}
}
async function y() {
for (let i = 0; i < 10; i++) {
console.log('y', i);
await timeout(3000);
}
}
x();
y();
And note that those two functions take a lot of time to execute but they don't block each other.
Run it with Node 7.x using:
node --harmony script-name.js
Or with Node 8.x with:
node script-name.js
I show you those examples to demonstrate that it's not a choice of async with callback hell and sync with nice code. You can actually run async code in a very elegant manner using the new async function and await operator available in ES2017 - it's good to read about it because not a lot of people know about those features.
They're asynchronous, learn to deal with it.
Promises now, and in the future ES2017's await and async will make your life a lot easier.
Bluebirds promisifyAll is extremely useful when dealing with any standard Node.js callback API. It adds functions tagged with Async that return a promise instead of requiring a callback.
const Promise = require('bluebird')
const crypto = Promise.promisifyAll(require('crypto'))
function randomString() {
return crypto.randomBytesAsync(4).then(bytes => {
console.log('got bytes', bytes)
return bytes.toString('hex')
})
}
randomString()
.then(string => console.log('string is', string))
.catch(error => console.error(error))