wait for writestream to finish before executing next function

wait for writestream to finish before executing next function - node.js

I have two functions.
The first function reads all the files in a folder and writes their data to a new file.
The second function takes that new file (output of function 1) as input and creates another file. Therefore it has to wait until the write stream of function 1 has finished.
const fs = require('fs');
const path = require('path');
function f1(inputDir, outputFile) {
let stream = fs.createWriteStream(outputFile, {flags:'a'}); // new data should be appended to outputFile piece by piece (hence flag a)
let files = await fs.promises.readdir(inputDir);
for(let file of files) {
let pathOfCurrentFile = path.join(inputDir, file);
let stat = fs.statSync(pathOfCurrentFile);
if(stat.isFile()) {
data = await fs.readFileSync(pathOfCurrentFile, 'utf8');
// now the data is being modified for output
let result = data + 'other stuff';
stream.write(result);
}
}
stream.end();
}
function f2(inputFile, outputFile) {
let newData = doStuffWithMy(inputFile);
let stream = fs.createWriteStream(outputFile);
stream.write(newData);
stream.end();
}
f1('myFiles', 'myNewFile.txt');
f2('myNewFile.txt', 'myNewestFile.txt');
Here's what happens:
'myNewFile.txt' (output of f1) is created correctly
'myNewestFile.txt' is created but is either empty or only contains one or two words (it should contain a long text)
When I use a timeout before executing f2, it works fine, but I can't use a timeout because there can be thousands of input files in the inputDir, therefore I need a way to do it dynamically.
I've experimented with async/await, callbacks, promises etc. but that stuff seems to be a little to advanced for me, I couldn't get it to work.
Is there anything else I can try?

Since you asked about a synchronous version, here's what that could look like. This should only be used in a single user script or in startup code, not in a running server. A server should only use asynchronous file I/O.
// synchronous version
function f1(inputDir, outputFile) {
let outputHandle = fs.openSync(outputFile, "a");
try {
let files = fs.readdirSync(inputDir, {withFileTypes: true});
for (let f of files) {
if (f.isFile()) {
let pathOfCurrentFile = path.join(inputDir, f.name);
let data = fs.readFileSync(pathOfCurrentFile, 'utf8');
fs.writeSync(outputHandle, data);
}
}
} finally {
fs.closeSync(outputHandle);
}
}
function f2(inputFile, outputFile) {
let newData = doStuffWithMy(inputFile);
fs.writeFileSync(outputFile, newData);
}
f1('myFiles', 'myNewFile.txt');
f2('myNewFile.txt', 'myNewestFile.txt');

Related

Dynamic Slash Command Options List via Database Query?

Background:
I am building a discord bot that operates as a Dungeons & Dragons DM of sorts. We want to store game data in a database and during the execution of certain commands, query data from said database for use in the game.
All of the connections between our Discord server, our VPS, and the VPS' backend are functional and we are now implementing slash commands since traditional ! commands are being removed from support in April.
We are running into problems making the slash commands though. We want to set them up to be as efficient as possible which means no hard-coded choices for options. We want to build those choice lists via data from the database.
The problem we are running into is that we can't figure out the proper way to implement the fetch to the database within the SlashCommandBuilder.
Here is what we currently have:
const {SlashCommandBuilder} = require('#discordjs/builders');
const fetch = require('node-fetch');
const {REST} = require('#discordjs/rest');
const test = require('../commonFunctions/test.js');
var options = async function getOptions(){
let x = await test.getClasses();
console.log(x);
return ['test','test2'];
}
module.exports = {
data: new SlashCommandBuilder()
.setName('get-test-data')
.setDescription('Return Class and Race data from database')
.addStringOption(option =>{
option.setName('class')
.setDescription('Select a class for your character')
.setRequired(true)
for(let op of options()){
//option.addChoice(op,op);
}
return option
}
),
async execute(interaction){
},
};
This code produces the following error when start the npm for our bot on our server:
options is not a function or its return value is not iterable
I thought that maybe the function wasn't properly defined, so I replaced the contents of it with just a simple array return and the npm started without errors and the values I had passed showed up in the server.
This leads me to think that the function call in the modules.exports block is immediatly attempting to get the return value of the function and as the function is async, it isn't yet ready and is either returning undefined or a promise or something else not iteratable.
Is there a proper way to implement the code as shown? Or is this way too complex for discord.js to handle?
Is there a proper way to implement the idea at all? Like creating a json object that contains the option data which is built and saved to a file at some point prior to this command being registered and then having the code above just pull in that file for the option choices?

Alright, I found a way. Ian Malcom would be proud (LMAO).
Here is what I had to do for those with a similar issues:
I had to basically re-write our entire application. It sucks, I know, but it works so who cares?
When you run your index file for your npm, make sure that you do the following things.
Note: you can structure this however you want, this is just how I set up my js files.
Setup a function that will setup the data you need, it needs to be an async function as does everything downstream from this point on relating to the creation and registration of the slash commands.
Create a js file to act as your application setup "module". "Module" because we're faking a real module by just using the module.exports method. No package.jsons needed.
In the setup file, you will need two requires. The first is a, as of yet, non-existent data manager file; we'll do that next. The second is a require for node:fs.
Create an async function in your setup file called setup and add it to your module.exports like so:
module.exports = { setup }
In your async setup function or in a function that it calls, make a call to the function in your still as of yet non-existent data manager file. Use await so that the application doesn't proceed until something is returned. Here is what mine looks like, note that I am writing my data to a file to read in later because of my use case, you may or may not have to do the same for yours:
async function setup(){
console.log('test');
//build option choice lists
let listsBuilt = await buildChoiceLists();
if (listsBuilt){
return true;
} else {
return false;
}
}
async function buildChoiceLists(){
let classListBuilt = await buildClassList();
return true;
}
async function buildClassList(){
let classData = await classDataManager.getClassData();
console.log(classData);
classList = classData;
await writeFiles();
return true;
}
async function writeFiles(){
fs.writeFileSync('./CommandData/classList.json', JSON.stringify(classList));
}
Before we finish off this file, if you want to store anything as a property in this file and then get it later on, you can do so. In order for the data to return properly though, you will need to define a getter function in your exports. Here is an example:
var classList;
module.exports={
getClassList: () => classList,
setup
};
So, with everything above you should have something that looks like this:
const classDataManager = require('./DataManagers/ClassData.js')
const fs = require('node:fs');
var classList;
async function setup(){
console.log('test');
//build option choice lists
let listsBuilt = await buildChoiceLists();
if (listsBuilt){
return true;
} else {
return false;
}
}
async function buildChoiceLists(){
let classListBuilt = await buildClassList();
return true;
}
async function buildClassList(){
let classData = await classDataManager.getClassData();
console.log(classData);
classList = classData;
await writeFiles();
return true;
}
async function writeFiles(){
fs.writeFileSync('./CommandData/classList.json', JSON.stringify(classList));
}
module.exports={
getClassList: () => classList,
setup
};
Next that pesky non-existent DataManager file. For mine, each data type will have its own, but you might want to just combine them all into a single .js file for yours.
Same with the folder name, I called mine DataManagers, if you're combining them all into one, you could just call the file DataManager and leave it in the same folder as your appSetup.js file.
For the data manager file all we really need is a function to get our data and then return it in the format we want it to be in. I am using node-fetch. If you are using some other module for data requests, write your code as needed.
Instead of explaining everything, here is the contents of my file, not much has to be explained here:
const fetch = require('node-fetch');
async function getClassData(){
return new Promise((resolve) => {
let data = "action=GetTestData";
fetch('http://xxx.xxx.xxx.xx/backend/characterHandler.php', {
method: 'post',
headers: { 'Content-Type':'application/x-www-form-urlencoded'},
body: data
}).then(response => {
response.json().then(res => {
let status = res.status;
let clsData = res.classes;
let rcData = res.races;
if (status == "Success"){
let text = '';
let classes = [];
let races = [];
if (Object.keys(clsData).length > 0){
for (let key of Object.keys(clsData)){
let cls = clsData[key];
classes.push({
"name": key,
"code": key.toLowerCase()
});
}
}
if (Object.keys(rcData).length > 0){
for (let key of Object.keys(rcData)){
let rc = rcData[key];
races.push({
"name": key,
"desc": rc.Desc
});
}
}
resolve(classes);
}
});
});
});
}
module.exports = {
getClassData
};
This file contacts our backend php and requests data from it. It queries the data then returns it. Then we format it into an JSON structure for use later on with option choices for the slash command.
Once all of your appSetup and data manager files are complete, we still need to create the commands and register them with the server. So, in your index file add something similar to the following:
async function getCommands(){
let cmds = await comCreator.appSetup();
console.log(cmds);
client.commands = cmds;
}
getCommands();
This should go at or near the top of your index.js file. Note that comCreator refers to a file we haven't created yet; you can name this require const whatever you wish. That's it for this file.
Now, the "comCreator" file. I named mine deploy-commands.js, but you can name it whatever. Once again, here is the full file contents. I will explain anything that needs to be explained after:
const {Collection} = require('discord.js');
const {REST} = require('#discordjs/rest');
const {Routes} = require('discord-api-types/v9');
const app = require('./appSetup.js');
const fs = require('node:fs');
const config = require('./config.json');
async function appSetup(){
console.log('test2');
let setupDone = await app.setup();
console.log(setupDone);
console.log(app.getClassList());
return new Promise((resolve) => {
const cmds = [];
const cmdFiles = fs.readdirSync('./commands').filter(f => f.endsWith('.js'));
for (let file of cmdFiles){
let cmd = require('./commands/' + file);
console.log(file + ' added to commands!');
cmds.push(cmd.data.toJSON());
}
const rest = new REST({version: '9'}).setToken(config.token);
rest.put(Routes.applicationGuildCommands(config.clientId, config.guildId), {body: cmds})
.then(() => console.log('Successfully registered application commands.'))
.catch(console.error);
let commands = new Collection();
for (let file of cmdFiles){
let cmd = require('./commands/' + file);
commands.set(cmd.data.name, cmd);
}
resolve(commands);
});
}
module.exports = {
appSetup
};
Most of this is boiler plate for slash command creation though I did combine the creation and registering of the commands into the same process. As you can see, we are grabbing our command files, processing them into a collection, registering that collection, and then resolving the promise with that variable.
You might have noticed that property, was used to then set the client commands in the index.js file.
Config just contains your connection details for your discord server app.
Finally, how I accessed the data we wrote for the SlashCommandBuilder:
data: new SlashCommandBuilder()
.setName('get-test-data')
.setDescription('Return Class and Race data from database')
.addStringOption(option =>{
option.setName('class')
.setDescription('Select a class for your character')
.setRequired(true)
let ops = [];
let data = fs.readFileSync('./CommandData/classList.json','utf-8');
ops = JSON.parse(data);
console.log('test data class options: ' + ops);
for(let op of ops){
option.addChoice(op.name,op.code);
}
return option
}
),
Hopefully this helps someone in the future!

Can nodejs stdout be written to such that the receiving nodejs stdin `on-data` returns individual segments?

I have a nodejs script generating a set of objects in a more secure environment and then piping the result to the long-running application, I was hoping this could also provide a way to pass iterable objects by having the stringification/parsing be done by the js and leave the structure transparent on my first script.
Right now I've been trying to create a readable stream from my iterable, or directly write each section with process.stdout.write() but in both instances, receiving it with process.stdin.on('data', (chunk) => do something with chunk always results in the all the strings combined.
let env = {
variableString: "somekey",
someIterable: {
someOtherValue: ['more values']
}
}
let streamArr = [];
for (let x in env) {
let y = env[x].replace(/ {4}/g ,'')
streamArr.push(JSON.stringify({[x]:y}))
}
// const readable = new Stream.Readable()
// readable.pipe(process.stdout)
streamArr.forEach(item => process.stdout.write(item))
// Receiving script
process.stdin.setEncoding('utf8');
process.stdin.on('data', (chunk) => {
console.log(chunk)
let i = JSON.parse(chunk);
for (let x in i) {
if(typeof i[x] == 'string') process.env[x] = i[x];
console.log(x)
}
});
process.stdin.on('end', () => {
console.log('end')
});

Ended sort of sidestepping this issue, since my data was small enough I just wrote the whole object to STDOUT using
process.stdout.write(JSON.stringify(env))
And then using the fs module to receive STDIN since it's the only way I found to synchronously consume the STDIN to initialize environment variables before anything else runs.
const fs = require('fs')
const env = (function () {
let envTmp = fs.readFileSync(0).toString();
envTmp = JSON.parse(envTmp);
return envTmp;
})();

Asynchronous file read reading different number of lines each time, not halting

I built a simple asynchronous implementation of the readlines module built into nodejs, which is simply a wrapper around the event-based module itself. The code is below;
const readline = require('readline');
module.exports = {
createInterface: args => {
let self = {
interface: readline.createInterface(args),
readLine: () => new Promise((succ, fail) => {
if (self.interface === null) {
succ(null);
} else {
self.interface.once('line', succ);
}
}),
hasLine: () => self.interface !== null
};
self.interface.on('close', () => {
self.interface = null;
});
return self;
}
}
Ideally, I would use it like so, in code like this;
const readline = require("./async-readline");
let filename = "bar.txt";
let linereader = readline.createInterface({
input: fs.createReadStream(filename)
});
let lines = 0;
while (linereader.hasLine()) {
let line = await linereader.readLine();
lines++;
console.log(lines);
}
console.log("Finished");
However, i've observed some erratic and unexpected behavior with this async wrapper. For one, it fails to recognize when the file ends, and simply hangs once it reaches the last line, never printing "Finished". And on top of that, when the input file is large, say a couple thousand lines, it's always off by a few lines and doesn't successfully read the full file before halting. in a 2000+ line file it could be off by as many as 20-40 lines. If I throw a print statement into the .on('close' listener, I see that it does trigger; however, the program still doesn't recognize that it should no longer have lines to read.

It seems that in nodejs v11.7, the readline interface was given async iterator functionality and can simply be looped through with a for await ... of loop;
const rl = readline.createInterface({
input: fs.createReadStream(filename);
});
for await (const line of rl) {
console.log(line)
}
How to get synchronous readline, or "simulate" it using async, in nodejs?

Node - Readable stream pipe() overwrite previous streams in a for loop

I am trying to stream a data collection to multiple files with the code below:
for (var key in data) {
// skip if collection length is 0
if (data[key].length > 0) {
// Use the key and jobId to open file for appending
let filePath = folderPath + '/' + key + '_' + jobId + '.txt';
// Using stream to append the data output to file, which should perform better when file gets big
let rs = new Readable();
let n = data[key].length;
let i = 0;
rs._read = function () {
rs.push(data[key][i++]);
if (i === n) {
rs.push(null);
}
};
rs.pipe(fs.createWriteStream(filePath, {flags: 'a', encoding: 'utf-8'}));
}
}
However, I end up getting all files being populated with the same data, which is the array for the last key in data object. It seems the reader stream is overridden for each loop, and the pipe() to writable stream doesn't start until the for loop is finished. How is that possible?

So the reason why you code is probably not working is because rs._read method is called asynchronically, and your key variable is function scoped(because of var keyword).
Every rs stream that you create points to the same variable which is key, at the end of main loop, each of those callbacks will have the same value.
When you change "var" to "let", then in each iteration new key variable will be created and it will solve your problem(_read function will have its own copy of key variable instead of shared one).
If you change it to let it should work.

This is happening because the key you're defining in the loop statement is not block-scoped. This is not a problem at first, but when you create a closure on it inside the rs._read function, all subsequent stream reads are using the last known value, which is the last value of data array.
And while we at it, I can propose a bit of a refactoring to make the code cleaner and more reusable:
const writeStream = (folderPath, index, jobId) => {
const filePath = `${folderPath}/${index}_${jobId}.txt`;
return fs.createWriteStream(filePath, {
flags: 'a', encoding: 'utf-8'
});
}
data.forEach((value, index) => {
const length = value.length;
if (length > 0) {
const rs = new Readable();
const n = length;
let i = 0;
rs._read = () => {
rs.push(value[i++]);
if (i === n) rs.push(null);
}
rs.pipe(writeStream(folderPath, index, jobId));
}
});

Is it possible to register multiple listeners to a child process's stdout data event? [duplicate]

I need to run two commands in series that need to read data from the same stream.
After piping a stream into another the buffer is emptied so i can't read data from that stream again so this doesn't work:
var spawn = require('child_process').spawn;
var fs = require('fs');
var request = require('request');
var inputStream = request('http://placehold.it/640x360');
var identify = spawn('identify',['-']);
inputStream.pipe(identify.stdin);
var chunks = [];
identify.stdout.on('data',function(chunk) {
chunks.push(chunk);
});
identify.stdout.on('end',function() {
var size = getSize(Buffer.concat(chunks)); //width
var convert = spawn('convert',['-','-scale',size * 0.5,'png:-']);
inputStream.pipe(convert.stdin);
convert.stdout.pipe(fs.createWriteStream('half.png'));
});
function getSize(buffer){
return parseInt(buffer.toString().split(' ')[2].split('x')[0]);
}
Request complains about this
Error: You cannot pipe after data has been emitted from the response.
and changing the inputStream to fs.createWriteStream yields the same issue of course.
I don't want to write into a file but reuse in some way the stream that request produces (or any other for that matter).
Is there a way to reuse a readable stream once it finishes piping?
What would be the best way to accomplish something like the above example?

You have to create duplicate of the stream by piping it to two streams. You can create a simple stream with a PassThrough stream, it simply passes the input to the output.
const spawn = require('child_process').spawn;
const PassThrough = require('stream').PassThrough;
const a = spawn('echo', ['hi user']);
const b = new PassThrough();
const c = new PassThrough();
a.stdout.pipe(b);
a.stdout.pipe(c);
let count = 0;
b.on('data', function (chunk) {
count += chunk.length;
});
b.on('end', function () {
console.log(count);
c.pipe(process.stdout);
});
Output:
8
hi user

The first answer only works if streams take roughly the same amount of time to process data. If one takes significantly longer, the faster one will request new data, consequently overwriting the data still being used by the slower one (I had this problem after trying to solve it using a duplicate stream).
The following pattern worked very well for me. It uses a library based on Stream2 streams, Streamz, and Promises to synchronize async streams via a callback. Using the familiar example from the first answer:
spawn = require('child_process').spawn;
pass = require('stream').PassThrough;
streamz = require('streamz').PassThrough;
var Promise = require('bluebird');
a = spawn('echo', ['hi user']);
b = new pass;
c = new pass;
a.stdout.pipe(streamz(combineStreamOperations));
function combineStreamOperations(data, next){
Promise.join(b, c, function(b, c){ //perform n operations on the same data
next(); //request more
}
count = 0;
b.on('data', function(chunk) { count += chunk.length; });
b.on('end', function() { console.log(count); c.pipe(process.stdout); });

You can use this small npm package I created:
readable-stream-clone
With this you can reuse readable streams as many times as you need

For general problem, the following code works fine
var PassThrough = require('stream').PassThrough
a=PassThrough()
b1=PassThrough()
b2=PassThrough()
a.pipe(b1)
a.pipe(b2)
b1.on('data', function(data) {
console.log('b1:', data.toString())
})
b2.on('data', function(data) {
console.log('b2:', data.toString())
})
a.write('text')

I have a different solution to write to two streams simultaneously, naturally, the time to write will be the addition of the two times, but I use it to respond to a download request, where I want to keep a copy of the downloaded file on my server (actually I use a S3 backup, so I cache the most used files locally to avoid multiple file transfers)
/**
* A utility class made to write to a file while answering a file download request
*/
class TwoOutputStreams {
constructor(streamOne, streamTwo) {
this.streamOne = streamOne
this.streamTwo = streamTwo
}
setHeader(header, value) {
if (this.streamOne.setHeader)
this.streamOne.setHeader(header, value)
if (this.streamTwo.setHeader)
this.streamTwo.setHeader(header, value)
}
write(chunk) {
this.streamOne.write(chunk)
this.streamTwo.write(chunk)
}
end() {
this.streamOne.end()
this.streamTwo.end()
}
}
You can then use this as a regular OutputStream
const twoStreamsOut = new TwoOutputStreams(fileOut, responseStream)
and pass it to to your method as if it was a response or a fileOutputStream

If you have async operations on the PassThrough streams, the answers posted here won't work.
A solution that works for async operations includes buffering the stream content and then creating streams from the buffered result.
To buffer the result you can use concat-stream
const Promise = require('bluebird');
const concat = require('concat-stream');
const getBuffer = function(stream){
return new Promise(function(resolve, reject){
var gotBuffer = function(buffer){
resolve(buffer);
}
var concatStream = concat(gotBuffer);
stream.on('error', reject);
stream.pipe(concatStream);
});
}
To create streams from the buffer you can use:
const { Readable } = require('stream');
const getBufferStream = function(buffer){
const stream = new Readable();
stream.push(buffer);
stream.push(null);
return Promise.resolve(stream);
}

What about piping into two or more streams not at the same time ?
For example :
var PassThrough = require('stream').PassThrough;
var mybiraryStream = stream.start(); //never ending audio stream
var file1 = fs.createWriteStream('file1.wav',{encoding:'binary'})
var file2 = fs.createWriteStream('file2.wav',{encoding:'binary'})
var mypass = PassThrough
mybinaryStream.pipe(mypass)
mypass.pipe(file1)
setTimeout(function(){
mypass.pipe(file2);
},2000)
The above code does not produce any errors but the file2 is empty

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

wait for writestream to finish before executing next function - node.js

Related

Dynamic Slash Command Options List via Database Query?

Can nodejs stdout be written to such that the receiving nodejs stdin `on-data` returns individual segments?

Asynchronous file read reading different number of lines each time, not halting

Node - Readable stream pipe() overwrite previous streams in a for loop

Is it possible to register multiple listeners to a child process's stdout data event? [duplicate]

Categories

Resources