Reading a csv file async - NodeJS - node.js

I am trying to create a function where I can pass file path and the read the file in async way. What I found out was that it supports streams()
const fs = require('fs');
var parse = require('csv-parse');
var async = require('async');
readCSVData = async (filePath): Promise<any> => {
let csvString = '';
var parser = parse({delimiter: ','}, function (err, data) {
async.eachSeries(data, function (line, callback) {
csvString = csvString + line.join(',')+'\n';
console.log(csvString) // I can see this value getting populated
})
});
fs.createReadStream(filePath).pipe(parser);
}
I got this code from here. but I am new to node js so I am not getting how to use await to get the data once all lines are parsed.
const csvData = await this.util.readCSVData(path)

My best workaround for this task is:
const csv = require('csvtojson')
const csvFilePath = 'data.csv'
const array = await csv().fromFile(csvFilePath);

This answer provides legacy code that uses async library. Promise-based control flow with async doesn't need this library. Asynchronous processing with async.eachSeries doesn't serve a good purpose inside csv-parse callback because a callback waits for data to be filled with all collected data.
If reading all data into memory is not an issue, CSV stream can be converted to a promise:
const fs = require('fs');
const getStream = require('get-stream');
const parse = require('csv-parse');
readCSVData = async (filePath): Promise<any> => {
const parseStream = parse({delimiter: ','});
const data = await getStream.array(fs.createReadStream(filePath).pipe(parseStream));
return data.map(line => line.join(',')).join('\n');
}

Related

How can i save a file i download using fetch with fs

I try downloading files with the fetch() function from github.
Then i try to save the fetched file Stream as a file with the fs-module.
When doing it, i get this error:
TypeError [ERR_INVALID_ARG_TYPE]: The "transform.writable" property must be an instance of WritableStream. Received an instance of WriteStream
My problem is, that i don't know the difference between WriteStream and WritableStream or how to convert them.
This is the code i run:
async function downloadFile(link, filename = "download") {
var response = await fetch(link);
var body = await response.body;
var filepath = "./" + filename;
var download_write_stream = fs.createWriteStream(filepath);
console.log(download_write_stream.writable);
await body.pipeTo(download_write_stream);
}
Node.js: v18.7.0
You can use Readable.fromWeb to convert body, which is a ReadableStream from the web streams API, into a NodeJS Readable stream that can be used with the fs methods.
Note that readable.pipe returns another stream instantly. To wait for it to finish, you can use the promise version of stream.finished to convert it into a Promise, or else you could add listeners for the 'finish' and 'error' events to detect success or failure.
const fs = require('fs');
const { Readable } = require('stream');
const { finished } = require('stream/promises');
async function downloadFile(link, filepath = './download') {
const response = await fetch(link);
const body = Readable.fromWeb(response.body);
const download_write_stream = fs.createWriteStream(filepath);
await finished(body.pipe(download_write_stream));
}
Good question. Web streams are something new, and they are different way of handling streams. WritableStream tells us that we can create WritableStreams as follows:
import {
WritableStream
} from 'node:stream/web';
const stream = new WritableStream({
write(chunk) {
console.log(chunk);
}
});
Then, you could create a custom stream that writes each chunk to disk. An easy way could be:
const download_write_stream = fs.createWriteStream('./the_path');
const stream = new WritableStream({
write(chunk) {
download_write_stream.write(chunk);
},
});
async function downloadFile(link, filename = 'download') {
const response = await fetch(link);
const body = await response.body;
await body.pipeTo(stream);
}

nodeJS async function parse csv return data to other file

I'm creating a small tool for internal user with puppeteer.
Basically I got a csv file with some data i "read" and fill form with.
As I try to cleanup my project to be reusable i'm struggle a little bit:
I create a file name parsecsv.js
const config = require('../config.json');
const parse = require('csv-parse');
const fs = require('fs');
const processFile = async () => {
records = []
const parser = fs
.createReadStream(config.sourceFile)
.pipe(parse({
// CSV options
from_line: 1,
delimiter: ";",
}));
let i =1;
for await (const record of parser) {
records.push(record)
i++;
}
return records
}
const processFileData = async () => {
const records = await processFile()
console.info(records);
return records
}
module.exports ={
processFile, processFileData
}
in an other Js file i made
const parseCSV = require('./src/ParseCsv');
const records = parseCSV.processFileData();
const data = parseCSV.processFile();
console.log(typeof records);
console.table(records);
console.log(typeof data);
console.table(data);
But I never get my data only an empty oject.
How I can get my data to be able to "share" it with other function ?
thanks
as your functions are async ones and they return a promises, you can do something like
const parseCSV = require('./src/ParseCsv');
(async () => {
const records = await parseCSV.processFileData();
const data = await parseCSV.processFile();
console.log(typeof records);
console.table(records);
console.log(typeof data);
console.table(data);
})()

node.js Combining multiple strings into a multiline string

module.exports = {
name: "help",
execute(msg, args){
const fs = require("fs");
const commandFiles = fs.readdirSync("./commands/").filter(file => file.endsWith(".js"));
for (const file of commandFiles){
const name = file.slice(0, -3);
const descriptionFileName = name.concat(".desc");
const descriptionFile = `./commands/${descriptionFileName}`;
var output = "Help:";
fs.readFile(descriptionFile, function(err, data){
const helpLine = name.concat(" - ",data.toString());
output = output + "\n" + helpLine
});
msg.channel.send(output);
}
}
}
Expected output:
help - description
ping - description
Output:
Help:
Help:
Any idea why that happens?
Im new at coding and very new at js.
you didn't get the expected result because readFile(file, cb) reads a file asynchronously. This means that it just schedule a callback cb to be executed once the I/O operation has been completed. However the following code:
msg.channel.send(output)
will be executed synchronously so the output will remain with the initial value.
One way to handle this could be with promises, here a partial example based on your code:
module.exports = {
name: 'help',
async execute(msg, args) {
const { readFile, readdir } = require('fs').promises;
const fs = require('fs');
const commandFiles = await readdir('./commands/').filter((file) => file.endsWith('.js'));
const promises = [];
for (const file of commandFiles) {
promises.push(
fs.readFile(file)
)
}
const results = await Promise.all(promises);
// manipulate results as you want
msg.channel.send(results);
},
};
Note that because of the async prefix the exported execute function you need to handle a promise in the consumer of this module
Another approach could be to use a fully parallel control flow pattern
Some references:
promises
async/await
control flow

node.js synchronous file reading operation problem?

Problem Statement:
Complete function readFile to read the contents of the file sample.txt
and return the content as plain text response.
Note:
make sure when you read file mention its full path.
for e.g - suppose you have to read file xyz.txt
then instead of writing './xyz.txt' or 'xyz.txt'
write like ${__dirname}/xyz.txt
My Code:
const fs = require('fs');
const path = require('path');
let readFile = () => {
let file = path.join(__dirname,'/xyz.txt') ;
let variableFile = fs.readFileSync(file);
return variableFile.toString();
};
module.exports = {
readFile:readFile
};
You have to pass an encoding parameter to readFileSync or it will return a buffer:
const variableFile = fs.readFileSync(file, "utf8");
return variableFile;
PS: You should not use synchronous calls in production, there is a very neat API called "promisify" that allows you to use async/await or promises with fs:
const {promisify} = require('util');
const fs = require('fs');
const readFile = promisify(fs.readFile);
const example = async () => {
const file = await readFile(/*...*/);
}

Node.js copy a stream into a file without consuming

Given a function parses incoming streams:
async onData(stream, callback) {
const parsed = await simpleParser(stream)
// Code handling parsed stream here
// ...
return callback()
}
I'm looking for a simple and safe way to 'clone' that stream, so I can save it to a file for debugging purposes, without affecting the code. Is this possible?
Same question in fake code: I'm trying to do something like this. Obviously, this is a made up example and doesn't work.
const fs = require('fs')
const wstream = fs.createWriteStream('debug.log')
async onData(stream, callback) {
const debugStream = stream.clone(stream) // Fake code
wstream.write(debugStream)
const parsed = await simpleParser(stream)
// Code handling parsed stream here
// ...
wstream.end()
return callback()
}
No you can't clone a readable stream without consuming. However, you can pipe it twice, one for creating file and the other for 'clone'.
Code is below:
let Readable = require('stream').Readable;
var stream = require('stream')
var s = new Readable()
s.push('beep')
s.push(null)
var stream1 = s.pipe(new stream.PassThrough())
var stream2 = s.pipe(new stream.PassThrough())
// here use stream1 for creating file, and use stream2 just like s' clone stream
// I just print them out for a quick show
stream1.pipe(process.stdout)
stream2.pipe(process.stdout)
I've tried to implement the solution provided by #jiajianrong but was struggling to get it work with a createReadStream, because the Readable throws an error when I try to push the createReadStream directly. Like:
s.push(createReadStream())
To solve this issue I have used a helper function to transform the stream into a buffer.
function streamToBuffer (stream: any) {
const chunks: Buffer[] = []
return new Promise((resolve, reject) => {
stream.on('data', (chunk: any) => chunks.push(Buffer.from(chunk)))
stream.on('error', (err: any) => reject(err))
stream.on('end', () => resolve(Buffer.concat(chunks)))
})
}
Below the solution I have found using one pipe to generate a hash of the stream and the other pipe to upload the stream to a cloud storage.
import stream from 'stream'
const Readable = require('stream').Readable
const s = new Readable()
s.push(await streamToBuffer(createReadStream()))
s.push(null)
const fileStreamForHash = s.pipe(new stream.PassThrough())
const fileStreamForUpload = s.pipe(new stream.PassThrough())
// Generating file hash
const fileHash = await getHashFromStream(fileStreamForHash)
// Uploading stream to cloud storage
await BlobStorage.upload(fileName, fileStreamForUpload)
My answer is mostly based on the answer of jiajianrong.

Resources