How to write tar stream entry with unknown size? - node.js

Here's the gist of what I'm trying to do:
import * as Path from 'path'
import {exportTableDataToFile} from '../struct'
import * as Tar from 'tar-stream'
import * as Zlib from 'zlib'
import * as FileSys from 'fs'
async function execute(opts, args) {
const pack = Tar.pack()
pack.pipe(Zlib.createGzip({level: Zlib.constants.Z_BEST_COMPRESSION})).pipe(FileSys.createWriteStream(opts.file))
const tblDataFile = Path.join(db.name, `${tblName}.csv`)
const dataStream = pack.entry({name: tblDataFile}, err => {
if(err) throw err;
})
await exportTableDataToFile(conn, db.name, tblName, dataStream)
}
Where exportTableDataToFile is writing a CSV into dataStream line-by-line.
Since I'm generating that CSV on the fly from some database records, I don't know how big it's going to be.
I also don't really want buffer the entire CSV into memory if I can help it.
The above is throwing "size mismatch" because I didn't specify the size in pack.entry(...)
Is there any way I can stream to a .tar.gz in Node.js without knowing the size?

Instead of using some module, if you just want to create a CSV from DB of an unknown size you can do the same as below:
const fs = require("fs");
const csvFile = fs.createWriteStream("db.csv");
//column headers of your csv, remove if not needed
csvFile.write("column1, column2, column3, column4");
while(true){
const result=db.find(table);//db call -> replace it with your db fetch call
//Here I am expecting column1Value ... to be the field in my DB
for(const elem of result){
csvFile.write(`${elem.column1Value}, ${elem.column2Value}, ${elem.column3Value}, ${elem.column4Value}`);
}
if(!result.length){
break
}
//Need to handle pagination
}
you can replace DB call as per your syntax.

Related

Using ipfs in javascipt: how to read an object the same way as dumpign it to file from command line and reading the file?

I have some arrow file that I am trying to read in javascript. Dumping it to file via the commanline: ipfs get HASH and then
fs = require('fs')
a = fs.readFileSync(HASH)
da = arrow.Table.from(a)
works fine.
Loading the cid (HASH)
ipfs = require('ipfs')
ipfs.create({repo: String(Math.random() + Date.now()) }).then(x=>node=x).then(
node=>node.object.get(HASH)
).then(x=>data=x)
Gives my something that has a data.Data buffer in some other format and it does not load into an arrow Table in the same way. How can I get the bytes in the same was as the readFileSync?
It turns out you need to use the ipfs cat method and that returns an async iterator so there is a small step to be aware of to get that into the arrow table.
I am not sure if there is a direct method for the get.
async function docat() {
var out = []
for await (const result of node.cat(has)) {
out.push(result)
}
return out
}

Adding files in directory to an array

I am really new to node.js. I need to read .json files from a directory and then add them to an array and return it. I am able to read each file separately by passing the address:
const fs = require("fs");
fs.readFile("./fashion/customer.json", "utf8", (err, jsonString) => {
if (err) {
console.log("Error reading file from disk:", err);
return;
}
try {
const customer = JSON.parse(jsonString);
console.log("Customer address is:", customer.address); // => "Customer address is: Infinity Loop Drive"
} catch (err) {
console.log("Error parsing JSON string:", err);
}
});
But the same fashion folder has multiple json files. I want to add these files to an array and then return it. I tried using readdirSync but that just returned the file names. Is it possible to add json files to an array and return it?
Basically I require an array of this format:
Array[{contents of json file1}, {contents of json file2}, .....]
Any help is appreciated!
Here is a simple solution to your question:
const fs = require("fs");
const jsonFolder = './fashion'
var customerDataArray = []
fs.readdirSync(jsonFolder).forEach(file => {
let fileData = JSON.parse(fs.readFileSync(jsonFolder+'/'+file))
customerDataArray.push(fileData)
});
console.log(customerDataArray)
readdirSync returns an array with all the file names or objects in the directory. You can use forEach to iterate through every item in the array, which will be the file names in this scenario. To read the contents of each file, use readFileSync and specify the path to the file as the name of the directory plus the name of the file. The data is returned as a buffer and needs to be parsed using JSON.parse(), and then it is pushed to the customerDataArray.
I hope this answers your question!

Getting error while reading json file using node.js

I am getting the following error while reading the json file using Node.js. I am explaining my code below.
SyntaxError: Unexpected token # in JSON at position 0
at JSON.parse (<anonymous>)
My json file is given below.
test.json:
#PATH:/test/
#DEVICES:div1
#TYPE:p1
{
name:'Raj',
address: {
city:'bbsr'
}
}
This json file has some # included strings . Here I need to remove those # included string from this file. I am explaining my code below.
fs.readdirSync(`${process.env['root_dir']}/uploads/${fileNameSplit[0]}`).forEach(f => {
console.log('files', f);
let rawdata = fs.readFileSync(`${process.env['root_dir']}/uploads/${fileNameSplit[0]}/${f}`);
let parseData = JSON.parse(rawdata);
console.log(parseData);
});
Here I am trying to read the code first but getting the above error. My need is to remove those # included lines from the json file and then read all the data and convert the removed lines to object like const obj ={PATH:'/test/',DEVICES:'div1',TYPE:p1}. Here I am using node.js fs module to achive this.
As you said, you need to remove those # lines from the JSON file. You need to code this yourself. To help with that, read the file into a string and not a Buffer by providing a charset to readFileSync.
const text = fs.readFileSync(path, 'utf8');
console.log(text);
const arr = raw.split("\n");
const noComments = arr.filter(x => x[0] !== "#"));
const filtered = noComments.join("\n");
const data = JSON.parse(filtered);
console.log(data);

Why am I getting a NOENT using Node core module 'fs'

This a repeat question (not yet answered) but I have revised and tightened up the code. And, I have included the specific example. I am sorry to keep beating this drum, but I need help.
This is a Node API. I need to read and write JSON data. I am using the Node core module 'fs', not the npm package by the same name (or fs-extra). I have extracted the particular area of concern onto a standalone module that is shown here:
'use strict';
/*==================================================
This service GETs the list of ids to the json data files
to be processed, from a json file with the id 'ids.json'.
It returns and exports idsList (an array holding the ids of the json data files)
It also calls putIdsCleared to clear the 'ids.json' file for the next batch of processing
==================================================*/
// node modules
const fs = require('fs');
const config = require('config');
const scheme = config.get('json.scheme')
const jsonPath = config.get('json.path');
const url = `${scheme}${jsonPath}/`;
const idsID = 'ids.json';
const uri = `${url}${idsID}`;
let idsList = [];
const getList = async (uri) => {
await fs.readFile(uri, 'utf8', (err, data) => {
if (err) {
return(console.log( new Error(err.message) ));
}
return jsonData = JSON.parse(data);
})
}
// The idea is to get the empty array written back to 'ids.json' before returning to 'process.js'
const clearList = async (uri) => {
let data = JSON.stringify({'ids': []});
await fs.writeFile(uri, data, (err) => {
if (err) {
return (console.log( new Error(err.message) ));
}
return;
})
}
getList(uri);
clearList(uri)
console.log('end of idsList',idsList);
module.exports = idsList;
Here is the console output from the execution of the module:
Error: ENOENT: no such file or directory, open 'File:///Users/doug5solas/sandbox/libertyMutual/server/api/ids.json'
at ReadFileContext.fs.readFile [as callback]
(/Users/doug5solas/sandbox/libertyMutual/server/.playground/ids.js:24:33)
at FSReqWrap.readFileAfterOpen [as oncomplete] (fs.js:235:13)
Error: ENOENT: no such file or directory, open 'File:///Users/doug5solas/sandbox/libertyMutual/server/api/ids.json'
at fs.writeFile
(/Users/doug5solas/sandbox/libertyMutual/server/.playground/ids.js:36:34)
at fs.js:1167:7
at FSReqWrap.oncomplete (fs.js:141:20)
I am being told there is no such file or directory. However I can copy the uri (as shown in the error message)
File:///Users/doug5solas/sandbox/libertyMutual/server/api/ids.json
into the search bar of my browser and this is what is returned to me:
{
"ids": [
"5sM5YLnnNMN_1540338527220.json",
"5sM5YLnnNMN_1540389571029.json",
"6tN6ZMooONO_1540389269289.json"
]
}
This result is the expected result. I do not "get" why I can get the data manually but I cannot get it programmatically, using the same uri. What am I missing? Help appreciated.
Your File URI is in the wrong format.
It shouldn't contain the File:// protocol (that's a browser-specific thing).
I'd imagine you want C://Users/doug5solas/sandbox/libertyMutual/server/api/ids.json.
I solved the problem by going to readFileSync. I don't like it but it works and it is only one read.

Is there a more elegant way to read then write *the same file* with node js stream

I wanna read file then change it with through2 then write into the same file, code like:
const rm = require('rimraf')
const through2 = require('through2')
const fs = require('graceful-fs')
// source file path
const replacementPath = `./static/projects/${destPath}/index.html`
// temp file path
const tempfilePath = `./static/projects/${destPath}/tempfile.html`
// read source file then write into temp file
await promiseReplace(replacementPath, tempfilePath)
// del the source file
rm.sync(replacementPath)
// rename the temp file name to source file name
fs.renameSync(tempfilePath, replacementPath)
// del the temp file
rm.sync(tempfilePath)
// promiseify readStream and writeStream
function promiseReplace (readfile, writefile) {
return new Promise((res, rej) => {
fs.createReadStream(readfile)
.pipe(through2.obj(function (chunk, encoding, done) {
const replaced = chunk.toString().replace(/id="wrap"/g, 'dududud')
done(null, replaced)
}))
.pipe(fs.createWriteStream(writefile))
.on('finish', () => {
console.log('replace done')
res()
})
.on('error', (err) => {
console.log(err)
rej(err)
})
})
}
the above code works, but I wanna know can I make it more elegant ?
and I also try some temp lib like node-temp
unfortunately, it cannot readStream and writeStream into the same file as well, and I open a issues about this.
So any one know a better way to do this tell me, thank you very much.
You can make the code more elegant by getting rid of unnecessary dependencies and using the newer simplified constructor for streams.
const fs = require('fs');
const util = require('util');
const stream = require('stream');
const tempWrite = require('temp-write');
const rename = util.promisify(fs.rename);
const goat2llama = async (filePath) => {
const str = fs.createReadStream(filePath, 'utf8')
.pipe(new stream.Transform({
decodeStrings : false,
transform(chunk, encoding, done) {
done(null, chunk.replace(/goat/g, 'llama'));
}
}));
const tempPath = await tempWrite(str);
await rename(tempPath, filePath);
};
Tests
AVA tests to prove that it works:
import fs from 'fs';
import path from 'path';
import util from 'util';
import test from 'ava';
import mkdirtemp from 'mkdirtemp';
import goat2llama from '.';
const writeFile = util.promisify(fs.writeFile);
const readFile = util.promisify(fs.readFile);
const fixture = async (content) => {
const dir = await mkdirtemp();
const fixturePath = path.join(dir, 'fixture.txt');
await writeFile(fixturePath, content);
return fixturePath;
};
test('goat2llama()', async (t) => {
const filePath = await fixture('I like goats and frogs, but goats the best');
await goat2llama(filePath);
t.is(await readFile(filePath, 'utf8'), 'I like llamas and frogs, but llamas the best');
});
A few things about the changes:
Through2 is not really needed anymore. It used to be a pain to set up passthrough or transform streams properly, but that is not the case anymore thanks to the simplified construction API.
You probably don't need graceful-fs, either. Unless you are doing a lot of concurrent disk I/O, EMFILE is not usually a problem, especially these days as Node has gotten smarter about file descriptors. But that library does help with temporary errors caused by antivirus software on Windows, if that is a problem for you.
You definitely do not need rimraf for this. You only need fs.rename(). It is similar to mv on the command line, with a few nuances that make it distinct, but the differences are not super important here. The point is there will be nothing at the temporary path after you rename the file that was there.
I used temp-write because it generates a secure random filepath for you and puts it in the OS temp directory (which automatically gets cleaned up now and then), plus it handles converting the stream to a Promise for you and takes care of some edge cases around errors. Disclosure: I wrote the streams implementation in temp-write. :)
Overall, this is a decent improvement. However, there remains the boundary problem discussed in the comments. Luckily, you are not the first person to encounter this problem! I wouldn't call the actual solution particularly elegant, certainly not if you implement it yourself. But replacestream is here to help you.
const fs = require('fs');
const util = require('util');
const tempWrite = require('temp-write');
const replaceStream = require('replacestream');
const rename = util.promisify(fs.rename);
const goat2llama = async (filePath) => {
const str = fs.createReadStream(filePath, 'utf8')
.pipe(replaceStream('goat', 'llama'));
const tempPath = await tempWrite(str);
await rename(tempPath, filePath);
};
Also...
I do not like temp files
Indeed, temp files are often bad. However, in this case, the temp file is managed by a well-designed library and stored in a secure, out-of-the-way location. There is virtually no chance of conflicting with other processes. And even if the rename() fails somehow, the file will be cleaned up by the OS.
That said, you can avoid temp files altogether by using fs.readFile() and fs.writeFile() instead of streaming. The former also makes text replacement much easier since you do not have to worry about chunk boundaries. You have to choose one approach or the other, however for very big files, streaming may be the only option, aside from manually chunking the file.
Streams are useless in this situation, because they return you chunks of file that can break the string that you're searching for. You could use streams, then merge all these chunks to get content, then replace the string that you need, but that will be longer code that will provoke just one question: why do you read file by chunks if you don't use them ?
The shortest way to achieve what you want is:
let fileContent = fs.readFileSync('file_name.html', 'utf8')
let replaced = fileContent.replace(/id="wrap"/g, 'dududud')
fs.writeFileSync('file_name.html', replaced)
All these functions are synchronous, so you don't have to promisify them

Resources