Node: Split CSV by value - node.js

I am trying to split a CSV into new CSV files based on the value of an attribute field. After testing a few different modules it looks like fast-csv is a good choice. However, I need some help on how split the files by the attribute.
I initially thought about doing a transform:
.transform(function(data){
if (data.AttributeValue == '10') {
return {
data
};
} else if (data.AttributeValue == '5') {
return {
data
};
} else {
data
})
})
Or I could use the validate:
.validate(function(data){
return data.AttributeValue == '10';
})
What I need help with is which one to use and how to send the row of data to different writeStreams.
.pipe(fs.createWriteStream("10.csv", {encoding: "utf8"}));
.pipe(fs.createWriteStream("5.csv", {encoding: "utf8"}));
.pipe(fs.createWriteStream("Other.csv", {encoding: "utf8"}));
I have done this in Python but trying to migrate this to Node is proving trickier than I thought.
Thanks

You can use something like this, to send the data from your validation test to another function.
.validate(function(data){
if(data.attribute = '10'){
ten(data);
} else {
notTen(data);
}
});
Then you can simply use a CSV writer to write out to CSV, for each CSV you'll need to open a csv writer:
var csvStream = csv.createWriteStream({headers: true}),
writableStream = fs.createWriteStream("not10.csv");
writableStream.on("finish", function(){
console.log("DONE!");
});
csvStream.pipe(writableStream);
function writeToCsv(data){
csvStream.write(data);
}
Repeat for each CSV you need to write to.
Might not be the best way to do it, but I am fairly new to this, and it seems to work.

Related

Loop trough nested folder structure and change a line in all files with node.js

I want to write a script that loops through a nested folder structure and changes for instance console.log('hello world') in each javascript file. I simplified the task and basically can't use the in-built VS search and replace tool. Folder structure looks like this:
My try:
function changeLine(folderPaths) {
folderPaths.forEach(folderPath=> {
const results = fs.readdirSync(folderPath)
const folders = results.filter(res=> fs.lstatSync(path.resolve(folderPath, res)).isDirectory())
const innerFolderPaths = folders.map(folder => path.resolve(folderPath, folder));
if(innerFolderPaths.length ===0) {
return
}
innerFolderPaths.forEach(
inner=>{
if(fs.lstatSync(inner).isDirectory()) {
//go deeper and change individual file
let individualFile = fs.readFileSync(<go to individual file>).toString()
individualFile = fileString.replace(/console.log('hello world')/, 'line changed');
}}
)
})
}
changeLine([path.resolve(__dirname, 'src')] )
I don't have much experience with node.js and that's what I came up with so far. The if condition inside the forEach Loop is not correct, hope someone can make sense of it. Thanks for reading!

After switching from a filesystem to mongodb approach, how can I recreate nested directory iteration?

I was storing files in a nested file structure like
data/category/year/month/day/type/subtype/0.json
and to get a list of files / directories within any folder I'd simply use a func like:
getDirectoryContentsSync(pathName) {
let exists = this.exists(pathName);
if (exists) {
let files = fs.readdirSync(pathName)
return files;
} else {
return [];
}
}
Now I'm switching to storing files in mongodb, identifying the files by their original file-path string like:
store(data, path) {
this.db.collection('fs').insertOne({
path,
data
});
}
load(path) {
let data = this.db.collection('fs').find({
path
})[0].data;
return data;
}
But I'm struggling to figure out a way to continue iterating through the structure the way I used to. The approach I'm thinking is pretty gross, like to assign a separate value for each sub-path pointing to children but I think its gonna be a really bad approach, something like:
store(data, path) {
this.db.collection('fs').insertOne({
path,
data
});
let pathParts = path.split("/");
let subPath = "";
pathParts.forEach(dir => {
subPath += dir;
this.db.collection('dir').insertOne({
path: subPath,
data: true,
children: []
});
});
}
That's not the full code for the concept because I realized it seemed like an overly complicated way to do it and decided to stop and ask. I'm just new to mongo db and I bet there's a much better way to handle this but have no idea where to start. What's a good way to do what I want?

Conditional settings for Gulp plugins dependent on source file

The plugin gulp-pug allows to pass global variables to pug files via data property.
What if we don't need full data set in each .pug file? To implement conditional data injection, we need to access to current vinyl file instance inside pipe(this.gulpPlugins.pug({}) or at least to know the source file absolute path. Possible?
const dataSetForTopPage = {
foo: "alpha",
bar: "bravo"
};
const dataSetForAboutPage = {
baz: "charlie",
hoge: "delta"
};
gulp.src(sourceFileGlobsOrAbsolutePath)
.pipe(gulpPlugins.pug({
data: /*
if path is 'top.pug' -> 'dataSetForTopPage',
else if path is 'about.pug' -> 'dataSetForAboutPage'
else -> empty object*/
}))
.pipe(Gulp.dest("output"));
I am using gulp-intercept plugin. But how to synchronize it with gulpPlugins.pug?
gulp.src(sourceFileGlobsOrAbsolutePath)
.pipe(this.gulpPlugins.intercept(vinylFile => {
// I can compute conditional data set here
// but how to execute gulpPlugins.pug() here?
}))
// ...
It was just one example, but we will deal with same problem when need to conditional plugins options for other gulp plugins, too. E. g:
.pipe(gulpPlugins.htmlPrettify({
indent_char: " ",
indent_size: // if source file in 'admin/**' -> 2, else if in 'auth/**' -> 3 else 4
}))
You'll need to modify the stream manually - through2 is probably the most used package for this purpose. Once in the through2 callback, you can pass the stream to your gulp plugins (as long as their transform functions are exposed) and conditionally pass them options. For example, here is a task:
pugtest = () => {
const dataSet = {
'top.pug': {
foo: "alpha",
bar: "bravo"
},
'about.pug': {
foo: "charlie",
bar: "delta"
}
};
return gulp.src('src/**/*.pug')
.pipe(through2.obj((file, enc, next) =>
gulpPlugins.pug({
// Grab the filename, and set pug data to the value found in dataSet by that name
data: dataSet[file.basename] || {}
})._transform(file, enc, next)
))
.pipe(through2.obj((file, enc, next) => {
const options = {
indent_char: ' ',
indent_size: 4
};
if(file.relative.match(/admin\//)) {
options.indent_size = 2;
} else if(file.relative.match(/auth\//)) {
options.indent_size = 3;
}
file.contents = new Buffer.from(html.prettyPrint(String(file.contents), options), enc);
next(null, file);
}))
.pipe(gulp.dest('output'));
}
For the pug step, we call through2.obj and create the pug plugin, passing it data grabbed from our object literal, indexed by filename in this example. So now the data passed into the compiler comes from that object literal.
For the html step you mention, gulp-html-prettify doesn't expose its transform function, so we can't reach into it and pass the transform back to the stream. But in this case that's OK, if you look at the source it's just a wrapper to prettyPrint in the html package. That's quite literally all it is doing. So we can just rig up our step using through2 to do the same thing, but changing our options based on the vinyl file's relative path.
That's it! For a working example see this repo: https://github.com/joshdavenport/stack-overflow-61314141-gulp-pug-conditional

Writing into a file only if the file content is different from the new data

I want a code like this:
fs.writeFile(fullFileAddress , data, function (err)
{
if(err)
{
if(err.code == 'ENOENT')
{
console.error( new Error(`#Space. Can not save file. message: ${err}`.red) );
}
}
else
{
console.log("#eJS. %s created.", fullFileAddress);
}
});
Do I have to read the file frist and then compare it with the data variable? and if would not be same, then I write into the file?
Or maybe there is a better way for this?
An elegant alternate would be to compare the stats.mtime which comprise the last modified time of the file against the time of data dispatch rather than the file content against the data.

Empty PHPExcel file using liuggio/ExcelBundle in Symfony

I have some code that iterates over the rows and columns of an Excel sheet and replaces text with other text. This is done with a service that has the excel file and a dictionary as parameters like this.
$mappedTemplate = $this->get('app.entity.translate')->translate($phpExcelObject, $dictionary);
The service itself looks like this.
public function translate($template, $dictionary)
{
foreach ($template->getWorksheetIterator() as $worksheet) {
foreach ($worksheet->getRowIterator() as $row) {
$cellIterator = $row->getCellIterator();
$cellIterator->setIterateOnlyExistingCells(false); // Loop all cells, even if it is not set
foreach ($cellIterator as $cell) {
if (!is_null($cell)) {
if (!is_null($cell->getCalculatedValue())) {
if (array_key_exists((string)$cell->getCalculatedValue(), $dictionary)) {
$worksheet->setCellValue(
$cell->getCoordinate(),
$dictionary[$cell->getCalculatedValue()]
);
}
}
}
}
}
}
return $template;
}
After some debugging I found out that the text actually is replaced and that the service works like it should. The problem is that when I return the new PHPExcel file as a response to download, the excel is empty.
This is the code I use to return the file.
// create the writer
$writer = $this->get('phpexcel')->createWriter($mappedTemplate, 'Excel5');
// create the response
$response = $this->get('phpexcel')->createStreamedResponse($writer);
// adding headers
$dispositionHeader = $response->headers->makeDisposition(
ResponseHeaderBag::DISPOSITION_ATTACHMENT,
$file_name
);
$response->headers->set('Content-Type', 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet');
$response->headers->set('Pragma', 'public');
$response->headers->set('Cache-Control', 'maxage=1');
$response->headers->set('Content-Disposition', $dispositionHeader);
return $response;
What am I missing?
Your code is missing the calls to the writer.
You only create the writer, but never use it, at least not in your shared code examples:
$objWriter = new PHPExcel_Writer_Excel2007($objPHPExcel);
$response = $this->get('phpexcel')->createStreamedResponse($objWriter)
Another thing is the content type: Do you have the apache content types setup correctly?
$response->headers->set('Content-Type', 'application/vnd.ms-excel; charset=utf-8');

Resources