Node.js import csv with blank fields - node.js

I'm trying to import & parse a CSV file using the csv-parse package, but having difficulty with requireing the csv file in the first place.
When I do input = require('../../path-to-my-csv-file')
I get an error due to consecutive commas because some fields are blank:
e","17110","CTSZ16","Slitzerâ„¢ 16pc Cutlery Set in Wood Block",,"Spice up
^
SyntaxError: Unexpected token ,
How do I import the CSV file into the node environment to begin with?
Package examples are Here

To solve your first problem, reading CSV with empty entries:
Use the 'fast-csv' node package. It will parse csv with emtpy entries.
To answer your second question, how to import a CSV into node:
You don't really "import" csv files into node. You should fs.open the file
or use fs.createReadStream to read the csv file at the appropriate location.
Below is a script that uses fs.createReadStream to parse a CSV called 'test.csv' that is one directory up from the script that is running it.
The first section sets up our program, makes basic declarations of the objects were going use to store our parsed list.
var csv = require('fast-csv') // require fast-csv module
var fs = require('fs') // require the fs, filesystem module
var uniqueindex = 0 // just an index for our array
var dataJSON = {} // our JSON object, (make it an array if you wish)
This next section declares a stream that will intercept data as it's read from our CSV file and do stuff to it. In this case we're intercepting the data and storing it in a JSON object and then saving that JSON object once the stream is done. It's basically a filter that intercepts data and can do what it wants with it.
var csvStream = csv() // - uses the fast-csv module to create a csv parser
.on('data',function(data){ // - when we get data perform function(data)
dataJSON[uniqueindex] = data; // - store our data in a JSON object dataJSON
uniqueindex++ // - the index of the data item in our array
})
.on('end', function(){ // - when the data stream ends perform function()
console.log(dataJSON) // - log our whole object on console
fs.writeFile('../test.json', // - use fs module to write a file
JSON.stringify(dataJSON,null,4), // - turn our JSON object into string that can be written
function(err){ // function(err) only gets performed once were done saving the file and err will be nil if there is no error
if(err)throw err //if there's an error while saving file throw it
console.log('data saved as JSON yay!')
})
})
This section creates what is called a "readStream" from our csv file. The path to the file is relative. A stream is just a way of reading a file. It's pretty powerful though because the data from a stream can be piped into another stream.
So we'll create a stream that reads the data from our CSV file, and then well pipe it into our pre-defined readstream / filter in section 2.
var stream = fs.createReadStream('../test.csv')
stream.pipe(csvStream)
This will create a file called 'test.json' one directory up from the place where our csv parsing script is. test.json will contain the parsed CSV list inside a JSON object. The order in which the code appears here is how it should appear in a script you make.

Related

How do I export the data generated by this code to a CSV file in puppeteer?

I need to export the data generated by this code into a CSV file. I am new to node.js/puppeteer so I am struggling on generating a CSV file.
I understand I can use the fs write function and tried adding this to the end of my code to no avail:
const fs = require('fs');
const csv = await page.$$eval('.product_desc_txt', function(products){
// Iterate over product descriptions
let csvLines = products.map(function(product){
// Inside of each product find product SKU and its price
let productId = document.querySelector(".custom-body-copy").innerText.trim();
let productPrice = document.querySelector("span[data-wishlist-linkfee]").innerText.trim();
// Fomrat them as a csv line
return `${productId};${productPrice}`;
});
// Join all lines into one file
return csvLines.join("\n");
});
fs.writeFileSync("test.csv", csv)
});
You've got csv with data from puppeteer, but don't use it. Just write the data to file:
fs.writeFileSync("test.csv", csv);
Also writing to file this
'${productId};${productPrice}'
won't work, there are no such variables at that place and even if there were, the correct way to format variables into a string is with backticks:
`${productId};${productPrice}`

How do I read csv file line by line, modify each line, write result to another file

I recently used event-stream library for nodejs to parse a huge csv file, saving results to database.
How do I solve the task of not just reading a file, but modifying each line, writing result to new file?
Is it some combination of through and map method, or duplex? Any help is highly appreciated.
If you use event-stream for read you can use split() method process csv line by line. Then change and write line to new writable stream.
var fs = require('fs');
var es = require('event-stream');
const newCsv = fs.createWriteStream('new.csv');
fs.createReadStream('old.csv')
.pipe(es.split())
.pipe(
es.mapSync(function(line) {
// modify line way you want
newCsv.write(line);
}))
newCsv.end();

How can I read json values from a file?

So basically I have these json values in my config.json file, but how can I read them from a .txt file, for example:
{"prefix": $}
This would set a variable configPrefix to $. Any help?
You can use require() to read and parse your JSON file in one step:
let configPrefix = require("./config.json").prefix;
Or, if you wanted to get multiple values from that config:
const configData = require("./config.json");
let configPrefix = configData.prefix;
If your data is not actually JSON formatted, then you have to read the file yourself with something like fs.readFile() or fs.readFileSync() and then parse it yourself according to whatever formatting rules you have for the file.
If you are going to be reading this file just as the start of the program then go ahead and use require or import if you have babel. just a tip, suround the require with a try catch block to handle possible errors.
let config
try {
config = require('path.to.file.json')
} catch (error) {
// handle error
config = {}
}
If you will be changing this file externally and you feel the need to source it then apart from reading it at the start you will need a function that uses fs.readFile. consider doing it like this and not with readFileAsync unless you need to block the program until you are done reading the config file.
After all of that you can do const configPrefix = config.prefix which will have the value '$'.

Get md5 checksums of entries in zip using adm-zip

I am trying to get MD5 checksums for all files in a ZIP file. I am currently using adm-zip for this because I read I can read zip contents into the memory without having to extract a file to the disk. But I am failing to read the data of entries in a ZIP file. My code goes as follows:
var zip = new AdmZip(path);
zip.getEntries()
.map(entry => { console.log(entry.entryName, entry.data); });
The entryName can be read, so opening and reading the zip works. But data is always undefined. I read that data is not really the method to read the data of an entry, but I am not sure how to actually read it.
To read the data of the entry, you must call the method getData() of the entry object, which returns a Buffer. Here is the updated code snippet which works on my end :
var zip = new AdmZip(path);
zip.getEntries().map(entry => {
const md5Hash = crypto.createHash('md5').update(entry.getData()).digest('hex');
console.log(md5Hash);
});
I used the basic crypto module to produce the md5 hash (in hex format). Don't forget to add it to the list of your requires at the top of your file: const crypto = require('crypto');

How to deserialize avro in nodejs?

I have an avro file.
I want to use nodejs to open and read its schema and iterate through its records.
How to do this? The avro libraries I see in nodejs appear to require you to pass in a schema instead of getting the schema out of the .avro file. Also, I want to be able to support arrays, which there does not seem to exist a node library that does (node-avro-io).
My avro/avroschema Contains:
A nested field {a:{suba: vala, subb: vala}}.
An array field {a:["A","B"]}. node-avro-io does not work.
Error I get with node-avro-io:
Avro Invalid Schema Error: Primitive type must be one of: ["null","boolean","int","long","float","double","bytes","string"]; got DependencyNode
In case you're still looking, you can do this with avsc. The code would look something like:
var avro = require('avsc');
// To stream the file's records (they will already be decoded):
avro.createFileDecoder('your/data.avro')
.on('data', function (record) { /* Do something with the record. */ });
// Or, if you just want the file's header (which includes the schema):
var header = avro.extractFileHeader('your/data.avro');
If you want to open a file, the following code found here : https://www.npmjs.com/package/node-avro-io will do the trick :
var DataFile = require("node-avro-io").DataFile;
var avro = DataFile.AvroFile();
var reader = avro.open('test.avro', { flags: 'r' });
reader.on('data', function(data) {
console.log(data);
});

Resources