Get md5 checksums of entries in zip using adm-zip - node.js

I am trying to get MD5 checksums for all files in a ZIP file. I am currently using adm-zip for this because I read I can read zip contents into the memory without having to extract a file to the disk. But I am failing to read the data of entries in a ZIP file. My code goes as follows:
var zip = new AdmZip(path);
zip.getEntries()
.map(entry => { console.log(entry.entryName, entry.data); });
The entryName can be read, so opening and reading the zip works. But data is always undefined. I read that data is not really the method to read the data of an entry, but I am not sure how to actually read it.

To read the data of the entry, you must call the method getData() of the entry object, which returns a Buffer. Here is the updated code snippet which works on my end :
var zip = new AdmZip(path);
zip.getEntries().map(entry => {
const md5Hash = crypto.createHash('md5').update(entry.getData()).digest('hex');
console.log(md5Hash);
});
I used the basic crypto module to produce the md5 hash (in hex format). Don't forget to add it to the list of your requires at the top of your file: const crypto = require('crypto');

Related

adm-zip not adding all files

I'm noticing a strange behavior while using this library. I'm trying to compress multiple EML files, to do so I first convert them to buffers and add them to adm-zip instance using the addFile() method. Here's my code:
const zip = new AdmZip();
assetBodies.forEach((body) => {
// emlData to buffer
let emlBuffer = Buffer.from(body);
zip.addFile(`${new Date().getTime()}.eml`, emlBuffer);
});
zip.getEntries().forEach((entry) => {
console.log("entry name", entry.entryName);
});
const willSendthis = zip.toBuffer();
The problem is that sometimes it compresses all the files and sometimes it doesn't.
For example, I received 5 items in the assetBodies array, but when I log the entries of the zip file I only see 1 or 2, sometimes 5.
Am I missing something or there's an issue with the library?
EDIT:
It's worth mentioning that some of the files are quite large in terms of text so I wonder if that could be the issue

How to generate zip and put files into zip using stream in Node js

Currently, I tried to make zip file(or any format of compressed file) containing few files that I want to put into zip file.
I thought it would work with adm-zip module.
but I found out that the way adm-zip module put files into zip is buffer.
It takes a lot of memory when I put files that size is very huge.
In the result, My server stopped working.
Below is What I'd done.
var zip = new AdmZip();
zip.addLocalFile('../largeFile', 'dir1'); //put largeFile into /dir1 of zip
zip.addLocalFile('../largeFile2', 'dir1');
zip.addLocalFile('../largeFile3', 'dir1/dir2');
zip.writeZip(/*target file name*/ `./${threadId}.zip`);
Is there any solution to solve this situation?
to solve memory issue the best practice is to use streams and not load all files into memory for example
import {
createReadStream,
createWriteStream
} from 'fs'
import { createGzip } from 'zlib'
const [, , src, dest] = process.argv
const srcStream = createReadStream(src)
const gzipStream = createGzip()
const destStream = createWriteStream(dest)
srcStream
.pipe(gzipStream)
.pipe(destStream)

Unable to use variables in fs functions when using brfs

I use browserify in order to be able to use require. To use fs functions with browserify i need to transform it with brfs but as far as I understood this results in only being able to input static strings as parameters inside my fs function. I want to be able to use variables for this.
I want to search for xml files in a specific directory and read them. Either by searching via text field or showing all of their data at once. In order to do this I need fs and browserify in order to require it.
const FS = require('fs')
function lookForRoom() {
let files = getFileNames()
findSearchedRoom(files)
}
function getFileNames() {
return FS.readdirSync('../data/')
}
function findSearchedRoom(files) {
const SEARCH_FIELD_ID = 'room'
let searchText = document.getElementById(SEARCH_FIELD_ID).value
files.forEach((file) => {
const SEARCHTEXT_FOUND = file.includes(searchText.toLowerCase())
if (SEARCHTEXT_FOUND) loadXML(file)
})
}
function loadXML(file) {
const XML2JS = require('xml2js')
let parser = new XML2JS.Parser()
let data = FS.readFile('../data/' + file)
console.dir(data);
}
module.exports = { lookForRoom: lookForRoom }
I want to be able to read contents out of a directory containing xml files.
Current status is that I can only do so when I provide a constant string to the fs function
The brfs README contains this gotcha:
Since brfs evaluates your source code statically, you can't use dynamic expressions that need to be evaluated at run time.
So, basically, you can't use brfs in the way you were hoping.
I want to be able to read contents out of a directory containing xml files
If by "a directory" you mean "any random directory, the name of which is determined by some form input", then that's not going to work. Browsers don't have direct access to directory contents, either locally or on a server.
You're not saying where that directory exists. If it's local (on the machine the browser is running on): I don't think there are standardized API's to do that, at all.
If it's on the server, then you need to implement an HTTP server that will accept a directory-/filename from some clientside code, and retrieve the file contents that way.

gunzip partials read from read-stream

I use Node.JS to fetch files from my S3 bucket.
The files over there are gzipped (gz).
I know that the contents of each file is composed by lines, where each line is a JSON of some record that failed to be put on Kinesis.
Each file consists of ~12K such records. and I would like to be able to process the records while the file is being downloaded.
If the file was not gzipped, that could be easily done using streams and readline module.
So, the only thing that stopping me from doing this is the gunzip process which, to my knowledge, needs to be executed on the whole file.
Is there any way of gunzipping a partial of a file?
Thanks.
EDIT 1: (bad example)
Trying what #Mark Adler suggested:
const fileStream = s3.getObject(params).createReadStream();
const lineReader = readline.createInterface({input: fileStream});
lineReader.on('line', line => {
const gunzipped = zlib.gunzipSync(line);
console.log(gunzipped);
})
I get the following error:
Error: incorrect header check
at Zlib._handle.onerror (zlib.js:363:17)
Yes. node.js has a complete interface to zlib, which allows you to decompress as much of a gzip file at a time as you like.
A working example that solves the above problem
The following solves the problem in the above code:
const fileStream = s3.getObject(params).createReadStream().pipe(zlib.createGunzip());
const lineReader = readline.createInterface({input: fileStream});
lineReader.on('line', gunzippedLine => {
console.log(gunzippedLine);
})

Node.js import csv with blank fields

I'm trying to import & parse a CSV file using the csv-parse package, but having difficulty with requireing the csv file in the first place.
When I do input = require('../../path-to-my-csv-file')
I get an error due to consecutive commas because some fields are blank:
e","17110","CTSZ16","Slitzerâ„¢ 16pc Cutlery Set in Wood Block",,"Spice up
^
SyntaxError: Unexpected token ,
How do I import the CSV file into the node environment to begin with?
Package examples are Here
To solve your first problem, reading CSV with empty entries:
Use the 'fast-csv' node package. It will parse csv with emtpy entries.
To answer your second question, how to import a CSV into node:
You don't really "import" csv files into node. You should fs.open the file
or use fs.createReadStream to read the csv file at the appropriate location.
Below is a script that uses fs.createReadStream to parse a CSV called 'test.csv' that is one directory up from the script that is running it.
The first section sets up our program, makes basic declarations of the objects were going use to store our parsed list.
var csv = require('fast-csv') // require fast-csv module
var fs = require('fs') // require the fs, filesystem module
var uniqueindex = 0 // just an index for our array
var dataJSON = {} // our JSON object, (make it an array if you wish)
This next section declares a stream that will intercept data as it's read from our CSV file and do stuff to it. In this case we're intercepting the data and storing it in a JSON object and then saving that JSON object once the stream is done. It's basically a filter that intercepts data and can do what it wants with it.
var csvStream = csv() // - uses the fast-csv module to create a csv parser
.on('data',function(data){ // - when we get data perform function(data)
dataJSON[uniqueindex] = data; // - store our data in a JSON object dataJSON
uniqueindex++ // - the index of the data item in our array
})
.on('end', function(){ // - when the data stream ends perform function()
console.log(dataJSON) // - log our whole object on console
fs.writeFile('../test.json', // - use fs module to write a file
JSON.stringify(dataJSON,null,4), // - turn our JSON object into string that can be written
function(err){ // function(err) only gets performed once were done saving the file and err will be nil if there is no error
if(err)throw err //if there's an error while saving file throw it
console.log('data saved as JSON yay!')
})
})
This section creates what is called a "readStream" from our csv file. The path to the file is relative. A stream is just a way of reading a file. It's pretty powerful though because the data from a stream can be piped into another stream.
So we'll create a stream that reads the data from our CSV file, and then well pipe it into our pre-defined readstream / filter in section 2.
var stream = fs.createReadStream('../test.csv')
stream.pipe(csvStream)
This will create a file called 'test.json' one directory up from the place where our csv parsing script is. test.json will contain the parsed CSV list inside a JSON object. The order in which the code appears here is how it should appear in a script you make.

Resources