Get contents of in-memory zip archive without saving the zip - node.js

I'm getting a ZIP archive from S3 using the aws s3 node SDK.
In this zip file there is a single .json file where I want to get the contents from. I don't want to save this file to storage, but only get the contents of this zip file.
Example:
File.zip contains a single file:
file.json with contents({"value":"abcd"})
I currently have:
const { S3Client, GetObjectCommand} = require("#aws-sdk/client-s3");
const s3Client = new S3Client({ region: 'eu-central-1'});
const file = await s3Client.send(new GetObjectCommand({Bucket:'MyBucket', Key:'file.zip'}));
file.body now contains a Readable stream with the contents of the zip file. I now want to transfer this Readable stream into {"value":"abcd"}
Is there a library or piece of code that can help me do this and produce the result without having to save the file to disk?

You could use the package archiver or zlib (zlib is integrated in nodejs)
a snippet from part of my project looks like this:
import { unzipSync } from 'zlib';
// Fetch data and get buffer
const res = await fetch('url')
const zipBuffer = await res.arrayBuffer()
// Unzip data and convert to utf8
const unzipedBuffer = await unzipSync(zipBuffer)
const fileData = unzipedBuffer.toString('utf8')
Now fileData is the content of your zipped file as a string, you can use JSON.parse(fileData) to get the content as a json and work with it

Related

Node Zlib. Unzip response with file structure in-memory

I'm receiving a Buffer data from a response with data in a file.
// Save it just to check if content is correct
const fd = fs.openSync('data.zip', 'w')
fs.writeSync(fd, data)
fs.closeSync(fd)
Produces file data.zip, which contains file foo.csv.
I can unzip it with UZIP in-memory with:
const unzipArray = UZIP.parse(data)['foo.csv']
However, I cannot do it with Zlib.
const unzipArray = zlib.unzipSync(data)
// Rises: incorrect header check
It looks like Zlib cannot parse the file structure.
How to unzip the above buffer in-memory with Zlib, without saving files to the filesystem?
You have a zip file, not a zlib or gzip stream. As you found, zlib doesn't process zip files. There are many solutions out there for node.js, which you can find using your friend google. Here is one.
for single file:
const fs = require('fs');
const zlib = require('zlib');
const fileContents = fs.createReadStream('./data/file1.txt.gz');
const writeStream = fs.createWriteStream('./data/file1.txt');
const unzip = zlib.createGunzip();
fileContents.pipe(unzip).pipe(writeStream);
for a group of file:
const fs = require('fs');
const zlib = require('zlib');
const directoryFiles = fs.readdirSync('./data');
directoryFiles.forEach(filename => {
const fileContents = fs.createReadStream(`./data/${filename}`);
const writeStream = fs.createWriteStream(`./data/${filename.slice(0, -3)}`);
const unzip = zlib.createGunzip();
fileContents.pipe(unzip).pipe(writeStream);
});

Create a read stream for a pdf file to upload to s3 bucket

I have an express service that's taking a pdf file from my front-end and saving it to an s3 bucket. I'm running into issues trying to take the file and create a stream so that I can then pass that to the s3 upload function. I'm trying to avoid writing the file to disc so I don't think I can use fs.createReadStream() but I can't seem to find an alternative way to do it..
router.post('/upload', upload.single('my-pdf'), async (req, res, next) {
const file = req.file;
// Needs a file path not an actual file
const stream = fs.createReadStream(file);
return s3.upload(file).promise();
}
Any help or advice on how to get around this would be greatly appreciated.
Assuming that req.file.<name_of_upload_field> is a buffer holding the file contents, you can convert that to a readable stream via
var str = new stream.PassThrough();
str.end(req.file.<name_of_upload_field>);
return s3.upload(str).promise();

Read all JSON files contained in a dynamically updated folder

I've got multiple json files contained within a directory that will dynamically be updated by users. The users can add categories which will create new json files in that directory, and they can also remove categories which would delete json files in that directory. I'm looking for a method to read all json files contained in that folder directory, and push all the json files into a single object array. I imagine asynchronously would be desirable too.
I'm very new to using fs. I've management to read single json files by directory using
const fs = require('fs');
let data = fs.readFileSync('./sw_lbi/categories/category1.json');
let categories = JSON.parse(data);
console.log(categories);
But of course this will only solve the synchronous issue when using require()
As I'll have no idea what json files will be contained in the directory because the users will also name them, I'll need a way to read all the json files by simply calling the folder directory which contains them.
I'm imagining something like this (which obviously is foolish)
const fs = require('fs');
let data = fs.readFileSync('./sw_lbi/categories');
let categories = JSON.parse(data);
console.log(categories);
What would be the best approach to achieve this?
Thanks in advance.
First of all you need to scan this directory for files, next you need to filter them and select only JSONs, and at the end just read every file and do what you need to do
const fs = require('fs');
const path = require('path')
const jsonsInDir = fs.readdirSync('./sw_lbi/categories').filter(file => path.extname(file) === '.json');
jsonsInDir.forEach(file => {
const fileData = fs.readFileSync(path.join('./sw_lbi/categories', file));
const json = JSON.parse(fileData.toString());
});

How do you add a header to wav file?

I am sending audio data stored as a blob to my backend (node/express). When I save the file as .wav and attempt to use in the SpeechRecogition package in python it throws an error saying the "file does not start with RIFF id". So how can I add the headers to my blob file before I save it so that it is a correctly formatted .wav file? I can provide the code if necessary.
node.js file
var multer = require('multer');
var fs = require('fs'); //use the file system so we can save files
var uniqid = require('uniqid');
var spawn = require('child_process').spawn;
const storage = multer.memoryStorage()
var upload = multer({ storage: storage });
router.post('/api/test', upload.single('upl'), function (req, res) {
console.log(req.file);
console.log(req.file.buffer);
var id = uniqid();
fs.writeFileSync(id+".wav", Buffer.from(new Uint8Array(req.file.buffer))); //write file to server as .wav file
const scriptPath = 'handleAudio.py'
const process = spawn('python3', [__dirname+"/../"+scriptPath, "/home/bitnami/projects/sample/"+id+".wav", req.file.originalname, 'True']); //throws error about header in .wav
});
Also I had this same example working with a php endpoint that just saved the blob to a file with .wav extension and the python file accepted it. What could be different in the move_uploaded_file in php and what I am doing above with node?
Every .wav file needs a header specified by the WAVE file format, available here. While it's fine for you to build the header yourself, it's much easier to just use a proper lib to do the work for you.
One example is node-wav, which has a nice API to write WAVE files from raw PCM data (what you have at the moment). Example code is provided by the node-wav documentation.

File read from angularjs and convert base64 and push into gitlab

multiple zip File read and display in angularjs and those files convert base64 in nodejs and push into gitlab. please suggest me if it possible in nodejs. is there any blug available for reference.
use fs module of nodejs to read the files from directory
const testFolder = './tests/';
const fs = require('fs');
fs.readdirSync(testFolder).forEach(file => {
console.log(file);
});
once you get the files you can covert to base64
function base64_encode(file) {
// read binary data
var bitmap = fs.readFileSync(file);
// convert binary data to base64 encoded string
return new Buffer(bitmap).toString('base64');
}

Resources