Node.JS Convert Base64 String into Binary and write to MongoDB GridFS

Node.JS Convert Base64 String into Binary and write to MongoDB GridFS - node.js

I have a Base64 string that I am converting to binary like this:
var b64string = req.body.image.substr(23);//The base64 has a imageURL
var buf = new Buffer(b64string, 'base64');
I need to insert this into MongoDB GridFS. The problem I am having is that createReadStream require a filepath where I already have the file in memory.
This is what I am trying that does not work
var grid = new gfs(db, mongo, 'files');
grid.createWriteStream(options, function (err, ws) {
fs.createReadStream(buf, {autoClose: true})
.pipe(ws)
.on('close', function (f) {
console.log(f._id)
res.send(f._id)
})
.on('error', function (err) {
console.log(err)
})
})
But as I described, it wants a path where I have buf
UPDATE ---
I was over thinking it...
This works
var b64string = req.body.image.substr(23);
var buf = new Buffer(b64string, 'base64');
var grid = new Grid(db, 'files');
grid.put(buf, {}function(err, file){})

Related

NodeJS: Unable to convert stream/buffer to base64 string

I need to create base64 string that I need to send to a third party API. I have the stream and buffer. Form stream I am able to create an image so there is no way the stream is corrupted. Here are the two variables:
var newJpeg = new Buffer(newData, "binary");
var fs = require('fs');
let Duplex = require('stream').Duplex;
let _updatedFileStream = new Duplex();
_updatedFileStream.push(newJpeg);
_updatedFileStream.push(null);
No matter whatever I try, I can not convert either of them in base64 string.
_updatedFileStream.toString('base64');
Buffer(newJpeg, 'base64');
Buffer(newData, 'base64');
None of the above works. Sometimes I get Uint8Array[arraySize] or Gibberish string. What am I doing wrong?

Example using promises (but could easily be adapted to other approaches):
return new Promise((resolve, reject) => {
let buffers = [];
let myStream = <...>;
myStream.on('data', (chunk) => { buffers.push(chunk); });
myStream.once('end', () => {
let buffer = Buffer.concat(buffers);
resolve(buffer.toString('base64'));
});
myStream.once('error', (err) => {
reject(err);
});
});

Get PDFKit as base64 string

I'm searching a way to get the base64 string representation of a PDFKit document. I cant' find the right way to do it...
Something like this would be extremely convenient.
var doc = new PDFDocument();
doc.addPage();
doc.outputBase64(function (err, pdfAsText) {
console.log('Base64 PDF representation', pdfAsText);
});
I already tried with blob-stream lib, but it doesn't work on a node server (It says that Blob doesn't exist).
Thanks for your help!

I was in a similar predicament, wanting to generate PDF on the fly without having temporary files lying around. My context is a NodeJS API layer (using Express) which is interacted with via a React frontend.
Ironically, a similar discussion for Meteor helped me get to where I needed. Based on that, my solution resembles:
const PDFDocument = require('pdfkit');
const { Base64Encode } = require('base64-stream');
// ...
var doc = new PDFDocument();
// write to PDF
var finalString = ''; // contains the base64 string
var stream = doc.pipe(new Base64Encode());
doc.end(); // will trigger the stream to end
stream.on('data', function(chunk) {
finalString += chunk;
});
stream.on('end', function() {
// the stream is at its end, so push the resulting base64 string to the response
res.json(finalString);
});

Synchronous option not (yet) present in the documentation
const doc = new PDFDocument();
doc.text("Sample text", 100, 100);
doc.end();
const data = doc.read();
console.log(data.toString("base64"));

I just made a module for this you could probably use. js-base64-file
const Base64File=require('js-base64-file');
const b64PDF=new Base64File;
const file='yourPDF.pdf';
const path=`${__dirname}/path/to/pdf/`;
const doc = new PDFDocument();
doc.addPage();
//save you PDF using the filename and path
//this will load and convert
const data=b64PDF.loadSync(path,file);
console.log('Base64 PDF representation', pdfAsText);
//you could also save a copy as base 64 if you wanted like so :
b64PDF.save(data,path,`copy-b64-${file}`);
It's a new module so my documentation isn't complete yet, but there is also an async method.
//this will load and convert if needed asynchriouniously
b64PDF.load(
path,
file,
function(err,base64){
if(err){
//handle error here
process.exit(1);
}
console.log('ASYNC: you could send this PDF via ws or http to the browser now\n');
//or as above save it here
b64PDF.save(base64,path,`copy-async-${file}`);
}
);
I suppose I could add in a convert from memory method too. If this doesn't suit your needs you could submit a request on the base64 file repo

Following Grant's answer, here is an alternative without using node response but a promise (to ease the call outside of a router):
const PDFDocument = require('pdfkit');
const {Base64Encode} = require('base64-stream');
const toBase64 = doc => {
return new Promise((resolve, reject) => {
try {
const stream = doc.pipe(new Base64Encode());
let base64Value = '';
stream.on('data', chunk => {
base64Value += chunk;
});
stream.on('end', () => {
resolve(base64Value);
});
} catch (e) {
reject(e);
}
});
};
The callee should use doc.end() before or after calling this async method.

NodeJS: Merge two PDF files into one using the buffer obtained by reading them

I am using fill-pdf npm module for filling template pdf's and it creates new file which is read from the disk and returned as buffer to callback. I have two files for which i do the same operation. I want to combine the two buffers there by to form a single pdf file which i can send back to the client. I tried different methods of buffer concatenation. The buffer can be concatenated using Buffer.concat, like,
var newBuffer = Buffer.concat([result_pdf.output, result_pdf_new.output]);
The size of new buffer is also the sum of the size of the input buffers. But still when the newBuffer is sent to client as response, it shows only the file mentioned last in the array.
res.type("application/pdf");
return res.send(buffer);
Any idea ?

As mentioned by #MechaCode, the creator has ended support for HummusJS.
So I would like to give you 2 solutions.
Using node-pdftk npm module
The Following sample code uses node-pdftk npm module to combine
two pdf buffers seamlessly.
const pdftk = require('node-pdftk');
var pdfBuffer1 = fs.readFileSync("./pdf1.pdf");
var pdfBuffer2 = fs.readFileSync("./pdf2.pdf");
pdftk
.input([pdfBuffer1, pdfBuffer2])
.output()
.then(buf => {
let path = 'merged.pdf';
fs.open(path, 'w', function (err, fd) {
fs.write(fd, buf, 0, buf.length, null, function (err) {
fs.close(fd, function () {
console.log('wrote the file successfully');
});
});
});
});
The requirement for node-pdftk npm module is you need to install the
PDFtk library. Some of you may find this overhead / tedious. So I have another solution using pdf-lib library.
Using pdf-lib npm module
const PDFDocument = require('pdf-lib').PDFDocument
var pdfBuffer1 = fs.readFileSync("./pdf1.pdf");
var pdfBuffer2 = fs.readFileSync("./pdf2.pdf");
var pdfsToMerge = [pdfBuffer1, pdfBuffer2]
const mergedPdf = await PDFDocument.create();
for (const pdfBytes of pdfsToMerge) {
const pdf = await PDFDocument.load(pdfBytes);
const copiedPages = await mergedPdf.copyPages(pdf, pdf.getPageIndices());
copiedPages.forEach((page) => {
mergedPdf.addPage(page);
});
}
const buf = await mergedPdf.save(); // Uint8Array
let path = 'merged.pdf';
fs.open(path, 'w', function (err, fd) {
fs.write(fd, buf, 0, buf.length, null, function (err) {
fs.close(fd, function () {
console.log('wrote the file successfully');
});
});
});
Personally I prefer to use pdf-lib npm module.

HummusJS supports combining PDFs using its appendPDFPagesFromPDF method
Example using streams to work with buffers:
const hummus = require('hummus');
const memoryStreams = require('memory-streams');
/**
* Concatenate two PDFs in Buffers
* #param {Buffer} firstBuffer
* #param {Buffer} secondBuffer
* #returns {Buffer} - a Buffer containing the concactenated PDFs
*/
const combinePDFBuffers = (firstBuffer, secondBuffer) => {
var outStream = new memoryStreams.WritableStream();
try {
var firstPDFStream = new hummus.PDFRStreamForBuffer(firstBuffer);
var secondPDFStream = new hummus.PDFRStreamForBuffer(secondBuffer);
var pdfWriter = hummus.createWriterToModify(firstPDFStream, new hummus.PDFStreamForResponse(outStream));
pdfWriter.appendPDFPagesFromPDF(secondPDFStream);
pdfWriter.end();
var newBuffer = outStream.toBuffer();
outStream.end();
return newBuffer;
}
catch(e){
outStream.end();
throw new Error('Error during PDF combination: ' + e.message);
}
};
combinePDFBuffers(PDFBuffer1, PDFBuffer2);

Here's what we use in our Express server to merge a list of PDF blobs.
const { PDFRStreamForBuffer, createWriterToModify, PDFStreamForResponse } = require('hummus');
const { WritableStream } = require('memory-streams');
// Merge the pages of the pdfBlobs (Javascript buffers) into a single PDF blob
const mergePdfs = pdfBlobs => {
if (pdfBlobs.length === 0) throw new Error('mergePdfs called with empty list of PDF blobs');
// This optimization is not necessary, but it avoids the churn down below
if (pdfBlobs.length === 1) return pdfBlobs[0];
// Adapted from: https://stackoverflow.com/questions/36766234/nodejs-merge-two-pdf-files-into-one-using-the-buffer-obtained-by-reading-them?answertab=active#tab-top
// Hummus is useful, but with poor interfaces -- E.g. createWriterToModify shouldn't require any PDF stream
// And Hummus has many Issues: https://github.com/galkahana/HummusJS/issues
const [firstPdfRStream, ...restPdfRStreams] = pdfBlobs.map(pdfBlob => new PDFRStreamForBuffer(pdfBlob));
const outStream = new WritableStream();
const pdfWriter = createWriterToModify(firstPdfRStream, new PDFStreamForResponse(outStream));
restPdfRStreams.forEach(pdfRStream => pdfWriter.appendPDFPagesFromPDF(pdfRStream));
pdfWriter.end();
outStream.end();
return outStream.toBuffer();
};
module.exports = exports = {
mergePdfs,
};

On which format send file to save it on gridfs?

Hy every one,
Please , i 'm study on a project using nodeJS, and i would like to know , in which format my node client must send the file to the server ( is it in base64 format or else ?).
my client is :
//client.js
$('#file').on('change', function(e){
encode64(this);
});
function encode64(input) {
if (input.files){
chap.emit('test', { "test" : input.files[0] });
var FR= new FileReader();
FR.readAsDataURL(input.files[0]);
FR.onload = function(e) {
chap.emit('test', { "test" : e.target.result } );
}
}
}
My server side is :
socket.on('test', function(e){
var gs = new gridStore(db, e.test,"w");
gs.writeFile(new Buffer(e.test,"base64"), function(err,calb){
if (!err)
console.log('bien passe');
else
console.log('erreur');
});
});
But this doesn't work , i get this error :
TypeError: Bad argument
at Object.fs.fstat (fs.js:667:11)
Any one could help me ?

Normally this is how you store into gridFs . I have used it to store files. hope it works.
fs = require('fs'),
var gfs = require('gridfs-stream');
var form = new multiparty.Form();
form.parse(req, function (err, fields, files) {
var file = files.file[0];
var filename = file.originalFilename; //filename
var contentType = file.headers['content-type'];
console.log(files)
var tmpPath = file.path ;// temporary path
var writestream = gfs.createWriteStream({filename: fileName});
// open a stream to the temporary file created by Express...
fs.createReadStream(tmpPath)
// and pipe it to gfs
.pipe(writestream);
writestream.on('close', function (file) {
// do something with `file`
res.send(value);
});
})

Read csv files in stream and store them in database

I have a few huge csv files, what I need to store in a mongo database. Because these files are too big, I need to use stream. I pause the stream while the data writing into the database.
var fs = require('fs');
var csv = require('csv');
var mongo = require('mongodb');
var db = mongo.MongoClient.connect...
var readStream = fs.createReadStream('hugefile.csv');
readStream.on('data', function(data) {
readStream.pause();
csv.parse(data.toString(), { delimiter: ','}, function(err, output) {
db.collection(coll).insert(data, function(err) {
readStream.resume();
});
});
});
readStream.on('end', function() {
logger.info('file stored');
});
But the csv.parse drop an error, because I would need to read the files line by line to handle them as csv, and convert to json for the mongodb. Maybe I should not pause them, but use an interface. I didn't find any solution for this yet.
Any help would be appreciated!

I think you might want to create a stream of lines from your raw data stream.
Here is an example from the split package. https://www.npmjs.com/package/split
fs.createReadStream(file)
.pipe(split())
.on('data', function (line) {
//each chunk now is a seperate line!
})
Adapted to your example it might look like this
var readStream = fs.createReadStream('hugefile.csv');
var lineStream = readStream.pipe(split());
lineStream.on('data', function(data) {
//remaining code unmodified

I'm unsure if bulk() was a thing back in '15, but whosoever is trying to import items from large sources should consider using them.
var fs = require('fs');
var csv = require('fast-csv');
var mongoose = require('mongoose');
var db = mongoose.connect...
var counter = 0; // to keep count of values in the bulk()
const BULK_SIZE = 1000;
var bulkItem = Item.collection.initializeUnorderedBulkOp();
var readStream = fs.createReadStream('hugefile.csv');
const csvStream = csv.fromStream(readStream, { headers: true });
csvStream.on('data', data => {
counter++;
bulkOrder.insert(order);
if (counter === BATCH_SIZE) {
csvStream.pause();
bulkOrder.execute((err, result) => {
if (err) console.log(err);
counter = 0;
bulkItem = Item.collection.initializeUnorderedBulkOp();
csvStream.resume();
});
}
}
});

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Node.JS Convert Base64 String into Binary and write to MongoDB GridFS - node.js

Related

NodeJS: Unable to convert stream/buffer to base64 string

Get PDFKit as base64 string

NodeJS: Merge two PDF files into one using the buffer obtained by reading them

On which format send file to save it on gridfs?

Read csv files in stream and store them in database

Categories

Resources