Decode a base64 document in ExpressJS response - node.js

I would like to store some documents in a database as base64 strings. Then when those docs are requested using HTTP, I would like ExpressJS to decode the base64 docs and return them. So something like this:
app.get('/base64', function (req, res) {
//pdf is my base64 encoded string that represents a document
var buffer = new Buffer(pdf, 'base64');
res.send(buffer);
});
The code is simply to give an idea of what I'm trying to accomplish. Do I need to use a stream for this? If so, how would I do that? Or should I be writing these docs to a temp directory and then serving up the file? Would be nice to skip that step if possible. Thanks!
UPDATE: Just to be clear I would like this to work with a typical HTTP request. So the user will click a link in his browser that will take him to a URL that returns a file from the database. Seems like it must be possible, Microsoft SharePoint stores serialized files in a SQL database and returns those files over http requests, and I don't believe it writes all those files to a temp location first. I'm feeling like a nodejs stream may be the answer, but I'm not very familiar with streaming.

Before saving a file representation to the DB you can just use the toString method with base 64 encoding:
var base64pdf = pdf.toString('base64');
After you get the base64 file representation from db use the buffer as follows in order to convert it back to a file:
var decodedFile = new Buffer(base64pdf, 'base64');
More information on Buffer usages can be found here - NodeJS Buffer
As for how to send a buffer from express server to the client, Socket IO should solve this issue.
Using socket.emit -
Emits an event to the socket identified by the string name. Any
other parameters can be included.
All datastructures are supported, including Buffer. JavaScript
functions can’t be serialized/deserialized.
var io = require('socket.io')();
io.on('connection', function(socket){
socket.emit('an event', { some: 'data' });
});
Required documentation on socket.io website.

Related

Any performance concerns with sending a buffer array in express JSON response?

My nodejs server consumes data from a nodejs JSON API. Some endpoints on the API return image data like so:
let buffer = await getImageBuffer();
res.set('content-type', 'image/png');
res.end(buffer);
Which works great. However for a number of complexity reasons, I'd love include a buffer array in a JSON response instead... like so:
let buffer = await getBuffer();
res.json({
contentType: 'image/png',
buffer
});
Are there any performance issues w/ including a buffer array in a JSON response like that? Is there any inherent performance benefit to using res.end(buffer) instead? The consuming server is also running nodejs, and will naturally JSON.parse() the response from the API.

How can I stream multiple remote images to a zip file and stream that to browser with ExpressJS?

I've got a small web app built in ExpressJs that allows people in our company to browse product information. A recent feature request requires that users be able to download batches of images (potentially hundreds at a time). These are stored on another server.
Ideally I think I need to to stream the batch of files to a zip file and stream that to the end user's browser as a download. All preferably without having to store the files on the server. The idea being that I want to reduce load on the server as much as possible.
Is it possible to do this or do I need to look at another approach? I've been experimenting with the 'request' module for the initial download.
If anyone can point me in the right direction or recommend any NPM modules that might help it would be very much appreciated.
Thanks.
One useful module for this is archiver, but I'm sure there are others as well.
Here's an example program that shows:
how to retrieve a list of URL's (I'm using async to handle the requests, and also to limit the # of concurrent HTTP requests to 3);
how to add the responses for those URL's to a ZIP file;
to stream the final ZIP file somewhere (in this case to stdout, but in case of Express you can pipe to the response object).
Example:
var async = require('async');
var request = require('request');
var archiver = require('archiver');
function zipURLs(urls, outStream) {
var zipArchive = archiver.create('zip');
async.eachLimit(urls, 3, function(url, done) {
var stream = request.get(url);
stream.on('error', function(err) {
return done(err);
}).on('end', function() {
return done();
});
// Use the last part of the URL as a filename within the ZIP archive.
zipArchive.append(stream, { name : url.replace(/^.*\//, '') });
}, function(err) {
if (err) throw err;
zipArchive.finalize().pipe(outStream);
});
}
zipURLs([
'http://example.com/image1.jpg',
'http://example.com/image2.jpg',
...
], process.stdout);
Do note that although this doesn't require the image files to be locally stored, it does build the ZIP file entirely in memory. Perhaps there are other ZIP modules that would allow you to work around that, although (AFAIK) the ZIP file format isn't really great in terms of streaming, as it depends on metadata being appended to the end of the file.

Writing a streaming response from a streaming query in Koa with Mongoose

I'm trying to send a large result-set from a Mongo database to the user of a Koa application (using Mongoose).
I originally had something like:
var res = yield Model.find().limit(500).exec();
this.body = {data: res};
However, the size of the result set being sent was causing the application to time out, and as such I'd like to stream the response as it comes from the database.
With Mongoose you can turn the result of a query into a stream by doing something like:
var stream = Model.find().limit(300).stream();
However, I'm not sure how to write this stream into the response while preserving the format needed. I want something like this to happen:
this.body.write("{data: "});
this.body.write(stream);
this.body.write("}");
but I know there is no body.write in Koa and I'm sure I'm not using streams properly either. Can someone point me in the right direction?
koa-write may help.
but you might not need that. Koa allows you to do:
this.body = stream;
In your case you can create a transform stream since the mongoose stream isn't exactly what you want to output.

Which nodejs library should I use to write into HDFS?

I have a nodejs application and I want to write data into hadoop HDFS file system. I have seen two main nodejs libraries that can do it: node-hdfs and node-webhdfs. Someone have tried it? Any hints? Which one should I use in production?
I am inclined to use node-webhdfs since it uses WebHDFS REST API. node-hdfs seem to be a c++ binding.
Any help will be greatly appreciated.
You may want to check out webhdfs library. It provides nice and straightforward (similar to fs module API) interface for WebHDFS REST API calls.
Writing to the remote file:
var WebHDFS = require('webhdfs');
var hdfs = WebHDFS.createClient();
var localFileStream = fs.createReadStream('/path/to/local/file');
var remoteFileStream = hdfs.createWriteStream('/path/to/remote/file');
localFileStream.pipe(remoteFileStream);
remoteFileStream.on('error', function onError (err) {
// Do something with the error
});
remoteFileStream.on('finish', function onFinish () {
// Upload is done
});
Reading from the remote file:
var WebHDFS = require('webhdfs');
var hdfs = WebHDFS.createClient();
var remoteFileStream = hdfs.createReadStream('/path/to/remote/file');
remoteFileStream.on('error', function onError (err) {
// Do something with the error
});
remoteFileStream.on('data', function onChunk (chunk) {
// Do something with the data chunk
});
remoteFileStream.on('finish', function onFinish () {
// Upload is done
});
Not good news!!!
Do not use node-hdfs. Although it seems promising, it is now two years obsolete. I've tried to compile it but it does not match the symbols of current libhdfs. If you want to use something like that you'll have to make your own nodejs binding.
You can use node-webhdfs but IMHO there's not much advantage on that. It is better to use an http nodejs lib to make your own requests. The hardest part here is try to hold the very async nature of nodejs, since you might want first to create a folder, and then after successfully create it, create a file and then, at last, write or append data. Everything through http requests that you must send and wait the for answer to then go on....
At least node-webhdfs might be a good reference to you take a look and start your own code.
Br,
Fabio Moreira

Node.js POST File to Server

I am trying to write an app that will allow my users to upload files to my Google Cloud Storage account. In order to prevent overwrites and to do some custom handling and logging on my side, I'm using a Node.js server as a middleman for the upload. So the process is:
User uploads file to Node.js Server
Node.js server parses file, checks file type, stores some data in DB
Node.js server uploads file to GCS
Node.js server response to user's request with a pass/fail remark
I'm getting a little lost on step 3, of exactly how to send that file to GCS. This question gives some helpful insight, as well as a nice example, but I'm still confused.
I understand that I can open a ReadStream for the temporary upload file and pipe that to the http.request() object. What I'm confused about is how do I signify in my POST request that the piped data is the file variable. According to the GCS API Docs, there needs to be a file variable, and it needs to be the last one.
So, how do I specify a POST variable name for the piped data?
Bonus points if you can tell me how to pipe it directly from my user's upload, rather than storing it in a temporary file
I believe that if you want to do POST, you have to use a Content-Type: multipart/form-data;boundary=myboundary header. And then, in the body, write() something like this for each string field (linebreaks should be \r\n):
--myboundary
Content-Disposition: form-data; name="field_name"
field_value
And then for the file itself, write() something like this to the body:
--myboundary
Content-Disposition: form-data; name="file"; filename="urlencoded_filename.jpg"
Content-Type: image/jpeg
Content-Transfer-Encoding: binary
binary_file_data
The binary_file_data is where you use pipe():
var fileStream = fs.createReadStream("path/to/my/file.jpg");
fileStream.pipe(requestToGoogle, {end: false});
fileStream.on('end, function() {
req.end("--myboundary--\r\n\r\n");
});
The {end: false} prevents pipe() from automatically closing the request because you need to write one more boundary after you're finished sending the file. Note the extra -- on the end of the boundary.
The big gotcha is that Google may require a content-length header (very likely). If that is the case, then you cannot stream a POST from your user to a POST to Google because you won't reliably know what what the content-length is until you've received the entire file.
The content-length header's value should be a single number for the entire body. The simple way to do this is to call Buffer.byteLength(body) on the entire body, but that gets ugly quickly if you have large files, and it also kills the streaming. An alternative would be to calculate it like so:
var body_before_file = "..."; // string fields + boundary and metadata for the file
var body_after_file = "--myboundary--\r\n\r\n";
var fs = require('fs');
fs.stat(local_path_to_file, function(err, file_info) {
var content_length = Buffer.byteLength(body_before_file) +
file_info.size +
Buffer.byteLength(body_after_file);
// create request to google, write content-length and other headers
// write() the body_before_file part,
// and then pipe the file and end the request like we did above
But, that still kills your ability to stream from the user to google, the file has to be downloaded to the local disk to determine it's length.
Alternate option
...now, after going through all of that, PUT might be your friend here. According to https://developers.google.com/storage/docs/reference-methods#putobject you can use a transfer-encoding: chunked header so you don't need to find the files length. And, I believe that the entire body of the request is just the file, so you can use pipe() and just let it end the request when it's done. If you're using https://github.com/felixge/node-formidable to handle uploads, then you can do something like this:
incomingForm.onPart = function(part) {
if (part.filename) {
var req = ... // create a PUT request to google and set the headers
part.pipe(req);
} else {
// let formidable handle all non-file parts
incomingForm.handlePart(part);
}
}

Resources