Piping readable stream using superagent - node.js

I'm trying to create a multer middleware to pipe a streamed file from the client, to a 3rd party via superagent.
const superagent = require('superagent');
const multer = require('multer');
// my middleware
function streamstorage(){
function StreamStorage(){}
StreamStorage.prototype._handleFile = function(req, file, cb){
console.log(file.stream) // <-- is readable stream
const post = superagent.post('www.some-other-host.com');
file.stream.pipe(file.stream);
// need to call cb(null, {some: data}); but how
// do i get/handle the response from this post request?
}
return new StreamStorage()
}
const streamMiddleware = {
storage: streamstorage()
}
app.post('/someupload', streamMiddleware.single('rawimage'), function(req, res){
res.send('some token based on the superagent response')
});
I think this seems to work, but I'm not sure how to handle the response from superagent POST request, since I need to return a token received from the superagent request.
I've tried post.end(fn...) but apparently end and pipe can't both be used together. I feel like I'm misunderstanding how piping works, or if what i'm trying to do is practical.

Superagent's .pipe() method is for downloading (piping data from a remote host to the local application).
It seems you need piping in the other direction: upload from your application to a remote server. In superagent (as of v2.1) there's no method for that, and it requires a different approach.
You have two options:
The easiest, less efficient one is:
Tell multer to buffer/save the file, and then upload the whole file using .attach().
The harder one is to "pipe" the file "manually":
Create a superagent instance with URL, method and HTTP headers you want for uploading,
Listen to data events on the incoming file stream, and call superagent's .write() method with each chunk of data.
Listen to the end event on the incoming file stream, and call superagent's .end() method to read server's response.

Related

When an Express.js middleware modifies the chunk that goes into res.end(), response times go up by 10x

Node.js version: 14.16, Express version: 4.17
We use express-winston for logging our express responses. Since we want additional information to go into our logs, that we don't want our end-users to see, we decided to include a middleware that wraps res.end(), so as to intercept the chunk sent by express-winston, and remove the additional data.
For reference here's the line of code in express-winston that calls res.end(), whose chunk we want to replace before the response is sent to the end-user, without altering the chunk that is logged:
https://github.com/bithavoc/express-winston/blob/bdba1d39965f83b003178646d213cd974b090326/index.js#L317
Here is a sample middleware that we wrote:
module.exports.responseMiddleware = (req, res, next) => {
const { end } = res;
res.end = (chunk, encoding) => {
res.end = end;
// If alterBody is enabled, the chunk sent to res.end is modified
const resultChunk = req.body.alterBody === 'yes'
? Buffer.from(JSON.stringify({}))
: chunk;
res.end(resultChunk, encoding);
};
next();
};
The original response is sent with a call to res.status(...).json(...)
What we found was that when we enable alterBody, the response time goes up by 10x (from 500ms to 5s).
What could be the reason for this? Is there a way that we can maintain the original response time, while also logging and sending two different chunks?

Node, Express, and parsing streamed JSON in endpoint without blocking thread

I'd like to provide an endpoint in my API to allow third-parties to send large batches of JSON data. I'm free to define the format of the JSON objects, but my initial thought is a simple array of objects:
{[{"id":1, "name":"Larry"}, {"id":2, "name":"Curly"}, {"id":3, "name":"Moe"}]}
As there could be any number of these objects in the array, I'd need to stream this data in, read each of these objects as they're streamed in, and persist them somewhere.
TL;DR: Stream a large array of JSON objects from the body of an Express POST request.
It's easy to get the most basic of examples out there working as all of them seem to demonstrate this idea using "fs" and working w/ the filesystem.
What I've been struggling with is the Express implementation of this. At this point, I think I've got this working using the "stream-json" package:
const express = require("express");
const router = express.Router();
const StreamArray = require("stream-json/streamers/StreamArray");
router.post("/filestream", (req, res, next) => {
const stream = StreamArray.withParser();
req.pipe(stream).on("data", ({key, value}) => {
console.log(key, value);
}).on("finish", () => {
console.log("FINISH!");
}).on("error", e => {
console.log("Stream error :(");
});
res.status(200).send("Finished successfully!");
});
I end up with a proper readout of each object as it's parsed by stream-json. The problem seems to be with the thread getting blocked while the processing is happening. I can hit this once and immediately get the 200 response, but a second hit blocks the thread until the first batch finishes, while the second also begins.
Is there any way to do something like this w/o spawning a child process, or something like that? I'm unsure what to do with this, so that the endpoint can continue to receive requests while streaming/parsing the individual JSON objects.

change content-disposition on piped response

I have the following controller that get a file from the a service and pipes the answer to the browser.
function (req,res){
request.get(serviceUrl).pipe(res);
}
I'd like to change the content-disposition (from attachment to inline) so the browser opens the file instead of directly download it.
I already tried this, but it is not working:
function (req,res){
res.set('content-disposition','inline');
request.get(serviceUrl).pipe(res);
}
The versions I'm using are:
NodeJS: 0.12.x
Express: 4.x
To do this you can use an intermediate passtrhough stream between request and response, then headers from request won't be passed to response:
var through2 = require('through2'); // or whatever you like better
function (req, res) {
var passThrough = through2(); // this stream is necessary to put correct response headers
res.set('content-disposition','inline');
request.get(serviceUrl).pipe(passThrough).pipe(res);
}
But be carefull, as this will ignore all headers, and you will probably need to specify 'Content-Type', etc.

NodeJS middleware how to read from a writable stream (http.ServerResponse)

I'm working on a app (using Connect not Express) composed of a set of middlewares plus node-http-proxy module, that is, I have a chain of middlewares like:
midA -> midB -> http-proxy -> midC
In this scenario, the response is wrote by the http-proxy that proxies the request to some target and returns the content.
I would like to create a middleware (say midB) to act as a cache. The idea is:
If url is cached the cache-middleware writes the response and avoids continuing the middleares chain.
If url is not cached the cache-middleware passes the request within the middlewares chain bit requires to read the final response content to be cached.
How can achieve this? Or there is another approach?
Cheers
Answering myself.
If you have a middleware like function(req, res, next){..} and need to read the content of the response object.
In this case the res is a http.ServerResponse object, a writable stream where every middleware in the chain is allowed to write content that will conform the response we want to return.
Do not confuse with the response you get when make a request with http.request(), that is a http.IncomingMessage which in fact is a readable stream.
The way I found to read the content all middlewares write to the response is redefining the write method:
var middleare = function(req, res, next) {
var data = "";
res._oldWrite = res.write;
res.write = function(chunk, encoding, cb) {
data += chunck;
return res._oldWrite.call(res, chunck, encoding, cb);
}
...
}
Any other solutions will be appreciated.

Node.js - Stream Binary Data Straight from Request to Remote server

I've been trying to stream binary data (PDF, images, other resources) directly from a request to a remote server but have had no luck so far. To be clear, I don't want to write the document to any filesystem. The client (browser) will make a request to my node process which will subsequently make a GET request to a remote server and directly stream that data back to the client.
var request = require('request');
app.get('/message/:id', function(req, res) {
// db call for specific id, etc.
var options = {
url: 'https://example.com/document.pdf',
encoding: null
};
// First try - unsuccessful
request(options).pipe(res);
// Second try - unsuccessful
request(options, function (err, response, body) {
var binaryData = body.toString('binary');
res.header('content-type', 'application/pdf');
res.send(binaryData);
});
});
Putting both data and binaryData in a console.log show that the proper data is there but the subsequent PDF that is downloaded is corrupt. I can't figure out why.
Wow, never mind. Found out Postman (Chrome App) was hijacking the request and response somehow. The // First Try example in my code excerpt works properly in browser.

Resources