NodeJS middleware how to read from a writable stream (http.ServerResponse) - node.js

I'm working on a app (using Connect not Express) composed of a set of middlewares plus node-http-proxy module, that is, I have a chain of middlewares like:
midA -> midB -> http-proxy -> midC
In this scenario, the response is wrote by the http-proxy that proxies the request to some target and returns the content.
I would like to create a middleware (say midB) to act as a cache. The idea is:
If url is cached the cache-middleware writes the response and avoids continuing the middleares chain.
If url is not cached the cache-middleware passes the request within the middlewares chain bit requires to read the final response content to be cached.
How can achieve this? Or there is another approach?
Cheers

Answering myself.
If you have a middleware like function(req, res, next){..} and need to read the content of the response object.
In this case the res is a http.ServerResponse object, a writable stream where every middleware in the chain is allowed to write content that will conform the response we want to return.
Do not confuse with the response you get when make a request with http.request(), that is a http.IncomingMessage which in fact is a readable stream.
The way I found to read the content all middlewares write to the response is redefining the write method:
var middleare = function(req, res, next) {
var data = "";
res._oldWrite = res.write;
res.write = function(chunk, encoding, cb) {
data += chunck;
return res._oldWrite.call(res, chunck, encoding, cb);
}
...
}
Any other solutions will be appreciated.

Related

Call Express router manually

Нello! I am looking to call a function which has been passed to an expressRouter.post(...) call.
This expressRouter.post(...) call is occurring in a file which I am unable to modify. The code has already been distributed to many clients and there is no procedure for me to modify their versions of the file. While I have no ability to update this file for remote clients, other developers are able to. I therefore face the issue of this POST endpoint's behaviour changing in the future.
I am also dealing with performance concerns. This POST endpoint expects req.body to be a parsed JSON object, and that JSON object can be excessively large.
My goal is to write a GET endpoint which internally activates this POST endpoint. The GET endpoint will need to call the POST endpoint with a very large JSON value, which has had URL query params inserted into it. The GET's functionality should always mirror the POST's functionality, including if the POST's functionality is updated in the future. For this reason I cannot copy/paste the POST's logic. Note also that the JSON format will never change.
I understand that the issue of calling an expressjs endpoint internally has conventionally been solved by either 1) extracting the router function into an accessible scope, or 2) generating an HTTP request to localhost.
Unfortunately in my case neither of these options are viable:
I can't move the function into an accessible scope as I can't modify the source, nor can I copy-paste the function as the original version may change
Avoiding the HTTP request is a high priority due to performance considerations. The HTTP request will require serializing+deserializing an excessively large JSON body, re-visiting a number of authentication middlewares (which require waiting for further HTTP requests + database queries to complete), etc
Here is my (contrived) POST endpoint:
expressRouter.post('/my/post/endpoint', (req, res) => {
if (!req.body.hasOwnProperty('val'))
return res.status(400).send('Missing "val"');
return res.status(200).send(`Your val: ${req.body.val}`);
});
If I make a POST request to localhost:<port>/my/post/endpoint I get the expected error or response based on whether I included "val" in the JSON body.
Now, I want to have exactly the same functionality available, but via GET, and with "val" supplied in the URL instead of in any JSON body. I have attempted the following:
expressRouter.get('/my/get/endpoint/:val', (req, res) => {
// Make it seem as if "val" occurred inside the JSON body
let fakeReq = {
body: {
val: req.params.val
}
};
// Now call the POST endpoint
// Pass the fake request, and the real response
// This should enable the POST endpoint to write data to the
// response, and it will seem like THIS endpoint wrote to the
// response.
manuallyCallExpressEndpoint(expressRouter, 'POST', '/my/post/endpoint', fakeReq, res);
});
Unfortunately I don't know how to implement manuallyCallExpressEndpoint.
Is there a solution to this problem which excludes both extracting the function into an accessible scope, and generating an HTTP request?
This seems possible, but it may make more sense to modify req and pass it, rather than create a whole new fakeReq object. The thing which enables this looks to be the router.handle(req, res, next) function. I'm not sure this is the smartest way to go about this, but it will certainly avoid the large overhead of a separate http request!
app.get('/my/get/endpoint/:val', (req, res) => {
// Modify `req`, don't create a whole new `fakeReq`
req.body = {
val: req.params.val
};
manuallyCallExpressEndpoint(app, 'POST', '/my/post/endpoint', req, res);
});
let manuallyCallExpressEndpoint = (router, method, url, req, res) => {
req.method = method;
req.url = url;
router.handle(req, res, () => {});
};
How about a simple middleware?
function checkVal(req, res, next) {
const val = req.params.val || req.body.val
if (!val) {
return res.status(400).send('Missing "val"');
}
return res.status(200).send(`Your val: ${val}`);
}
app.get('/my/get/endpoint/:val', checkVal)
app.post('/my/post/endpoint', checkVal)
This code isn't tested but gives you rough idea on how you can have the same code run in both places.
The checkVal function serves as a Express handler, with request, response and next. It checks for params first then the body.

How do you read a stream in a middleware and still be streamable in next middleware

I'm using a proxy middleware to forward multipart data to a different endpoint. I would like to get some information from the stream using previous middleware, and still have the stream readable for the proxy middleware that follows. Is there stream pattern that allows me to do this?
function preMiddleware(req, res, next) {
req.rawBody = '';
req.on('data', function(chunk) {
req.rawBody += chunk;
});
req.on('end', () => {
next();
})
}
function proxyMiddleware(req, res, next) {
console.log(req.rawBody)
console.log(req.readable) // false
}
app.use('/cfs', preMiddleware, proxyMiddleware)
I want to access the name value of <input name="fee" type='file' /> before sending the streamed data to the external endpoint. I think I need to do this because the endpoint parses fee into the final url, and I would like to have a handle for doing some post processing. I'm open to alternative patterns to resolve this.
I don't think there is any mechanism for peeking into a stream without actually permanently removing data from the stream or any mechanism for "unreading" data from a stream to put it back into the stream.
As such, I can think of a few possible ideas:
Read the data you want from the stream and then send the data to the final endpoint manually (not using your proxy code that expects the readable stream).
Read the stream, get the data you want out if it, then create a new readable stream, put the data you read into that readable stream and pass that readable stream onto the proxy. Exactly how to pass it only the proxy will need some looking into the proxy code. You might have to make a new req object that is the new stream.
Create a stream transform that lets you read the stream (potentially even modifying it) while creating a new stream that can be fed to the proxy.
Register your own data event handler, then pause the stream (registering a data even automatically triggers the stream to flow and you don't want it to flow yet), then call next() right away. I think this will allow you to "see" a copy of all the data as it goes by when the proxy middleware reads the stream as there will just be multiple data event handlers, one for your middleware and one for the proxy middleware. This is a theoretical idea - I haven't yet tried it.
You would need to be able to send a single stream in two different directions, which is not gonna be easy if you try it on your own - luckily I wrote a helpful module back in the day rereadable-stream
, that you could use and I'll use scramjet for finding the data you're interested in.
I assume your data will be a multipart-boundary:
const {StringStream} = require('scramjet');
const {ReReadable} = require("rereadable-stream");
// I will use a single middleware, since express does not allow to pass an altered request object to next()
app.use('/cfs', (req, res, next) => {
const buffered = req.pipe(new ReReadable()); // rewind file to
let file = '';
buffered.pipe(new StringStream) // pipe to a StringStream
.lines('\n') // split request by line
.filter(x => x.startsWith('Content-Disposition: form-data;'))
// find form-data lines
.parse(x => x.split(/;\s*/).reduce((a, y) => { // split values
const z = y.split(/:\s*/); // split value name from value
a[z[0]] = JSON.parse(z[1]); // assign to accumulator (values are quoted)
return a;
}, {}))
.until(x => x.name === 'fee' && (file = x.filename, 1))
// run the stream until filename is found
.run()
.then(() => uploadFileToProxy(file, buffered.rewind(), res, next))
// upload the file using your method
});
You'll probably need to adapt this a little to make it work in real world scenario. Let me know if you get stuck or there's something to fix in the above answer.

Piping readable stream using superagent

I'm trying to create a multer middleware to pipe a streamed file from the client, to a 3rd party via superagent.
const superagent = require('superagent');
const multer = require('multer');
// my middleware
function streamstorage(){
function StreamStorage(){}
StreamStorage.prototype._handleFile = function(req, file, cb){
console.log(file.stream) // <-- is readable stream
const post = superagent.post('www.some-other-host.com');
file.stream.pipe(file.stream);
// need to call cb(null, {some: data}); but how
// do i get/handle the response from this post request?
}
return new StreamStorage()
}
const streamMiddleware = {
storage: streamstorage()
}
app.post('/someupload', streamMiddleware.single('rawimage'), function(req, res){
res.send('some token based on the superagent response')
});
I think this seems to work, but I'm not sure how to handle the response from superagent POST request, since I need to return a token received from the superagent request.
I've tried post.end(fn...) but apparently end and pipe can't both be used together. I feel like I'm misunderstanding how piping works, or if what i'm trying to do is practical.
Superagent's .pipe() method is for downloading (piping data from a remote host to the local application).
It seems you need piping in the other direction: upload from your application to a remote server. In superagent (as of v2.1) there's no method for that, and it requires a different approach.
You have two options:
The easiest, less efficient one is:
Tell multer to buffer/save the file, and then upload the whole file using .attach().
The harder one is to "pipe" the file "manually":
Create a superagent instance with URL, method and HTTP headers you want for uploading,
Listen to data events on the incoming file stream, and call superagent's .write() method with each chunk of data.
Listen to the end event on the incoming file stream, and call superagent's .end() method to read server's response.

change content-disposition on piped response

I have the following controller that get a file from the a service and pipes the answer to the browser.
function (req,res){
request.get(serviceUrl).pipe(res);
}
I'd like to change the content-disposition (from attachment to inline) so the browser opens the file instead of directly download it.
I already tried this, but it is not working:
function (req,res){
res.set('content-disposition','inline');
request.get(serviceUrl).pipe(res);
}
The versions I'm using are:
NodeJS: 0.12.x
Express: 4.x
To do this you can use an intermediate passtrhough stream between request and response, then headers from request won't be passed to response:
var through2 = require('through2'); // or whatever you like better
function (req, res) {
var passThrough = through2(); // this stream is necessary to put correct response headers
res.set('content-disposition','inline');
request.get(serviceUrl).pipe(passThrough).pipe(res);
}
But be carefull, as this will ignore all headers, and you will probably need to specify 'Content-Type', etc.

Node/Express, How do I modify static files but still have access to req.params?

I'm new to node/express, so there's (hopefully) an obvious answer that I'm missing.
There's a middleware for transforming static content: https://www.npmjs.com/package/connect-static-transform/. The transformation function looks like:
transform: function (path, text, send) {
send(text.toUpperCase(), {'Content-Type': 'text/plain'});
}
So, that's great for transforming the content before serving, but it doesn't let me look at query parameters.
This answer shows how to do it Connect or Express middleware to modify the response.body:
function modify(req, res, next){
res.body = res.body + "modified";
next();
}
But I can't figure out how to get it to run with static file content. When I run it res.body is undefined.
Is there some way to get a middleware to run after express.static?
My use case is that I want to serve files from disk making a small substitution of some text based on the value of a query parameter. This would be easy with server-side templating, like Flask. But I want the user to be able to do a simple npm-install and start up a tiny server to do this. Since I'm new to node and express, I wanted to save myself the bother of reading the url, locating the file on disk and reading it. But it's becoming clear that I wasted much more time trying this approach.
The answer appears to be "There is no answer." (As suggested by Pomax in the comment.) This is really annoying. It didn't take me too long to figure out how to serve and transform files myself, but now I'm having to figure out error handling. A million people have already written this code.
You can create middleware that only does transformation of body chunks as they are written with res.write or res.end or whatever.
For example:
const CSRF_RE = /<meta name="csrf-token" content="(.*)"([^>]*)?>/
function transformMiddleware (req, res, next) {
const _write = res.write
res.write = function(chunk, encoding) {
if (chunk.toString().indexOf('<meta name="csrf-token"') === -1) {
_write.call(res, chunk, encoding)
} else {
const newChunk = chunk.toString().replace(CSRF_RE, `<meta name="csrf-token" content="${req.csrfToken()}">`)
_write.call(res, newChunk, encoding)
}
}
next()
}

Resources