Node.js - Create a proxy, why is request.pipe needed? - node.js

Can some one explain this code to create a proxy server. Everything makes sense except the last block. request.pipe(proxy - I don't get that because when proxy is declared it makes a request and pipes its response to the clients response. What am I missing here? Why would we need to pipe the original request to the proxy because the http.request method already makes the request contained in the options var.
var http = require('http');
function onRequest(request, response) {
console.log('serve: ' + request.url);
var options = {
hostname: 'www.google.com',
port: 80,
path: request.url,
method: 'GET'
};
var proxy = http.request(options, function (res) {
res.pipe(response, {
end: true
});
});
request.pipe(proxy, {
end: true
});
}
http.createServer(onRequest).listen(8888);

What am I missing here? [...] the http.request method already makes the request contained in the options var.
http.request() doesn't actually send the request in its entirety immediately:
[...] With http.request() one must always call req.end() to signify that you're done with the request - even if there is no data being written to the request body.
The http.ClientRequest it creates is left open so that body content, such as JSON data, can be written and sent to the responding server:
var req = http.request(options);
req.write(JSON.stringify({
// ...
}));
req.end();
.pipe() is just one option for this, when you have a readable stream, as it will .end() the client request by default.
Although, since GET requests rarely have a body that would need to be piped or written, you can typically use http.get() instead, which calls .end() itself:
Since most requests are GET requests without bodies, Node provides this convenience method. The only difference between this method and http.request() is that it sets the method to GET and calls req.end() automatically.
http.get(options, function (res) {
res.pipe(response, {
end: true
});
});

Short answer: the event loop. I don't want to talk too far out of my ass, and this is where node.js gets both beautiful and complicated, but the request isn't strictly MADE on the line declaring proxy: it's added to the event loop. So when you connect the pipe, everything works as it should, piping from the incoming request > proxy > outgoing response. It's the magic / confusion of asynchronous code!

Related

Node.js GET API is getting called twice intermittently

I have a node.js GET API endpoint that calls some backend services to get data.
app.get('/request_backend_data', function(req, res) {
---------------------
}
When there is a delay getting a response back from the backend services, this endpoint(request_backend_data) is getting triggered exactly after 2 minutes. I have checked my application code, but there is no retry logic written anywhere when there is a delay.
Does node.js API endpoint gets called twice in any case(like delay or timeout)?
There might be a few reasons:
some chrome extensions might cause bugs. Those chrome extensions have been causing a lot of issues recently. run your app on a different browser. If there is no issue, that means it is chrome-specific problem.
express might be making requests for favicon.ico. In order to prevent this, use this module : https://www.npmjs.com/package/serve-favicon
add CORS policy. Your server might sending preflight requests Use this npm package: https://www.npmjs.com/package/cors
No there is no default timeouts in nodejs or something like that.
Look for issue at your frontend part:
can be javascript fetch api with 'retry' option set
can be messed up RxJS operators chain which emits events implicitly and triggers another one REST request
can be entire page reload on timeout which leads to retrieve all neccessary data from backend
can be request interceptors (in axios, angular etc) which modify something and re-send
... many potential reasons, but not in backend (nodejs) for sure
Just make simple example and invoke your nodejs 'request_backend_data' endpoint with axois or xmlhttprequest - you will see that problem is not at backend part.
Try checking the api call with the code below, which includes follwing redirects. Add headers as needed (ie, 'Authorization': 'bearer dhqsdkhqd...etc'
var https = require('follow-redirects').https;
var fs = require('fs');
var options = {
'method': 'GET',
'hostname': 'foo.com',
'path': '/request_backend_data',
'headers': {
},
'maxRedirects': 20
};
var req = https.request(options, function (res) {
var chunks = [];
res.on("data", function (chunk) {
chunks.push(chunk);
});
res.on("end", function (chunk) {
var body = Buffer.concat(chunks);
console.log(body.toString());
});
res.on("error", function (error) {
console.error(error);
});
});
req.end();
Paste into a file called test.js then run with node test.js.

Nodejs Request module -- how to set global keepalive

I am using request npm module in my app, to make to create a http client, as this.
var request = require('request');
And each time, I make a request to some server, I pass the options as below:
var options = {
url: "whateverurl...",
body: { some json data for POST ... }
}
request(options, cb(e, r, body) {
// handle response here...
})
This was working fine, until I started testing with high load, and I started getting errors indicating no address available (EADDRNOTAVAIL). It looks like I am running out of ephemeral ports, as there is no pooling or keep-alive enabled.
After that, I changed it to this:
var options = {
url: "whateverurl...",
body: { some json data for POST ... },
forever: true
}
request(options, cb(e, r, body) {
// handle response here...
})
(Note the option (forever:true)
I tried looking up request module's documentation about how to set keep-alive. According to the documentation and this stackoverflow thread, I am supposed to add {forever:true} to my options.
It didn't seem to work for me, because when I checked the tcpdump, the sever was still closing the connection. So, my question is:
Am I doing something wrong here?
Should I not be setting a global option to request module, while I am "require"ing it, instead of telling it to use {forever:true}, each time I make a http request? This is confusing to me.

Trying to understand how to work properly with : app = Express().method , several requests() in same method and middleware

Trying to understand how to work properly with :
1. Express
2. request
3. middleware
It's a follow up question from here where the discussion wad fruitful and helpfull (thanks #BlazeSahlzen , you are great!) but I realize that I tried at one point to put too much issues (although they are all related) into the same question.
So, this one is a focused question... I hope :-)
Case: I want to build POST() that recives parameter via path (/:param1),
uses it to request() #1 an external API,
gets the result from the external API,
Uses the result to do somwething and send ANOTHER request() #2 to a 2nd external API,
get's the outcome of the 2nd APi request(),
decide if the POST is statusCode = 200 with message="ok" or statusCode = something_else and message = "problem"
and res.send() it properly.
for that, here is my pseudo code -
var middle_1 = function(req, res, next) {
param1 = req.params.param1; //trying to access the param1 from the path, not sure it will work in middleware
req.middle_1_output = {
statusCode: 404,
message: "param1"
}
var options = {
method: 'PUT',
url: `EXTERNAL_API_1`,
headers: {
'cache-control': 'no-cache',
'content-type': 'application/x-www-form-urlencoded',
apikey: `KEY`
}
};
request(options, function(error, response, body) {
if (error) throw new Error(error);
// CODE THAT DO SOMETHING AND GET INFORMATION
req.request_1_output.statusCode = 200;
req.request_1_output.message = "hello world";
next(); // not sure what happens here - will it "jump" outside of the middle_1() or go to the next request() down the code??
});
var options = {
method: 'PUT',
url: `EXTERNAL_API_2`,
headers: {
'cache-control': 'no-cache',
'content-type': 'application/x-www-form-urlencoded',
apikey: `KEY`
}
};
request(options, function(error, response, body) {
if (error) throw new Error(error);
//Can I use here the req.request_1_output.message ???
//How can I use here ALSO some of the EXTERNAL_API_1 outcome????
// Some more CODE THAT DO SOMETHING AND GET INFORMATION
req.request_2_output.statusCode = 201;
req.request_2_output.message = "hello world";
next(); // not sure what happens here
});
}
//This middleware is only used to send success response
var response_success = function(req, res) {
sum_statusCode = req.request_1_output.statusCode + req.request_2_output.statusCode;
if (req.request_2_output.message == req.request_1_output.message) {
meassge = "hello world";
} else {
message = "goodbye world!";
}
res.json({
"statusCode": sum_statusCode,
"message": message
});
}
app.post('/test', middle_1, response_success);
I am not sure how to connect the different requests (request #1 and request #2) in this case - should they all become middleware? how should I write it? (connect => make them run one only after the other is done.)
How can I get also infomation from the request #1 outcome and use it in the request #2 ?
look at my code at response_success() -> will this work? can I access like this data from req that originated within the request #1 and request #2?
How am I suppose to access inside the response_success() data which is the OUTCOME of the request #1 and request #2?
// EDITED - question #5 and #6 are a late edition of mine but should be a stand alone questions. I leave them here but I will be opening a new thread just for them.
Let's say my middle_1 needs to get information as an outcome from the request_1 , calculate something, and move it forward to a middle_2... how do I take the request_1 information into something that can be transffered into a middle_2? I think I am suppose to create a property inside "req" , something like req.middle_1_outcome = DATA , but I am not sure how to "get the DATA" from the request_1 outcome...
How do I "monitor and wait" for request_1 to be done before my middle_1 moves forward to calculate things? is there a requestSync() funciton for Synced requests?
Thanks in advance to all the helpers :-)
A given middleware function should call next() only once when it is done with all its processing.
A given request handler should only submit one response per request. Anything more will be considered an error by Express.
I am not sure how to connect the different requests (request #1 and
request #2) in this case - should they all become middleware? how
should I write it? (connect => make them run one only after the other
is done.)
If your two request() calls can run in parallel (the second one does not depend upon the first results), then you can run them both and then monitor when they are both done, collect the results, do what you need to do with the request and then once and only once call next().
If they must be run in sequence (use the results from the first in the second), then you can nest them.
How can I get also information from the request #1 outcome and use it
in the request #2 ?
There are a variety of ways to solve that issue. If the requests are being run in sequence, then the usual way is to just put request #2 inside the completion callback for request #1 where the results from #1 are known.
Look at my code at response_success() -> will this work? can I access like this data from req that originated within the request #1 and request #2?
You can't quite do it like that because you can't call next() multiple times from the same middleware.
How am I suppose to access inside the response_success() data which is the OUTCOME of the request #1 and request #2?
If you nest the two operations and run request #2 from inside the completion of
request #1, then inside the completion for request #2, you can access both results. There is no need to a completely separate request handler to process the results. That just makes more complication that is necessary.
If you need to serialize your two requests because you want to use the result from the first request in the second request, then you can use this structure where you nest the second request inside the completion of the first one:
function middle_1(req, res, next) {
var param1 = req.params.param1; //trying to access the param1 from the path, not sure it will work in middleware
var options = {
method: 'PUT',
url: `EXTERNAL_API_1`,
headers: {
'cache-control': 'no-cache',
'content-type': 'application/x-www-form-urlencoded',
apikey: `KEY`
}
};
request(options, function (error, response, body) {
if (error) return next(error);
// CODE THAT DO SOMETHING AND GET INFORMATION
var options = {
method: 'PUT',
url: `EXTERNAL_API_2`,
headers: {
'cache-control': 'no-cache',
'content-type': 'application/x-www-form-urlencoded',
apikey: `KEY`
}
};
// this second request is nested inside the completion callback
// of the first request. This allows it to use the results from
// from the first request when sending the second request.
request(options, function (error2, response2, body2) {
if (error2) return next(error2);
// right here, you can access all the results from both requests as
// they are all in scope
// response, body from the first request and
// response2, body2 from the second request
// When you are done with all your processing, then you
// can send the response here
res.json(....);
});
});
}
app.post('/test', middle_1);
Note several things about the structure of this code:
To use the results of the first request in the second one, just nest the two.
When nesting like this, the results from both requests will be available in the completion callback for request #2 as long as you give the arguments unique names so they don't accidentally hide parent scoped variables of the same name.
It does you no good to throw from an async callback, since there's no way for your code to ever catch an exception throw from a plain async callback. The request will likely just sit there forever until it eventually times out if you throw. You need to actually handle the error. Since I didn't know what error handling you wanted, I called next(err) to at least give you default error handling.
I would not suggest using multiple middleware functions that are really just one operation. You may as well just put the one operation in one request handler function as I've shown.

response.writeHead and response.end in NodeJs

var https = require('https');
var fs = require('fs');
var options = {
key: fs.readFileSync('test/fixtures/keys/agent2-key.pem'),
cert: fs.readFileSync('test/fixtures/keys/agent2-cert.pem')
};
https.createServer(options, function (req, res) {
res.writeHead(200);
res.end("hello world\n");
}).listen(8000);
Can anyone explain me why do we call the writeHead and end
method in createServer method.
What is the main purpose of options object passed in createServer
method.
Those calls to writeHead and end are not being done in the createServermethod, but rather in a callback.
It's a bit easier to see if you split out the callback into a separate function:
function handleRequest(req, res) {
res.writeHead(200);
res.end("hello world\n");
}
https.createServer(options, handleRequest).listen(8000);
So here we define a handleRequest function and then pass that into the createServer call. Now whenever the node.js server we created receives an incoming request, it will invoke our handleRequest method.
This pattern is very common in JavaScript and is core to node.js' asynchronous event handling.
In your code, the writeHead() is called to write the header of the response, that the application will serve to the client. The end() method both sends the content of the response to the client and signals to the server that the response (header and content) has been sent completely. If you are still going to send anything else, you should call write() method of res response object instead.
The options JSON object is a modifier that you may use, to override the default behaviour of the createServer() method. In your code's case:
+ key: Private key to use for SSL (default is null)
+ cert: Public x509 certificate to use (default is null)
You can find more in this section of the Node.js API doc about the response.writeHead() method.
You can find more in this section of the Node.js API doc about the https.createServer() method.
response.writeHead(200) sends a response header to the request. The status code is a 3-digit HTTP status code, like 404.
This method must only be called once on a message and it must be called before response.end() is called.
If you call response.write() or response.end() before calling this, the implicit/mutable headers will be calculated and call this function for you.
As far as i know if you don't put the response.end() at the end then your web page will go on loading thus the response.end() is used to tell the server that the data has been loaded
The res.writeHead method is for returning a status code to the browser, and the browser will throw an error if it is a client-side status code or server-side status code. The res.end method is to make sure the response isn't returned before it might be ready, in case of nested code or otherwise.
The purpose of the options object is to make sure the page has a valid key and certificate before declaring that the page is encrypted under https.

Pipe an MJPEG stream through a Node.js proxy

Using Motion on linux, every webcam is served up as a stream on its own port.
I now want to serve up those streams, all on the same port, using Node.js.
Edit: This solution now works. I needed to get the boundary string from the original mjpeg stream (which was "BoundaryString" in my Motion config)
app.get('/motion', function(req, res) {
var boundary = "BoundaryString";
var options = {
// host to forward to
host: '192.168.1.2',
// port to forward to
port: 8302,
// path to forward to
path: '/',
// request method
method: 'GET',
// headers to send
headers: req.headers
};
var creq = http.request(options, function(cres) {
res.setHeader('Content-Type', 'multipart/x-mixed-replace;boundary="' + boundary + '"');
res.setHeader('Connection', 'close');
res.setHeader('Pragma', 'no-cache');
res.setHeader('Cache-Control', 'no-cache, private');
res.setHeader('Expires', 0);
res.setHeader('Max-Age', 0);
// wait for data
cres.on('data', function(chunk){
res.write(chunk);
});
cres.on('close', function(){
// closed, let's end client request as well
res.writeHead(cres.statusCode);
res.end();
});
}).on('error', function(e) {
// we got an error, return 500 error to client and log error
console.log(e.message);
res.writeHead(500);
res.end();
});
creq.end();
});
I would think this serves up the mjpeg stream at 192.168.1.2:8302 as /motion, but it does not.
Maybe because it never ends, and this proxy example wasn't really a streaming example?
Streaming over HTTP isn't the issue. I do that with Node regularly. I think the problem you're having is that you aren't sending a content type header to the client. You go right to writing data without sending any response headers, actually.
Be sure to send the right content type header back to the client making the request, before sending any actual content data.
You may need to handle multipart responses, if Node's HTTP client doesn't already do it for you.
Also, I recommend debugging this with Wireshark so you can see exactly what is being sent and received. That will help you narrow down problems like this quickly.
I should also note that some clients have a problem with chunked encoding, which is what Node will send if you don't specify a content length (which you can't because it's indefinite). If you need to disable chunked encoding, see my answer here: https://stackoverflow.com/a/11589937/362536 Basically, you just need to disable it: response.useChunkedEncodingByDefault = false;. Don't do this unless you need to though! And make sure to send a Connection: close in your headers with it!
What you need to do is request the mjpeg stream when it's necessary just in one thread and response each client with mjpeg or jpeg (if you need IE support).

Resources