Node.js HTTP request: How to detect response body encoding?

Node.js HTTP request: How to detect response body encoding? - node.js

I am using https.request() to make a HTTPS request using the following familiar pattern:
var request = https.request(options, function (response) {
var chunks = [];
response.on('data', function (chunk) {
chunks.push(chunk);
});
response.on('end', function () {
var buffer = Buffer.concat(chunks);
...
});
});
...
request.end();
...
Once I have the finished response Buffer, it needs to be packaged into a JSON object. The reason for this is because I am creating a kind of tunnel, whereby the HTTP response (its headers, status, and body) are to be sent as JSON through another protocol.
So that both textual and binary responses may be supported, what works for me so far is to encode the Buffer to Base64 (using buffer.toString('base64')) and unencode it at the other end using new Buffer(theJsonObject.body, 'base64'). While this works, it would be more efficient if I could selectively only perform Base64 encoding if the HTTP request response is known to be of binary type (e.g. images). Otherwise, in the https.request() callback shown above, I could simply do chunk.toString() and convey the response body in the JSON object as a UTF-8 string type. My JSON object would probably contain an additional property that indicates to the opposite end of the tunnel whether the 'body' is a UTF-8 string (e.g. for .htm, .css, etc.) or a Base64-encoded (e.g. images).
What I could do is try to use the MIME type in the response content-type header to work out whether the response is going to be binary. I would probably maintain a 'white list' of types that I know it's safe to assume are UTF-8 (such as 'text/html' and so on). All others (including e.g. 'image/png') would be Base64-encoded.
Can anyone propose a better solution?

Could you use the file-type package to detect the file type by checking the magic number of the buffer?
Install
npm install --save file-type
Usage
var fileType = require('file-type');
var safeTypes = ['image/gif'];
var request = https.request(options, function (response) {
var chunks = [];
response.on('data', function (chunk) {
chunks.push(chunk);
});
response.on('end', function () {
var buffer = Buffer.concat(chunks);
var file = fileType(buffer) );
console.log( file );
//=> { ext: 'gif', mime: 'image/gif' }
// mime isn't safe
if ( safeTypes.indexOf(file.mime) == '-1' ) {
// do your Base64 thing
}
});
});
...
request.end();
...
If you want to keep your code package free have a look at the package source on Github, it's pretty minimal.

Related

Stream binary file with http post

I'm using the request library to send a binary (pdf) file in the body of the request using http post (NOTE: This API does not accept multi-part forms). However, I have only been able to get it to work using fs.readFilesync(). For some reason, when I try to use fs.createReadStream() the pdf file is still sent, but it is empty, and the request never finishes (I never get a response back from the server).
Here is my working version using fs.readFileSync():
const request = require('request');
const fs = require('fs');
const filename = 'test.pdf';
request({
url: 'http://localhost:8083/api/v1/endpoint',
method: 'POST',
headers: {
'Content-Type': 'application/octet-stream',
'Accept': 'application/vnd.api+json',
'Content-Disposition': `file; filename="${filename}"`
},
encoding: null,
body: fs.readFileSync(filename)
}, (error, response, body) => {
if (error) {
console.log('error:', error);
} else {
console.log(JSON.parse(response.body.toString()));
}
});
If I try to replace the body with the below, it doesn't work:
body: fs.createReadStream(filename)
I have also tried piping the http request on to the stream, like it says in the request library docs, but I get the same result:
fs.createReadStream(filename).pipe(request({...}))
I've tried to monitor the stream by doing the following:
var upload = fs.createReadStream('test.pdf');
upload.pipe(req);
var upload_progress = 0;
upload.on("data", function (chunk) {
upload_progress += chunk.length
console.log(new Date(), upload_progress);
})
upload.on("end", function (res) {
console.log('Finished');
req.end();
})
I see progress for the stream and Finished, but still no response is returned from the API.
I'd prefer to create a read stream because of the benefits of working better with larger files, but am clueless as to what is going wrong. I am making sure I'm not altering the file with any special encoding as well.
Is there some way to get some kind of output to see what process is taking forever?
UPDATE:
I decided to test with a simple 1 KB .txt file. I found that it is still empty using fs.createReadStream(), however, this time I got a response back from the server. The test PDF I'm working with is 363 KB, which isn't outrageous in size, but still... weren't streams made for large files anyway? Using fs.readFileSync() also worked fine for the text file.
I'm beginning to wonder if this is an synchronous vs asynchronous issue. I know that fs.readFileSync() is synchronous. Do I need to wait until fs.createReadStream() finishes before trying to append it to the body?

I was able to get this working by doing the following:
const request = require('request');
const fs = require('fs');
const filename = 'test.pdf';
const readStream = fs.createReadStream(filename);
let chunks = [];
readStream.on('data', (chunk) => chunks.push(chunk));
readStream.on('end', () => {
const data = Buffer.concat(chunks);
request({
url: 'http://localhost:8083/api/v1/endpoint',
method: 'POST',
headers: {
'Content-Type': 'application/octet-stream',
'Accept': 'application/vnd.api+json',
'Content-Disposition': `file; filename="${filename}"`
},
encoding: null,
body: data
}, (error, response, body) => {
if (error) {
console.log('error:', error);
} else {
console.log(JSON.parse(response.body.toString()));
}
});
});
I chunked the data together and concatenated it with a buffer before making the request.
I noticed in the documentation it said this:
The Buffer class was introduced as part of the Node.js API to enable interaction with octet streams in TCP streams, file system operations, and other contexts.
The API I'm calling requires the application/octet-stream header, so I need to use the buffer rather than streaming it directly.

How to decode chunked data in node js?

I am receiving a PDF file from a node server (it is running jsreport in this server) and i need to download this PDF in the client (i am using react in the client) but the problem is that when i download the file, it comes all blank and the title some strange symbols. After a lot of tests and researchs, i found that the problem may be that the file is coming enconded as chunked (i can see that in the headers of the response) and i need to decode do become a file again.
So, how to decode this chunked string to a file again?
In the client i am just downloading the file that comes in the responde:
handleGerarRelatorioButtonClick(){
axios.post(`${REQUEST_URL}/relatorios`, this.state.selectedExam).then((response) => {
fileDownload(response.data, this.state.selectedExam.cliente.nome.replace(' ', '_') + ".pdf");
});
}
In my server, i am making a request to my jsreport that is other node server and it returns the report as a PDF:
app.post('/relatorios', (request, response) => {
var exame = new Exame(request.body);
var pdf = '';
var body = {
"template": {
"shortid": "S1C9birB-",
"data": exame
}
};
var options = {
hostname: 'localhost',
port: 5488,
path: '/api/report',
method: 'POST',
headers: {
'Content-Type': 'application/json'
}
};
var bodyparts = [];
var bodylength = 0;
var post = http.request(options, (res) => {
res.on('data', (chunk) => {
bodyparts.push(chunk);
bodylength += chunk.length;
});
res.on('end', () => {
var pdf = new Buffer(bodylength);
var pdfPos = 0;
for(var i=0;i<bodyparts.length;i++){
bodyparts[i].copy(pdf, pdfPos, 0, bodyparts[i].length);
pdfPos += bodyparts[i].length;
}
response.setHeader('Content-Type', 'application/pdf');
response.setHeader('Content-disposition', exame._id + '.pdf');
response.setHeader('Content-Length', bodylength);
response.end(Buffer.from(pdf));
});
});
post.write(JSON.stringify(body));
post.end();
});
I am sure that my report is being rendered as expected because if i make a request from postman, it returns the PDF just fine.

Your solution is simply relaying data chunks but you are not telling your front end what to expect of these chunks or how to assemble them. At a minimum you should be setting the the Content-Type response header to application/pdf and to be complete should also be sending the Content-disposition as well as Content-Length. You may need to collect the PDF from your 3rd party source into a buffer and then send that buffer to your client if you are not able to set headers and pipe to response successfully.
[edit] - I'm not familiar with jsreport but it is possible (and likely) that the response they send is a buffer. If that is the case you could use something like this in place of your response to the client:
myGetPDFFunction(params, (err, res) => {
if (err) {
//handle it
} else {
response.writeHead(200, {
'Content-Type': 'application/pdf',
'Content-Length': [your buffer's content length]
});
response.end(Buffer.from([the res PDF buffer]));
}
}
What you haven't shown is the request made to obtain that PDF, so I couldn't be more specific at this time. You should look into the documentation of jsreport to see what it sends in its response, and you can also read up on buffers here
This is rough pseudo code but the point is to respond with the PDF buffer after setting the headers to their proper values.

How to convert gzip stream into a readable content and pipe it out in the request?

I am trying to make a proxy call to a website (somesite.com) and get the html from it. somesite.com is chucked and it is zgipped so I was unable to parse buffer (responseFromServer in my code) to be html (currently i get bunch of zumbbled string when i do res.write).
i tried res.end and res.send but neither of them work.
function renderProxyRequest(req, res) {
// somesite.com is gzipped and also is chunked.
var options = {
protocol: 'http:',
hostname: 'somesite.com',
// passing in my current headers
headers: req.headers,
maxRedirects: 0,
path: req.url,
socketTimeout: 200000,
connectTimeout: 1800,
method: 'GET'
}
var proxyrequest = someProxyApi.request(options);
proxyrequest.on('response', function (postresponse) {
// postresponse is a buffer
//postresponse.pipe(res);
var responseFromServer = ''
postresponse.on('data', function (data) {
responseFromServer += data;
});
postresponse.on('end', function () {
// getting some jumbled string onto the browser.
res.write(responseFromServer);
res.end();
})
});
req.pipe(proxyrequest);
}

If postresponse is a stream, you can probably do something like this:
const zlib = require('zlib');
...
postresponse.pipe(zlib.createGunzip()).pipe(res);
You have to check if the response is gzipped to begin with, by checking the Content-Encoding header from the remote server.
Alternatively, if you pass the original headers from the remote server to the client you're proxying the request for, you should be able to just pass the response data along as-is (because the original headers will tell the client that the data was gzipped). This obviously depends on the client being able to handle compressed responses (browsers will).

ask a server to not gzip content

Is it possible with an http request header to ask the server not to gzip content?
I'm using node.js request.get library to make an API call and it appears the content is coming back as gzipped.
Its only a problem with 1 api (i call several) and I'm thinking maybe their server is misconfigured. But I wanted to try asking the server for a non-gzipped version.
Here is the response I'm getting:
GET https://www.itbit.com/api/feeds/ticker/XBTUSD
4R�&HTpȇ��{3y�L�3��SJ)$�Qj��)�w\d�P�����('t]{�d#�������?� �ŔŅ�2�1Y��_�-X%�uS��}��Y���`���gN?
�-sP��rr6�.셢$�h��]������h�>�����<]#�mx-�����d ��鑈�`��+fos�r��%�����~G�c���E)���̓5pqXK�h�S����<��,M�F�P�n�'��#��+#��]琛����Ʒ{q���܀�6u*�lygnؓ�������z��ë>X�� �rS).����s!Z�U�"Fg��:zL �����mx�W�_ѯ���^�
<l��ۊp?�t��H�1ǎ�e-��zCw�#�e�4�r�ke�z����zN��o�8����5�������\B<3��HL~g!�I��ȥ��.贡h_�aE�]X~��E����_���/7���h Ia�����3���H:\�Âi����l��2�;]w;ގ:��\���s���(�4�hV咸�q�/g�v�

Assuming I'm understanding your problem correctly, you can explicitly provide a value for the Accept-Encoding header in your HTTP GET request.
request({
url: '...',
headers: {
'Accept-Encoding': 'identity'
}
}, function(err, res, body){
});
This assumes that the server you are requesting from respects the Accept-Encoding header. If it doesn't, then your only option would be to just unzip the content.
var zlib = require('zlib');
var req = request.get(...);
req.on('response', function(res){
var stream;
if (res.headers['content-encoding'] === 'gzip'){
stream = res.pipe(zlib.createGunzip());
} else {
stream = res;
}
var chunks = [];
stream.on('readable', function(){
var chunk;
while ((chunk = stream.read()) !== null) chunks.push(chunk);
});
stream.on('end', function(){
var body = Buffer.concat(chunks);
// Do what you'd normally do.
});
});
This is how you would conditionally unzip a request based on the content encoding. That said, this API looks pretty inconsistent, since running this with the URL you gave returns a stack trace. As #robertklep pointed out, they seem to do some user-agent checking too, so it seems like this API isn't really designed for public consumption.

It's a very strange server indeed.
This seems to prevent it from sending back gzipped content:
request({
url : 'https://www.itbit.com/api/feeds/ticker/XBTUSD',
headers : { 'User-Agent' : '' }
}, ...);
(or some other random User-Agent header; it might be caching requests based on certain HTTP headers, and randomizing those headers may prevent it from serving already-cached gzipped responses)

How to buffer an HTTP response using the request module?

I would like to stream the contents of an HTTP response to a variable. My goal is to get an image via request(), and store it in in MongoDB - but the image is always corrupted.
This is my code:
request('http://google.com/doodle.png', function (error, response, body) {
image = new Buffer(body, 'binary');
db.images.insert({ filename: 'google.png', imgData: image}, function (err) {
// handle errors etc.
});
})
What is the best way to use Buffer/streams in this case?

The request module buffers the response for you. In the callback, body is a string (or Buffer).
You only get a stream back from request if you don't provide a callback; request() returns a Stream.
See the docs for more detail and examples.
request assumes that the response is text, so it tries to convert the response body into a sring (regardless of the MIME type). This will corrupt binary data. If you want to get the raw bytes, specify a null encoding.
request({url:'http://google.com/doodle.png', encoding:null}, function (error, response, body) {
db.images.insert({ filename: 'google.png', imgData: body}, function (err) {
// handle errors etc.
});
});

var options = {
headers: {
'Content-Length': contentLength,
'Content-Type': 'application/octet-stream'
},
url: 'http://localhost:3000/lottery/lt',
body: formData,
encoding: null, // make response body to Buffer.
method: 'POST'
};
set encoding to null, return Buffer.

Have you tried piping this?:
request.get('http://google.com/doodle.png').pipe(request.put('{your mongo path}'))
(Though not familiar enough with Mongo to know if it supports direct inserts of binary data like this, I know CouchDB and Riak do.)

Nowadays, you can easily retreive a file in binary with Node 8, RequestJS and async await. I used the following:
const buffer = await request.get(pdf.url, { encoding: null });
The response was a Buffer containing the bytes of the pdf. Much cleaner than big option objects and old skool callbacks.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Node.js HTTP request: How to detect response body encoding? - node.js

Related

Stream binary file with http post

How to decode chunked data in node js?

How to convert gzip stream into a readable content and pipe it out in the request?

ask a server to not gzip content

How to buffer an HTTP response using the request module?

Categories

Resources