How do I send raw binary data via HTTP in node.js? - node.js

From a node.js back end, I need to send an HTTP message to a REST endpoint. The endpoint requires some parameters that it will expect to find in the HTTP message. Some of the parameters are simple enough, just requiring a number or a string as an argument. But one of the parameters is to be "the raw binary file content being uploaded" and this has puzzled me. As far as I understand, the parameters need to be gathered together into a string to put in the body of the HTTP request; How do I add raw binary data to a string? Obviously, for it to be in the string, it cannot be raw binary data; it needs to be encoded into characters.
The endpoint in question is the Twitter media upload API. The "raw binary data" parameter is called media. Below is an incomplete code snippet showing the basic gist of what I've tried. Specifically, the line where I build the requestBody string. I don't believe it is anywhere near correct, because the endpoint is returning a "bad request" message.
var https = require("https");
var base64ImageData = /* (some base 64 string here) */;
var options = {
host: "api.twitter.com",
path: "/1.1/media/upload.json",
method: "POST",
headers: {
"Content-Type": "multipart/form-data"
}
};
var request = https.request(options, function(response) {});
var requestBody = "media_id=18283918294&media=" + Buffer.from(base64ImageData, "base64").toString("binary");
request.write(requestBody);
request.end();
Also worth noting, Twitter themselves note the following extremely confusing statement:
"When posting base64 encoded images, be sure to set the “Content-Transfer-Encoding: base64” on the image part of the message."
Source: https://developer.twitter.com/en/docs/media/upload-media/uploading-media/media-best-practices
That might be part of the answer to my question, but what I don't understand is: How do I apply different headers to different parts of the HTTP message? Because apparently, the image data needs to have a Content-Transfer-Encoding header of "base64" while the rest of the HTTP message does not...

How do I apply different headers to different parts of the HTTP message?
This is the point of the multipart/form-data content type. A multi-part message looks like this:
Content-Type: multipart/form-data; boundary=---foo---
---foo---
Content-Disposition: form-data; name="datafile1"; filename="r.gif"
Content-Transfer-Encoding: base64
Content-Type: image/gif
// data goes here
---foo---
Content-Disposition: form-data; name="datafile2"; filename="g.png"
Content-Transfer-Encoding: base64
Content-Type: image/png
// another file's data goes here
---foo---
You probably don't want to put all this together yourself. There are a bunch of good libraries for putting together complex POSTs. For example: https://www.npmjs.com/package/form-data

Related

Uploading an image using Multipart by karate API tool [duplicate]

I have a simple POST request that requires a json Content-Type header and a body like
{
oneNbr: "2016004444",
twoCode: "###",
threeNbr: "STD PACK",
sheetTitle: "010000",
codeType: "AF14"
}
When I run this in Postman, it runs as expected, returning 200 status and the expected response.
Here's the same script in Karate:
Scenario: Sample test
* def payload =
"""
{
oneNbr: "2016004444",
twoCode: "###",
threeNbr: "STD PACK",
sheetTitle: "010000",
codeType: "AF14"
}
"""
Given path '/my_end_point/endpoint'
And request payload
When method post
Then status 200
When I run this, it returns {"code":"415","status":"Unsupported Media Type"}. The console output shows that the right content-type is being set during the POST.
Even if I specifically set the content-type in the script, 415 is still returned e.g.
And header Content-Type = 'application/json'
OR
* configure headers = { 'Content-Type': 'application/json' }
Any help is appreciated.
We did some debugging and found that Karate automatically appends 'charset=UTF-8' to the Content-Type header. The API does not expect charset.
Found the following post and that solved the problem:
How can I send just 'application/json' as the content-type header with Karate?
Posting this to help others in future.
It's simple. Try to use this in your background.
* def charset = null
Try adding a * header Accept = 'application/json' header. The one difference between Karate and Postman is that Postman tries to be smart and auto-adds an Accept header - whereas Karate does not.

"Header content contains invalid characters" error when piping multipart upload part into a new request

My express server receives file uploads from browsers. The uploads are transferred as multipart/form-data requests; I use multiparty to parse the incoming entity body.
Multiparty allows you to get a part (roughly, a single form field like an <input type="file">) as a readable stream. I do not want to process or store the uploaded file(s) on my web server, so I just pipe the uploaded file part into a request made to another service (using the request module).
app.post('/upload', function(req, res) {
var form = new multiparty.Form();
form.on('part', function(part) {
var serviceRequest = request({
method: 'POST',
url: 'http://other-service/process-file',
headers: {
'Content-Type': 'application/octet-stream'
}
}, function(err, svcres, body) {
// handle response
});
part.pipe(serviceRequest);
});
form.parse(req);
});
This works correctly most of the time. node automatically applies chunked transfer encoding, and as the browser uploads file bytes, they are correctly sent to the backend service as a raw entity body (without the multipart formatting), which ultimately gets the complete file and returns successfully.
However, sometimes the request fails and my callback gets called with this err:
TypeError: The header content contains invalid characters
at ClientRequest.OutgoingMessage.setHeader (_http_outgoing.js:360:11)
at new ClientRequest (_http_client.js:85:14)
at Object.exports.request (http.js:31:10)
at Object.exports.request (https.js:199:15)
at Request.start (/app/node_modules/request/request.js:744:32)
at Request.write (/app/node_modules/request/request.js:1421:10)
at PassThrough.ondata (_stream_readable.js:555:20)
at emitOne (events.js:96:13)
at PassThrough.emit (events.js:188:7)
at PassThrough.Readable.read (_stream_readable.js:381:10)
at flow (_stream_readable.js:761:34)
at resume_ (_stream_readable.js:743:3)
at _combinedTickCallback (internal/process/next_tick.js:80:11)
at process._tickDomainCallback (internal/process/next_tick.js:128:9)
I'm unable to explain where that error is coming from since I only set the Content-Type header and the stack does not contain any of my code.
Why do my uploads occasionally fail?
This example shows how to send file as an attachment with national symbols in the filename.
const http = require('http');
const fs = require('fs');
const contentDisposition = require('content-disposition');
...
// req, res - http request and response
let filename='totally legit 😈.pdf';
let filepath = 'D:/temp/' + filename;
res.writeHead(200, {
'Content-Disposition': contentDisposition(filename), // Mask non-ANSI chars
'Content-Transfer-Encoding': 'binary',
'Content-Type': 'application/octet-stream'
});
var readStream = fs.createReadStream(filepath);
readStream.pipe(res);
readStream.on('error', (err) => ...);
That TypeError gets thrown by node when making an outgoing HTTP request if there is any string in the request headers option object contains a character outside the basic ASCII range.
In this case, it appears that the Content-Disposition header is getting set on the request even though it is never specified in the request options. Since that header contains the uploaded filename, this can result in the request failing if the filename contains non-ASCII characters. ie:
POST /upload HTTP/1.1
Host: public-server
Content-Type: multipart/form-data; boundary=--ex
Content-Length: [bytes]
----ex
Content-Disposition: form-data; name="file"; filename="totally legit 😈.pdf"
Content-Type: application/pdf
[body bytes...]
----ex--
The request to other-service/process-file then fails because multiparty stores the part headers on the part object, which is also a readable stream representing the part body. When you pipe() the part into serviceRequest, the request module looks to see if the piped stream has a headers property, and if it does, copies them to the outgoing request headers.
This results in the outgoing request that would look like:
POST /process-file HTTP/1.1
Host: other-service
Content-Type: application/octet-stream
Content-Disposition: form-data; name="file"; filename="totally legit 😈.pdf"
Content-Length: [bytes]
[body bytes...]
...except that node sees the non-ASCII character in the Content-Disposition header and throws. The thrown error is caught by request and passed to the request callback function as err.
This behavior can be avoided by removing the part headers before piping it into the request.
delete part.headers;
part.pipe(serviceRequest);
As like as #arrow cmt before, using encodeURI(filename) on your Content-disposition header. In client, you using decodeURI method to decode.

why didn't I get exact same file size through node.js?

I have a simple uploading code by node.js.
var http = require('http')
var fs = require('fs')
var server = http.createServer(function(req, res){
if(req.url == '/upload') {
var a = fs.createWriteStream('a.jpg', { defaultEncoding: 'binary'})
req.on('data', function(chunk){
a.write(chunk)
})
.on('end', function()){
a.end()
res.end('okay')
})
}
else {
fs.createReadStream('./index.html').pipe(res);
// just show <form>
}
})
server.listen(5000)
when I upload some image, I cannot get exact same file.
Always files are broken.
When I try to do this using formidable, I can get a fine file.
So I studied formidable but I cannot understand how did it catch data and save.
I could find formidable use parser to calculate something about chunk from request but I did not get it all.
(It is definitely my brain issue :( ).
Anyway, what is the difference between my code and formidable?
What am I missing?
Is it a wrong way to just add all chunks from http request and save it by
fs.createWriteStream or fs.writeFile ?
What concepts am I missing?
First, req is a Readable stream. You can simply do:
req.pipe(fs.createWriteStream('a.jpg'))
for the upload part. This is copying all byte data from request stream to file.
This will work when you send raw file data as the request body:
curl --data-binary #"/home/user/Desktop/a.jpg" http://localhost:8080/upload
Because this sends request body exactly as image binary data, that gets streamed to a file on server.
But, there is another request format called multipart/form-data. This is what web browsers use with <form> to upload files.
curl -form "image=#/home/user1/Desktop/a.jpg" http://localhost:8080/upload
Here the request body contains multiple "parts", one for each file attachment or form field, separated by special "boundary" characters:
--------------------------e3f25f5319cd6624
Content-Disposition: form-data; name="image"; filename="a.jpg"
Content-Type: application/octet-stream
JPG IHDRH-ÑtEXtSoftwareAdobe.....raw file data
--------------------------e3f25f5319cd6624
Hence you will need much more complicated code to extract the file part data from it. NPM Modules like busboy and formidable do exactly that.

non-chunked multipart/mixed POST?

I'm trying to send multiple binary files to a web service in a single multipart/mixed POST but can't seem to get it to work... target server rejects it. One thing I noticed is Node is trying to do the encoding as chunked, which I don't want:
POST /SomeFiles HTTP/1.1
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=123456789012345
Host: host.server.com
Connection: keep-alive
Transfer-Encoding: chunked
How do I get it to stop being chunked? Docs say that can be disabled by setting the Content-Length in the main request header but I can't set the Content-Length since, among other reasons, I'm loading one file at a time -- but it shouldn't be required either since it's multipart.
I think the rest is OK (excepting it's not chunked, unless the req_post.write() is doing that part), e.g. for the initial part I:
var req_post = https.request({
hostname: 'host.server.com',
path: '/ManyFiles',
method: 'POST',
headers: {
'MIME-Version': '1.0',
'Content-Type': 'multipart/mixed; boundary=' + boundary
}
},...);
and then pseudo-code:
while ( true ) {
// { get next file here }
req_post.write('\r\n--' + boundary + '\r\nContent-Type: ' + filetype + '\r\nContent-Length: ' + filesize + '\r\n\r\n');
req_post.write(chunk);// filesize
if ( end ) {
req_post.write('\r\n--' + boundary + '--\r\n');
req_post.end();
break;
}
}
Any help/pointers is appreciated!
The quick answer is you cannot disable chunked without setting content-length. The chunked encoding was introduced for cases where you do not know the size of the payload when you start transmitting. Originally, content-length was required and the recipient knew it had the full message when it received content-length bytes. Chunked encoding removed that requirement by transmitting mini-payloads each with a content-length, followed by a zero-size to denote completion of the payload. So, if you do not set the content-length, and you do not use the chunked methodology, the recipient will never know when it has the full payload.
To help solve your problem, if you cannot send chunked and do not want to read all the files before sending, take a look at fs.stat. You can use it to get the file size without reading the file.

Umlauts broken when doing get request

I'm trying to query a webservice which answers with plain text. The text often has german umlauts in it. In the received stream the umlauts are broken. Any ideas what am I doing wrong?
Regards,
Torsten
Here is the sample code:
var request = require('request');
var uri = <anUriWithUserId>;
request(uri, {encoding: 'utf8','content-type': 'text/plain; charset=UTF-8'},
function (error, response, body)
{
console.log("encoding: " + response.headers['content-encoding']);
console.log("type: " + response.headers['content-type']);
console.log(body);
});
And the response:
encoding: undefined
type: text/plain
error=0
---
asin=
name=Eistee
detailname=Pfanner Der Gr�ne Tee, Zitrone - Kaktusfeige, 2,0 l
vendor=Hermann Pfanner Getr�nke GmbH, Lauterach, �sterreich
maincat=Getr�nke, Alkohol
When you set the encoding option in your request call, you advise the request module to decode the response body with this encoding. In this way you ignore the encoding used by the webservice, wich may or may not be utf-8. You need to find out wich encoding was used be the webservice and use that.
Depending on how complient the webservice you could also try to set the Accept-Charset: utf-8 header.
As your output shows, the webservice doesn't provide the used encoding in the Content-Type header, which is a bad habbit imho.
Sidenote: Content-Encoding isn't for charset, but for compression, gzip migh be a valid value for it.

Resources