non-chunked multipart/mixed POST? - node.js

I'm trying to send multiple binary files to a web service in a single multipart/mixed POST but can't seem to get it to work... target server rejects it. One thing I noticed is Node is trying to do the encoding as chunked, which I don't want:
POST /SomeFiles HTTP/1.1
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=123456789012345
Host: host.server.com
Connection: keep-alive
Transfer-Encoding: chunked
How do I get it to stop being chunked? Docs say that can be disabled by setting the Content-Length in the main request header but I can't set the Content-Length since, among other reasons, I'm loading one file at a time -- but it shouldn't be required either since it's multipart.
I think the rest is OK (excepting it's not chunked, unless the req_post.write() is doing that part), e.g. for the initial part I:
var req_post = https.request({
hostname: 'host.server.com',
path: '/ManyFiles',
method: 'POST',
headers: {
'MIME-Version': '1.0',
'Content-Type': 'multipart/mixed; boundary=' + boundary
}
},...);
and then pseudo-code:
while ( true ) {
// { get next file here }
req_post.write('\r\n--' + boundary + '\r\nContent-Type: ' + filetype + '\r\nContent-Length: ' + filesize + '\r\n\r\n');
req_post.write(chunk);// filesize
if ( end ) {
req_post.write('\r\n--' + boundary + '--\r\n');
req_post.end();
break;
}
}
Any help/pointers is appreciated!

The quick answer is you cannot disable chunked without setting content-length. The chunked encoding was introduced for cases where you do not know the size of the payload when you start transmitting. Originally, content-length was required and the recipient knew it had the full message when it received content-length bytes. Chunked encoding removed that requirement by transmitting mini-payloads each with a content-length, followed by a zero-size to denote completion of the payload. So, if you do not set the content-length, and you do not use the chunked methodology, the recipient will never know when it has the full payload.
To help solve your problem, if you cannot send chunked and do not want to read all the files before sending, take a look at fs.stat. You can use it to get the file size without reading the file.

Related

What is happening in the response body of my http request?

I have a post request that when sent in python through the requests or aiohttp library, responds as expected, but when the equivalent request is sent in rust through the reqwest library, is pure gibberish.
The request:
pub async fn get_token(client: &reqwest::Client, uri: String, headers: HeaderMap, body: serde_json::json) {
let user_name = env::var("USERNAME").unwrap();
let password =env::var("PWD").unwrap();
let resp = client.post(uri).headers(headers).json(body).send().await;
if resp.is_ok() {
println!("{}", resp.text().await.unwrap())
}
Expected body of response:
{"access_token":"eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VySWQiOiJ3aGhjaFFTRWF6Y2s1RmI5VyIsInJvbGVzIjp7Imdsb2JhbCI6WyJkcml2ZXIiXSwiZ3JvdXBzIjpbXX0sInNjb3BlIjpbImxvZ2luIl0sImlhdCI6MTY0ODYyNDkwOCwiZXhwIjoxNjQ4NjI4NTA4fQ.EQrLoG9cDQTXcbqBPZdfhN0cjOXRCeGz_cA8uTNF9kN4_rIVV4xcb67OwT8I03ch49V-BeA71qvbVDYdVqubNg5jxA6iSeTng-6aepGswyIaWYuDHx8KFUdaRWoZVh-WhIlDNSNIXkFbxnO4ggKy_Bf3nVJbIraWuitWWVcwjg8jbOy4cpjSkIjgiXUzMNL8_RWOIATvthplnw4MBsOpEsBsZkoYqfOjMmepojyGPE-FjrLYTWFZpB0PHV3OSv3mwZT-aAtI2yexZOSi6rz2TuBhPJVk93SfcXq-UeUPIlSrN7C6QI-6jVIzl9xFX1DKO0Uc8Fq3M-lvPnYkmY29G09h6Ltr9XPBRq9AZq-_r7yAH1lsWvWf1XhTwEOsFcACkH5Q5HxA4Ai50PegrHEhcBB9Cub9CPySMJ9oIewfj3cQURbRHAALbGXpiHBE7BU39QLUskuyzL4OGShaHliHQk1igyPRRHMdYeCGb1P39wB_Mq3nUzoH047QQ7KMGHb5azMPTLav7FsVmJhw7NOGIZtIyILz_07IcA_4XriokJuKUjBBOHuz82Ka7Vi6kthPsnDPplZ3i7TgQi8IptOWpm7IbhPAhTaSH5DuXFQfmtkWVNMcoVR6_Q8O1DNw9DVQmQwEycOA4SbbSYKmdVz0K8w81Kk7HGR6MGln4hEbrSk",
"refresh_token":"eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VySWQiOiJ3aGhjaFFTRWF6Y2s1RmI5VyIsInNjb3BlIjpbInJlZnJlc2giXSwiaWF0IjoxNjQ4NjI0OTA4fQ.LDfwWeDPG11ZHNRo1Tks4LFY399VRKLI_cQnrxbGVAwlfCftOnIWw5-bsjXxMgnT6psL6D9EdKzBN8BNrw0eCis4U7EjVNUkGFegCmjPlO8gmdc_MRaHO-gN2TN43C-2jwbA9IojOw7UqVUtEzwp4px-OqnYNNIxqMnFOa5oe72v2ILJ5G3bD0J-0AQ-ly-Ce_lstDhObG-yVptIRcbt54OJ1Ou9PjPi9Y0OuVo6rwpQ1KKXdNc4GbQ_cOQKpQ6CJkDK-SjYByOFXC1pqD3aku4lxHpfx99K3RWFgmSeIN4VQLcQ7uBEWvBePEWUCEXSGiod1hK-gLAY0c-Io6NEygZWKIlACKoEphZLqyoKQ7vn-iMlN-8DGWX8Oh6gfT7ULhiG4U_JUheXxIzFvQBLfKtmXbKyagYEZ-y-Zl2nkAEZG2QzSm-cRAsTgquOvJvfujAtK3c5dHKoBL-0jIfnfWr4BHsptsWCc5J2SGtjjtTIG-Lh9d5mqgfN8TBVEK2R1JIeF0HxYPXcOO9CZZmqGqWp5YNZCJwppP2VJoVLSyzi3X5Cu0WWv7OpImmiR8H_M1JAn4XMyPMtyKFM5seewn8s6bOsZzELkKaPAFWbLlgeoJlDGA0CZtwFs2iJdd-UbS5C2dXUQw7yxVhRUIUq-pC1F3NVo51bRHgGzpcSYlE",
"token_type":"bearer",
"expires_in":3600}
The actual response:
�lMP�JX#z�ܦ}��>�}�CL>2�y#3%��`P�c:�L �#D�t�
���X#*vk\p�\�*�Y�~u�f������J<����|}b �M?^*&�d-u����2�!�hKU�1�`�dit�
�5W#�͛�Z1;F5��w�+��1.� DY#
���x?u]�äh(F��c�#���ů��Y{I�3XU�QN�+�pu�=��-X���+�5('�b9�бGz���l4��}����=Ȫ��FQm�����͂-��wiD� p�%S�>yq����d��/N�c2g���˛����kɋ%_�h5���9�]8�]���o�u� � `u�~R�o7_9��S��C��%LPj^��#����}{B�� "�_}�IGb��p�:9Bۤ7�ٌT�_|cJDْ��Q�l��2#S��Rܣ\۳�}T�C+꽨ʹ�O����ƝW�����=�`th�忿��&dU��zh�I��X��_�1���oο��Vdp�������P�#���E�
ǣ��3�L��x�¡�?�~������Z��Bk���
(��$h`0�r���$wr��W꜄է��c
���7�0�E-�b��I6Q�ac���V�
��F��7���o�ݭ9�4��j<�a�/L��&dUZ��8����Åba!�X��.�������`ˣ�'A'���/sP,�m�?~v/�綔FR��|
]��l1�}\(]��̃�'㠊җ�)�
��EG2#�4�H�gd�x��p�������JՏ���+ϼ�m#��o=���)j=��\�}���pB<Tb×r��N���K�q��){+��u�HK�,Pu�+��絍m>D�=��$��|y�y��T/����F���
The issue is seems to obviously be some kind of format or encoding/decoding issue, but I can't find anything relevant to something like this. Anyone have an idea what the issue is?
EDIT
So I ended up being able to figure out what the issue was, and it was in the Accept-Encoding request header, whose value was gzip, deflate, br
I was trying to see if any of my configured request headers were the issue by just removing them one by one, and that header was certainly the issue.
The response headers now have the Vary header set to Accept-Encoding, so I'm not even sure what the actual compression mechanism is being used to produce the correct response.
Can anyone offer an explanation on why the gzip, deflate, br header when used in the python request would respond as expected while the rust request would not?

How do I send raw binary data via HTTP in node.js?

From a node.js back end, I need to send an HTTP message to a REST endpoint. The endpoint requires some parameters that it will expect to find in the HTTP message. Some of the parameters are simple enough, just requiring a number or a string as an argument. But one of the parameters is to be "the raw binary file content being uploaded" and this has puzzled me. As far as I understand, the parameters need to be gathered together into a string to put in the body of the HTTP request; How do I add raw binary data to a string? Obviously, for it to be in the string, it cannot be raw binary data; it needs to be encoded into characters.
The endpoint in question is the Twitter media upload API. The "raw binary data" parameter is called media. Below is an incomplete code snippet showing the basic gist of what I've tried. Specifically, the line where I build the requestBody string. I don't believe it is anywhere near correct, because the endpoint is returning a "bad request" message.
var https = require("https");
var base64ImageData = /* (some base 64 string here) */;
var options = {
host: "api.twitter.com",
path: "/1.1/media/upload.json",
method: "POST",
headers: {
"Content-Type": "multipart/form-data"
}
};
var request = https.request(options, function(response) {});
var requestBody = "media_id=18283918294&media=" + Buffer.from(base64ImageData, "base64").toString("binary");
request.write(requestBody);
request.end();
Also worth noting, Twitter themselves note the following extremely confusing statement:
"When posting base64 encoded images, be sure to set the “Content-Transfer-Encoding: base64” on the image part of the message."
Source: https://developer.twitter.com/en/docs/media/upload-media/uploading-media/media-best-practices
That might be part of the answer to my question, but what I don't understand is: How do I apply different headers to different parts of the HTTP message? Because apparently, the image data needs to have a Content-Transfer-Encoding header of "base64" while the rest of the HTTP message does not...
How do I apply different headers to different parts of the HTTP message?
This is the point of the multipart/form-data content type. A multi-part message looks like this:
Content-Type: multipart/form-data; boundary=---foo---
---foo---
Content-Disposition: form-data; name="datafile1"; filename="r.gif"
Content-Transfer-Encoding: base64
Content-Type: image/gif
// data goes here
---foo---
Content-Disposition: form-data; name="datafile2"; filename="g.png"
Content-Transfer-Encoding: base64
Content-Type: image/png
// another file's data goes here
---foo---
You probably don't want to put all this together yourself. There are a bunch of good libraries for putting together complex POSTs. For example: https://www.npmjs.com/package/form-data

how to disable res.writeHead() output extra number before output?

I create a http server with express.
Below is the server code:
router.get('/', function(req, res, next) {
// req.socket.setTimeout(Infinity);
res.writeHead(200, {
'Content-Type': 'text/plain; charset=utf-8', // <- Important headers
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
});
res.write('hell');
res.write('world');
res.end();
// res.write('\n\n');
// response = res;
});
when I use netcat to GET the url.
the output is like this
GET /sse HTTP/1.1
HTTP/1.1 200 OK
X-Powered-By: Express
Content-Type: text/plain; charset=utf-8
Cache-Control: no-cache
Expires: 0
Connection: keep-alive
Keep-Alive: timeout=5, max=97
Date: Fri, 30 Jun 2017 11:50:00 GMT
Transfer-Encoding: chunked
4
hell
5
world
0
My question is why there are always a number before every res.write()? And the number seems is then length of the output of res.write.
How can I remove the number?
This is how the chunked encoding works. You don't have to declare up front how many bytes are you going to send but instead every chunk is prepended with the number of bytes that it has. For example: 4 for "hell", 5 for "world" and then finally 0 for "no more bytes to send".
You can see that you have the chunked encoding header present:
Transfer-Encoding: chunked
Now, to answer your question directly, to remove the numbers you'd have to switch off the chunked encoding. To do that you'd have to set the Content-length header to the number of bytes that you are going to send in the body of the response.
For example:
app.get('/', function(req, res, next) {
res.writeHead(200, {
'Content-Type': 'text/plain; charset=utf-8',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
'Content-length': 9, // <<<--- NOTE HERE
});
res.write('hell');
res.write('world');
res.end();
});
(Of course in the real code you'd have to either calculate the length of the response or build up a string or buffer and get its length just before you set the Content-length header.)
But note that if you use curl or any other HTTP client then it will "remove" the numbers for you. It's just that with netcat you accidentally saw the underlying implementation detail of the chunked encoding but all of the real HTTP clients can handle that just fine.
Normally in HTTP you declare the length of the entire response in the headers and then send the body - in one piece, in multiple chunks, whatever - but it has to have the same length as what you declared. It means that you cannot start sending data before you know the length of everything that you want to send. With chunked encoding it's enough to declare the length of each chunk that you send but you don't have to know the length of the entire response - which could even be infinite, it's up to you. It lets you start sending as soon as you have anything to send which is very useful.

Chunked encoding, streams and content-length

I need to upload a gzipped file. For performance in case my string gets too big I decided to use streams but ran into an issue with the server requiring a content-length header which cannot be calculated as the gzipping is inline. I then decided to use chunked transfer but am not sure if I am either not doing this correctly or if the server will simply not accepting streams/chunks as it still returns an error about needing a content-length header.
Here's the bit of the code:
const gzip = zlib.createGzip()
let stream = createStream(string) // I also use files hence the streaming
.pipe(gzip) )
.pipe(request.put(url, {
headers: {
'Transfer-Encoding': 'chunked',
'x-ms-blob-type': 'blockblob'
}
}))
Response:
Content-Length HTTP header is missing
I've also played around with adding other headers such as:
'Content-Type': 'application/javascript'
'Content-Encoding': 'gzip'
Is my only option to just forgo streaming or gzip out of the flow and calculate length that way? I can't tell if I am missing something or of the server is being persnickety.

Umlauts broken when doing get request

I'm trying to query a webservice which answers with plain text. The text often has german umlauts in it. In the received stream the umlauts are broken. Any ideas what am I doing wrong?
Regards,
Torsten
Here is the sample code:
var request = require('request');
var uri = <anUriWithUserId>;
request(uri, {encoding: 'utf8','content-type': 'text/plain; charset=UTF-8'},
function (error, response, body)
{
console.log("encoding: " + response.headers['content-encoding']);
console.log("type: " + response.headers['content-type']);
console.log(body);
});
And the response:
encoding: undefined
type: text/plain
error=0
---
asin=
name=Eistee
detailname=Pfanner Der Gr�ne Tee, Zitrone - Kaktusfeige, 2,0 l
vendor=Hermann Pfanner Getr�nke GmbH, Lauterach, �sterreich
maincat=Getr�nke, Alkohol
When you set the encoding option in your request call, you advise the request module to decode the response body with this encoding. In this way you ignore the encoding used by the webservice, wich may or may not be utf-8. You need to find out wich encoding was used be the webservice and use that.
Depending on how complient the webservice you could also try to set the Accept-Charset: utf-8 header.
As your output shows, the webservice doesn't provide the used encoding in the Content-Type header, which is a bad habbit imho.
Sidenote: Content-Encoding isn't for charset, but for compression, gzip migh be a valid value for it.

Resources