nodejs partial data just from firefox - node.js

I have a server running on nodejs, and I have the following piece of code to manage a post request -
form.on('file', function (field, file) {
var RecordingInfo = JSON.parse(file.name);
...
when I tried to upload a file I got the following exception:
undefined:1
"}
SyntaxError: Unexpected end of input
at Object.parse (native)
at IncomingForm.<anonymous> (.../root.js:31:34)
...
searching around the web, I fond that this exception is caused because the data comes in bits, and the event is fired after the first bit arrives, and I don't have all the data. OK. The thing is, after a little testing I fond that from chrome I can upload large files (tried a 1.75gb file) without any problem, while firefox crashes the server with a 6kb file.
My question is - why are they different?
A sample capture can be downloaded form here. The first post is from chrome, the second from firefox.
The complete file.name string before uploading is:
// chrome
"{"subject":"flksajfd","lecturer":"אבישי וינר","path":"/גמרא","fileType":".png"}"
// firefox
"{"subject":"fdsa","lecturer":"אלקס ציקין","path":"/גמרא","fileType":".jpg"}"
(The data submitted is not the same, but I don't think it matters)

Chrome is encoding double-quotes in the JSON-encoded "filename" as %22 while Firefox is encoding them as \".
Your file upload parsing library, Formidable, explicitly truncates the filename from the last \ character. It expects double-quotes to be encoded as %22 although RFC 2616 allows backslash-escaped quotes like Firefox has implemented. You can consider this a bug in Formidable. The result is that the following JSON string:
'{"subject":"fdsa",...,"fileType":".jpg"}'
...is encoded as follows:
'{%22subject%22:%22fdsa",...,%22fileType%22:%22.jpg%22}' // Chrome
'{\"subject\":\"fdsa\",...\"fileType\":\".jpg\"}' // Firefox
...and then decoded by Formidable:
'{"subject":"fdsa",..."fileType":".jpg"}' // Chrome
'"}' // Firefox
To fix the issue you have a few choices:
Raise the issue with Formidable to correctly handle backslash-escaped quoted-value strings (or fix it yourself and submit a pull request).
Send the JSON payload in a separate part of the FormData object, e.g. using a Blob.
Transliterate all double-quote characters in your JSON-format filename to a 'safe' character that will not appear elsewhere in the string (I chose ^ as an example); replace the quote client-side and reinstate it server-side as follows.
Client:
var formData = new FormData();
formData.append('file', $scope.recording, JSON.stringify(RecordingInfo).replace(/"/g, '^');
Server
form.on('file', function (field, file) {
var RecordingInfo = JSON.parse(file.name.replace(/\^/g, '"');

Related

Malformed XML response with manual newline characters

I am using a GET request to fetch what I expect to be an XML document from an endpoint. The response has the following structure:
' <itunes:explicit>clean</itunes:explicit>\n' +
' <itunes:episode>11</itunes:episode>\n' +
' <itunes:episodeType>full</itunes:episodeType>\n' +
(This is from a console log in a Node.js function).
I haven't encountered a response like this before and am having trouble doing anything useful with it. I've tried:
Changing the response type and encoding of my GET function
Parsing the response with an XML parser - this throws an error
Removing the newline and + characters manually with regex (I'd like to avoid this, but it doesn't seem to work anyway)
It's worth saying that the response looks as you'd expect in a browser window:
Am I missing something fundamental about how this data is encoded / structured and what is the best way to turn it into something I can work with?
Rookie error. In case anyone else stumbles across this; I expected the response from my Axios GET request to be the xml. The response was actually in a data property in the response:
const response = await axios.get(url);
const myXML = response.data;

Handling UTF8 characters in express route parameters

I'm having an issue with a NodeJS REST api created using express.
I have two calls, a get and a post set up like this:
router.get('/:id', (request, response) => {
console.log(request.params.id);
});
router.post('/:id', (request, response) => {
console.log(request.params.id);
});
now, I want the ID to be able to contain special characters (UTF8).
The problem is, when I use postman to test the requests, it looks like they are encoded very differently:
GET http://localhost:3000/api/â outputs â
POST http://localhost:3000/api/â outputs â
Does anyone have any idea what I am missing here?
I must mention that the post call also contains a file upload so the content type will be multipart/form-data
You should encode your URL on the client and decode it on the server. See the following articles:
What is the proper way to URL encode Unicode characters?
Can urls have UTF-8 characters?
Which characters make a URL invalid?
For JavaScript, encodeURI may come in handy.
It looks like postman does UTF-8 encoding but NOT proper url encoding. Consequently, what you type in the request url box translates to something different than what would happen if you typed that url in a browser.
I'm requesting: GET localhost/ä but it encodes it on the wire as localhost/ä
(This is now an invalid URL because it contains non ascii characters)
But when I type localhost/ä in to google chrome, it correctly encodes the request as localhost/%C3%A4
So you could try manually url encoding your request to http://localhost:3000/api/%C3%A2
In my opinion this is a bug (perhaps a regression). I am using the latest version of PostMan v7.11.0 on MacOS.
Does anyone have any idea what I am missing here?
yeah, it doesn't output â, it outputs â, but whatever you're checking the result with, think you're reading something else (iso-8859-1 maybe?), not UTF-8, and renders it as â
Most likely, you're viewing the result in a web browser, and the web server is sending the wrong Content-Type header. try doing header("Content-type: text/plain;charset=utf-8"); or header("Content-type: text/html;charset=utf-8"); , then your browser should render your â correctly.

Router with base64 generated param

I am trying to send a parameter after convert it to the base 64 the definition of the geddy.js route :
router.get(routing_prefix+'/gogetthecart/:data').to('Main.gogetthecart');
In the client side, javascript, I generate a base64 json data var jsonb64 = btoa(JSON.stringify(params)); after that I call the url that will somthing like this
http://www.mydomain.com//gogetthecart/GVudGl...aWNo=
I got Error: 404 Not Found.. But If I delete manually the = from the end of data that work
Solved by the community in the git repos issues https://github.com/geddy/geddy/issues/556 as Kieran said
I looked into adding support for base64 encoded vars to Barista
directly, but some characters in the b64 spec are reserved in the URI
spec. I don't feel comfortable making that the default behaviour.
However! You can simply override the behaviour to support this use
case:
router
.get( routing_prefix+'/gogetthecart/:data')
.to('Main.gogetthecart')
.where({
data: /[\w\-\/+]+={0,2}/ // base64-safe regex condition
})
and that should do the trick!
I added a test here:
https://github.com/kieran/barista/blob/master/tests/barista.test.js#L812

Why are the last 2 chars of my NodeJS shred POST request getting stripped?

I am having a problem submitting a UTF8 encoded text file in a POST request using NodeJS "shred"
The text content I am trying to post looks fine on the client side, I know because I console.log it to the screen just before calling client.post, what the server gets is the content of the text file but the last 2 chars are always missing/chopped. This is not a problem with ANSI text files. If I convert the textfile from UTF8 to ANSI, it is complete when it reaches the server.
var Shred = require('shred');
var client = new Shred();
var textToPost = fs.readFileSync("myfile.txt", 'utf8');
console.log (textToPost);
client.post({
url: "http://www.example.com/readTextFile.php",
headers: { 'Content-Type': 'application/x-subrip'},
content: textToPost,
on: {
200: function (response) {
console.log("posted ok");
console.log(response.content.body);
},
500: function (response) {
asyncCb(new Error('bad response\n' + response.content.body));
}
}
What is recieved on the server (by readTextFile.php) is the contents of myfile.txt with the last 2 chars stripped out. I cannot understand why. This has big downstream implications so any patchy workarounds are not likely to help.
I also noticed that when the contents of textToPost are logged to the console, there is a "?" preceding the contents. This doesn't appear when the file is an ANSI encoded file.
Please help.. thank you
Ok, after the comments above (thanks rdrey and JohnnyHK), I decided to try stripping out the BOM from the file(s). So I used a hex editor, deleted the EF BB BF chars and saved, this time the file arrived at the server fully intact with no chars missing at the end. Now I'll modify my nodeJS to strip the chars out also. This doesn't completely answer my question (why is there an issue with the BOM). Perhaps shred has a problem posting text files with a BOM. Maybe it incorrectly reads it and decides the file is smaller than it actually is and as a result, chops the end off. I am not sure.

Node.js POST File to Server

I am trying to write an app that will allow my users to upload files to my Google Cloud Storage account. In order to prevent overwrites and to do some custom handling and logging on my side, I'm using a Node.js server as a middleman for the upload. So the process is:
User uploads file to Node.js Server
Node.js server parses file, checks file type, stores some data in DB
Node.js server uploads file to GCS
Node.js server response to user's request with a pass/fail remark
I'm getting a little lost on step 3, of exactly how to send that file to GCS. This question gives some helpful insight, as well as a nice example, but I'm still confused.
I understand that I can open a ReadStream for the temporary upload file and pipe that to the http.request() object. What I'm confused about is how do I signify in my POST request that the piped data is the file variable. According to the GCS API Docs, there needs to be a file variable, and it needs to be the last one.
So, how do I specify a POST variable name for the piped data?
Bonus points if you can tell me how to pipe it directly from my user's upload, rather than storing it in a temporary file
I believe that if you want to do POST, you have to use a Content-Type: multipart/form-data;boundary=myboundary header. And then, in the body, write() something like this for each string field (linebreaks should be \r\n):
--myboundary
Content-Disposition: form-data; name="field_name"
field_value
And then for the file itself, write() something like this to the body:
--myboundary
Content-Disposition: form-data; name="file"; filename="urlencoded_filename.jpg"
Content-Type: image/jpeg
Content-Transfer-Encoding: binary
binary_file_data
The binary_file_data is where you use pipe():
var fileStream = fs.createReadStream("path/to/my/file.jpg");
fileStream.pipe(requestToGoogle, {end: false});
fileStream.on('end, function() {
req.end("--myboundary--\r\n\r\n");
});
The {end: false} prevents pipe() from automatically closing the request because you need to write one more boundary after you're finished sending the file. Note the extra -- on the end of the boundary.
The big gotcha is that Google may require a content-length header (very likely). If that is the case, then you cannot stream a POST from your user to a POST to Google because you won't reliably know what what the content-length is until you've received the entire file.
The content-length header's value should be a single number for the entire body. The simple way to do this is to call Buffer.byteLength(body) on the entire body, but that gets ugly quickly if you have large files, and it also kills the streaming. An alternative would be to calculate it like so:
var body_before_file = "..."; // string fields + boundary and metadata for the file
var body_after_file = "--myboundary--\r\n\r\n";
var fs = require('fs');
fs.stat(local_path_to_file, function(err, file_info) {
var content_length = Buffer.byteLength(body_before_file) +
file_info.size +
Buffer.byteLength(body_after_file);
// create request to google, write content-length and other headers
// write() the body_before_file part,
// and then pipe the file and end the request like we did above
But, that still kills your ability to stream from the user to google, the file has to be downloaded to the local disk to determine it's length.
Alternate option
...now, after going through all of that, PUT might be your friend here. According to https://developers.google.com/storage/docs/reference-methods#putobject you can use a transfer-encoding: chunked header so you don't need to find the files length. And, I believe that the entire body of the request is just the file, so you can use pipe() and just let it end the request when it's done. If you're using https://github.com/felixge/node-formidable to handle uploads, then you can do something like this:
incomingForm.onPart = function(part) {
if (part.filename) {
var req = ... // create a PUT request to google and set the headers
part.pipe(req);
} else {
// let formidable handle all non-file parts
incomingForm.handlePart(part);
}
}

Resources