Why does the request pipe includes the headers in this code? - node.js

I have a strange situation regarding http server and piping request.
From my past experience, when piping the request object of a http server to a writable stream of some sort, it does not include the headers, just the payload.
Today however, I wrote some very simple code, and from some reason, I'm spending the past 2 hours trying to figure out why it writes the headers to the file (super confusing!)
Here's my code:
server = http.createServer((req, res) => {
f = '/tmp/dest'
console.log(`writing to ${f}`)
s = fs.createWriteStream(f)
req.pipe(s)
req.on('end', () => {
res.end("done")
})
})
server.listen(port)
I test this with the following curl command:
curl -XPOST -F 'data=#test.txt' localhost:8080
And this is what I'm getting when I'm reading /tmp/dest:
--------------------------993d19e02b7578ff
Content-Disposition: form-data; name="data"; filename="test.txt"
Content-Type: text/plain
hello - this is some text
--------------------------993d19e02b7578ff--
Why am I seeing the headers here? I expected it to only write the payload
I have a code I wrote about a year ago that streams directly to a file without the headers, I don't understand what's different, but this one did the trick:
imageRouter.post('/upload', async(req, res) => {
if(!req.is("image/*")) {
let errorMessage = `the /upload destination was hit, but content-type is ${req.get("Content-Type")}`;
console.log(errorMessage);
res.status(415).send(errorMessage);
return;
}
let imageType = req.get("Content-Type").split('/')[1];
let [ err, writeStream ] = await getWritableStream({ suffix: imageType });
if (err) {
console.log("error while trying to write", err);
return res.status(500).end();
}
let imageName = writeStream.getID();
req.on('end', () => {
req.unpipe();
writeStream.close();
res.json({
imageRelativeLink: `/images/${imageName}`,
imageFullLink: `${self_hostname}/images/${imageName}`
});
});
req.pipe(writeStream);
});
What's different? Why does my code from a year ago (last block) writes without the form-data/headers? The resulting file is only an image, without text, but this time (the first block) shows http headers in the resulting file

Instead of using pipe, try using on('data') and referring to req.data to pull off the contents. This will allow the http library to process the HTTP body format and handle the "headers" (really: form part descriptors) for you.
Node Streaming Consumer API
server = http.createServer((req, res) => {
f = '/tmp/dest'
console.log(`writing to ${f}`)
s = fs.createWriteStream(f)
req.on('data', chunk) => {
s.write(chunk);
}
req.on('end', () => {
s.close();
res.end("done")
})
})
server.listen(port)

As it turns out, I had a mistake in my understanding, and therefore made a mistake in my question.
What I thought were the headers, were actually http multipart specification. This is how curl uploads a file when used with this syntax.
What I actually needed was to change the way I test my code with curl to one of the following:
cat /path/to/test/file | curl -T - localhost:8080
# or
curl -T - localhost:8080 < /path/to/test/file
# or
curl -T /path-/to/test/file localhost:8080 < /path/to/test/file
Using the -T (or --upload-file) flag, curl uploads the file (or stdin) without wrapping it in an http form.

Related

busboy wait for field data before accepting file upload

Is something like this even possible, or are there better ways to do this? Is what Im doing even a good idea, or is this a bad approach?
What I want to do is upload a file to my nodejs server. Along with the file I want to send some meta data. The meta data will determine if the file can be saved and the upload accepted, or if it should be rejected and sending a 403 response.
I am using busboy and I am sending FormData from my client side.
The example below is very much simplified:
Here is a snippet of the client side code.
I am appending the file as well as the meta data to the form
const formData = new FormData();
formData.append('name', JSON.stringify({name: "John Doe"}));
formData.append('file', this.selectedFile, this.selectedFile.name);
Here is the nodejs side:
exports.Upload = async (req, res) => {
try {
var acceptUpload = false;
const bb = busboy({ headers: req.headers });
bb.on('field', (fieldname, val) => {
//Verify data here before accepting file upload
var data = JSON.parse(val);
if (val.name === 'John Doe') {
acceptUpload = true;
} else {
acceptUpload = false;
}
});
bb.on('file', (fieldname, file, filename, encoding, mimetype) => {
if (acceptUpload) {
const saveTo = '/upload/file.txt'
file.pipe(fs.createWriteStream(saveTo));
}else{
response = {
message: 'Not Authorized'
}
res.status(403).json(response);
}
});
bb.on('finish', () => {
response = {
message: 'Upload Successful'
}
res.status(200).json(response);
});
req.pipe(bb);
} catch (error) {
console.log(error)
response = {
message: error.message
}
res.status(500).json(response);
}
}
So basically, is it even possible for the 'field' event-handler to wait for the 'file' event handler? How could one verify some meta data before accepting a file upload?
How can I do validation of all data in the form data object, before accepting the file upload? Is this even possible, or are there other ways of uploading files with this kind of behaviour? I am considering even adding data to the request header, but this does not seem like the ideal solution.
Update
As I suspected, nothing is waiting. Which ever way I try, the upload first has to be completed, only then after is it rejected with a 403
Another Update
Ive tried the same thing with multer and have similar results. Even when I can do the validation, the file is completely uploaded from the client side. Once the upload is complete, only then the request is rejected. The file, however, never gets stored, even though it is uploaded in its entirety.
With busboy, nothing is written to the server if you do not execute the statement file.pipe(fs.createWriteStream(saveTo));
You can prevent more data from even being uploaded to the server by executing the statement req.destroy() in the .on("field", ...) or the .on("file", ...) event handler, even after you have already evaluated some of the fields. Note however, that req.destroy() destroys not only the current HTTP request but the entire TCP connection, which might otherwise have been reused for subsequent HTTP requests. (This applies to HTTP/1.1, in HTTP/2 the relationship between connections and requests is different.)
At any rate, it has no effect on the current HTTP request if everything has already been uploaded. Therefore, whether this saves any network traffic depends on the size of the file. And if the decision whether to req.destroy() involves an asynchronous operation, such as a database lookup, then it may also come too late.
Compare
> curl -v -F name=XXX -F file=#<small file> http://your.server
* We are completely uploaded and fine
* Empty reply from server
* Closing connection 0
curl: (52) Empty reply from server
with
> curl -v -F name=XXX -F file=#<large file> http://your.server
> Expect: 100-continue
< HTTP/1.1 100 Continue
* Send failure: Connection was reset
* Closing connection 0
curl: (55) Send failure: Connection was reset
Note that the client sets the Expect header before uploading a large file. You can use that fact in connection with a special request header name in order to block the upload completely:
http.createServer(app)
.on("checkContinue", function(req, res) {
if (req.headers["name"] === "John Doe") {
res.writeContinue(); // sends HTTP/1.1 100 Continue
app(req, res);
} else {
res.statusCode = 403;
res.end("Not authorized");
}
})
.listen(...);
But for small files, which are uploaded without the Expect request header, you still need to check the name header in the app itself.

Node.js request - print entire http request (raw) of a post

I am using the request library in Node.js to Google's text-to-speech API. I would like to print out the request that is being sent like in this python example.
Here is my code:
const request = require('request');
const headers = {headers: {'input': {'text':'I want to say this'}, 'voice':{ 'languageCode' : 'en-US'},'audioConfig':{'audioEncoding': 'MP3'}}}
request.post('https://texttospeech.googleapis.com/v1beta1/text:synthesize?key=API_KEY',headers, (error, res, body) => {
if (error) {
console.error(error)
return
}
console.log(`statusCode: ${res.statusCode}`)
console.log(body)
})
Simplest way to do this is to start a netcat server on any port:
$ nc -l -p 8080
and change the URL to localhost:
https://localhost:8080/v1beta1/text:synthesize?key=API_KEY
Obviously, you won't be able to see the response, but the entire raw request data will be available for you to inspect in the terminal you have netcat running
This is documented here:
There are at least three ways to debug the operation of request:
Launch the node process like NODE_DEBUG=request node script.js (lib,request,otherlib works too).
Set require('request').debug = true at any time (this does the same thing as #1).
Use the request-debug module to view request and response headers and bodies.

Send an image as the body of a request, image recived with a request from outside

Yeah i kinda didn't know how to type the title well...
I've a node server which recives an image via post form. I then want to send this image to Microsoft vision and the same Google service in order to gether information from both, do some stuff, and return a result to the user that has accessed my server.
My problem is: how do i send the actual data?
This is the actual code that cares of that:
const microsofComputerVision = require("microsoft-computer-vision");
module.exports = function(req, res)
{
var file;
if(req.files)
{
file = req.files.file;
// Everything went fine
microsofComputerVision.analyzeImage(
{
"Ocp-Apim-Subscription-Key": vision_key,
"content-type": "multipart/form-data",
"body": file.data.toString(),
"visual-features":"Tags, Faces",
"request-origin":"westcentralus"
}).then((result) =>
{
console.log("A");
res.write(result);
res.end();
}).catch((err)=>
{
console.log(err);
res.writeHead(400, {'Content-Type': 'application/json'});
res.write(JSON.stringify({error: "The request must contain an image"}));
res.end();
});
}
else
{
res.writeHead(400, {'Content-Type': 'application/octet-stream'});
res.write(JSON.stringify({error: "The request must contain an image"}));
res.end();
}
}
If instead of calling "analyzeImage" i do the following
res.set('Content-Type', 'image/jpg')
res.send(file.data);
res.end();
The browser renders the image correctly, which made me think "file.data" contains the actual file (considered it's of type buffer).
But apparently Microsoft does not agree with that, because when i send the request to computer vision i get the following response:
"InvalidImageFormat"
The only examples i found are here, and the "data" that is used in that example comes from a file system read, not stright from a request. But saving the file to load it and then delete it to me looks like an horrible workaround, so i'd rather like to know in what form and how should i work on the "file" that i have to send it correctly for the APIs call.
Edit: if i use file.data (which i thought was the most correct since it would be sending the raw image as the body) i get an error which says that i must use a string or a buffer as content. So apparently that file.data is not a buffer in the way "body" requires O.o i'm not understanding honestly.
Solved, the error was quite stupid. In the "then" part, res.write(result) did not accept result as argument. This happened when i actually used the corret request (file.data which is a buffer). The other errors occurred everytime i tryed using toString() on file.data, in that case the request wasn't accepted.
Solved, the request asked for a buffer, and file.data is indeed a buffer. After chacking file.data type in any possible way i started looking for other problems. The error was much easier and, forgive my being stupid, too stupid to be evident. The result was a json, and res.write didn't accept a json as argument.
This is how I did it with Amazon Recognition Image Classifier, I know its not the same service your using - hoping this helps a little thou:
const imagePath = `./bat.jpg`;
const bitmap = fs.readFileSync(imagePath);
const params = {
Image: { Bytes: bitmap },
MaxLabels: 10,
MinConfidence: 50.0
};
route.post('/', upload.single('image'), (req, res) => {
let params = getImage();
rekognition.detectLabels(params, function(err, data) {
if (err) {
console.log('error');
}else {
console.log(data);
res.json(data);
}
});
});

hitting a multipart url in nodejs

I have a client code using form-data module to hit a url that returns a content-type of image/jpeg. Below is my code
var FormData = require('form-data');
var fs = require('fs');
var form = new FormData();
//form.append('POLICE', "hello");
//form.append('PAYSLIP', fs.createReadStream("./Desert.jpg"));
console.log(form);
//https://fbcdn-profile-a.akamaihd.net/hprofile-ak-xfp1/v/t1.0- 1/c8.0.50.50/p50x50/10934065_1389946604648669_2362155902065290483_n.jpg?oh=13640f19512fc3686063a4703494c6c1&oe=55ADC7C8&__gda__=1436921313_bf58cbf91270adcd7b29241838f7d01a
form.submit({
protocol: 'https:',
host: 'fbcdn-profile-a.akamaihd.net',
path: '/hprofile-ak-xfp1/v/t1.0-1/c8.0.50.50/p50x50/10934065_1389946604648669_2362155902065290483_n.jpg?oh=13640f19512fc3686063a3494c6c1&oe=55ADCC8&__gda__=1436921313_bf58cbf91270adcd7b2924183',
method: 'get'
}, function (err, res) {
var data = "";
res.on("data", function (chunks) {
data += chunks;
});
res.on("end", function () {
console.log(data);
console.log("Response Headers - " + JSON.stringify(res.headers));
});
});
I'm getting some chunk data and the response headers i received was
{"last-modified":"Thu, 12 Feb 2015 09:49:26 GMT","content-type":"image/jpeg","timing-allow-origin":"*","access-control-allow-origin":"*","content-length":"1443","cache-control":"no-transform, max-age=1209600","expires":"Thu, 30 Apr 2015 07:05:31 GMT","date":"Thu, 16 Apr 2015 07:05:31 GMT","connection":"keep-alive"}
I am now stuck as how to process the response that i received to a proper image.I tried base64 decoding but it seemed to be a wrong approach any help will be much appreciated.
I expect that data, once the file has been completely downloaded, contains a Buffer.
If that is the case, you should write the buffer as is, without any decoding, to a file:
fs.writeFile('path/to/file.jpg', data, function onFinished (err) {
// Handle possible error
})
See fs.writeFile() documentation - you will see that it accepts either a string or a buffer as data input.
Extra awesomeness by using streams
Since the res object is a readable stream, you can simply pipe the data directly to a file, without keeping it in memory. This has the added benefit that if you download really large file, Node.js will not have to keep the whole file in memory (as it does now), but will write it to the filesystem continuously as it arrives.
form.submit({
// ...
}, function (err, res) {
// res is a readable stream, so let's pipe it to the filesystem
var file = fs.createWriteStream('path/to/file.jpg')
res.on('end', function writeDone (err) {
// File is saved, unless err happened
})
.pipe(file) // Send the incoming file to the filesystem
})
The chunk you got is the raw image. Do whatever it is you want with the image, save it to disk, let the user download it, whatever.
So if I understand your question clearly, you want to download a file from an HTTP endpoint and save it to your computer, right? If so, you should look into using the request module instead of using form-data.
Here's a contrived example for downloading things using request:
var fs = require('fs');
var request = require('request')
request('http://www.example.com/picture.jpg')
.pipe(fs.createWriteStream('picture.jpg'))
Where 'picture.jpg' is the location to save to disk. You can open it up using a normal file browser.

nodejs gm content-length implementation hangs browser

I've written a simple image manipulation service that uses node gm on an image from an http response stream. If I use nodejs' default transfer-encoding: chunked, things work just fine. But, as soon as I try and add the content-length implementation, nodejs hangs the response or I get content-length mismatch errors.
Here's the gist of the code in question (variables have been omitted due to example):
var image = gm(response);
// gm getter used to get origin properties of image
image.identify({bufferStream: true}, function(error, value){
this.setFormat(imageFormat)
.compress(compression)
.resize(width,height);
// instead of default transfer-encoding: chunked, calculate content-length
this.toBuffer(function(err, buffer){
console.log(buffer.length);
res.setHeader('Content-Length', buffer.length);
gm(buffer).stream(function (stError, stdout, stderr){
stdout.pipe(res);
});
});
});
This will spit out the desired image and a content length that looks right, but the browser will hang suggesting that there's a bit of a mismatch or something else wrong. I'm using node gm 1.9.0.
I've seen similar posts on nodejs gm content-length implementation, but I haven't seen anyone post this exact problem yet.
Thanks in advance.
I ended up changing my approach. Instead of using this.toBuffer(), I save the new file to disk using this.write(fileName, callback), then read it with fs.createReadStream(fileName) and piping it to the response. Something like:
var filePath = './output/' + req.param('id') +'.' + imageFormat;
this.write(filePath, function (writeErr) {
var stat = fs.statSync(filePath);
res.writeHead(200, {
'Content-Type': 'image/' + imageFormat,
'Content-Length': stat.size
});
var readStream = fs.createReadStream(filePath);
readStream.pipe(res);
// async delete the file from filesystem
...
});
You end up getting all of the headers you need including your new content-length to return to the client.

Resources