Download File, save it and read it again --> Error

Download File, save it and read it again --> Error - node.js

I would like to download a file, write it to a temporary file, read it and give the readFileSync Buffer to a function. I tried this:
var file = fs.createWriteStream("temp.pdf")
var request = http.get(linkArray[1], function(response) {
response.on('data', function(data){
file.write(data)
}).on('end', function(){
postData(fs.readFileSync('temp.pdf'))
})
});
Sometimes it works, but sometimes it doesn't - my guess is that the file isn't written completely, when it is read. (But than the 'end' event shouldn't be fired ?!
As you can see, I would like to download a bunch of files and do this. Do you have any advise how to solve this? Maybe this isn't the best way to solve this...

You shouldn't link streams with on('data' you should use pipe. Pipe will link the streams data events to writes and end events to ends.
var file = fs.createWriteStream("temp.pdf");
var request = http.get(linkArray[1], function(response) {
response.pipe(file).on('close', function(){
postData(fs.readFileSync('temp.pdf'));
});
});
also you should use https://github.com/mikeal/request
var request = require('request');
request.get(linkArray[i], function (err, response, body) {
postData(body);
});
or
var request = require('request');
var file = fs.createWriteStream("temp.pdf");
request.get(linkArray[i]).pipe(file).on('close', function () {
postData(fs.readFileSync('temp.pdf'));
});

You need to call file.end(); at the top of your .on('end', ...) handler. The end() method itself is asynchronous, though, so you'll want to read the file once that's complete. E.g.,
var file = fs.createWriteStream("temp.pdf")
var request = http.get(linkArray[1], function(response) {
response.on('data', function(data){
file.write(data)
}).on('end', function(){
file.end(function() {
postData(fs.readFileSync('temp.pdf'))
});
})
});

Related

Node request write to file corrupt

I have a get request in node that successfully receives data from API.
When I pipe that response directly to a file like this, it works, the file created is a valid, readable pdf (as i expect to receive from the API).
var http = require('request');
var fs = require('fs');
http.get(
{
url:'',
headers:{}
})
.pipe(fs.createWriteStream('./report.pdf'));
Simple, however the file gets corrupted if I use the event emitters of the request like this
http.get(
{
url:'',
headers:{}
})
.on('error', function (err) {
console.log(err);
})
.on('data', function(data) {
file += data;
})
.on('end', function() {
var stream = fs.createWriteStream('./report.pdf');
stream.write(file, function() {
stream.end();
});
});
I have tried all manner of writing this file and it always ends in a totally blank pdf - the only time the pdf is valid is via the pipe method.
When i console log the events, the sequence seems to be correct - ie, all chunks received and then the end fires at the end.
It's making things impossible to do anything after the pipe. What is pipe doing differently to the writestream ?

I assume that you initialize file as a string:
var file = '';
Then, in your data handler, you add the new chunk of data to it:
file += data;
However, this performs an implicit conversion to (UTF-8-encoded) strings. If the data is actually binary, like with a PDF, this will invalidate the output data.
Instead, you want to collect the data chunks, which are Buffer instances, and use Buffer.concat() to concatenate all those buffers into one large (binary) buffer:
var file = [];
...
.on('data', function(data) {
file.push(data);
})
.on('end', function() {
file = Buffer.concat(file);
...
});

If you wanted to do something after the file is done being written by pipe, you can add an event listener for finish on the object returned by pipe.
.pipe(fs.createWriteStream('./report.pdf'))
.on('finish', function done() { /* the file has been written */ });
Source: https://nodejs.org/api/stream.html#stream_event_finish

POSTing RAW body with restify client

I'm trying to POST a raw body with restify. I have the receive side correct, when using POSTman I can send a raw zip file, and the file is correctly created on the server's file system. However, I'm struggling to write my test in mocha. Here is the code I have, any help would be greatly appreciated.
I've tried this approach.
const should = require('should');
const restify = require('restify');
const fs = require('fs');
const port = 8080;
const url = 'http://localhost:' + port;
const client = restify.createJsonClient({
url: url,
version: '~1.0'
});
const testPath = 'test/assets/test.zip';
fs.existsSync(testPath).should.equal(true);
const readStream = fs.createReadStream(testPath);
client.post('/v1/deploy', readStream, function(err, req, res, data) {
if (err) {
throw new Error(err);
}
should(res).not.null();
should(res.statusCode).not.null();
should(res.statusCode).not.undefined();
res.statusCode.should.equal(200);
should(data).not.null();
should(data.endpoint).not.undefined();
data.endpoint.should.equal('http://endpointyouhit:8080');
done();
});
Yet the file size on the file system is always 0. I'm not using my readStream correctly, but I'm not sure how to correct it. Any help would be greatly appreciated.
Note that I want to stream the file, not load it in memory on transmit and receive, the file can potentially be too large for an in memory operation.
Thanks,
Todd

One thing is that you would need to specify a content-type of multi-part/form-data. However, it looks like restify doesn't support that content type, so you're probably out of luck using the restify client to post a file.

To answer my own question, it doesn't appear to be possible to do this with the restify client. I also tried the request module, which claims to have this capability. However, when using their streaming examples, I always had a file size of 0 on the server. Below is a functional mocha integration test.
const testPath = 'test/assets/test.zip';
fs.existsSync(testPath).should.equal(true);
const readStream = fs.createReadStream(testPath);
var options = {
host: 'localhost'
, port: port
, path: '/v1/deploy/testvalue'
, method: 'PUT'
};
var req = http.request(options, function (res) {
//this feels a bit backwards, but these are evaluated AFTER the read stream has closed
var buffer = '';
//pipe body to a buffer
res.on('data', function(data){
buffer+= data;
});
res.on('end', function () {
should(res).not.null();
should(res.statusCode).not.null();
should(res.statusCode).not.undefined();
res.statusCode.should.equal(200);
const json = JSON.parse(buffer);
should(json).not.null();
should(json.endpoint).not.undefined();
json.endpoint.should.equal('http://endpointyouhit:8080');
done();
});
});
req.on('error', function (err) {
if (err) {
throw new Error(err);
}
});
//pipe the readstream into the request
readStream.pipe(req);
/**
* Close the request on the close of the read stream
*/
readStream.on('close', function () {
req.end();
console.log('I finished.');
});
//note that if we end up with larger files, we may want to support the continue, much as S3 does
//https://nodejs.org/api/http.html#http_event_continue

what should I use instead of readableStream.push('')

I am trying to implement the ._read function of a readable stream, a problem happens when ._read is called and there isn't data, the documentation says that I can push('') until more data comes, and I should only return false when the stream will never have more data.
https://nodejs.org/api/stream.html#stream_readable_read_size_1
But it also says that if I need to do that then something is terribly wrong with my design.
https://nodejs.org/api/stream.html#stream_stream_push
But I can't find an alternative to that.
code:
var http = require('http');
var https = require('https');
var Readable = require('stream').Readable;
var router = require('express').Router();
var buffer = [];
router.post('/', function(clientRequest, clientResponse) {
var delayedMSStream = new Readable;
delayedMSStream._read = function() {
var a=buffer.shift();
if(typeof a === 'undefined'){
this.push('');
return true;
}
else {
this.push(a);
if(a===null) {
return false;
}
return true;
}
};
//I need to get a url from example.com
https.request({hostname:'example.com'}, function(exampleResponse){
data='';
exampleResponse.on('data',function(chunk){data+=chunk});
exampleResponse.on('end',function(){
var MSRequestOptions = {hostname: data, method: 'POST'};
var MSRequest = https.request(MSRequestOptions, function(MSResponse){
MSResponse.on('end', function () {
console.log("MSResponse.on(end)");//>>>
});//end MSResponse.on(end)
}); //end MSRequest
delayedMSStream.pipe(MSRequest);
});
});
clientRequest.on('data', function (chunk) {
buffer.push(chunk);
});
clientRequest.on('end', function () {//when done streaming audio
buffer.push(null);
});
});//end router.post('/')
explanation:
client sends a POST request streaming audio to my server, my server requests a url from example.com, when example.com responds with the url, my server streams the audio to it.
What's a smarter way to do it?

So if I undertstand the code correctly, you:
receive a request,
make your own request to a remote endpoint and fetch a URL
make a new request to that URL and pipe that to original response.
There are ways to do this other then yours, and even your way would look cleaner to me if you just improve the naming a bit. Also, splitting the huge request into a few functions with smaller responsibility scopes might help.
I would make the endpoint this way:
let http = require('http');
let https = require('https');
let Readable = require('stream').Readable;
let router = require('express').Router();
let buffer = [];
/**
* Gets some data from a remote host. Calls back when done.
* We cannot pipe this directly into your stream chain as we need the complete data to get the end result.
*/
function getHostname(cb) {
https.request({
hostname: 'example.com'
}, function(response) {
let data = '';
response.on('error', err => cb(err)); // shortened for brewity
response.on('data', function(chunk) {
data = data + chunk;
});
response.on('end', function() {
// we're done here.
cb(null, data.toString());
});
});
}
router.post('/', function(request, response) {
// first let's get that url.
getHostname(function(err, hostname) {
if (err) { return response.status(500).end(); }
// now make that other request which we can stream.
https.request({
hostname: hostname,
method: 'POST'
}, function(dataStream) {
dataStream.pipe(response);
});
});
});
Now, as said in the comments, with streams2, you don't have to manage your streams. With node versions pre 0.10 you have had to listen to 'read', 'data' etc events, with newer node versions, it's handled. Furthermore, you don't even need it here, streams are smart enough to handle backpressure on their own.

Node pipe stops working

My client sends an image file to the server. It works 5 times and then it suddenly stops. I am pretty new using streams and pipe so I am not sure what I am doing wrong.
Server Code
http.createServer(function(req, res) {
console.log("File received");
// This opens up the writeable stream to `output`
var name = "./test"+i+".jpg";
var writeStream = fs.createWriteStream(name);
// This pipes the POST data to the file
req.pipe(writeStream);
req.on('end', function () {
console.log("File saved");
i++;
});
// This is here incase any errors occur
writeStream.on('error', function (err) {
console.log(err);
});
}).listen(3000);
Client code
var request = require('request');
var fs = require('fs');
setInterval(function () {
var readStream = fs.createReadStream('./test.jpg');
readStream.on('open', function () {
// This just pipes the read stream to the response object (which goes to the client)
readStream.pipe(request.post('http://192.168.1.100:3000/test'));
console.log("Send file to server");
});
}, 1000);

Behaves like a resource exhaustion issue. Not sure which calls throw errors and which just return. Does the server connect on the 6th call? Does the write stream open? Does the pipe open?
Try ending the connection and closing the pipe after the image is saved. Maybe close the write stream too, don't remember if node garbage collects file descriptors.

I had to do the following on the server side to make this work :
res.statusCode = 200;
res.end();

How to make web service calls in Expressjs?

app.get('/', function(req, res){
var options = {
host: 'www.google.com'
};
http.get(options, function(http_res) {
http_res.on('data', function (chunk) {
res.send('BODY: ' + chunk);
});
res.end("");
});
});
I am trying to download google.com homepage, and reprint it, but I get an "Can't use mutable header APIs after sent." error
Anyone know why? or how to make http call?

Check out the example here on the node.js doc.
The method http.get is a convenience method, it handles a lot of basic stuff for a GET request, which usually has no body to it. Below is a sample of how to make a simple HTTP GET request.
var http = require("http");
var options = {
host: 'www.google.com'
};
http.get(options, function (http_res) {
// initialize the container for our data
var data = "";
// this event fires many times, each time collecting another piece of the response
http_res.on("data", function (chunk) {
// append this chunk to our growing `data` var
data += chunk;
});
// this event fires *one* time, after all the `data` events/chunks have been gathered
http_res.on("end", function () {
// you can use res.send instead of console.log to output via express
console.log(data);
});
});

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Download File, save it and read it again --> Error - node.js

Related

Node request write to file corrupt

POSTing RAW body with restify client

what should I use instead of readableStream.push('')

Node pipe stops working

How to make web service calls in Expressjs?

Categories

Resources