How do I stream large files over ssh in Node? - node.js

I'm trying to stream a cat command using the ssh2 module but it just hangs at some point of the execution. I'm executing cat there.txt where there.txt is around 10 MB or so.
For example:
local = fs.createWriteStream('here.txt');
conn.exec('cat there.txt', function(err, stream) {
if (err) throw err;
stream.pipe(local).on('finish, function() { console.log('Done'); });
}
This just completely stops at one point. I've even piped the stream to local stdout, and it just hangs after a while. In my actual code, I pipe it through a bunch of other transform streams so I think this is better than transferring the files to the local system first (the files may get larger than 200MB).

I had just started working with streams recently so I when I was piping the ssh stream through various transform streams, I wasn't ending on a writable stream like I was in my example (I should've included my actual code, sorry!). This caused it to hang. This was originally so that I could execute multiple commands remotely and put their output sorted into a single file.
So, my original code was stream.pipe(transformStream), then push the transformStream to an array once it's finished. And then sort it using the mergesort-stream npm module. Instead of that, I just write the results from the multiple ssh commands (transformed) to temporary files and then sort them all at once.

Try out the createReadStream for serving huge files:
fs.exists(correctfilepath, function(exists) {
if (exists) {
var readstream = fs.createReadStream(correctfilepath);
console.log("About to serve " + correctfilepath);
res.writeHead(200);
readstream.setEncoding("binary");
readstream.on("data", function (chunk) {
res.write(chunk, "binary");
});
readstream.on("end", function () {
console.log("Served file " + correctfilepath);
res.end();
});
readstream.on('error', function(err) {
res.write(err + "\n");
res.end();
return;
});
} else {
res.writeHead(404);
res.write("No data\n");
res.end();
}
});

Related

fluent-ffmpeg get codec data without specifying output

I am using fluent-ffmpeg node module for getting codec data from a file.
It works if I give an output but I was wondering if there is any option to run fluent-ffmpeg without giving to it an output.
This is what I am doing:
readStream.end(new Buffer(file.buffer));
var process = new ffmpeg(readStream);
process.on('start', function() {
console.log('Spawned ffmpeg');
}).on('codecData', function(data) {
//get recording duration
const duration = data.duration;
console.log(duration)
}).save('temp.flac');
As you can see I am saving the file to temp.flac so I can get the seconds duration of that file.
If you don't want to save the ffmpeg process result to a file, one thing that comes to mind is to redirect the command output to /dev/null.
In fact, as the owner of the fluent-ffmpeg repository said in one comment, there is no need to specify a real file name for the destination when using null format.
So, for example, something like that will work:
let process = new ffmpeg(readStream);
process
.addOption('-f', 'null') // set format to null
.on('start', function() {
console.log('Spawned ffmpeg');
})
.on('codecData', function(data) {
//get recording duration
let duration = data.duration;
console.log(duration)
})
.output('nowhere') // or '/dev/null' or something else
.run()
It remains a bit hacky, but we must set an output to avoid the "No output specified" error.
When no stream argument is present, the pipe() method returns a PassThrough stream, which you can pipe to somewhere else (or just listen to events on).
var command = ffmpeg('/path/to/file.avi')
.videoCodec('libx264')
.audioCodec('libmp3lame')
.size('320x240')
.on('error', function(err) {
console.log('An error occurred: ' + err.message);
})
.on('end', function() {
console.log('Processing finished !');
});
var ffstream = command.pipe();
ffstream.on('data', function(chunk) {
console.log('ffmpeg just wrote ' + chunk.length + ' bytes');
});

Node js file system: end event not called for readable stream

I'm trying to extract a .tar file(packed from a directory) and then check the names of the files in the extracted directory. I'm using tar-fs to extract the tar file and then use fs.createReadStream to manipulate the data. Here's what I've got so far:
fs.createReadStream(req.files.file.path)
.pipe(tar.extract(req.files.file.path + '0'))
.on('error', function() {
errorMessage = 'Failed to extract file. Please make sure to upload a tar file.';
})
.on('entry', function(header, stream, callback) {
console.error(header);
stream.on('end', function() {
console.error("this is working");
});
})
.on('end', function() {
//the one did not get called
console.error('end');
})
;
I was hoping to extract the whole folder and then check the file names. Well, I haven't get that far yet..
To my understanding, I got a readable stream after the pipe. And a readable stream has an end event? My question is, why the end event in the code is not called?
Thanks!
Listen for finish event for writable streams. This is fired when end() has been called and processing of the entry is finished. More on it here.
.on('finish', function() {
console.error('end');
})

Cannot stream large files

I have 2 node js servers A and B, where A connects to B to get files(large files). Here's the code which I use to handle the streams.
var downloadedAmount = 0;
stream.on('data', function (data) {
if (Buffer.isBuffer(data)) {
downloadedAmount += data.toString('utf8').length;
} else {
downloadedAmount += data.length;
}
if (!res.write(data)) {
stream.pause();
}
});
res.on('finish', function () {
console.log("Finish called");
sendUsage(reqId, downloadedAmount); // this is a async db/network call
});
stream.on("end", function () {
console.log("End called");
res.end();
});
res.on("drain", function () {
stream.resume();
});
It downloads files when it's only a single file which is being downloaded. My problem is when I try to download multiple files (larger than 300MB) which increases the load on the servers, the connections are closed at B and all the files are stopped at 124MB. Can anyone tell me what am I doing wrong here?
Update : The problem is in the B server as the direct multiple downloads (requested from browser) also tend to halt at 124MB or so.
Ok. Seems like the problem is altogether in a different location. I changed the Apache server settings( which the B server uses to get the files to
EnableSendfile On
Now it works.

Writing a function to remove all files from a directory upon exiting the directory

I would like to delete the files that have appeared in the directory as a side effect of running this piece of code. I have tried the following method but it does not work as when I check if any files with the .csv extension are there they are still present. I have used the following link as a template
http://nodejs.org/api/child_process.html#child_process_child_process_exec_command_options_callback
My code is:
process.on('exit', function()
{
var child = exec('~/.ssh/project' + 'rm *.csv',
function (error, stdout, stderr)
{
if (error !== null || stderr.length > 0)
{
console.log(stderr);
console.log('exec error: ' + error);
proces.exit();
}
});
});
But this is not working since there are still csv files in this directory
As thefourtheye mentioned "Asynchronous callbacks will not work in exit event". So anything you have to do, will have to be sync. In this case you have two choices:
Use execSync. The good thing about this is that it saves you from a lot of complexity and for a lot of files might be faster(due to concurrency):
process.on('exit', () =>
execSync('~/.ssh/project rm *.csv');
);
Use fs.readdirSync and fs.unlinkSync. Originally thefourtheye wrote it in his deleted answer. This might be a little faster if there's a one or a few files as it doesn't involve creating a process and lot of other things.
function getUserHome() {
return process.env[(process.platform==='win32')?'USERPROFILE':'HOME']+path.sep;
}
process.on('exit', () =>
fs.readdirSync(getUserHome() + ".ssh/project").forEach((fileName) => {
if (path.extname(fileName) === ".csv") {
fs.unlinkSync(fileName);
}
});
);

node.js file system problems

I keep banging my head against the wall because of tons of different errors. This is what the code i try to use :
fs.readFile("balance.txt", function (err, data) //At the beginning of the script (checked, it works)
{
if (err) throw err;
balance=JSON.parse(data);;
});
fs.readFile("pick.txt", function (err, data)
{
if (err) throw err;
pick=JSON.parse(data);;
});
/*....
.... balance and pick are modified
....*/
if (shutdown)
{
fs.writeFile("balance2.txt", JSON.stringify(balance));
fs.writeFile("pick2.txt", JSON.stringify(pick));
process.exit(0);
}
At the end of the script, the files have not been modified the slightest. I then found out on this site that the files were being opened 2 times simultaneously, or something like that, so i tried this :
var balance, pick;
var stream = fs.createReadStream("balance.txt");
stream.on("readable", function()
{
balance = JSON.parse(stream.read());
});
var stream2 = fs.createReadStream("pick.txt");
stream2.on("readable", function()
{
pick = JSON.parse(stream2.read());
});
/****
****/
fs.unlink("pick.txt");
fs.unlink("balance.txt");
var stream = fs.createWriteStream("balance.txt", {flags: 'w'});
var stream2 = fs.createWriteStream("pick.txt", {flags: 'w'});
stream.write(JSON.stringify(balance));
stream2.write(JSON.stringify(pick));
process.exit(0);
But, this time, both files are empty... I know i should catch errors, but i just don't see where the problem is. I don't mind storing the 2 objects in the same file, if that can helps. Besides that, I never did any javascript in my life before yesterday, so, please give me a simple explanation if you know what failed here.
What I think you want to do is use readFileSync and not use readFile to read your files since you need them to be read before doing anything else in your program (http://nodejs.org/api/fs.html#fs_fs_readfilesync_filename_options).
This will make sure you have read both the files before you execute any of the rest of your code.
Make your like code do this:
try
{
balance = JSON.parse(fs.readFileSync("balance.txt"));
pick = JSON.parse(fs.readFileSync("pick.txt"));
}
catch(err)
{ throw err; }
I think you will get the functionality you are looking for by doing this.
Note, you will not be able to check for an error in the same way you can with readFile. Instead you will need to wrap each call in a try catch or use existsSync before each operation to make sure you aren't trying to read a file that doesn't exist.
How to capture no file for fs.readFileSync()?
Furthermore, you have the same problem on the writes. You are kicking off async writes and then immediately calling process.exit(0). A better way to do this would be to either write them sequentially asynchronously and then exit or to write them sequentially synchronously then exit.
Async option:
if (shutdown)
{
fs.writeFile("balance2.txt", JSON.stringify(balance), function(err){
fs.writeFile("pick2.txt", JSON.stringify(pick), function(err){
process.exit(0);
});
});
}
Sync option:
if (shutdown)
{
fs.writeFileSync("balance2.txt", JSON.stringify(balance));
fs.writeFileSync("pick2.txt", JSON.stringify(pick));
process.exit(0);
}

Resources