Can't get a node stream's finish event to fire - node.js

I'm trying to determine when a node WriteStream is done writing:
var gfs = GFS(db);
var writeStream = gfs.createWriteStream(
{
filename: "thismyfile.txt",
root: "myfiles"
});
writeStream.on("finish", function()
{
console.log("finished");
response.send({ Success: true });
});
writeStream.write("this might work");
writeStream.end();
console.log("end");
In my console, I see "end", but never "finished", and there is never a response. The stream is writing properly, however, and it seems to be finishing (I see the completed file in the database). That even just isn't firing. I've tried moving the "this might work" into the call to end() and removing write(), I've also tried passing a string into end() as well. That string is written to the stream, but still no callback.
Why might this event not be getting fired?
Thank you.

The gridfs-stream module is designed and written primarily for node 0.8.x and below and does not use the stream2-style methods provided by require('stream').WritableStream in node >= 0.10.x. Because of this, it does not get the standardized finish event. It is up to the module implementation itself to emit finish, which it apparently does not.

Related

Node Unzipper - how to know it's finished

I'm trying to use the unzipper node module to extract and process a number of files (exact number is unknown). However, I can't seem to figure out how to know when all the files are processed. So far, my code looks like this:
s3.getObject(params).createReadStream()
.pipe(unzipper.Parse())
.on('entry', async (entry) => {
var fileName = entry.path;
if (fileName.match(someRegex)) {
await processEntry(entry);
console.log("Uploaded");
} else {
entry.autodrain();
console.log("Drained");
}
});
I'm trying to figure out how to know that unzipper has gone through all the files (i.e., no more entry events are forthcoming) and all the entry handlers have finished so that I know I've finished processing all the files I care about.
I've tried experimenting with the close and finish events but when I have, they both trigger before console.log("Uploaded"); has printed, so that doesn't seem right.
Help?
Directly from the docs:
The parser emits finish and error events like any other stream. The parser additionally provides a promise wrapper around those two events to allow easy folding into existing Promise-based structures.
Example:
fs.createReadStream('path/to/archive.zip')
.pipe(unzipper.Parse())
.on('entry', entry => entry.autodrain())
.promise()
.then( () => console.log('done'), e => console.log('error',e));

Redis pubsub message queue but with callback, as in ZeroMQ

I have found the following code that implements an asynchronous message queue (actually there is no queue, only files) with ZeroMQ and Node.js
setInterval(function() {
var value = { id: i++, date: new Date() };
WriteFile(value.id + ".dat", value);
client.send(value, function(result) {
console.log(value, result);
DeleteFile(value.id + ".dat");
});
}, 10000);
The code is from here.
The functions "WriteFile" and "DeleteFile" are defined later in the code, but there is nothing extraordinary there.
The function "client.send" is also defined in another file, where the callback is defined. Clearly there is a provision from ZeroMQ to have a callback when the message transmission is successful.
Now I want to do something like this but with Redis pubsub instead of ZeroMQ for simplicity. As I understand it, there is no callback in the "publish" function from the node_redis module.
My question is, is there a way to implement something like this? I really like the idea of writing files and then deleting them whet the transmission is complete, but I would like it done in Redis. I know I am grasping at straws, but if anyone has any ideas, I will gladly listen.
All of the redis module's commands have an optional callback as the last argument.
So doing something like
client.publish('channel', 'message', function(err) {
if (err) throw err;
});
should work as expected.

Block function whilst waiting for response

I've got a NodeJS app i'm building (using Sails, but i guess that's irrelevant).
In my action, i have a number of requests to other services, datasources etc that i need to load up. However, because of the huge dependency on callbacks, my code is still executing long after the action has returned the HTML.
I must be missing something silly (or not quite getting the whole async thing) but how on earth do i stop my action from finishing until i have all my data ready to render the view?!
Cheers
I'd recommend getting very intimate with the async library
The docs are pretty good with that link above, but it basically boils down to a bunch of very handy calls like:
async.parallel([
function(){ ... },
function(){ ... }
], callback);
async.series([
function(){ ... },
function(){ ... }
]);
Node is inherently async, you need to learn to love it.
It's hard to tell exactly what the problem is but here is a guess. Assuming you have only one external call your code should look like this:
exports.myController = function(req, res) {
longExternalCallOne(someparams, function(result) {
// you must render your view inside the callback
res.render('someview', {data: result});
});
// do not render here as you don't have the result yet.
}
If you have more than two external calls your code will looks like this:
exports.myController = function(req, res) {
longExternalCallOne(someparams, function(result1) {
longExternalCallTwo(someparams, function(result2) {
// you must render your view inside the most inner callback
data = {some combination of result1 and result2};
res.render('someview', {data: data });
});
// do not render here since you don't have result2 yet
});
// do not render here either as you don't have neither result1 nor result2 yet.
}
As you can see, once you have more than one long running async call things start to get tricky. The code above is just for illustration purposes. If your second callback depends on the first one then you need something like it, but if longExternalCallOne and longExternalTwo are independent of each other you should be using a library like async to help parallelize the requests https://github.com/caolan/async
You cannot stop your code. All you can do is check in all callbacks if everything is completed. If yes, go on with your code. If no, wait for the next callback and check again.
You should not stop your code, but rather render your view in your other resources callback, so you wait for your resource to be reached before rendering. That's the common pattern in node.js.
If you have to wait for several callbacks to be called, you can check manually each time one is called if the others have been called too (with simple bool for example), and call your render function if yes. Or you can use async or other cool libraries which will make the task easier. Promises (with the bluebird library) could be an option too.
I am guessing here, since there is no code example, but you might be running into something like this:
// let's say you have a function, you pass it an argument and callback
function myFunction(arg, callback) {
// now you do something asynchronous with the argument
doSomethingAsyncWithArg(arg, function() {
// now you've got your arg formatted or whatever, render result
res.render('someView', {arg: arg});
// now do the callback
callback();
// but you also have stuff here!
doSomethingElse();
});
});
So, after you render, your code keeps running. How to prevent it? return from there.
return callback();
Now your inner function will stop processing after it calls callback.

How to close a readable stream (before end)?

How to close a readable stream in Node.js?
var input = fs.createReadStream('lines.txt');
input.on('data', function(data) {
// after closing the stream, this will not
// be called again
if (gotFirstLine) {
// close this stream and continue the
// instructions from this if
console.log("Closed.");
}
});
This would be better than:
input.on('data', function(data) {
if (isEnded) { return; }
if (gotFirstLine) {
isEnded = true;
console.log("Closed.");
}
});
But this would not stop the reading process...
Edit: Good news! Starting with Node.js 8.0.0 readable.destroy is officially available: https://nodejs.org/api/stream.html#stream_readable_destroy_error
ReadStream.destroy
You can call the ReadStream.destroy function at any time.
var fs = require("fs");
var readStream = fs.createReadStream("lines.txt");
readStream
.on("data", function (chunk) {
console.log(chunk);
readStream.destroy();
})
.on("end", function () {
// This may not been called since we are destroying the stream
// the first time "data" event is received
console.log("All the data in the file has been read");
})
.on("close", function (err) {
console.log("Stream has been destroyed and file has been closed");
});
The public function ReadStream.destroy is not documented (Node.js v0.12.2) but you can have a look at the source code on GitHub (Oct 5, 2012 commit).
The destroy function internally mark the ReadStream instance as destroyed and calls the close function to release the file.
You can listen to the close event to know exactly when the file is closed. The end event will not fire unless the data is completely consumed.
Note that the destroy (and the close) functions are specific to fs.ReadStream. There are not part of the generic stream.readable "interface".
Invoke input.close(). It's not in the docs, but
https://github.com/joyent/node/blob/cfcb1de130867197cbc9c6012b7e84e08e53d032/lib/fs.js#L1597-L1620
clearly does the job :) It actually does something similar to your isEnded.
EDIT 2015-Apr-19 Based on comments below, and to clarify and update:
This suggestion is a hack, and is not documented.
Though for looking at the current lib/fs.js it still works >1.5yrs later.
I agree with the comment below about calling destroy() being preferable.
As correctly stated below this works for fs ReadStreams's, not on a generic Readable
As for a generic solution: it doesn't appear as if there is one, at least from my understanding of the documentation and from a quick look at _stream_readable.js.
My proposal would be put your readable stream in paused mode, at least preventing further processing in your upstream data source. Don't forget to unpipe() and remove all data event listeners so that pause() actually pauses, as mentioned in the docs
Today, in Node 10
readableStream.destroy()
is the official way to close a readable stream
see https://nodejs.org/api/stream.html#stream_readable_destroy_error
You can't. There is no documented way to close/shutdown/abort/destroy a generic Readable stream as of Node 5.3.0. This is a limitation of the Node stream architecture.
As other answers here have explained, there are undocumented hacks for specific implementations of Readable provided by Node, such as fs.ReadStream. These are not generic solutions for any Readable though.
If someone can prove me wrong here, please do. I would like to be able to do what I'm saying is impossible, and would be delighted to be corrected.
EDIT: Here was my workaround: implement .destroy() for my pipeline though a complex series of unpipe() calls. And after all that complexity, it doesn't work properly in all cases.
EDIT: Node v8.0.0 added a destroy() api for Readable streams.
At version 4.*.* pushing a null value into the stream will trigger a EOF signal.
From the nodejs docs
If a value other than null is passed, The push() method adds a chunk of data into the queue for subsequent stream processors to consume. If null is passed, it signals the end of the stream (EOF), after which no more data can be written.
This worked for me after trying numerous other options on this page.
This destroy module is meant to ensure a stream gets destroyed, handling different APIs and Node.js bugs. Right now is one of the best choice.
NB. From Node 10 you can use the .destroy method without further dependencies.
You can clear and close the stream with yourstream.resume(), which will dump everything on the stream and eventually close it.
From the official docs:
readable.resume():
Return: this
This method will cause the readable stream to resume emitting 'data' events.
This method will switch the stream into flowing mode. If you do not want to consume the data from a stream, but you do want to get to its 'end' event, you can call stream.resume() to open the flow of data.
var readable = getReadableStreamSomehow();
readable.resume();
readable.on('end', () => {
console.log('got to the end, but did not read anything');
});
It's an old question but I too was looking for the answer and found the best one for my implementation. Both end and close events get emitted so I think this is the cleanest solution.
This will do the trick in node 4.4.* (stable version at the time of writing):
var input = fs.createReadStream('lines.txt');
input.on('data', function(data) {
if (gotFirstLine) {
this.end(); // Simple isn't it?
console.log("Closed.");
}
});
For a very detailed explanation see:
http://www.bennadel.com/blog/2692-you-have-to-explicitly-end-streams-after-pipes-break-in-node-js.htm
This code here will do the trick nicely:
function closeReadStream(stream) {
if (!stream) return;
if (stream.close) stream.close();
else if (stream.destroy) stream.destroy();
}
writeStream.end() is the go-to way to close a writeStream...
for stop callback execution after some call,
you have to use process.kill with particular processID
const csv = require('csv-parser');
const fs = require('fs');
const filepath = "./demo.csv"
let readStream = fs.createReadStream(filepath, {
autoClose: true,
});
let MAX_LINE = 0;
readStream.on('error', (e) => {
console.log(e);
console.log("error");
})
.pipe(csv())
.on('data', (row) => {
if (MAX_LINE == 2) {
process.kill(process.pid, 'SIGTERM')
}
// console.log("not 2");
MAX_LINE++
console.log(row);
})
.on('end', () => {
// handle end of CSV
console.log("read done");
}).on("close", function () {
console.log("closed");
})

What's the right way to exit node.js script with a log message?

I have a node.js script that does some logging to a file using WriteStream. On certain events I want to stop execution of the script, i.e. warn to log and exit immediately after that. Being asyncronious node.js does not allow us to do it straight forward like:
#!/usr/local/bin/node
var fs = require('fs');
var stream = fs.createWriteStream('delme.log', { flags: 'a' });
stream.write('Something bad happened\n');
process.exit(1);
Instead of appending a message to delme.log this script does nothing with the file. Handling 'exit' event and flushing doesn't work. The only way to write the last log message before exitting found so far is to wrap process.exit(1) in the setTimeout():
#!/usr/local/bin/node
var fs = require('fs');
var stream = fs.createWriteStream('delme.log', { flags: 'a' });
stream.write('Something bad happened\n');
setTimeout(function(){
process.exit(1);
}, 30);
However in this form it doesn't stop the script execution immediately and the script will be running for some time after the critical event happened. So I'm wondering if there are other ways to exit a script with a log message?
Since you want to block, and already are using a stream, you will probably want to handle the writing yourself.
var data = new Buffer('Something bad happened\n');
fs.writeSync(stream.fd, data, 0, data.length, stream.pos);
process.exit();
Improved.
var fs = require('fs');
var stream = fs.createWriteStream('delme.log', {flags: 'a'});
// Gracefully close log
process.on('uncaughtException', function () {
stream.write('\n'); // Make sure drain event will fire (queue may be empty!)
stream.on('drain', function() {
process.exit(1);
});
});
// Any code goes here...
stream.write('Something bad happened\n');
throw new Error(SOMETHING_BAD);
The try-catch block works but it is ugly. Still, credits go #nab, I just prettified it.
I think this is the right way:
process.on('exit', function (){
// You need to use a synchronous, blocking function, here.
// Not streams or even console.log, which are non-blocking.
console.error('Something bad happened\n');
});
To flush all log messages to a file before exitting one might want to wrap a script execution in a try-catch block. Once something bad has happened, it's being logged and throws an exception that will be catched by the outer try from which it is safe to exit asynchronously:
#!/usr/local/bin/node
var fs = require('fs');
var stream = fs.createWriteStream('delme.log', { flags: 'a' });
var SOMETHING_BAD = 'Die now';
try {
// Any code goes here...
if (somethingIsBad) {
stream.write('Something bad happened\n');
throw new Error(SOMETHING_BAD);
}
} catch (e) {
if (e.message === SOMETHING_BAD) {
stream.on('drain', function () {
process.exit(1);
});
} else {
throw e;
}
}
I would advocate just writing to stderr in this event - e.g trivial example
console.error(util.inspect(exception));
and then let the supervising* process handle the log persistence. From my understanding nowadays you don't have to worry about stdout and stderr not flushing before node exits (although I did see the problematic opposite behavior in some of 0.2.x versions).
(*) For supervising process take your pick from supervisord, god, monit, forever, pswatch etc...
This also provides a clean path to use PaaS providers such as Heroku and dotcloud etc... let the infrastructure managing the logging

Resources