What is the difference between console.log and process._rawDebug?
This answer tells me that console.log actually calls process.stdout.write with formatting and a new line at the end. According to this article process._rawDebug also writes to the terminal but uses process.stderr. I'm not sure how reliable this article is, though.
I logged 10.000 messages (for testing purposes) to the console using console.log and process._rawDebug. The later was at least twice as fast which should mean something I guess.
Are there any dis(advantages) of using console.log or process._rawDebug? Which one is better/safer to use for logging small messages?
I have found the answer on the Node 0.x archive repository on Github. The commit message description:
This is useful when we need to push some debugging messages out to
stderr, without going through the Writable class, or triggering any kind
of nextTick or callback behavior.
The reason why it is faster is because it bypasses JavaScript entirely and the output is logged directly to the terminal.
Related
I want my application to have a "debug" mode that will print to the console every thing happening but putting many console.log calls is kinda polluting my source code, so is it a good practice to have a debug mode like this?
For example:
function doSomething() {
// ...
console.log("Did something");
}
I really need this debug mode because my function are called on events and it's difficult to trace what's happening and there are many possible scenarios.
Two issues:
Does your logging somehow "pollute" your source code? Not if the logging helps your future self or someone else understand your code. (Of course, don't allow your logging to have side-effects: No console.log( variable++ )
Is console logging good for programs put into production? No, not really. You should consider adopting a logging package.
I like Winston. There are plenty of other good packages; hopefully fans of those will write their own answers.
It allows you to send your log entries to files, to your *nix machine's syslog subsystem or to Windows Events or whatever, and to other places. It timestamps them if you want, and identifies which program they came from. Your console.log('current value', q) operation becomes logger.info('current value', q) and your console.error() becomes logger.error().
For a long-lived program (one that will still be used a few months from now) it is definitely worth your trouble to climb up the logger learning curve and rig up a solid logging system. If your program will run as part of a larger system, ask somebody how other parts of the system handle logging, and use the same scheme.
From Expressjs documentation:
To keep your app purely asynchronous, you’d still want to pipe
console.err() to another program
Qestions:
Is it enough to run my node app with stdout and stderr redirect to not block event loop? Like this: node app 2>&1 | tee logFile ?
If ad.1 answer is true, then how to achieve non-blocking logging while using Winston or Bunyan? They have some built in mechanism to achieve this or they just save data to specific file wasting cpu time of current Node.js process? Or maybe to achieve trully async logging they should pipe data to child process that performs "save to file" (is it still performance positive?) ? Can anyone explain or correct me if my way of thinking is just wrong?
Edited part: I can assume that piping data from processes A, B, ...etc to process L is cheaper for this specific processes (A, B, ...) than writing it to file (or sending over network).
To the point:
I am designing logger for application that uses nodejs cluster.
Briefly - one of processes (L) will handle data streams from others, (A, B, ...).
Process L will queue messages (for example line by line or some other special separator) and log it one by one into file, db or anywhere else.
Advantage of this approach is reducing load of processes that can spent more time on doing their job.
One more thing - assumption is to simplify usage of this library so user will only include this logger without any additional interaction (stream redirection) via shell.
Do you think this solution makes sense? Maybe you know a library that already doing this?
Let's set up some ground level first...
Writing to a terminal screen (console.log() etc.), writing to a file (fs.writeFile(), fs.writeFileSync() etc.) or sending data to a stream process.stdout.write(data) etc.) will always "block the event loop". Why? Because some part of those functions is always written in JavaScript. The minimum amount of work needed by these functions would be to take the input and hand it over to some native code, but some JS will always be executed.
And since JS is involved, it will inevitably "block" the event loop because JavaScript code is always executed on a single thread no matter what.
Is this a bad thing...?
No. The amount of time required to process some log data and send it over to a file or a stream is quite low and does not have significant impact on performance.
When would this be a bad thing, then...?
You can hurt your application by doing something generally called a "synchronous" I/O operation - that is, writing to a file and actually not executing any other JavaScript code until that write has finished. When you do this, you hand all the data to the underlying native code and while theoretically being able to continue doing other work in JS space, you intentionally decide to wait until the native code responds back to you with the results. And that will "block" your event loop, because these I/O operations can take much much longer than executing regular code (disks/networks tend to be the slowest part of a computer).
Now, let's get back to writing to stdout/stderr.
From Node.js' docs:
process.stdout and process.stderr differ from other Node.js streams in important ways:
They are used internally by console.log() and console.error(), respectively.
They cannot be closed (end() will throw).
They will never emit the 'finish' event.
Writes may be synchronous depending on what the stream is connected to and whether the system is Windows or POSIX:
Files: synchronous on Windows and POSIX
TTYs (Terminals): asynchronous on Windows, synchronous on POSIX
Pipes (and sockets): synchronous on Windows, asynchronous on POSIX
I am assuming we are working with POSIX systems below.
In practice, this means that when your Node.js' output streams are not piped and are sent directly to the TTY, writing something to the console will block the event loop until the whole chunk of data is sent to the screen. However, if we redirect the output streams to something else (a process, a file etc.) now when we write something to the console Node.js will not wait for the completion of the operation and continue executing other JavaScript code while it writes the data to that output stream.
In practice, we get to execute more JavaScript in the same time period.
With this information you should be able to answer all your questions yourself now:
You do not need to redirect the stdout/stderr of your Node.js process if you do not write anything to the console, or you can redirect only one of the streams if you do not write anything to the other one. You may redirect them anyway, but if you do not use them you will not gain any performance benefit.
If you configure your logger to write the log data to a stream then it will not block your event loop too much (unless some heavy processing is involved).
If you care this much about your app's performance, do not use Winston or Bunyan for logging - they are extremely slow. Use pino instead - see the benchmarks in their readme.
To answer (1) we can dive into the Express documentation, you will see a link to the Node.js documentation for Console, which links to the Node documentation on the process I/O. There it describes how process.stdout and process.stderr behaves:
process.stdout and process.stderr differ from other Node.js streams in important ways:
They are used internally by console.log() and console.error(), respectively.
They cannot be closed (end() will throw).
They will never emit the 'finish' event.
Writes may be synchronous depending on what the stream is connected to and whether the system is Windows or POSIX:
Files: synchronous on Windows and POSIX
TTYs (Terminals): asynchronous on Windows, synchronous on POSIX
Pipes (and sockets): synchronous on Windows, asynchronous on POSIX
With that we can try to understand what will happen with node app 2>&1 | tee logFile:
Stdout and stderr is piped to a process tee
tee writes to both the terminal and the file logFile.
The important part here is that stdout and stderr is piped to a process, which means that it should be asynchronous.
Regarding (2) it would depend on how you configured Bunyan or Winston:
Winston has the concept of Transports, which essentially allows you to configure where the log will go. If you want asynchronous logs, you should use any logger other than the Console Transport. Using the File Transport should be ok, as it should create a file stream object for this and that is asynchronous, and won't block the Node process.
Bunyan has a similar configuration option: Streams. According to their doc, it can accept any stream interface. As long as you avoid using the process.stdout and process.stderr streams here you should be ok.
I've often heard of Streams2 and old-streams, but what is Streams3? It get mentioned in this talk by Thorsten Lorenz.
Where can I read about it, and what is the difference between Streams2 and Streams3.
Doing a search on Google, I also see it mentioned in the Changelog of Node 0.11.5,
stream: Simplify flowing, passive data listening (streams3) (isaacs)
I'm going to give this a shot, but I've probably got it wrong. Having never written Streams1 (old-streams) or Streams2, I'm probably not the right guy to self-answer this one, but here it goes. It seems as if there is Streams1 API that still persists to some degree. In Streams2, there are two modes of streams flowing (legacy), and non-flowing. In short, the shim that supported flowing mode is going away. This was the message that lead to the patch now called called Streams3,
Same API as streams2, but remove the confusing modality of flowing/old
mode switch.
Every time read() is called, and returns some data, a data event fires.
resume() will make it call read() repeatedly. Otherwise, no change.
pause() will make it stop calling read() repeatedly.
pipe(dest) and on('data', fn) will automatically call resume().
No switches into old-mode. There's only flowing, and paused. Streams start out paused.
Unfortunately, to understand any of description which defines Streams3 pretty well, you need to first understand Streams1, and the legacy streams
Backstory
First, let's take a look at what the Node v0.10.25 docs say about the two modes,
Readable streams have two "modes": a flowing mode and a non-flowing mode. When in flowing mode, data is read from the underlying system and provided to your program as fast as possible. In non-flowing mode, you must explicitly call stream.read() to get chunks of data out. — Node v0.10.25 Docs
Isaac Z. Schlueter said in November slides I dug up:
streams2
"suck streams"
Instead of 'data' events spewing, call read() to pull data from source
Solves all problems (that we know of)
So it seems as if in streams1, you'd create an object and call .on('data', cb) to that object. This would set the event to be trigger, and then you were at the mercy of the stream. In Streams2 internally streams have buffers and you request data from those streams explicitly (using `.read). Isaac goes on to specify how backwards compat works in Streams2 to keep Streams1 (old-stream) modules functioning
old-mode streams1 shim
New streams can switch into old-mode, where they spew 'data'
If you add a 'data' event handler, or call pause() or resume(), then switch
Making minimal changes to existing tests to keep us honest
So in Streams2, a call to .pause() or .resume() triggers the shim. And, it should, right? In Streams2 you have control over when to .read(), and you're not catching stuff being thrown at you. This triggered a legacy mode that acted independently of Streams2.
Let's take an example from Isaac's slide,
createServer(function(q,s) {
// ADVISORY only!
q.pause()
session(q, function(ses) {
q.on('data', handler)
q.resume()
})
})
In Streams1, q starts up right away reading and emitting (likely losing data), until the call to q.pause advises q to stop pulling in data but not from emitting events to clear what it already read.
In Streams2, q starts off paused until the call to .pause() which signifies to emulate the old mode.
In Streams3, q starts off as paused having never read from the file handle making the q.pause() a noop, and on the call to q.on('data', cb) will call q.resume until there is no more data in the buffer. And, then call again q.resume doing the same thing.
Seems like Streams3 was introduced in io.js, then in Node 0.11+
Streams 1 Supported data being pushed to a stream. There was no consumer control, data was thrown at the consumer whether it was ready or not.
Streams 2 allows data to be pushed to a stream as per Streams 1, or for a consumer to pull data from a stream as needed. The consumer could control the flow of data in pull mode (using stream.read() when notified of available data). The stream can not support both push and pull at the same time.
Streams 3 allows pull and push data on the same stream.
Great overview here:
https://strongloop.com/strongblog/whats-new-io-js-beta-streams3/
A cached version (accessed 8/2020) is here: https://hackerfall.com/story/whats-new-in-iojs-10-beta-streams-3
I suggest you read the documentation, more specifically the section "API for Stream Consumers", it's actually very understandable, besides I think the other answer is wrong: http://nodejs.org/api/stream.html#stream_readable_read_size
I'm writing a command line tool for installing Windows services using Node JS. After running a bunch of async operations, my tool should print a success message then quit. Sometimes however, it prints its success message and doesn't quit.
Is there a way to view what is queued on Node's internal event loop, so I can see what is preventing my tool from quitting?
The most typical culprit for me in CLI apps is event listeners that are keeping the process alive. I obviously can't say if that's relevant to you without seeing your code, though.
To answer your more general question, I don't believe there are any direct ways to view all outstanding tasks in the event loop (at least not from JS-land). You can, however, get pretty close with process._getActiveHandles() and process._getActiveRequests().
I really recommend you look up the documentation for them, though. Because you won't find any. They're undocumented. And they start with underscores. Use at your own peril. :)
try to use some tools to clarify the workflow - for example, the https://github.com/caolan/async#waterfall or https://github.com/caolan/async#eachseriesarr-iterator-callback
so, you don't lose the callback called and can catch any erros thrown while executing commands.
I think you also need to provide some code samples that leads to this errors.
In my node site I call a restful API service I have built using a standard http get. After a few hours of this communication successfully working I find that the request stops being sent, it just waits and eventually times out.
The API that is being called is still receiving requests from elsewhere perfectly well but when a request is sent from the site it does not reach the API.
I have tried with stream.pipe, util.pump and just writing the file to the file system.
I am using Node 0.6.15. My site and the service that is being called are on the same server so calls to localhost are being made. Memory usage is about 25% over all with cpu averaging about 10% usage.
After a while of the problem I started using the request module but I get the same behaviour. The number of calls it makes before failing varrys it seems between 5 to 100. In the end I have to restart the site but not the api to make it work again.
Here is roughly what the code in the site looks like:
var Request = require('request');
downloadPDF: function(req, res) {
Project.findById(req.params.Project_id, function(err, project) {
project.findDoc(req.params.doc_id ,function(err, doc) {
var pdfileName;
pdfileName = doc.name + ".pdf";
res.contentType(pdfileName);
res.header('Content-Disposition', "filename=" + pdfileName);
Request("http://localhost:3001/" + project._id).pipe(res);
});
});
}
I am lots at what could be happening.
Did you try to increase agent.maxSockets, or to disable http.Agent functionality? By default, recent node versions use sockets pooling for HTTP client connects, this may be source of the problem
http://nodejs.org/api/http.html#http_class_http_agent
I'm not sure how busy your Node server is, but it could be that all of your sockets are in TIME_WAIT status.
If you run this command, you should see how many sockets are in this state:
netstat -an | awk '/tcp/ {print $6}' | sort | uniq -c
It's normal to have some, of course. You just don't want to max out your system's available sockets and have them all be in TIME_WAIT.
If this is the case, you would actually want to reduce the agent.maxSockets setting (contrary to #user1372624's suggestion), as otherwise each request would simply receive a new socket even though it could simply reuse a recent one. It will simply take longer to reach a non-responsive state.
I found this Gist (a patch to http.Agent) that might help you.
This Server Fault answer might also help: https://serverfault.com/a/212127
Finally, it's also possible that updating Node may help, as they may have addressed the keep-alive behavior since your version (you might check the change log).
You're using a callback to return a value which doesn't make a lot of sense because your Project.findById() returns immediately, without waiting for the provided callback to complete.
Don't feel bad though, the programming model that nodejs uses is somewhat difficult at first to wrap your head around.
In event-driven programming (EDP), we provide callbacks to accomplish results, ignoring their return values since we never know when the callback might actually be called.
Here's a quick example.
Suppose we want to write the result of an HTTP request into a file.
In a procedural (non-EDP) programming environment, we rely on functions that only return values when they have them to return.
So we might write something like (pseudo-code):
url = 'http://www.example.com'
filepath = './example.txt'
content = getContentFromURL(url)
writeToFile(filepath,content)
print "Done!"
which assumes that our program will wait until getContentFromURL() has contacted the remote server, made its request, waited for a result and returned that result to the program.
The writeToFile() function then asks the operating system to open a local file at some filepath for write, waiting until it is told the open file operation has completed (typically waiting for the disk driver to report that it can carry out such an operation.)
writeToFile() then requests that the operating system write the content to the newly opened file, waiting until it is told that the driver the operating system uses to write files tells it it has accomplished this goal, returning the result to the program so it can tell us that the program has completed.
The problem that nodejs was created to solve is to better use all the time wasted by all the waiting that occurs above.
It does this by using functions (callbacks) which are called when operations like retrieving a result from a remote web request or writing a file to the filesystem complete.
To accomplish the same task above in an event-driven programming environment, we need to write the same program as:
getContentFromURL(url,onGetContentFromURLComplete)
function onGetContentFromURLComplete(content,err){
writeToFile(content,onWriteToFileComplete);
}
function onWriteToFileComplete(err){
print "Done!";
}
where
calling getContentFromURL() only calls the onGetContentFromURLComplete callback once it has the result of the web request and
calling writeToFile() only calls its callback to display a success message when it completes writing the content successfully.
The real magic of nodejs is that it can do all sorts of other things during the surprisingly large amount of time procedural functions have to wait for most time-intensive operations (like those concerned with input and output) to complete.
(The examples above ignore all errors which is usually considered a bad thing.)
I have also experienced intermittent errors using the inbuilt functions. As a work around I use the native wget. I do something like the following
var exec = require('child_process').exec;
function fetchURL(url, callback) {
var child;
var command = 'wget -q -O - ' + url;
child = exec(command, function (error, stdout, stderr) {
callback(error, stdout, stderr);
});
}
With a few small adaptations you could make it work for your needs. So far it is rock solid for me.
Did you try logging your parameters while calling this function? The error can depend on req.params.Project_id. You should also provide error handling in your callback functions.
If you can nail down the failing requests to a certain parameter set (make them reproducible), you could debug you application easily with node-inspector.