Where does the buffer come into picture when using node.js exec function instead of spawn function? - node.js

As I read from the child_process module documentation of Node.js, I understand the difference between exec and spawn. The most crucial difference that is also highlighted in a similar StackOverflow question about the Spawn vs Exec:
The main difference is that spawn is more suitable for long-running processes with huge output. That's because spawn streams input/output with a child process. On the other hand, exec buffers output in a small (by default 200K) buffer.
However, I noticed, thanks to TS Intellisense that both exec and spawn function return similar object of type ChildProcess. So, I could technically write this for the exec function using stdout as stream and it works:
function cmdAsync (cmd, options) {
return new Promise((resolve) => {
const proc = exec(cmd, options);
proc.stdout?.pipe(process.stdout);
proc.on('exit', resolve);
});
}
cmdAsync('node server/main.mjs');
And without any buffer time/delay, I could see the logs generated by server/main.mjs file being piped into the parent process's stdout stream.
So, my question is exactly where the buffering is happening and how the streaming behavior is different for exec than spawn function! Also, can I rely of this feature even if it is undocumented?

Related

How do I spawn two processes from Node.js and pipe them together?

I want to be able to spawn two Node.js child processes but have the stdout from one be piped to the stdin of another. The equivalent of:
curl "https://someurl.com" | jq .
My Node's stdout will go to either a terminal or to a file, depending on whether the user pipes the output or not.
You can spawn a child process with Node.js's child_process built-in module. We need to processes, so we'll call it twice:
const cp = require('child_process')
const curl = cp.spawn('curl', ['https://someurl.com'], { stdio: ['inherit', 'pipe', 'inherit'] })
const jq = cp.spawn('jq', ['.'], { stdio: ['pipe', 'inherit', 'pipe'] })
The first parameter is the executable to run, the second is the array of parameters to pass it and the third is options. We need to tell it where the process's stdin, stdout and stderr are to be routed: 'inherit' means "use the host Node.js application's stdio", and 'pipe' means "we'll handle it programmatically.
So in this case curl's output and jq's input are left to be dealt with programmatically which we do with an additional line of code:
curl.stdout.pipe(jq.stdin)
which means "plumb curl's stdout into jq's stdin".
It's as simple as that.

Nodejs: write to stdin of bash process crashes with EPIPE

My node process gets some PDF file via HTTP Request, then uses the request's onData event to pass the incoming data on to a properly configured lpr, spawned via child_process.exec. I write to stdin using process.stdin.write(...), followed by process.stdin.end() when done. This allows me to print those files immediately.
Now I have a situation where I don't want the data to be piped to lpr, but to some bash script. The script uses cat to process its stdin.
myscript.sh < somefile.pdf works as expected, as does cat somefile.pdf | myscript.sh.
However, when I spawn /path/to/script.sh from node (by simply replacing lpr with the script path in the source), the process exits with
events.js:183
throw er; // Unhandled 'error' event
^
Error: write EPIPE
at WriteWrap.afterWrite [as oncomplete] (net.js:868:14)
Subsequently, the whole node process crashes, the error sneaking around all try...catch blocks. Logging at the beginning of the bash script shows, it does not even get started.
When I target anything that's not a shell script but some compiled executable, like cat, echo,... everything works just fine.
Adding epipebomb module would not change anything.
I also tried piping to process.exec("bash", ["-c cat | myscript.sh"]), with the same errors.
An example bash script, just to test for execution:
#!/usr/bin/env bash
date > logfile.txt
cat > /dev/null
EDIT:
I think I maybe need to signal to keep the stdin stream open somehow.
The process-spawning part of the script, leaving promisification and output processing away:
const process = require("child_process")
// inputObservable being an rxjs Observable
execstuff(inputObervable) {
const task = process.spawn("/path/to/script.sh");
inputObservable.subscribe(
chunk => task.stdin.write(chunk),
error => console.error(error),
finished => task.stdin.end()
);
}
There is an example at child_process.spawn how you can write the following lines ps ax | grep ssh as node.js script, maybe it will be helpful for you:
const { spawn } = require('child_process');
const ps = spawn('ps', ['ax']);
const grep = spawn('grep', ['ssh']);
ps.stdout.on('data', (data) => {
grep.stdin.write(data);
});
ps.stderr.on('data', (data) => {
console.log(`ps stderr: ${data}`);
});
The first impression is that you are doing the same stuff, the problem may be in the chunk data, maybe one of the chunks is null, and it is closing the stream, and you want to close it by running task.stdin.end().
The other thing you can try is to run the node.js script with the NODE_DEBUG=stream node script.js
Will log the node.js internals how the stream, behaves, also may be helpful for you.

What are the ways to flush a Linux command's stdout as a Node.js child process?

What are the ways to flush a Linux command's stdout as a Node.js child process?
Ending the stdin stream (child.stdin.end) will work. As will unbuffering the command with stdbuf.
But I imagine there's a proper way to stream results from external commands.
How do we tell a command we're ready to consume while we are still providing data?
Example:
const { spawn } = require('child_process');
const child = spawn('uniq');
child.stdout.pipe(process.stdout);
child.stdin.write('a\nb\nb\nc\n', 'utf8');
// No output, child is still running.
(uniq is just an example here. It's the same with most Linux commands.)

shelljs performance is slow

I have been using shelljs
On my super fast system I execute this:
var shell = require('shelljs')
const exec = require('child_process').exec
console.time('shell mktemp -d')
shell.exec('mktemp -d', {silent: true})
console.timeEnd('shell mktemp -d')
console.time('child exec mktemp -d')
exec('mktemp', ['-d'], function(error, stdout, stderr) {
if (error) {
console.error('stderr', stderr)
throw error
}
console.log('exec stdout', stdout)
console.timeEnd('child exec mktemp -d')
})
Its giving the following execution times:
shell mktemp -d: 208.126ms
exec stdout /tmp/tmp.w22tyS5Uyu
child exec mktemp -d: 48.812ms
Why is shelljs 4 times slower? Any thoughts?
Your code example compares async child_process.exec() with sync shell.exec(), which isn't entirely a fair comparison. I think you'll find shell.exec(..., { async: true }) performs a bit better: this is because sync shell.exec() does extra work to provide real-time stdio while still capturing stdout/stderr/return code as part of its return value; async shell.exec() can provide the same feature mostly for free.
Even with { silent: true }, the extra work is still necessary. shell.exec() is built on top of child_process.execSync(), which only returns stdout. We need to perform the same extra work in order to return return code and stderr.
Have a look to how shelljs is implemented:
It fully relies on node.js fs library. This library is cross platform and written in C++ but not as performant as C language. More generally, you can't have in JS the perfs you get in C...
Another thing, abstraction layers:
you're using exec(Command) where Command is a C tailored (Linux C here I think). The machine creates a thread and executes a command in it.
When using shell.js, there are many mechanisms to ensure cross plateform and keep the abstraction of your command as a function and keep the result as a variable. See the code of exec in shell.js:
https://github.com/shelljs/shelljs/blob/master/src/exec.js
It is not really doing the same thing as your line of code.
Hope that helps!

NodeJs spawn giving ENOENT error (Raspbian)

i'm having an error regarding spawning nodeJs script:
exec('node ./modules/buttons', function(error, stdout, stderr) {
if(error) console.log(error);
console.log(stdout);
if(stderr) console.log(stderr);
});
Exec Works perfectly fine. However spawn
var buttons = spawn('node ./modules/buttons.js', []);
buttons.stdout.on('data', function(data){
console.log(data);
});
Gives me the following error:
spawn node ./modules/buttons.js ENOENT
Defining the absolute path to the script results in the same error. Would appreciate it if someone could help me resolving this; I have absolutely no clue what could be the cause of this and google isn't helping me either.
exec accepts the command to be executed along with all the command line parameters, but spawn, OTOH, accepts the program to invoke and the command line arguments as an array.
In your case, Node.js is trying to execute a program called node ./modules/buttons.js, not node with ./modules/buttons.js as command line argument. That is why it is failing.
Quoting the example from the spawn docs,
const spawn = require('child_process').spawn;
const ls = spawn('ls', ['-lh', '/usr']);
The difference between exec and spawn is that, exec will be default launch the command in a shell, spawn simply invokes the program.
Note: BTW, as you are simply invoking a JavaScript file, you are better off using execFile

Resources