pipe of stdout in NodeJS - node.js

I've got the below statement gets one sentence from stdin and print the sentence (stdout).
process.stdin.pipe(process.stdout);
But what does the below statement means?
The below statement acts like the first statement.
(Receive data from the user and print it on the screen.)
What does the sentence below mean and why does this happen?
process.stdout.pipe(process.stdout);
what does it mean to pipe data from stdout, stderr?

Conceptually you can pipe stdout to stdin. But it's not useful because there's no way to put data into stdout, and no way to get it out of stdin.
And, in some systems trying to read from stdout and / or write to stdin may throw an error.
stdin and stdout have been part of the UNIX / FreeBSD / Linux operating system for half a century now. Nodejs's process object simply exposes them. It's worth a bit of your time to learn how these fundamental OS building blocks work.

Related

How to write to stdout without being blocking under linux?

I've written a log-to-stdout program which produces logs, and another exe read-from-stdin (for example filebeat) to collect logs from stdin. My problem is that my log-to-stdout speed may burst in a short period which exceeds read-from-stdin can accept, that will blocking log-to-stdout process, I'd like to know if there is a Linux API to tell if the stdout file descriptor can be written to (up to N bytes) without being blocked?
I've found some comments in nodejs process.stdout
In the case they refer to pipes:
They are blocking in Linux/Unix.
They are non-blocking like other streams in Windows.
Does that mean under Linux it's impossible to do non-blocking write on stdout? Some documents reference non-blocking file operate mode (https://www.linuxtoday.com/blog/blocking-and-non-blocking-i-0/), does it apply to stdout too? Because I'm using third-party logging (which expect stdout working at blocking mode), can I check stdout writable in non-blocking mode (before calling logging library), and then switch stdout back to blocking mode, so from logging library perspective, stdout fd still works as previously? (if I can tell stdout will be blocking, I'll throw output, since not being block is more important than output complete logging in my usage)
(Or if there is a auto-drop-pipe command, which can auto drop lines if pipeline will block, so I can call
log-to-stdout | auto-drop-pipe --max-lines=100 --drop-head-if-full | read-from-stdin)

std::process, with stdin and stdout from buffers

I have a command cmd and three Vec<u8>: buff1, buff2, and buff3.
I want to execute cmd, using buff1 as stdin, and capturing stdout into buff2 and stderr into buff3.
And I'd like to do all this without explicitly writing any temporary files.
std::process seems to allow all of those things, just not all at the same time.
If I use Command::new(cmd).output() it will return the buffers for stdout and stderr, but there's no way to give it stdin.
If I use Command::new(cmd).stdin(Stdio::piped()).spawn()
then I can child.stdin.as_mut().unwrap().write_all(buff1)
but I can't capture stdout and stderr.
As far as I can tell, there's no way to call Command::new(cmd).stdout(XXX) to explicitly tell it to capture stdout in a buffer, the way it does by default with .output().
It seems like something like this should be possible:
Command::new(cmd)
.stdin(buff1)
.stdout(buff2)
.stderr(buff3)
.output()
Since Rust makes a Vec<u8> look like a File, but Vec doesn't implement Into<Stdio>
Am I missing something? Is there a way to do this, or do I need to read and write with actual files?
If you're ok with using an external library, the subprocess crate supports this use case:
let (buff2, buff3) = subprocess::Exec::cmd(cmd)
.stdin(buff1)
.communicate()?
.read()?;
Doing this with std::process::Command is trickier than it seems because the OS doesn't make it easy to connect a region of memory to a subprocess's stdin. It's easy to connect a file or anything file-like, but to feed a chunk of memory to a subprocess, you basically have to write() in a loop. While Vec<u8> does implement std::io::Read, you can't use it to construct an actual File (or anything else that contains a file descriptor/handle).
Feeding data into a subprocess while at the same time reading its output is sometimes referred to as communicating in reference to the Python method introduced in 2004 with the then-new subprocess module of Python 2.4. You can implement it yourself using std::process, but you need to be careful to avoid deadlock in case the command generates output while you are trying to feed it input. (E.g. a naive loop that feeds a chunk of data to the subprocess and then reads its stdout and stderr will be prone to such deadlocks.) The documentation describes a possible approach to implement it safely using just the standard library.
If you want to read and write with buffers, you need to use the piped forms. The reason is that, at least on Unix, input and output to a process are done through file descriptors. Since a buffer cannot intrinsically be turned into a file descriptor, it's required to use a pipe and both read and write incrementally. The fact that Rust provides an abstraction for buffers doesn't allow you to avoid the fact that the operating system doesn't, and Rust doesn't abstract this for you.
However, since you'll be using pipes for both reading and writing, you'll need to use something like select so you don't deadlock. Otherwise, you could end up trying to write when your subprocess was not accepting new input because it needed data to be read from its standard output. Using select or poll (or similar) permits you to determine when each of those file descriptors are ready to be read or written to. In Rust, these functions are in the libc crate; I don't believe that Rust provides them natively. Windows will have some similar functionality, but I have no clue what it is.
It should be noted that unless you are certain that the subprocess's output can fit into memory, it may be better to process it in a more incremental way. Since you're going to be using select, that shouldn't be too difficult.

Determine when(/after which input) a python subprocess crashes

I have a python subprocess that runs an arbitrary C++ program (student assignments if it matters) via POpen. The structure is such that i write a series of inputs to stdin, at the end i read all of stdout and parse for responses to each output.
Of course given that these are student assignments, they may crash after certain inputs. What i require is to know after which specific input their program crashed.
So far i know that when a runtime exception is thrown in the C++ program, its printed to stderr. So right not i can read the stderr after the fact and see that it did in face crash. But i haven't found a way to read stderr while the program is still running, so that i can infer that the error is in response to the latest input. Every SO question or article that i have run into seems to make use of subprocess.communicate(), but communicate seems to block until the subprocess returns, this hasn't been working for me because i need to continue sending inputs to the program after the fact if it hasn't crashed.
What i require is to know after which specific input their program crashed.
Call process.stdin.flush() after process.stdin.write(b'your input'). If the process is already dead then either .write() or .flush() will raise an exception (specific exception may depend on the system e.g, BrokenPipeError on POSIX).
Unrelated: If you are redirecting all three standard streams (stdin=PIPE, stdout=PIPE, stderr=PIPE) then make sure to consume stdout, stderr pipes concurrently while you are writing the input otherwise the child process may hang if it generates enough output to fill the OS pipe buffer. You could use threads, async. IO to do it -- code examples.

Process connected to separate pty for stdout and stderr

I'm writing a terminal logging program - think the script command but a bit more featureful. One of the differences is that, whereas script captures stdout, stdin and stderr as one big character stream, I would like to keep them separate and record them as such.
In order to do this, I use the standard approach of running a child shell connected to a pty, but instead of using a single pty with stdin, stdout and stderr all connected to it, I use two ptys - with stdin and stderr connected to one pty, and stdout on the other. This way, the master process can tell what is coming from stdout and what from stderr.
This has, so far, worked fine. However, I'm starting to run into a few issues. For example, when trying to set the number of columns, I get the following:
$stty cols 169
stty: stdout appears redirected, but stdin is the control descriptor
This seems to be a result of this piece of code, which seems to check whether stdout and stderr are both ttys, but complains if they are not the same.
My question, therefore, is this: am I violating any fundamental assumptions about how Posix processes behave by acting in this way? If not, any idea why I'm seeing errors such as this? If so, is there any way I can get around this and still manage to separate stdout and stderr nicely?
One idea I had about this is to use a process directly on the pty which then runs the target program, e.g.
(wrapper) -> pty -> (controller) -> script
The controller would be responsible for running the script and capturing the stdout and stderr separately, feeding them back to the wrapper, perhaps by some non-std fd, or alternatively, serialising the data before shipping it back, e.g. prefixing output from stderr with stderr: and stdout with stdout: - then in the wrapper deserialize this and feed it back upstream or whatever you want to do with it.

How does pipelining work?

Can somebody explain what actually happens internally(the system calls called) for the command ls | grep 'xxx' ?
First, pipe(2,3p) is called in order to create the pipe with read and write ends. fork(2,3p) is then called twice, once for each command. Then dup2(2,3p) is used to replace the appropriate file descriptor in each forked child with each end of the pipe. Finally exec(3) is called in each child to actually run the commands.
The standard output of the first command is fed as standard input to the second command in the pipeline. There are a couple of system calls that you may be interested to understand what is happening in more detail, in particular, fork(2), execve(2), pipe(2), dup2(2), read(2) and write(2).
In effect the shell arranges STDIN_FILENO and STDOUT_FILENO to be the read end and the write end of the pipe respectively. When the first process in the pipeline performs a write(2) the standard output of that process is duplicated to be the write end of the pipe, similarly when the second process does a read(2) on the standard input it ends up reading from the read end of the pipe.
There are of course more details to be considered, please check out a book such as Advanced programming in the UNIX environment by Richard Stevens.

Resources