How does commands-piping work in *NIX?

How does commands-piping work in *NIX? - linux

When I do this:
find . -name "pattern" | grep "another-pattern"
Are the processes, find and grep, spawned together? My guess is yes. If so, then how does this work?:
yes | command_that_prompts_for_confirmations
If yes is continuously sending 'y' to stdout and command_that_prompts_for_confirmations reads 'y' whenever it's reading its stdin, how does yes know when to terminate? Because if I run yes alone without piping its output to some other command, it never ends.
But if piping commands don't spawn all the processes simultaneously, then how does yes know how many 'y's to output? It's catch-22 for me here. Can someone explain me how this piping works in *NIX?

From the wikipedia page: "By itself, the yes command outputs 'y' or whatever is specified as an argument, followed by a newline, until stopped by the user or otherwise killed; when piped into a command, it will continue until the pipe breaks (i.e., the program completes its execution)."
yes does not "know" when to terminate. However, at some point outputting "y" to stdout will cause an error because the other process has finished, that will cause a broken pipe, and yes will terminate.
The sequence is:
other program terminates
operating system closes pipe
yes tries to output character
error happens (broken pipe)
yes terminates

Yes, (generally speaking) all of the processes in a pipeline are spawned together. With regard to yes and similar situations, a signal is passed back up the pipeline to indicate that it is no longer accepting input. Specifically: SIGPIPE, details here and here. Lots more fun infomation on *nix pipelining is available on wikipedia.
You can see the SIGPIPE in action if you interrupt a command that doesn't expect it, as you get a Broken Pipe error. I can't seem to find an example that does it off the top of my head on my Ubuntu setup though.

Other answers have covered termination. The other facet is that yes will only output a limited number of y's - there is a buffer in the pipe, and once that is full yes will block in its write request. Thus yes doesn't consume infinite CPU time.

The stdout of the first process is connected to the stdin of the second process, and so on down the line. "yes" exits when the second process is done because it no longer has a stdout to write to.

Related

How to use the attach the same console as output for a process and input for another process?

I am trying to use suckless ii irc client. I can listen to a channel by tail -f out file. However is it also possible for me to input into the same console by starting an echo or cat command?
If I background the process, it actually displays the output in this console but that doesn't seem to be right way? Logically, I think I need to get the fd of the console (but how to do that) and then force the tail output to that fd and probably background it. And then use the present bash to start a cat > in.
Is it actually fine to do this or is that I am creating a lot of processes overhead for a simple task? In other words piping a lot of stuff is nice but it creates a lot of overhead which ideally has to be in a single process if you are going to repeat that task it a lot?

However is it also possible for me to input into the same console by starting an echo or cat command?
Simply NO! cat writes the current content. cat has no idea that the content will grow later. echo writes variables and results from the given command line. echo itself is not made for writing the content of files.
If I background the process, it actually displays the output in this console but that doesn't seem to be right way?
If you do not redirect the output, the output goes to the console. That is the way it is designed :-)
Logically, I think I need to get the fd of the console (but how to do that) and then force the tail output to that fd and probably background it.
As I understand that is the opposite direction. If you want to write to the stdin from the process, you simply can use a pipe for that. The ( useless ) example show that cat writes to the pipe and the next command will read from the pipe. You can extend to any other pipe read/write scenario. See link given below.
Example:
cat main.cpp | cat /dev/stdin
cat main.cpp | tail -f
The last one will not exit, because it waits that the pipe gets more content which never happens.
Is it actually fine to do this or is that I am creating a lot of processes overhead for a simple task? In other words piping a lot of stuff is nice but it creates a lot of overhead which ideally has to be in a single process if you are going to repeat that task it a lot?
I have no idea how time critical your job is, but I believe that the overhead is quite low. Doing the same things in a self written prog must not be faster. If all is done in a single process and no access to the file system is required, it will be much faster. But if you also use system calls, e.g. file system access, it will not be much faster I believe. You always have to pay for the work you get.
For IO redirection please read:
http://www.tldp.org/LDP/abs/html/io-redirection.html
If your scenario is more complex, you can think of named pipes instead of IO redirection. For that you can have a look at:
http://www.linuxjournal.com/content/using-named-pipes-fifos-bash

Linux process in background - "Stopped" in jobs?

I'm currently running a process with the & sign.
$ example &
However, (please note i'm a newbie to Linux) I realised that pretty much a second after such command I'm getting a note that my process received a stopped signal. If I do
$ jobs
I'll get the list with my example process with a little note "Stopped". Is it really stopped and not working at all in the background? How does it exactly work? I'm getting mixed info from the Internet.

In Linux and other Unix systems, a job that is running in the background, but still has its stdin (or std::cin) associated with its controlling terminal (a.k.a. the window it was run in) will be sent a SIGTTIN signal, which by default causes the program to be completely stopped, pending the user bringing it to the foreground (fg %job or similar) to allow input to actually be given to the program. To avoid the program being paused in this way, you can either:
Make sure the programs stdin channel is no longer associated with the terminal, by either redirecting it to a file with appropriate contents for the program to input, or to /dev/null if it really doesn't need input - e.g. myprogram < /dev/null &.
Exit the terminal after starting the program, which will cause the association with the program's stdin to go away. But this will cause a SIGHUP to be delivered to the program (meaning the input/output channel experienced a "hangup") - this normally causes a program to be terminated, but this can be avoided by using nohup - e.g. nohup myprogram &.
If you are at all interested in capturing the output of the program, this is probably the best option, as it prevents both of the above signals (as well as a couple others), and saves the output for you to look at to determine if there are any issues with the programs execution:
nohup myprogram < /dev/null > ${HOME}/myprogram.log 2>&1 &

Yes it really is stopped and no longer working in the background. To bring it back to life type fg job_number

From what I can gather.
Background jobs are blocked from reading the user's terminal. When one tries to do so it will be suspended until the user brings it to the foreground and provides some input. "reading from the user's terminal" can mean either directly trying to read from the terminal or changing terminal settings.
Normally that is what you want, but sometimes programs read from the terminal and/or change terminal settings not because they need user input to continue but because they want to check if the user is trying to provide input.
http://curiousthing.org/sigttin-sigttou-deep-dive-linux has the gory technical details.

Just enter fg which will resolve the error when you then try to exit.

How does pipelining work?

Can somebody explain what actually happens internally(the system calls called) for the command ls | grep 'xxx' ?

First, pipe(2,3p) is called in order to create the pipe with read and write ends. fork(2,3p) is then called twice, once for each command. Then dup2(2,3p) is used to replace the appropriate file descriptor in each forked child with each end of the pipe. Finally exec(3) is called in each child to actually run the commands.

The standard output of the first command is fed as standard input to the second command in the pipeline. There are a couple of system calls that you may be interested to understand what is happening in more detail, in particular, fork(2), execve(2), pipe(2), dup2(2), read(2) and write(2).
In effect the shell arranges STDIN_FILENO and STDOUT_FILENO to be the read end and the write end of the pipe respectively. When the first process in the pipeline performs a write(2) the standard output of that process is duplicated to be the write end of the pipe, similarly when the second process does a read(2) on the standard input it ends up reading from the read end of the pipe.
There are of course more details to be considered, please check out a book such as Advanced programming in the UNIX environment by Richard Stevens.

How to find out when process exits in Linux?

I can't find a good way to find out when a process exits in Linux. Does anyone have a solution for that?
One that I can think of is check process list periodically, but that is not instant and pretty expensive (have to loop over all processes each time).
Is there an interface for doing that on Linux? Something like waitpid, except something that can be used from unrelated processes?
Thanks,
Boda Cydo

You cannot wait for an unrelated process, just children.
A simpler polling method than checking the process list, if you have permission, you can use the kill(2) system call and "send" signal 0.
From the kill(2) man page:
If sig is 0, then no signal is sent,
but error checking is still performed;
this can be used to check for the
existence of a process ID or process
group ID

Perhaps you can start the program together with another program, the second one doing whatever it is you want to do when the first program stops, like sending a notification etc.
Consider this very simple example:
sleep 10; echo "finished"
sleep 10 is the first process, echo "finished" the second one (Though echo is usually a shell plugin, but I hope you get the point)

Another option is to have the process open an IPC object such as a unix domain socket; your watchdog process can detect when the process quits as it will immediately be closed.

If you know the PID of the process in question, you can check if /proc/$PID exists. That's a relatively cheap stat() call.

Linux i/o to running daemon / process

Is it possible to i/o to a running process?
I have multiple game servers running like this:
cd /path/to/game/server/binary
./binary arg1 arg2 ... argn &
Is it possible to write a message to a server if i know the process id?
Something like this would be handy:
echo "quit" > process1234
Where process1234 is the process (with sid 1234).
The game server is not a binary written by me, but it is a Call of Duty binary. So i can't change anything to the code.

Yes, you can start up the process with a pipe as its stdin and then write to the pipe. You can used a named or anonymous pipe.
Normally a parent process would be needed to do this, which would create an anonmyous pipe and supply that to the child process as its stdin - popen() does this, many libraries also implement it (see Perl's IPC::Open2 for example)
Another way would be to run it under a pseudo tty, which is what "screen" does. Screen itself may also have a mechanism for doing this.

Only if the process is listening for some message somewhere. For instance, your game server can be waiting for input on a file, over a network connection, or from standard input.
If your process is not actively listening for something, the only things you can really do is halt or kill it.
Now if your process is waiting on standard input, and you ran it like so:
$ myprocess &
Then (in linux) you should be able to try the following:
$ jobs
[1]+ Running myprocess &
$ fg 1
And at this point you are typing standard input into your process.

You can only do that if the process is explicitly designed for that.
But since you example is requesting the process quit, I'd recommend trying signals. First try to send the TERM (i.e. terminate) signal which is the default:
kill _pid_
If that doesn't work, you can try other signals such as QUIT:
kill -QUIT _pid_
If all else fails, you can use the KILL signal. This is guaranteed (*) to stop the process but the process will have no change to clean up:
kill -KILL _pid_
* - in the past, kill -KILL would not work if the process was hung when on a flaky network file server. Don't know if they ever fixed this.

I'm pretty sure this would work, since the server has a console on stdin:
echo "quit" > /proc/<server pid>/fd/0
You mention in a comment below that your process does not appear to read from the console on fd 0. But it must on some fd. ls -l /proc/<server pid/>/fd/ and look for one that's pointing at /dev/pts/ if the process is running in a gnome-terminal or xterm or something.

If you want to do a few simple operations on your server, use signals as mentioned elsewhere. Set up signal handlers in the server and have each signal perform a different action e.g.:
SIGINT: Reread config file
SIGHUP: quit
...

Highly hackish, don't do this if you have a saner alternative, but you can redirect a process's file descriptors on the fly if you have ptrace permissions.
$ echo quit > /tmp/quitfile
$ gdb binary 1234
(gdb) call dup2(open("/tmp/quitfile", 0), 0)
(gdb) continue
open("/tmp/quitfile", O_RDONLY) returns a file descriptor to /tmp/quitfile. dup2(..., STDIN_FILENO) replaces the existing standard input by the new file descriptor.
We inject this code into the application using gdb (but with numeric constants, as #define constants may not be available), and taadaah.

Simply run it under screen and don't background it. Then you can either connect to it with screen interactively and tell it to quit, or (with a bit of expect hackery) write a script that will connect to screen, send the quit message, and disconnect.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string