what happens to background processes stdout and stderr when I log out? - linux

I'm trying to understand what happens with stdout and stderr of background processes when exiting an SSH session. I understand about SIGHUP, child processes and all that, but I'm puzzled about the following:
If I run:
(while true; do date; sleep 0.5; done) | tee foo | cat >bar
and then kill the cat process then the tee process terminates because it can no longer write into the pipe. You can observe this using ps.
But if I run:
(while true; do date; sleep 0.5; done) | tee foo & disown
and the log out of my SSH session, I can observe that everything continues running just fine "forever". So somehow the stdout of the tee process must "keep going" even though my pty must be gone.
Can anyone explain what happens in the second example?
(Yes, I know I could explicitly redirect stdout/stderr/stdin of the background process.)

This is the crucial loop where tee sends output to stdout and opened files:
while (1)
{
bytes_read = read (0, buffer, sizeof buffer);
if (bytes_read < 0 && errno == EINTR)
continue;
if (bytes_read <= 0)
break;
/* Write to all NFILES + 1 descriptors.
Standard output is the first one. */
for (i = 0; i <= nfiles; i++)
if (descriptors[i]
&& fwrite (buffer, bytes_read, 1, descriptors[i]) != 1)
{
error (0, errno, "%s", files[i]);
descriptors[i] = NULL;
ok = false;
}
}
Pay closer attention on this part:
if (descriptors[i]
&& fwrite (buffer, bytes_read, 1, descriptors[i]) != 1)
{
error (0, errno, "%s", files[i]);
descriptors[i] = NULL;
ok = false;
}
It shows that when an error occurs, tee would not close itself but just unset the file descriptor descriptors[i] = NULL and continue to keep reading data until EOF or error on input occurs besides EINTR.
The date command or anything that sends output to the pipe connected to tee would not terminated since tee still reads their data. Only that the data doesn't go anywhere besides the file foo. And even if a file argument was not provided, tee would still read their data.
This is what /proc/**/fd looks like on tee when disconnected from a terminal:
0 -> pipe:[431978]
1 -> /dev/pts/2 (deleted)
2 -> /dev/pts/2 (deleted)
And this one's from the process that connects to its pipe:
0 -> /dev/pts/2 (deleted)
1 -> pipe:[431978]
2 -> /dev/pts/2 (deleted)
You can see that tee's stdout and stderr is already EOL but it's still running.

Related

When there is no output on the left side of the pipe, what happened on the right side of the pipe?

when I put a no output program on the left of pipe, such as:
// program a
int main() {
char s[255];
scanf("%s", s);
return 0;
}
and a program with input & output on the right:
// program b
int main() {
char s[255];
scanf("%s", s);
for(char *c = s; *(c) != '\0'; c++) printf("%d ", *c);
return 0;
}
when I try to connect them with pipe:
$ ./a | ./b
No matter what the input is, I always get two ascii codes: 64 & 3 in my computer, Even though I know that 'a' has no output.
I want to know if this is because of my program, or because of the shell, or because of the pipeline, or something else...
The pipe means "run program a, and use its output as input for program b". Both programs get executed.
Interesting is e.g. where cat comes in that just takes input and outputs it:
$ ls
file README-cloudshell.txt terraform.tfstate test.php
$ ls | cat
file
README-cloudshell.txt
terraform.tfstate
test.php
And this also explains why "hello" is not printed here:
$ echo hallo | ls
file README-cloudshell.txt terraform.tfstate test.php

Bash: close if pipe IO is idle

How can I close a program if a pipe stream is idle for a period of time?
Say for instance:
someprogram | closeidlepipe -t 500 | otherprogram
Is there some program closeidlepipe that can close if idle for a period (-t 500)?
timeout can close after a period, but not with the "idle" distinction.
UPDATE
It is important to note that someprogram outputs an endless stream of binary data. The data may contain the null character \0 and should be piped verbatim.
Here's the general form of the heart of a program that does this.
while(1) {
struct timeval tv;
tv.m_sec = 0;
tv.m_usec = 500000;
int marker = 1;
select(1, &marker, NULL, NULL, &tv);
if (marker == 0) exit(1);
char buf[8192];
int n = read(0, buf, 8192);
if (n < 0) exit(2);
char *b = buf;
while (n)
{
int l = write(1, b, n);
if (l <= 0) exit(3);
b += l;
n -= l;
}
}
The builtin read has a timeout option -t.
someprogram |
while :; do
IFS= read -d'' -r -t 500 line
res=$?
if [[ $res -eq 0 || )); then
# Normal read up to delimiter
printf '%s\0' "$line"
else
# Either read timed out without reading another null
# byte, or there was some other failure meaning we
# should break. In neither case did we read a trailing null byte
# that read discarded.
[[ -n $line ]] && printf '%s' "$line"
break
fi
done |
otherprogram
If read times out after 500 seconds, the while loop will exit and the middle part of the pipeline closes. someprogram will receive a SIGCHLD signal the next time it tries to write to its end of that pipe, allowing it to exit.

Breaking out of a while loop with system commands in Perl using Ctrl-C (SIGINT)?

Consider the following example, test.pl:
#!/usr/bin/env perl
use 5.10.1;
use warnings;
use strict;
$SIG{'INT'} = sub {print "Caught Ctrl-C - Exit!\n"; exit 1;};
$| = 1; # turn off output line buffering
use Getopt::Long;
my $doSystemLoop = 0;
GetOptions( "dosysloop"=>\$doSystemLoop );
print("$0: doSystemLoop is:$doSystemLoop (use " . (($doSystemLoop)?"system":"Perl") . " loop); starting...\n");
my $i=0;
if (not($doSystemLoop)) { # do Perl loop
while ($i < 1e6) {
print("\tTest value is $i");
$i++;
sleep 1;
print(" ... ");
sleep 1;
print(" ... \n");
}
} else { # do system call loop
while ($i < 1e6) {
system("echo","-ne","\tTest value is $i");
$i++;
system("sleep 1");
system("echo","-ne"," ... ");
system("sleep 1");
system("echo","-e"," ... ");
}
}
So, if I call this program, so it uses a usual Perl loop, everything is as expected:
$ perl test.pl
test.pl: doSystemLoop is:0 (use Perl loop); starting...
Test value is 0 ... ...
Test value is 1 ... ...
Test value is 2 ... ^CCaught Ctrl-C - Exit!
$
... that is, I hit Ctrl-C, program exits instantly.
However, if the while loop's commands consist mostly of system calls, then it becomes nearly impossible to exit with Ctrl-C:
$ perl test.pl --dosysloop
test.pl: doSystemLoop is:1 (use system loop); starting...
Test value is 0 ... ...
Test value is 1 ... ...
Test value is 2 ... ^C ...
Test value is 3 ... ^C ...
Test value is 4 ... ^C ...
Test value is 5^C ... ^C ...
Test value is 6^C ... ^C ...
Test value is 7^C ... ^C ...
Test value is 8^C ... ^C ...
Test value is 9^C ... ^C ...
Test value is 10 ... ^C ...
Test value is 11^C ... ^C ...
Test value is 12^C ... ...
Test value is 13^Z
[1]+ Stopped perl test.pl --dosysloop
$ killall perl
$ fg
perl test.pl --dosysloop
Terminated
$
So in the snippet above, I'm hitting Ctrl-C (the ^C) like mad, and the program ignores me completely :/ Then I cheat by hitting Ctrl-Z (the ^Z), which stops the process and sets in the background; then in the resulting shell I do killall perl, and after that I execute the fg command, which places the Perl job back in the foreground - where it finally terminates due to the killall.
What I would like to have, is run a system loop like this, with the possibility to break out of it/exit it with the usual Ctrl-C. Is this possible to do, and how do I do that?
Perl's signal handling mechanism defers the handling of signals until a safe point. Deferred signals are checked between Opcodes of the perl VM. As system and friends count as a single opcode, signals are only checked once the exec'd command has terminated.
This can be circumvented by forking, and then waiting in a loop for the child process to terminate. The child can also be terminated early via a signal handler.
sub launch_and_wait {
my $wait = 1;
my $child;
local $SIG{CHLD} = sub {
$wait = 0;
};
local $SIG{INT} = sub {
$wait = 0;
kill KILL => $child if defined $child;
};
if ($child = fork()) {
# parent
while ($wait) {
print "zzz\n";
sleep 1;
}
wait; # try to join the child
} else {
# child
exec {$_[0]} #_;
}
}
launch_and_wait sleep => 60;
print "Done\n";
There are probably lots of ways this can go wrong (getting a SIGINT before the child was spawned…). I also omitted any error handling.
Check the exit status of the system() command for any signals. An external command interrupted with SIGINT will get a "2" here:
while () {
system("sleep", 1);
if ($? & 127) {
my $sig = $? & 127;
die "Caught signal INT" if $sig == 2; # you may also abort on other signals if you like
}
}

Reading with cat: Stop when not receiving data

Is there any way to tell the cat command to stop reading when not receiving any data? maybe with some "timeout" that specifies for how long no data is incoming.
Any ideas?
There is a timeout(1) command. Example:
timeout 5s cat /dev/random
Dependening on your circumstances. E.g. you run bash with -e and care normally for the exit code.
timeout 5s cat /dev/random || true
cat itself, no. It reads the input stream until told it's the end of the file, blocking for input if necessary.
There's nothing to stop you writing your own cat equivalent which will use select on standard input to timeout if nothing is forthcoming fast enough, and exit under those conditions.
In fact, I once wrote a snail program (because a snail is slower than a cat) which took an extra argument of characters per second to slowly output a file (a).
So snail 10 myprog.c would output myprog.c at ten characters per second. For the life of me, I can't remember why I did this - I suspect I was just mucking about, waiting for some real work to show up.
Since you're having troubles with it, here's a version of dog.c (based on my afore-mentioned snail program) that will do what you want:
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <sys/select.h>
static int dofile (FILE *fin) {
int ch = ~EOF, rc;
fd_set fds;
struct timeval tv;
while (ch != EOF) {
// Set up for fin file, 5 second timeout.
FD_ZERO (&fds); FD_SET (fileno (fin), &fds);
tv.tv_sec = 5; tv.tv_usec = 0;
rc = select (fileno(fin)+1, &fds, NULL, NULL, &tv);
if (rc < 0) {
fprintf (stderr, "*** Error on select (%d)\n", errno);
return 1;
}
if (rc == 0) {
fprintf (stderr, "*** Timeout on select\n");
break;
}
// Data available, so it will not block.
if ((ch = fgetc (fin)) != EOF) putchar (ch);
}
return 0;
}
int main (int argc, char *argv[]) {
int argp, rc;
FILE *fin;
if (argc == 1)
rc = dofile (stdin);
else {
argp = 1;
while (argp < argc) {
if ((fin = fopen (argv[argp], "rb")) == NULL) {
fprintf (stderr, "*** Cannot open input file [%s] (%d)\n",
argv[argp], errno);
return 1;
}
rc = dofile (fin);
fclose (fin);
if (rc != 0)
break;
argp++;
}
}
return rc;
}
Then, you can simply run dog without arguments (so it will use standard input) and, after five seconds with no activity, it will output:
*** Timeout on select
(a) Actually, it was called slowcat but snail is much nicer and I'm not above a bit of minor revisionism if it makes the story sound better :-)
mbuffer, with its -W option, works for me.
I needed to sink stdin to a file, but with an idle timeout:
I did not need to actually concatenate multiple sources (but perhaps there are ways to use mbuffer for this.)
I did not need any of cat's possible output-formatting options.
I did not mind the progress bar that mbuffer brings to the table.
I did need to add -A /bin/false to suppress a warning, based on a suggestion in the linked man page. My invocation for copying stdin to a file with 10 second idle timeout ended up looking like
mbuffer -A /bin/false -W 10 -o ./the-output-file
Here is the code for timeout-cat:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
void timeout(int sig) {
exit(EXIT_FAILURE);
}
int main(int argc, char* argv[]) {
int sec = 0; /* seconds to timeout (0 = no timeout) */
int c;
if (argc > 1) {
sec = atoi(argv[1]);
signal(SIGALRM, timeout);
alarm(sec);
}
while((c = getchar()) != EOF) {
alarm(0);
putchar(c);
alarm(sec);
}
return EXIT_SUCCESS;
}
It does basically the same as paxdiablo's dog.
It works as a cat without an argument - catting the stdin. As a first argument provide timeout seconds.
One limitation (applies to dog as well) - lines are line-buffered, so you have n-seconds to provide a line (not any character) to reset the timeout alarm. This is because of readline.
usage:
instead of potentially endless:
cat < some_input > some_output
you can do compile code above to timeout_cat and:
./timeout_cat 5 < some_input > some_output
Try to consider tail -f --pid
I am assuming that you are reading some file and when the producer is finished (gone?) you stop.
Example that will process /var/log/messages until watcher.sh finishes.
./watcher.sh&
tail -f /var/log/messages --pid $! | ... do something with the output
I faced same issue of cat command blocking while reading on tty port via adb shell but did not find any solution (timeout command was also not working). Below is the final command I used in my python script (running on ubuntu) to make it non-blocking. Hope this will help someone.
bash_command = "adb shell \"echo -en 'ATI0\\r\\n' > /dev/ttyUSB0 && cat /dev/ttyUSB0\" & sleep 1; kill $!"
response = subprocess.check_output(['bash', '-c', bash_command])
Simply cat then kill the cat after 5 sec.
cat xyz & sleep 5; kill $!
Get the cat output as a reply after 5 seconds
reply="`cat xyz & sleep 5; kill $!`"
echo "reply=$reply"

Is there a good way to detect a stale NFS mount

I have a procedure I want to initiate only if several tests complete successfully.
One test I need is that all of my NFS mounts are alive and well.
Can I do better than the brute force approach:
mount | sed -n "s/^.* on \(.*\) type nfs .*$/\1/p" |
while read mount_point ; do
timeout 10 ls $mount_point >& /dev/null || echo "stale $mount_point" ;
done
Here timeout is a utility that will run the command in the background, and will kill it after a given time, if no SIGCHLD was caught prior to the time limit, returning success/fail in the obvious way.
In English: Parse the output of mount, check (bounded by a timeout) every NFS mount point. Optionally (not in the code above) breaking on the first stale mount.
A colleague of mine ran into your script. This doesn't avoid a "brute force" approach, but if I may in Bash:
while read _ _ mount _; do
read -t1 < <(stat -t "$mount") || echo "$mount timeout";
done < <(mount -t nfs)
mount can list NFS mounts directly. read -t (a shell builtin) can time out a command. stat -t (terse output) still hangs like an ls*. ls yields unnecessary output, risks false positives on huge/slow directory listings, and requires permissions to access - which would also trigger a false positive if it doesn't have them.
while read _ _ mount _; do
read -t1 < <(stat -t "$mount") || lsof -b 2>/dev/null|grep "$mount";
done < <(mount -t nfs)
We're using it with lsof -b (non-blocking, so it won't hang too) in order to determine the source of the hangs.
Thanks for the pointer!
test -d (a shell builtin) would work instead of stat (a standard external) as well, but read -t returns success only if it doesn't time out and reads a line of input. Since test -d doesn't use stdout, a (( $? > 128 )) errorlevel check on it would be necessary - not worth the legibility hit, IMO.
Took me some time, but here is what I found which works in Python:
import signal, os, subprocess
class Alarm(Exception):
pass
def alarm_handler(signum, frame):
raise Alarm
pathToNFSMount = '/mnt/server1/' # or you can implement some function
# to find all the mounts...
signal.signal(signal.SIGALRM, alarm_handler)
signal.alarm(3) # 3 seconds
try:
proc = subprocess.call('stat '+pathToNFSMount, shell=True, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
stdoutdata, stderrdata = proc.communicate()
signal.alarm(0) # reset the alarm
except Alarm:
print "Oops, taking too long!"
Remarks:
credit to the answer here.
You could also use alternative scheme:
os.fork() and os.stat()
check if the fork finished, if it has timed out you can kill it. You will need to work with time.time() and so on.
In addition to previous answers, which hangs under some circumstances, this snippet checks all suitable mounts, kills with signal KILL, and is tested with CIFS too:
grep -v tracefs /proc/mounts | cut -d' ' -f2 | \
while read m; do \
timeout --signal=KILL 1 ls -d $m > /dev/null || echo "$m"; \
done
You could write a C program and check for ESTALE.
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <iso646.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
int main(){
struct stat st;
int ret;
ret = stat("/mnt/some_stale", &st);
if(ret == -1 and errno == ESTALE){
printf("/mnt/some_stale is stale\n");
return EXIT_SUCCESS;
} else {
return EXIT_FAILURE;
}
}
Writing a C program that checks for ESTALE is a good option if you don't mind waiting for the command to finish because of the stale file system. If you want to implement a "timeout" option the best way I've found to implement it (in a C program) is to fork a child process that tries to open the file. You then check if the child process has finished reading a file successfully in the filesystem within an allocated amount of time.
Here is a small proof of concept C program to do this:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/wait.h>
void readFile();
void waitForChild(int pid);
int main(int argc, char *argv[])
{
int pid;
pid = fork();
if(pid == 0) {
// Child process.
readFile();
}
else if(pid > 0) {
// Parent process.
waitForChild(pid);
}
else {
// Error
perror("Fork");
exit(1);
}
return 0;
}
void waitForChild(int child_pid)
{
int timeout = 2; // 2 seconds timeout.
int status;
int pid;
while(timeout != 0) {
pid = waitpid(child_pid, &status, WNOHANG);
if(pid == 0) {
// Still waiting for a child.
sleep(1);
timeout--;
}
else if(pid == -1) {
// Error
perror("waitpid()");
exit(1);
}
else {
// The child exited.
if(WIFEXITED(status)) {
// Child was able to call exit().
if(WEXITSTATUS(status) == 0) {
printf("File read successfully!\n");
return;
}
}
printf("File NOT read successfully.\n");
return;
}
}
// The child did not finish and the timeout was hit.
kill(child_pid, 9);
printf("Timeout reading the file!\n");
}
void readFile()
{
int fd;
fd = open("/path/to/a/file", O_RDWR);
if(fd == -1) {
// Error
perror("open()");
exit(1);
}
else {
close(fd);
exit(0);
}
}
I wrote https://github.com/acdha/mountstatus which uses an approach similar to what UndeadKernel mentioned, which I've found to be the most robust approach: it's a daemon which periodically scans all mounted filesystems by forking a child process which attempts to list the top-level directory and SIGKILL it if it fails to respond in a certain timeout, with both successes and failures recorded to syslog. That avoids issues with certain client implementations (e.g older Linux) which never trigger timeouts for certain classes of error, NFS servers which are partially responsive but e.g. won't respond to actual calls like listdir, etc.
I don't publish them but the included Makefile uses fpm to build rpm and deb packages with an Upstart script.
Another way, using shell script. Works good for me:
#!/bin/bash
# Purpose:
# Detect Stale File handle and remove it
# Script created: July 29, 2015 by Birgit Ducarroz
# Last modification: --
#
# Detect Stale file handle and write output into a variable and then into a file
mounts=`df 2>&1 | grep 'Stale file handle' |awk '{print ""$2"" }' > NFS_stales.txt`
# Remove : ‘ and ’ characters from the output
sed -r -i 's/://' NFS_stales.txt && sed -r -i 's/‘//' NFS_stales.txt && sed -r -i 's/’//' NFS_stales.txt
# Not used: replace space by a new line
# stales=`cat NFS_stales.txt && sed -r -i ':a;N;$!ba;s/ /\n /g' NFS_stales.txt`
# read NFS_stales.txt output file line by line then unmount stale by stale.
# IFS='' (or IFS=) prevents leading/trailing whitespace from being trimmed.
# -r prevents backslash escapes from being interpreted.
# || [[ -n $line ]] prevents the last line from being ignored if it doesn't end with a \n (since read returns a non-zero exit code when it encounters EOF).
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "Unmounting due to NFS Stale file handle: $line"
umount -fl $line
done < "NFS_stales.txt"
#EOF
I'll just paste a snippet from our Icinga2 NFS stale mount monitoring Bash script here:
MOUNTS="$(mount -t nfs;mount -t nfs3;mount -t nfs4)"
MOUNT_POINTS=$(echo -e "$MOUNTS \n"|grep -v ^$|awk '{print $3}')
if [ -z "$MOUNT_POINTS" ]; then
OUTPUT="[OK] No nfs mounts"
set_result 0
else
for i in $MOUNT_POINTS;do
timeout 1 stat -t "$i" > /dev/null
TMP_RESULT=$?
set_result $TMP_RESULT
set_output $TMP_RESULT "$i"
done
fi

Resources