Capture both exit status and output from a system call in R - linux

I've been playing a bit with system() and system2() for fun, and it struck me that I can save either the output or the exit status in an object. A toy example:
X <- system("ping google.com",intern=TRUE)
gives me the output, whereas
X <- system2("ping", "google.com")
gives me the exit status (1 in this case, google doesn't take ping). If I want both the output and the exit status, I have to do 2 system calls, which seems a bit overkill. How can I get both with using only one system call?
EDIT : I'd like to have both in the console, if possible without going over a temporary file by using stdout="somefile.ext" in the system2 call and subsequently reading it in.

As of R 2.15, system2 will give the return value as an attribute when stdout and/or stderr are TRUE. This makes it easy to get the text output and return value.
In this example, ret ends up being a string with an attribute "status":
> ret <- system2("ls","xx", stdout=TRUE, stderr=TRUE)
Warning message:
running command ''ls' xx 2>&1' had status 1
> ret
[1] "ls: xx: No such file or directory"
attr(,"status")
[1] 1
> attr(ret, "status")
[1] 1

I am a bit confused by your description of system2, because it has stdout and stderr arguments. So it is able to return both exit status, stdout and stderr.
> out <- tryCatch(ex <- system2("ls","xx", stdout=TRUE, stderr=TRUE), warning=function(w){w})
> out
<simpleWarning: running command ''ls' xx 2>&1' had status 2>
> ex
[1] "ls: cannot access xx: No such file or directory"
> out <- tryCatch(ex <- system2("ls","-l", stdout=TRUE, stderr=TRUE), warning=function(w){w})
> out
[listing snipped]
> ex
[listing snipped]

I suggest using this function here:
robust.system <- function (cmd) {
stderrFile = tempfile(pattern="R_robust.system_stderr", fileext=as.character(Sys.getpid()))
stdoutFile = tempfile(pattern="R_robust.system_stdout", fileext=as.character(Sys.getpid()))
retval = list()
retval$exitStatus = system(paste0(cmd, " 2> ", shQuote(stderrFile), " > ", shQuote(stdoutFile)))
retval$stdout = readLines(stdoutFile)
retval$stderr = readLines(stderrFile)
unlink(c(stdoutFile, stderrFile))
return(retval)
}
This will only work on a Unix-like shell that accepts > and 2> notations, and the cmd argument should not redirect output itself. But it does the trick:
> robust.system("ls -la")
$exitStatus
[1] 0
$stdout
[1] "total 160"
[2] "drwxr-xr-x 14 asieira staff 476 10 Jun 18:18 ."
[3] "drwxr-xr-x 12 asieira staff 408 9 Jun 20:13 .."
[4] "-rw-r--r-- 1 asieira staff 6736 5 Jun 19:32 .Rapp.history"
[5] "-rw-r--r-- 1 asieira staff 19109 11 Jun 20:44 .Rhistory"
$stderr
character(0)

Related

How to exit child fork process correctly if error encountered when connected to oracle database using perl

I have recently been using perl (v5.10.1) on a Linux system to connect to a database and perform some tasks.
To do this more efficiently I have been using fork() to be able to perform the tasks in parallel. Whilst doing this I have noticed some problems if the child exits with some sort of error (killed by kill command, dies etc.)
I have searched the forum for possible explanations but have not found anything related to using fork() while connected to a database.
Below is my initial program structure. My actual code is more complex but this simplified code illustrates the idea.
use strict;
use warnings;
use utf8;
use APR::UUID ;
use DBI ;
use DBD::Oracle ;
use Data::Dumper;
$ENV{'ORACLE_HOME'} = "/home/data/ora11g2" ;
$ENV{'NLS_LANG'} = "french_france.AL32UTF8" ;
$ENV{'LANG'} = "fr_FR.utf-8" ;
my $IDJOB = APR::UUID->new->format ;
my $DB="DB_val";
my $SRV="SRV_val";
my %attr = (
PrintError => 1,
RaiseError => 0
);
my %attr_CHILD = (
PrintError => 1,
RaiseError => 0
);
my $db = DBI->connect("dbi:Oracle:$SRV/$DB", "user", "pword", \%attr ) or die "impossible de se connecter à $SRV / $DB";
$db->{AutoCommit} = 0 ;
$db->{InactiveDestroy} = 1; # This needs to be set to 1 if any parallel processing will be used.
# Otherwise database is disconnected in parent after children have finished.
my $Crash_Error_String='';
my #Res1;
eval{#Res1=Mainsub($db)};
#
$Crash_Error_String=$# unless #Res1 ;
$Res1[0] = 501 unless #Res1 ;
print "ERROR code:" . $Res1[0] . " (Error string:$Crash_Error_String)\n";
$db->commit if defined($db) ;
$db->disconnect if defined($db) ;
#
#
#
sub Mainsub{
my $db=shift;
#
my $Program_Termination_Code=0;
my #Results=(0,0,0,0);
my $Processes_To_Use_After_Calc=4;
my $fh1PR_E_filename_STEM="/tmp/error_file_Test_parallel_rows_" . $IDJOB . "_";
my $forked = 0;
my $err = 0;
my #child_pids_rows;
my #child_ispawns_rows;
my $start = time;
for my $ispawn (1 .. $Processes_To_Use_After_Calc){
my $ispawn_XML=$ispawn-1;
my $child_pid = fork();
if(!defined $child_pid){
$err++
} else {
push #child_pids_rows,$child_pid;
push #child_ispawns_rows,$ispawn;
}
if(defined $child_pid && $child_pid > 0) {
## Parent
$forked++;
} elsif(defined $child_pid){
my $db_child;
my $fh1PR_E_filename=$fh1PR_E_filename_STEM . $ispawn . ".err";
#$SIG{__DIE__} = $SIG{TERM} = $SIG{INT} = sub {
# my $ERROR_Val=$!;
# open(my $fh1PR_E, '>:encoding(UTF-8)', $fh1PR_E_filename);
# print $fh1PR_E "Caught an errorsignal: $ERROR_Val (child $ispawn)";
# close $fh1PR_E;
# $db_child->commit unless $db_child->{AutoCommit};
# $db_child->disconnect if defined($db_child);
# exit;
#};
my $ERROR_Code_child=0;
$db_child = DBI->connect("dbi:Oracle:$SRV/$DB", "user", "pword", \%attr_CHILD ) or die "impossible de se connecter à $SRV / $DB";
$db_child->{AutoCommit} = 0 ;
$db_child->commit unless $db_child->{AutoCommit} ;
#
#
#
#my $ased=4/0 if $ispawn==2 || $ispawn==1;
$db_child->commit unless $db_child->{AutoCommit} ;
$db_child->disconnect if defined($db_child) ;
exit;
} else {
## unable to fork
$err++;
}
}
my $Total_Children_Errors=0;
my $Total_Children_Exited=0;
my $Error_Messages="";
while (scalar #child_pids_rows) {
my $pid = $child_pids_rows[0];
my $ispawn=$child_ispawns_rows[0];
my $kid = waitpid $pid, 0;
my $ERROR_Count=0;
if($kid > 0){
my ($rc, $sig, $core) = ($? >> 8, $? & 127, $? & 128);
if ($core){
$ERROR_Count++;
$Total_Children_Errors++;
$Error_Messages eq "" ? $Error_Messages="$pid dumped core" : $Error_Messages=$Error_Messages . "\n" . "$pid dumped core";
} elsif($sig == 9){
$ERROR_Count++;
$Total_Children_Errors++;
$Error_Messages eq "" ? $Error_Messages="$pid was murdered!" : $Error_Messages=$Error_Messages . "\n" . "$pid was murdered!";
} else {
print "$pid returned $rc";
print ($sig?" after receiving signal $sig":"\n");
my $fname=$fh1PR_E_filename_STEM . $ispawn . ".err";
if(-f "$fname"){
$Total_Children_Errors++;
$ERROR_Count++;
if($Error_Messages eq ""){
$Error_Messages="Error found in parallel row process $ispawn (see file " . $fh1PR_E_filename_STEM . $ispawn . ".err for details)";
} else {
$Error_Messages=$Error_Messages . "\n" . "Error found in parallel row process $ispawn (see file " .
$fh1PR_E_filename_STEM . $ispawn . ".err for details)";
}
}
}
} else {
$ERROR_Count++;
$Total_Children_Errors++;
$Error_Messages eq "" ? $Error_Messages="$pid... um... disappeared..." : $Error_Messages=$Error_Messages . "\n" . "$pid... um... disappeared...";
}
$Total_Children_Exited++;
if($ERROR_Count==0){
print "Child $pid exited successfully (" . eval($forked-$Total_Children_Exited) . " of " . $forked . " Children left)\n";
} else {
print "Child $pid exited with ERROR! (" . eval($forked-$Total_Children_Exited) . " of " . $forked . " Children left)\n";
}
shift #child_pids_rows;
shift #child_ispawns_rows;
}
#print "Total child errors:$Total_Children_Errors\n";
if($Total_Children_Errors>0){
$Program_Termination_Code=915;
print $Error_Messages . "\n";
#Results=($Program_Termination_Code,0,0);
goto END101;
} else {
if($err > 0){
$Program_Termination_Code=919;
#Results=($Program_Termination_Code,0,0);
goto END101;
} else {
print "ALL Child processes terminated correctly (Parallel Rows)!\n";
}
}
END101:
return #Results;
}
Running this code produces the output:
27713 returned 0
Child 27713 exited successfully (3 of 4 Children left)
27714 returned 0
Child 27714 exited successfully (2 of 4 Children left)
27715 returned 0
Child 27715 exited successfully (1 of 4 Children left)
27716 returned 0
Child 27716 exited successfully (0 of 4 Children left)
ALL Child processes terminated correctly (Parallel Rows)!
ERROR code:0 (Error string:)
So far no problems. However, now I introduce a deliberate division by zero error in the child process by uncommenting the line (see original code above)
my $ased=4/0 if $ispawn==2 || $ispawn==1;
Now I get the output
ERROR code:501 (Error string:Illegal division by zero at /home/public/AGO/testcode/BArt_F/perl/DB_forking_with_errors_test_code_1.pl line 83.)
ERROR code:501 (Error string:Illegal division by zero at /home/public/AGO/testcode/BArt_F/perl/DB_forking_with_errors_test_code_1.pl line 83.)
30744 returned 0
Child 30744 exited successfully (3 of 4 Children left)
30745 returned 0
Child 30745 exited successfully (2 of 4 Children left)
30746 returned 0
Child 30746 exited successfully (1 of 4 Children left)
30747 returned 0
Child 30747 exited successfully (0 of 4 Children left)
ALL Child processes terminated correctly (Parallel Rows)!
ERROR code:0 (Error string:)
DBD::Oracle::db commit failed: ORA-03113: fin de fichier sur canal de communication
ID de processus : 22739
ID de session : 1, Numéro de série : 54727 (DBD ERROR: OCITransCommit) at /home/public/AGO/testcode/BArt_F/perl/DB_forking_with_errors_test_code_1.pl line 35.
Here I have lost the connection to the database in the parent and the code does not terminate correctly!
Finally, to sort this out, I uncomment the code in the child process (see original code above):
$SIG{__DIE__} = $SIG{TERM} = $SIG{INT} = sub {
my $ERROR_Val=$!;
open(my $fh1PR_E, '>:encoding(UTF-8)', $fh1PR_E_filename);
print $fh1PR_E "Caught an errorsignal: $ERROR_Val (child $ispawn)";
close $fh1PR_E;
$db_child->commit unless $db_child->{AutoCommit};
$db_child->disconnect if defined($db_child);
exit;
};
Now running the code I get:
946 returned 0
Child 946 exited with ERROR! (3 of 4 Children left)
947 returned 0
Child 947 exited with ERROR! (2 of 4 Children left)
948 returned 0
Child 948 exited successfully (1 of 4 Children left)
949 returned 0
Child 949 exited successfully (0 of 4 Children left)
Error found in parallel row process 1 (see file /tmp/error_file_Test_parallel_rows_53a6e838-def0-11eb-b482-8f8e0f0aecb2_1.err for details)
Error found in parallel row process 2 (see file /tmp/error_file_Test_parallel_rows_53a6e838-def0-11eb-b482-8f8e0f0aecb2_2.err for details)
ERROR code:915 (Error string:)
Now the error is trapped and the parent exits correctly.
All of this seems fine until I have read (https://www.perlmonks.org/?node_id=1173708) that I should not use
$SIG{__DIE__}
However I cannot find any alternative method that allows my parent program to exit correctly if any of the child processes die.
Could anyone tell me if there is an alternative method to using
$SIG{__DIE__}

lazyness, does words read (within interact) all the input, or only what is needed?

I am starting to learn Haskell, and I am trying to understand
how much work do the functions do (specially with respect to
the laziness concept). Please see the following program:
main::IO()
main = interact ( head . words)
Will this program read all the input or only the first word in input?
Just the first word:
% yes | ghc -e 'interact (head . words)'
y
%
But beware: this relies a feature called "lazy IO" that is only kind of related to the technique of laziness in pure code. Pure functions are lazy by default and you must work hard to make them strict; IO is "strict IO" by default and you must work hard to make it lazy IO. A handful of library functions (notably interact, (h)getContents, and readFile) have gone to this effort.
It also has some problems with composability.
Conceptually, it reads only what it needs. But it probably uses a buffer to do so:
$ yes | strace -feread,write ghc -e 'interact (head . words)'
...
[pid 61274] read(0, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 8096) = 8096
[pid 61272] write(1, "y", 1y) = 1
[pid 61272] --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_value={int=0, ptr=0}} ---
[pid 61272] write(5, "\376", 1) = 1
[pid 61273] read(4, "\376", 1) = 1
[pid 61273] +++ exited with 0 +++
[pid 61274] +++ exited with 0 +++
[pid 61276] +++ exited with 0 +++
+++ exited with 0 +++
This shows (on a Linux system) that the program split itself into multiple threads, one of them read 8KiB of data from stdin, then another output the first word. The main reason is that repeatedly reading small amounts is quite inefficient. Asynchronous sources like terminals and sockets may produce smaller amounts of data, though:
$ strace -f -e trace=read,write -e signal= ghc -e 'interact (head . words)'
...
hello program
Process 61594 attached
[pid 61592] read(0, "hello program\n", 8096) = 14
[pid 61590] write(1, "hello", 5hello) = 5
In this case, the terminal layer completed the read at the first newline, even though the buffer was still 8KiB large. As this was enough data for the first word to be identified, no further reads were needed.

See stdin/stdout/stderr of a running process - Linux kernel

Is there a way to redirect/see the stdin/stdout/stderr of a given running process(By PID) in a simple way ?
I tried the following (Assume that 'pid' contains a running user process):
int foo(const void* data, struct file* file, unsigned fd)
{
printf("Fd = %x\n", fd);
return 0;
}
struct task_struct* task = pid_task(find_vpid(pid), PIDTYPE_PID);
struct files_struct* fs = task->files;
iterate_fd(fs, 0, foo, NULL);
I get 3 calls to foo (This process probably has 3 opened files, makes sense) but I can't really read from them (from the file pointers).
It prints:
0
1
2
Is it possible to achieve what I asked for in a fairly simple way ?
thanks
First, if you can change your architecure, you run it under something like screen, tmux, nohup, or dtach which will make your life easier.
But if you have a running program, you can use strace to monitor it's kernel calls, including all reads/writes. You will need to limit what it sees (try -e), and maybe filter the output for just the first 3 FDs. Also add -s because the default is to limit the size of data recorded. Something like: strace -p <PID> -e read,write -s 1000000
You can achieve it via gdb
Check the file handles process() has open :
$ ls -l /proc/6760/fd
total 3
lrwx—— 1 rjc rjc 64 Feb 27 15:32 0 -> /dev/pts/5
l-wx—— 1 rjc rjc 64 Feb 27 15:32 1 -> /tmp/foo1
lrwx—— 1 rjc rjc 64 Feb 27 15:32 2 -> /dev/pts/5
Now run GDB:
$ gdb -p 6760 /bin/cat
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
[lots more license stuff snipped]
Attaching to program: /bin/cat, process 6760
[snip other stuff that’s not interesting now]
(gdb) p close(1)
$1 = 0
Provide a new file name to get output - process_log
(gdb) p creat(“/tmp/process_log″, 0600)
$2 = 1
(gdb) q
The program is running. Quit anyway (and detach it)? (y or n) y
Detaching from program: /bin/cat, process 6760
After that verify the result as:
ls -l /proc/6760/fd/
total 3
lrwx—— 1 rjc rjc 64 2008-02-27 15:32 0 -> /dev/pts/5
l-wx—— 1 rjc rjc 64 2008-02-27 15:32 1 -> /tmp/process_log <====
lrwx—— 1 rjc rjc 64 2008-02-27 15:32 2 -> /dev/pts/5
In the similar way, you can redirect stdin, stderr too.

How to get a nice output when calling jconsole?

I've recently started to learn J.
If find it useful when learning a new language to be able to quickly
map a bit of source code to an output and store it for later reference in Emacs org-mode.
But I'm having trouble with the cryptic jconsole when I want to do the evaluation.
For instance jconsole --help doesn't work.
And man jconsole brings up something about a Java tool. Same applies to googling.
I have for instance this bit of code from the tutorial saved in temp.ijs:
m =. i. 3 4
1 { m
23 23 23 23 (1}) m
Now when I run jconsole < temp.ijs, the output is:
4 5 6 7
0 1 2 3
23 23 23 23
8 9 10 11
Ideally, I'd like the output to be:
4 5 6 7
0 1 2 3
23 23 23 23
8 9 10 11
Again, ideally I'd like to have this without changing the source code at all,
i.e. just by passing some flag to jconsole.
Is there a way to do this?
I'm currently going with solving the problem on Emacs side, instead of on jconsole side.
I intersperse the source code with echo'':
(defun org-babel-expand-body:J (body params)
"Expand BODY according to PARAMS, return the expanded body."
(mapconcat #'identity (split-string body "\n") "\necho''\n"))
Execute it like this:
(j-strip-whitespace
(org-babel-eval
(format "jconsole < %s" tmp-script-file) ""))
And post-process assuming that only first row of each array is misaligned
(that has been my experience so far). Here's the result:
#+begin_src J
m =. i. 3 4
1 { m
23 23 23 23 (1}) m
#+end_src
#+RESULTS:
: 4 5 6 7
:
: 0 1 2 3
: 23 23 23 23
: 8 9 10 11
And here's the post-processing code:
(defun whitespacep (str)
(string-match "^ *$" str))
(defun match-second-space (s)
(and (string-match "^ *[^ ]+\\( \\)" s)
(match-beginning 1)))
(defun strip-leading-ws (s)
(and (string-match "^ *\\([^ ].*\\)" s)
(match-string 1 s)))
(defun j-print-block (x)
(if (= 1 (length x))
(strip-leading-ws (car x))
;; assume only first row misaligned
(let ((n1 (match-second-space (car x)))
(n2 (match-second-space (cadr x))))
(setcar
x
(if (and n1 n2)
(substring (car x) (- n1 n2))
(strip-leading-ws (car x))))
(mapconcat #'identity x "\n"))))
(defun j-strip-whitespace (str)
(let ((strs (split-string str "\n" t))
out cur s)
(while (setq s (pop strs))
(if (whitespacep s)
(progn (push (nreverse cur) out)
(setq cur))
(push s cur)))
(mapconcat #'j-print-block
(delq nil (nreverse out))
"\n\n")))
You need to use echo for explicit output, rather than rely on implicit output which is the case for the REPL function of jconsole normally.
Create the script, which I'm calling "tst2.js" below, and place the following code in it:
#!/Applications/j64/bin/jconsole
9!:7'+++++++++|-'
m =. i. 3 4
echo 1 { m
echo ''
echo 23 23 23 23 (1}) m
exit''
Of course, if your path to jconsole is different, then update the "shebang" line to be the actual path for your system.
Next, make sure the script is executable:
$ chmod +x tst2.js
or whatever you called your script.
Next, invoke it:
$ ./tst2.js
4 5 6 7
0 1 2 3
23 23 23 23
8 9 10 11
Note that the above output is identical to the output generated when you are in the interactive jconsole.
The problem is with loose declarations. Every time you give the console a command, it replies with the answer. You should format your code in a verb and have it echo what you need.
foo =: 3 : 0
m =. i. 3 4
echo ''
echo 1 { m
echo ''
echo 23 23 23 23 (1}) m
''
)
foo''
It can also be nameless and self executing if you're in a hurry:
3 : 0 ''
m =. i. 3 4
echo ''
echo 1 { m
echo ''
echo 23 23 23 23 (1}) m
''
)

Linux proc/pid/fd for stdout is 11?

Executing a script with stdout redirected to a file. So /proc/$$/fd/1 should point to that file (since stdout fileno is 1). However, actual fd of the file is 11. Please, explain, why.
Here is session:
$ cat hello.sh
#!/bin/sh -e
ls -l /proc/$$/fd >&2
$ ./hello.sh > /tmp/1
total 0
lrwx------ 1 nga users 64 May 28 22:05 0 -> /dev/pts/0
lrwx------ 1 nga users 64 May 28 22:05 1 -> /dev/pts/0
lr-x------ 1 nga users 64 May 28 22:05 10 -> /home/me/hello.sh
l-wx------ 1 nga users 64 May 28 22:05 11 -> /tmp/1
lrwx------ 1 nga users 64 May 28 22:05 2 -> /dev/pts/0
I have a suspicion, but this is highly dependent on how your shell behaves. The file descriptors you see are:
0: standard input
1: standard output
2: standard error
10: the running script
11: a backup copy of the script's normal standard out
Descriptors 10 and 11 are close on exec, so won't be present in the ls process. 0-2 are, however, prepared for ls before forking. I see this behaviour in dash (Debian Almquist shell), but not in bash (Bourne again shell). Bash instead does the file descriptor manipulations after forking, and incidentally uses 255 rather than 10 for the script. Doing the change after forking means it won't have to restore the descriptors in the parent, so it doesn't have the spare copy to dup2 from.
The output of strace can be helpful here.
The relevant section is
fcntl64(1, F_DUPFD, 10) = 11
close(1) = 0
fcntl64(11, F_SETFD, FD_CLOEXEC) = 0
dup2(2, 1) = 1
stat64("/home/random/bin/ls", 0xbf94d5e0) = -1 ENOENT (No such file or
+++++++>directory)
stat64("/usr/local/bin/ls", 0xbf94d5e0) = -1 ENOENT (No such file or directory)
stat64("/usr/bin/ls", 0xbf94d5e0) = -1 ENOENT (No such file or directory)
stat64("/bin/ls", {st_mode=S_IFREG|0755, st_size=96400, ...}) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
+++++++>child_tidptr=0xb75a8938) = 22748
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 22748
--- SIGCHLD (Child exited) # 0 (0) ---
dup2(11, 1) = 1
So, the shell moves the existing stdout to an available file descriptor above 10 (namely, 11), then moves the existing stderr onto its own stdout (due to the >&2 redirect), then restores 11 to its own stdout when the ls command is finished.

Resources