Troubleshooting SIGTERMs with tee on a cluster within SGE jobs

Troubleshooting SIGTERMs with tee on a cluster within SGE jobs - linux

I have some legacy scientific code running on a Rocks cluster, with SGE. I have an application-specific job submission script that generates qsub scripts (i.e. the script which Sun Grid Engine takes and runs).
Within the qsub script, my legacy app is called. This app sends it's output to STDOUT. SGE intercepts STDOUT and spools it into a file in the users home directory, so the user can see results build up in real-time. I want this behavior to be maintained, but at the same time, I want to transparently log all output in the background. I figured tee would be perfect to achieve this.
So I modified the job submission script to run the app and pipe STDOUT to tee, which saves STDOUT to a file that is copied to a central store once the job completes. The app is run and piped to tee as follows:
\$GMSCOMMAND | tee \$SCRATCHDIR/gamess_output.log
The problem is, ever since I've started piping the code to tee, the app has been dying with SIGTERMs, especially when I request several nodes. I tried using the -i (ignore interrupts) parameter with tee: it makes no difference.
Things work fine if I redirect the app output to a file then cat the file once the app is done, but then I can't allow users to view results buildup in real-time (which is an important requirement).
Any ideas about why this use of tee might be failing? Or alternatively, any ideas about how else I might achieve the desired functionality?

I don't know anything about why your particular case is failing, but one option might be to make $GMSCOMMAND do it's own logging. (Effectively put the tee inside the app). I guess this option depends on cost of changing the legacy app.
Failing that you could wrap the 'legacy app' with your own script/application to do the redirection/duplication.

If pipes are your problem perhaps you can get around this by using a 'while/read' loop with process substitution. Does this work for you?
while read line; do
echo "$line"
echo "$line" >> ${SCRATCHDIR}/gamess_output.log
done <(${GMSCOMMAND})

Related

How to use the attach the same console as output for a process and input for another process?

I am trying to use suckless ii irc client. I can listen to a channel by tail -f out file. However is it also possible for me to input into the same console by starting an echo or cat command?
If I background the process, it actually displays the output in this console but that doesn't seem to be right way? Logically, I think I need to get the fd of the console (but how to do that) and then force the tail output to that fd and probably background it. And then use the present bash to start a cat > in.
Is it actually fine to do this or is that I am creating a lot of processes overhead for a simple task? In other words piping a lot of stuff is nice but it creates a lot of overhead which ideally has to be in a single process if you are going to repeat that task it a lot?

However is it also possible for me to input into the same console by starting an echo or cat command?
Simply NO! cat writes the current content. cat has no idea that the content will grow later. echo writes variables and results from the given command line. echo itself is not made for writing the content of files.
If I background the process, it actually displays the output in this console but that doesn't seem to be right way?
If you do not redirect the output, the output goes to the console. That is the way it is designed :-)
Logically, I think I need to get the fd of the console (but how to do that) and then force the tail output to that fd and probably background it.
As I understand that is the opposite direction. If you want to write to the stdin from the process, you simply can use a pipe for that. The ( useless ) example show that cat writes to the pipe and the next command will read from the pipe. You can extend to any other pipe read/write scenario. See link given below.
Example:
cat main.cpp | cat /dev/stdin
cat main.cpp | tail -f
The last one will not exit, because it waits that the pipe gets more content which never happens.
Is it actually fine to do this or is that I am creating a lot of processes overhead for a simple task? In other words piping a lot of stuff is nice but it creates a lot of overhead which ideally has to be in a single process if you are going to repeat that task it a lot?
I have no idea how time critical your job is, but I believe that the overhead is quite low. Doing the same things in a self written prog must not be faster. If all is done in a single process and no access to the file system is required, it will be much faster. But if you also use system calls, e.g. file system access, it will not be much faster I believe. You always have to pay for the work you get.
For IO redirection please read:
http://www.tldp.org/LDP/abs/html/io-redirection.html
If your scenario is more complex, you can think of named pipes instead of IO redirection. For that you can have a look at:
http://www.linuxjournal.com/content/using-named-pipes-fifos-bash

How to create a virtual command-backed file in Linux?

What is the most straightforward way to create a "virtual" file in Linux, that would allow the read operation on it, always returning the output of some particular command (run everytime the file is being read from)? So, every read operation would cause an execution of a command, catching its output and passing it as a "content" of the file.

There is no way to create such so called "virtual file". On the other hand, you would be
able to achieve this behaviour by implementing simple synthetic filesystem in userspace via FUSE. Moreover you don't have to use c, there
are bindings even for scripting languages such as python.
Edit: And chances are that something like this already exists: see for example scriptfs.

This is a great answer I copied below.
Basically, named pipes let you do this in scripting, and Fuse let's you do it easily in Python.
You may be looking for a named pipe.
mkfifo f
{
echo 'V cebqhpr bhgchg.'
sleep 2
echo 'Urer vf zber bhgchg.'
} >f
rot13 < f
Writing to the pipe doesn't start the listening program. If you want to process input in a loop, you need to keep a listening program running.
while true; do rot13 <f >decoded-output-$(date +%s.%N); done
Note that all data written to the pipe is merged, even if there are multiple processes writing. If multiple processes are reading, only one gets the data. So a pipe may not be suitable for concurrent situations.
A named socket can handle concurrent connections, but this is beyond the capabilities for basic shell scripts.
At the most complex end of the scale are custom filesystems, which lets you design and mount a filesystem where each open, write, etc., triggers a function in a program. The minimum investment is tens of lines of nontrivial coding, for example in Python. If you only want to execute commands when reading files, you can use scriptfs or fuseflt.

No one mentioned this but if you can choose the path to the file you can use the standard input /dev/stdin.
Everytime the cat program runs, it ends up reading the output of the program writing to the pipe which is simply echo my input here:
for i in 1 2 3; do
echo my input | cat /dev/stdin
done
outputs:
my input
my input
my input

I'm afraid this is not easily possible. When a process reads from a file, it uses system calls like open, fstat, read. You would need to intercept these calls and output something different from what they would return. This would require writing some sort of kernel module, and even then it may turn out to be impossible.
However, if you simply need to trigger something whenever a certain file is accessed, you could play with inotifywait:
#!/bin/bash
while inotifywait -qq -e access /path/to/file; do
echo "$(date +%s)" >> /tmp/access.txt
done
Run this as a background process, and you will get an entry in /tmp/access.txt each time your file is being read.

Automatic Background Perl Execution on Ubuntu

I've been troubleshooting this issue for about a week and I am nowhere, so I wanted to reach out for some help.
I have a perl script that I execute via command like, usually in a manner of
nohup ./script.pl --param arg --param2 arg2 &
I usually have about ten of these running at once to process the same type of data from different sources (that is specified through parameters). The script works fine and I can see logs for everything in nohup.out and monitor status via ps output. This script also uses a sql database to track status of various tasks, so I can track finishes of certain sources.
However, that was too much work, so I wrote a wrapper script to execute the script automatically and that is where I am running into problems. I want something exactly the same as I have, but automatic.
The getwork.pl script runs ps and parses output to find out how many other processes are running, if it is below the configured thresh it will query the database for the most out of date source and kick off the script.
The problem is that the kicked off jobs aren't running properly, sometimes they terminate without any error messages and sometimes they just hang and sit idle until I kill them.
The getwork script queries sql and gets the entire execution command via sql concatanation, so in the sql query I am doing something like CONCAT('nohup ./script.pl --arg ',param1,' --arg2 ',param2,' &') to get the command string.
I've tried everything to get these kicked off, I've tried using system (), but again, some jobs kick off, some don't, sometimes it gets stuck, sometimes jobs start and then die within a minute. If I take the exact command I used to start the job and run it in bash, it works fine.
I've tried to also open a pipe to the command like
open my $ca, "| $command" or die ($!);
print $ca $command;
close $ca;
That works just about as well as everything else I've tried. The getwork script used to be executed through cron every 30 minutes, but I scrapped that because I needed another shell wrapper script, so now there is an infinite look in the get work script that executes a function every 30 minutes.
I've also tried many variations of the execution command, including redirecting output to different files, etc... nothing seems to be consistent. Any help would be much appreciated, because I am truly stuck here....
EDIT:
Also, I've tried to add separate logging within each script, it would start a new log file with it's PID ($$). There was a bunch of weirdness there too, all log files would get created, but then some of the processes would be running and writing to the file, others would just have an empty text file and some would just have one or two log entries. Sometimes the process would still be running and just not doing anything, other times it would die with nothing in the log. Me, running the command in shell directly always works though.
Thanks in advance

You need a kind of job managing framework.
One of the bigest one is Gearman: http://www.slideshare.net/andy.sh/gearman-and-perl

Logs are written asynchronous to log file

I have come across strange scenario where when I am trying to redirect stdout logs of perl script into a log file, all the logs are getting written at the end of execution when script completes instead of during execution of the script.
While running script when I do tail -f "filename", I could able to see log only when script has completed its execution not during execution of script.
My script details are given below:
/root/Application/download_mornings.pl >> "/var/log/file_manage/file_manage-$(date +\%Y-\%m-\%d).txt"
But when I run without redirecting log file, I can see logs on command prompt as when script progresses.
Let me know if you need any other details.
Thanks in advance for any light that you all might be able shed whats going on.
Santosh

Perl would buffer the output by default. You can say:
$| = 1;
(at the beginning of the script) to disable buffering. Quoting perldoc perlvar:
$|
If set to nonzero, forces a flush right away and after every write or
print on the currently selected output channel. Default is 0
(regardless of whether the channel is really buffered by the system or
not; $| tells you only whether you've asked Perl explicitly to flush
after each write). STDOUT will typically be line buffered if output is
to the terminal and block buffered otherwise. Setting this variable is
useful primarily when you are outputting to a pipe or socket, such as
when you are running a Perl program under rsh and want to see the
output as it's happening. This has no effect on input buffering. See
getc for that. See select on how to select the output channel. See
also IO::Handle.
You might also want to refer to Suffering from Buffering?.

What happens to stdout when a script runs a program?

I have an embedded application that I want a simple-minded logger for.
The system starts from a script file, which in turn runs the application. There could be various reasons that the script fails to run the application, or the application itself could fail to start. To diagnose this remotely, I need to view the stdout from the script and the application.
I tried writing a tee-like logger that would repeat its stdin to stdout, and save the text in a FIFO for later retrieval via the network. Then I naively tried
./script | ./logger
I ended up with only the script stdout going to the logger, and the application stdout disappearing. I had similar results trying tee.
The system is running kernel 2.4.26, and busybox.
What is going on, and how can I accomplish my desired ends?

It turns out it was working exactly as I thought it should work, with one minor gotcha. stdout was being buffered, and without any fflush(stdout) commands, I never saw it. Had I been really patient, I would have suddenly seen a big gush of output when the stdout buffer filled up. A call to setlinebuf(3) fixed my problem.

Apparently, the application output doesn't end up on stdout...
The output is actually on stderr (which is usually also connected to the terminal)
./script.sh 2>&1 | ./logger
should then work
The application actively disconnects from stdin/stdout (e.g. by closing/reopening file descriptors 0,1(,2) or, using nohup, exec or similar utilities)
the script daemonizes (which also detaches from all standard streams)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string