view output of already running processes in linux - linux

I have a process that is running in the background (sh script) and I wonder if it is possible to view the output of this process without having to interrupt it.
The process ran by some application otherwise I would have attached it to a screen for later viewing. It might take an hour to finish and i want to make sure it's running normally with no errors.

There is already an program that uses ptrace(2) in linux to do this, retty:
http://pasky.or.cz/dev/retty/
It works if your running program is already attached to a tty, I do not know if it will work if you run your program in background.
At least it may give some good hints. :)
You can probably retreive the exit code from the program using ptrace(2), otherwise just attach to the process using gdb -p <pid>, and it will be printed when the program dies.
You can also manipulate file descriptors using gdb:
(gdb) p close(1)
$1 = 0
(gdb) p creat("/tmp/stdout", 0600)
$2 = 1
http://etbe.coker.com.au/2008/02/27/redirecting-output-from-a-running-process/

You could try to hook into the /proc/[pid]/fd/[012] triple, but likely that won't work.
Next idea that pops to my mind is strace -p [pid], but you'll get "prittified" output. The possible solution is to strace yourself by writing a tiny program using ptrace(2) to hook into write(2) and writing the data somewhere. It will work but is not done in just a few seconds, especially if you're not used to C programming.
Unfortunately I can't think of a program that does precisely what you want, which is why I give you a hint of how to write it yourself. Good luck!

Related

Linux process in background - "Stopped" in jobs?

I'm currently running a process with the & sign.
$ example &
However, (please note i'm a newbie to Linux) I realised that pretty much a second after such command I'm getting a note that my process received a stopped signal. If I do
$ jobs
I'll get the list with my example process with a little note "Stopped". Is it really stopped and not working at all in the background? How does it exactly work? I'm getting mixed info from the Internet.
In Linux and other Unix systems, a job that is running in the background, but still has its stdin (or std::cin) associated with its controlling terminal (a.k.a. the window it was run in) will be sent a SIGTTIN signal, which by default causes the program to be completely stopped, pending the user bringing it to the foreground (fg %job or similar) to allow input to actually be given to the program. To avoid the program being paused in this way, you can either:
Make sure the programs stdin channel is no longer associated with the terminal, by either redirecting it to a file with appropriate contents for the program to input, or to /dev/null if it really doesn't need input - e.g. myprogram < /dev/null &.
Exit the terminal after starting the program, which will cause the association with the program's stdin to go away. But this will cause a SIGHUP to be delivered to the program (meaning the input/output channel experienced a "hangup") - this normally causes a program to be terminated, but this can be avoided by using nohup - e.g. nohup myprogram &.
If you are at all interested in capturing the output of the program, this is probably the best option, as it prevents both of the above signals (as well as a couple others), and saves the output for you to look at to determine if there are any issues with the programs execution:
nohup myprogram < /dev/null > ${HOME}/myprogram.log 2>&1 &
Yes it really is stopped and no longer working in the background. To bring it back to life type fg job_number
From what I can gather.
Background jobs are blocked from reading the user's terminal. When one tries to do so it will be suspended until the user brings it to the foreground and provides some input. "reading from the user's terminal" can mean either directly trying to read from the terminal or changing terminal settings.
Normally that is what you want, but sometimes programs read from the terminal and/or change terminal settings not because they need user input to continue but because they want to check if the user is trying to provide input.
http://curiousthing.org/sigttin-sigttou-deep-dive-linux has the gory technical details.
Just enter fg which will resolve the error when you then try to exit.

Shell script process is getting killed automatically

I am facing problem with shell script i have ascript which will be running in infinite loop so say its havin PID X.The process is running for 4-5 hours but automatically the process getting killed.This is happening only for some long time running system and i am observing some times after 2 hours also its getting killed.
I am not able to find the reason the why its going down why its getting killed.No one is using the system other than me.And i am running the process as a root user.
Can any one explain or suspect the reason who is killing the process.
Below is the sample script
#!/bin/bash
until ./test.tcl; do
echo "Server 'test.tcl' crashed with exit code $?. Respawing.." >&2
done
In test.tcl script i am running it for infinite loop and the script is used to trap signal and do some special operation.But we find that test.tcl is also going down.
So is there any way from where i capture who and how it gets killed.
Enable core dump in your system, it is the most commonly used method for app crash analysis. I know it is a bit painful to gdb core file, but more or less you can find something out of it.
Here is a reference link for you.(http://www.cyberciti.biz/tips/linux-core-dumps.html).
Another way to do this is tracing you script by "strace -p PID-X", note that it will slow down your system, espeically several hours in your case, but it can be the last resort.
Hope above helpful to you.
Better to check all the signals generated and caught by OS at that time by specific script might be one of signal is causing to kill your process.

How to find the reason for a dead process without log file on unix?

This is an interview question.
A developer started a process.
But when a customer wants to use the process, he found the process wasn't running.
The developer logged in and found the process died. How can the developer know what was wrong?
Follow up: a running process which is supposed to write logs to a file. But there are no logs in the file. How can the developer figure out what's going on in the process?
I think :
If the program can be re-run, i will use gdb to track the process.
If not, check the output file from the process (the application program).
or, add print to the code.
But, are there other ways to do it by referring some information generated by OS?
If you have the disk space and spare CPU power, you can leave strace following the program to catch the sequence leading up to exit.
One possible cause if the program died without leaving any trace is the Out-Of-Memory (OOM) killer. This will leave a message in the kernel log if it kills your process.
From the same answer, process accounting can be modified to provide some clues by telling you the exit code along with the exit time.
are there other ways to do it by referring some information generated
by OS?
core dump is one option.
Sometimes programs don't create core dumps. In this case knowing the exit code of your software may help.
So you can use this script below to start your software and log its exit status for finding its exit reason.
Example :
#!/bin/bash
./myprogram
#get exit code
exitvalue=$?
#log exit code value to /var/log/messages
logger -s "exit code of my program is " $exitvalue
... use a debugger like gdb ...

What do you do with badly behaving 3rd party processes in Linux?

Anytime I have a badly behaving process (pegging CPU, or frozen, or otherwise acting strangely) I generally kill it, restart it and hope it doesn't happen again.
If I wanted to explore/understand the problem (i.e. debug someone else's broken program as it's running) what are my options?
I know (generally) of things like strace, lsof, dmesg, etc. But I'm not really sure of the best way start poking around productively.
Does anyone have a systematic approach for getting to the bottom of these issues? Or general suggestions? Or is killing & restarting really the best one can do?
Thanks.
If you have debugging symbols of the program in question installed, you can attach to it with gdb and look what is wrong there. Start gdb, type attach pid where pid is the process id of the program in question (you can find it via top or ps). Then type Ctrl-C to stop it. Saying backtrace gives you the call stack, that means it tells which line of code is currently running and which functions called the currently running function.

Linux i/o to running daemon / process

Is it possible to i/o to a running process?
I have multiple game servers running like this:
cd /path/to/game/server/binary
./binary arg1 arg2 ... argn &
Is it possible to write a message to a server if i know the process id?
Something like this would be handy:
echo "quit" > process1234
Where process1234 is the process (with sid 1234).
The game server is not a binary written by me, but it is a Call of Duty binary. So i can't change anything to the code.
Yes, you can start up the process with a pipe as its stdin and then write to the pipe. You can used a named or anonymous pipe.
Normally a parent process would be needed to do this, which would create an anonmyous pipe and supply that to the child process as its stdin - popen() does this, many libraries also implement it (see Perl's IPC::Open2 for example)
Another way would be to run it under a pseudo tty, which is what "screen" does. Screen itself may also have a mechanism for doing this.
Only if the process is listening for some message somewhere. For instance, your game server can be waiting for input on a file, over a network connection, or from standard input.
If your process is not actively listening for something, the only things you can really do is halt or kill it.
Now if your process is waiting on standard input, and you ran it like so:
$ myprocess &
Then (in linux) you should be able to try the following:
$ jobs
[1]+ Running myprocess &
$ fg 1
And at this point you are typing standard input into your process.
You can only do that if the process is explicitly designed for that.
But since you example is requesting the process quit, I'd recommend trying signals. First try to send the TERM (i.e. terminate) signal which is the default:
kill _pid_
If that doesn't work, you can try other signals such as QUIT:
kill -QUIT _pid_
If all else fails, you can use the KILL signal. This is guaranteed (*) to stop the process but the process will have no change to clean up:
kill -KILL _pid_
* - in the past, kill -KILL would not work if the process was hung when on a flaky network file server. Don't know if they ever fixed this.
I'm pretty sure this would work, since the server has a console on stdin:
echo "quit" > /proc/<server pid>/fd/0
You mention in a comment below that your process does not appear to read from the console on fd 0. But it must on some fd. ls -l /proc/<server pid/>/fd/ and look for one that's pointing at /dev/pts/ if the process is running in a gnome-terminal or xterm or something.
If you want to do a few simple operations on your server, use signals as mentioned elsewhere. Set up signal handlers in the server and have each signal perform a different action e.g.:
SIGINT: Reread config file
SIGHUP: quit
...
Highly hackish, don't do this if you have a saner alternative, but you can redirect a process's file descriptors on the fly if you have ptrace permissions.
$ echo quit > /tmp/quitfile
$ gdb binary 1234
(gdb) call dup2(open("/tmp/quitfile", 0), 0)
(gdb) continue
open("/tmp/quitfile", O_RDONLY) returns a file descriptor to /tmp/quitfile. dup2(..., STDIN_FILENO) replaces the existing standard input by the new file descriptor.
We inject this code into the application using gdb (but with numeric constants, as #define constants may not be available), and taadaah.
Simply run it under screen and don't background it. Then you can either connect to it with screen interactively and tell it to quit, or (with a bit of expect hackery) write a script that will connect to screen, send the quit message, and disconnect.

Resources