Howto debug running bash script - linux

I have a bash script running on Ubuntu.
Is it possible to see the line/command executed now without script restart.
The issue is that script sometimes never exits. This is really hard to reproduce (now I caught it), so I can't just stop the script and start the debugging.
Any help would be really appreciated
P.S. Script logic is hard to understand, so I can't to figure out why it's frozen by power of thoughts.

Try to find the process id (pid) of the shell, you may use ps -ef | grep <script_name>
Let's set this pid in the shell variable $PID.
Find all the child processes of this $PID by:
ps --ppid $PID
You might find one or more (if for example it's stuck in a pipelined series of commands). Repeat this command couple of times. If it doesn't change this means the script is stuck in certain command. In this case, you may attach trace command to the running child process:
sudo strace -p $PID
This will show you what is being executed, either indefinite loop (like reading from a pipe) or waiting on some event that never happens.
In case you find ps --ppid $PID changes, this indicates that your script is advancing but it's stuck somewhere, e.g. local loop in the script. From the changing commands, it can give you a hint where in the script it's looping.

Related

How do I stop a scirpt running in the background in linux?

Let's say I have a silly script:
while true;do
touch ~/test_file
sleep 3
done
And I start the script into the background and leave the terminal:
chmod u+x silly_script.sh
./silly_script.sh &
exit
Is there a way for me to identify and stop that script now? The way I see it is, that every command is started in it's own process and I might be able to catch and kill one command like the 'sleep 3' but not the execution of the entire script, am I mistaken? I expected a process to appear with the scripts name, but it does not. If I start the script with 'source silly_script.sh' I can't find a process by the name of 'source'. Do I need to identify the instance of bash, that is executing the script? How would I do that?
EDIT: There have been a few creative solutions, but so far they require the PID of the script execution to be stored right away, or the bash session to not be left with ^D or exit. I understand, that this way of running scripts should maybe be avoided, but I find it hard to believe, that any low privilege user could, even by accident, start an annoying script into the background, that is for instance filling the drive with garbage files or repeatedly starting new instances of some software and even the admin has no other option, than to restart the server, because a simple script can hide it's identifier without even trying.
With the help of the fine people here I was able to derive the answer I needed:
It is true, that the script runs every command in it's own process, so for instance killing the sleep 3 command won't do anything to the script being run, but through a command like the sleep 3 you can find the bash instance running the script, by looking for the parent process:
So after doing the above, you can run ps axf to show all processes in a tree form. You will then find this section:
18660 ? S 0:00 /bin/bash
18696 ? S 0:00 \_ sleep 3
Now you have found the bash instance, that is running the script and can stop it: kill 18660
(Of course your PID will be different from mine)
The jobs command will show you all running background jobs.
You can kill background jobs by id using kill, e.g.:
$ sleep 9999 &
[1] 58730
$ jobs
[1]+ Running sleep 9999 &
$ kill %1
[1]+ Terminated sleep 9999
$ jobs
$
58730 is the PID of the backgrounded task, and 1 is the task id of it. In this case kill 58730 and kill %1` would have the same effect.
See the JOB CONTROL section of man bash for more info.
When you exit, the backgrounded job will get a kill signal and die (assuming that's how it handles the signal - in your simple example it is), unless you disown it first.
That kill will propogate to the sleep process, which may well ignore it and continue sleeping. If this is the case you'll still see it in ps -e output, but with a parent pid of 1 indicating its original parent no longer exists.
You can use ps -o ppid= <pid> to find the parent of a process, or pstree -ap to visualise the job hierarchy and find the parent visually.

How to find the PID of a running command in bash?

I've googled this question, but never found anything useful for my particular case.
So, what I'm trying to figure out is the PID of a certain command that is running so I can kill it if necessary. I know it's possible to get the PID of a command by typing echo $! So supposedly
my_command & echo $!
should give me the PID. But this isn't the case, and I think I know why:
My command is as follows:
screen -d -m -S Radio /path/to/folder -f frequency -r path/to/song
which opens a detached screen first and then types the command so that it gets executed and keeps on running in the background. This way the PID that echo shows me is the wrong one. I'm guessing it shows me PID of screen -d -m -S Radio /path/to/folder -f frequency -r path/to/song instead of the PID of the command run in the new terminal created by screen.
But there's another problem to it: when I run screen -ls in the terminal, the command that is running in the background doesn't show up! I'm fairly certain it's running because the Pi is constantly at 25% CPU usage (instead of the 0% or 1% usually) and when I type ps au I can actually see the command and the PID.
So now I'm calling the community: any idea on how I could find the PID of that certain command in the new terminal? I'm writing a bash script, so it has to be possible to obtain the PID through code. Perfect would be a command that stores the PID in a variable!
Thanks to #glennjackman I managed to get the PID I wanted with a simple pgrep search_word. At first it wasn't working, but somehow I made it work after some trial and error. For those wanting the PID on a variable, just type
pid=$(pgrep search_word)
Regarding the problem with screen -ls not showing my detached session, it's still not solved, but I'm not bothered with it. Thanks again for solving my problem #glennjackman !
EDIT:
Second problem solved, check the comments on berends answer.
You can easily find out all the running process and their PID by writing (for example):
ps aux
The process is run by screen so you can probably find it easier by writing:
ps aux | grep screen
For more info about ps and the parameters I used check (quick google) -> https://www.lifewire.com/g00/uses-of-linux-ps-command-4058715?i10c
EDIT: You can use this command with bash scripting as well.

How to get right PID of a group of background command and kill it?

Ok, just like in this thread, How to get PID of background process?, I know how to get the PID of background process. However, what I need to do countains more than one operation.
{
sleep 300;
echo "Still running after 5 min, killing process manualy.";
COMMAND COMMAND COMMAND
echo "Shutdown complete"
}&
PID_CHECK_STOP=$!
some stuff...
kill -9 $PID_CHECK_STOP
But it doesn't work. It seems i get either a bad PID or I just can't kill it. I tried to run ps | grep sleep and the pid it gives is always right next to the one i get in PID_CHECK_STOP. Is there a way to make it work? Can i wrap those commands an other way so i can kill them all when i need to?
Thx guys!
kill -9 kills the process before it can do anything else, including signalling its children to exit. Use a gentler signal (kill by itself, which sends a TERM, should be sufficient). You do need to have the process signal its children to exit (if any) explicitly, though, via a trap command.
I'm assuming sleep is a placeholder for the real command. sleep is tricky, however, as it ignores any signals until it returns (i.e., it is non-interruptible). To make your example work, put sleep itself in the background and immediately wait on it. When you kill the "outer" background process, it will interrupt the wait call, which will allow sleep to be killed as well.
{
trap 'kill $(jobs -p)' EXIT
sleep 300 & wait
echo "Still running after 5 min, killing process manualy.";
COMMAND COMMAND COMMAND
echo "Shutdown complete"
}&
PID_CHECK_STOP=$!
some stuff...
kill $PID_CHECK_STOP
UPDATE: COMMAND COMMAND COMMAND includes a command that runs via sudo. To kill that process, kill must also be run via sudo. Keep in mind that doing so will run the external kill program, not the shell built-in (there is little difference between the two; the built-in exists to allow you to kill a process when your process quota has been reached).
You can have another script containing those commands and kill that script. If you are dynamically generating code for the block, just write out a script, execute it and kill when you are done.
The { ... } surrounding the statements starts a new shell, and you get its PID afterwards. sleep and other commands within the block get separate PIDs.
To illustrate, look for your process in ps afux | less - the parent shell process (above the sleep) has the PID you were just given.

Bash script on background: how to kill child processes

Well, I'm basically trying to make a bash script runs a node script forever. I made the following bash script:
#!/bin/bash
while true ; do
cd /myscope/
unlink nohup.out
node myscript.js
sleep 6
done & echo $! > pid
I'm expecting that when it runs, it starts up node with the given script, checks if node exits, sleeps for 6 seconds if so and reopen node. Also, I'm expecting it to run in background and writes it's pid (the bash pid) on a file called "pid".
Everything explained above works as expected, apparently, but I'm also expecting that when the pid of the bash script is killed, the node script would stop running, I don't know why that made sense in my mind, but when it comes to practice, it doesn't work. The bash script is killed indeed, but the node script keeps running and that is freaking me out.
I've tested it in the terminal, by not sending the bash script to the background and entering ctrl+c, both scripts gets killed.
I'm obviously miss understanding something on the way the background process works. For god sake, can anybody help me?
There are lots of tools that let you do what you're trying, just two off the top of my head:
https://github.com/nodejitsu/forever - A simple CLI tool for ensuring that a given script runs continuously (i.e. forever)
https://github.com/remy/nodemon - Monitor for any changes in your node.js application and automatically restart the server - perfect for development
Maybe the second it's not what you're looking for, but still worth a look.
If you can't or don't want to use those then the problem is that if you kill the parent process the child one is still there, so, you should kill that too:
pkill -TERM -P $PID
where $PID is the parent PID.

linux: suspend process at startup

I would like to spawn a process suspended, possibly in the context of another user (e.g. via sudo -u ...), set up some iptables rules for the spawned process, continue running the process, and remove the iptable rules when the process exists.
Is there any standart means (bash, corutils, etc.) that allows me to achieve the above? In particular, how can I spawn a process in a suspended state and get its pid?
Write a wrapper script start-stopped.sh like this:
#!/bin/sh
kill -STOP $$ # suspend myself
# ... until I receive SIGCONT
exec $# # exec argument list
And then call it like:
sudo -u $SOME_USER start-stopped.sh mycommand & # start mycommand in stopped state
MYCOMMAND_PID=$!
setup_iptables $MYCOMMAND_PID # use its PID to setup iptables
sudo -u $SOME_USER kill -CONT $MYCOMMAND_PID # make mycommand continue
wait $MYCOMMAND_PID # wait for its termination
MYCOMMAND_EXIT_STATUS=$?
teardown_iptables # remove iptables rules
report $MYCOMMAND_EXIT_STATUS # report errors, if necessary
All this is overkill, however. You don't need to spawn your process in a suspended state to get the job done. Just make a wrapper script setup_iptables_and_start:
#!/bin/sh
setup_iptables $$ # use my own PID to setup iptables
exec sudo -u $SOME_USER $# # exec'ed command will have same PID
And then call it like
setup_iptables_and_start mycommand || report errors
teardown_iptables
You can write a C wrapper for your program that will do something like this :
fork and print child pid.
In the child, wait for user to press Enter. This puts the child in sleep and you can add the rules with the pid.
Once rules are added, user presses enter. The child runs your original program, either using exec or system.
Will this work?
Edit:
Actually you can do above procedure with a shell script. Try following bash script:
#!/bin/bash
echo "Pid is $$"
echo -n "Press Enter.."
read
exec $#
You can run this as /bin/bash ./run.sh <your command>
One way to do it is to enlist gdb to pause the program at the start of its main function (using the command "break main"). This will guarantee that the process is suspended fast enough (although some initialisation routines can run before main, they probably won't do anything relevant). However, for this you will need debugging information for the program you want to start suspended.
I suggest you try this manually first, see how it works, and then work out how to script what you've done.
Alternatively, it may be possible to constrain the process (if indeed that is what you're trying to do!) without using iptables, using SELinux or a ptrace-based tool like sydbox instead.
I suppose you could write a util yourself that forks, and wherein the child of the fork suspends itself just before doing an exec. Otherwise, consider using an LD_PRELOAD lib to do your 'custom' business.
If you care about making that secure, you should probably look at bigger guns (with chroot, perhaps paravirtualization, user mode linux etc. etc);
Last tip: if you don't mind doing some more coding, the ptrace interface should allow you to do what you describe (since it is used to implement debuggers with)
You probably need the PID of a program you're starting, before that program actually starts running. You could do it like this.
Start a plain script
Force the script to wait
You can probably use suspend which is a bash builitin but in the worst case you can make it stop itself with a signal
Use the PID of the bash process in every way you want
Restart the stopped bash process (SIGCONT) and do an exec - another builtin - starting your real process (it will inherit the PID)

Resources