Podman build cannot be stopped with CTRL+C in tmux - linux

I noticed that CTRL+C doesn't work well with podman build when I'm running it from within tmux. When I hit CTRL+C I see a ^C appearing in the terminal, but podman doesn't stop building the image. I think it stops later at some point, but it does not do that immediately.
Why is that?
//EDIT
I noticed that building with docker build allows me to stop with CSTL+C without any issue.

the process with pid 1 that runs inside your pod intercepts signals such as SIGTERM and SIGINT (from pressing CTRL+C).
It then sends a SIGTERM to all the processes under its management (anything running in the pod), so those processess have some time to terminate cleanly/gracefully.
If there are still processes around after a certain amount of time (10 seconds i believe), the pid 1 process will send a SIGKILL to all remaining processes, which kills them instantly; after that the pid 1 process itself can exit and let the entire pid namespace expire.
Basically, the short version is that the podman init process is trying to cleanly power things down instead of just ripping out the powercord.
If it bothers you, i imagine the 10 second delay is configurable somewhere (i've never looked). Another alternative may be to send a stronger signal.

Related

Long Running Python Script in VSCode Exits with 'Polite quit request'

I have a long running Python script which is running in Visual Studio Code.
After a while the script stops running, there are no errors just this statement:
"fish: “/usr/bin/python3 /home/ubuntu/.…” terminated by signal SIGTERM (Polite quit request)"
What is happening here?
If a process recieves SIGTERM, some other process sent that signal. That is what happened in your case.
The SIGTERM signal is sent to a process to request its termination. Unlike the SIGKILL signal, it can be caught and interpreted or ignored by the process. This allows the process to perform nice termination releasing resources and saving state if appropriate. SIGINT is nearly identical to SIGTERM.
SIGTERM is not sent automatically by the system. There are a few signals that are sent automatically like SIGHUP when a terminal goes away, SIGSEGV/SIGBUS/SIGILL when a process does things it shouldn't be doing, SIGPIPE when it writes to a broken pipe/socket, etc.
SIGTERM is the signal that is typically used to administratively terminate a process.
That's not a signal that the kernel would send, but that's the signal a process would typically send to terminate (gracefully) another process. It is sent by default by the kill, pkill, killall, fuser -k commands.
Possible reasons why your process recieved such signal are:
execution of the process takes too long
insufficient memory or system resources to continue the execution of the process
But these are some possibilities. In your case, the root of the issue might be related with something different. You can avoid from a SIGTERM signal by telling the procces to ignore the signal but it is not suggested to do.
Refer to this link for more information.
Check this similar question for additional information.

when will /proc/<pid> be removed?

Process A opened && mmaped thousand of files when running. Then killl -9 <pid of process A> is issued. Then I have a question about the sequence of below two events.
a) /proc/<pid of process A> cannot be accessed.
b) all files opened by process A are closed.
More background about the question:
Process A is a multi-thread background service. It is started by cmd ./process_A args1 arg2 arg3.
There is also a watchdog process which checked whether process A is still alive periodically(every 1 second). If process A is dead, then restart it. The way watchdog checks process A is as below.
1) collect all numerical subdir under /proc/
2) compares /proc/<all-pids>/cmdline with cmdline of process A. If these is a /proc/<some-pid>/cmdline matches, then process A is alive and do nothing, otherwise restart process A.
process A will do below stuff when doing initialization.
1) open fileA
2) flock fileA
3) mmap fileA into memory
4) close fileA
process A will mmap thousand of files after initialization.
after several minutes, kill -9 <pid of process A> is issued.
watchdog detect the death of process A, restart it. But sometimes process A stuck at step 2 flock fileA. After some debugging, we found that unlock of fileA is executed when process A is killed. But sometimes this event will happen after step 2 flock fileA of new process.
So we guess the way to check process alive by monitor /proc/<pid of process A>
is not correct.
then kill -9 is issued
This is bad habit. You'll better send a SIGTERM first. Because well behaved processes and well designed programs can catch it (and exit nicely and properly when getting a SIGTERM...). In some cases, I even recommend: sending SIGTERM. Wait two or three seconds. sending SIGQUIT. Wait two seconds. At last, send a SIGKILL signal (for those bad programs who have not been written properly or are misbehaving). A few seconds later, you could send a SIGKILL. Read signal(7) and signal-safety(7). In multi-threaded, but Linux specific, programs, you might use signalfd(2) or the pipe(7) to self trick (well explained in Qt documentation, but not Qt specific).
If your Linux system is systemd based, you could imagine your program-A is started with systemd facilities. Then you'll use systemd facilities to "communicate" with it. In some ways (I don't know the details), systemd is making signals almost obsolete. Notice that signals are not multi-thread friendly and have been designed, in the previous century, for single-thread processes.
we guess the way to check process alive by monitor /proc/ is not correct.
The usual (and faster, and "atomic" enough) way to detect the existence of a process (on which you have enough privileges, e.g. which runs with your uid/gid) is to use kill(2) with a signal number (the second argument to kill) of 0. To quote that manpage:
If sig is 0, then no signal is sent, but existence and permission
checks are still performed; this can be used to check for the
existence of a process ID or process group ID that the caller is
permitted to signal.
Of course, that other process can still terminate before any further interaction with it. Because Linux has preemptive scheduling.
You watchdog process should better use kill(pid-of-process-A, 0) to check existence and liveliness of that process-A. Using /proc/pid-of-process-A/ is not the correct way for that.
And whatever you code, that process-A could disappear asynchronously (in particular, if it has some bug that gives a segmentation fault). When a process terminates (even with a segmentation fault) the kernel is acting on its file locks (and "releases" them).
Don't scan /proc/PID to find out if a specific process has terminated. There are lots of better ways to do that, such as having your watchdog program actually launch the server program and wait for it to terminate.
Or, have the watchdog listen on a TCP socket, and have the server process connect to that and send its PID. If either end dies, the other can notice the connect was closed (hint: send a heartbeat packet every so often, to a frozen peer). If the watchdog receives a connection from another server while the first is still running, it can decide to allow it or tell one of the instances to shut down (via TCP or kill()).

Bash: Is it possible to stop a PID from being reused?

Is it possible to stop a PID from being reused?
For example if I run a job myjob in the background with myjob &, and get the PID using PID=$!, is it possible to prevent the linux system from re-using that PID until I have checked that the PID no longer exists (the process has finished)?
In other words I want to do something like:
myjob &
PID=$!
do_not_use_this_pid $PID
wait $PID
allow_use_of_this_pid $PID
The reasons for wanting to do this do not make much sense in the example given above, but consider launching multiple background jobs in series and then waiting for them all to finish.
Some programmer dude rightly points out that no 2 processes may share the same PID. That is correct, but not what I am asking here. I am asking for a method of preventing a PID from being re-used after a process has been launched with a particular PID. And then also a method of re-enabling its use later after I have finished using it to check whether my original process finished.
Since it has been asked for, here is a use case:
launch multiple background jobs
get PID's of background jobs
prevent PID's from being re-used by another process after background job terminates
check for PID's of "background jobs" - ie, to ensure background jobs finish
[note if disabled PID re-use for the PID's of the background jobs those PIDs could not be used by a new process which was launched after a background process terminated]*
re-enable PID of background jobs
repeat
*Further explanation:
Assume 10 jobs launched
Job 5 exits
New process started by another user, for example, they login to a tty
New process has same PID as Job 5!
Now our script checks for Job 5 termination, but sees PID in use by tty!
You can't "block" a PID from being reused by the kernel. However, I am inclined to think this isn't really a problem for you.
but consider launching multiple background jobs in series and then waiting for them all to finish.
A simple wait (without arguments) would wait for all the child processes to complete. So, you don't need to worry about the
PIDs being reused.
When you launch several background process, it's indeed possible that PIDs may be reused by other processes.
But it's not a problem because you can't wait on a process unless it's your child process.
Otherwise, checking whether one of the background jobs you started is completed by any means other than wait is always going to unreliable.
Unless you've retrieved the return value of the child process it will exist in the kernel. That also means that it's pid is bound to it and can't being re-used during that time.
Further suggestion to work around this - if you suspect that a PID assigned to one of your background jobs is reassigned, check it in ps to see if it still is your process with your executable and has PPID (parent PID) 1.
If you are afraid of reusing PID's, which won't happen if you wait as other answers explain, you can use
echo 4194303 > /proc/sys/kernel/pid_max
to decrease your fear ;-)

What special precautions must I make for docker apps running as pid 1?

From what I gather, programs that run as pid 1 may need to take special precautions such as capturing certain signals.
It's not altogether clear how to correctly write a pid 1. I'd rather not use runit or supervisor in my case. For example, supervisor is written in python and if you install that, it'll result in a much larger container. I'm not a fan of runit.
Looking at the source code for runit is intersting but as usual, comments are virtually non-existent and don't explain what's being done for what reason.
There is a good discussion here:
When the process with pid 1 die for any reason, all other processes
are killed with KILL signal
When any process having children dies for any reason, its children are reparented to process with PID 1
Many signals which have default action of Term do not have one for PID 1.
The relevant part for your question:
you can’t stop process by sending SIGTERM or SIGINT, if process have not installed a signal handler

Child processes won't die in Jenkins environment

I'm developing code for Linux, and cannot seem to kill processes when running in a Jenkins environment.
I have test script that spawns processes and cleans them up as it goes through the tests. One of the processes also spawns and cleans up one of its own subprocesses. All of the "cleanup" is done by sending a SIGINT, followed by a wait. Everything works fine with this when run from a terminal, except when running through Jenkins.
When the same exact thing is run in Jenkins, processes killed with SIGINT do not die, and the call to wait blocks forever. This wreaks havoc on my test. I could update the logic to not do a blocking wait, but I don't feel I should have to change my production code to accommodate Jenkins.
Any ideas?
Process tree killer may be your answer - https://wiki.jenkins-ci.org/display/JENKINS/ProcessTreeKiller
In testing, this would usually work when I ran the tests from the command line, but would almost always fail when that unit test script was called from another script. Frankly, it was bizarre....
Then I realized that when I had stray processes, they would indeed go away when I killed them with SIGTERM. But WHY?????
I didn't find a 100%-definitive answer. But thinking about it logically, if the process is not attached to a terminal, then maybe the "terminal interrupt" signal (SIGINT), wouldn't work...?
In doing some reading, what I learned is that, basically, when it's a shell that executes a process, the SIGINT action may be set to 'ignore'. That make sense (to me, anyway) because you wouldn't want CTRL-C at the command line to kill all of your background processes:
When the shell executes a process “in the background” (or when another background process executes another process), the newly executed process should ignore the interrupt and quit characters. Thus, before a shell executes a background process, it should set SIGINT and SIGQUIT to SIG_IGN.
Our production code isn't a shell, but it is started from a shell, and Jenkins uses /bin/sh to run stuff. So, this would add up.
So, since there is an implied association between SIGINT and the existence of a TTY, SIGTERM is a better option for killing your own background processes:
It should be noted that SIGINT is nearly identical to SIGTERM. (1)
I've changed the code that kills the proxyserver processes, and the Python unit test code, to use the SIGTERM signal. Now everything runs at the terminal and in Jenkins.

Resources