Is there a simple way I can completely pause an unprivileged docker container from the inside while retaining the ability to unpause/exec it from the outside?
TL;DR;
On a linux container, the answer is definitely, yes, because those two are equivalent:
From host:
docker pause [container-id]
From the container:
kill -SIGSTOP [process(es)-id]
or, even shorter
kill -SIGSTOP -1
Mind that:
If your process ID, or PID is 1, then you fall under an edge case, because PID 1, the init process, do have a specific meaning and behaviour in Linux.
Some processes might spawn child worker, as the NGINX example below.
And those two are also equivalent:
From host:
docker unpause [container-id]
From the container:
kill -SIGCONT [process(es)-id]
or, even shorter
kill -SIGCONT -1
Also mind that, in some edges cases, this won't work. The edge cases being that your process is meant to catch those two signals, SIGSTOP and SIGCONT, and ignore them.
In those cases, you will have to
either, be a privileged user, because the use of the cgroup freezer is under a folder, that is per default, read only in Docker, and probably this will end you in a dead end, because you will not be able to jump in the container anymore.
or, run your container with the flag --init so the PID 1 will just be a wrapper process initialised by Docker and you won't need to pause it anymore in order to pause the processes running inside your container.
You can use the --init flag to indicate that an init process should be used as the PID 1 in the container. Specifying an init process ensures the usual responsibilities of an init system, such as reaping zombie processes, are performed inside the created container.
The default init process used is the first docker-init executable found in the system path of the Docker daemon process. This docker-init binary, included in the default installation, is backed by tini.
This is definitely possible for Linux containers, and is explained, somehow, in the documentation, where they point out that running docker pause [container-id] just means that Docker will use an equivalent mechanism to sending the SIGSTOP signal to the process run in your container.
The docker pause command suspends all processes in the specified containers. On Linux, this uses the freezer cgroup. Traditionally, when suspending a process the SIGSTOP signal is used, which is observable by the process being suspended. With the freezer cgroup the process is unaware, and unable to capture, that it is being suspended, and subsequently resumed. On Windows, only Hyper-V containers can be paused.
See the freezer cgroup documentation for further details.
Source: https://docs.docker.com/engine/reference/commandline/pause/
Here would be an example on an NGINX Alpine container:
### For now, we are on the host machine
$ docker run -p 8080:80 -d nginx:alpine
f444eaf8464e30c18f7f83bb0d1bd07b48d0d99f9d9e588b2bd77659db520524
### Testing if NGINX answers, successful
$ curl -I -m 1 http://localhost:8080/
HTTP/1.1 200 OK
Server: nginx/1.19.0
Date: Sun, 28 Jun 2020 11:49:33 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 26 May 2020 15:37:18 GMT
Connection: keep-alive
ETag: "5ecd37ae-264"
Accept-Ranges: bytes
### Jumping into the container
$ docker exec -ti f7a2be0e230b9f7937d90954ef03502993857c5081ab20ed9a943a35687fbca4 ash
### This is the container, now, let's see the processes running
/ # ps -o pid,vsz,rss,tty,stat,time,ruser,args
PID VSZ RSS TT STAT TIME RUSER COMMAND
1 6000 4536 ? S 0:00 root nginx: master process nginx -g daemon off;
29 6440 1828 ? S 0:00 nginx nginx: worker process
30 6440 1828 ? S 0:00 nginx nginx: worker process
31 6440 1828 ? S 0:00 nginx nginx: worker process
32 6440 1828 ? S 0:00 nginx nginx: worker process
49 1648 1052 136,0 S 0:00 root ash
55 1576 4 136,0 R 0:00 root ps -o pid,vsz,rss,tty,stat,time,ruser,args
### Now let's send the SIGSTOP signal to the workers of NGINX, as docker pause would do
/ # kill -SIGSTOP 29 30 31 32
### Running ps again just to observer the T (stopped) state of the processes
/ # ps -o pid,vsz,rss,tty,stat,time,ruser,args
PID VSZ RSS TT STAT TIME RUSER COMMAND
1 6000 4536 ? S 0:00 root nginx: master process nginx -g daemon off;
29 6440 1828 ? T 0:00 nginx nginx: worker process
30 6440 1828 ? T 0:00 nginx nginx: worker process
31 6440 1828 ? T 0:00 nginx nginx: worker process
32 6440 1828 ? T 0:00 nginx nginx: worker process
57 1648 1052 136,0 S 0:00 root ash
63 1576 4 136,0 R 0:00 root ps -o pid,vsz,rss,tty,stat,time,ruser,args
/ # exit
### Back on the host to confirm NGINX doesn't answer anymore
$ curl -I -m 1 http://localhost:8080/
curl: (28) Operation timed out after 1000 milliseconds with 0 bytes received
$ docker exec -ti f7a2be0e230b9f7937d90954ef03502993857c5081ab20ed9a943a35687fbca4 ash
### Sending the SIGCONT signal as docker unpause would do
/ # kill -SIGCONT 29 30 31 32
/ # ps -o pid,vsz,rss,tty,stat,time,ruser,args
PID VSZ RSS TT STAT TIME RUSER COMMAND
1 6000 4536 ? S 0:00 root nginx: master process nginx -g daemon off;
29 6440 1828 ? S 0:00 nginx nginx: worker process
30 6440 1828 ? S 0:00 nginx nginx: worker process
31 6440 1828 ? S 0:00 nginx nginx: worker process
32 6440 1828 ? S 0:00 nginx nginx: worker process
57 1648 1052 136,0 S 0:00 root ash
62 1576 4 136,0 R 0:00 root ps -o pid,vsz,rss,tty,stat,time,ruser,args 29 30 31 32
/ # exit
### Back on the host to confirm NGINX is back
$ curl -I http://localhost:8080/
HTTP/1.1 200 OK
Server: nginx/1.19.0
Date: Sun, 28 Jun 2020 11:56:23 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 26 May 2020 15:37:18 GMT
Connection: keep-alive
ETag: "5ecd37ae-264"
Accept-Ranges: bytes
For the cases where the meaningful process is the PID 1 and so, is protected by the Linux kernel, you might want to try the --init flag at the run of your container so Docker will create a wrapper process that will be able to pass the signal to your application.
$ docker run -p 8080:80 -d --init nginx:alpine
e61e9158b2aab95007b97aa50bc77fff6b5c15cf3b16aa20a486891724bec6e9
$ docker exec -ti e61e9158b2aab95007b97aa50bc77fff6b5c15cf3b16aa20a486891724bec6e9 ash
/ # ps -o pid,vsz,rss,tty,stat,time,ruser,args
PID VSZ RSS TT STAT TIME RUSER COMMAND
1 1052 4 ? S 0:00 root /sbin/docker-init -- /docker-entrypoint.sh nginx -g daemon off;
7 6000 4320 ? S 0:00 root nginx: master process nginx -g daemon off;
31 6440 1820 ? S 0:00 nginx nginx: worker process
32 6440 1820 ? S 0:00 nginx nginx: worker process
33 6440 1820 ? S 0:00 nginx nginx: worker process
34 6440 1820 ? S 0:00 nginx nginx: worker process
35 1648 4 136,0 S 0:00 root ash
40 1576 4 136,0 R 0:00 root ps -o pid,vsz,rss,tty,stat,time,ruser,args
See how nginx: master process nginx -g daemon off; that was PID 1 in the previous use case became PID 7, now?
This ables us to kill -SIGSTOP -1 and be sure all meaningful processes are stopped, still we won't be locked out of the container.
While digging on this, I found this blog post that seems like a good read on the topic: https://major.io/2009/06/15/two-great-signals-sigstop-and-sigcont/
Also related it the ps manual page extract about process state code:
Here are the different values that the s, stat and state output
specifiers (header "STAT" or "S") will display to describe the state
of a process:
D uninterruptible sleep (usually IO)
I Idle kernel thread
R running or runnable (on run queue)
S interruptible sleep (waiting for an event to complete)
T stopped by job control signal
t stopped by debugger during the tracing
W paging (not valid since the 2.6.xx kernel)
X dead (should never be seen)
Z defunct ("zombie") process, terminated but not reaped by
its parent
For BSD formats and when the stat keyword is used, additional
characters may be displayed:
< high-priority (not nice to other users)
N low-priority (nice to other users)
L has pages locked into memory (for real-time and custom
IO)
s is a session leader
l is multi-threaded (using CLONE_THREAD, like NPTL
pthreads do)
+ is in the foreground process group
Source https://man7.org/linux/man-pages/man1/ps.1.html#PROCESS_STATE_CODES
docker pause command from inside is not possible for an unprivileged container. It would need access to the docker daemon by mounting the socket.
You would need to build a custom solution. Just the basic idea: You could bindmount a folder from the host. Inside this folder you create a file which acts as a lock. So when you pause inside the container you would create the file. While the file exists you activly wait/sleep. As soon as the host would delete the file at path which was mounted, your code would resume. That is a rather naive approach because you actively wait, but it should do the trick.
You can also look into inotify to overcome activ waiting.
https://lwn.net/Articles/604686/
Trying to stop Apache2 service, but get PID error:
#service apache2 stop
[FAIL] Stopping web server: apache2 failed!
[....] There are processes named 'apache2' running which do not match your pid file which are left untouched in the name of safety, Plea[warnview the situation by hand. ... (warning).
Trying to kill, those processes:
#kill -9 $(ps aux | grep apache2 | awk '{print $2}')
but they get re-spawned again:
#ps aux | grep apache2
root 19279 0.0 0.0 4080 348 ? Ss 05:10 0:00 runsv apache2
root 19280 0.0 0.0 4316 648 ? S 05:10 0:00 /bin/sh /usr/sbin/apache2ctl -D FOREGROUND
root 19282 0.0 0.0 91344 5424 ? S 05:10 0:00 /usr/sbin/apache2 -D FOREGROUND
www-data 19284 0.0 0.0 380500 2812 ? Sl 05:10 0:00 /usr/sbin/apache2 -D FOREGROUND
www-data 19285 0.0 0.0 380500 2812 ? Sl 05:10 0:00 /usr/sbin/apache2 -D FOREGROUND
And though the processes are running i can't connect to the server on port 80. /var/log/apache2/error.log.1 has no new messages when i do the kill -9.
Before I tried to restart everything worked perfectly.
Running on Debian: Linux adara 3.2.0-4-amd64 #1 SMP Debian 3.2.54-2 x86_64 GNU/Linux
UPD:
also tried apache2ctl:
#/usr/sbin/apache2ctl -k stop
AH00526: Syntax error on line 76 of /etc/apache2/apache2.conf:
PidFile takes one argument, A file for logging the server process ID
Action '-k stop' failed.
The Apache error log may have more information.
but there is no pid file in /var/run/apache2
I'm new to linux, looks like it has to do something with startup scripts, but can't figure out what exactly.
Below is the command to find out the process running on port 80
lsof -i tcp:80
Kill the process with PID.Restart the system once to check if their is any start up script executing and using the Port 80 which is preventing you to start your service.
For start up scripts you can check
/etc/init.d/ or /etc/rc.local or crontab - e
You can try Apache official documentation for stop/restart operations.
link
$ jcmd -l
418 sun.tools.jcmd.JCmd -l
$ jstat -gcutil -t 10 250ms 1
10 not found
I am aware of the bug in jdk related to attaching jstat as root to a process running as a different user.
Here, this docker container has one user root and as can be seen below from the ps command, cassandra is running under root.
$ whoami
root
I have tried to do the following:
$ sudo -u root jcmd -l
Any help is appreciated.
Docker container is debian:jessie
running java version:
openjdk version "1.8.0_66-internal"
Here's the output of ps -ef:
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 17:40 ? 00:00:00 /bin/bash /run.sh
root 10 1 11 17:40 ? 00:02:25 java -ea -javaagent:/usr/share/c
root 375 0 0 17:49 ? 00:00:00 bash
root 451 375 0 18:00 ? 00:00:00 ps -ef
Aside: jstack successfully dumps out the stack traces of the threads.
I know at least two possible reasons why this can happen.
Java is run with -XX:+PerfDisableSharedMem option. This option helps sometimes to reduce JVM safepoint pauses, but it also makes JVM invisible to jps and jstat. This is a very likely case, because you are running Cassandra, and recent Cassandra has this option ON by default.
Java process has a different mount namespace, so that /tmp of Java process is not physically the same directory as /tmp of your shell. The directory /tmp/hsperfdata_root must be accessible in order to use jps or jstat. This is also a plausible reason since you are using docker containers.
I'm new to linux.
How can I show a list of all processes that says about each process if it's running or suspended?
I've tried
ps -ef|grep myusername
but it doesn't say if the processes are running or not.
also tried
ps ux
same thing, it doesn't say if the processes are running or not.
I'm looking for something like this list:
I get this list when I move a process to background, I don't know how to see it otherwise...
You can use "ps" to list processes, This (ps aux) will list all the processes. Given an example output of it below.
ps aux | more
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.1 189160 9376 ? Ss 15:51 0:04 /usr/lib/systemd/systemd --switched-root --system --deserialize 20
root 2 0.0 0.0 0 0 ? S 15:51 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S 15:51 0:00 [ksoftirqd/0]
root 5 0.0 0.0 0 0 ? S< 15:51 0:00 [kworker/0:0H]
root 7 0.0 0.0 0 0 ? S 15:51 0:06 [rcu_sched]
root 8 0.0 0.0 0 0 ? S 15:51 0:00 [rcu_bh]
root 9 0.0 0.0 0 0 ? S 15:51 0:04 [rcuos/0]
By checking the STAT of the process ( UNDER "STAT" ) you can identify the process states, Below are some possible states codes.
R running or runnable (on run queue)
D uninterruptible sleep (usually IO)
S interruptible sleep (waiting for an event to complete)
Z defunct/zombie, terminated but not reaped by its parent
T stopped, either by a job control signal or because it is being
traced
You can type "man ps" to get more info.
You can use htop to see the list of processes and there is a column for process state
What does a C process status mean in htop?
http://www.howtogeek.com/howto/ubuntu/using-htop-to-monitor-system-processes-on-linux/
ps -p PID -o comm=
Enter the code above where PID is PID of the process.
Following command will be more helpful to you.
Use the command : sudo lsof -i -n -P
This command lists the Application Name, PID, User, IP version, Device ID and the Node with Port Name. It shows both TCP and UDP.
Variations :
To format it in a nice, readable way; use :
sudo lsof -i -n -P | more
To view view only TCP connections :
sudo lsof -i -n -P | grep TCP | more
To view view only UDP connections :
sudo lsof -i -n -P | grep UDP | more