`ps` of specific container from host - linux

in the host, is there any way to get ps of specific container?
if a container having cgroup foo has processes bar, baz, bam
then like ps --cgroup-id foo should print the result of ps as if in the container(cgroup) as follows:
PID USER TIME COMMAND
1 root 0:00 bar
60 root 0:00 baz
206 root 0:00 bam
it doesn't have to be ps though, I hope it could be made of just one or two commands.
Thanks!

There's a docker top command, e.g.:
$ docker top 9f2
UID PID PPID C STIME TTY TIME CMD
root 20659 20621 0 Oct08 ? 00:00:00 nginx: master process nginx -g daemon off;
systemd+ 20825 20659 0 Oct08 ? 00:00:00 nginx: worker process
systemd+ 20826 20659 0 Oct08 ? 00:00:00 nginx: worker process
systemd+ 20827 20659 0 Oct08 ? 00:00:00 nginx: worker process
systemd+ 20828 20659 0 Oct08 ? 00:00:00 nginx: worker process
systemd+ 20829 20659 0 Oct08 ? 00:00:00 nginx: worker process
systemd+ 20830 20659 0 Oct08 ? 00:00:00 nginx: worker process
systemd+ 20831 20659 0 Oct08 ? 00:00:00 nginx: worker process
systemd+ 20832 20659 0 Oct08 ? 00:00:00 nginx: worker process
Or you can exec into the container if the container ships with ps:
docker exec $container_name ps
And if ps isn't included in the container, you can run a different container in the same pid namespace:
$ docker run --pid container:9f2 busybox ps -ef
PID USER TIME COMMAND
1 root 0:00 nginx: master process nginx -g daemon off;
23 101 0:00 nginx: worker process
24 101 0:00 nginx: worker process
25 101 0:00 nginx: worker process
26 101 0:00 nginx: worker process
27 101 0:00 nginx: worker process
28 101 0:00 nginx: worker process
29 101 0:00 nginx: worker process
30 101 0:00 nginx: worker process
31 root 0:00 ps -ef

Related

How does CircleCI achieve "merged" Docker cgroup namespaces?

I was trying to debug a failed test job in a CircleCI workflow which had a config similar to this:
integration_tests:
docker:
- image: my-group/our-custom-image:latest
- image: postgres:9.6.11
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: password
POSTGRES_DB: db
steps:
- stuff && things
When I ran the job with SSH debugging and SSH'ed to where the CircleCI app told me, I found myself in a strange maze of twisty little namespaces, all alike. I ran ps awx and I could see processes from both the two docker containers:
root#90c93bcdd369:~# ps awwwx
PID TTY STAT TIME COMMAND
1 pts/0 Ss 0:00 /dev/init -- /bin/sh
6 pts/0 S+ 0:00 /bin/sh
7 pts/0 Ss+ 0:00 postgres
40 ? Ssl 0:02 /bin/circleci-agent ...
105 ? Ss 0:00 postgres: checkpointer process
106 ? Ss 0:00 postgres: writer process
107 ? Ss 0:00 postgres: wal writer process
108 ? Ss 0:00 postgres: autovacuum launcher process
109 ? Ss 0:00 postgres: stats collector process
153 pts/1 Ss+ 0:00 bash "stuff && things"
257 pts/1 Sl+ 0:31 /path/to/our/application
359 pts/2 Ss 0:00 -bash
369 pts/2 R+ 0:00 ps awwwx
It seems like what they did was somehow "merged" the cgroup namespaces of the two docker containers into a third namespace, under which the shell they provided me resides. Because pid 7 is running from one docker container, and pid 257 is the application running inside the my-group/our-custom-image:latest container.
The cgroup view from /proc shows some kind of merged cgroups going on, it looks like?
root#90c93bcdd369:~# cat /proc/7/cgroup
12:devices:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/d8fc5294708fd4cf91fa405d6462571e1dc56413b55a6b6e5790b8f158fee632
11:blkio:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/d8fc5294708fd4cf91fa405d6462571e1dc56413b55a6b6e5790b8f158fee632
10:memory:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/d8fc5294708fd4cf91fa405d6462571e1dc56413b55a6b6e5790b8f158fee632
9:hugetlb:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/d8fc5294708fd4cf91fa405d6462571e1dc56413b55a6b6e5790b8f158fee632
8:perf_event:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/d8fc5294708fd4cf91fa405d6462571e1dc56413b55a6b6e5790b8f158fee632
7:net_cls,net_prio:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/d8fc5294708fd4cf91fa405d6462571e1dc56413b55a6b6e5790b8f158fee632
6:rdma:/
5:cpu,cpuacct:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/d8fc5294708fd4cf91fa405d6462571e1dc56413b55a6b6e5790b8f158fee632
4:cpuset:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/d8fc5294708fd4cf91fa405d6462571e1dc56413b55a6b6e5790b8f158fee632
3:pids:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/d8fc5294708fd4cf91fa405d6462571e1dc56413b55a6b6e5790b8f158fee632
2:freezer:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/d8fc5294708fd4cf91fa405d6462571e1dc56413b55a6b6e5790b8f158fee632
1:name=systemd:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/d8fc5294708fd4cf91fa405d6462571e1dc56413b55a6b6e5790b8f158fee632
0::/system.slice/containerd.service
root#90c93bcdd369:~# cat /proc/257/cgroup
12:devices:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/90c93bcdd3693a918adddf62939c5b31e86868864edabe7347a268149e797f43
11:blkio:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/90c93bcdd3693a918adddf62939c5b31e86868864edabe7347a268149e797f43
10:memory:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/90c93bcdd3693a918adddf62939c5b31e86868864edabe7347a268149e797f43
9:hugetlb:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/90c93bcdd3693a918adddf62939c5b31e86868864edabe7347a268149e797f43
8:perf_event:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/90c93bcdd3693a918adddf62939c5b31e86868864edabe7347a268149e797f43
7:net_cls,net_prio:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/90c93bcdd3693a918adddf62939c5b31e86868864edabe7347a268149e797f43
6:rdma:/
5:cpu,cpuacct:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/90c93bcdd3693a918adddf62939c5b31e86868864edabe7347a268149e797f43
4:cpuset:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/90c93bcdd3693a918adddf62939c5b31e86868864edabe7347a268149e797f43
3:pids:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/90c93bcdd3693a918adddf62939c5b31e86868864edabe7347a268149e797f43
2:freezer:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/90c93bcdd3693a918adddf62939c5b31e86868864edabe7347a268149e797f43
1:name=systemd:/docker/94376e7880579c6bde0622017594fcdb8d5767788bb4790c0f014db282198577/90c93bcdd3693a918adddf62939c5b31e86868864edabe7347a268149e797f43
0::/system.slice/containerd.service
Is there a standard Docker feature, or some cgroup feature being used to produce this magic? Or is this some custom, proprietary CircleCI feature?

Can an unprivileged docker container be paused from the inside?

Is there a simple way I can completely pause an unprivileged docker container from the inside while retaining the ability to unpause/exec it from the outside?
TL;DR;
On a linux container, the answer is definitely, yes, because those two are equivalent:
From host:
docker pause [container-id]
From the container:
kill -SIGSTOP [process(es)-id]
or, even shorter
kill -SIGSTOP -1
Mind that:
If your process ID, or PID is 1, then you fall under an edge case, because PID 1, the init process, do have a specific meaning and behaviour in Linux.
Some processes might spawn child worker, as the NGINX example below.
And those two are also equivalent:
From host:
docker unpause [container-id]
From the container:
kill -SIGCONT [process(es)-id]
or, even shorter
kill -SIGCONT -1
Also mind that, in some edges cases, this won't work. The edge cases being that your process is meant to catch those two signals, SIGSTOP and SIGCONT, and ignore them.
In those cases, you will have to
either, be a privileged user, because the use of the cgroup freezer is under a folder, that is per default, read only in Docker, and probably this will end you in a dead end, because you will not be able to jump in the container anymore.
or, run your container with the flag --init so the PID 1 will just be a wrapper process initialised by Docker and you won't need to pause it anymore in order to pause the processes running inside your container.
You can use the --init flag to indicate that an init process should be used as the PID 1 in the container. Specifying an init process ensures the usual responsibilities of an init system, such as reaping zombie processes, are performed inside the created container.
The default init process used is the first docker-init executable found in the system path of the Docker daemon process. This docker-init binary, included in the default installation, is backed by tini.
This is definitely possible for Linux containers, and is explained, somehow, in the documentation, where they point out that running docker pause [container-id] just means that Docker will use an equivalent mechanism to sending the SIGSTOP signal to the process run in your container.
The docker pause command suspends all processes in the specified containers. On Linux, this uses the freezer cgroup. Traditionally, when suspending a process the SIGSTOP signal is used, which is observable by the process being suspended. With the freezer cgroup the process is unaware, and unable to capture, that it is being suspended, and subsequently resumed. On Windows, only Hyper-V containers can be paused.
See the freezer cgroup documentation for further details.
Source: https://docs.docker.com/engine/reference/commandline/pause/
Here would be an example on an NGINX Alpine container:
### For now, we are on the host machine
$ docker run -p 8080:80 -d nginx:alpine
f444eaf8464e30c18f7f83bb0d1bd07b48d0d99f9d9e588b2bd77659db520524
### Testing if NGINX answers, successful
$ curl -I -m 1 http://localhost:8080/
HTTP/1.1 200 OK
Server: nginx/1.19.0
Date: Sun, 28 Jun 2020 11:49:33 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 26 May 2020 15:37:18 GMT
Connection: keep-alive
ETag: "5ecd37ae-264"
Accept-Ranges: bytes
### Jumping into the container
$ docker exec -ti f7a2be0e230b9f7937d90954ef03502993857c5081ab20ed9a943a35687fbca4 ash
### This is the container, now, let's see the processes running
/ # ps -o pid,vsz,rss,tty,stat,time,ruser,args
PID VSZ RSS TT STAT TIME RUSER COMMAND
1 6000 4536 ? S 0:00 root nginx: master process nginx -g daemon off;
29 6440 1828 ? S 0:00 nginx nginx: worker process
30 6440 1828 ? S 0:00 nginx nginx: worker process
31 6440 1828 ? S 0:00 nginx nginx: worker process
32 6440 1828 ? S 0:00 nginx nginx: worker process
49 1648 1052 136,0 S 0:00 root ash
55 1576 4 136,0 R 0:00 root ps -o pid,vsz,rss,tty,stat,time,ruser,args
### Now let's send the SIGSTOP signal to the workers of NGINX, as docker pause would do
/ # kill -SIGSTOP 29 30 31 32
### Running ps again just to observer the T (stopped) state of the processes
/ # ps -o pid,vsz,rss,tty,stat,time,ruser,args
PID VSZ RSS TT STAT TIME RUSER COMMAND
1 6000 4536 ? S 0:00 root nginx: master process nginx -g daemon off;
29 6440 1828 ? T 0:00 nginx nginx: worker process
30 6440 1828 ? T 0:00 nginx nginx: worker process
31 6440 1828 ? T 0:00 nginx nginx: worker process
32 6440 1828 ? T 0:00 nginx nginx: worker process
57 1648 1052 136,0 S 0:00 root ash
63 1576 4 136,0 R 0:00 root ps -o pid,vsz,rss,tty,stat,time,ruser,args
/ # exit
### Back on the host to confirm NGINX doesn't answer anymore
$ curl -I -m 1 http://localhost:8080/
curl: (28) Operation timed out after 1000 milliseconds with 0 bytes received
$ docker exec -ti f7a2be0e230b9f7937d90954ef03502993857c5081ab20ed9a943a35687fbca4 ash
### Sending the SIGCONT signal as docker unpause would do
/ # kill -SIGCONT 29 30 31 32
/ # ps -o pid,vsz,rss,tty,stat,time,ruser,args
PID VSZ RSS TT STAT TIME RUSER COMMAND
1 6000 4536 ? S 0:00 root nginx: master process nginx -g daemon off;
29 6440 1828 ? S 0:00 nginx nginx: worker process
30 6440 1828 ? S 0:00 nginx nginx: worker process
31 6440 1828 ? S 0:00 nginx nginx: worker process
32 6440 1828 ? S 0:00 nginx nginx: worker process
57 1648 1052 136,0 S 0:00 root ash
62 1576 4 136,0 R 0:00 root ps -o pid,vsz,rss,tty,stat,time,ruser,args 29 30 31 32
/ # exit
### Back on the host to confirm NGINX is back
$ curl -I http://localhost:8080/
HTTP/1.1 200 OK
Server: nginx/1.19.0
Date: Sun, 28 Jun 2020 11:56:23 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 26 May 2020 15:37:18 GMT
Connection: keep-alive
ETag: "5ecd37ae-264"
Accept-Ranges: bytes
For the cases where the meaningful process is the PID 1 and so, is protected by the Linux kernel, you might want to try the --init flag at the run of your container so Docker will create a wrapper process that will be able to pass the signal to your application.
$ docker run -p 8080:80 -d --init nginx:alpine
e61e9158b2aab95007b97aa50bc77fff6b5c15cf3b16aa20a486891724bec6e9
$ docker exec -ti e61e9158b2aab95007b97aa50bc77fff6b5c15cf3b16aa20a486891724bec6e9 ash
/ # ps -o pid,vsz,rss,tty,stat,time,ruser,args
PID VSZ RSS TT STAT TIME RUSER COMMAND
1 1052 4 ? S 0:00 root /sbin/docker-init -- /docker-entrypoint.sh nginx -g daemon off;
7 6000 4320 ? S 0:00 root nginx: master process nginx -g daemon off;
31 6440 1820 ? S 0:00 nginx nginx: worker process
32 6440 1820 ? S 0:00 nginx nginx: worker process
33 6440 1820 ? S 0:00 nginx nginx: worker process
34 6440 1820 ? S 0:00 nginx nginx: worker process
35 1648 4 136,0 S 0:00 root ash
40 1576 4 136,0 R 0:00 root ps -o pid,vsz,rss,tty,stat,time,ruser,args
See how nginx: master process nginx -g daemon off; that was PID 1 in the previous use case became PID 7, now?
This ables us to kill -SIGSTOP -1 and be sure all meaningful processes are stopped, still we won't be locked out of the container.
While digging on this, I found this blog post that seems like a good read on the topic: https://major.io/2009/06/15/two-great-signals-sigstop-and-sigcont/
Also related it the ps manual page extract about process state code:
Here are the different values that the s, stat and state output
specifiers (header "STAT" or "S") will display to describe the state
of a process:
D uninterruptible sleep (usually IO)
I Idle kernel thread
R running or runnable (on run queue)
S interruptible sleep (waiting for an event to complete)
T stopped by job control signal
t stopped by debugger during the tracing
W paging (not valid since the 2.6.xx kernel)
X dead (should never be seen)
Z defunct ("zombie") process, terminated but not reaped by
its parent
For BSD formats and when the stat keyword is used, additional
characters may be displayed:
< high-priority (not nice to other users)
N low-priority (nice to other users)
L has pages locked into memory (for real-time and custom
IO)
s is a session leader
l is multi-threaded (using CLONE_THREAD, like NPTL
pthreads do)
+ is in the foreground process group
Source https://man7.org/linux/man-pages/man1/ps.1.html#PROCESS_STATE_CODES
docker pause command from inside is not possible for an unprivileged container. It would need access to the docker daemon by mounting the socket.
You would need to build a custom solution. Just the basic idea: You could bindmount a folder from the host. Inside this folder you create a file which acts as a lock. So when you pause inside the container you would create the file. While the file exists you activly wait/sleep. As soon as the host would delete the file at path which was mounted, your code would resume. That is a rather naive approach because you actively wait, but it should do the trick.
You can also look into inotify to overcome activ waiting.
https://lwn.net/Articles/604686/

Killing subprocess from inside a Docker container kills the entire container

On my Windows machine, I started a Docker container from docker compose. My entrypoint is a Go filewatcher that runs a task of a taskmanager on every filechange. The executed task builds and runs the Go program.
But before I can build and run the program again after filechanges I have to kill the previous running version. But every time I kill the app process, the container is also gone.
The goal is to kill only the svc1 process with PID 74 in this example. I tried pkill -9 svc1 and kill $(pgrep svc1). But every time the parent processes are killed too.
The commandline output from inside the container:
root#bf073c39e6a2:/app/cmd/svc1# ps -aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 2.5 0.0 104812 2940 ? Ssl 13:38 0:00 /go/bin/watcher
root 13 0.0 0.0 294316 7576 ? Sl 13:38 0:00 /go/bin/task de
root 74 0.0 0.0 219284 4908 ? Sl 13:38 0:00 /svc1
root 82 0.2 0.0 18184 3160 pts/0 Ss 13:38 0:00 /bin/bash
root 87 0.0 0.0 36632 2824 pts/0 R+ 13:38 0:00 ps -aux
root#bf073c39e6a2:/app/cmd/svc1# ps -afx
PID TTY STAT TIME COMMAND
82 pts/0 Ss 0:00 /bin/bash
88 pts/0 R+ 0:00 \_ ps -afx
1 ? Ssl 0:01 /go/bin/watcher -cmd /go/bin/task dev -startcmd
13 ? Sl 0:00 /go/bin/task dev
74 ? Sl 0:00 \_ /svc1
root#bf073c39e6a2:/app/cmd/svc1# pkill -9 svc1
root#bf073c39e6a2:/app/cmd/svc1
Switching to the containerlog:
task: Failed to run task "dev": exit status 255
2019/08/16 14:20:21 exit status 1
"dev" is the name of the task in the taskmanger.
The Dockerfile:
FROM golang:stretch
RUN go get -u -v github.com/radovskyb/watcher/... \
&& go get -u -v github.com/go-task/task/cmd/task
WORKDIR /app
COPY ./Taskfile.yml ./Taskfile.yml
ENTRYPOINT ["/go/bin/watcher", "-cmd", "/go/bin/task dev", "-startcmd"]
I expect only the process with the target PID is killed and not the parent process that spawned it it.
You can use process manager like "supervisord" and configure it to re-execute your script or the command even if you killed it's process which will keep your container up and running.

Apache2: "Address already in use" when trying to start it ('httpd.pid' issue?)

Using Apache2 on Linux, I get this error message when trying to start it.
$ sudo /usr/local/apache2/bin/apachectl start
httpd not running, trying to start
(98)Address already in use: make_sock: unable to listen for connections on address 127.0.0.1:80
no listening sockets available, shutting down
Unable to open logs
$ sudo /usr/local/apache2/bin/apachectl stop
httpd (no pid file) not running
Some facts:
This is one of the last lines in my Apache logs:
[Mon Jun 19 18:29:01 2017] [warn] pid file /usr/local/apache2/logs/httpd.pid overwritten -- Unclean shutdown of previous Apache run?
My '/usr/local/apache2/conf/httpd.conf' contains
Listen 127.0.0.1:80
I have "Listen 80" configured at '/etc/apache2/ports.conf'
Disk is not full
I've checked that I do not have two or more "Listen" at '/usr/local/apache2/conf/httpd.conf'
Some outputs:
$ sudo ps -ef | grep apache
root 1432 1 0 17:35 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 1435 1432 0 17:35 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 1436 1432 0 17:35 ? 00:00:00 /usr/sbin/apache2 -k start
myuserr 1775 1685 0 17:37 pts/1 00:00:00 grep --color=auto apache
$ sudo grep -ri listen /etc/apache2
/etc/apache2/apache2.conf:# supposed to determine listening ports for incoming connections which can be
/etc/apache2/apache2.conf:# Include list of ports to listen on
/etc/apache2/ports.conf:Listen 80
/etc/apache2/ports.conf: Listen 443
/etc/apache2/ports.conf: Listen 443
What can I do to restart Apache? Should I repair 'httpd.pid'?
This error means that something already uses 80 port.
If you really don't have 2 line of Listen 80 in apache configurations then execute this command to see what uses 80 port: netstat -antp | grep 80.
I fixed it by killing the three processes
root 1621 1 0 18:46 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 1624 1621 0 18:46 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 1625 1621 0 18:46 ? 00:00:00 /usr/sbin/apache2 -k start
However, each time I want to reboot my server, I must kill thee processes. What is starting them?

Multiple httpd processes running in Docker Container

This is the Dockerfile I created for installing httpd on centos:
#Installing HTTPD
FROM centos:latest
MAINTAINER xxx#gmail.com
RUN yum install -y httpd
EXPOSE 80
#ENTRYPOINT ["systemctl"]
ENTRYPOINT ["/usr/sbin/httpd"]
After building, when I run the container I can see too many httpd process running inside this container:
docker run -d -p 80:80 httpd:4.0 -DFOREGROUND
Output of Docker top command:
UID PID PPID C STIME TTY TIME CMD
root 2457 2443 0 04:26 ? 00:00:00 /usr/sbin/httpd -DFOREGROUND
apache 2474 2457 0 04:26 ? 00:00:00 /usr/sbin/httpd -DFOREGROUND
apache 2475 2457 0 04:26 ? 00:00:00 /usr/sbin/httpd -DFOREGROUND
apache 2476 2457 0 04:26 ? 00:00:00 /usr/sbin/httpd -DFOREGROUND
apache 2477 2457 0 04:26 ? 00:00:00 /usr/sbin/httpd -DFOREGROUND
apache 2478 2457 0 04:26 ? 00:00:00 /usr/sbin/httpd -DFOREGROUND
apache 2491 2457 0 04:26 ? 00:00:00 /usr/sbin/httpd -DFOREGROUND
apache 2492 2457 0 04:26 ? 00:00:00 /usr/sbin/httpd -DFOREGROUND
apache 2493 2457 0 04:26 ? 00:00:00 /usr/sbin/httpd -DFOREGROUND
root 2512 2500 0 04:27 pts/0 00:00:00 /bin/bash
apache 2532 2457 0 04:27 ? 00:00:00 /usr/sbin/httpd -DFOREGROUND
Please let me know why so many httpd processes running and how to have only one process with PID 1 ?
Apache runs multiple processes to be ready catch up a client request fast, because spawning a server process is slow, so it is better to have one ready when request comes in.
You can configure their number in httpd.conf through StartServers, MinSpareServers, MaxSpareServers and ServerLimit directives.

Resources