docker: different PID for `top` and `ps` - linux

I don't understand the difference in
$> docker top lamp-test
PID USER COMMAND
31263 root {supervisord} /usr/bin/python /usr/bin/supervisord -n
31696 root {mysqld_safe} /bin/sh /usr/bin/mysqld_safe
31697 root apache2 -D FOREGROUND
...
and
$> docker exec lamp-test ps
PID TTY TIME CMD
1 ? 00:00:00 supervisord
433 ? 00:00:00 mysqld_safe
434 ? 00:00:00 apache2
831 ? 00:00:00 ps
So, the question is, why are the PID different ? I would say that the output from ps is namespaced, but if that is true, what is top showing!

docker exec lamp-test ps show pids inside docker container.
docker top lamp-test show host system pids.
You can see a container processes, but You cannot kill them. This "flawed" isolation actually has some great benefits, like the ability to monitor the processes running inside all your containers from a single monitor process running on the host machine.

I don't think you should worry about this. You can't kill the PID in Host environment, but can do it in container.
docker exec <CONTAINER NAME> ps remember the PID
docker exec <CONTAINER NAME> kill <PID>

Related

Docker: attach to a specific bash

Lets say I have a container running and I do
docker exec -ti container-id /bin/bash
Then I detach from this container and want to attach again
If I do this
docker attach container-id
I wont go back to that bash that I created. Instead I will go to the main process.
How can I attach to that bash again ?
You can't. While the docker exec documentation suggests it supports the same "detach" key sequence as docker run, the exec'd process doesn't have any Docker-level identity (beyond its host and container pids) and there's no way to re-attach to that shell.
(In the Docker API, "exec instance" is an actual object so this isn't technically impossible; the CLI just has no support for it.)
The workflow you're describing sounds more like what you'd run with screen or tmux in a virtual machine.
I have one container and I have started it using, and checked the pid of /bin/bash
[root#ip-10-0-1-153 centos]# docker exec -ti 78c2e4a46b58 /bin/bash
root#78c2e4a46b58:/# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 03:24 ? 00:00:00 bash
root 10 0 0 03:24 ? 00:00:00 /bin/bash
root 20 10 0 03:24 ? 00:00:00 ps -ef
Now I detach from container using CTR+p and CTR+q sequence and container is detached.
Now I reattach using the container id and I see the same pid of /bin/bash
root#78c2e4a46b58:/# [root#ip-10-0-1-153 centos]# docker attach 78c2e4a46b58
root#78c2e4a46b58:/# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 03:24 ? 00:00:00 bash
root 10 0 0 03:24 ? 00:00:00 /bin/bash
root 21 1 0 03:25 ? 00:00:00 ps -ef
root#78c2e4a46b58:/#
I hope you're using CTR+p CTR+q sequence to detach from the container.

Docker container on Alpine Linux 3.7: Strange pid 1 not visible within the container's pid namespace

I am currently tracking a weird issue we are experiencing using dockerd 17.10.0-ce on an Alpine Linux 3.7 host. It seems for all the containers on this host, the process tree initiated as the entrypoint/command of the Docker image is NOT visible within the container itself. In comparison, on an Ubuntu host, the same image will have the process tree visible as PID 1.
Here is an example.
Run a container with an explicit known entrypoint/command:
% docker run -d --name testcontainer --rm busybox /bin/sh -c 'sleep 1000000'
Verify the processes are seen by dockerd properly:
% docker top testcontainer
PID USER TIME COMMAND
6729 root 0:00 /bin/sh -c sleep 1000000
6750 root 0:00 sleep 1000000
Now, start a shell inside that container and check the process list:
% docker exec -t -i testcontainer /bin/sh
/ # ps -ef
PID USER TIME COMMAND
6 root 0:00 /bin/sh
12 root 0:00 ps -ef
As can be observed, our entrypoint command (/bin/sh -c 'sleep 1000000') is not visible inside the container itself. Even running top will yield the same results.
Is there something I am missing here? On an Ubuntu host with the same docker engine version, the results are as I would expect. Could this be related to Alpine's hardened kernel causing an issue with how the container PID space is separated?
Any help appreciated for areas to investigate.
-b
It seems this problem is related to grsecurity module which the Alpine kernel implements. In this specific case, the GRKERNSEC_CHROOT_FINDTASK kernel setting is used to limit what processes can do outside of the chroot environment. This is controlled by the kernel.grsecurity.chroot_findtask sysctl variable.
From the grsecurity docs:
kernel.grsecurity.chroot_findtask
If you say Y here, processes inside a chroot will not be able to kill,
send signals with fcntl, ptrace, capget, getpgid, setpgid, getsid, or
view any process outside of the chroot. If the sysctl option is
enabled, a sysctl option with name "chroot_findtask" is created.
The only workaround I have found for now is to disable this flag as well as the chroot_deny_mknod and chroot_deny_chmod flags in order to get the same behaviour as with a non-grsecurity kernel.
kernel.grsecurity.chroot_deny_mknod=0
kernel.grsecurity.chroot_deny_chmod=0
kernel.grsecurity.chroot_findtask=0
Of course this is less than ideal since it bypasses and disables security features of the system but might be a valid workaround for a development environment.

Finding Docker container processes? (from host point of view)

I am doing some tests on docker and containers and I was wondering:
Is there a method I can use to find all process associated with a docker container by its name or ID from the host point of view.
After all, at the end of the day a container is a set of virtualized processes.
You can use docker top command.
This command lists all processes running within your container.
For instance this command on a single process container on my box displays:
UID PID PPID C STIME TTY TIME CMD
root 14097 13930 0 23:17 pts/6 00:00:00 /bin/bash
All methods mentioned by others are also possible to use but this one should be easiest.
Update:
To simply get the main process id within the container use this command:
docker inspect -f '{{.State.Pid}}' <container id>
Another way to get an overview of all Docker processes running on a host is using generic cgroup based systemd tools.
systemd-cgls will show all our cgroups and the processes running in them in a tree-view, like this:
├─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
├─docker
│ ├─070a034d27ed7a0ac0d336d72cc14671584cc05a4b6802b4c06d4051ce3213bd
│ │ └─14043 bash
│ ├─dd952fc28077af16a2a0a6a3231560f76f363359f061c797b5299ad8e2614245
│ │ └─3050 go-cron -s 0 0 * * * * -- automysqlbackup
As every Docker container has its own cgroup, you can also see Docker Containers and their corresponding host processes this way.
Two interesting properties of this method:
It works even if the Docker Daemon(s) are defunct.
It's a pretty quick overview.
You can also use systemd-cgtop to get an overview of the resource usage of Docker Containers, similar to top.
By the way: Since systemd services also correspond to cgroups these methods are also applicable to non-Dockerized systemd services.
I found a similar solution using a bash script in one line:
for i in $(docker container ls --format "{{.ID}}"); do docker inspect -f '{{.State.Pid}} {{.Name}}' $i; done
the process run in a docker container is a child of a process named containerd-shim (in Docker v18.09.4)
First figure out the process IDs of the containerd-shim processes.
For each of them, find their child process.
pgrep containerd-shim
7105
7141
7248
To find the child process of parent process 7105:
pgrep -P 7105
7127
In the end you could get the list with:
for i in $(pgrep containerd-shim); do pgrep -P $i; done
7127
7166
7275
When running this on the host, it will give you a list of processes running in a container with <Container ID>, showing host PIDs instead of container PIDs.
DID=$(docker inspect -f '{{.State.Pid}}' <Container ID>);ps --ppid $DID -o pid,ppid,cmd
docker ps will list docker containers that are running.
docker exec <id|name> ps will tell you the processes it's running.
Since the following command shows only the container's itself process ID (not all child processes):
docker inspect -f '{{.State.Pid}}' <container-name_or_ID>
To find a process that is the child of a container, this process ID must be find in directory /proc. So find "processID" inside it and then find the container hash from file:
/proc/parent_process/task/processID
and then cut container ID from hash (first 12-digits of the container hash) and then find the container itself:
#!/bin/bash
processPath=$(find /proc/ -name $1 2>/dev/null)
containerID=$(cat ${processPath}/cgroup | fgrep 'pids:/docker/' | sed -e 's#.*/docker/##g' | cut -c 1-12)
docker ps | fgrep $containerID
Save above script in a file such as: p2c and run it by:
p2c <PID>
For example:
p2c 85888
Another solution with docker container and docker top
docker ps --format "{{.ID}}" | xargs -I'{}' docker top {} -o pid | awk '!/PID/'
Note: awk '!/PID/' just remove the PID header from the output of docker top
If you want to know the whole process tree of docker container, you can try it
docker ps --format "{{.ID}}" | xargs -I'{}' docker top {} -o pid | awk '!/PID/' | xargs -I'{}' pstree -psa {}
Docker stats "container id"
Shows the resource consumption along with pid or simply Docker ps .
Probably this cheat sheet can be of use.
http://theearlybirdtechnology.com/2017/08/12/docker-cheatsheet/

Can't kill supervisord inside of docker container

I have docker container and it has supervisord inside.
I wish to kill that process
root 1 0.0 0.1 59768 13360 ? Ss+ 20:29 0:01 /usr/bin/python /usr/bin/supervisord
I login
sudo docker exec -ti blahblah bash
root# kill -KILL 1
it does not kills process 1 but I can kill any another process
If you kill the process the whole container would stop. So you might as well run.
docker stop containerName
or if you want to force it you can change stop to "kill" or "rm -f"(if you also want to remove the container)

docker attach vs lxc-attach

UPDATE: Docker 0.9.0 use libcontainer now, diverting from LXC see: Attaching process to Docker libcontainer container
I'm running an istance of elasticsearch:
docker run -d -p 9200:9200 -p 9300:9300 dockerfile/elasticsearch
Checking the process it show like the following:
$ docker ps --no-trunc
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
49fdccefe4c8c72750d8155bbddad3acd8f573bf13926dcaab53c38672a62f22 dockerfile/elasticsearch:latest /usr/share/elasticsearch/bin/elasticsearch java About an hour ago Up 8 minutes 0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp pensive_morse
Now, when I try to attach the running container, I get stacked:
$ sudo docker attach 49fdccefe4c8c72750d8155bbddad3acd8f573bf13926dcaab53c38672a62f22
[sudo] password for lsoave:
the tty doesn't connect and the prompt is not back. Doing the same with lxc-attach works fine:
$ sudo lxc-attach -n 49fdccefe4c8c72750d8155bbddad3acd8f573bf13926dcaab53c38672a62f22
root#49fdccefe4c8:/# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 49 20:37 ? 00:00:20 /usr/bin/java -Xms256m -Xmx1g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMa
root 88 0 0 20:38 ? 00:00:00 /bin/bash
root 92 88 0 20:38 ? 00:00:00 ps -ef
root#49fdccefe4c8:/#
Does anybody know what's wrong with docker attach ?
NB. dockerfile/elasticsearch ends with:
ENTRYPOINT ["/usr/share/elasticsearch/bin/elasticsearch"]
You're attaching to a container that is running elasticsearch which isn't an interactive command. You don't get a shell to type in because the container is not running a shell. The reason lxc-attach works is because it's giving you a default shell. Per man lxc-attach:
If no command is specified, the current default shell of the user
running lxc-attach will be looked up inside the container and
executed. This will fail if no such user exists inside the container
or the container does not have a working nsswitch mechanism.
docker attach is behaving as expected.
As Ben Whaley notes this is expected behavior.
It's worth mentioning though that if you want to monitor the process you can do a number of things:
Start bash as front process: e.g. $ES_DIR/bin/elasticsearch && /bin/bash will give you your shell when you attach. Mainly useful during development. Not so clean :)
Install an ssh server. Although I've never done this myself it's a good option. Drawback is of course overhead, and maybe a security angle. Do you really want ssh on all of your containers? Personally, I like to keep them as small as possible with single-process as the ultimate win.
Use the log files! You can use docker cp to get the logs locally, or better the docker logs $CONTAINER_ID command. The latter give you the accumulated stdin/stderr output for the entre lifetime of the container each time though.
Mount the log directory. Just mount a directory on your host and have elasticsearch write to a logfile in that directory. You can have syslog on your host, Logstash, or whatever turns you on ;). Of course, the drawback here is that you are now using your host more than you might like. I also found a nice experiment using logstash in this blog.
FWIW, now that Docker 1.3 is released, you can use "docker exec" to open up a shell or other process on a running container. This should allow you to effectively replace lxc-attach when using the native driver.
http://blog.docker.com/2014/10/docker-1-3-signed-images-process-injection-security-options-mac-shared-directories/

Resources