Docker - Access host /proc - linux

This is a duplicate of a post I have created in the docker forum. Thus I am going to close this / the other one once this problem is solved. But since no one answers in the docker forum and my problem persists, I'll post it again, looking forward to get an answer.
I would like to expose a server monitoring app as a docker container. The app I have written relies on /proc to read system information like CPU utilization or disk stats. Thus I have to forward the information provided in hosts /proc virtual file system to my docker container.
So I made a simple image (using the first or second intro on docker website: Link) and started it:
docker run -v=/proc:/host/proc:ro -d hostfiletest
Assuming the running container could read from /host/proc to obtain information about the host system.
I fired up a console inside the container to check:
docker exec -it {one of the funny names the container get} bash
And checked the content of /host/proc.
Easiest way to check it was getting the content of /host/proc/sys/kernel/hostname - that should yield the hostname of the vm I am working on.
But I get the hostname of the container, while /host/proc/uptime gets me the correct uptime of the vm.
Do I miss something here? Maybe something conceptual?
Docker version 17.05.0-ce, build 89658be running on Linux 4.4.0-97-generic (VM)
Update:
I found several articles describing how to run a specific monitoring app inside a containing using the same approach I mentioned above.
Update:
Just tried using an existing Ubuntu image - same behavior. Running the image privileged and with pid=host doesn't help.
Greetings
Peepe

The reason of this problem is that /proc is not a normal filesystem. According to procfs, it is like an interface to access some kernel data and system information. This interface provides a file-like structure, so it can make people misunderstand that it is a normal directory.
Files in /proc are also not normal files. They are empty (size = 0). You can check by yourself.
$ stat /proc/sys/kernel/hostname
File: /proc/sys/kernel/hostname
Size: 0 Blocks: 0 IO Block: 1024 regular empty file
So the file doesn't hold any data, but when you read the file, the kernel will dynamically return to you a corresponding system information.
To answer your question, /proc/sys/kernel/hostname is just an interface to access the hostname. And depending on where you access that interface, on the host or on the container, you will get the corresponding hostname. This is also applied when you use bind mount -v /proc:/hosts/proc:ro, since bind mount will provide an alternative view of /proc. If you call the interface /hosts/proc/sys/kernel/hostname, the kernel will return the hostname of the box where you are in (the container).
In short, think about/proc/sys/kernel/hostname as a mirror, if your host stands in front of it, it will reflect the host. If it is the container, it will reflect the container.

I know its a few months later no but I came across the same problem today.
In my case I was using psutil in Python to read disk stats of the hosts from inside a docker container.
The solution was to mount the whole host files system as read only into the docker container with -v /:/rootfs:ro and specify the path to proc as psutil.PROCFS_PATH = '/rootfs/proc'.
Now the psutil.disk_partitions() lists all partitions from the host files system. As the hostname is also contained within the proc hierarchy, I guess this also works for other host system information as long the the retrieving command points to /rootsfs/proc.

Related

does docker manage filesystem like a standalone OS?

I have a program I'm running in a docker container. After 10-12 hours of run, the program terminated with filesystem-related errors (FileNotFoundError, or similar).
I'm wondering if the disk space got filled up or a similar filesystem-related issue or there was a problem in my code (e.g one process deleted the file pre-maturely).
I don't know much about docker management of files and wonder if inside docker it creates and manages its own FS or not. Here are three possibilities I'm considering and mainly wonder if #1 could be the case or not.
If docker manages it's own filesystem, could it be that although disk space is available on the host machine, docker container ran out of it's own storage space? (I've seen similar issues regarding running out of memory for a process that has limited memory artificially imposed using cgroups)
Could it be that host filesystem ran out of space and the files got corrupted or maybe didn't get written correctly?
There is some bug in my code.
This is likely a bug in your code. Most programs print the error they encounter, and when a program encounters out-of-space, the error returned by the filesystem is: "No space left on device" (errno 28 ENOSPC).
If you see FileNotFoundError, that means the file is missing. My best theory is that it's coming from your consumer process.
It's still possible though, that the file doesn't exist because the producer ran out of space and you didn't handle the error correctly - you'll need to check your logs.
It might also be a race condition, depending on your application. There's really not enough details to answer that.
As to the title question:
By default, docker just overlay-mounts an empty directory from the host's filesystem into the container, so the amount of free space on the container is the same as the amount on the host.
If you're using volumes, that depends on the storage driver you use. As #Dan Serbyn mentioned, the default limit for the devicemapper driver is 10 GB. The overlay2 driver - the default driver - doesn't have that limitation.
In the current Docker version, there is a default limitation on the Docker container storage of 10 GB.
You can check the disk space that containers are using by running the following command:
docker system df
It's also possible that the file your container is trying to access has access level restrictions. Try to make it available for docker or maybe everybody (chmod 777 file.txt).

Where does a container get it's limits (no_file and similar) from when not set by the user?

This is something I've struggled to find anything more than anecdotal answers for and I'd really like to find some solid information on this.
This is my situation (I'll be focusing on the no_file limit as it's of principle interest to me):
I have a container.
Swarm is not being use
I can use docker inspect it and see that "Ulimits": null.
I use a compose file
host and container are running linux (ubuntu)
I do not set ulimits via args to docker
I do not set ulimits in the compose file
I do not set limits via resources in the compose file
On the host I can see that all of /etc/default/docker is commented out
On the host I start docker with systemd which seems to ignore /etc/default/docker anyway
On the host there is no --config-file option being passed to dockerd
On the host there is no /etc/docker/daemon.json file (these last 2 being relevant due to this)
On the host ulimit -Hn gives 1048576
In the container ulimit -Hn gives 1048576
In the container /etc/sysctl.conf is commented out (And is ignored by docker containers anyway iirc)
In the container /etc/sysctl.d is an empty directory
In the container /etc/security/limits.conf is commented out (And is ignored by docker containers anyway iirc)
In the container /etc/security/limits.d is an empty directory
I'm aware (from answers like this one) that the host values do impact the container. But not what takes preference.
The best leads I've found are as follows:
https://stackoverflow.com/a/50145392/574033
https://stackoverflow.com/a/46233327/574033
What then muddies the water is that /lib/systemd/system/docker.service has a line in the [Service] section that reads LimitNOFILE=1048576. So then I'm not sure if this is setting a hard limit on the docker deamon pid (ala ulimits) and, if so, whether this take precedence over the host's established limits (one would assume so right?).
There is an extent to which it "doesn't matter" in that as long as I have the number in the range I need it should be fine. However I'd really like to be able to understand the actual behaviour here, and know if there is any official information on what the behavior is (or should be).
Thank you.

Export Memory Dump Azure Kubernetes

I need to export memory dump from Aks Cluster and save it in some location
How can I do it? Is easy to export to a storage account? Exist another solution? Can someone give me an step y step?
EDIT: the previous answer was wrong, I didn't paid attention you needed a dump. You'll actually will need to get it from Boot Diagnostic or some command line:
https://learn.microsoft.com/en-us/azure/virtual-machines/troubleshooting/boot-diagnostics#enable-boot-diagnostics-on-existing-virtual-machine
This question is quite old, but let me nevertheless share how I realized it:
Linux has an internal setting called RLIMIT_CORE which limits the size of the core dump you'll receive when your application crashes - this is what you find quite quickly.
Next, you have to define the location of where core files are saved, which is done in the file /proc/sys/kernel/core_pattern. The given path can either be a relative file name (saved next to the binary which crashed), an absolute path (absolute to the mounted namespace) or - here is where it gets interesting - a pipe followed by an absolute path to an executable (application or script). This script will (according to the docs - see headline Piping core dumps to a program) be started as user and group root - but furthermore, it will (according to this post in the Linux mailing list) also be executed in the global namespace - in other words, outside of the container.
If you are like me, and you do not have access to the image used for new nodes on your AKS cluster, you want to set these values using DaemonSets, a pod which runs once on every node.
Armed with all this knowledge, you can do the following:
Create a DaemonSet - a pod running on every machine performing the initial setup.
This DaemonSet will run as a privileged container to allow it to switch to the root namespace.
After having switched namespaces successfully, it can change the value of /proc/sys/kernel/core_pattern.
The value should be something like |/bin/dd of=/core/%h.%e.%p.%t (dd will take the stdin, the core file, and save it to the location defined by the parameter of). Core files will now be saved at /core/. The name of the file can be explained by the variables found in the docs for core files.
After knowing that the files will be saved to /core/ of the root namespace, we can mount our storage there - in my case Azure File Storage. Here's a tutorial of how to mount AzureFileStorage.
Pods have the RestartPolicy set to Always. Since the job of your pod is done, and you don't want it to restart automatically, let it remain running using sleep infinity.
This writeup is almost a copy of what I discovered while contacting the support from Microsoft. Here's the thread in their forum, which contains an almost finished configuration for a DaemonSet.
I'll leave some links here which I used during my research:
how to generate core file in docker container?
How to access docker host filesystem from privileged container
https://medium.com/#patnaikshekhar/initialize-your-aks-nodes-with-daemonsets-679fa81fd20e
Sidenote:
I could also just have mounted the AzureFileSystem into every container and set the value for /proc/sys/kernel/core_pattern to just /core/%h.%e.%p.%t but this would require me to mention the mount on every container. Going this way I could free the configuration of the pods of this administrative task and put it where it (in my opinion) belongs, to the initial machine setup.

Persisting content across docker restart within an Azure Web App

I'm trying to run a ghost docker image on Azure within a Linux Docker container. This is incredibly easy to get up and running using a custom Docker image for Azure Web App on Linux and pointing it at the official docker hub image for ghost.
Unfortunately the official docker image stores all data on the /var/lib/ghost path which isn't persisted across restarts so whenever the container is restarted all my content get's deleted and I end up back at a default ghost install.
Azure won't let me execute arbitrary commands you basically point it at a docker image and it fires off from there so I can't use the -v command line param to map a volume. The docker image does have an entry point configured if that would help.
Any suggestions would be great. Thanks!
Set WEBSITES_ENABLE_APP_SERVICE_STORAGE to true in appsettings and the home directory would be mapped from your outer kudo instance:
https://learn.microsoft.com/en-us/azure/app-service/containers/app-service-linux-faq
You have a few options:
You could mount a file share inside the Docker container by creating a custom image, then storing data there. See these docs for more details.
You could switch to the new container instances, as they provide volume support.
You could switch to the Azure Container Service. This requires an orchestrator, like Kubernetes, and might be more work than you're looking for, but it also offers more flexibility, provides better reliability and scaling, and other benefits.
You have to use a shared volume that map the content of the container /var/lib/ghost directory to a host directory. This way, your data will persist in your host directory.
To do that, use the following command.
$ docker run -d --name some-ghost -p 3001:2368 -v /path/to/ghost/blog:/var/lib/ghost/content ghost:1-alpine
I never worked with Azure, so I'm not 100 percent sure the following applies. But if you interface docker via the CLI there is a good chance it applies.
Persistency in docker is handled with volumes. They are basically mounts inside the container's file system tree to a directory on the outside. From your text I understand that you want store the content of the inside /var/lib/ghost path in /home/site/wwwroot on the outside. To do this you would call docker like this:
$ docker run [...] -v /var/lib/ghost:/home/site/wwwroot ghost
Unfortunately setting the persistent storage (or bring your own storage) to a specific path is currently not supported in Azure Web Apps on Linux.
That's said, you can play with ssh and try and configure ghost to point to /home/ instead of /var/lib/.
I have prepared a docker image here: https://hub.docker.com/r/elnably/ghost-on-azure that adds the ssh capability the dockerfile and code can be found here: https://github.com/ahmedelnably/ghost-on-azure/tree/master/1/alpine.
try it out by configuring you web app to use elnably/ghost-on-azure:latest, browse to the site (to start the container) and go to the ssh page .scm.azurewebsites.net, to learn more about SSH check this link: https://aka.ms/linux-ssh.

What's inside a Docker image/container?

Considering the fact that docker images/containers come in various flavours - Ubuntu, CentOS, CoreOS etc.... I'm curious what actually makes up an image/container, and what is shared with the host OS? Where is the dividing line?
For example, I can download the base Ubuntu image and launch it on a CentOS host. Then, when I poke around inside the Ubuntu container I can see that it looks and feels like an Ubuntu server (filesystem layout etc). But if I run a uname command I see the kernel and the likes of the CentOS host....
Obviously I understand that the underlying kernel is shared by all containers on the same host. But what else is shared with the host OS, and what is part of the image/container?
E.g. the kernel is part of the host, the filesystem layout is part of the the image/container.... Is there a spec that defines this?
It can be helpful to distinguish between images and containers (docs). An image is static and lives only on disk. A container is a running instance of an image and it includes its own process tree as well as RAM and other runtime resources.
An image is a logical grouping of layers plus metadata about what to do when creating a container and how to assemble the layers. Part of that metadata is that each layer knows its parent's ID.
So, what goes into a layer? The files (and directories) you've added to the parent. There are also special files ("whiteout") that indicate that something was deleted from the parent.
When you docker run an image, docker creates a container: it unpacks all the layers in the correct order, creating a new "root" file system separate from the host. docker also reads the image metadata and starts either the "entrypoint" or "command" specified when the image was created -- that starts a new process sub-tree. From inside the container, that first process seems like the root of the tree, but from the host you can see it is a subtree of processes.
The root file system is what makes one Linux distro different from another (there can be some kernel module differences as well, and bootloader/boot file system differences, but these are usually invisible to the running processes). The kernel is shared with the host and is, in fact, still managing its usual responsibilities inside the container. But the root file system is different, and so when you're inside the container, it looks and feels like whatever distro was in the Docker image.
The container not only has its own file system and process tree, but also has its own logical network interface and, optionally, its own allocation of RAM and CPU time. You're in control over the container though, as the operator, so you can decide to share the host's network interface with the container, give it unlimited access to RAM and CPU, and even mount devices, files and directories from the host into the container. The default is to keep things separate, but you have the power to break the isolation model as much as you need to.
Docker is a wrapper over LXC Linux Containers and the documentation for that will let you know in detail what is shared and what is not.
In general the host machine sees/contains everything inside the containers from file system to processes etc. You can issue a ps command on the host vm and see processes inside the container.
Remember docker containers are not VMs - hence everything is actually running natively on the host and is using the host kernel directly. Each container has its own user namespace (similar to root jails of old). There are tools/features which make sure containers only see its own processes, have its own file-system layered onto host file system, and a networking stack which pipes to the host networking stack.

Resources