I have been reading up on docker, and I have understood that unlike VMs, docker uses the host OS's kernel. Why is there a requirement that the base image has to be an OS. Why can't docker use resources from the host OS (eg: filesystem) and use the isolation supported by the host OS ? (I am assuming that the host OS provides mechanism for isolation)
It depends on how you define an OS. Docker images are not full OS (unlike VMs). They do not have a kernel of their own. This means no specific kernel modules (device drivers for external hardware etc) are installed as the host OS already has them.
Images are simply filesystem clones of popular Linux distributions (the binaries in image are offcourse built for the target arch). There can be multiple reasons for it, I would try and put some here:
Near-VM like experience as users like to use their favorite Linux
distribution
Pre-configured libraries based on distribution. Lets you run apps straight away with all distribution-based dependencies taken care.
Flexibility of running multiple distributions on same host (Great dev/test sandboxing!)
Greater isolation from other containers as each image is self sufficient and doesn't have to share a filesystem with others
Related
I want to know about LXC and came across this site: https://linuxcontainers.org/lxc/introduction/; in this site, it talks about LXC, LXD, among others.
I am a bit confused, I am under the impression that LXC is a Linux kernel feature, so it should be present in Kernel itself. However, looking at the above site viz: https://linuxcontainers.org/lxc/introduction/, is this same when we say LXC (the kernel feature)? Or is LXC provided to the Linux kernel by https://linuxcontainers.org/lxc/introduction/?
How can I understand this subtle difference?
Most of the core features needed to operate Linux in containers are built into the kernel -- namespaces, control groups, virtual roots, etc. However, to assemble a usable container platform from these features requires a considerable amount of infrastructure. We need to manage container storage, create network links between containers, control per-container resource usage, etc. User-space programs can, and are, used to provide this infrastructure, and the tooling that goes with it.
I have written a series of articles on building a container from scratch that explains some of these issues:
http://kevinboone.me/containerfromscratch.html
It's possible in principle to build and connect containers using nothing but the features built into the kernel, and a bunch of shell scripts. Tools like LXC, Docker, and Podman all use the same kernel features (so far as I know), but they manipulate these features in different ways.
I've read that on linux, Docker uses the underlying linux kernal to create containers. So this is an advantage because resources aren't wasted on creating virtual machines that each contain an OS.
I'm confused, though, as to why most Dockerfiles specify the OS in the FROM line of the Dockerfile. I thought that as it was using the underlying OS, then the OS wouldn't have to be defined.
I would like to know what actually happens if the OS specified doesn't match the OS flavour of the machine it's running on. So if the machine is CentOS but the Dockerfile has FROM Debian:latest in the first line, is a virtual machine containing a Debian OS actually created.
In other words, does this result in a performance reduction because it needs to create a virtual machine containing the specified OS?
I'm confused, though, as to why most Dockerfiles specify the OS in the
FROM line of the Dockerfile. I thought that as it was using the
underlying OS, then the OS wouldn't have to be defined.
I think your terminology may be a little confused.
Docker indeed uses the host kernel, because Docker is nothing but a way of isolating processes running on the host (that is, it's not any sort of virtualization, and it can't run a different operating system).
However, the filesystem visible inside the container has nothing to do with the host. A Docker container can run programs from any Linux distribution. So if I am on a Fedora 24 Host, I can build a container that uses an Ubuntu 14.04 userspace by starting my Dockerfile with:
FROM ubuntu:14.04
Processes running in this container are still running on the host kernel, but there entire userspace comes from the Ubuntu distribution. This isn't another "operating system" -- it's still the same Linux kernel -- but it is a completely separate filesystem.
The fact that my host is running a different kernel version than maybe you would find in an actual Ubuntu 14.04 host is almost irrelevant. There are going to be a few utilities that expect a particular kernel version, but most applications just don't care as long as the kernel is "recent enough".
So no, there is no virtualization in Docker. Just various (processes, filesystem, networking, etc) sorts of isolation.
I know base images are minimal operating systems with limited kernel features. If I want to use Ubuntu base image for my applications, how can I know if the kernel features included are enough for my applications? Are there any commands to show the kernel features included in the base images? Thanks a lot!!
This a common misconception regarding containerization vs virtualization.
A Docker image is just a packaged file structure with some additional metadata. A Docker container is simply an isolated process on the host (see cgroups) using the image as its root file system (see chroot). This is what makes containers so lightweight as compared to running a full VM.
To answer your question, a Docker container can only rely on the kernel features of the host system it is running on.
If your application requires uncommon kernel features Docker might not be the best solution, though you could easily add a check for those features as part of the container startup to inform the user with further instructions.
Can docker images created with one version of linux (say ubuntu) be run without problem on ANY other version of Linux? i.e. CentOS?
So far I have not had problems in my testing but I am a new to this.
I'd like to know if there are any specific use cases that might make a Docker container non-functional on a host node due to the host's Linux version.
Thank you
Can docker images created with one version of linux (say ubuntu) be run without problem on ANY other version of Linux? i.e. CentOS?
Older kernels may not have the necessary namespace support for Docker to operate correctly, although at this point Docker seems to run fine on the current releases of most common distributions.
Obviously the host must be the appropriate architecture for whatever you're running in the container. E.g., you can't run an ARM container on an x86_64 host.
If you are running tools that are tighly coupled to a particular kernel version, you may run into problems if your host kernel is substantially newer or older than what the tools expect. E.g., you have a tool that wants to use ipset, but ipset support is not available in your host kernel.
You're only likely to have an issue if you have code that relies on a kernel feature that isn't present on another host. This is certainly possible, but unusual in everyday usage.
I'm developing for a Linux-based embedded system, where I have a build process from hell, at least to generate a full flashable binary -- tons of dependencies, proprietary compiler, etc. To make development setup easier for new developers, and uniform across our development team, I've adopted Vagrant. But, there's a snag...
So Vagrant spins up a VM and provisions it with our dependencies and tools. Then it mounts the source tree on the host within /vagrant on the VM. However, we cannot build within this directory -- mounts between the host and the VM do not support mmap (at least not in Virtualbox), which the build relies upon. With our developers that run OS X, things are even worse as their host OS is HFS+, case-insensitive out of the box -- the build requires a case-sensitive FS. So developers are forced to work within the VM, which is constraining if you're used to particular development tools in OS X say, and simply want to use the terminal for compilation.
Seems what's needed is a real-time (e.g. inotify based?) bi-directional sync mechanism that would keep /vagrant in sync with say /home/vagrant, which is not a mount point but simply part of the VM's ext4 fs, so produced/edited files and symlinks are synced. Is there such a mechanism? Closest thing I've found is aufs, but I'm not sure that does what we want.
I ended up using NFS (included within Vagrant and OS X) and it's been working well for a couple months now.