Using multiple docker containers on the same host securely like isolated instances - security

I know, multiple Docker containers can be used in the same host, but can they be used securely like isolated instances? I want to run multiple secure and sandboxed containers such that no container can affect or access others.
For instance, can I serve nginx and apache containers which listen to different ports, with full trust that each container can only access their own files, resources etc?

In some sense you are asking the million dollar question with containers, and to be clear, IMHO there is no black and white answer to the question "is the platform/technology secure enough." It is a big (and important) enough question that the list of startups--not to mention amount of funding they've received--around container security is an appreciable number!
As noted in another answer, isolation for containers is realized through an assortment of Linux kernel capabilities (namespaces and cgroups), and adding more security to these capabilities is yet another set of technologies like seccomp, apparmor (or SELinux), user namespaces, or general hardening of the container runtime & node it is installed on (e.g. via the CIS benchmark guidelines). Out of the box default installation and default runtime parameters are probably not good enough for generically trusting in the kernel isolation primitives of Linux. However, this depends greatly on the trust level of what you are running across your container workloads. For example, is this all in-house within one organization? Can workloads be submitted from external sources? Obviously the spectrum of possibilities may greatly impact your level of trust.
If your use case is potentially narrow (for example, you mention web serving content from nginx or apache), and you are willing to do some work to handle base image creation, minimization and hardening; add to that a --readonly root filesystem and a capability limiting apparmor and seccomp profile, bind mount in the content served + writeable area, with no executables and ownership by an unprivileged user--all those things together might be enough for a specific use case.
However, there is no guarantee that a currently unknown security escape becomes a "0day" for Linux containers in the future, and that has led to promotion of lightweight virtualization that marries container isolation with actual hardware-level virtualization through shims from hyper.sh or Intel Clear Containers, as two examples. This is a happy medium between running a full virtualized OS with another container runtime and trusting kernel isolation with a single daemon on a single node. There is still a performance cost and memory overhead to adding this layer of isolation, but it is much less than a fully virtualized OS and work continues to make this less of a performance impact.
For a deeper set of information on all the "knobs" available for tuning container security, a presentation I gave last year several times is available on slideshare as well as via video from Skillsmatter.
The incredibly thorough "Understanding and Hardening Linux Containers" by Aaron Grattafiori is also a great resource with exhaustive detail on many of the same topics.

filesystem isolation (as well as memory and processes isolation) is a core feature of docker containers, based on the Linux Kernel abilities.
But if you wanted to be completely sure, you would deploy your containers on different nodes (each managed by their own docker daemons), each node being a VM (Virtual Machine) on your host, ensuring a complete sandbox.
Then a docker swarm or Kubernetes would be able to orchestrate those node and their containers, and make them communicate.
This is normally not needed when you have just a few linked containers: their should be able to be managed in isolation by a single docker daemon. You could use user namespace for additional isolation.
Plus, using nodes to separate containers implies different machines or different VM within the same machine.
And one big difference with a VM and a container is that a VM will preempt resources (allocate a fix minimal amount of disk/memory/CPU), which means you cannot launch an hundred VM, one per container. As opposed to a single docker instance, where a container, if it does nothing, won't consume much disk space/memory/CPU at all.

Related

What is the difference between guestOS from VM and Base Image from Docker? [duplicate]

I keep rereading the Docker documentation to try to understand the difference between Docker and a full VM. How does it manage to provide a full filesystem, isolated networking environment, etc. without being as heavy?
Why is deploying software to a Docker image (if that's the right term) easier than simply deploying to a consistent production environment?
Docker originally used LinuX Containers (LXC), but later switched to runC (formerly known as libcontainer), which runs in the same operating system as its host. This allows it to share a lot of the host operating system resources. Also, it uses a layered filesystem (AuFS) and manages networking.
AuFS is a layered file system, so you can have a read only part and a write part which are merged together. One could have the common parts of the operating system as read only (and shared amongst all of your containers) and then give each container its own mount for writing.
So, let's say you have a 1 GB container image; if you wanted to use a full VM, you would need to have 1 GB x number of VMs you want. With Docker and AuFS you can share the bulk of the 1 GB between all the containers and if you have 1000 containers you still might only have a little over 1 GB of space for the containers OS (assuming they are all running the same OS image).
A full virtualized system gets its own set of resources allocated to it, and does minimal sharing. You get more isolation, but it is much heavier (requires more resources). With Docker you get less isolation, but the containers are lightweight (require fewer resources). So you could easily run thousands of containers on a host, and it won't even blink. Try doing that with Xen, and unless you have a really big host, I don't think it is possible.
A full virtualized system usually takes minutes to start, whereas Docker/LXC/runC containers take seconds, and often even less than a second.
There are pros and cons for each type of virtualized system. If you want full isolation with guaranteed resources, a full VM is the way to go. If you just want to isolate processes from each other and want to run a ton of them on a reasonably sized host, then Docker/LXC/runC seems to be the way to go.
For more information, check out this set of blog posts which do a good job of explaining how LXC works.
Why is deploying software to a docker image (if that's the right term) easier than simply deploying to a consistent production environment?
Deploying a consistent production environment is easier said than done. Even if you use tools like Chef and Puppet, there are always OS updates and other things that change between hosts and environments.
Docker gives you the ability to snapshot the OS into a shared image, and makes it easy to deploy on other Docker hosts. Locally, dev, qa, prod, etc.: all the same image. Sure you can do this with other tools, but not nearly as easily or fast.
This is great for testing; let's say you have thousands of tests that need to connect to a database, and each test needs a pristine copy of the database and will make changes to the data. The classic approach to this is to reset the database after every test either with custom code or with tools like Flyway - this can be very time-consuming and means that tests must be run serially. However, with Docker you could create an image of your database and run up one instance per test, and then run all the tests in parallel since you know they will all be running against the same snapshot of the database. Since the tests are running in parallel and in Docker containers they could run all on the same box at the same time and should finish much faster. Try doing that with a full VM.
From comments...
Interesting! I suppose I'm still confused by the notion of "snapshot[ting] the OS". How does one do that without, well, making an image of the OS?
Well, let's see if I can explain. You start with a base image, and then make your changes, and commit those changes using docker, and it creates an image. This image contains only the differences from the base. When you want to run your image, you also need the base, and it layers your image on top of the base using a layered file system: as mentioned above, Docker uses AuFS. AuFS merges the different layers together and you get what you want; you just need to run it. You can keep adding more and more images (layers) and it will continue to only save the diffs. Since Docker typically builds on top of ready-made images from a registry, you rarely have to "snapshot" the whole OS yourself.
It might be helpful to understand how virtualization and containers work at a low level. That will clear up lot of things.
Note: I'm simplifying a bit in the description below. See references for more information.
How does virtualization work at a low level?
In this case the VM manager takes over the CPU ring 0 (or the "root mode" in newer CPUs) and intercepts all privileged calls made by the guest OS to create the illusion that the guest OS has its own hardware. Fun fact: Before 1998 it was thought to be impossible to achieve this on the x86 architecture because there was no way to do this kind of interception. The folks at VMware were the first who had an idea to rewrite the executable bytes in memory for privileged calls of the guest OS to achieve this.
The net effect is that virtualization allows you to run two completely different OSes on the same hardware. Each guest OS goes through all the processes of bootstrapping, loading kernel, etc. You can have very tight security. For example, a guest OS can't get full access to the host OS or other guests and mess things up.
How do containers work at a low level?
Around 2006, people including some of the employees at Google implemented a new kernel level feature called namespaces (however the idea long before existed in FreeBSD). One function of the OS is to allow sharing of global resources like network and disks among processes. What if these global resources were wrapped in namespaces so that they are visible only to those processes that run in the same namespace? Say, you can get a chunk of disk and put that in namespace X and then processes running in namespace Y can't see or access it. Similarly, processes in namespace X can't access anything in memory that is allocated to namespace Y. Of course, processes in X can't see or talk to processes in namespace Y. This provides a kind of virtualization and isolation for global resources. This is how Docker works: Each container runs in its own namespace but uses exactly the same kernel as all other containers. The isolation happens because the kernel knows the namespace that was assigned to the process and during API calls it makes sure that the process can only access resources in its own namespace.
The limitations of containers vs VMs should be obvious now: You can't run completely different OSes in containers like in VMs. However you can run different distros of Linux because they do share the same kernel. The isolation level is not as strong as in a VM. In fact, there was a way for a "guest" container to take over the host in early implementations. Also you can see that when you load a new container, an entire new copy of the OS doesn't start like it does in a VM. All containers share the same kernel. This is why containers are light weight. Also unlike a VM, you don't have to pre-allocate a significant chunk of memory to containers because we are not running a new copy of the OS. This enables running thousands of containers on one OS while sandboxing them, which might not be possible if we were running separate copies of the OS in their own VMs.
Good answers. Just to get an image representation of container vs VM, have a look at the one below.
Source
I like Ken Cochrane's answer.
But I want to add additional point of view, not covered in detail here. In my opinion Docker differs also in whole process. In contrast to VMs, Docker is not (only) about optimal resource sharing of hardware, moreover it provides a "system" for packaging application (preferable, but not a must, as a set of microservices).
To me it fits in the gap between developer-oriented tools like rpm, Debian packages, Maven, npm + Git on one side and ops tools like Puppet, VMware, Xen, you name it...
Why is deploying software to a docker image (if that's the right term) easier than simply deploying to a consistent production environment?
Your question assumes some consistent production environment. But how to keep it consistent?
Consider some amount (>10) of servers and applications, stages in the pipeline.
To keep this in sync you'll start to use something like Puppet, Chef or your own provisioning scripts, unpublished rules and/or lot of documentation... In theory servers can run indefinitely, and be kept completely consistent and up to date. Practice fails to manage a server's configuration completely, so there is considerable scope for configuration drift, and unexpected changes to running servers.
So there is a known pattern to avoid this, the so called immutable server. But the immutable server pattern was not loved. Mostly because of the limitations of VMs that were used before Docker. Dealing with several gigabytes big images, moving those big images around, just to change some fields in the application, was very very laborious. Understandable...
With a Docker ecosystem, you will never need to move around gigabytes on "small changes" (thanks aufs and Registry) and you don't need to worry about losing performance by packaging applications into a Docker container at runtime. You don't need to worry about versions of that image.
And finally you will even often be able to reproduce complex production environments even on your Linux laptop (don't call me if doesn't work in your case ;))
And of course you can start Docker containers in VMs (it's a good idea). Reduce your server provisioning on the VM level. All the above could be managed by Docker.
P.S. Meanwhile Docker uses its own implementation "libcontainer" instead of LXC. But LXC is still usable.
Docker isn't a virtualization methodology. It relies on other tools that actually implement container-based virtualization or operating system level virtualization. For that, Docker was initially using LXC driver, then moved to libcontainer which is now renamed as runc. Docker primarily focuses on automating the deployment of applications inside application containers. Application containers are designed to package and run a single service, whereas system containers are designed to run multiple processes, like virtual machines. So, Docker is considered as a container management or application deployment tool on containerized systems.
In order to know how it is different from other virtualizations, let's go through virtualization and its types. Then, it would be easier to understand what's the difference there.
Virtualization
In its conceived form, it was considered a method of logically dividing mainframes to allow multiple applications to run simultaneously. However, the scenario drastically changed when companies and open source communities were able to provide a method of handling the privileged instructions in one way or another and allow for multiple operating systems to be run simultaneously on a single x86 based system.
Hypervisor
The hypervisor handles creating the virtual environment on which the guest virtual machines operate. It supervises the guest systems and makes sure that resources are allocated to the guests as necessary. The hypervisor sits in between the physical machine and virtual machines and provides virtualization services to the virtual machines. To realize it, it intercepts the guest operating system operations on the virtual machines and emulates the operation on the host machine's operating system.
The rapid development of virtualization technologies, primarily in cloud, has driven the use of virtualization further by allowing multiple virtual servers to be created on a single physical server with the help of hypervisors, such as Xen, VMware Player, KVM, etc., and incorporation of hardware support in commodity processors, such as Intel VT and AMD-V.
Types of Virtualization
The virtualization method can be categorized based on how it mimics hardware to a guest operating system and emulates a guest operating environment. Primarily, there are three types of virtualization:
Emulation
Paravirtualization
Container-based virtualization
Emulation
Emulation, also known as full virtualization runs the virtual machine OS kernel entirely in software. The hypervisor used in this type is known as Type 2 hypervisor. It is installed on the top of the host operating system which is responsible for translating guest OS kernel code to software instructions. The translation is done entirely in software and requires no hardware involvement. Emulation makes it possible to run any non-modified operating system that supports the environment being emulated. The downside of this type of virtualization is an additional system resource overhead that leads to a decrease in performance compared to other types of virtualizations.
Examples in this category include VMware Player, VirtualBox, QEMU, Bochs, Parallels, etc.
Paravirtualization
Paravirtualization, also known as Type 1 hypervisor, runs directly on the hardware, or “bare-metal”, and provides virtualization services directly to the virtual machines running on it. It helps the operating system, the virtualized hardware, and the real hardware to collaborate to achieve optimal performance. These hypervisors typically have a rather small footprint and do not, themselves, require extensive resources.
Examples in this category include Xen, KVM, etc.
Container-based Virtualization
Container-based virtualization, also known as operating system-level virtualization, enables multiple isolated executions within a single operating system kernel. It has the best possible performance and density and features dynamic resource management. The isolated virtual execution environment provided by this type of virtualization is called a container and can be viewed as a traced group of processes.
The concept of a container is made possible by the namespaces feature added to Linux kernel version 2.6.24. The container adds its ID to every process and adding new access control checks to every system call. It is accessed by the clone() system call that allows creating separate instances of previously-global namespaces.
Namespaces can be used in many different ways, but the most common approach is to create an isolated container that has no visibility or access to objects outside the container. Processes running inside the container appear to be running on a normal Linux system although they are sharing the underlying kernel with processes located in other namespaces, same for other kinds of objects. For instance, when using namespaces, the root user inside the container is not treated as root outside the container, adding additional security.
The Linux Control Groups (cgroups) subsystem, the next major component to enable container-based virtualization, is used to group processes and manage their aggregate resource consumption. It is commonly used to limit the memory and CPU consumption of containers. Since a containerized Linux system has only one kernel and the kernel has full visibility into the containers, there is only one level of resource allocation and scheduling.
Several management tools are available for Linux containers, including LXC, LXD, systemd-nspawn, lmctfy, Warden, Linux-VServer, OpenVZ, Docker, etc.
Containers vs Virtual Machines
Unlike a virtual machine, a container does not need to boot the operating system kernel, so containers can be created in less than a second. This feature makes container-based virtualization unique and desirable than other virtualization approaches.
Since container-based virtualization adds little or no overhead to the host machine, container-based virtualization has near-native performance
For container-based virtualization, no additional software is required, unlike other virtualizations.
All containers on a host machine share the scheduler of the host machine saving need of extra resources.
Container states (Docker or LXC images) are small in size compared to virtual machine images, so container images are easy to distribute.
Resource management in containers is achieved through cgroups. Cgroups does not allow containers to consume more resources than allocated to them. However, as of now, all resources of host machine are visible in virtual machines, but can't be used. This can be realized by running top or htop on containers and host machine at the same time. The output across all environments will look similar.
Update:
How does Docker run containers in non-Linux systems?
If containers are possible because of the features available in the Linux kernel, then the obvious question is how do non-Linux systems run containers. Both Docker for Mac and Windows use Linux VMs to run the containers. Docker Toolbox used to run containers in Virtual Box VMs. But, the latest Docker uses Hyper-V in Windows and Hypervisor.framework in Mac.
Now, let me describe how Docker for Mac runs containers in detail.
Docker for Mac uses https://github.com/moby/hyperkit to emulate the hypervisor capabilities and Hyperkit uses hypervisor.framework in its core. Hypervisor.framework is Mac's native hypervisor solution. Hyperkit also uses VPNKit and DataKit to namespace network and filesystem respectively.
The Linux VM that Docker runs in Mac is read-only. However, you can bash into it by running:
screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty.
Now, we can even check the Kernel version of this VM:
# uname -a
Linux linuxkit-025000000001 4.9.93-linuxkit-aufs #1 SMP Wed Jun 6 16:86_64 Linux.
All containers run inside this VM.
There are some limitations to hypervisor.framework. Because of that Docker doesn't expose docker0 network interface in Mac. So, you can't access containers from the host. As of now, docker0 is only available inside the VM.
Hyper-v is the native hypervisor in Windows. They are also trying to leverage Windows 10's capabilities to run Linux systems natively.
Most of the answers here talk about virtual machines. I'm going to give you a one-liner response to this question that has helped me the most over the last couple years of using Docker. It's this:
Docker is just a fancy way to run a process, not a virtual machine.
Now, let me explain a bit more about what that means. Virtual machines are their own beast. I feel like explaining what Docker is will help you understand this more than explaining what a virtual machine is. Especially because there are many fine answers here telling you exactly what someone means when they say "virtual machine". So...
A Docker container is just a process (and its children) that is compartmentalized using cgroups inside the host system's kernel from the rest of the processes. You can actually see your Docker container processes by running ps aux on the host. For example, starting apache2 "in a container" is just starting apache2 as a special process on the host. It's just been compartmentalized from other processes on the machine. It is important to note that your containers do not exist outside of your containerized process' lifetime. When your process dies, your container dies. That's because Docker replaces pid 1 inside your container with your application (pid 1 is normally the init system). This last point about pid 1 is very important.
As far as the filesystem used by each of those container processes, Docker uses UnionFS-backed images, which is what you're downloading when you do a docker pull ubuntu. Each "image" is just a series of layers and related metadata. The concept of layering is very important here. Each layer is just a change from the layer underneath it. For example, when you delete a file in your Dockerfile while building a Docker container, you're actually just creating a layer on top of the last layer which says "this file has been deleted". Incidentally, this is why you can delete a big file from your filesystem, but the image still takes up the same amount of disk space. The file is still there, in the layers underneath the current one. Layers themselves are just tarballs of files. You can test this out with docker save --output /tmp/ubuntu.tar ubuntu and then cd /tmp && tar xvf ubuntu.tar. Then you can take a look around. All those directories that look like long hashes are actually the individual layers. Each one contains files (layer.tar) and metadata (json) with information about that particular layer. Those layers just describe changes to the filesystem which are saved as a layer "on top of" its original state. When reading the "current" data, the filesystem reads data as though it were looking only at the top-most layers of changes. That's why the file appears to be deleted, even though it still exists in "previous" layers, because the filesystem is only looking at the top-most layers. This allows completely different containers to share their filesystem layers, even though some significant changes may have happened to the filesystem on the top-most layers in each container. This can save you a ton of disk space, when your containers share their base image layers. However, when you mount directories and files from the host system into your container by way of volumes, those volumes "bypass" the UnionFS, so changes are not stored in layers.
Networking in Docker is achieved by using an ethernet bridge (called docker0 on the host), and virtual interfaces for every container on the host. It creates a virtual subnet in docker0 for your containers to communicate "between" one another. There are many options for networking here, including creating custom subnets for your containers, and the ability to "share" your host's networking stack for your container to access directly.
Docker is moving very fast. Its documentation is some of the best documentation I've ever seen. It is generally well-written, concise, and accurate. I recommend you check the documentation available for more information, and trust the documentation over anything else you read online, including Stack Overflow. If you have specific questions, I highly recommend joining #docker on Freenode IRC and asking there (you can even use Freenode's webchat for that!).
Through this post we are going to draw some lines of differences between VMs and LXCs. Let's first define them.
VM:
A virtual machine emulates a physical computing environment, but requests for CPU, memory, hard disk, network and other hardware resources are managed by a virtualization layer which translates these requests to the underlying physical hardware.
In this context the VM is called as the Guest while the environment it runs on is called the host.
LXCs:
Linux Containers (LXC) are operating system-level capabilities that make it possible to run multiple isolated Linux containers, on one control host (the LXC host). Linux Containers serve as a lightweight alternative to VMs as they don’t require the hypervisors viz. Virtualbox, KVM, Xen, etc.
Now unless you were drugged by Alan (Zach Galifianakis- from the Hangover series) and have been in Vegas for the last year, you will be pretty aware about the tremendous spurt of interest for Linux containers technology, and if I will be specific one container project which has created a buzz around the world in last few months is – Docker leading to some echoing opinions that cloud computing environments should abandon virtual machines (VMs) and replace them with containers due to their lower overhead and potentially better performance.
But the big question is, is it feasible?, will it be sensible?
a. LXCs are scoped to an instance of Linux. It might be different flavors of Linux (e.g. a Ubuntu container on a CentOS host but it’s still Linux.) Similarly, Windows-based containers are scoped to an instance of Windows now if we look at VMs they have a pretty broader scope and using the hypervisors you are not limited to operating systems Linux or Windows.
b. LXCs have low overheads and have better performance as compared to VMs. Tools viz. Docker which are built on the shoulders of LXC technology have provided developers with a platform to run their applications and at the same time have empowered operations people with a tool that will allow them to deploy the same container on production servers or data centers. It tries to make the experience between a developer running an application, booting and testing an application and an operations person deploying that application seamless, because this is where all the friction lies in and purpose of DevOps is to break down those silos.
So the best approach is the cloud infrastructure providers should advocate an appropriate use of the VMs and LXC, as they are each suited to handle specific workloads and scenarios.
Abandoning VMs is not practical as of now. So both VMs and LXCs have their own individual existence and importance.
Docker encapsulates an application with all its dependencies.
A virtualizer encapsulates an OS that can run any applications it can normally run on a bare metal machine.
They both are very different. Docker is lightweight and uses LXC/libcontainer (which relies on kernel namespacing and cgroups) and does not have machine/hardware emulation such as hypervisor, KVM. Xen which are heavy.
Docker and LXC is meant more for sandboxing, containerization, and resource isolation. It uses the host OS's (currently only Linux kernel) clone API which provides namespacing for IPC, NS (mount), network, PID, UTS, etc.
What about memory, I/O, CPU, etc.? That is controlled using cgroups where you can create groups with certain resource (CPU, memory, etc.) specification/restriction and put your processes in there. On top of LXC, Docker provides a storage backend (http://www.projectatomic.io/docs/filesystems/) e.g., union mount filesystem where you can add layers and share layers between different mount namespaces.
This is a powerful feature where the base images are typically readonly and only when the container modifies something in the layer will it write something to read-write partition (a.k.a. copy on write). It also provides many other wrappers such as registry and versioning of images.
With normal LXC you need to come with some rootfs or share the rootfs and when shared, and the changes are reflected on other containers. Due to lot of these added features, Docker is more popular than LXC. LXC is popular in embedded environments for implementing security around processes exposed to external entities such as network and UI. Docker is popular in cloud multi-tenancy environment where consistent production environment is expected.
A normal VM (for example, VirtualBox and VMware) uses a hypervisor, and related technologies either have dedicated firmware that becomes the first layer for the first OS (host OS, or guest OS 0) or a software that runs on the host OS to provide hardware emulation such as CPU, USB/accessories, memory, network, etc., to the guest OSes. VMs are still (as of 2015) popular in high security multi-tenant environment.
Docker/LXC can almost be run on any cheap hardware (less than 1 GB of memory is also OK as long as you have newer kernel) vs. normal VMs need at least 2 GB of memory, etc., to do anything meaningful with it. But Docker support on the host OS is not available in OS such as Windows (as of Nov 2014) where as may types of VMs can be run on windows, Linux, and Macs.
Here is a pic from docker/rightscale :
1. Lightweight
This is probably the first impression for many docker learners.
First, docker images are usually smaller than VM images, makes it easy to build, copy, share.
Second, Docker containers can start in several milliseconds, while VM starts in seconds.
2. Layered File System
This is another key feature of Docker. Images have layers, and different images can share layers, make it even more space-saving and faster to build.
If all containers use Ubuntu as their base images, not every image has its own file system, but share the same underline ubuntu files, and only differs in their own application data.
3. Shared OS Kernel
Think of containers as processes!
All containers running on a host is indeed a bunch of processes with different file systems. They share the same OS kernel, only encapsulates system library and dependencies.
This is good for most cases(no extra OS kernel maintains) but can be a problem if strict isolations are necessary between containers.
Why it matters?
All these seem like improvements, not revolution. Well, quantitative accumulation leads to qualitative transformation.
Think about application deployment. If we want to deploy a new software(service) or upgrade one, it is better to change the config files and processes instead of creating a new VM. Because Creating a VM with updated service, testing it(share between Dev & QA), deploying to production takes hours, even days. If anything goes wrong, you got to start again, wasting even more time. So, use configuration management tool(puppet, saltstack, chef etc.) to install new software, download new files is preferred.
When it comes to docker, it's impossible to use a newly created docker container to replace the old one. Maintainance is much easier!Building a new image, share it with QA, testing it, deploying it only takes minutes(if everything is automated), hours in the worst case. This is called immutable infrastructure: do not maintain(upgrade) software, create a new one instead.
It transforms how services are delivered. We want applications, but have to maintain VMs(which is a pain and has little to do with our applications). Docker makes you focus on applications and smooths everything.
Docker, basically containers, supports OS virtualization i.e. your application feels that it has a complete instance of an OS whereas VM supports hardware virtualization. You feel like it is a physical machine in which you can boot any OS.
In Docker, the containers running share the host OS kernel, whereas in VMs they have their own OS files. The environment (the OS) in which you develop an application would be same when you deploy it to various serving environments, such as "testing" or "production".
For example, if you develop a web server that runs on port 4000, when you deploy it to your "testing" environment, that port is already used by some other program, so it stops working. In containers there are layers; all the changes you have made to the OS would be saved in one or more layers and those layers would be part of image, so wherever the image goes the dependencies would be present as well.
In the example shown below, the host machine has three VMs. In order to provide the applications in the VMs complete isolation, they each have their own copies of OS files, libraries and application code, along with a full in-memory instance of an OS.
Whereas the figure below shows the same scenario with containers. Here, containers simply share the host operating system, including the kernel and libraries, so they don’t need to boot an OS, load libraries or pay a private memory cost for those files. The only incremental space they take is any memory and disk space necessary for the application to run in the container. While the application’s environment feels like a dedicated OS, the application deploys just like it would onto a dedicated host. The containerized application starts in seconds and many more instances of the application can fit onto the machine than in the VM case.
Source: https://azure.microsoft.com/en-us/blog/containers-docker-windows-and-trends/
There are three different setups that providing a stack to run an application on (This will help us to recognize what a container is and what makes it so much powerful than other solutions):
1) Traditional Servers(bare metal)
2) Virtual machines (VMs)
3) Containers
1) Traditional server stack consist of a physical server that runs an operating system and your application.
Advantages:
Utilization of raw resources
Isolation
Disadvantages:
Very slow deployment time
Expensive
Wasted resources
Difficult to scale
Difficult to migrate
Complex configuration
2) The VM stack consist of a physical server which runs an operating system and a hypervisor that manages your virtual machine, shared resources, and networking interface. Each Vm runs a Guest Operating System, an application or set of applications.
Advantages:
Good use of resources
Easy to scale
Easy to backup and migrate
Cost efficiency
Flexibility
Disadvantages:
Resource allocation is problematic
Vendor lockin
Complex configuration
3) The Container Setup, the key difference with other stack is container-based virtualization uses the kernel of the host OS to rum multiple isolated guest instances. These guest instances are called as containers. The host can be either a physical server or VM.
Advantages:
Isolation
Lightweight
Resource effective
Easy to migrate
Security
Low overhead
Mirror production and development environment
Disadvantages:
Same Architecture
Resource heavy apps
Networking and security issues.
By comparing the container setup with its predecessors, we can conclude that containerization is the fastest, most resource effective, and most secure setup we know to date. Containers are isolated instances that run your application. Docker spin up the container in a way, layers get run time memory with default storage drivers(Overlay drivers) those run within seconds and copy-on-write layer created on top of it once we commit into the container, that powers the execution of containers. In case of VM's that will take around a minute to load everything into the virtualize environment. These lightweight instances can be replaced, rebuild, and moved around easily. This allows us to mirror the production and development environment and is tremendous help in CI/CD processes. The advantages containers can provide are so compelling that they're definitely here to stay.
In relation to:-
"Why is deploying software to a docker image easier than simply
deploying to a consistent production environment ?"
Most software is deployed to many environments, typically a minimum of three of the following:
Individual developer PC(s)
Shared developer environment
Individual tester PC(s)
Shared test environment
QA environment
UAT environment
Load / performance testing
Live staging
Production
Archive
There are also the following factors to consider:
Developers, and indeed testers, will all have either subtlely or vastly different PC configurations, by the very nature of the job
Developers can often develop on PCs beyond the control of corporate or business standardisation rules (e.g. freelancers who develop on their own machines (often remotely) or contributors to open source projects who are not 'employed' or 'contracted' to configure their PCs a certain way)
Some environments will consist of a fixed number of multiple machines in a load balanced configuration
Many production environments will have cloud-based servers dynamically (or 'elastically') created and destroyed depending on traffic levels
As you can see the extrapolated total number of servers for an organisation is rarely in single figures, is very often in triple figures and can easily be significantly higher still.
This all means that creating consistent environments in the first place is hard enough just because of sheer volume (even in a green field scenario), but keeping them consistent is all but impossible given the high number of servers, addition of new servers (dynamically or manually), automatic updates from o/s vendors, anti-virus vendors, browser vendors and the like, manual software installs or configuration changes performed by developers or server technicians, etc. Let me repeat that - it's virtually (no pun intended) impossible to keep environments consistent (okay, for the purist, it can be done, but it involves a huge amount of time, effort and discipline, which is precisely why VMs and containers (e.g. Docker) were devised in the first place).
So think of your question more like this "Given the extreme difficulty of keeping all environments consistent, is it easier to deploying software to a docker image, even when taking the learning curve into account ?". I think you'll find the answer will invariably be "yes" - but there's only one way to find out, post this new question on Stack Overflow.
There are many answers which explain more detailed on the differences, but here is my very brief explanation.
One important difference is that VMs use a separate kernel to run the OS. That's the reason it is heavy and takes time to boot, consuming more system resources.
In Docker, the containers share the kernel with the host; hence it is lightweight and can start and stop quickly.
In Virtualization, the resources are allocated in the beginning of set up and hence the resources are not fully utilized when the virtual machine is idle during many of the times.
In Docker, the containers are not allocated with fixed amount of hardware resources and is free to use the resources depending on the requirements and hence it is highly scalable.
Docker uses UNION File system .. Docker uses a copy-on-write technology to reduce the memory space consumed by containers. Read more here
With a virtual machine, we have a server, we have a host operating system on that server, and then we have a hypervisor. And then running on top of that hypervisor, we have any number of guest operating systems with an application and its dependent binaries, and libraries on that server. It brings a whole guest operating system with it. It's quite heavyweight. Also there's a limit to how much you can actually put on each physical machine.
Docker containers on the other hand, are slightly different. We have the server. We have the host operating system. But instead a hypervisor, we have the Docker engine, in this case. In this case, we're not bringing a whole guest operating system with us. We're bringing a very thin layer of the operating system, and the container can talk down into the host OS in order to get to the kernel functionality there. And that allows us to have a very lightweight container.
All it has in there is the application code and any binaries and libraries that it requires. And those binaries and libraries can actually be shared across different containers if you want them to be as well. And what this enables us to do, is a number of things. They have much faster startup time. You can't stand up a single VM in a few seconds like that. And equally, taking them down as quickly.. so we can scale up and down very quickly and we'll look at that later on.
Every container thinks that it’s running on its own copy of the operating system. It’s got its own file system, own registry, etc. which is a kind of a lie. It’s actually being virtualized.
Source: Kubernetes in Action.
I have used Docker in production environments and staging very much. When you get used to it you will find it very powerful for building a multi container and isolated environments.
Docker has been developed based on LXC (Linux Container) and works perfectly in many Linux distributions, especially Ubuntu.
Docker containers are isolated environments. You can see it when you issue the top command in a Docker container that has been created from a Docker image.
Besides that, they are very light-weight and flexible thanks to the dockerFile configuration.
For example, you can create a Docker image and configure a DockerFile and tell that for example when it is running then wget 'this', apt-get 'that', run 'some shell script', setting environment variables and so on.
In micro-services projects and architecture Docker is a very viable asset. You can achieve scalability, resiliency and elasticity with Docker, Docker swarm, Kubernetes and Docker Compose.
Another important issue regarding Docker is Docker Hub and its community.
For example, I implemented an ecosystem for monitoring kafka using Prometheus, Grafana, Prometheus-JMX-Exporter, and Docker.
For doing that, I downloaded configured Docker containers for zookeeper, kafka, Prometheus, Grafana and jmx-collector then mounted my own configuration for some of them using YAML files, or for others, I changed some files and configuration in the Docker container and I build a whole system for monitoring kafka using multi-container Dockers on a single machine with isolation and scalability and resiliency that this architecture can be easily moved into multiple servers.
Besides the Docker Hub site there is another site called quay.io that you can use to have your own Docker images dashboard there and pull/push to/from it. You can even import Docker images from Docker Hub to quay then running them from quay on your own machine.
Note: Learning Docker in the first place seems complex and hard, but when you get used to it then you can not work without it.
I remember the first days of working with Docker when I issued the wrong commands or removing my containers and all of data and configurations mistakenly.
This is how Docker introduces itself:
Docker is the company driving the container movement and the only
container platform provider to address every application across the
hybrid cloud. Today’s businesses are under pressure to digitally
transform but are constrained by existing applications and
infrastructure while rationalizing an increasingly diverse portfolio
of clouds, datacenters and application architectures. Docker enables
true independence between applications and infrastructure and
developers and IT ops to unlock their potential and creates a model
for better collaboration and innovation.
So Docker is container based, meaning you have images and containers which can be run on your current machine. It's not including the operating system like VMs, but like a pack of different working packs like Java, Tomcat, etc.
If you understand containers, you get what Docker is and how it's different from VMs...
So, what's a container?
A container image is a lightweight, stand-alone, executable package of
a piece of software that includes everything needed to run it: code,
runtime, system tools, system libraries, settings. Available for both
Linux and Windows based apps, containerized software will always run
the same, regardless of the environment. Containers isolate software
from its surroundings, for example differences between development and
staging environments and help reduce conflicts between teams running
different software on the same infrastructure.
So as you see in the image below, each container has a separate pack and running on a single machine share that machine's operating system... They are secure and easy to ship...
There are a lot of nice technical answers here that clearly discuss the differences between VMs and containers as well as the origins of Docker.
For me the fundamental difference between VMs and Docker is how you manage the promotion of your application.
With VMs you promote your application and its dependencies from one VM to the next DEV to UAT to PRD.
Often these VM's will have different patches and libraries.
It is not uncommon for multiple applications to share a VM. This requires managing configuration and dependencies for all the applications.
Backout requires undoing changes in the VM. Or restoring it if possible.
With Docker the idea is that you bundle up your application inside its own container along with the libraries it needs and then promote the whole container as a single unit.
Except for the kernel the patches and libraries are identical.
As a general rule there is only one application per container which simplifies configuration.
Backout consists of stopping and deleting the container.
So at the most fundamental level with VMs you promote the application and its dependencies as discrete components whereas with Docker you promote everything in one hit.
And yes there are issues with containers including managing them although tools like Kubernetes or Docker Swarm greatly simplify the task.
Feature
Virtual Machine
(Docker) Containers
OS
Each VM Does contains an Operating System
Each Docker Container Does Not contains an Operating System
H/W
Each VM contain a virtual copy of the hardware that OS requires to run.
There is No virtualization of H/W with containers
Weight
VM's are heavy -- reason sited above--
containers are lightweight and, thus, fast
Required S/W
Virtuliazation achieve using software called a hypervisor
Containerzation achieve using software called a Docker
Core
Virtual machines provide virtual hardware (or hardware on which an operating system and other programs can be installed)
Docker containers don’t use any hardware virtualization. **It helps to use container
Abstraction
Virtual machines provide hardware abstractions so you can run multiple operating systems.
Containers provide OS abstractions so you can run multiple containers.
Boot-Time
It takes a long time (often minutes) to create and require significant resource overhead because they run a whole operating system in addition to the software you want to use.
It takes less time because Programs running inside Docker containers interface directly with the host’s Linux kernel.
Containers isolates libraries and software packages from the system so that you can install different versions of same software and libraries without conflict. It uses minimal storage and ram, almost no overhead using same base os kernel and available libraries with a small delta difference if possible. You can expose your hardware directly or indirectly to containers so that you can use acceleration such as gpu for computations.
In practice you use docker for pre-made containers. You install them and run them in one line. Installing tensorflow-gpu is as easy as docker run -it tensorflow-gpu. Although I could not stumble upon many premade containers of lxd (lxc containers),I find them easier to customize and more stable and performant.
Both containers and VMs can be used to distribute the load. But since containers has almost no overhead, container management software are focused on creating container clusters so that you distribute them, thus the load, to metal machines easily.
Real Life example:
Suppose that you need more than 50 types of computation environment and 50 types of services such as mysql, webhosting and cloud based services (like jenkins and object storage) and you have more than 50 different bare metal servers. Typically its an academic environment with many faculties. And you need to use resources efficiently and you need high availability. When one server goes down users should not experience any problem.
To solve this, what you do is basically installing all types of containers on all servers. And distribute the load to all metal machines. As one type of container is needed more it is possible to automatically spawn more of them on one or more bare metal machines. So that many different users can use different services and environments continuously and flexibly.
In that setup suppose there are 100 students using the system at the same time. 95 of them are using servers for rudimentary services such as checking GPA's, curriculum, library database etc. But 5 of them are performing 5 different types of engineering simulations. You will see that 49 bare metal servers are fully dedicated to engineering simulation each having 5 different types of computation containers tying to race each but balanced as %20 hw resource use. When you add 2500 more students for rudimentary tasks, that either will use %5 of all bare metal machines. Rest will be used for computations.
Thus the most important distinguishing features of container providing such flexibility benefits are:
ready to deploy premade containers, almost no overhead, fast spawnability
with live-adjustable quotas
using .cpu_allowencess , .ram_allowances or directly cgroup.
Kubernetes does all of this for you. After fiddling with docker and lxd you may want to check it out.
In my opinion it depends, it can be seen from the needs of your application, why decide to deploy to Docker because Docker breaks the application into small parts according to its function, this becomes effective because when one application / function is an error it has no effect on other applications , in contrast to using full vm, it will be slower and more complex in configuration, but in some ways safer than docker
The docker documentation (and self-explanation) makes a distinction between "virtual machines" vs. "containers". They have the tendency to interpret and use things in a little bit uncommon ways. They can do that because it is up to them, what do they write in their documentation, and because the terminology for virtualization is not yet really exact.
Fact is what the Docker documentation understands on "containers", is paravirtualization (sometimes "OS-Level virtualization") in the reality, contrarily the hardware virtualization, which is docker not.
Docker is a low quality paravirtualisation solution. The container vs. VM distinction is invented by the docker development, to explain the serious disadvantages of their product.
The reason, why it became so popular, is that they "gave the fire to the ordinary people", i.e. it made possible the simple usage of typically server ( = Linux) environments / software products on Win10 workstations. This is also a reason for us to tolerate their little "nuance". But it does not mean that we should also believe it.
The situation is made yet more cloudy by the fact that docker on Windows hosts used an embedded Linux in HyperV, and its containers have run in that. Thus, docker on Windows uses a combined hardware and paravirtualization solution.
In short, Docker containers are low-quality (para)virtual machines with a huge advantage and a lot of disadvantages.

what is a container? and gVisor?

I am trying to understand what are containers and what is their purpose?
I am a little bit confused. When I started to read about them I saw that they rely on the Linux namespaces (is it true?) - a way to isolate the process within the container from the other processes on the machine, and got the impression that their main purpose is security.
For instance, let's say that I own a server that runs multiple services. I also don't want that a single hacked service will be able to hack the whole system. So I put each service inside a container that will make the service unable to interfere the other processes inside the machine, like to kill them or to play with their memory and in that way eliminate the risk.
But later I saw other purposes like being able to ship the app easily? or something like that. so what is their main purpose? I also read that if their main purpose is security - they have a problem. because they run directly on the host kernel (again, is it true?)- and an exploit like the "dirty cow" will or was able to get out of the container and be able to corrupt the machine. So I ended reading about the gVisor - which from what I understood tries to secure the containers, and in some cases succeed. So - what does gVisor do differently? that it's able to secure the containers? is gVisor a container itself? or just a Runtime environment for containers?
eventually, I always see comparisons between containers and VM and I ask why? And when should I use them?
I don't know if anything that I wrote is correct, and I will be glad if you will point out my mistakes, and answer my questions. Yes, I know that there are a lot of them and I am sorry, but Thanks!
The answer below is not guaranteed to be concise. Anyone is welcomed to point out my mistakes.
It might be a little bit vague because many people mixed such concepts nowadays.
1. LXC
When I first got to know such concepts, container still meant LXC, a long-existed technique in Linux. IMHO, container is a complete process that does not simulate a kernel. The difference between a container and a normal process is that container provides a isolated view via cgroups, as if it was in a new operating system. But in fact, the containers still share the host kernel (you are right), so people do worry about the security, especially when you want to deploy it in a public cloud (I don't see people using LXC directly on public cloud yet).
Despite the potential insecurity, the convenience and lightweightness(fast boot, small memory fingerprint) of containers seem to outweigh its drawbacks in most of security-insensitive situations. Tools like docker and kubernetes make large-scale deployment and management more efficient.
2. Virtual Machine & Hardware-assisted virtualization
In contrast to container, the concept Virtual Machine represents another category of isolated execution environment. Considering that most of VMs leverages some hardware-accelerating techniques like VT-x, I will assume you are talking about hardware-assisted virtualization. Virtual Machine usually contains a full kernel inside it.
See this picture from Doug Chamberlain
The Intel VT-x technique provides 2 modes, root mode(privileged) and non-root mode(not privileged). Each mode has its own ring0-ring3 (e.g, non-root ring3, non-root ring0, root ring3, root ring 0). The whole virtual machine runs in non-root mode, and the hypervisor(VMM, e.g., kvm) runs in root-mode.
In the classic qemu+kvm setup, qemu runs in root ring3, and kvm runs in root ring0.
The strong isolation and the existance of guest kernel makes virtual machine more secure and compatible. But, of course, the price is performance and efficiency (slower boot etc.)
Container-based Virtualization
People want the isolation of hardware-assisted virtualization, but don't want to give up the convenience of containers. Therefore, the hybrid solution seems really intuitive to come.
There are 2 typical solutions at present, Kata Container and [gVisor][6]
Kata Container tries to slim the whole stack of virtual machine to make it more lightweight. However, there is still linux inside it and it is still a virtual machine, but more lightweight.
gVisor claims to be an secure container, but it still leverages hardware virtualization techniques (or ptrace if you don't want virtualization). There is a component called sentry, which runs both in non-root ring0 and root ring3. The sentry will do part of the guest kernel's job, but is much smaller than linux. If sentry could not finish a request itself, it proxy the request down to the host kenrel.
The reason why most people believe gvisor is somewhat more secure is that it achieves "defense in depth" -- more layers of indirection lead people to believe it is more secure. This is usually true, but again, is not a guarantee.

How does Docker share resources

I've been looking into Docker and I understand from this post that running multiple docker containers is meant to be fast because they share kernel level resources through the "LXC Host," however, I haven't found any documentation about how this relationship works that is specific to the docker configuration, and at what level are resources shared.
What's the involvement of the Docker image and the Docker container with shared resources and how are resources shared?
Edit:
When talking about "the kernel" where resources are shared, which kernel is this? Does it refer to the host O.S (the level at which the docker binary lives) or does it refer to the kernel of the image the container is based on? Won't containers based on different linux distributions need to run on different types of kernels?
Edit 2:
One final edit to make my question a little more clear, I'm curious as to whether or not docker really does not run the full O.S of the image as they suggest on this page under the "How is Docker different then a VM"
The following statement seems to contradict the diagram above, taken from here:
A container consists of an operating system, user-added files, and
meta-data. As we've seen, each container is built from an image.
Strictly speaking Docker no longer has to use LXC, the user tools. It does still use the same underlying technologies with their in house container library, libcontainer. Actually Docker can use various system tools for the abstraction between process and kernel:
The kernel need not be different for different distributions - but you cannot run a non-linux OS. The kernel of the host and of the containers is the same but it supports a sort of context awareness to separate these from one another.
Each container does contain a separate OS in every way beyond the kernel. It has its own user-space applications / libraries and for all intents and purposes it behaves as though it has its own kernel.
It's not so much a question of which resources are shared as which resources aren't shared. LXC works by setting up namespaces with restricted visibility -- into the process table, into the mount table, into network resources, etc -- but anything that isn't explicitly restricted and namespaced is shared.
This means, of course, that the backends for all these components are also shared -- you aren't needing to pretend to have a different set of page tables per guest, because you aren't pretending to run more than one kernel; it's all the same kernel, all the same memory allocation pools, all the same hardware devices doing bit-twiddling (vs all the overhead of emulating hardware for a VM, and having each guest separately twiddle its virtual devices); the same block caches; etc etc etc.
Frankly, the question is almost too broad to be answered, as the only real answer as to what is shared is "almost everything", and to how it's shared is "by not doing duplicate work in the first place" (as conventional VMs do by emulating hardware rather than sharing just one kernel interacting with the real hardware). This is also why kernel exploits are so dangerous in LXC-based systems -- it's all one kernel, so there's no nontrivial distinction between ring 0 in one container and ring 0 in another.

docker and product versions

I am working for a product company and we do make lot of releases of the product. In the current approach to test multiple releases, we create separate VM and install all infrastructure softwares(db, app server etc) on top of it. Later we deploy the application WARs on the respective VM. Recently, I came across docker and it seems to be much helpful. Hence I started exploring it with the examples listed on the site. But, I am not able to find a way as how docker can be applied to build environment suitable to various releases?
Each product version will have db schema changes.
Each application WARs will have enhancements/defects etc.
Consider below example.
Every month, our company is releasing a new version of software and hence in order to support/fix defects we create VMs per release. Given the fact that if the application's overall size is 2 gb and OS takes close to 5 gb (apart from space it will also take up system resources for extra overhead). The VMs are required to restore any release and test any support issues reported against it. But looking at the additional infrastructure requirements, it seems that its very costly affair.
Can docker have everything required to run an application inside a container/image?
Can docker pack an application which consists of multiple WARs/DB schemas and when started allocate appropriate port?
Will there be any space/memory/speed differences compared to VM and docker assuming above scenario?
Do you think docker is still appropriate solution or should we continue using VMs? Can someone share pointers on how I can achieve above requirements with docker?
tl;dr: Yes, docker can run most applications inside a container.
Docker runs a single process inside each container. When using VMs or real servers, this one process is usually the init system which starts all system services. With docker it is usually your app.
This difference will get you faster startup times for your app (not starting the whole operating system). The trade off is that, if you depend on system services (such as cron, sshd…) you will need to start them yourself. There are some base images that provide a more "VM-like" environment… check phusion's baseimage for instance. To start more than a single process, you can also use a process manager such as supervisord.
Going forward, the recommended (although not required) approach is to start one process in each container (one per application server, one per database server, and so on) and not use containers as VMs.
Docker has no problems allocating ports either. It even has an explicit command on the Dockerfile: EXPOSE. Exposed ports can also be published on the docker host with the --publish argument of run so you don't even need to know the IP assigned to the container.
Regarding used space, you will probably see important savings. Docker images are created by stacking filesystem layers… this means that the common layers are only stored once on the server. In your setup, you will likely only have one copy of the base operating system layer (with VMs, you have a copy on each VM).
On memory you will probably see less significant savings (mostly caused by not starting all the operating system services). Speed is still a subject of research… A few things clear so far is that for faster IO you will need to use docker volumes and that for network heavy use cases you should use host networking. Check the IBM research "An Updated Performance Comparison of Virtual Machines and Linux Containers" for details. Or a summary like InfoQ's.

What is the benefit of Docker container for a memcached instance?

One of the Docker examples is for a container with Memcached configured. I'm wondering why one would want this versus a VM configured with Memcached? I'm guessing that it would make no sense to have more than one memcached docker container running under the same host, and that the only real advantage is speed advantage of "spinning up" the memcached stack in a docker container vs Memcached via a VM. Is this correct?
Also, how does one set the memory to be used by memcached in the docker container? How would this work if there were two or more docker containers with Memcached under one host? (I'm assuming again that two or more would not make sense).
I'm wondering why one would want this versus a VM configured with Memcached?
Security: If someone breaks memcached and trojans the filesystem, it doesn't matter -- the file system gets thrown away when you start a new memchached.
Isolation: You can hard-limit each container to prevent it from using too much RAM.
Standardization: Currently, each app/database/cache/load balancer must record what to install, what to configure and what to run. There is no standard (and no lack of tools such as puppet, chef, etc.). But these tools are very complex, not really OS independent (despite their claims), and carry the same complexity from development to deployment.
With docker, everything is just a container started with run BLAH. If your app has 5 layers, you just have 5 containers to run, with a tiny bit of orchestration on top. Developers never need to "look into the container" unless they are developing at that layer.
Resources: You can spin up 1000's of docker containers on an ordinary PC, but you would have trouble spinning up 100's of VMs. The limit is both CPU and RAM. Docker containers are just processes in an "enhanced" chroot. On a VM, there are dozens of background processes (cron, logrotation, syslog, etc), but there are no extra processes for docker.
I'm guessing that it would make no sense to have more than one memcached docker container running under the same host
It depends. There are cases where you want to split up your RAM into parcels instead of globally. (i.e. imagine if you want to devote 20% of your cache to caching users, and 40% of your cache to caching files, etc.)
Also, most sharding schemes are hard to expand, so people often start with many 'virtual' shards, then expand on to physical boxes when needed. So you might start with your app knowing about 20 memcached instances (chosen based on object ID). At first, all 20 run on one physical server. But later you split them onto 2 servers (10/10), then later onto 5 servers (4/4/4/4) and finally onto 20 physical servers (1 memcached each). Thus, you can scale your app 20x just by moving VMs around and not changing your app.
the only real advantage is speed advantage of "spinning up" the memcached stack in a docker container vs Memcached via a VM. Is this correct?
No, that's just a slight side benefit. see above.
Also, how does one set the memory to be used by memcached in the docker container?
In the docker run command, just use -m.
How would this work if there were two or more docker containers with Memcached under one host? (I'm assuming again that two or more would not make sense).
Same way. If you didn't set a memory limit, it would be exactly like running 2 memcached processes on the host. (If one fills up the memory, both will get out of memory errors.)
There seems to be two questions here...
1 - The benefit is as you describe. You can sandbox the memcached instance (and configuration) in to separate containers so you could run multiple on a given host. In addition, moving the memcached instance to another host is pretty trivial and just requires an update to application configuration in the worst case.
2 - docker run -m <inbytes> <memcached-image> would limit the amount of memory a memcached container could consume. You can run as many of these as you want under a single host.
I might be missing something here, but Memcaching only says something about memory usage, right? Docker containers are very efficient in disk space usage as well. You don't need an OS on every VM, but you can share resources. Insightful expanation with pictures on the docker.io website.

Resources