Simple Docker Concept - linux

I'm going through the getting started with Docker guide and understood most of the basics except for one concept.
I get how docker/whalesay takes up 247 MB. It needs to download a few layers, including a base image of ubuntu. But hello-world should be around the same size? It's a self-contained image that can be shipped anywhere.
When hello-world executes, there's still a Linux layer running it somewhere, and I also downloaded hello-world before docker/whalesay so it couldn't have been using the Linux layer downloaded from docker/whalesay. What am I missing here?

It is not an ubuntu instance. Check the hub:
https://hub.docker.com/_/hello-world/
Here if you click on latest, you can see the dockerfile:
FROM scratch
COPY hello /
CMD ["/hello"]
The FROM defines which operating system it is based on. Scratch is an "empty" image, as described here: https://hub.docker.com/_/scratch/

Looking into Dockerfile clears things up - it's not using any base image i.e Ubuntu, etc:
FROM scratch
COPY hello /
CMD ["/hello"]
The first directive FROM states the base image for the new image we intend to build. From the docs:
The FROM instruction sets the Base Image for subsequent instructions.
As such, a valid Dockerfile must have FROM as its first instruction.
The image can be any valid image – it is especially easy to start by
pulling an image from the Public Repositories. (Docker Hub)
And FROM scratch (no way it is using any base image, hence the mini image size) is a special case - the term scratch is reserved - from the docs:
FROM scratch
This image is most useful in the context of building base images (such
as debian and busybox) or super minimal images (that contain only a
single binary and whatever it requires, such as hello-world).
also
As of Docker 1.5.0 (specifically, docker/docker#8827), FROM scratch is
a no-op in the Dockerfile, and will not create an extra layer in your
image (so a previously 2-layer image will be a 1-layer image instead).
EDIT 1 - OP's new comment to clarify it further:
To clarify, there's a very minimal Linux dist installed with Docker.
And this incredibly simple hello-world image uses that default Linux
dist that comes with Docker?
A good clarification by Paul Becotte:
No. Docker does not contain a kernel- it is not a virtual machine. It
is a way to run processes on your existing kernel in such a way as to
trick them into thinking they are completely isolated. The size of the
image is actually a "root file system" ... in this case, the file
system contains only a single file, which is why it is small. The
process actually gets executed on the kernel that is running the
Docker Daemon (you Linux machine on which you installed Docker), with it chroot'ed to the container filesystem.
To clarify it further - I am sharing an example of using a minimal image Alpine:
A minimal Docker image based on Alpine Linux with a complete package
index and only 5 MB in size!
P.S. In case of hello-world there isn't any base image not even a minimlistic one.

Related

Using docker as full OS?

Little intro:
I have two OS on my pc. Linux and Windows. I need Linux for work, but it freezes on my pc, but windows does not. I've heard that is a common thing for ASRock motherboards.
That's why i want to switch to Windows for work.
So my idea was to create docker image with everything i need for work, such as yarn, make, and a lot of other stuff, and run it on windows for using linux functionality. You got the idea.
I know that docker is designed to only do one thing per image, but i gave this a shot.
But there are problems constantly. For example right now i'm trying to install nvm on my image, but, after building the image, command 'nvm' is not found on bash. It is a known problem and running source ~/.profile adds the command in console, but running it while building the image doesnt affect your console when you run this image. So you need to do that manually every time you use this image.
People suggest putting this in .bashrc which gives segmentation error.
And that's just my problem for today, but i've encountered many more, as i've been trying creating this image for a couple of days already.
So my question is basically this: is it possible to create fully operational OS in one docker image, or maybe one could connect multiple images to create OS, or do i just need to stop that and use a virtual machine like a sensible person?
I would recommend using a virtual machine for your use-case. Since you will be using this for work and modifying settings, and installing new software, these operations are better suited to be in a virtual machine where it is expected that you change the state or configurations.
In contrast, Docker containers are generally meant to be immutable, as in the running instance of the image should not be altered or configured. This is so that others can pull down the image and it works "out-of-the-box." Additionally, most Docker containers available on Docker Hub are made to be lean, with only one or two use cases in mind and not extra (for security purposes and image size), so I expect that you would frequently run into problems trying to essentially set up a Docker image that you would be working on. Lastly, since it is not done frequently, there would be less help available online, and Docker-level virtualization does not really suit your situation.

Why a vendor/node_modules mapping in a volume is considered a bad practise?

Could someone explain me what is happening when you map (in a volume) your vendor or node_module files?
I had some speed problems of docker environment and red that I don't need to map vendor files there, so I excluded it in docker-compose.yml file and the speed was much faster instantly.
So I wonder what is happening under the hood if you have vendor files mapped in your volume and what's happening when you don't?
Could someone explain that? I think this information would be useful to more than only me.
Docker does some complicated filesystem setup when you start a container. You have your image, which contains your application code; a container filesystem, which gets lost when the container exits; and volumes, which have persistent long-term storage outside the container. Volumes break down into two main flavors, bind mounts of specific host directories and named volumes managed by the Docker daemon.
The standard design pattern is that an image is totally self-contained. Once I have an image I should be able to push it to a registry and run it on another machine unmodified.
git clone git#github.com:me/myapp
cd myapp
docker build -t me/myapp . # requires source code
docker push me/myapp
ssh me#othersystem
docker run me/myapp # source code is in the image
# I don't need GitHub credentials to get it
There's three big problems with using volumes to store your application or your node_modules directory:
It breaks the "code goes in the image" pattern. In an actual production environment, you wouldn't want to push your image and also separately push the code; that defeats one of the big advantages of Docker. If you're hiding every last byte of code in the image during the development cycle, you're never actually running what you're shipping out.
Docker considers volumes to contain vital user data that it can't safely modify. That means that, if your node_modules tree is in a volume, and you add a package to your package.json file, Docker will keep using the old node_modules directory, because it can't modify the vital user data you've told it is there.
On MacOS in particular, bind mounts are extremely slow, and if you mount a large application into a container it will just crawl.
I've generally found three good uses for volumes: storing actual user data across container executions; injecting configuration files at startup time; and reading out log files. Code and libraries are not good things to keep in volumes.
For front-end applications in particular there doesn't seem to be much benefit to trying to run them in Docker. Since the actual application code runs in the browser, it can't directly access any Docker-hosted resources, and there's no difference if your dev server runs in Docker or not. The typical build chains involving tools like Typescript and Webpack don't have additional host dependencies, so your Docker setup really just turns into a roundabout way to run Node against the source code that's only on your host. The production path of building your application into static files and then using a Web server like nginx to serve them is still right in Docker. I'd just run Node on the host to develop this sort of thing, and not have to think about questions like this one.

How does docker Images and Layers work?

Actually I am new to the Docker ecosystem and I am trying to understand how exactly does a container work on a base image? Does the base image gets loaded into the container?
I have been through Docker docs where its said that a read write container layer is formed on top of a image layer which is the container layer, but what I am confused about is image is immutable, right? Then where is the image running, is it inside the Docker engine in the VM and how the container is actually coming into play?
how exactly does a container work on a base image?
Does the base image gets loaded into the container?
Docker containers wrap a piece of software in a complete filesystem that contains everything needed to run: code, runtime, system tools, system libraries – anything that can be installed on a server.
Like FreeBSD Jails and Solaris Zones, Linux containers are self-contained execution environments -- with their own, isolated CPU, memory, block I/O, and network resources (Using CGROUPS kernel feature) -- that share the kernel of the host operating system. The result is something that feels like a virtual Machine, but sheds all the weight and startup overhead of a guest operating system.
This being said Each distribution has it's own official docker image (library), that is shipped with minimal binaries, Considered docker's best practices and it's ready to build on.
I am confused about is image is immutable, right? where is the image running, is it inside the Docker engine in the VM and how the container is actually coming into play?
Docker used to use AUFS, still uses it on debian and uses AUFS like file systems like overlay and etc on other distributions. AUFS provides layering. Each Image consists of Layers, These layers are read only. Each container has a read/write layer on top of its image layers. Read only layer are shared between containers so you will have storage space savings. Container will see the union mount of all image layers + read/write layer.

Docker: How to create a stack, multiple images or one base image?

I am new using Docker, and I got the doubt of using one image base for my stack or I have to define each image depending on my needs.
For example, reading a blog about creating a website using docker the author suggests the following Stack:
Image taken from http://project-webdev.blogspot.de/2015/05/create-site-based-on-docker-part4-docker-container-architecture.html
Now, seen the structure, If we have base images in the Docker registry for technologies as mongoDB, io.JS, nginx, Why on this examples we do not use those images insted of using a single Docker base image for everything?
I'm the author of this blog post/series, so let me elaborate on the reason why I've chosen one base image. :)
Docker offers the possibility to use a common base image for subsequent images, so you can stack all images and each image only contains the diff to the underlying image (that's the big advantage of docker!). You can save disk space and RAM. If you don't care about that (I mean RAM and storage are cheap) you can also use multiple images.
Another advantage of one base image is that you can configure/secure that base image based on your needs. If you use a different base image for each container, you have to maintain all of them (e.g. firewall) and Docker will download several different images (uses more disk space, container builds take longer)).
But what's the difference when you look at the official images? The official mongodb, redis and MySQL images are based on the debian:wheezy image. So if you use these images, they will also be based on the same base image, won't they?
Anyway, if you want to use your own architecture, feel free... please consider this architecture/blog post as possible idea of creating a Docker setup. Thanks to Docker and they way it shares the kernel, you can use as many images you want. :)
Regards,
Sascha
i suppose it depends on the application but in this case you are right it does not make sense to have a large base image.
From my understanding, the author created a base image and then built a lot of other images from that base image. When you build an image for docker, there are a lot of intermediary images created so if the base image is the same, it could be shared and potentially save some memory on your machine.
but i don't think it makes any difference for when you run the containers. each container is independent and they essentially only share the linux kernel. and it's actually really bad because the base image is huge in this case and it takes a lot of resource to spawn up a container from an image like this. the dedicated images for each service are much smaller and take up much less resource for when you start containers from them. so yea, whatever memory you could save from building a large base image is not going to be worth it.

Docker Help : Creating Dockerfile and Image for Node.js App

I am new docker and followed the tutorials on docker's website for installing boot2docker locally and building my own images for Node apps using their tutorial (https://docs.docker.com/examples/nodejs_web_app/). I was able to successfully complete this but I have the following questions:
(1) Should I be using these Node Docker images (https://registry.hub.docker.com/_/node/) instead of CentOS6 for the base of my Docker Image? I am guessing the Docker tutorial is out of date?
(2) If I should be basing from the Node Docker Images, does anyone have any thoughts on whether the Slim vs Regular Official Node Image is better to use. I would assume slim would be the best choice but I am confused on why multiple versions exist.
(3) I don't want my Docker Images to include my Node.JS app source files directly in the image and thus have to re-create my images on every commit. Instead I want running my Docker Container to pull the source from my private Git Repository upon starting for a specific commit. Is this possible? Could I use something like entrypoint to specify my credentials and commit when running the Docker Container so it then would run a shell script to pull the code and then start the node app?
(4) I may end up running multiple different Docker Containers on the same EC2 hosts. I imagine making sure the containers are all based off of the same Linux distro would be preferred? This would prohibit me from downloading multiple versions when first starting the instance and running the different containers?
Thanks!
It would have been best to ask 4 separate questions rather than put this all into one question. But:
1) Yes, use the Node image.
2) The "regular" image includes various development libraries that aren't in the slim image. Use the regular image if you need these libraries, otherwise use slim. More information on the libraries is here https://registry.hub.docker.com/_/buildpack-deps/
3) You would probably be better off by putting the code into a data-container that you add to the container with --volumes-from. You can find more information on this technique here: https://docs.docker.com/userguide/dockervolumes/
4) I don't understand this question. Note that amazon now have a container offering: https://aws.amazon.com/ecs/

Resources