I have a docker container in a hyperledger fabric setup. This stores all user credentials.
What happens if this container or machine goes down and is not available?
If I bring up a backup container, how can the entire state be restored?
I tried doing the commit option but on bringing it back up, it does not work as expected. More likely the CA functionality uses some container id to track since a CA server is highly secretive price of the setup.
Overall, this is more of a strategy question, there are many approaches to backing up critical data - and you may or may not choose one that is specific to Docker containers.
On the technical questions that you asked:
If the container 'goes down', its files remain intact and will be there when it is restarted (that is, if you re-start the same container and don't create a new one). If the machine goes down, the container will come 'back up' if and when the machine is restarted. Depending on how you created the container, you may need to start it yourself or Docker may restart it automatically. If it went down hard and won't come back - you lose all data on it, including files in containers.
You can create a 'backup container' (or more precisely, a backup image), but if it was left on the same machine it will die with that machine. You will need to save it elsewhere (e.g., with 'docker push', though I don't recommend that unless you have your own docker registry to use for backups).
If you do 'commit', this simply creates a new container image, which has the files as they were when you did the commit. You should commit a stopped container, if you want a proper copy of all files - I don't think you can do it while there are active open files. This copy lives on the same machine where the container was, so you still need to save it away from that machine to protect it from loss. Note that to use the saved image, you should tag it and use it to start a new container. The image from which you started the old container is untouched by the 'commit' (using that old image will start the container as it was then, when you first created it).
IMO, an option better than 'commit' (which saves the entire container file system, along with all the junk like logs and temp. files) is to mount a docker volume to the path where important files are stored (e.g., /var/lib/mysql, if you run a mysql database) - and back up only that volume.
Related
Could someone explain me what is happening when you map (in a volume) your vendor or node_module files?
I had some speed problems of docker environment and red that I don't need to map vendor files there, so I excluded it in docker-compose.yml file and the speed was much faster instantly.
So I wonder what is happening under the hood if you have vendor files mapped in your volume and what's happening when you don't?
Could someone explain that? I think this information would be useful to more than only me.
Docker does some complicated filesystem setup when you start a container. You have your image, which contains your application code; a container filesystem, which gets lost when the container exits; and volumes, which have persistent long-term storage outside the container. Volumes break down into two main flavors, bind mounts of specific host directories and named volumes managed by the Docker daemon.
The standard design pattern is that an image is totally self-contained. Once I have an image I should be able to push it to a registry and run it on another machine unmodified.
git clone git#github.com:me/myapp
cd myapp
docker build -t me/myapp . # requires source code
docker push me/myapp
ssh me#othersystem
docker run me/myapp # source code is in the image
# I don't need GitHub credentials to get it
There's three big problems with using volumes to store your application or your node_modules directory:
It breaks the "code goes in the image" pattern. In an actual production environment, you wouldn't want to push your image and also separately push the code; that defeats one of the big advantages of Docker. If you're hiding every last byte of code in the image during the development cycle, you're never actually running what you're shipping out.
Docker considers volumes to contain vital user data that it can't safely modify. That means that, if your node_modules tree is in a volume, and you add a package to your package.json file, Docker will keep using the old node_modules directory, because it can't modify the vital user data you've told it is there.
On MacOS in particular, bind mounts are extremely slow, and if you mount a large application into a container it will just crawl.
I've generally found three good uses for volumes: storing actual user data across container executions; injecting configuration files at startup time; and reading out log files. Code and libraries are not good things to keep in volumes.
For front-end applications in particular there doesn't seem to be much benefit to trying to run them in Docker. Since the actual application code runs in the browser, it can't directly access any Docker-hosted resources, and there's no difference if your dev server runs in Docker or not. The typical build chains involving tools like Typescript and Webpack don't have additional host dependencies, so your Docker setup really just turns into a roundabout way to run Node against the source code that's only on your host. The production path of building your application into static files and then using a Web server like nginx to serve them is still right in Docker. I'd just run Node on the host to develop this sort of thing, and not have to think about questions like this one.
We have On-premises software docker image.Also, We have licensing for application security and code duplication.
But to add extra security is it possible to do any of the below ?
Can we lock docker image such that no one can copy or save running container and start new docker container in another environment.
or is it possible to change something in docker image while build that may prevent user to login inside container.
Goal is to secure docker images as much as possible in terms of duplication of the docker images and stop login inside running container to see the configuration.
No. Docker images are a well known format with an open specification that is essentially a set of tar files and some json metadata. Once someone has this image, they can do with it what they want. This includes running it with any options they'd like, coping it, and extending it with their own changes.
I'm trying to figure out if best practices would dictate that when deploying a new version of my web app (nodejs running in its own container) I should:
Do a git pull from inside the container and update "in place"; or
Create a new container with the new code and perform a hot swap of the two docker containers
I may be missing some technical details as I'm very new to the idea of containers.
The second approach is the best practice: you would make a second version of your image (with the new code), stop your container, and run a second container based on that second version.
The idea is that you can easily roll-back as the first version of your image can be used to run the container that was initially in production at any time.
Trying to modify a running container is not a good idea as, once it is stopped and removed, running it again would be from the original image, with its original state. Unless you commit that container to a new image, those changes would be lost. And even if you did commit, you would not be able to easily rebuild that image. (plus you would commit the all container: its new code, but also a bunch of additional files created during the execution of the server: logs and other files: not very clean)
A container is supposed to be run from an image that you can precisely build from the specifications of a Dockerfile. It is not supposed to be modified at runtime.
Couple of caveat though:
if your container is used (--link) by other containers, you would beed to stop those first, stop your container and run a new one from a new version of the image, then restart your other containers.
don't forget to remount any data containers that you were using in order to get your persistent data.
For some scenarios a clustered file system is just too much. This is, if I got it right, the use case for the data volume container pattern. But even CoreOS needs updates from time to time. If I'd still like to minimise the down time of applications, I'd have to move the data volume container with the app container to an other host, while the old host is being updated.
Are there best practices existing? A solution mentioned more often is the "backup" of a container with docker export on the old host and docker import on the new host. But this would include scp-ing of tar-files to an other host. Can this be managed with fleet?
#brejoc, I wouldn't call this a solution, but it may help:
Alternative
1: Use another OS, which does have clustering, or at least - doesn't prevent it. I am now experimenting with CentOS.
2: I've created a couple of tools that help in some use cases. First tool, retrieves data from S3 (usually artifacts), and is uni-directional. Second tool, which I call 'backup volume container', has a lot of potential in it, but requires some feedback. It provides a 2-way backup/restore for data, from/to many persistent data stores including S3 (but also Dropbox, which is cool). As it is implemented now, when you run it for the first time, it would restore to the container. From that point on, it would monitor the relevant folder in the container for changes, and upon changes (and after a quiet period), it would back up to the persistent store.
Backup volume container: https://registry.hub.docker.com/u/yaronr/backup-volume-container/
File sync from S3: https://registry.hub.docker.com/u/yaronr/awscli/
(docker run yaronr/awscli aws s3 etc etc - read aws docs)
I'm planning to set up a jenkins-based CD workflow with Docker at the end.
My idea is to automatically build (by Jenkins) a docker image for every green build, then deploy that image either by jenkins or by 'hand' (I'm not yet sure whether I want to automatically run each green build).
Getting to the point of having a new image built is easy. My question is about the deployment itself. What's the best practice to 'reload' or 'restart' a running docker container? Suppose the image changed for the container, how do I gracefully reload it while having a service running inside? Do I need to do the traditional dance with multiple running containers and load balancing or is there a 'dockery' way?
Suppose the image changed for the container, how do I gracefully reload it while having a service running inside?
You don't want this.
Docker is a simple system for managing apps and their dependencies. It's simple and robust because ALL dependencies of an application are bundled with it. If your app runs today on your laptop, it will run tomorrow on your server. This is because we have captured 100% of the "inputs" for your application.
As soon as you introduce concepts like "upgrade" and "restart", your application can (accidentally) store state internally. That means it might behave differently tomorrow than it does today (after being restarted and upgraded 100 times).
It's better use a load balancer (or similar) to transition between your versions than to try and muck with the philosophy of Docker.
The Docker machine itself should always be immutable as you have to replace it for a new deployment. Storing state inside the Docker container will not work when you want to ship new releases often that you've built on your CI.
Docker supports Volumes which will let you write files that are permanent into some folder on the host. When you then upgrade the Docker container you use the same volume so you've got access to the same files written by the old container:
https://docs.docker.com/userguide/dockervolumes/