Running Docker Compose on Azure Batch Shipyard

Running Docker Compose on Azure Batch Shipyard - azure

I was wondering if anyone had any idea how to run compose on Batch Shipyard, or short of directly using compose, allowing multiple containers to work in concert on the same node in a job. Is this possible?
To clarify - I have several containers that work together parse and process a file. The main container utilizes other services via network calls. A simplified example of the compose file that I want to replicate is like so:
version: "3"
services:
primary:
image: "primaryimage"
depends_on:
- secondary
environment:
SECONDARY_URL: secondary:8888
secondary:
image: secondaryimage
Within code run in primary there are calls to the URL given in SECONDARY_URL to perform some transformations on the data, and a response is returned.

Azure Batch (Shipyard) does not have out-of-the-box support for Docker Compose.
But if you are already familiar with Docker Compose then it's not too hard convert it to shipyard configuration files.
If you want to run MPI/multi-instance tasks (a cluster of nodes cooperating on solving parts of a computation) then take a look at this.
However, Service Fabric does support Docker Compose. So if Docker Compose support is a strict requirement you could combine you Azure Batch setup with calls to a Service Fabric cluster.

Here is a workaround to allow this to work. Note that as a workaround, it is not officially supported and may break at any time.
Create a Docker image that has Docker itself (more specifically only the client portion is needed) installed with Docker compose.
For the tasks specification in jobs.yaml:
docker_image is the image you created above
Either use resource_files or input_data to ingress your compose file for the task
additional_docker_run_options should have an item:
-v /var/run/docker.sock:/var/run/docker.sock
command would be your docker-compose up command with whatever arguments you require
Ensure that your config.yaml specifies all of the Docker images required by your Docker compose file and the Docker image that contains Docker/docker-compose itself.

Related

K8s: deployment patterns for node.js apps with dbs

Hi!
My problem is relevant with the deployment of node.js apps via k8s, architecture patterns, and connected them with DBs.
alpha | beta | gamma1 | gamma2
I have the following node.js app services, some of them are scalable with app instances (like gamma), others are separate ones, all of them are built in a single docker image with .Dockefile and running from it.
And I also have a-non-cloud DBs, like elastic & mongo running from their containers with .env: mongo | elastic
As for now, my docker-compose.yml is like a typical node.js example app, but with common volume and bridge-network (except I have more then one node.js app):
version: '3'
services:
node:
restart: always
build: .
ports:
- 80:3000
volumes:
- ./:/code
mongo:
image: mongo
ports:
- 27017:27017
volumes:
- mongodb:/data/db
volumes:
mongodb:
networks:
test-network:
driver: bridge
Current deployment:
All these things are running on a single heavy VPS (X CPU cores, Y RAM, Z SSD, everything loaded by 70%) from single docker-compose.yml file.
What I want to ask and achieve:
Since one VPS is already not enough, I'd like to start using k8s with rancher. So the question is about correct deployment:
For example, I have N VPSs connected within one private network, each VPS is a worker connected in one cluster, (with Rancher, of course, one of them is a master node) which gives me X cores, Y RAM, and other shared resources.
Do I need another, separate cluster (or a VPS machine in a private network, but not part of a cluster) with DB running on it? Or I could deploy DB in the same cluster? And what if each VPS (worker) in the cluster has only 40GB volume, and DB will grow more than this volume? Do shared resources from workers include the shared volume space?
Is it right to have one image from which I can start all my apps, or in the case of k8s, I should I have a separate docker image for each service? So if I have 5 node.js apps within one mono-repo, I should have 5 separate docker-image, not one common?
I'll understand that my question can have a complex answer, so I will be glad to see, not just answer but links or anything that is connected with a problem. It's much more easy to find or google for something, if you know and how to ask.

A purist answer:
Each of your five services should have their own image, and their own database. It's okay for the databases to be in the same cluster so long as you have a way to back them up, run migrations, and do other database-y things. If your cloud provider offers managed versions of these databases then storing the data outside the cluster is fine too, and can help get around some of the disk-space issues you cite.
I tend to use Helm for actual deployment mechanics as a way to inject things like host names and other settings at deploy time. Each service would have its own Dockerfile, its own Helm chart, its own package.json, and so on. Your CI system would build and deploy each service separately.
A practical answer:
There's nothing technically wrong with running multiple containers off the same image doing different work. If you have a single repository and a single build system now, and you don't mind a change in one service causing all of them to redeploy, this approach will work fine.
Whatever build system your repository has now, if you go with this approach, I'd put a single Dockerfile in the repository root and probably have a single Helm chart to deploy it. In the Helm chart Deployment spec you can override the command to run with something like
# This fragment appears under containers: in a Deployment's Pod spec
# (this is Helm chart, Go text/template templated, YAML syntax)
image: {{ .Values.repository }}/{{ .Values.image }}:{{ .Values.tag }}
command: node service3/index.js
Kubernetes's terminology here is slightly off from Docker's, particularly if you use an entrypoint wrapper script. Kubernetes command: overrides a Dockerfile ENTRYPOINT, and Kubernetes args: overrides CMD.
In either case:
Many things in Kubernetes allocate infrastructure dynamically. For example, you can set up a horizontal pod autoscaler to set the replica count of a Deployment based on load, or a cluster autoscaler to set up more (cloud) instances to run Pods if needed. If you have a persistent volume provisioner then a Kubernetes PersistentVolumeClaim object can be backed by dynamically allocated storage (on AWS, for example, it creates an EBS volume), and you won't be limited to the storage space of a single node. You can often find prebuilt Helm charts for the databases; if not, use a StatefulSet to have Kubernetes create the PVCs for you.
Make sure your CI system produces images with a unique tag, maybe based on a timestamp or source control commit ID. Don't use ...:latest or another fixed string: Kubernetes won't redeploy on update unless the text of the image: string changes.
Multiple clusters is tricky in a lot of ways. In my day job we have separate clusters per environment (development, pre-production, production) but the application itself runs in a single cluster and there is no communication between clusters. If you can manage the storage then running the databases in the same cluster is fine.
Several Compose options don't translate well to Kubernetes. I'd especially recommend removing the volumes: that bind-mount your code into the container and validating your image runs correctly, before you do anything Kubernetes-specific. If you're replacing the entire source tree in the image then you're not really actually running the image, and it'll be much easier to debug locally. In Kubernetes you also have almost no control over networks: but they're not really needed in Compose either.

I can't answer the part of your question about the VPS machine setup, but I can make some suggestions about the image setup.
Actually, while you have asked this question about a node app, it's actually applicable for more than just node.
Regarding the docker image and having a common image or separate ones; generally it's up to you and/or your company as to whether you have a common or separate image.
There's both pros and cons about both methods:
You could "bake in" the code into the image, and have a different image per app, but if you run into any security vulnerabilities, you have to patch, rebuild, and redeploy all the images. If you had 5 apps all using the same library, but that library was not in the base image, then you would have to patch it 5 times, once in each image, rebuild the image and redeploy.
Or you could just use a single base image which includes the libraries needed, and mount the codebase in (for example as a configmap), and that base image would never need to change unless you had to patch something in the underlying operating system. The same vulnerability mentioned in the paragraph above, would only need to be patched in the base image, and the affected pods could be respun (no need to redeploy).

How do i create a custom docker image?

I have to deploy my application software which is a linux based package (.bin) file on a VM instance. As per system requirements, it needs minimum 8vCPUs and 32GB RAM.
Now, i was wondering if it is possible to deploy this software over multiple containers that load share the CPU and RAM power in the kubernetes cluster, rather than installing the software on a single VM instance.
is it possible?

Yes, it's possible to achieve that.
You can start using docker compose to build your customs docker images and then build your applications quickly.
First, I'll show you my GitHub docker-compose repo, you can inspect the folders, they are separated by applications or servers, so, one docker-compose.yml build the app, only you must run a command docker-compose up -d
if you need to create a custom image with docker you should use this docker command docker build -t <user_docker>/<image_name> <path_of_files>
<user_docker> = your docker user
<image_name> = the image name that you choose
<path_of_files> = somelocal path, if you need to build in the same folder you should use . (dot)
So, after that, you can upload this image to Dockerhub using the following commands.
You must login with your credentials
docker login
You can check your images using the following command
docker images
Upload the image to DockerHub registry
docker push <user_docker>/<image_name>
Once the image was uploaded, you can use it in different projects, make sure to make the image lightweight and usefully
Second, I'll show a similar repo but this one has a k8s configuration into the folder called k8s. This configuration was made for Google cloud but I think you can analyze it and learn how you can start in your new project.
The Nginx service was replaced by ingress service ingress-service.yml and https certificate was added certificate.yml and issuer.yml files
If you need dockerize dbs, make sure the db is lightweight, you need to make a persistent volume using PersistentVolumeClaim (database-persistent-volume-claim.yml file) or if you use larger data onit you must use a dedicated db server or some db service in the cloud.
I hope this information will be useful to you.

There are two ways to achieve what you want to do. The first one is to write a dockerfile and create the image. More information about how to write a dockerfile can be found from here. Apart for that you can create a container from a base image and install all the software and packages and export it as a image. Then you can upload to a docker image repo like Docker Registry and Amazon ECR

Is anyone done Hyperledger fabric multi org in multi host using docker compose file

Is anyone done Hyperledger fabric multi org in multi host using docker compose file.
I just need to know the feasibility, if possible please share the reference material as well.
I am also tried with the docker compose commands as mentioned in this link https://medium.com/#wahabjawed/hyperledger-fabric-on-multiple-hosts-a33b08ef24f

It is very well feasible and we have tried it out.
There are two ways you can achieve
1. Either docker-compose up with splitting the files or passing arguments as params where each org has to be brought up
2. Using stack deploy for a single file and keeping the constraints as per the schema 3.5 for docker compose file.
Rest all process remains the same.
You need to use docker swarm init to initialise the cluster and using the token generated add other VMs. Refer docker swarm guide for help.

Limit useable host resources in Docker compose without swarm

I simply want to limit the resources of some Docker containers in a docker-compose file. The reason is simple: There are multiple apps/services running on the host. So I want to avoid, that a single container can use e.g. all memory, which harms the other containers.
From the docs I learned, that this can be done using resources. But this is beyond deploy. So I have to write my docker-compose file like the following example:
php:
image: php:7-fpm
restart: always
volumes:
- ./www:/www
deploy:
resources:
limits:
memory: 512M
This gave me the warning:
WARNING: Some services (php) use the 'deploy' key, which will be ignored. Compose does not support deploy configuration - use docker stack deploy to deploy to a swarm.
And that seems to be true: docker statsconfirms, the container is able to use all the ram from the host.
The documentation says:
Specify configuration related to the deployment and running of services. This only takes effect when deploying to a swarm with docker stack deploy, and is ignored by docker-compose up and docker-compose run.
But I don't need clustering. It seems that there is no other way to limit resources using a docker composer file. Why is it not possible to specify some kind of memorytag like the start-parameter in docker rundoes?
Example: docker run --memory=1g $imageName
This works perfectly for a single container. But I can't use this (at least without violating a clean separation of concerns), since I need to use two different containers.
Edit: Temp workaround
I found out, that I'm able to use mem_limit directly after downgrading from version 3 to version 2 (placing version: '2' on top). But we're currently on version 3.1, so this is not a long-time solution. And the docs say, that deploy.resources is the new replacement for v2 tags like mem_limit.
Someday, version 2 is deprecated. So resource management isn't possible any more with the latest versions, at least without having a swarm? Seems a worsening for me, can't belive this...

Since many Docker Compose users have complained about this incompatibility of compose v3 vs v2, the team has developed compatibility mode.
You can retain the same deploy structure that you provided and it will not be ignored, simply by adding the --compatibility flag to the docker-compose command (docker-compose --compatibility up), as explained here. I tested this with version 3.5 and verified with docker stats and can confirm that it works.

You can run the docker daemon in swarm mode on a single host. It will add extra un-needed features like the etcd service discovery but that's all behind the scene.
The Docker documentation has a "note" about it here https://docs.docker.com/engine/swarm/swarm-tutorial/#three-networked-host-machines

Docker security concerns using unofficial images

How to ensure, that docker container will be secure, especially when using third party containers or base images?
Is it correct, when using base image, it may initiate any services or mount arbitrary partitions of host filesystem under the hood, and potentially send sensitive data to attacker?
So if I use third party container, which Dockerfile proves the container to be safe, should I traverse the whole linked list of base images (potentially very long) to ensure the container is actually safe and does what it intends of doing?
How to ensure the trustworthy of docker container in a systematic and definite way?

Consider Docker images similar to android/iOS mobile apps. You are never quite sure if they are safe to run, but the probability of it being safe is higher when it's from an official source such as Google play or App Store.
More concretely Docker images coming from Docker hub go through security scans details of which are undisclosed as yet. So chances of a malicious image pulled from Docker hub are rare.
However, one can never be paranoid enough when it comes to security. There are two ways to make sure all images coming from any source are secure:
Proactive security: Do security source code review of each Dockerfile corresponding to Docker image, including base images which you have already expressed in question
Reactive security: Run Docker bench, open sourced by Docker Inc., which runs as a privileged container looking for runtime known malicious activities by containers.
In summary, whenever possible use Docker images from Docker hub. Perform security code reviews of DockerFiles. Run Docker bench or any other equivalent tool that can catch malicious activities performed by containers.
References:
Docker security scanning formerly known as Project Nautilus: https://blog.docker.com/2016/05/docker-security-scanning/
Docker bench: https://github.com/docker/docker-bench-security
Best practices for Dockerfile: https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/

Docker images are self-contained, meaning that unless you run them inside a container with volumes and network mode they have no way of accessing any network or memory stack of your host.
For example if I run an image inside a container by using the command:
docker run -it --network=none ubuntu:16.04
This will start the docker container ubuntu:16.04 with no mounting to host's storage and will not share any network stack with host. You can test this by running ifconfig inside the container and in your host and comparing them.
Regarding checking what the image/base-image does, a conclusion from above said is nothing harmful to your host (unless you mount your /improtant/directory_on_host to container and after starting container it removes them).
You can check what an image/base-image conatins after running by checking their dockerfile(s) or docker-compose .yml files.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string