Cannot setup multi-host Docker overlay network with etcd - linux

I am trying to connect two Docker hosts with an overlay network and am using etcd as a KV-store. etcd is running directly on the first host (not in a container). I finally managed to connect the Docker daemon of the first host to etcd but cannot manage to establish a connection the Docker daemon on the second host.
I downloaded etcd from the Github releases page and followed the instructions under the "Linux" section.
After starting etcd, it is listening to the following ports:
etcdmain: listening for peers on http://localhost:2380
etcdmain: listening for peers on http://localhost:7001
etcdmain: listening for client requests on http://localhost:2379
etcdmain: listening for client requests on http://localhost:4001
And I started the Docker daemon on the first host (on which etcd is running as well) like this:
docker daemon --cluster-advertise eth1:2379 --cluster-store etcd://127.0.0.1:2379
After that, I could also create an overlay network with:
docker network create -d overlay <network name>
But I can't figure out how to start the daemon on the second host. No matter which values I tried for --cluster-advertise and --cluster-store, I keep getting the following error message:
discovery error: client: etcd cluster is unavailable or misconfigured
Both my hosts are using the eth1 interface. The IP of host1 is 10.10.10.10 and the IP of host2 is 10.10.10.20. I already ran iperf to make sure they can connect to each other.
Any ideas?

So I finally figured out how to connect the two hosts and to be honest, I don't understand why it took me so long to solve the problem. But in case other people run into the same problem I will post my solution here. As mentioned earlier, I downloaded etcd from the Github release page and extracted the tar file.
I followed the instructions from the etcd documentation and applied it to my situation. Instead of running etcd with all the options directly from the command line I created a simple bash script. This makes it a lot easier to adjust the options and rerun the command. Once you figured out the right options it would be handy to place them separately in a config file and run etcd as a service as explaind in this tutorial. So here is my bash script:
#!/bin/bash
./etcd --name infra0 \
--initial-advertise-peer-urls http://10.10.10.10:2380 \
--listen-peer-urls http://10.10.10.10:2380 \
--listen-client-urls http://10.10.10.10:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.10.10.10:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster infra0=http://10.10.10.10:2380,infra1=http://10.10.10.20:2380 \
--initial-cluster-state new
I placed this file in the etcd-vX.X.X-linux-amd64 directory (that I just downloaded and extracted) which also contains the etcd binary. On the second host I did the same thing but changed the --name from infra0 to infra1 and adjusted the IP to that one the second host (10.10.10.20). The --initial-cluster option is not modified.
Then I executed the script on host1 first and then on host2. I'm not sure if the order matters, but in my case I got an error message when I did it the other way round.
To make sure your cluster is set up correctly you can run:
./etcdctl cluster-health
If the output looks similar to this (listing the two members) it should work.
member 357e60d488ae5ab3 is healthy: got healthy result from http://10.10.10.10:2379
member 590f234979b9a5ee is healthy: got healthy result from http://10.10.10.20:2379
If you want to be really sure, add a value to your store on host1 and retrieve it on host2:
host1$ ./etcdctl set myKey myValue
host2$ ./etcdctl get myKey
Setting up docker overlay network
In order to set up a docker overlay network I had to restart the Docker daemon with the --cluster-store and --cluster-advertise options. My solution is probably not the cleanest one but it works. So on both hosts first stopped the docker service and then restarted the daemon with the options:
sudo service docker stop
sudo /usr/bin/docker daemon --cluster-store=etcd://10.10.10.10:2379 --cluster-advertise=10.10.10.10:2379
Note that on host2 the IP addresses need to be adjusted. Then I created the overlay network like this on one of the hosts:
sudo docker network create -d overlay <network name>
If everything worked correctly, the overlay network can now be seen on the other host. Check with this command:
sudo docker network ls

Related

Docker Container Attached to Network, Network Inspect Shows No Containers

I am simply trying to connect a ROS2 node from my Ubuntu 22.04 VM on my laptop to another ROS2 node on another machine running Ubuntu 18.04. Ideally, I would only have Docker on the second machine (the first machine runs a trivial node that will never change), but I have been trying using a separate container on each.
Here is what I am doing and what I am seeing when I inspect:
(ssh into machine 2 from VM 1.)
A: start up network from machine 2.
sudo docker network create -d overlay --attachable my-attachable-ovrlay
B: start up container 1.
sudo docker run -it --rm test1
C: successfully attach container 1 to the network.
sudo docker network connect dwgyau64pvpenxoj2edu4liqu bold_murdock
D: Confirm the container lists network.
sudo docker inspect -f '{{range $key, $value := .NetworkSettings.Networks}}{{$key}} {{end}}' bold_murdock
prints:
bridge my-attachable-ovrlay
E: Check the network to see container.
sudo docker network inspect my-attachable-ovrlay
prints (among other things):
"Containers": null,
I am new to Docker AND networking, so I could be missing something huge, but I have tried all of the standard suggestions I found online including disabling my firewall, opening a ton of ports using ufw allow on both machines, making sure nodes are active, etc etc etc etc etc.
I tried joining the network from machine 2 and that works and the container is displayed when using network inspect. But when I do that, then machine 1 simply refuses to connect to network.
F: In this situation it gives an error.
sudo docker network connect dwgyau64pvpenxoj2edu4liqu objective_mendel
prints:
Error response from daemon: attaching to network failed, make sure your network options are correct and check manager logs: context deadline exceeded
Also, before trying any docker networking, I have tried plainly pinging from VM1 to machine 2 and that works, both ways. I have tried to use netcat to open an old-timey chat window on port 1234 (random port as per this resource) and that works one way only. I can communicate both ways, but only when machine 1 sends the initial netcat request and machine 2 listens. When machine 2 sends request and 1 listens, nothing happens.
I have been struggling to get this to work for 3 weeks now. I know it’s something stupid, I just know it. Any advice would be incredibly appreciated. Please explain like I know nothing about networking, because I just about do.
EDIT: I converted images (still hyperlinked) into code blocks.
If both PCs are on the same LAN, you could skip the whole network configuration entirely and use ROS2 auto-discovery.
E.g.
PC1:
docker run -it --rm --net=host -v /dev/shm:/dev/shm osrf/ros:foxy-desktop
export ROS_DOMAIN_ID=1
ros2 run demo_nodes_py talker
PC2:
docker run -it --rm --net=host -v /dev/shm:/dev/shm osrf/ros:foxy-desktop
export ROS_DOMAIN_ID=1
ros2 run demo_nodes_py listener
If the PCs are not on the same network, I usually use ZeroTier to create a virtual LAN between PC1, PC2, and PC(N), then repeat the above example.
The issue was that the router was set to 1969. When we updated the time by connecting to the internet for 15 seconds, then disconnected, it started working.

Not exposed ports in docker swarm mode

I'm trying to set up very simple service using docker swarm and I have a problem with exposing ports.
I have two machines, lets name them xxx and yyy. When I do simple
docker run -d -p 9200:9200 -p 9300:9300 elasticsearch:7.4.0
both of them works correctly, I can go to xxx:9200 to have instance of Elasticsearch
I tried to do the same with swarm mode so, on the xxx machine I did:
docker swarm init --advertise-addr [external IP of xxx machine]
I got correct token and I successfully joined yyy machine to the swarm.
Then I created new overlay network using
docker network create -d overlay dockerdemo
and created service in this swarm using
docker service create --name swarmelasticsearch --network dockerdemo --replicas 2 -p 9200:9200 -p 9300:9300 elasticsearch:7.4.0
Service is being created successfully, both of machines have running containers with Elasticsearch, but I cannot go to them from outside. When I go to the xxx:9200 or yyy:9200 or port-of-xxx:9200 nothing happens. I cannot reach my site. Why? Do I need to do anything more? Both of my machines are on Azure VM with Ubuntu + latest docker.

connecting to services on docker host from docker container

Apologies for asking two unrelated questions.
what is the best way of accessing the host machine of the docker container (i.e. I am trying to access a kafka instance running on the host, from my docker container so that I can publish some messages)
when I run docker run ..... on an image which I've modified that may have an issue/syntax error, it will naturally not start - is there a log file anywhere that I would be able to take a look at to debug the issue. (this question is somewhat related to the 1st question, since I did what was suggested on another post, but the image is still not starting)
This is an ongoing discussion on what to use and what not, I don't really know what is best. Using the docker run --net="host" is pretty easy but can be dangerous. See From inside of a Docker container, how do I connect to the localhost of the machine?.
Use docker logs containerid or lookup the raw data in /var/lib/docker/containers/containerid/ for Ubuntu.
You should have no problem connecting to the host using the local lan interface ip address. Suppose you have a host with ip 192.168.0.1:
docker run --rm -ti ubuntu bash
ping 192.168.0.1
should give you a response.
You can use docker logs to see the standard output of your container.

How to reach docker containers by name instead of IP address?

Is there a way I can reach my docker containers using names instead of ip addresses?
I've heard of pipework and I've seen some dns and hostname type options for docker, but I still am unable to piece everything together.
Thank you for your time.
I'm not sure if this is helpful, but this is what I've done so far:
installed docker container host using docker-machine and the vmwarevsphere driver
started up all the services with docker-compose
I can reach all of the services from any other machine on the network using IP and port
I've added a DNS alias entry to my private network DNS server and it matches the machine name that's used by docker-machine. But the machine always picks up a different IP address when it boots and connects to the network.
I'm just lost as to where to tackle this:
network DNS server
docker-machine hostname
docker container hostname
probably some combination of all of them
I'm probably looking for something similar to this question:
How can let docker use my network router to assign dhcp ip to containers easily instead of pipework?
Any general direction will be awesome...thanks again!
Docker 1.10 has a built in DNS. If your containers are connected to the same user defined network (create a network docker network create my-network and run your container with --net my-network) they can reference each other using the container name. (Docs).
Cool!
One caveat if you are using Docker compose you know that it adds a prefix to your container names, i.e. <project name>_<service name>-#. This makes your container names somewhat more difficult to control, but it might be ok for your use case. You can override the docker compose naming functionality by manually setting the container name in your compose template, but then you wont be able to scale with compose.
Create a new bridge network other than docker0, run your containers inside it and you can reference the containers inside that network by their names.
Docker daemon runs an embedded DNS server to provide automatic service
discovery for containers connected to user-defined networks. Name
resolution requests from the containers are handled first by the
embedded DNS server.
Try this:
docker network create <network name>
docker run --net <network name> --name test busybox nc -l 0.0.0.0:7000
docker run --net <network name> busybox ping test
First, we create a new network. Then, we run a busybox container named test listening on port 7000 (just to keep it running). Finally, we ping the test container by its name and it should work.
EDIT 2018-02-17: Docker may eventually remove the links key from docker-compose, therefore they suggest to use user-defined networks as stated here => https://docs.docker.com/compose/compose-file/#links
Assuming you want to reach the mysql container from the web container of your docker-compose.yml file, such as:
web:
build: .
links:
- mysql
mysqlservice:
image: mysql
You'll be pleased to know that Docker Compose already adds a mysqlservice domain name (in the web container /etc/hosts) which point to the mysql container.
Instead of looking for the mysql container IP address, you can just use the mysqlservice domain name.
If you want to add custom domain names, it's also possible with the extra_hosts parameter.
You might want to try out dnsdock. Looks straight forward and easy(!) to set up. Have a look at http://blog.brunopaz.net/easy-discover-your-docker-containers-with-dnsdock/ and https://github.com/tonistiigi/dnsdock .
If you want out of the box solution, you might want to check for example Kontena. It comes with network overlay technology from Weave and this technology is used to create virtual private LAN networks between services. Thanks to that every service/container can be reached by service_name.kontena.local.
I changed the --net parameter with --network parameter and it runs as expected:
docker network create <network name>
docker run --network <network name> --name <container name> <other container options>
docker run --network <network name> --name <container name> <other container options>
If you are using Docker Compose, and your docker-compose.yml file has a top-level services: block (you are not using the obsolete "version 1" file format), then Compose does all of the required setup automatically. The names underneath services: can be directly used as host names.
version: '3.8'
services:
database: # <-- "database" is a usable hostname
image: postgres
application: # <-- "application" is a usable hostname
build: .
environment:
PGHOST: database # <-- use the "database" hostname
Networking in Compose in the Docker documentation describes this setup further.
These host names only work for connections between containers, in the same Compose file. If you manually declare networks: then the two containers must have some network in common, but the easiest setup is to just not declare networks: at all. These connections will only use the "standard" port (for PostgreSQL, for example, always connect to port 5432); a ports: declaration is not required and is ignored if present.

Restarting named container assigns different IP

Im a trying to deploy my application using Docker and came across an issue that restarting named containers assigns a different IP to container. Maybe explaining what I am doing will better explain the issue:
Postgres runs inside a separate container named "postgres"
$ PG_ID=$(docker run --name postgres postgres/image)
My webapp container links to postgres container
$ APP_ID=$(docker run --link postgres:postgres webapp/image)
Linking postgres container image to webapp container inserts in webapp container a hosts file entry with the IP of the postgres container. This allows me to point to postgres db within my webapp using postgres:5432 (I am using Django btw). This all works well except if for some reason postgres crashes.
Before I manually stop postgres process to simulate postgres process crashing I verify IP of postgres container:
$ docker inspect --format "{{.NetworkSettings.IPAddress}}" $PG_ID
172.17.0.73
Now to simulate crash I stop postgres container:
$ docker stop $PG_ID
If now I restart postgres by using
$ docker start $PG_ID
the ip of the container changes:
$ docker inspect --format "{{.NetworkSettings.IPAddress}}" $PG_ID
172.17.0.74
Therefore the IP which points to postgres container in webapp container is no longer correct. I though that by naming container docker assigns a name to it with specific configs so that you can reliably link between containers (both network and volumes). If the IP changes this seems to defeat the purpose.
If I have to restart my webapp process each time I postgres restarts, this does not seem any better than just using a single container to run both processes. Then I can use supervisor or something similar to keep both of them running and use localhost to link between processes.
I am still new to Docker so am I doing something wrong or is this a bug in docker?
2nd UPDATE: maybe you already discovered this, but as workaround, I plan to map the service to share the database to the host interface (ej: with -p 5432:5432), and connect the webapps to the host IP (the IP of the docker0 interface: in my Ubuntu and CentOS, the IP is 172.17.42.1). If you restart the postgres container, the conteiner's IP will change, but I wil be accesible using 172.17.42.1:5432. The downside is that you are exposing that port to all the containers, and loose the fine-grained mapping that --link gives you.
--- OLD UPDATES:
CORRECTION: Docker will map 'postgres' to the container's IP in the /etc/hosts files, on the webapp container. So, in the webapp container, you can ping 'postgres', and it will be mapped to the IP.
1st UPDATE: I've seen that Docker generates and mounts /etc/hosts, /etc/resolv.conf, etc. to have always the correct information, but this does not apply when the linked container is restarted. So, I've assumed (wrongly) that Docker would update the hosts files.
-- ORIGINAL (wrong) response:
Add --hostname=postgres-db (you can use anythin, I'm using something different than 'postgres' to avoid confussion with the container name):
$ docker run --name postgres --hostname postgres-db postgres/image
Docker will map 'postgres-db' to the container's IP (check the contents of /etc/hosts on the webapp container).
This will allow you run 'ping postgres-db' from the webapp container. If the IP changes, Dockers will update /etc/hosts for you.
In the Django app, use 'postgres-db' instead of the IP (or whatever you use for --hostname of the container with PostgreSql).
Bye!
Horacio
According to https://docs.docker.com/engine/reference/commandline/run/, it should be possible to assign a static IP for your container -- at the time of container creation -- using the --ip option:
Example:
docker run -itd --ip 172.30.100.104 --name postgres postgres/image
....where 172.30.100.104 is a free IP address on a custom bridge/overlay network.
This should then retain the same IP address even if postgres container crashes/restarts.
Looks like this was released in Docker Engine v 1.10 or greater, therefore if you have a lower version, you have to upgrade first.
As of Docker 1.0 they implemented a stronger sense of linked containers. Now you can use the container instance name as if it were the host name.
Here is a link
I found a link that better describes your problem. And while that question was answered I wonder whether or not this ambassador pattern might not solve the problem... this assumes that the ambassador is more reliable than the services that link.

Resources