docker build fails on a cloud VM - azure

I have an Ubuntu 16.04 (Xenial) running inside an Azure VM. I have followed the instructions to install Docker and all seems fine and dandy.
One of the things that I need to do when I trigger docker run is to pass --net=host, which allows me to run apt-get update and other internet-dependent commands within the container.
The problem comes in when I try to trigger docker build based on an existing Ubuntu image. It fails:
The problem here is that there is no way to pass --net=host to the build command. I see that there are issues open on the Docker GitHub (#20987, #10324) but no clear resolution.
There is an existing answer on Stack Overflow that covers the scenario I want, but that doesn't work within a cloud VM.
Any thoughts on what might be happening?
UPDATE 1:
Here is the docker version output:
Client:
Version: 1.12.0
API version: 1.24
Go version: go1.6.3
Git commit: 8eab29e
Built: Thu Jul 28 22:11:10 2016
OS/Arch: linux/amd64
Server:
Version: 1.12.0
API version: 1.24
Go version: go1.6.3
Git commit: 8eab29e
Built: Thu Jul 28 22:11:10 2016
OS/Arch: linux/amd64
UPDATE 2:
Here is the output from docker network ls:
NETWORK ID NAME DRIVER SCOPE
aa69fa066700 bridge bridge local
1bd082a62ab3 host host local
629eacc3b77e none null local

Another approach would be to try letting docker-machine provision the VM for you and see if that works. There is a provider for Azure, so you should be able to set your subscription id on a local Docker client (Windows or Linux) and follow the instructions to get a new VM provisioned with Docker and it will also setup your local environment variables to communicate with the Docker VM instance remotely. After it is setup running docker ps or docker run locally would run the commands as if you were running them on the VM. Example:
#Name at end should be all lower case or it will fail.
docker-machine create --driver azure --azure-subscription-id <omitted> --azure-image canonical:ubuntuserver:16.04.0-LTS:16.04.201608150 --azure-size Standard_A0 azureubuntu
#Partial output, see docker-machine resource group in Azure portal
Running pre-create checks...
(azureubuntu) Completed machine pre-create checks.
Creating machine...
(azureubuntu) Querying existing resource group. name="docker-machine"
(azureubuntu) Resource group "docker-machine" already exists.
(azureubuntu) Configuring availability set. name="docker-machine"
(azureubuntu) Configuring network security group. location="westus" name="azureubuntu-firewall"
(azureubuntu) Querying if virtual network already exists. name="docker-machine-vnet" location="westus"
(azureubuntu) Configuring subnet. vnet="docker-machine-vnet" cidr="192.168.0.0/16" name="docker-machine"
(azureubuntu) Creating public IP address. name="azureubuntu-ip" static=false
(azureubuntu) Creating network interface. name="azureubuntu-nic"
(azureubuntu) Creating virtual machine. osImage="canonical:ubuntuserver:16.04.0-LTS:16.04.201608150" name="azureubuntu" location="westus" size="Standard_A0" username="docker-user"
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with ubuntu(systemd)...
Installing Docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Docker is up and running!
To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env azureubuntu
#Set environment using PowerShell (or login to the new VM) and see containers on remote host
docker-machine env azureubuntu | Invoke-Expression
docker info
docker network inspect bridge
#Build a local docker project using the remote VM
docker build MyProject
docker images
#To clean up the Azure resources for a machine (you can create multiple, also check docker-machine resource group in Azure portal)
docker-machine rm azureubuntu
Best I can tell that is working fine. I was able to build a debian:wheezy DockerFile that uses apt-get on the Azure VM without any issues. This should allow the containers to run using the default bridged network as well instead of the host network.

According to I can't get Docker containers to access the internet? using sudo systemctl restart docker might help, or enable net.ipv4.ip_forward = 1 or disable the firewall.
Also you may need to update the dns servers in /etc/resolv.conf on the VM

Related

Azure DSVM: Cannot connect to the Docker daemon

We have been using Data Science Virtual Machine in combination with Virtual Machine scale set for our CI and then running custom Docker image in connected Azure pipelines.
https://github.com/PyTorchLightning/metrics/blob/77e252ec6165ec94e23ce5c5cf9ffdad01bf54a1/azure-pipelines.yml#L29
Recently we are observing the following failer message
Starting: Initialize containers
/usr/bin/docker version --format '{{.Server.APIVersion}}'
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
'
##[error]Exit code 1 returned from process: file name '/usr/bin/docker', arguments 'version --format '{{.Server.APIVersion}}''.
see the full output here - https://dev.azure.com/PytorchLightning/Metrics/_build/results?buildId=9061&view=logs&j=fd70b5b8-241a-53bf-d137-3fd86cf9f066&t=a0ca1fe4-fde6-4a82-9888-52f5ae79d8fe
UPDATE: the issue was solved in June 2021 release,
see Azure DSVM release notes
Based on the discussion on the post above, the solution (for now) is to pin the version of the scale set image to a previous version:
az vmss update -g <resource group> -n <vmss name> --set virtualMachineProfile.storageProfile.imageReference.version=21.01.21
Docker appears to be disabled in the latest version of the DSVM. Until that is corrected, pin the version. In general, for stability, pinning the version is probably a good idea and then be deliberate about when you change versions so that you know what is going on.
The docker is enabled by default on the latest image release (21.06.01) of Data Science Virtual Machine - Ubuntu 18. This should probably resolve this issue.
Below command is working on the latest Data Science Virtual Machine.
/usr/bin/docker --version
Docker version 20.10.6+azure, build 370c28948e3c12dce3d1df60b6f184990618553f
However above command output works, we need to start docker daemon using the below commands:
sudo systemctl unmask docker
sudo systemctl start docker
sudo chmod 777 /var/run/docker.sock

How to pull a docker image from Azure Registry in a Azure Ubuntu virtual machine

I have created an Azure Registry where I deploy some docker container from the CD\CI in Azure DevOps.
Following the Microsoft documentation, I have created a service principal. So, I have username and password to use to pull images from the Azure Container Registry. I tried to pull the images locally and it is working. To connect to the Container registry I use this command:
docker login myazureregistry.azurecr.io --username --password
Now, I want to create a virtual machine in Azure to publicly access to the application in the container.
I created an Ubuntu virtual machine and installed Docker. I run the same command as before on the Ubuntu machine but I got an error:
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.39/auth: dial unix /var/run/docker.sock: connect: permission denied
What is the problem? How have I to configure Ubuntu to connect to the Azure Container?
Maybe you don't have permissions for use docker.
Add your user to docker group for use docker command without sudo
sudo groupadd docker
sudo usermod -aG docker $USER
And after this run your login command. (If you use virtual machine, it may be necessary to restart the virtual machine for changes to take effect)

userns-remap option in Docker Swarm (existing) installation

I decided to increase security by enabling userns-remap option in Docker running in swarm mode.
Installation is not new, there's plenty of running services.
Followed configuration with official manual: https://docs.docker.com/engine/security/userns-remap/
Docker service is starting but docker service ls throws error:
Handler for GET /v1.40/services returned error: This node is not a swarm manager. Use docker swarm init or docker swarm join to connect this node to swarm and try again
Error getting services: This node is not a swarm manager. Use docker swarm init or docker swarm join to connect this node to swarm and try again
cat /etc/docker/daemon.json is simple as
{
"userns-remap": "default"
}
cat /etc/subuid /etc/subgid
dockremap:100000:65536
dockremap:100000:65536
id dockremap
uid=1000(dockremap) gid=1000(dockremap) groups=1000(dockremap)
ls -ld /var/lib/docker/100000.100000/
drwx------ 11 231072 231072 26 Mar 21 20:19 /var/lib/docker/100000.100000/
Removing userns-remap from config brings services back to normal.
Running CentOS 7.7 and docker 19.03.8
How can I make it work?
From https://docs.docker.com/engine/security/userns-remap/:
Enabling userns-remap effectively masks existing image and container
layers, as well as other Docker objects within /var/lib/docker/. This
is because Docker needs to adjust the ownership of these resources and
actually stores them in a subdirectory within /var/lib/docker/. It is
best to enable this feature on a new Docker installation rather than
an existing one.
Ergo - all existing images and containers will not be available after enabling user namespace.

How to reconnect to docker instance

I'll start from the beginning.
I've created an Ubuntu machine with docker installed on Azure.
On the top I created two docker containers, I used to connect from an old computer using docker-machine for management tasks.
I've changed my computer, so I need to connect from the new one.
I've added my azure subscription
However when I try the docker-machine cdmdlet to the existing container I have the following error message:
PS C:\WINDOWS\system32> docker-machine ssh vm name
Host does not exist: "vm name".
The machine is running but I'll guess I'll have to recreate the certificates used for the connection.
I've tried the following with no luck:
PS C:\WINDOWS\system32> docker-machine regenerate-certs vm name
Regenerate TLS machine certs?  Warning: this is irreversible. (y/n): y
Regenerating TLS certificates
Host does not exist: "vm name"
I no longer have access to the old machine.
Has anyone been into the same situation?
Any thoughts are welcome.
You'll have to recreate the machine using the generic driver:
docker-machine create \
--driver generic \
--generic-ip-address=203.0.113.81 \
--generic-ssh-key ~/.ssh/id_rsa \
vm
Replace the information accordingly.
Note that this does NOT remove any data on the target instance, rather it just configures docker to talk to machine if it isn't already, and also generates new certificates so it can communicate with the instance.

"Default" docker machine does not exist on Linux when Docker daemon is running

I'm running Docker on Linux Manjaro. No problem with running and using the service:
[luqo33#ltarasiewicz-pc containers]$ systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Active: active (running) since Fri 2016-12-23 20:46:31 CET; 26s ago
However, docker-machine ls will always show this:
[luqo33#ltarasiewicz-pc containers]$ docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
and
[luqo33#ltarasiewicz-pc containers]$ docker-machine env default
Host does not exist: "default"
Why isn't there the 'default' machine available?
Since you have installed docker on Linux, you can access it directly on the host with docker ps or any other docker commands. You will need to either run these commands as root (sudo) or add your user to the docker group for access to the docker socket.
Docker machine is used to quickly spin up cloud and virtual machine instances of docker, so it's not needed when you have installed it directly on the Linux host.
You have to create it, like this:
$ docker-machine create --driver virtualbox default
Running pre-create checks...
Creating machine...
...
...
...
To see how to connect Docker to this machine, run: docker-machine env default
$: docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
default - virtualbox Running tcp://192.168.99.100:2376 v1.12.1
$: docker-machine env default
export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://192.168.99.100:2376"
export DOCKER_CERT_PATH="/Users/blahblah/.docker/machine/machines/default"
export DOCKER_MACHINE_NAME="default"
EDIT: You can also use other virtualization providers like Fusion, Hyper-V etc.

Resources