Docker - Cannot build multi-platform images with docker buildx - linux

I'm trying to build a multi-platform (amd64, arm64 and armv7) image using docker buildx. Since I'm using an amd64 machine running Ubuntu 18.04, I followed the instructions on the Docker website and installed qemu via:
sudo apt install qemu-user
However, a weird error appears when I execute the previous command. More specifically, there seems to be an issue with the binfmt-support service. Here's the full log:
Reading package lists... Done
Building dependency tree
Reading state information... Done
Starting pkgProblemResolver with broken count: 0
Starting 2 pkgProblemResolver with broken count: 0
Done
The following additional packages will be installed:
binfmt-support qemu-user-binfmt
The following NEW packages will be installed:
binfmt-support qemu-user qemu-user-binfmt
0 upgraded, 3 newly installed, 0 to remove and 1 not upgraded.
Need to get 0 B/7.409 kB of archives.
After this operation, 63,4 MB of additional disk space will be used.
Do you want to continue? [Y/n]
Selecting previously unselected package binfmt-support.
(Reading database ... 245278 files and directories currently installed.)
Preparing to unpack .../binfmt-support_2.1.8-2_amd64.deb ...
Unpacking binfmt-support (2.1.8-2) ...
Selecting previously unselected package qemu-user.
Preparing to unpack .../qemu-user_1%3a2.11+dfsg-1ubuntu7.21_amd64.deb ...
Unpacking qemu-user (1:2.11+dfsg-1ubuntu7.21) ...
Selecting previously unselected package qemu-user-binfmt.
Preparing to unpack .../qemu-user-binfmt_1%3a2.11+dfsg-1ubuntu7.21_amd64.deb ...
Unpacking qemu-user-binfmt (1:2.11+dfsg-1ubuntu7.21) ...
Setting up binfmt-support (2.1.8-2) ...
Job for binfmt-support.service failed because the control process exited with error code.
See "systemctl status binfmt-support.service" and "journalctl -xe" for details.
invoke-rc.d: initscript binfmt-support, action "start" failed.
● binfmt-support.service - Enable support for additional executable binary formats
Loaded: loaded (/lib/systemd/system/binfmt-support.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Wed 2020-02-05 17:20:29 CET; 4ms ago
Docs: man:update-binfmts(8)
Process: 7766 ExecStart=/usr/sbin/update-binfmts --enable (code=exited, status=2)
Main PID: 7766 (code=exited, status=2)
feb 05 17:20:29 XPS-15-9570 systemd[1]: Starting Enable support for additional executable binary formats...
feb 05 17:20:29 XPS-15-9570 update-binfmts[7766]: update-binfmts: warning: unable to close /proc/sys/fs/binfmt_misc/register: No such file or directory
feb 05 17:20:29 XPS-15-9570 update-binfmts[7766]: update-binfmts: exiting due to previous errors
feb 05 17:20:29 XPS-15-9570 systemd[1]: binfmt-support.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
feb 05 17:20:29 XPS-15-9570 systemd[1]: binfmt-support.service: Failed with result 'exit-code'.
feb 05 17:20:29 XPS-15-9570 systemd[1]: Failed to start Enable support for additional executable binary formats.
Setting up qemu-user (1:2.11+dfsg-1ubuntu7.21) ...
Setting up qemu-user-binfmt (1:2.11+dfsg-1ubuntu7.21) ...
update-binfmts: warning: current package is qemu-user-binfmt, but binary format already installed by qemu-user-static
update-binfmts: exiting due to previous errors
dpkg: error processing package qemu-user-binfmt (--configure):
installed qemu-user-binfmt package post-installation script subprocess returned error exit status 2
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
Processing triggers for ureadahead (0.100.0-21) ...
Processing triggers for neon-settings (0.0+p18.04+git20191212.1343) ...
Processing triggers for systemd (237-3ubuntu10.33) ...
Errors were encountered while processing:
qemu-user-binfmt
E: Sub-process /usr/bin/dpkg returned an error code (1)
Despite that, I tried to go on with the usual procedure, namely:
docker buildx create --name mybuilder
docker buildx use mybuilder
docker buildx inspect --bootstrap
Where the output of the last command is:
[+] Building 5.0s (1/1) FINISHED
=> [internal] booting buildkit 5.0s
=> => pulling image moby/buildkit:buildx-stable-1 4.3s
=> => creating container buildx_buildkit_mybuilder0 0.7s
Name: mybuilder
Driver: docker-container
Nodes:
Name: mybuilder0
Endpoint: unix:///var/run/docker.sock
Status: running
Platforms: linux/amd64, linux/386
As you can see, "linux/amd64" and "linux/386" are listed as the only available platforms, however I would need to build the image for "linux/arm64" and "linux/arm/v7" platforms as well.
I've been looking for a solution to this problem for hours, though I didn't find anything that worked
------------------------------------ EDIT ------------------------------------
Looks like I was able to solve part of the issue by running:
sudo apt purge --auto-remove qemu-user qemu-user-binfmt binfmt-support
And then reinstalling them. In fact, running again this command:
sudo apt install qemu-user
gives no error at all:
Reading package lists... Done
Building dependency tree
Reading state information... Done
Starting pkgProblemResolver with broken count: 0
Starting 2 pkgProblemResolver with broken count: 0
Done
The following additional packages will be installed:
binfmt-support qemu-user-binfmt
The following NEW packages will be installed:
binfmt-support qemu-user qemu-user-binfmt
0 upgraded, 3 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/7.409 kB of archives.
After this operation, 63,4 MB of additional disk space will be used.
Do you want to continue? [Y/n]
Selecting previously unselected package binfmt-support.
(Reading database ... 245437 files and directories currently installed.)
Preparing to unpack .../binfmt-support_2.1.8-2_amd64.deb ...
Unpacking binfmt-support (2.1.8-2) ...
Selecting previously unselected package qemu-user.
Preparing to unpack .../qemu-user_1%3a2.11+dfsg-1ubuntu7.21_amd64.deb ...
Unpacking qemu-user (1:2.11+dfsg-1ubuntu7.21) ...
Selecting previously unselected package qemu-user-binfmt.
Preparing to unpack .../qemu-user-binfmt_1%3a2.11+dfsg-1ubuntu7.21_amd64.deb ...
Unpacking qemu-user-binfmt (1:2.11+dfsg-1ubuntu7.21) ...
Setting up binfmt-support (2.1.8-2) ...
Created symlink /etc/systemd/system/multi-user.target.wants/binfmt-support.service → /lib/systemd/system/binfmt-support.service.
Setting up qemu-user (1:2.11+dfsg-1ubuntu7.21) ...
Setting up qemu-user-binfmt (1:2.11+dfsg-1ubuntu7.21) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
Processing triggers for ureadahead (0.100.0-21) ...
Processing triggers for neon-settings (0.0+p18.04+git20191212.1343) ...
Processing triggers for systemd (237-3ubuntu10.38) ...
Similarly, the output of systemctl status binfmt-support.service is as expected:
● binfmt-support.service - Enable support for additional executable binary formats
Loaded: loaded (/lib/systemd/system/binfmt-support.service; enabled; vendor preset: enabled)
Active: active (exited) since Mon 2020-02-10 11:42:23 CET; 1min 11s ago
Docs: man:update-binfmts(8)
Main PID: 7161 (code=exited, status=0/SUCCESS)
Tasks: 0 (limit: 4915)
CGroup: /system.slice/binfmt-support.service
feb 10 11:42:23 XPS-15-9570 systemd[1]: Starting Enable support for additional executable binary formats...
feb 10 11:42:23 XPS-15-9570 systemd[1]: Started Enable support for additional executable binary formats.
However, part of the issue is still there, as the output after running these three commands:
docker buildx create --name mybuilder
docker buildx use mybuilder
docker buildx inspect --bootstrap
is the same as before, namely:
[+] Building 2.6s (1/1) FINISHED
=> [internal] booting buildkit 2.6s
=> => pulling image moby/buildkit:buildx-stable-1 2.0s
=> => creating container buildx_buildkit_mybuilder0 0.6s
Name: mybuilder
Driver: docker-container
Nodes:
Name: mybuilder0
Endpoint: unix:///var/run/docker.sock
Status: running
Platforms: linux/amd64, linux/386
Why is that? Why is it showing me linux/amd64 and linux/386 as the only available platforms?
EDIT #2 (concerning #LinPy's comment)
The output of docker context ls is:
NAME DESCRIPTION DOCKER ENDPOINT KUBERNETES ENDPOINT ORCHESTRATOR
default * Current DOCKER_HOST based configuration unix:///var/run/docker.sock swarm
I've also tried to restart docker after qemu's installation, but to no success. Also, specifying the target platforms in the docker buildx command:
docker buildx build -t <mytag> --platform linux/amd64,linux/arm64,linux/arm/v7 --load .
results in this error:
[+] Building 0.6s (5/20)
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 32B 0.0s
=> [linux/arm/v7 internal] load metadata for docker.io/alegeno92/opencv_python3:3.4.2 0.6s
=> CANCELED [linux/arm64 internal] load metadata for docker.io/alegeno92/opencv_python3:3.4.2 0.6s
=> CANCELED [linux/amd64 internal] load metadata for docker.io/alegeno92/opencv_python3:3.4.2 0.6s
failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to load LLB: runtime execution on platform linux/arm/v7 not supported
By the way, my version of the kernel is 4.15.0-76-generic

Run the multiarch container first
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
docker buildx rm builder
docker buildx create --name builder --driver docker-container --use
docker buildx inspect --bootstrap
And you should have your alternate architectures.

Tagging on this answer in response to the first error. The commands have been updated per https://docs.docker.com/buildx/working-with-buildx/.
QEMU is a cross-platform emulator responsible for sourcing the binaries for different architectures (through the binfmt_misc handler).
Will save some people some time to start with this command first:
docker run --privileged --rm tonistiigi/binfmt --install all

There are multiple binfmt packages, and there's a configuration that I think was missed when this question was asked.
For the various packages, I would opt for qemu-user-static over qemu-user-binfmt to avoid any dynamic linking issues. The two packages are doing the same thing, so you'll need to pick one or the other.
The next part should be fixed in current releases, but I think you were stumbling on this before. That's the fix binary or F flag you'll see when catting the files in /proc/sys/fs/binfmt_misc, e.g. see the F flag here:
$ cat /proc/sys/fs/binfmt_misc/qemu-arm
enabled
interpreter /usr/libexec/qemu-binfmt/arm-binfmt-P
flags: POCF
offset 0
magic 7f454c4601010100000000000000000002002800
mask ffffffffffffff00fffffffffffffffffeffffff
Details on what the F flag means can be found on this kernel.org post but the short of it is container namespaces include a different filesystem namespace, and trying to access the interpreter from that namespace will fail (unless you do something like bind mount /usr/libexec/qemu-binfmt into your container). Newer versions of the qemu packages automatically set this flag, so if your flags section doesn't have the F defined, see these bug reports for the version you'll need to upgrade to:
Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=868030
Ubuntu: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815100
The easy button is to use the binaries from the multiarch image. This is good in CI if you have a dedicated VM (less ideal if you are modifying the host used by other builds). However if you reboot, it breaks until you run the container again. And it requires you to remember to update it for any upstream patches. So I wouldn't recommend it for a long running build host.
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes

For github CI, add following plugin solve this for me
- name: Set up QEMU
id: qemu
uses: docker/setup-qemu-action#v1
with:
image: tonistiigi/binfmt:latest
platforms: all

Related

How to create an x server with Singularity

Overall, I am trying to render images using Unity on a remote cluster.
The cluster does not have an X server; I don't have sudo permissions, or can start a Docker container, but I can start a Singularity container.
My plan is to create a container that would simulate the X Server. I created the following Singularity definition file:
Bootstrap: docker
From: nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
%post
# xvfb for rendering in headless mode
apt-get update
apt-get install -y xvfb mesa-utils xorg
echo "allowed_users = anybody" > /etc/X11/Xwrapper.config
I started the container with the option --containall. From the container, I launched the command /usr/bin/X :0, but it failed with the following error:
Singularity xvfb.sif:~> /usr/bin/X :0
_XSERVTransmkdir: Owner of /tmp/.X11-unix should be set to root
X.Org X Server 1.19.6
Release Date: 2017-12-20
X Protocol Version 11, Revision 0
Build Operating System: Linux 4.15.0-140-generic x86_64 Ubuntu
Current Operating System: Linux cooper 5.8.0-50-generic #56~20.04.1-Ubuntu SMP Mon Apr 12 21:46:35 UTC 2021 x86_64
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.8.0-50-generic root=/dev/mapper/vgubuntu-root ro quiet splash vt.handoff=7
Build Date: 08 April 2021 01:57:21PM
xorg-server 2:1.19.6-1ubuntu4.9 (For technical support please see http://www.ubuntu.com/support)
Current version of pixman: 0.34.0
Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/home/pierre-louis/.local/share/xorg/Xorg.0.log", Time: Wed May 26 09:17:05 2021
(==) Using system config directory "/usr/share/X11/xorg.conf.d"
(EE)
Fatal server error:
(EE) parse_vt_settings: Cannot open /dev/tty0 (No such file or directory)
(EE)
(EE)
Please consult the The X.Org Foundation support
at http://wiki.x.org
for help.
(EE) Please also check the log file at "/home/pierre-louis/.local/share/xorg/Xorg.0.log" for additional information.
(EE)
(EE) Server terminated with error (1). Closing log file.
Not any /dev/tty* exist. Then I tried to launch startx, but only to get the same message error.
How can I launch an X Server using a Singularity image?
As mentioned in a separate discussion, Xvfb is not supposed to be start through startx or /usr/bin/X but rather with the supplied run script.

Cannot install freeradius server installation exits with code (1)

Trying to install freeradius package on Debian 10 buster and it fails.
$ sudo apt install freeradius
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Suggested packages:
freeradius-krb5 freeradius-ldap freeradius-mysql freeradius-postgresql freeradius-python3
The following NEW packages will be installed:
freeradius
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 555 kB of archives.
After this operation, 2230 kB of additional disk space will be used.
Get:1 http://ftp.de.debian.org/debian sid/main amd64 freeradius amd64 3.0.21+dfsg-2+b2 [555 kB]
Fetched 555 kB in 0s (2753 kB/s)
Selecting previously unselected package freeradius.
(Reading database ... 140557 files and directories currently installed.)
Preparing to unpack .../freeradius_3.0.21+dfsg-2+b2_amd64.deb ...
Unpacking freeradius (3.0.21+dfsg-2+b2) ...
Setting up freeradius (3.0.21+dfsg-2+b2) ...
update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
install: invalid user ‘freerad’
dpkg: error processing package freeradius (--configure):
installed freeradius package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
freeradius
E: Sub-process /usr/bin/dpkg returned an error code (1)
install: invalid user ‘freerad’
Apparently it seems as if there's something wrong with the freerad user which doesn't exist?
Typing / checking inside /etc/passwd file learly shows there's no such user
$ cat /etc/passwd | grep freer*
Checking the syslog shows:
Feb 10 00:38:45 server-1 systemd[1]: freeradius.service: Scheduled restart job, restart counter is at 277.
Feb 10 00:38:45 server-1 freeradius[10918]: FreeRADIUS Version 3.0.21
Feb 10 00:38:45 server-1 freeradius[10918]: Copyright (C) 1999-2019 The FreeRADIUS server project and contributors
Feb 10 00:38:45 server-1 freeradius[10918]: There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
Feb 10 00:38:45 server-1 freeradius[10918]: PARTICULAR PURPOSE
Feb 10 00:38:45 server-1 freeradius[10918]: You may redistribute copies of FreeRADIUS under the terms of the
Feb 10 00:38:45 server-1 freeradius[10918]: GNU General Public License
Feb 10 00:38:45 server-1 freeradius[10918]: For more information about these matters, see the file named COPYRIGHT
Feb 10 00:38:45 server-1 freeradius[10918]: Errors reading /etc/freeradius/3.0: Permission denied
Feb 10 00:38:45 server-1 systemd[1]: freeradius.service: Control process exited, code=exited, status=1/FAILURE
Feb 10 00:38:45 server-1 systemd[1]: freeradius.service: Failed with result 'exit-code'.
Tried a**smarting this by adding the user manually with the command adduser freeradius
and then tried re-installing:
$ sudo apt install freeradius
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
freeradius is already the newest version (3.0.21+dfsg-2+b2).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
1 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] y
Setting up freeradius (3.0.21+dfsg-2+b2) ...
update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
chown: cannot access '/etc/freeradius': No such file or directory
dpkg: error processing package freeradius (--configure):
installed freeradius package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
freeradius
E: Sub-process /usr/bin/dpkg returned an error code (1)
now it fails prompting a new error: chown: cannot access '/etc/freeradius': No such file or directory
Tried purging / removing / clean / cache deleting / rebooting and then restarting the installation but it keeps reappearing.
Let me know if there's more info I can provide in manner to help you help me.
Cheers.
when removing freeradius (either by apt remove freeradius or apt purge freeradius) mind that config files are a seperate package called freeradius-config
so to completely wipe freeradius and do a reinstall do:
apt purge freeradius freeradius-config
rm -rf /etc/freeradius/
apt install freeradius
After installation of freeradius, add the following lines at the end of the client.conf file to set the access of your client:
#vi /etc/freeradius/3.0/clients.conf
For Example :
client SWITCH-01 {
ipaddr = 192.168.0.10
secret = kamisama123
}
client LINUX-01 {
ipaddr = 192.168.0.20
secret = vegeto123
}
In our example, we are adding 2 client devices, both devices will offer a login promt to authenticate on freeradius server database.
Now, locate and edit the freeradius users configuration file:
Example:
# vi /etc/freeradius/3.0/users
Adding Your Username And Password Like below:
freerad1 Cleartext-Password := "freerad1234"
freerad2 Cleartext-Password := "freerad2123"
Now, Restart freeradius server :
# service freeradius restart
Now, test your radius server configuration file :
#freeradius -CX
you can Test your radius authentication locally on the Radius server using the following commands:
# radtest freerad1 freerad1234 localhost 0 testing123
Here is an example of a successful radius authentication:
Sent Access-Request Id 151 from 0.0.0.0:34857 to 127.0.0.1:1812 length 75
User-Name = "freerad1"
User-Password = "freerad1234"
NAS-IP-Address = 172.31.41.98
NAS-Port = 0
Message-Authenticator = 0x00
Cleartext-Password = "freerad1234"
Received Access-Accept Id 151 from 127.0.0.1:1812 to 0.0.0.0:0 length 20
We are using the freerad1 username and the freerad1234 password to authenticate the user account.
The testing123 is a default device password included in the clients.conf file.
Now, go to the Linux server included on the clients.conf configuration file as LINUX-01.
Install the freeradius-utils package.
# apt-get install freeradius-utils
Test your radius authentication remotely on the Linux server (LINUX-01 OR YOUR CLIENT YOU ARE ADDED ON client.conf) using the following commands:
# radtest freerad freerad1234 192.168.0.50 0 vegeto123
** Congratulations, Your Freeradius server is on the service **

Postgresql 13 Installation failing on Ubuntu 20.04 (code=exited, status=203/EXEC)

I've been trying to install PostgreSQL on Ubuntu 20.04 (VPS) but I always get an error.
I followed this article to install it.
EDIT: I also followed the instructions from the PostgreSQL website to install it from repository. I get the same error.
The line that breaks the installation is here (2nd time I try to install it):
$ sudo apt install postgresql-13 postgresql-client-13
It outputs:
Reading package lists... Done
Building dependency tree
Reading state information... Done
postgresql-13 is already the newest version (13.1-1.pgdg20.04+1).
postgresql-client-13 is already the newest version (13.1-1.pgdg20.04+1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
2 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] Y
Setting up postgresql-common (223.pgdg20.04+1) ...
Not replacing deleted config file /etc/postgresql-common/createcluster.conf
Building PostgreSQL dictionaries from installed myspell/hunspell packages...
Removing obsolete dictionary files:
Job for postgresql.service failed because the control process exited with error code.
See "systemctl status postgresql.service" and "journalctl -xe" for details.
invoke-rc.d: initscript postgresql, action "start" failed.
● postgresql.service - PostgreSQL database server
Loaded: loaded (/etc/systemd/system/postgresql.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Wed 2020-11-18 17:57:53 UTC; 13ms ago
Docs: man:postgres(1)
Process: 16459 ExecStart=/usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data (code=exited, status=203/EXEC)
Main PID: 16459 (code=exited, status=203/EXEC)
Nov 18 17:57:53 localhost systemd[1]: Starting PostgreSQL database server...
Nov 18 17:57:53 localhost systemd[16459]: postgresql.service: Failed to execute command: No such file or directory
Nov 18 17:57:53 localhost systemd[16459]: postgresql.service: Failed at step EXEC spawning /usr/local/pgsql/bin/postgres: No such file or directory
Nov 18 17:57:53 localhost systemd[1]: postgresql.service: Main process exited, code=exited, status=203/EXEC
Nov 18 17:57:53 localhost systemd[1]: postgresql.service: Failed with result 'exit-code'.
Nov 18 17:57:53 localhost systemd[1]: Failed to start PostgreSQL database server.
dpkg: error processing package postgresql-common (--configure):
installed postgresql-common package post-installation script subprocess returned error exit status 1
dpkg: dependency problems prevent configuration of postgresql-13:
postgresql-13 depends on postgresql-common (>= 182~); however:
Package postgresql-common is not configured yet.
dpkg: error processing package postgresql-13 (--configure):
dependency problems - leaving unconfigured
No apport report written because the error message indicates its a followup error from a previous failure.
Errors were encountered while processing:
postgresql-common
postgresql-13
Then I try to check the status of postgresql.service:
$ systemctl status postgresql.service
● postgresql.service - PostgreSQL database server
Loaded: loaded (/etc/systemd/system/postgresql.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Wed 2020-11-18 17:57:53 UTC; 15s ago
Docs: man:postgres(1)
Process: 16459 ExecStart=/usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data (code=exited, status=203/EXEC)
Main PID: 16459 (code=exited, status=203/EXEC)
Can you give me some insights on how to deal with this issue ?

Gitlab CI 9.5 service is not running

I am searching a solution since 2 weeks on the web and I really need some help.
I am facing 3 problems:
Linux Gitlab-runner is not running
I have been trying to install gilab-runner with all the ways (GitLab's official repository, manualy, docker).
Everytime, when I am launching the command "gitlab-runner status" the answer is always "The server is not running." I have tried a million times to uninstall the service and re-install it but I do not want to work. I have register runners of all kind and with/without the sudo user. Without any success. This is my setup server:
Config
Ubuntu 16.04.1
Docker container gitlab 9.4.3
Port:
webservice :8088
https : 4433
ssh : 2222
gitlab-runner 9.5.0
How to reproduce
Register a shell runner http://192.168.1.10:8088/
Launch the command "sudo service gitlab-runner status"
Loaded: loaded (/etc/systemd/system/gitlab-runner.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since ven. 2017-08-25 15:17:45 CEST; 45s ago
Process: 13201 ExecStart=/usr/bin/gitlab-ci-multi-runner run --working-directory /home/gitlab-runner --config /etc/gitlab-runner/config.toml --service gitlab-runner --syslog --user gitlab-runner (code=exited, status=1/FAILURE)
Main PID: 13201 (code=exited, status=1/FAILURE)
systemd1: gitlab-runner.service: Unit entered failed state.
systemd1: gitlab-runner.service: Failed with result 'exit-code'.
Windows gitlab-runner Error 500
Because of my problem to install gitlab-runner in Linux, I have tried to install it on another computer on Windows 10.
It worked and finally the commande gitlab-runner status answered me "Service is running" (but this is just a temporary solution, I really need to make it work on linux).
Anyway, I have added a CI script to a test program and launch the job but it was turning in loop over and over.
When I launch the command "gitlab-runner --debug run":
...
passfile: true
extension: cmd
job=183 project=19 runner=679ccd01
Using Shell executor... job=183 project=19 runner=679ccd01
Waiting for signals... job=183 project=19 runner=679ccd01
WARNING: Job failed: exit status 128 job=183 project=19 runner=679ccd01
WARNING: Submitting job to coordinator... failed job=183 runner=679ccd01 status=500 Internal Server Error
WARNING: Submitting job to coordinator... failed job=183 runner=679ccd01 status=500 Internal Server Error
...
Gitlab.com and run command
So I have decided to add my project on gitlab.com, to test it.
git#gitlab.com:sandbox_test/test_ci.git
Once again the job was turning in infinite loop until I launch on my Windows computer the command "gitlab-runner run".
Dialing: tcp gitlab.com:443 ...
Feeding runners to channel builds=0
Checking for jobs... received job=30315630 repo_url=https://gitlab.com/sandbox_test/test_ci.git runner=d98c0af1
Failed to requeue the runner: builds=1 runner=d98c0af1
Running with gitlab-ci-multi-runner 9.5.0 (413da38)
on Windows_shell_gitlab_com (d98c0af1) job=30315630 project=3992201 runner=d98c0af1
Shell configuration: environment: []
dockercommand: []
command: cmd
arguments:
- /C
passfile: true
extension: cmd
job=30315630 project=3992201 runner=d98c0af1
Using Shell executor... job=30315630 project=3992201 runner=d98c0af1
Waiting for signals... job=30315630 project=3992201 runner=d98c0af1
Job succeeded job=30315630 project=3992201 runner=d98c0af1
Why is it necessary to launch the run command to make work my job on gitlab.com?
I expect when I run a new job it will figure out by itself without to launch manually the gitlab-runner on the CI computer...
Script .gitlab-ci.yml
Validate on CI Lint
stages:
- build
- test
- deploy
build:
stage: build
script:
- echo "building"
test:
stage: test
script:
- echo "test"
I really need answers very fast, thanks for your help.
Best Regards,Clement
UPDATE 1
I have resoved a part of my problems :
Linux Gitlab-runner is not running
Launch the command "gitlab-runner run --working-directory /home/gitlab-runner --config /etc/gitlab-runner/config.toml --service gitlab-runner --syslog --user gitlab-runner"
First Error : chdir /home/gitlab-runner: no such file or directory
Solution: sudo mkdir /home/gitlab-runner
Second Error : open /etc/gitlab-runner/config.toml: permission denied
Solution : sudo chmod 755 /etc/gitlab-runner/config.toml
I have resoved a part of my problems :
Linux Gitlab-runner is not running
Launch the command "gitlab-runner run --working-directory /home/gitlab-runner --config /etc/gitlab-runner/config.toml --service gitlab-runner --syslog --user gitlab-runner"
First Error : chdir /home/gitlab-runner: no such file or directory
Solution: sudo mkdir /home/gitlab-runner
Second Error : open /etc/gitlab-runner/config.toml: permission denied
Solution : sudo chmod 755 /etc/gitlab-runner/config.toml

Unable to start Docker Service in Ubuntu 16.04

I've been trying to use Docker (1.10) on Ubuntu 16.04 but installation fails because Docker Service doesn't start.
I've already tried to install docker by docker.io, docker-engine apt packages and curl -sSL https://get.docker.com/ | sh but it doesn't work.
My Host info is:
Linux Xenial 4.5.3-040503-generic #201605041831 SMP Wed May 4 22:33:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Here is systemctl status docker.service:
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since sáb 2016-05-14 15:17:31 CEST; 12min ago
Docs: https://docs.docker.com
Process: 22479 ExecStart=/usr/bin/docker daemon -H fd:// (code=exited, status=1/FAILURE)
Main PID: 22479 (code=exited, status=1/FAILURE)
may 14 15:17:30 Xenial docker[22479]: time="2016-05-14T15:17:30.103601523+02:00" level=info msg="New containerd process, pid: 22485\n"
may 14 15:17:31 Xenial docker[22479]: time="2016-05-14T15:17:31.149064723+02:00" level=error msg="devmapper: Unable to delete device: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool"
may 14 15:17:31 Xenial docker[22479]: time="2016-05-14T15:17:31.149127439+02:00" level=warning msg="devmapper: Usage of loopback devices is strongly discouraged for production use. Please use `--storage-opt dm.thinpooldev` or use `man docker` to refer to dm.thinpooldev section."
may 14 15:17:31 Xenial docker[22479]: time="2016-05-14T15:17:31.153010028+02:00" level=error msg="[graphdriver] prior storage driver \"devicemapper\" failed: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool"
may 14 15:17:31 Xenial docker[22479]: time="2016-05-14T15:17:31.153130839+02:00" level=fatal msg="Error starting daemon: error initializing graphdriver: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool"
may 14 15:17:31 Xenial systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
may 14 15:17:31 Xenial docker[22479]: time="2016-05-14T15:17:31+02:00" level=info msg="stopping containerd after receiving terminated"
may 14 15:17:31 Xenial systemd[1]: Failed to start Docker Application Container Engine.
may 14 15:17:31 Xenial systemd[1]: docker.service: Unit entered failed state.
may 14 15:17:31 Xenial systemd[1]: docker.service: Failed with result 'exit-code'.
Here is sudo docker daemon -D
DEBU[0000] docker group found. gid: 999
DEBU[0000] Listener created for HTTP on unix (/var/run/docker.sock)
INFO[0000] previous instance of containerd still alive (23050)
DEBU[0000] containerd connection state change: CONNECTING
DEBU[0000] Using default logging driver json-file
DEBU[0000] Golang's threads limit set to 55980
DEBU[0000] received past containerd event: &types.Event{Type:"live", Id:"", Status:0x0, Pid:"", Timestamp:0x57372cae}
DEBU[0000] containerd connection state change: READY
DEBU[0000] devicemapper: driver version is 4.34.0
DEBU[0000] devmapper: Generated prefix: docker-8:6-2101297
DEBU[0000] devmapper: Checking for existence of the pool docker-8:6-2101297-pool
DEBU[0000] devmapper: poolDataMajMin=7:0 poolMetaMajMin=7:1
DEBU[0000] devmapper: Major:Minor for device: /dev/loop0 is:7:0
DEBU[0000] devmapper: Major:Minor for device: /dev/loop1 is:7:1
DEBU[0000] devmapper: loadDeviceFilesOnStart()
DEBU[0000] devmapper: Skipping file /var/lib/docker/devicemapper/metadata/transaction-metadata
DEBU[0000] devmapper: loadDeviceFilesOnStart() END
DEBU[0000] devmapper: constructDeviceIDMap()
DEBU[0000] devmapper: constructDeviceIDMap() END
DEBU[0000] devmapper: Rolling back open transaction: TransactionID=1 hash= device_id=1
ERRO[0000] devmapper: Unable to delete device: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool
WARN[0000] devmapper: Usage of loopback devices is strongly discouraged for production use. Please use `--storage-opt dm.thinpooldev` or use `man docker` to refer to dm.thinpooldev section.
DEBU[0000] devmapper: Initializing base device-mapper thin volume
DEBU[0000] devicemapper: CreateDevice(poolName=/dev/mapper/docker-8:6-2101297-pool, deviceID=1)
DEBU[0000] devmapper: Error creating device: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool
DEBU[0000] devmapper: Error device setupBaseImage: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool
ERRO[0000] [graphdriver] prior storage driver "devicemapper" failed: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool
DEBU[0000] Cleaning up old mountid : start.
FATA[0000] Error starting daemon: error initializing graphdriver: devicemapper: Can't set task name /dev/mapper/docker-8:6-2101297-pool
Here is ./check-config.sh output:
warning: /proc/config.gz does not exist, searching other paths for kernel config ...
info: reading kernel config from /boot/config-4.5.3-040503-generic ...
Generally Necessary:
- cgroup hierarchy: properly mounted [/sys/fs/cgroup]
- apparmor: enabled and tools installed
- CONFIG_NAMESPACES: enabled
- CONFIG_NET_NS: enabled
- CONFIG_PID_NS: enabled
- CONFIG_IPC_NS: enabled
- CONFIG_UTS_NS: enabled
- CONFIG_DEVPTS_MULTIPLE_INSTANCES: enabled
- CONFIG_CGROUPS: enabled
- CONFIG_CGROUP_CPUACCT: enabled
- CONFIG_CGROUP_DEVICE: enabled
- CONFIG_CGROUP_FREEZER: enabled
- CONFIG_CGROUP_SCHED: enabled
- CONFIG_CPUSETS: enabled
- CONFIG_MEMCG: enabled
- CONFIG_KEYS: enabled
- CONFIG_MACVLAN: enabled (as module)
- CONFIG_VETH: enabled (as module)
- CONFIG_BRIDGE: enabled (as module)
- CONFIG_BRIDGE_NETFILTER: enabled (as module)
- CONFIG_NF_NAT_IPV4: enabled (as module)
- CONFIG_IP_NF_FILTER: enabled (as module)
- CONFIG_IP_NF_TARGET_MASQUERADE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: enabled (as module)
- CONFIG_NETFILTER_XT_MATCH_CONNTRACK: enabled (as module)
- CONFIG_NF_NAT: enabled (as module)
- CONFIG_NF_NAT_NEEDED: enabled
- CONFIG_POSIX_MQUEUE: enabled
Optional Features:
- CONFIG_USER_NS: enabled
- CONFIG_SECCOMP: enabled
- CONFIG_CGROUP_PIDS: enabled
- CONFIG_MEMCG_KMEM: missing
- CONFIG_MEMCG_SWAP: enabled
- CONFIG_MEMCG_SWAP_ENABLED: missing
(note that cgroup swap accounting is not enabled in your kernel config, you can enable it by setting boot option "swapaccount=1")
- CONFIG_BLK_CGROUP: enabled
- CONFIG_BLK_DEV_THROTTLING: enabled
- CONFIG_IOSCHED_CFQ: enabled
- CONFIG_CFQ_GROUP_IOSCHED: enabled
- CONFIG_CGROUP_PERF: enabled
- CONFIG_CGROUP_HUGETLB: enabled
- CONFIG_NET_CLS_CGROUP: enabled (as module)
- CONFIG_CGROUP_NET_PRIO: enabled
- CONFIG_CFS_BANDWIDTH: enabled
- CONFIG_FAIR_GROUP_SCHED: enabled
- CONFIG_RT_GROUP_SCHED: missing
- CONFIG_EXT3_FS: missing
- CONFIG_EXT3_FS_XATTR: missing
- CONFIG_EXT3_FS_POSIX_ACL: missing
- CONFIG_EXT3_FS_SECURITY: missing
(enable these ext3 configs if you are using ext3 as backing filesystem)
- CONFIG_EXT4_FS: enabled
- CONFIG_EXT4_FS_POSIX_ACL: enabled
- CONFIG_EXT4_FS_SECURITY: enabled
- Network Drivers:
- "overlay":
- CONFIG_VXLAN: enabled (as module)
- Storage Drivers:
- "aufs":
- CONFIG_AUFS_FS: missing
- "btrfs":
- CONFIG_BTRFS_FS: enabled (as module)
- "devicemapper":
- CONFIG_BLK_DEV_DM: enabled
- CONFIG_DM_THIN_PROVISIONING: enabled (as module)
- "overlay":
- CONFIG_OVERLAY_FS: enabled (as module)
- "zfs":
- /dev/zfs: missing
- zfs command: missing
- zpool command: missing
If someone could please help me I would be very thankful
Update
It seems that in newer versions of docker and Ubuntu the unit file for docker is simply masked (pointing to /dev/null).
You can verify it by running the following commands in the terminal:
sudo file /lib/systemd/system/docker.service
sudo file /lib/systemd/system/docker.socket
You should see that the unit file symlinks to /dev/null.
In this case, all you have to do is follow S34N's suggestion, and run:
sudo systemctl unmask docker.service
sudo systemctl unmask docker.socket
sudo systemctl start docker.service
sudo systemctl status docker
I'll also keep the original post, that answers the error log stating that the storage driver should be replaced:
Original Post
I had the same problem, and I tried fixing it with Salva Cort's suggestion, but printing /etc/default/docker says:
# THIS FILE DOES NOT APPLY TO SYSTEMD
So here's a permanent fix that works for systemd (Ubuntu 15.04 and higher):
create a new file /etc/systemd/system/docker.service.d/overlay.conf with the following content:
[Service]
ExecStart=
ExecStart=/usr/bin/docker daemon -H fd:// -s overlay
flush changes by executing:
sudo systemctl daemon-reload
verify that the configuration has been loaded:
systemctl show --property=ExecStart docker
restart docker:
sudo systemctl restart docker
The following unmasking commands worked for me (Ubuntu 18). Hope it helps someone out there... :-)
sudo systemctl unmask docker.service
sudo systemctl unmask docker.socket
sudo systemctl start docker.service
I had the same problem after upgrade docker from 17.05-ce to 17.06-ce via docker-machine
Update /etc/systemd/system/docker.service.d/10-machine.conf
replace
`docker daemon` => `dockerd`
example from
[Service]
ExecStart=
ExecStart=/usr/bin/docker deamon -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver aufs --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=generic
Environment=
to
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver aufs --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=generic
Environment=
flush changes by executing:
sudo systemctl daemon-reload
restart docker:
sudo systemctl restart docker
Well, finally I fixed it
Everything you have to do is to load a different storage-driver in my case I will use overlay:
Disable Docker service: sudo systemctl stop docker.service
Start Docker Daemon (overlay driver): sudo docker daemon -s overlay
Run Demo container: sudo docker run hello-world
In order to make these changes permanent, you must edit /etc/default/docker file and add the option:
DOCKER_OPTS="-s overlay"
Next time Docker service get loaded, it will run docker daemon -s overlay
I've been able to get it working after a kernel upgrade by following the directions in this blog.
https://mymemorysucks.wordpress.com/2016/03/31/docker-graphdriver-and-aufs-failed-driver-not-supported-error-after-ubuntu-upgrade/
sudo apt-get update
sudo apt-get install linux-image-extra-$(uname -r) linux-image-extra-virtual
sudo modprobe aufs
sudo service docker restart
After viewing some of the other answers it looks like the issue was that the service wasn't running with the -s overlay options.
I also happened to notice that docker tried to start up with ${DOCKER_OPTS} at the end of the call.
I was able to export DOCKER_OPTS="-s overlay" (bc by default DOCKER_OPTS was empty) and get docker running.
I had a similar issue on a new Docker installation (version 19.03.3-rc1) on Ubuntu 18.04.3 LTS. By default /etc/docker/daemon.json file does not exist on a new installation. Following a tutorial I changed the storage driver to devicemapper by creating a new daemon.json file. It worked but then I deleted the daemon.json file thinking that it would revert to the default but that did not work and the service would not start.
Creating the /etc/docker/daemon.json file again with the default storage driver fixed it for me.
{
"storage-driver": "overlay2"
}
sudo dockerd --debug will help to fix actual pain point I fixed the same error using this at ubuntu 20 LTS
As to me, I have get this error.
Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
Finally I found, it the /etc/docker/daemon.json error, for I add registry-mirrors
{
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
# I forget to add a comma , here !!!!!!!
"registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"]
}
After I add it , then systemctl restart docker, I solved it.
In my case I was getting the following error from journalctl -xe command
unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character 'â' looking for beginning of object key string
Just clean /etc/docker/daemon.json with
{
}
I had this issue today after an upgrade to the ubuntu kernel and tried numerous solutions above. However the only one that worked (Ubuntu 16.04.6 LTS) was to remove (or rename) the folder: /var/lib/docker
Please be aware, this will remove all your docker images, containers and volumes etc. So understand the implications before applying or take a backup!
There are more details here:
https://github.com/docker/for-linux/issues/162

Resources