GitLab CI dind times out - gitlab

I have a GitLab installation running in Kubernetes, and suddenly, my connections to dind have stopped working. This problem started appearing in a single project out of ~30 and is working in the other ones, and no change has been made.
The builds give the following errors:
*** WARNING: Service runner-c542f8fe-project-3-concurrent-0-docker-0 probably didn't start properly.
Health check error:
service "runner-c542f8fe-project-3-concurrent-0-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2018-08-13T08:40:53.274661600Z mount: permission denied (are you root?)
2018-08-13T08:40:53.274713900Z Could not mount /sys/kernel/security.
2018-08-13T08:40:53.274730800Z AppArmor detection and --privileged mode might break.
2018-08-13T08:40:53.275949300Z mount: permission denied (are you root?)
*********
I am running the container privileged, as can be seen in my /etc/gitlab-runner/config.toml:
metrics_server = ":9252"
concurrent = 10
check_interval = 30
[[runners]]
name = "mothy-jackal-gitlab-runner-bb76cb464-7fq6z"
url = "[redacted]"
token = "[redacted]"
executor = "kubernetes"
[runners.cache]
[runners.kubernetes]
host = ""
image = "ubuntu:16.04"
namespace = "gitlab"
namespace_overwrite_allowed = ""
privileged = true
cpu_request = "100m"
memory_request = "128Mi"
service_cpu_request = "100m"
service_memory_request = "128Mi"
service_account_overwrite_allowed = ""
[runners.kubernetes.volumes]
The only other solution I've found that don't pertain to making sure that the runner is privileged is this one. I've tried setting the variables in my .gitlab-ci.yaml to this:
variables:
DOCKER_HOST: "tcp://docker:2375"
DOCKER_DRIVER: overlay
The error remains the same.
Worth noting is the output of these following commands, in accordance with the other post:
bash-4.3# find /lib/modules/`uname -r`/kernel/ -type f -name "overlay*"
find: /lib/modules/4.4.111-k8s/kernel/: No such file or directory
bash-4.3# lsmod | grep overlay
overlay 45056 12
Note the "No such file or directory" error.
I'm stumped, and with my builds failing in the registry stage, I can't make releases. Any pointers as of where to go?
Thanks.
EDIT
It's not a solution, but I noticed that this occurred because I had set a dedicated runner to this project. Once I removed that, it worked again. Not a fix, but important info to anyone having the same issue.

Related

GitLab CE 15.7 / Docker registry with Self Signed Certificate.. not working

I want to use the Gitlab Docker registry. I am using GitLab CE 15.7
I created my own CA and signed a certificate. GitLab UI and GitLab runners are working fine!
When it comes to the Docker Registry I have some issues. I configured the gitlab.rb like this:
registry_external_url 'https://198.18.133.100:5000'
registry['enable'] = true
registry['username'] = "registry"
registry['group'] = "registry"
registry['registry_http_addr'] = "127.0.0.1:5000"
registry['debug_addr'] = "localhost:5001"
registry['env'] = {
'SSL_CERT_DIR' => "/etc/gitlab/ssl/"
}
registry['rootcertbundle'] = "/etc/gitlab/ssl/198.18.133.100.crt"
Which also confuses me are the options for registry and registry_nginx.
I am not sure if I configured it correctly and the documentation doesn't help me a lot. I didn't spin up any docker container for the registry or anything. I believe that this comes in the binary of the GitLab (if I am not mistaken). I port 5000 is available and I can telnet.
However, while pushing the image to the registry I get the following error:
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
Error response from daemon: Get "https://198.18.133.100:5000/v2/": x509: certificate signed by unknown authority
Cleaning up project directory and file based variables
00:00
ERROR: Job failed: exit status 1
Any ideas? Thanks a lot!
I tried already quite a lot of different configs and reconfigured the gitlab server.
It has been fixed with copying the ca at the following path:
mkdir -p /etc/docker/certs.d/<your_registry_host_name>:<your_registry_host_port>
As well as the right config in the gitlab.rb
registry_nginx['enable'] = true
registry_nginx['listen_https'] = true
registry_nginx['redirect_http_to_https'] = true
registry_external_url 'https://registry.YOUR_DOMAIN.gtld'
Thanks all for your help!

Job ends with error "WARNING: Uploading artifacts as "archive" to coordinator... failed"

I am trying to set up a GitLab ce server running in docker, on my local Windows machine for the moment. Trying to configure GitLab CI, I am facing an issue when uploading the artifact at the end of the job:
WARNING: Uploading artifacts as "archive" to coordinator... failed id=245 responseStatus=500 Internal Server Error status=500 token=i3yfe7rf
Before showing more logs, this is my setup. I am using different containers
one for running GitLab
one for running the CI runners (gitlab-runner)
one for running a container registry
one recently added container to store artifacts on a local s3 server (minio)
This is the config.toml file for the only registered runner. Note that this version uses a local s3 server, but the same happens with local cache.
[[runners]]
name = "Docker Runner"
url = "http://192.168.1.18:6180/"
token = "JHubtvs8kFaQjJNC6r6Z"
executor = "docker"
clone_url = "http://192.168.1.18:6180/"
[runners.custom_build_dir]
[runners.cache]
Type = "s3"
Path = "mycustom-s3"
Shared = true
[runners.cache.s3]
ServerAddress = "192.168.1.18:9115"
AccessKey = "XXXXXX"
SecretKey = "XXXXXX"
BucketName = "runner"
Insecure = true
[runners.cache.gcs]
[runners.cache.azure]
[runners.docker]
tls_verify = false
image = "docker:19.03.1"
privileged = true
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/cache"]
shm_size = 0
This is my CI YAML file: I've taken this example from a youtube video. The same happens for all projects in GitLab.
image: "ruby:latest"
cache:
paths:
- output
before_script:
- bundle install --path vendor # Install dependencies into ./vendor/ruby
build:
stage: build
tags:
- docker,ruby
artifacts:
paths:
- output/
expire_in: 5 days
script:
- echo "In the build stage"
- mkdir -p output
- echo date > output/$(date +%s).txt
- ls -l output
- ls -l vendor
Running the job ends with the above mentioned error.
More errors can be seen in the log files:
In exceptions_json.log:
{"severity":"ERROR","time":"2020-12-16T11:24:11.865Z","correlation_id":"ZxQ4vVdD1J1","tags.correlation_id":"ZxQ4vVdD1J1","tags.locale":"en","exception.class":"Errno::ENOENT","exception.message":"No such file or directory # apply2files - /var/opt/gitlab/gitlab-rails/shared/artifacts/tmp/work/1608117851-2655-0006-1409/artifacts.zip"...
In Production.log
Started POST "/api/v4/jobs/245/artifacts/authorize?artifact_format=zip&artifact_type=archive&expire_in=5+days" for 172.17.0.1 at 2020-12-16 11:24:07 +0000
Started POST "/api/v4/jobs/245/artifacts?artifact_format=zip&artifact_type=archive&expire_in=5+days" for 172.17.0.1 at 2020-12-16 11:24:07 +0000
Processing by Gitlab::RequestForgeryProtection::Controller#index as HTML
Parameters: {"file.remote_url"=>"", "file.size"=>"389", "file.sha1"=>"da6c0be0e7a3a4791035bc9f851439dcb0e94135", "file.sha256"=>"6539358258571174fb3bed6ab68db78705efdd9ed4b7c423bab0b19eb9aea531", "file.path"=>"/var/opt/gitlab/gitlab-rails/shared/artifacts/tmp/uploads/artifacts.zip609500792", "file.remote_id"=>"", "file.name"=>"artifacts.zip", "file.md5"=>"d432c9507b8879dfad13342c6b60f73b", "file.sha512"=>"5ea4e5b6bcbbffb2d3f81e8c05ede92b630b6033ea3f09dc61a4a4bbc7919088cf4a1eab46cd54e9e994b35908065412779e77caf2612341fed3c36449947bdd", "file.gitlab-workhorse-upload"=>"...", "metadata.name"=>"metadata.gz", "metadata.path"=>"/var/opt/gitlab/gitlab-rails/shared/artifacts/tmp/uploads/metadata.gz123385207", "metadata.remote_url"=>"", "metadata.sha256"=>"93d549eb28b503108a4e9da0cb08cac02cd70041aedcbef418aa5c969d1a0d1e", "metadata.size"=>"175", "metadata.remote_id"=>"", "metadata.sha512"=>"3c7ff2a2a992695c2082c37340be7caa2955e9ba4ff50015c787f790146da1ac7f6884685797db1bc59eb8045bab1fac2fc1300114542059cddcec2593ea5934", "metadata.md5"=>"c7b52bc3b9b2d7dbf780aa919917b562", "metadata.sha1"=>"c71ab07f5bdf21d8d3b5a6507a0747167d4a80de", "metadata.gitlab-workhorse-upload"=>"...", "file"=>#<UploadedFile:0x00007fb805291cf0 #tempfile=#File:/var/opt/gitlab/gitlab-rails/shared/artifacts/tmp/uploads/artifacts.zip609500792, #size=389, #content_type="application/octet-stream", #original_filename="artifacts.zip", #sha256="6539358258571174fb3bed6ab68db78705efdd9ed4b7c423bab0b19eb9aea531", #remote_id="">, "artifact_format"=>"zip", "artifact_type"=>"archive", "expire_in"=>"5 days", "metadata"=>#<UploadedFile:0x00007fb804cdcfa8 #tempfile=#File:/var/opt/gitlab/gitlab-rails/shared/artifacts/tmp/uploads/metadata.gz123385207, #size=175, #content_type="application/octet-stream", #original_filename="metadata.gz", #sha256="93d549eb28b503108a4e9da0cb08cac02cd70041aedcbef418aa5c969d1a0d1e", #remote_id="">}
Can't verify CSRF token authenticity.
This CSRF token verification failure is handled internally by GitLab::RequestForgeryProtection
Unlike the logs may suggest, this does not result in an actual 422 response to the user
For API requests, the only effect is that current_user will be nil for the duration of the request
Completed 422 Unprocessable Entity in 8ms (ActiveRecord: 0.0ms | Elasticsearch: 0.0ms | Allocations: 241)
Errno::ENOENT (No such file or directory # apply2files - /var/opt/gitlab/gitlab-rails/shared/artifacts/tmp/work/1608117848-2659-0005-4872/artifacts.zip):
/opt/gitlab/embedded/lib/ruby/gems/2.7.0/gems/carrierwave-1.3.1/lib/carrierwave/sanitized_file.rb:320:in `chmod'...
I've been spending my last 3 days searching the root of this, and despite having read many articles (here or on GitLab support site), I can't get this resolved.
The error suggests this is an issue with file /var/opt/gitlab/gitlab-rails/shared/artifacts/tmp/work/1608117848-2659-0005-4872/artifacts.zip.
Definitely, directory /var/opt/gitlab/gitlab-rails/shared/artifacts/tmp/work/ exists.
But sub-directory 1608117848-2659-0005-4872 doesn't.
I had the same problem this morning and finally solved it for me.
I was using bind-mounts for the data/config/log volumes in the gitlab container, which apparently cause a problem when uploading the artifacts.
I now switched to using docker volumes and now artifact upload works.

Gitlab CI/CD caching

I am wanted to try out caching on my Gitlab project following documentation here - https://docs.gitlab.com/ee/ci/caching/#how-archiving-and-extracting-works. I have a project specific runner and am using docker executor, but I get error
cat: vendor/hello.txt: No such file or directory
How would I go about troubleshooting this problem? I set disable_cache = false in my runner config, but that did not help.
EDIT: using private gitlab instance 12.3.
I acheived this using distributed caching which I found easy. First of all you need a S3 bucket or s3 compatible storage like minio. You can set MinIo locally where gitlab runner exsists with following commands.
docker run -it --restart always -p 9005:9000 \
-v /.minio:/root/.minio -v /export:/export \
--name minio \
minio/minio:latest server /export
Check the IP address of the server:
hostname --ip-address
Your cache server will be available at MY_CACHE_IP:9005
Create a bucket that will be used by the Runner:
sudo mkdir /export/runner
runner is the name of the bucket in that case. If you choose a different bucket, then it will be different. All caches will be stored in the /export directory.
Read the Access and Secret Key of MinIO and use it to configure the Runner:
sudo cat /export/.minio.sys/config/config.json | grep Key
Next step is to configure your runner to use the cache. For that following is the sample config.toml
[[runners]]
limit = 10
executor = "docker+machine"
[runners.cache]
Type = "s3"
Path = "path/to/prefix"
Shared = false
[runners.cache.s3]
ServerAddress = "s3.example.com"
AccessKey = "access-key"
SecretKey = "secret-key"
BucketName = "runner"
Insecure = false
I hope this answer will help you
Reference:
https://docs.gitlab.com/runner/install/registry_and_cache_servers.html
https://docs.gitlab.com/runner/configuration/autoscale.html#distributed-runners-caching
I managed to solve the issue thanks to this post https://gitlab.com/gitlab-org/gitlab-runner/-/issues/336#note_263931046.
Basically added
variables:
GIT_CLEAN_FLAGS: none
and it worked.
#Bilal's answer is definitely correct, but I was looking for slightly different solution.

Deploying Docker Image from Azure Container Registry to Web App Container "failed to register layer: Error processing tar file(exit status 1)"

I've tried latest Ubuntu and Python - they successfully extract and mount. However, as soon as I install further dependencies and add my app and push the image to Azure ACR - this error happens.
What is happening on my local machine? I have the Ubuntu image running for example, I install pip3 for example, and "docker commit" the changes locally, then tag the image and push it to ACR. This image will then fail to load with the above error. I can see that the segments in the previous image are already in the registry and only the latest image segment is actually pushed. So the error appears to occur with the latest change to the image.
The full error message is:-
2020-06-25T03:14:43.517Z ERROR - failed to register layer: Error
processing tar file(exit status 1): Container ID 197609 cannot be
mapped to a host IDErr: 0, Message: failed to register layer: Error
processing tar file(exit status 1): Container ID 197609 cannot be
mapped to a host ID
2020-06-25T03:14:43.589Z INFO - Pull Image
failed, Time taken: 0 Minutes and 46 Seconds
2020-06-25T03:14:43.590Z ERROR - Pulling docker image *******.azurecr.io/seistech-1:v1.0.0.15
failed:
2020-06-25T03:14:43.590Z INFO - Pulling image from Docker
hub: .azurecr.io/seistech-1:v1.0.0.15 2020-06-25T03:14:43.987Z
ERROR - DockerApiException: Docker API responded with status
code=InternalServerError, response={"message":"Get
https://.azurecr.io/v2/*******-1/manifests/v1.0.0.15:
unauthorized: authentication required, visit
https://aka.ms/acr/authorization for more information."}
2020-06-25T03:14:44.020Z ERROR - Image pull failed: Verify docker
image configuration and credentials (if using private repository)
2020-06-25T03:14:46.491Z INFO - Stopping site *******-dev-container
because it failed during startup.
Note re' authentication message(s) - I have created system assigned identity in the web app and assigned image pull permissions in the ACR - so as far as I can tell, there should be no auth issue.
Suggestions appreciated - very little diagnostic info to work with.
Thanks
Andy, NZ
In my case, it was related to npm v9 and only when using Azure App Service.
may install modules into your node_modules directory with a high ID for the file owner/creator.
My Solution:
As a quick workaround, I reverted back to an older version of npm 8.x. That worked for me at the moment.
Found this article to be helpful
NPM-specific issues causing users remapping exceptions
The Managed identity cannot be authentication to deploy the Docker image from the ACR. You must set the environment such as:
DOCKER_REGISTRY_SERVER_USERNAME - The username for the ACR server.
DOCKER_REGISTRY_SERVER_URL - The full URL to the ACR server. (For example, https://my-server.azurecr.io.)
DOCKER_REGISTRY_SERVER_PASSWORD - The password for the ACR server.
This is the only way to pull the image from a private registry and so for ACR. The managed identity just works when the Web App running, but pulling the image is before it.
For some reason I took several days to find this documentation page
The key piece of info in the container error log is this:
...container id xxxxxx cannot be mapped...
And the shell script to find these items is this:
$ find / \( -uid 1000000 \) -ls 2>/dev/null
(where 1000000 is the high ID from the error message)
The solution then was simple, in the final step of my docker-compose Dockerfile I added:
&& chown -R root:root /home
which is where the find command show all the problematic files were - they were created by a tar -x operation. I haven't dug further into whether this caused the permissions problem.

GitLab-Runner "listen_address not defined" error

I'm running a Laravel api on my server, and I wanted to use Gitlab-runner for CD. The first two runs were good, but then I started to see this problem listen_address not defined, session endpoints disabled builds=0
I'm running a linux server on a web shared hosting, so I can access a terminal and get some priviliges but I can't do some sudo stuff like installing a service. That's why I've been running gitlab-runner in user-mode
Error info
Configuration loaded builds=0
listen_address not defined, metrics & debug endpoints disabled builds=0
[session_server].listen_address not defined, session endpoints disabled builds=0
.gitlab-runner/config.toml
concurrent = 1
check_interval = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "CD API REST Sistema SIGO"
url = "https://gitlab.com/"
token = "blablabla"
executor = "shell"
listen_address="my.server.ip.address:8043"
[runners.custom_build_dir]
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
I have literally wasted 2 days on this subject. I have followed the below steps to get the runners configured and execute jobs successfully.
I am using Mac OS X 10.13 and Git Lab 12. However, people with other OS also can check this out.
I have stopped the runners and uninstalled them. Now deleted all references and files to gitlab runner, including the gitlab executable also.
I got to know GitLab Runner executable paths from https://docs.gitlab.com/runner/configuration/advanced-configuration.html
I have installed them again using the gitlab official documentation.
Then the runners shows online in the gitlab portal. However, the jobs are not getting executed. It shows simply stuck. It tried to get information from logs using
gitlab-runner -debug run
Then I got to know that listen_address not defined. After a long try I got to know that simply enabling Run Untagged jobs did the trick. The jobs started and completed successfully. Still the I see the listen_address not defined from debug. So that misled me.
Though it seems that last one task has solved my problem, but doing all the tasks in a batch did the trick.
Conversely, an alternative to Avinash's solution is to include the tags you create when you register the runner in the gitlab-ci.yml file
stages:
- testing
testing:
stage: testing
script:
- echo 'Hello world'
tags:
- my-tags

Resources