Gitlab: There has been a runner system failure - gitlab

I use Gitlab for doing Continuous Integration and Development and all of a sudden I get this error message "There has been a runner system failure, please try again"
There's no real error message or error code.
I've tried restarting the gitlab runner, using gitlab-runner restart, I've done a reboot of the server its running on but I keep getting this error message on Gitlab whenever I push a code change.

After a couple of hours, I realized the issue is that the server that Gitlab Runner is running on has no space left.
I logged into the server in question. Looked at the Gitlab log file using the following command:
journalctl -u gitlab-runner
And it showed me the following logs:
May 21 08:20:41 gitlab-runner[18936]: Checking for jobs... received job=178911 repo_url=https://.......git runner=f540b942
May 21 08:20:41 gitlab-runner-01 gitlab-runner[18936]: WARNING: Failed to process runner builds=0 error=open /tmp/trace543210445: no space left on device executor=docker runner=f540b942
To fix this issue I ran docker conatiner prune which clears out stopped containers.
Alternatively you could use docker system prune which would remove all unused objects.
See https://linuxize.com/post/how-to-remove-docker-images-containers-volumes-and-networks/ for more information about those docker commands.
Afterwards, I no longer got the error on Gitlab when pushing changes.

Related

Azure DevOps Artifact Agent log "Error: UPGRADE FAILED: timed out waiting for the condition"

So the CD part was working perfectly fine for a whole year, now without any changes it started giving this error. Any help would be appreciated.
This is the command that gets executed:
The issue here has nothing to do with Kubernetes or the AKS cluster( that was my first thought) I tried deploying manually using the same commands I set up in the CD pipeline, I figured that my docker image was broken and wasn't matching the latest helm release.
I solved it by replicating the issue manually (both CI and CD) and inspecting the pipeline in debug mode.

gitlab: Runner has never contacted this instance

I added a new virtualbox runner to my gitlab self hosted solution and I'm getting this warning on it:
Runner has never contacted this instance
and it nevers runs any jobs
Bouncing the runner will definitely help, else re-register the runner.
Also, you should check the status of the Runner with the below command.
gitlab-runner status
If you are using the runner in Windows Server, then go to the path where you have stored the .exe file and run the below command:
.\<.exe> status
If the runner is in stopped state, the start the runner by using the same commands but just replace status with start.

Error response from daemon: join session keyring: create session key: disk quota exceeded

I tried installing docker on a server of mine using this tutorial.
I want to run docker images remotely and use the portainer web-interface to administrate everything.
However, when I get to the point where I need to test my installation and I enter the command $ sudo docker run hello-world, I only get the following error:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:344: starting container process caused "process_linux.go:424: container init caused \"join session keyring: create session key: disk quota exceeded\"": unknown. ERRO[0000] error waiting for container: context canceled
I tried the following methods:
"Install Docker CE / Install using the convenience script"
"Install Docker CE / Install using the repository"
This also happens when I try to run other images (eg. portainer).
I hope this is enough information.
I am new to docker, so I don't know how I should debug it efficiently.
Try to increase maxkeys kernel parameter:
echo 50000 > /proc/sys/kernel/keys/maxkeys
see: https://discuss.linuxcontainers.org/t/error-with-docker-inside-lxc-container/922/2
So, as it turns out, I connected to the wrong vServer.
The one I was connected to is using LXD (as you might have seen in my previous comment), which doesn't support Docker (at least not the way this guide advises).
When I ran the same setup on a vServer using a bare-metal(type 1) hypervisor, it worked without a problem.
I think this has to do with automatic storage allocation under LXD, but this is just a guess.

Gitlab on OpenShift Origin gets stuck on "Symlinking existing certificates found in /etc/gitlab/trusted-certs"

I'm running the Openshift origin all in one, and the various project templates seem to work fine, except for gitlab. When restarting the machine, or trying to restart the gitlab-ce pod, it seems to either take a really long time (just under 10m) or it fails due to timeout. It always seems to get stuck at the same place.
How can i troubleshoot this deployment?
Thank you for using GitLab Docker Image!
Current version: gitlab-ce=8.14.1-ce.1
Configure GitLab for your system by editing /etc/gitlab/gitlab.rb file
And restart this container to reload settings.
To do it use docker exec:
docker exec -it gitlab vim /etc/gitlab/gitlab.rb
docker restart gitlab
For a comprehensive list of configuration options please see the Omnibus GitLab readme
https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/README.md
If this container fails to start due to permission problems try to fix it by executing:
docker exec -it gitlab update-permissions
docker restart gitlab
Preparing services...
Starting services...
Configuring GitLab package...
/opt/gitlab/embedded/bin/runsvdir-start: line 24: ulimit: pending signals: cannot modify limit: Operation not permitted
/opt/gitlab/embedded/bin/runsvdir-start: line 34: ulimit: max user processes: cannot modify limit: Operation not permitted
/opt/gitlab/embedded/bin/runsvdir-start: line 37: /proc/sys/fs/file-max: Read-only file system
Configuring GitLab...
* Moving existing certificates found in /opt/gitlab/embedded/ssl/certs
* Symlinking existing certificates found in /etc/gitlab/trusted-certs
Your process hangs on https://hub.docker.com/r/gitlab/gitlab-ce/~/dockerfile/
wrapper
line
you can run it manually by executing
docker run -it gitlab/gitlab-ce 'bash'
and entering there wrapper
If you manage to see
Starting Chef...
then everything after that should be OK.
The root cause of this for me, turned out to be lack of memory. I was running AWS EC2 t1.micro - 1GB. To fix, I stopped the EC2 instance and upgraded to t2.small (2GB). I started the EC2 again, ran free to check the memory was available, and then ran the docker run -ti gitlab/gitlab-ce command again.
Also, this saved me a couple of times, to clear up a corrupted docker state:
service docker stop
sudo rm -rf /var/run/docker
sudo rm /var/run/docker.*
service docker start

Jenkins CI - SSL CA error

NOTE: This is .NET Core on Linux (Ubuntu)
I am setting up some CI infrastructure for .NET Core code and am running into a strange issue. Specifically, when it comes to package restore (dotnet restore).
The Jenkins instance that I am running is hosted on Azure via a Bitnami Jenkins image. In my build, I have a number of build steps. One of them (the one that actually triggers dotnet restore) is an Execute Shell.
When I run whoami from it, I get that I am the user tomcat. When dotnet restore is triggered from within the build step, however, I get the following error:
06:22:37 log : Retrying 'FindPackagesByIdAsync' for source 'https://api.nuget.org/v3-flatcontainer/system.runtime.serialization.primitives/index.json'.
06:22:37 log : An error occurred while sending the request.
06:22:37 log : Problem with the SSL CA cert (path? access rights?)
That same issue does not happen if I SSH into the box and do a sudo su - tomcat - running dotnet restore on the same folder from a SSH session works, while doesn't work in the Execute Shell step (despite the fact that both run in the same user context).
What am I missing that might be causing this?

Resources