GitLab/GitLab-CI Omnibus package configure sidekiq concurrency - gitlab

My server has way too many sidekiq processes running for my needs in my GitLab install, both GitLab and GitLab-CI were running a ton of them. I have it running on DigitalOcean droplet with 1GB Ram 20GB SSD Disk on Ubuntu 14.04 x64, and it was regularly telling me I need to restart my server, and when I check htop I have 17-30 sidekiq processes running gitlab-rails [0 of 25 busy]
There is no clear documentation on how to change the number of sidekiq processes, or the concurrency, for the Omnibus install of GitLab/GitLab-CI.
What is the best way to adjust this and have it persist through upgrades?

I still have a problem with the number of processes slowly growing over time, but the best solution I have come up with so far for limiting the concurrency setting is to alter these two files:
/opt/gitlab/embedded/service/gitlab-rails/config/initializers/4_sidekiq.rb
/opt/gitlab/embedded/service/gitlab-ci/config/initializers/3_sidekiq.rb
By adding config.options[:concurrency] = 2 inside Sidekiq.configure_server do |config|
So, for example, my final 4_sidekiq.rb file looks like this:
# Custom Redis configuration
config_file = Rails.root.join('config', 'resque.yml')
resque_url = if File.exists?(config_file)
YAML.load_file(config_file)[Rails.env]
else
"redis://localhost:6379"
end
Sidekiq.configure_server do |config|
config.options[:concurrency] = 2
config.redis = {
url: resque_url,
namespace: 'resque:gitlab'
}
config.server_middleware do |chain|
chain.add Gitlab::SidekiqMiddleware::ArgumentsLogger if ENV['SIDEKIQ_LOG_ARGUMENTS']
chain.add Gitlab::SidekiqMiddleware::MemoryKiller if ENV['SIDEKIQ_MEMORY_KILLER_MAX_RSS']
end
end
Sidekiq.configure_client do |config|
config.redis = {
url: resque_url,
namespace: 'resque:gitlab'
}
end

At least for gitlab-omnibus we can do that easy in /etc/gitlab/gitlab.rb
##################
# GitLab Sidekiq #
##################
# sidekiq['log_directory'] = "/var/log/gitlab/sidekiq"
# sidekiq['shutdown_timeout'] = 4
# sidekiq['concurrency'] = 25
sidekiq['concurrency'] = 5
So now it says "[0 of 5 busy]"

Check out: hardware requirements for GitLab, and killing sidekick processes is a no go, GitLab depends on it to perform a lot of async actions.
1G of memory is not enough!

Related

Packer failed when executed on Gitlab-runner

I have a packer file to deploy Centos 7 using vSphere-Iso builder that works Ok when executed directly on a linux server but when I try to run the same packer file using a gitlab-runner it fails as it does not wait until the OS is installed. It fails after waiting for 1 minute but if I run the packer command with -on-error=run-cleanup-provisioner the OS install finish succesuflly so clear the issue is that packer is just not waiting.
2021/07/20 12:02:40 packer.io plugin: [INFO] Waiting for IP, up to total timeout: 30m0s, settle timeout: 5m0s
==> vsphere-iso.autogenerated_1: Waiting for IP...
==> vsphere-iso.autogenerated_1: Clear boot order...
==> vsphere-iso.autogenerated_1: Power off VM...
==> vsphere-iso.autogenerated_1: Destroying VM...
2021/07/20 12:03:12 [INFO] (telemetry) ending
==> Wait completed after 1 minute 2 seconds
2021/07/20 12:03:12 machine readable: error-count []string{"1"}
==> Some builds didn't complete successfully and had errors:
My boot command is the following as I do not use DHCP.
boot_command = ["<up><tab> text inst.ks=http://{{ .HTTPIP }}:{{ .HTTPPort }}/vmware-ks.cfg ip=10.118.12.117::10.118.12.1:255.255.255.0:{{ .Name }}.localhost:ens192:none<enter><wait>"]
I have tested using options like ssh_host, ip_wait_address, ip_settle_timeout, ssh_wait_timeout, pause_before_connecting but nothing seems to work.
As I said, the same packer pkr.hcl file works OK if run it manually on a regular linux but not on my gitlab-runner that is a runner installed directly on my Gitlab server (Yes, I know is not the best practice but I only use the runner for this task)
Packer versions 1.7.2 and 1.7.3 tested, gitlab-runner 14.0.0 and 14.0.1 tested.
Managed to make it work by changing the las wait on my boot command for wait5m. This will give the OS enough time to get installed and the VM rebooted.
New boot command boot_command = ["<up><tab> text inst.ks=http://{{ .HTTPIP }}:{{ .HTTPPort }}/vmware-ks.cfg ip=10.118.12.117::10.118.12.1:255.255.255.0:{{ .Name }}.localhost:ens192:none<enter><wait5m>"]
All the other wait options from packer are no longer needed with this boot command.
Doing some test I managed to make it work as well by creating a Firewall drop rule for the VM just after the kickstar file was loaded and removing the FW rules once the OS was installed. Definitelly, packer is just ignoring all the wait machanism native to packer when running on the gitlab-runner
EDIT: After having the same issue with my Windows Templates y tested using a different gitlab-runner installed on a different server instead of the one in the same gitlab server and it worked perfectly with my initial contifiguration for both, windows and centos.

How do I view my sidekiq console output locally using the default queue?

I'm using Rails 5. I would like to create a sidekiq process running locally using the default queue. My worker class looks roughly like the below ...
module Accounting::Workers
class FileGenerationWorker
include Sidekiq::Worker
def perform
print "starting work ...\n"
...
I have set up my config/sidekiq.yml file like so, in hopes of running the worker daily at a specific time (11:05 am) ...
:concurrency: 20
:queues:
- default
...
:schedule:
Accounting::Workers::FileGenerationWorker:
cron: "0 5 11 * *"
queue: default
However, when I start my rails server ("rails s"), I don't see my print statement output to the console or any of the work performed, which tells me my worker isn't running. What else am I missing in order to get this worker scheduled properly locally?
Run the workers with
bundle exec sidekiq
You may need to provide the path to the worker module. For example,
bundle exec sidekiq -r ./worker.rb
Sidekiq by itself doesn't support a :schedule: map entry in the Sidekiq configuration file.
Periodic job functionality is provided in extensions such as sidekiq-scheduler.
You need to use classes declared in the extended Sidekiq module provided in sidekiq-scheduler. For example,
./worker.rb
require 'sidekiq-scheduler'
require './app/workers/accounting'
Sidekiq.configure_client do |config|
config.redis = {db: 1}
end
Sidekiq.configure_server do |config|
config.redis = {db: 1}
end
./app/workers/accounting.rb
module Accounting
# ...
end
module Accounting::Workers
class FileGenerationWorker
include Sidekiq::Worker
def perform
puts "starting work ...\n"
end
end
end

GitLab-Runner "listen_address not defined" error

I'm running a Laravel api on my server, and I wanted to use Gitlab-runner for CD. The first two runs were good, but then I started to see this problem listen_address not defined, session endpoints disabled builds=0
I'm running a linux server on a web shared hosting, so I can access a terminal and get some priviliges but I can't do some sudo stuff like installing a service. That's why I've been running gitlab-runner in user-mode
Error info
Configuration loaded builds=0
listen_address not defined, metrics & debug endpoints disabled builds=0
[session_server].listen_address not defined, session endpoints disabled builds=0
.gitlab-runner/config.toml
concurrent = 1
check_interval = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "CD API REST Sistema SIGO"
url = "https://gitlab.com/"
token = "blablabla"
executor = "shell"
listen_address="my.server.ip.address:8043"
[runners.custom_build_dir]
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
I have literally wasted 2 days on this subject. I have followed the below steps to get the runners configured and execute jobs successfully.
I am using Mac OS X 10.13 and Git Lab 12. However, people with other OS also can check this out.
I have stopped the runners and uninstalled them. Now deleted all references and files to gitlab runner, including the gitlab executable also.
I got to know GitLab Runner executable paths from https://docs.gitlab.com/runner/configuration/advanced-configuration.html
I have installed them again using the gitlab official documentation.
Then the runners shows online in the gitlab portal. However, the jobs are not getting executed. It shows simply stuck. It tried to get information from logs using
gitlab-runner -debug run
Then I got to know that listen_address not defined. After a long try I got to know that simply enabling Run Untagged jobs did the trick. The jobs started and completed successfully. Still the I see the listen_address not defined from debug. So that misled me.
Though it seems that last one task has solved my problem, but doing all the tasks in a batch did the trick.
Conversely, an alternative to Avinash's solution is to include the tags you create when you register the runner in the gitlab-ci.yml file
stages:
- testing
testing:
stage: testing
script:
- echo 'Hello world'
tags:
- my-tags

Phoenix Deployment with EXRM

I am trying to deploy a phoenix app on a Ubuntu Server with EXRM.
The release runs perfectly and the website is accessible but when I ping the release it says the
Node 'myapp#myhost' not responding to pings.
vm.args file
## Name of the node
-sname pxblog
## Cookie for distributed erlang
-setcookie pxblog
## Heartbeat management; auto-restarts VM if it dies or becomes unresponsive
## (Disabled by default..use with caution!)
##-heart
## Enable kernel poll and a few async threads
##+K true
##+A 5
## Increase number of concurrent ports/sockets
##-env ERL_MAX_PORTS 4096
## Tweak GC to run more often
##-env ERL_FULLSWEEP_AFTER 10
Updated vm.args (Solved)
## Name of the node
-sname pxblog#localhost
## Cookie for distributed erlang
-setcookie pxblog
## Heartbeat management; auto-restarts VM if it dies or becomes unresponsive
## (Disabled by default..use with caution!)
##-heart
## Enable kernel poll and a few async threads
##+K true
##+A 5
## Increase number of concurrent ports/sockets
##-env ERL_MAX_PORTS 4096
## Tweak GC to run more often
##-env ERL_FULLSWEEP_AFTER 10
Check the vm.args file. Look for a line similar to this:
## Name of the node
-name test#127.0.0.1
I suspect the name you'll find there is "myapp#myhost". Try changing it to yourappname#localhost or yourappname#127.0.0.1. NB: I do not mean you should put the literal string yourappname there. Substitute the name of your app.

Specifying Parallel Environment on Google Compute Engine using Elasticluster

I recently created a Grid Engine cluster on Compute Engine using Elasticluster (http://googlegenomics.readthedocs.org/en/latest/use_cases/setup_gridengine_cluster_on_compute_engine/index.html).
I was wondering what is the appropriate command to run shared-memory multithreaded batch jobs on a cluster of Compute Engine virtual machine running Grid Engine.
In other words, what is the name (i.e. pe_name) of the Grid Engine parallel environment.
Let's say I want to run a job requesting 4 cpus on 1 node, what would be the right qsub command.
So far I tried the following command:
qsub -cwd -l h_vmem=800G -pe smp 6 run.sh
Unable to run job: job rejected: the requested parallel environment "smp" does not exist.
qsub -cwd -l h_vmem=800G -pe omp 6 run.sh
Unable to run job: job rejected: the requested parallel environment "omp" does not exist.
Thank you for your help!
I don't believe that Elasticluster's Ansible playbook includes a parallel environment. You can see the main configuration run on the master here:
https://github.com/gc3-uzh-ch/elasticluster/blob/master/elasticluster/providers/ansible-playbooks/roles/gridengine/tasks/master.yml
I believe you can simply connect to the master and issue the "add parallele environment" command:
$ qconf -ap smp
and write a configuration file like:
pe_name smp
slots 9999
user_lists NONE
xuser_lists NONE
start_proc_args /bin/true
stop_proc_args /bin/true
allocation_rule $fill_up
control_slaves FALSE
job_is_first_task FALSE
urgency_slots min
accounting_summary FALSE
and then modify the queue configuration for all.q:
$ qconf -mq all.q
...
pe_list make smp
...
I would also suggest filing an issue with Elasticluster here:
https://github.com/gc3-uzh-ch/elasticluster/issues
I would expect that someone has already done this in a fork of Elasticluster and may be able to provide a pull request to the master fork.
Hope that helps.
-Matt

Resources