Run jobs in parallel in .gitlab.ci.yml

Run jobs in parallel in .gitlab.ci.yml - gitlab

Basically I currently have 5 test licenses on a server. I have a pipeline that runs test scripts when I trigger it manually. It connects to the license server and acquires a floating license . This approach is good for now but soon I will be wanting to expand this so that when the application needs testing, I can run multiple pipelines in parallel to have multiple tests running. Here's the thing, I'm wanting to only sometimes run them in parallel based on what I need to test and I would want to manually trigger each pipeline. For example, one day I might only want to run the tests scripts on one job so this requires one pipeline running. Another day I might want to run 3 jobs at the same time or another day I may want to run 5 jobs throughout the day but may overlap with a running pipeline so it needs to run its own pipeline.
My question is, how do I go about setting this up in a gitlab yml file.
If anyone could also provide a basic example that would be helpful!

As a general rule, any job within the same stage is run in parallel. Within those jobs, you can define rules that specify when a given job runs. Note that needing to manually kick off jobs will cause your overall pipeline to be listed as "blocked" when the manual jobs are reached. Similarly, you'll need to set some jobs as allow_failure: true or they will block the next stage from executing. Example pipeline:
stages:
- build
- test
build_job:
image: alpine:latest
stage: build
script:
- echo "hello world"
test_job_1:
image: alpine:latest
stage: test
rules:
- when: manual
script:
- echo "test 1"
test_job_2:
image: alpine:latest
stage: test
rules:
- when: manual
script:
- echo "test 2"
When you run the above pipeline, you will have to manually click "play" on both jobs to start them.
I will note though, that this feels like an anti-pattern. It sounds like you want to conserve your 5 test licenses to make sure that you don't, for example, have 6 jobs running and have a job fail due to no free licenses. If so, this is one of the exact use-cases that the resource_group keyword is meant to address. Jobs added to the same resource group do not run concurrently (even if they are in different pipelines), so you could have a test_1 resource group, test_2 resource group, etc, and those jobs would always run in parallel automatically, but would never run more than one instance of that job at once even across different pipelines. This allows you ensure you only use 5 licenses, but your tests still run automatically and you don't need to manually trigger them, which also allows you to more easily define downstream jobs to run when your tests pass.

Related

Active Gitlab CI job only when early job was failed

I was wondering and could not find a solution to our problem.
We have many CI pipelines running in scheduled times and we need to add a job in case of early job failure.
For example lets say that in the attached picture job "deploy-job1" failed (not like in the picture)
We want to have a "sleeping job" that will be activated and run only when a previous job did not succeed.
Gitlab pipeline
Are there any suggestions on a way to handle this kind of a task?
We have tried handle this within the scripts we are running but we want to have general "sleeping job" that will be similar to all stages

Something like this might help:
.sleeping_job:
needs: deploy-job1
when: on_failure
# do stuff

Can I schedule only one specific GitLab CI/CD job?

At the moment, I have a couple of CI/CD jobs configured in gitlab-ci.yaml within a pipeline. For instance, I have the build and deploy configured within one branch. One of the jobs is creating a backup.
I would like to schedule only the backup job every night. However, in the schedule section of GitLab CI/CD, I can only select a branch or a tag. This will result in all the stages/jobs to be triggered, whereas I only want the backup to be triggered.
I've seen the option to configure rules and excepts to only trigger the jobs in a certain context. That, however, feels a bit like a patch solution. Is there another way to trigger only a single job in the Gitlab CI/CD scheduler?

I've seen the option to configure rules and excepts to only trigger the jobs in a certain context. That, however, feels a bit like a patch solution. Is there another way to trigger only a single job in the Gitlab CI/CD scheduler?
No, there's no other way than using rules: or only:/except: to control which jobs run when the pipeline is triggered by a schedule.
You can make this more manageable by splitting your pipeline configuration to multiple files then using the appropriate rules: on the include:
include:
- local: /do-backup.yml
rules:
- if: '$CI_PIPELINE_SOURCE == "schedule"'
- local: /do-pipeline.yml
rules:
- if: '$CI_PIPELINE_SOURCE != "schedule"'
That way you don't have to necessarily apply the same rules: to every single job and you can define additional rules: per job with less conflict.

How to manage success/failure of independent scheduled jobs in Gitlab CI

I have a few independent scheduled CI jobs.
Check that the infrastructure matches Terraform code.
Check for npm vulnerabilities.
Check that external APIs pass tests.
These are not hermetic. They do not test the fitness of the code in isolation. They could succeed at commit 12345 and then fail at commit 12345 and then succeed again.
I run these daily.
Gitlab lacks the ability to have multiple pipeline types (unlike, say, Buildkite), so I use a variable to control which steps run.
However, I am left with the problem that these checks interfere with the main passed/failed status of commits.
For example, if the Terraform infra check fails, then it's broken and people are notified and whoever is the next to push is informed they fixed it.
These kinds of checks can't be that uncommon, right? How should thse be managed?

It sounds like you're looking for the allow_failure keyword.
https://docs.gitlab.com/ce/ci/yaml/#allow_failure
When allow_failure is set to true and the job fails, the job shows an orange warning in the UI. However, the logical flow of the pipeline considers the job a success/passed, and is not blocked.
job1:
stage: test
script:
- execute_script_that_will_fail
allow_failure: true

Can a Gitlab CI pipeline job be configured as automatic when the prior stage succeeds, but manual otherwise?

I use Gitlab CI to deploy my service. Currently I have a single deploy job that releases my changes to every server behind a load balancer.
I would like do a phased rollout where I deploy to one server in my load balancer, give it a few minutes to bake and set off any alarms if there is an issue, and then automatically continue deploying to the remaining servers. If any issue occurred before the delayed full automatic deploy happened I would manually cancel that job to prevent the bad change from going out more widely.
With this goal in mind I configured my pipeline with the following .gitlab-ci.yml:
stages:
- canary_deploy
- full_deploy
canary:
stage: canary_deploy
allow_failure: false
when: manual
script: make deploy-canary
full:
stage: full_deploy
when: delayed
start_in: 10 minutes
script: make deploy-full
This works relatively well but I ran into a problem when I tried to push a critical change out quickly. The canary deploy script was hanging and this prevented the second job from starting as it must wait for the first stage to complete. In this case I would have preferred to skip the canary entirely but because of the way the pipeline is configured it was not possible to manually invoke the full deploy.
Ideally I would like the full_deploy stage to run on the typical delay but allow me to forcefully start it if I didn't want to wait. I've reviewed the rules and needs and when configuration options hoping to find a way to achieve my goal but I haven't been able to find a working solution.
Some things I've tried, without luck:
I could create a duplicate full_deploy job which is manual and does not depend on the canary_deploy stage but it feels a bit hacky. And in reality my configuration is a bit more complex than what I've distilled here so there are actually several region-specific deploy jobs and I would prefer not to have to duplicate each of them.
I tried to use rules to consider the status of the prior stage and make the full_deploy manual unless the prior stage was successful. This isn't possible because rules are executed on pipeline creation and cannot dynamically adjust this property at runtime.
I changed the canary_deploy to allow failure, which effectively unblocked the second stage immediately. The problem here is that it caused the delay timer to start counting down immediately upon pipeline creation rather than waiting for the first stage to complete.

One thing you could do to make duplicating the full_deploy job feel a little bit less "hacky" is to define it once and then use extends two times:
stages:
- canary_deploy
- full_deploy
.full:
script: make deploy-full
canary:
stage: canary_deploy
allow_failure: false
when: manual
script: make deploy-canary
full_automatic:
extends: .full
stage: full_deploy
when: delayed
start_in: 10 minutes
full_manual:
stage: full_deploy
extends: .full
when: manual
needs: []
This way, you only need to define the scripts section once and both the full_manual and the full_automatic job use it. When running the pipeline, you can choose which job to run first (manual versus canary):
Screenshot of the GitLab UI for selecting which job to run
By specifying needs: [], you tell GitLab that the full_manual job does not depend on any other jobs and can be executed immediately without running jobs from canary_deploy before.
When executing full_manual, the canary job is not executed:
Overview of executed pipeline jobs

Gitlab CI in multiple platforms simultaneously

I have a C++ project that is compiled and packaged for multiple OS (Linux, Windows, MacOS) as well as multiple CPU architectures (i386, x86_64, arm, Aarch64)
For this I'm using Jenkins to grab the source code and run the build script in parallel on each system. It's a simple working solution, since my build script deals with the system differences.
Now I'm looking into Gitlab CI/CD, and it has many things I find appealing ( being able to keep the build script as part of the repository, very well integrated with the git repo and ticketing system, natural use of Docker containers, etc), but I cannot find any way to run the same pipeline in multiple architectures/systems parallel to each other.
So, say that my build script is:
build:
stage: build
script:
- uname -m > arch.txt
artifacts:
paths:
- arch.txt
How can I tell Gitlab that I want to run this job in multiple runners/Docker containers/systems at the same time? All the documentation I've read so far deals with running multiple tests on one build, integrating multiple projects or deploying in different environments depending on branches. Nothing I've read so far tries to do many separate builds, test and package them individually and report on their independent results. Is this feasible on Gitlab CI/CD?

GitLab uses "runners" to execute CI jobs. Runners are installed wherever you want to run a CI job, so if you want to run on multiple architectures then you will need to install runners on systems for each architecture. Runner install documentation can be found here:
https://docs.gitlab.com/runner/install/index.html
For Linux-based jobs it is common to use Docker for job execution - this doesn't give architectural flexibility, but it does allow you to test on different flavors and with different software using containerisation. For other architectures you may need to install runners yourself, or use other peoples shared runners.
While you are installing the runner software there are some keys steps:
you have the opportunity to link each runner to your GitLab project, which means it will show up in the runners list under Project > Settings > CI/CD.
you will have the opportunity to assign "tags" to the runners. Tags can be used to help identify a runner or group of runners by an arbitrary name (e.g. you could add "Windows x86_64" as a tag, or "Windows" and "x86_64" tags). These tags can be used in jobs to select a runner.
Once you have your runners installed you can get editing your .gitlab-ci.yml file.
GitLab CI files are broken up into "stages". Jobs in each stage can run in parallel. Stage names are defined at the top of the file.
stages:
- build
- deploy
Each CI job can be attached to a stage by using the stage: entry:
build job:
stage: build
script:
- echo "I am a build stage job"
In your case you will need to create multiple jobs for each architecture you want to build for. Attaching these to the same stage will allow them to run in parallel.
To control where each job runs you have two main mechanisms:
Tags - tags allow you to pin a job to a runner tag. You can specify multiple tags using the tags: entry which forms an AND list (e.g. win tag AND x86_64 tag). When that job runs GitLab will find a runner that has all the required tags, and run the job there.
Image - When running on Docker / Kubernetes you can specify a docker image to use for the runner. To use a docker image you first need to specify a runner that can run docker images (e.g. a docker-in-docker or kubernetes runner), which might, for example, be tagged with docker or kubernetes. Then you use the image: entry to specify the docker image.
Here's an example showing both tags and images:
build win x86_64:
stage: build
tags:
- win
- x86_64
script:
- echo "I am a build stage job for win x86_64"
build win 32:
stage: build
tags:
- win
- 32-bit
script:
- echo "I am a build stage job for win 32"
build debian:
stage: build
tags:
- docker
image: debian:stretch
script:
- echo "I am a build stage job for debian, running on docker using debian:stretch image"
There is currently no support for dynamic jobs, or running one job on multiple runners / architectures, so this involves a bit of manual effort. On the positive side it makes GitLab CI files easy to read, and easy to see what will run during CI execution.

Check out the latest GitLab release 11.5 (Nov. 2018) with:
Parallel attribute for faster pipelines
The speed of pipelines is an important factor for any team, and running tests or other parallelizable tasks tends to take a big chunk of the time for any build.
Adding this new keyword gives teams the ability to simply parallelize tests, allowing everyone to accelerate their software delivery process.
To use this feature, simply set the parallel attribute to how many copies of the task you’d like to split it into, and GitLab will handle the work of automatically creating the appropriate number of jobs running your task.
As commented by March, this is not an exact fit for the OP (which is about running the same pipeline in multiple architectures/systems parallel to each other).
See documentation and issue.
parallel allows you to configure how many instances of a job to run in parallel.
This value has to be greater than or equal to two (2) and less or equal than 50.
This creates N instances of the same job that run in parallel.
They’re named sequentially from job_name 1/N to job_name N/N.
test:
script: rspec
parallel: 5

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string