GitLab's only:changes and the first commiit in a branch - gitlab

I have a multi-module build with one "leading" module and one additional one. I have set things up so that the additional module is only built when either it or the build files have changed:
build:sbt:module-main:
extends: .build-sbt
build:sbt:module-a:
extends: .build-sbt
only:
changes:
- module-a/**/*
- project/**/*
- "*.yml"
- "*.sbt
The behaviour I observe is that in a pipeline resulting from pushing a new branch, both modules are always built, regardless of the actual changes. Then when new commits are pushed to the branch, the pipelines triggered off of those will behave according to my rule, i.e. module-a will only be built when there was a change that affects it.
I would expect the same behaviour from the start.
I assume that "change" means "what git thinks has changed between this branch and the branch it was based off of". Is that not what change means in this context?

While searching for another answer I discovered by accident that Gitlab is working as designed in this case.

Related

How do I label pipelines in GitLab?

How do I add a label to the GitLab pipelines when they run?
This would be extremely helpful when you run a few nightly (scheduled) pipelines for different configurations on the main branch. For example, we run a nightly main branch with several submodules, each set at a point in their development (a commit point SHA) and I want to label that 'MAIN'. We run a second pipeline that I want to label 'HEADs', which is a result of pulling all of the HEAD's of the submodule to see if changes will break the main trunk when they are merged in.
Currently it shows:
Last commit message.
Pipeline #
commit SHA
Branch name
'Scheduled'
That is helpful, but it is very difficult to tell them apart because only the pipeline # changes between the pipelines.
I have good news!!
Our friends at GitLab have been working on this feature. There is now a way to label your pipeline in release 15.5.1-ee.0!
It uses the workflow control with a new keyword name
workflow:
name: 'Pipeline for branch: $CI_COMMIT_BRANCH'
You can even use the workflow:rules pair to have different names for you pipeline:
variables:
PIPELINE_NAME: 'Default pipeline name'
workflow:
name: '$PIPELINE_NAME'
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
variables:
PIPELINE_NAME: 'MR pipeline: $CI_COMMIT_BRANCH'
- if: '$CI_MERGE_REQUEST_LABELS =~ /pipeline:run-in-ruby3/'
variables:
PIPELINE_NAME: 'Ruby 3 pipeline'
Find the docs here: https://docs.gitlab.com/ee/ci/yaml/#workflow
This feature is disabled by default in 15.5 because it is so new.
You can enable the feature flag, which is named pipeline_name.
See this link to enable: https://docs.gitlab.com/ee/administration/feature_flags.html
(You need to use the Rails Console to enable it. Pretty easy.)
Note: Remember that the workflow keyword affects the entire pipeline instance.
This seems to be officially supported with GitLab 15.7 (December 2022)
Add custom names to pipelines with workflow:name:
For some projects, the same pipeline can be configured to run differently for different variables or conditions, creating very distinct outcomes for successful pipelines.
It can be hard for you to determine which version of that pipeline ran since there is no indication about the inputs used for that particular run.
While labels like scheduled and API help, it is sometimes still difficult to identify specific pipelines.
Now you can set a pipeline name using the keyword workflow:name to better identify the pipeline with string, a CI/CD variable, or a combination of both.
See Documentation and Issue.
Note:
If the name is an empty string, the pipeline is not assigned a name.
A name consisting of only CI/CD variables could evaluate to an empty string if all the variables are also empty.

How to read labels in Gitlab CI script

I have a few use cases in my Gitlab setup I would like to be able to support:
If a certain label (let's call it “skip_build”) is set, the deployment steps should not be run when I merge an MR to a main branch. This would be useful when we have multiple MRs being merged right after another and only need the last one built.
If another label (we'll call it “skip_tests”) is set, I should be able to read it as an env var from within the script and alter the flow within the script accordingly (using normal bash syntax), e.g. to alter the package command parameters used a bit. This is useful for small changes where it might not make sense to run a lengthy test suite.
Is this possible with Gitlab, and if so, how?
I’ve tried experimenting with CI_MERGE_REQUEST_LABELS, but it doesn’t seem to be able to read that as an env var from within the script.
You have to use merge request pipelines for the CI_MERGE_REQUEST_LABELS variable (and other MR-related variables) to be present as documented in predefined variables.
You could use a rules: clause to skip jobs. Something like
build:
rules: # only run this job if the regex pattern does not match
- if: $CI_MERGE_REQUEST_LABELS !~ /skip_build/
You can also do this on any other kind of predefined (or user-defined) variable, like branch name, commit messages, MR titles, etc. Whatever works for you.
For example, a built in feature of GitLab is that if your commit message contains [ci skip] it will prevent the pipeline from running. You could implement similar functionality for your jobs and/or pipelines through rules: or workflow:rules:.

Gitlab only running tests based on changed modules?

Lets say I have a test block for all my microservices (15-20+). Tests take a long time since there are so many disparate modules in this monorepo.
Lets say I only want to run 1 or maybe 2 at a time if and only if specific code changes have been made underneath a path. How can I best do this? For assembling I do something like this (not sure if this is terrible or not)
Ultimately, I'm trying to only build and test relevant things if they're relevant (based on if they or a related module I can define change)
Module-specific assembles
x:
stage: build
image: gradle:6.0-jdk11
script:
- gradle :x:assemble
artifacts:
paths:
- x/build/libs
only:
changes:
- x
- x/*
- x/*/**
y_build:
stage: build
image: gradle:6.0-jdk11
script:
- gradle :y:assemble
artifacts:
paths:
- y/build/libs
only:
changes:
- y
- y/*
- y/*/**
Current block for testing
test:
stage: test
image: gradle:6.0-jdk11
services:
- name: gitlab-registry.company.com/nap/dynamodb-local:1252954
command: [ "-inMemory", "-sharedDb" ]
alias: dynamodb
script: gradle check
There are many ways that a Gitlab CI Pipeline can be triggered, but the basic way is when a commit is pushed, no matter what the change is. Currently, there isn't a way to inspect which parts of the code are changed, and only run some steps or others based on the result, but you can control which steps run based on the branch or tag name.
So for example, you could have steps that only run when the branch name starts with something like "microservice_3_", and then make sure that when editing Microservice 3, you always start the branch name with "microserver_3_", though this could get complicated the more microservices you have to support.
The easier option (in terms of the pipeline definition) is to maintain the microservices in separate repositories, each with their own pipelines and tests. This way each individual pipeline only runs against a specific component and doesn't care about changes to the others. Then if you need to, you can combine the microservices in another repository (included as git submodules) and have a pipeline that only cares about the services as a whole.
This would add extra overhead while developing the project(s), but it makes the pipelines easier to manage.

How can I prevent jobs from running based on the files changed compared with master?

I have some gitlab jobs in my pipleline which are slow and I'd like to prevent them from running when the changes will not affect the job's outcome.
This is what I have tried:
run_tests:
stage: checks
script:
- cargo test
except:
- master
- tags
only:
changes:
- "**/*.rs"
- "**/Cargo.toml"
- "**/Cargo.lock"
This sort of works. If a merge request has multiple commits, this job will not run on any of the commits after the first one, unless a Rust source file has changed.
But the job will still always run on the first commit of the branch, even if there are no changes to Rust source files between this branch and master. Even worse, if the tests fail on the first commit, subsequent commits might skip the tests so broken code could get merged.
How can I change this filter so that the diff is done against the target branch of the merge request?
Hidden quite deep in the documentation for only: changes, there is this snippet:
Without pipelines for merge requests, pipelines run on branches or tags that don’t have an explicit association with a merge request. In
this case, a previous SHA is used to calculate the diff, which
equivalent to git diff HEAD~. This could result in some unexpected
behavior, including:
When pushing a new branch or a new tag to GitLab, the policy always evaluates to true.
When pushing a new commit, the changed files are calculated using the previous commit as the base SHA
Which is the scenario you are currently experiencing, where the job always runs on the first commit of the new branch.
For this to work as you want, I believe you need to add only: merge_requests, however this would require changing your entire pipeline workflow to integrate only: merge_requests to your pipeline.
From the documentation:
With pipelines for merge requests, it’s possible to define a job to be
created based on files modified in a merge request.
In order to deduce the correct base SHA of the source branch, we
recommend combining this keyword with only: [merge_requests]. This
way, file differences are correctly calculated from any further
commits, thus all changes in the merge requests are properly tested in
pipelines.
Various issues from GitLab:
https://gitlab.com/gitlab-org/gitlab/-/issues/11427
https://gitlab.com/gitlab-org/gitlab/-/issues/27875

In GitLab CI, is there a variable for a Merge Request's target branch?

In my pipeline, I'd like to have a job run only if the Merge Requests target branch is a certain branch, say master or release.
Is this possible?
I've read through https://docs.gitlab.com/ee/ci/variables/ and unless I missed something, I'm not seeing anything that can help.
Update: 2019-03-21
GitLab has variables for merge request info since version 11.6 (https://docs.gitlab.com/ce/ci/variables/ see the variables start with CI_MERGE_REQUEST_). But, these variables are only available in merge request pipelines.(https://docs.gitlab.com/ce/ci/merge_request_pipelines/index.html)
To configure a CI job for merge requests, we have to set:
only:
- merge_requests
And then we can use CI_MERGE_REQUEST_* variables in those jobs.
The biggest pitfall here is only: merge_request has complete different behavior from normal only/except parameters.
usual only/except parameters:
(https://docs.gitlab.com/ce/ci/yaml/README.html#onlyexcept-basic)
only defines the names of branches and tags for which the job will run.
except defines the names of branches and tags for which the job will not run.
only: merge_request: (https://docs.gitlab.com/ce/ci/merge_request_pipelines/index.html#excluding-certain-jobs)
The behavior of the only: merge_requests parameter is such that only jobs with that parameter are run in the context of a merge request; no other jobs will be run.
I felt hard to reorganize jobs to make them work like before with only: merge_request exists on any job. Thus I'm still using the one-liner in my original answer to get MR info in a CI job.
Original answer:
No.
But GitLab have a plan for this feature in 2019 Q2: https://gitlab.com/gitlab-org/gitlab-ce/issues/23902#final-assumptions
Currently, we can use a workaround to achieve this. The method is as Rekovni's answer described, and it actually works.
There's a simple one-liner, get the target branch of an MR from the current branch:
script: # in any script section of gitlab-ci.yml
- 'CI_TARGET_BRANCH_NAME=$(curl -LsS -H "PRIVATE-TOKEN: $AWESOME_GITLAB_API_TOKEN" "https://my.gitlab-instance.com/api/v4/projects/$CI_PROJECT_ID/merge_requests?source_branch=$CI_COMMIT_REF_NAME" | jq --raw-output ".[0].target_branch")'
Explanation:
CI_TARGET_BRANCH_NAME is a newly defined variable which stores resolved target branch name. Defining a variable is not necessary for various usage.
AWESOME_GITLAB_API_TOKEN is the variable configured in repository's CI/CD variable config. It is a GitLab personal access token(created in User Settings) with api scope.
About curl options: -L makes curl aware of HTTP redirections. -sS makes curl silent(-s) but show(-S) errors. -H specifies authority info accessing GitLab API.
The used API could be founded in https://docs.gitlab.com/ce/api/merge_requests.html#list-project-merge-requests. We use the source_branch attribute to figure out which MR current pipeline is running on. Thus, if a source branch has multiple MR to different target branch, you may want to change the part after | and do your own logic.
About jq(https://stedolan.github.io/jq/), it's a simple CLI util to deal with JSON stuff(what GitLab API returns). You could use node -p or any method you want.
Because of the new env variables in 11.6 $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME and $CI_MERGE_REQUEST_TARGET_BRANCH_NAME jobs can be included or excluded based on the source or target branch.
Using the only and except (complex) expressions, we can build a rule to filter merge requests. For a couple examples:
Merge request where the target branch is master:
only:
refs:
- merge_requests
variables:
- $CI_MERGE_REQUEST_TARGET_BRANCH_NAME == "master"
Merge request except if the source branch is master or release:
only:
- merge_requests
except:
variables:
- $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME == "master"
- $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME == "release"
If you want to use multiple refs (let's say merge_requests and tags) and multiple variables, the refs will be OR'd, the variables will be OR'd, and the result will be AND'd:
If any of the conditions in variables evaluates to truth when using only, a new job is going to be created. If any of the expressions evaluates to truth when except is being used, a job is not going to be created.
If you use multiple keys under only or except, they act as an AND. The logic is:
(any of refs) AND (any of variables) AND (any of changes) AND (if kubernetes is active)
Variable expressions are also quite primitive, only supporting equality and (basic) regex. Because the variables will be OR'd you cannot specify both a source and target branch as of gitlab 11.6, just one or the other.
As of GitLab 11.6, there is CI_MERGE_REQUEST_TARGET_BRANCH_NAME.
If this is what you're really after, there could be an extremely convoluted way (untested) you could achieve this using the merge request API and CI variables.
With a workflow / build step something like:
Create merge request from feature/test to master
Start a build
Using the API (in a script), grab all open merge requests from the current project using CI_PROJECT_ID variable, and filter by source_branch and target_branch.
If there is a merge request open with the source_branch and target_branch being feature/test and master respectively, continue with the build, otherwise just skip the rest of the build.
For using the API, I don't believe you can use the CI_JOB_TOKEN variable to authenticate, so you'll probably need to create your own personal access token and store it as a CI variable to use in the build job.
Hope this helps!
Another example, but using rules:
rules:
# pipeline should run on merge request to master branch
- if: $CI_PIPELINE_SOURCE == 'merge_request_event' && $CI_MERGE_REQUEST_TARGET_BRANCH_NAME == 'master'
when: always
# pipeline should run on merge request to release branch
- if: $CI_PIPELINE_SOURCE == 'merge_request_event' && $CI_MERGE_REQUEST_TARGET_BRANCH_NAME == 'release'
when: always
- when: never
Gitlab CI is agnostic of Merge Requests (for now). Since the pipeline runs on the origin branch you will not be able to retrieve the destination.

Resources