How to speed up Gitlab CI job with cache and artifacts - rust

I wish Gitlab test job for my project on Rust run faster.
Locally it rebuilds pretty fast, but in gitlab job, avery build runs slow as a first one.
Looking for a way to use artifacts or cache from previous pipline to speed up rust test and build processe.
# .gitlab-ci.yml
stages:
- test
test:
stage: test
image: rust:latest
script:
- cargo test

Gitlab CI/CD supports caching between CI jobs using the cache key in your .gitlab-ci.yml. It is only able to cache files in the project directory so you need to use the environment variable CARGO_HOME if you also want to cache the cargo registry.
You can add a cache setting at the top level to setup a cache for all jobs that don't have a cache setting themselves and you can add it below a job definition to setup a cache configuration for this kind of job.
See the keyword reference for all possible configuration options.
Here is one example configuration that caches the cargo registry and the temporary build files and configures the clippy job to only use the cache but not write to it:
stages:
- test
cache: &global_cache # Default cache configuration with YAML variable
# `global_cache` pointing to this block
key: ${CI_COMMIT_REF_SLUG} # Share cache between all jobs on one branch/tag
paths: # Paths to cache
- .cargo/bin
- .cargo/registry/index
- .cargo/registry/cache
- target/debug/deps
- target/debug/build
policy: pull-push # All jobs not setup otherwise pull from
# and push to the cache
variables:
CARGO_HOME: ${CI_PROJECT_DIR}/.cargo # Move cargo data into the project
# directory so it can be cached
# ...
test:
stage: test
image: rust:latest
script:
- cargo test
# only for demonstration, you can remove this block if not needed
clippy:
stage: test
image: rust:latest
script:
- cargo clippy # ...
only:
- master
needs: []
cache:
<<: *global_cache # Inherit the cache configuration `&global_cache`
policy: pull # But only pull from the cache, don't push changes to it
If you want to use cargo publish from CI, you should then add .cargo to your .gitignore file. Otherwise cargo publish will show an error that there is an uncommitted directory .cargo in your project directory.

Related

Gitlab-CI avoid unnecessary rebuilds of react portion of project

I have a stage in my CI pipeline (gitlab-ci) as follows:
build_node:
stage: Build Prerequisites
only:
- staging
- production
- ci
image: node:15.5.0
artifacts:
paths:
- http
cache:
key: "node_modules"
paths:
- ui/node_modules
script:
- cd ui
- yarn install --network-timeout 600000
- CI=false yarn build
- mv build ../http
The UI however, is not the only part of the project. There are other files with their own build processes. So whenever we commit changes for only those other files, this stage gets rerun every time, even if nothing in the ui folder changed.
Is there a way to have gitlab cache or otherwise not rebuild this every time if there were no changes? Any changes that should trigger a rebuild would all be under the ui folder. Just have it use the older build if possible?
It is possible to do in latest Gitlab version using the rules:changes keyword.
rules:
- changes:
- ui/*
Link: https://docs.gitlab.com/ee/ci/jobs/job_control.html#variables-in-ruleschanges
This will only check for changes inside the ui folder and trigger this stage.
Check this link for more info: https://docs.gitlab.com/ee/ci/yaml/#ruleschanges

Why does GitLab Ci not find my cached folder?

I have a list of CI jobs running in my GitLab and the Caching does not work as expected:
This is how my docu-generation job ends:
[09:19:33] Documentation generated in ./documentation/ in 4.397 seconds using gitbook theme
Creating cache angular...
00:02
WARNING: frontend/node_modules: no matching files
frontend/documentation: found 136 matching files
No URL provided, cache will be not uploaded to shared cache server. Cache will be stored only locally.
Created cache
Job succeeded
I then start a deployment Job (to GitLab Pages) but it fails because it doesn't find the documentation-folder:
$ cp -r frontend/documentation .public/frontend
cp: cannot stat 'frontend/documentation': No such file or directory
this is the cache config of the generation:
generate_docu_frontend:
image: node:12.19.0
stage: build
cache:
key: angular
paths:
- frontend/node_modules
- frontend/documentation
needs: ["download_angular"]
and this is for deployment:
deploy_documentation:
stage: deploy
cache:
- key: angular
paths:
- frontend/node_modules
- frontend/documentation
policy: pull
- key: laravel
paths:
- backend/vendor
- backend/public/docs
policy: pull
does anyone know why my documentation folder is missing?
The message in your job output No URL provided, cache will be not uploaded to shared cache server. Cache will be stored only locally. just means that your runners are not using Amazon S3 to store your cache, or something similar like Minio.
Without S3/Minio, the cache only lives on the runner that first ran the job and cached the resources. This means that the next time the job runs and it happens to be picked up by a different runner, it won't have the cache. In that case, you'd run into an error like this.
There's a couple ways around this:
Configure your runners to use S3/Minio (Minio has an open source, free-to-use license if you're interested in hosting it yourself).
Only use one runner (not a great solution since generally more runners means faster pipelines and this would slow things down considerably, though it would solve the cache problem).
Use tags. Tags are used to ensure that a job runs on a specific runner(s). Let's say for example that 1 out of your 10 runners have access to your production servers, but all have access to your lower environment servers. Your lower-env jobs can run on any runner, but your Production Deployment job has to run on the one runner with prod access. You can do this by putting a Tag on the runner called let's say prod-access and putting the same tag on the prod deploy job. This will ensure that job will run on the runner with prod access. The same thing can be used here to ensure the cache is available.
Use artifacts instead of cache. I'll explain this option below as it's really what you should be using for this use case.
Let's briefly explain the difference between Cache and Artifacts:
Cache is generally best used for dependency installation like npm or composer (for PHP projects). When you have a job that runs npm ci or composer install, you don't want it to run every since time your pipeline runs when you don't necessary change the dependencies as it wastes time. Use the cache keyword to cache the dependencies so that subsequent pipelines don't have to install the dependencies again.
Artifacts are best used when you need to share files or directories between jobs in the same pipeline. For example, after installing npm dependencies, you might need to use the node_modules directory in another job in the pipeline. Artifacts are also uploaded to the GitLab server by the runner at the end of the job, opposed to being stored locally on the runner that ran the job. All previous artifacts will be downloaded for all subsequent jobs, unless controlled with either dependencies or needs.
Artifacts are the better choice for your use case.
Let's update your .gitlab-ci.yml file to use artifacts instead of cache:
stages:
- build
- deploy
generate_docu_frontend:
image: node:12.19.0
stage: build
script:
- ./generate_docs.sh # this is just a representation of whatever steps you run to generate the docs
artifacts:
paths:
- frontend/node_modules
- frontend/documentation
expire_in: 6 hours # your GitLab instance will have a default, you can override it like this
when: on_success # don't attempt to upload the docs if generating them failed
deploy_documentation:
stage: deploy
script:
- ls # just an example showing that frontend/node_modules and frontend/documentation are present
- deploy.sh # whatever else you need to run this job

Sonarqube Gitlab integration issue with sonar-scanner.properties file

I have a two projects in GitLab and I am trying to integrate SonarQube with my GitLab projects.
Project 1
I have added the 'sonar-scanner.properties' file to Project1 and it's as follows:
sonar-scanner.properties
# SonarQube server
# sonar.host.url & sonar.login are set by the Scanner CLI.
# See https://docs.sonarqube.org/latest/analysis/gitlab-cicd/.
# Project settings.
sonar.projectKey=Trojanwall
sonar.projectName=Trojanwall
sonar.projectDescription=My new interesting project.
sonar.links.ci=https://gitlab.com/rmesi/trojanwallg2-testing/-/pipelines
#sonar.links.issue=https://gitlab.com/rmesi/trojanwallg2-testing/
# Scan settings.
sonar.projectBaseDir=./
#sonar.sources=./
sonar.sources=./
sonar.sourceEncoding=UTF-8
sonar.host.url=http://sonarqube.southeastasia.cloudapp.azure.com:31000
sonar.login=4f4cbabd17914579beb605c3352349229b4fd57b
#sonar.exclusions=,**/coverage/**
# Fail CI pipeline if Sonar fails.
sonar.qualitygate.wait=true
Then I added the sonar scanner job in the gitlab-ci.yml file:
gitlab-ci.yml
sonar-scanner-trojanwall:
stage: sonarqube:scan
image:
name: sonarsource/sonar-scanner-cli:4.5
entrypoint: [""]
variables:
# Defines the location of the analysis task cache
SONAR_USER_HOME: "${CI_PROJECT_DIR}/.sonar"
# Shallow cloning needs to be disabled.
# See https://docs.sonarqube.org/latest/analysis/gitlab-cicd/.
GIT_DEPTH: 0
cache:
key: "${CI_JOB_NAME}"
paths:
- .sonar/cache
script:
- sonar-scanner
only:
- Production
- /^[\d]+\.[\d]+\.1$/
when: on_success
After this, I configured the two variables: 'SONAR_HOST_URL' and 'SONAR_TOKEN' and then, ran the pipeline. It worked perfectly fine for the Project 1.
Project 2
Then, I needed to do the same for the Project 2 as well. I needed the sonar scanner to go into the Project 2, scan and analyze. For that, I created another project in SonarQube with a new token.
I needed to configure in such a way that when the pipeline for Project 1 is triggered, it scans both Project 1 and 2.
For that, I added another job in Project1's pipeline.
It's as follows:
gitlab-ci.yml
sonar-scanner-test-repo:
stage: sonarqube:scan
trigger:
include:
- project: 'rmesi/test-repo'
ref: master
file: 'sonarscanner.gitlab-ci.yml'
only:
- Production
- /^[\d]+\.[\d]+\.1$/
when: on_success
I tried to setup a downstream pipeline to trigger a yaml file in Project 2. So, when The pipeline in Project 1 is triggered and when the job 'sonar-scanner-test-repo' gets triggered, another yaml file in Project 2 is run as a down stream pipeline. That YAML file is as follows:
sonarscanner.gitlab-ci.yml
stages:
- sonarqube:scan
variables:
CI_PROJECT_DIR: /builds/rmesi/test-repo
sonar-scanner:
stage: sonarqube:scan
image:
name: sonarsource/sonar-scanner-cli:4.5
entrypoint: [""]
variables:
# Defines the location of the analysis task cache
SONAR_USER_HOME: "${CI_PROJECT_DIR}/.sonar"
# Shallow cloning needs to be disabled.
# See https://docs.sonarqube.org/latest/analysis/gitlab-cicd/.
GIT_DEPTH: 0
cache:
key: "${CI_JOB_NAME}"
paths:
- .sonar/cache
script:
- cd /builds/rmesi/
- git clone https://gitlab.com/rmesi/test-repo.git test-repo
- sonar-scanner
Then I added the 'sonar-project.properties' file in Project2 which is as follows:
sonar-project.properties
# SonarQube server
# sonar.host.url & sonar.login are set by the Scanner CLI.
# See https://docs.sonarqube.org/latest/analysis/gitlab-cicd/.
# Project settings.
sonar.projectKey=test-repo
sonar.projectName=test-repo
sonar.projectDescription=My new interesting project.
sonar.links.ci=https://gitlab.com/rmesi/test-repo/-/pipelines
#sonar.links.issue=https://gitlab.com/rmesi/test-repo/
# Scan settings.
sonar.projectBaseDir=/builds/rmesi/test-repo/
sonar.sources=/builds/rmesi/test-repo/, ./
sonar.sourceEncoding=UTF-8
sonar.host.url=http://sonarqube.southeastasia.cloudapp.azure.com:31000
sonar.login=b0c40e44fd59155d27ee43ae375b9ad7bf39bbdb
#sonar.exclusions=,**/coverage/**
# Fail CI pipeline if Sonar fails.
sonar.qualitygate.wait=true
The issue is that, when the down stream pipeline is run, I am getting the following error message:
I figured out that the down stream pipeline is not locating the 'sonar-scanner.properties' in Project 2. (Lines 68 and 74)
Where as, on Project 1 while searching for this step, it shows:
INFO: Project root configuration file: /builds/rmesi/trojanwallg2-testing/sonar-project.properties
But in Project 2 it's not working.
Does anyone know how to fix this?
I found the solution to this, myself.
Required to add
"- cd /build/rmesi/test-repo ; sonar-scanner"
in the script section in the job of the 'sonarscanner.gitlab-ci.yml' file.
That way, the runner maps directly to desired directory and execute the 'sonar-scanner' command there.

GitLab CI - Run pipeline when the contents of a file changes

I have a mono-repo with several projects (not my design choice).
Each project has a .gitlab-ci.yml setup to run a pipeline when a "version" file is changed. This is nice because a user can check-in to stage or master (for a hot-fix) and a build is created and deployed to a test environment.
The problem is when a user does a merge from master to stage and commits back to stage (to pull in any hot-fixes). This causes ALL the pipelines to run; even for projects that do not have actual content changes.
How do I allow the pipeline to run from master and/or stage but ONLY when the contents of the "version" file change? Like when a user changes the version number.
Here is an example of the .gitlab-ci.yml (I have 5 of these, 1 for each project in the mono-repo)
#
# BUILD-AND-TEST - initial build
#
my-project-build-and-test:
stage: build-and-test
script:
- cd $MY_PROJECT_DIR
- dotnet restore
- dotnet build
only:
changes:
- "MyProject/.gitlab-ci.VERSION.yml"
# no needs: here because this is the first step
#
# PUBLISH
#
my-project-publish:
stage: publish
script:
- cd $MY_PROJECT_DIR
- dotnet publish --output $MY_PROJECT_OUTPUT_PATH --configuration Release
only:
changes:
- "MyProject/.gitlab-ci.VERSION.yml"
needs:
- my-project-build-and-test
... and so on ...
I am still new to git, GitLab, and CI/pipelines. Any help would be appreciated! (I have little say in changing the mono-repo)
The following .gitlab-ci.yml will run the test_job only if the file version changes.
test_job:
script: echo hello world
rules:
- changes:
- version
See https://docs.gitlab.com/ee/ci/yaml/#ruleschanges
See also
Run jobs only/except for modifications on a path or file

Stop gitlab runner to not remove a directory

I have a directory which is generated during a build and it should not be deleted in the next builds. I tried to keep the directory using cache in .gitlab-ci.yml:
cache:
key: "$CI_BUILD_REF_NAME"
untracked: true
paths:
- target_directory/
build-runner1:
stage: build
script:
- ./build-platform.sh target_directory
In the first build a cache.zip is generated but for the next builds the target_directory is deleted and the cache.zip is extracted which takes a very long time. Here is a log of the the second build:
Running with gitlab-ci-multi-runner 1.11.
on Runner1
Using Shell executor...
Running on Runner1...
Fetching changes...
Removing target_directory/
HEAD is now at xxxxx Update .gitlab-ci.yml
From xxxx
Checking out xxx as master...
Skipping Git submodules setup
Checking cache for master...
Successfully extracted cache
Is there a way that gitlab runner not remove the directory in the first place?
What you need is to use a job artifacts:
Artifacts is a list of files and directories which are attached to a
job after it completes successfully.
.gitlab-ci.yml file:
your job:
before_script:
- do something
script:
- do another thing
- do something to generate your zip file (example: myFiles.zip)
artifacts:
paths:
- myFiles.zip
After a job finishes, if you visit the job's specific page, you can see that there is a button for downloading the artifacts archive.
Note
If you need to pass artifacts between different jobs, you need to use dependencies.
Gitlab has a good documentation about that if you really have this need http://docs.gitlab.com/ce/ci/yaml/README.html#dependencies

Resources