I have a directory which is generated during a build and it should not be deleted in the next builds. I tried to keep the directory using cache in .gitlab-ci.yml:
cache:
key: "$CI_BUILD_REF_NAME"
untracked: true
paths:
- target_directory/
build-runner1:
stage: build
script:
- ./build-platform.sh target_directory
In the first build a cache.zip is generated but for the next builds the target_directory is deleted and the cache.zip is extracted which takes a very long time. Here is a log of the the second build:
Running with gitlab-ci-multi-runner 1.11.
on Runner1
Using Shell executor...
Running on Runner1...
Fetching changes...
Removing target_directory/
HEAD is now at xxxxx Update .gitlab-ci.yml
From xxxx
Checking out xxx as master...
Skipping Git submodules setup
Checking cache for master...
Successfully extracted cache
Is there a way that gitlab runner not remove the directory in the first place?
What you need is to use a job artifacts:
Artifacts is a list of files and directories which are attached to a
job after it completes successfully.
.gitlab-ci.yml file:
your job:
before_script:
- do something
script:
- do another thing
- do something to generate your zip file (example: myFiles.zip)
artifacts:
paths:
- myFiles.zip
After a job finishes, if you visit the job's specific page, you can see that there is a button for downloading the artifacts archive.
Note
If you need to pass artifacts between different jobs, you need to use dependencies.
Gitlab has a good documentation about that if you really have this need http://docs.gitlab.com/ce/ci/yaml/README.html#dependencies
Related
I have a stage in my CI pipeline (gitlab-ci) as follows:
build_node:
stage: Build Prerequisites
only:
- staging
- production
- ci
image: node:15.5.0
artifacts:
paths:
- http
cache:
key: "node_modules"
paths:
- ui/node_modules
script:
- cd ui
- yarn install --network-timeout 600000
- CI=false yarn build
- mv build ../http
The UI however, is not the only part of the project. There are other files with their own build processes. So whenever we commit changes for only those other files, this stage gets rerun every time, even if nothing in the ui folder changed.
Is there a way to have gitlab cache or otherwise not rebuild this every time if there were no changes? Any changes that should trigger a rebuild would all be under the ui folder. Just have it use the older build if possible?
It is possible to do in latest Gitlab version using the rules:changes keyword.
rules:
- changes:
- ui/*
Link: https://docs.gitlab.com/ee/ci/jobs/job_control.html#variables-in-ruleschanges
This will only check for changes inside the ui folder and trigger this stage.
Check this link for more info: https://docs.gitlab.com/ee/ci/yaml/#ruleschanges
I have a list of CI jobs running in my GitLab and the Caching does not work as expected:
This is how my docu-generation job ends:
[09:19:33] Documentation generated in ./documentation/ in 4.397 seconds using gitbook theme
Creating cache angular...
00:02
WARNING: frontend/node_modules: no matching files
frontend/documentation: found 136 matching files
No URL provided, cache will be not uploaded to shared cache server. Cache will be stored only locally.
Created cache
Job succeeded
I then start a deployment Job (to GitLab Pages) but it fails because it doesn't find the documentation-folder:
$ cp -r frontend/documentation .public/frontend
cp: cannot stat 'frontend/documentation': No such file or directory
this is the cache config of the generation:
generate_docu_frontend:
image: node:12.19.0
stage: build
cache:
key: angular
paths:
- frontend/node_modules
- frontend/documentation
needs: ["download_angular"]
and this is for deployment:
deploy_documentation:
stage: deploy
cache:
- key: angular
paths:
- frontend/node_modules
- frontend/documentation
policy: pull
- key: laravel
paths:
- backend/vendor
- backend/public/docs
policy: pull
does anyone know why my documentation folder is missing?
The message in your job output No URL provided, cache will be not uploaded to shared cache server. Cache will be stored only locally. just means that your runners are not using Amazon S3 to store your cache, or something similar like Minio.
Without S3/Minio, the cache only lives on the runner that first ran the job and cached the resources. This means that the next time the job runs and it happens to be picked up by a different runner, it won't have the cache. In that case, you'd run into an error like this.
There's a couple ways around this:
Configure your runners to use S3/Minio (Minio has an open source, free-to-use license if you're interested in hosting it yourself).
Only use one runner (not a great solution since generally more runners means faster pipelines and this would slow things down considerably, though it would solve the cache problem).
Use tags. Tags are used to ensure that a job runs on a specific runner(s). Let's say for example that 1 out of your 10 runners have access to your production servers, but all have access to your lower environment servers. Your lower-env jobs can run on any runner, but your Production Deployment job has to run on the one runner with prod access. You can do this by putting a Tag on the runner called let's say prod-access and putting the same tag on the prod deploy job. This will ensure that job will run on the runner with prod access. The same thing can be used here to ensure the cache is available.
Use artifacts instead of cache. I'll explain this option below as it's really what you should be using for this use case.
Let's briefly explain the difference between Cache and Artifacts:
Cache is generally best used for dependency installation like npm or composer (for PHP projects). When you have a job that runs npm ci or composer install, you don't want it to run every since time your pipeline runs when you don't necessary change the dependencies as it wastes time. Use the cache keyword to cache the dependencies so that subsequent pipelines don't have to install the dependencies again.
Artifacts are best used when you need to share files or directories between jobs in the same pipeline. For example, after installing npm dependencies, you might need to use the node_modules directory in another job in the pipeline. Artifacts are also uploaded to the GitLab server by the runner at the end of the job, opposed to being stored locally on the runner that ran the job. All previous artifacts will be downloaded for all subsequent jobs, unless controlled with either dependencies or needs.
Artifacts are the better choice for your use case.
Let's update your .gitlab-ci.yml file to use artifacts instead of cache:
stages:
- build
- deploy
generate_docu_frontend:
image: node:12.19.0
stage: build
script:
- ./generate_docs.sh # this is just a representation of whatever steps you run to generate the docs
artifacts:
paths:
- frontend/node_modules
- frontend/documentation
expire_in: 6 hours # your GitLab instance will have a default, you can override it like this
when: on_success # don't attempt to upload the docs if generating them failed
deploy_documentation:
stage: deploy
script:
- ls # just an example showing that frontend/node_modules and frontend/documentation are present
- deploy.sh # whatever else you need to run this job
While trying to run the GitLab pipeline, I am getting an error
"Error: Could not find or load main class Testing\GitLab-Runner\builds\EgKZ847y\0\sandeshmms\LearningSelenium..m2.repository"
Also, it is giving this message:
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted.
Below is the console message:
Running with gitlab-runner 14.2.0 (58ba2b95)
on my-runner1 EgKZ847y
Preparing the "shell" executor 00:00
Using Shell executor...
Preparing environment
Running on HOMEPC...
Getting source from Git repository 00:10
Fetching changes with git depth set to 50...
Reinitialized existing Git repository in D:/Java Testing/GitLab-Runner/builds/EgKZ847y/0/sandeshmms/LearningSelenium/.git/
Checking out 41ee697d as develop...
git-lfs/2.12.1 (GitHub; windows 386; go 1.14.10; git 85b28e06)
Skipping Git submodules setup
Restoring cache 00:02
Version: 14.2.0
Git revision: 58ba2b95
Git branch: 14-2-stable
GO version: go1.13.8
Built: 2021-08-22T19:47:56+0000
OS/Arch: windows/386
Checking cache for default-14...
Runtime platform arch=386 os=windows pid=5420 revision=58ba2b95 version=14.2.0
No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted.
Successfully extracted cache
Executing "step_script" stage of the job script 00:03
$ echo "Testing Job Triggered"
Testing Job Triggered
$ echo $CI_PROJECT_DIR
D:\Java Testing\GitLab-Runner\builds\EgKZ847y\0\sandeshmms\LearningSelenium
$ mvn $MAVEN_OPTS clean test
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
Error: Could not find or load main class Testing\GitLab-Runner\builds\EgKZ847y\0\sandeshmms\LearningSelenium..m2.repository
Uploading artifacts for failed job 00:02
Version: 14.2.0
Git revision: 58ba2b95
Git branch: 14-2-stable
GO version: go1.13.8
Built: 2021-08-22T19:47:56+0000
OS/Arch: windows/386
Uploading artifacts...
Runtime platform arch=386 os=windows pid=4312 revision=58ba2b95 version=14.2.0
WARNING: target/surefire-reports/*: no matching files
ERROR: No files to upload
Cleaning up file based variables 00:01
ERROR: Job failed: exit status 1
Below is the complete yaml file:
stages:
- test
variables:
# This will suppress any download for dependencies and plugins or upload messages which would clutter the console log.
# `showDateTime` will show the passed time in milliseconds. You need to specify `--batch-mode` to make this work.
MAVEN_OPTS: "-Dmaven.repo.local=$CI_PROJECT_DIR/.m2/repository"
# Cache downloaded dependencies and plugins between builds.
# To keep cache across branches add 'key: "$CI_JOB_NAME"'
cache:
paths:
- .m2/repository
test job:
stage: test
tags:
- testing
script:
- echo "Testing Job Triggered"
- echo $CI_PROJECT_DIR
- 'mvn $MAVEN_OPTS clean test'
- echo "Testing Job Finished"
artifacts:
when: always
paths:
- target/surefire-reports/*
But if I remove the variables section and cache section from the yaml file and in the script section if I give just mvn clean test, then the build runs fine.
Also, it is downloading the maven repository to 'C:\Windows\System32\config\systemprofile\.m2\repository'. Any reason why it is downloading to this directory ?
Can anyone please help on this ?
The message No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted. just means that your GitLab instance isn't configured to use a service like AWS S3 or Min.io to store your cached items. Without it, the cache can only be stored locally where your Gitlab-Runners are running. This also means that the cache stored on one runner cannot be shared with another runner, which is most likely how you can run into the error you have. You also don't have a key, so the runner doesn't know when to download which cached items.
Here's an example of a job building NPM dependencies that use the cache and a key for the specific ref name (a branch, commit, or tag):
...
Run NPM Install:
stage: build
cache:
key: $CI_COMMIT_REF_NAME
paths:
- node_modules
script:
- npm ci
artifacts:
paths:
- node_modules
...
In this job, for the first pipeline for the branch, commit, or tag under CI_COMMIT_REF_NAME, it will run npm ci and upload it as an artifact for jobs later in the pipeline to use. However, if a pipeline for the same branch, commit, or tag is run again, instead of running npm ci it will download the cached node_modules directory, and upload that as an artifact.
See Caching in GitLab CI/CD for more information on caching, and Distributed Caching for information on using S3 or Minio to distribute your cache across all runners.
I have a .gitlab-ci.yml file which I want to use to run a script for merge request validation. The same script should be used in CI, but only there the result should be published to gitlab pages. Also, only for the CI, the result should be cached.
This is a simplified version of the current .gitlab-ci.yml:
pages:
stage: deploy
script:
- mkdir public/
- touch public/file.txt
artifacts:
paths:
- public
only:
- master
cache:
paths:
- fdroid
(The real-world code is in the fdroid-firefox gitlab repo.)
There are 2 ways how the pipeline is being triggered. Depending on this, I do or do not want to publish to pages:
by merge request validation. In this case, I want to execute the script part, but I don't want to publish or cache the result (otherwise, anyone with permissions to create a merge request could overwrite the gitlab pages content).
by CI (which is triggered both after check-in to master branch and following a schedule). In this case, I want the result to be cached and the gitlab pages to be updated.
I already tried splitting up the stages:
stages:
- build
- deploy
build_repo:
stage: build
script:
- mkdir public/
- touch public/file.txt
pages:
stage: deploy
script: echo "publish to Gitlab pages"
artifacts:
paths:
- public
only:
- master
cache:
paths:
- fdroid
(Original .gitlab-ci.yml file)
But by doing this, the pages:deploy stage faled because it does not have access to the result of the build stage. The pages:deploy stage shows an error symbol and on the tooltip it says missing pages artifacts. (real world log).
The log says:
Uploading artifacts for successful job
00:01
Uploading artifacts...
WARNING: public: no matching files
ERROR: No files to upload
What am I doing wrong that I don't have access to the result of the build stage?
How can I run the script section in both cases but still deploy to pages only from master branch?
You don't save your public path artifacts in your build job. And that's why they are missing at next deploy stage pages job.
You have this:
build_repo:
stage: build
script:
- your script
Try to save artifacts in your build job like this:
build_repo:
stage: build
script:
- your script
artifacts:
when: always
paths:
- public
So they will be passed to the next stage deploy and pages job could see them.
I have a project on github and some issues were solved already and it has merged pullrequests.
I try to intregrate project with circleci by adding circleci config in root of the project (I created a new branch and pushed it) .circleci/config.yml:
version: 2
jobs:
build:
working_directory: ~/circleci
docker:
- image: circleci/openjdk:8-jdk
environment:
MAVEN_OPTS: -Xmx3200m
steps:
- checkout
- restore_cache:
keys:
- v1-dependencies-{{ checksum "pom.xml" }}
- v1-dependencies-
- run: mvn dependency:go-offline
- save_cache:
paths:
- ~/.m2
key: v1-dependencies-{{ checksum "pom.xml" }}
- run: mvn test
And I get error:
#!/bin/sh -eo pipefail
# No configuration was found in your project. Please refer to https://circleci.com/docs/2.0/ to get started with your configuration.
#
# -------
# Warning: This configuration was auto-generated to show you the message above.
# Don't rerun this job. Rerunning will have no effect.
false
Exited with code 1
It tries to run a job on a merged pullrequest.
How to make circlecie run builds from my new pullrequest in that I added circleci config?
p.s. I've tried to add circleci config into my main branch - it doesn't help.
Thanks!
CircleCI will look for a .circleci/config.yml in the branch that triggers a webhook from GitHub.
This means in a PR that the config must exist in the branch, and once merged, will be included in master.
When first added via the UI, CircleCI only looks at master, but subsequent pushes to any branch (as long as .circleci/config.yml is present in that branch) should work.
It looks like your working_directory is set incorrectly. Perhaps you mean ~/.circleci? By the way, people typically set the working directory to the root directory of the project, not the .circleci directory.