Get artifacts from previous GIT jobs - gitlab

I have a 3 stages in pipeline, each job in all 3 stages are creating a xml data files. These jobs which runs in parallel.
I want to merge all xml data file in 4th stage. Below is my yml code
stages:
- deploy
- test
- execute
- artifact
script:
- XYZ
artifacts:
name: datafile.xml
paths:
- data/
Problem: how i can collect all xmls from previous jobs to merge it? Files names are unique.

Here is a .gitlab-ci.yml file that collects artifacts into a final artifact (takes a file generated by earlier stages, and puts them all together).
The key is the needs attribute which takes the artifacts from the earlier jobs (with artifacts: true).
stages:
- stage_one
- stage_two
- generate_content
apple:
stage: stage_one
script: echo apple > apple.txt
artifacts:
paths:
- apple.txt
banana:
stage: stage_two
script: echo banana > banana.txt
artifacts:
paths:
- banana.txt
put_it_all_together:
stage: generate_content
needs:
- job: apple
artifacts: true
- job: banana
artifacts: true
script:
- cat apple.txt banana.txt > fruit.txt
artifacts:
paths:
- fruit.txt

Related

GitLab pipeline overriding artifacts attribute

We have a base GitLab CI template which includes by default certain artifacts. Now we need to include this template in one of the pipelines we have and in a job we want not to pass the artifacts.
We tried this:
artifacts: []
EDIT
Here my example:
base.yaml
build:
stage: build
script:
- echo "build..."
artifacts:
expire_in: 3 weeks
reports:
dotenv: VERSION.env
paths:
- '$env:BUILD_OUTPUT_DIR\**\webconfigs'
- '$env:MSBUILD_OUTPUT_DIR\**\_PublishedWebsites\**\*.zip'
child.yaml
include: 'base.yaml'
build:child:
extends: [build]
before_script: []
script:
- *run-nuget-restore
- *build-release
artifacts: [] # I don't need any of the atributes of the base template, but this does not work
but it's not valid! How can I set the artifacts attributes to empty?
Inherited keys of jobs with extend can be excluded with null.
To exclude a key from the extended content, you must assign it to null...
https://docs.gitlab.com/ee/ci/yaml/yaml_optimization.html#exclude-a-key-from-extends
Example:
.base:
script: ...
artifacts:
paths:
- ...
test:
extends: .base
artifacts: null

How to use the "cache" in gitlab?

I have created the following test pipeline in gitlab:
job1:
stage: stage1
rules:
- if: $RUN != "run2"
when: always
- when: never
script:
- echo "Job stage1 updating file dates.txt"
- date >> data/dates.txt
cache:
key: statusfiles
paths:
- data/dates.txt
job2:
stage: stage2
rules:
- if: $RUN == "run2"
when: always
script:
- echo "Running stage 2"
- cat data/dates.txt
cache:
key: statusfiles
paths:
- data/dates.txt
where I want to use the "cache" feature of gitlab.
Here I first run job1 which update the file date.txt and add an entry to this example files, so it contains two lines.
However, when I run a new pipeline with job2 alone, the files contain only ONE line. The files seem to be the original, unmodified files.
How can I "upload" or "save" the files into the cache in job1, so I can use the updated file in a later run of job2?
Test first if the section "Share caches between jobs in the same branch" is relevant.
To have jobs in each branch use the same cache, define a cache with the key: >
$CI_COMMIT_REF_SLUG:
cache:
key: $CI_COMMIT_REF_SLUG
This configuration prevents you from accidentally overwriting the cache.
In your case, from the discussion, and from "How archiving and extracting works":
stages:
- build
- test
before_script:
- echo "Hello"
job A:
stage: build
rules:
- if: $RUN != "run3"
when: always
- when: never
script:
- mkdir -p data/
- date > data/hello.txt
cache:
key: build-cache
paths:
- data/
after_script:
- echo "World"
job B:
stage: test
rules:
- if: $RUN == "run3"
script:
- cat data/hello.txt
cache:
key: build-cache
paths:
- data/
By using a single runner on a single machine, you don’t have the issue where job B might execute on a runner different from job A.
This setup guarantees the cache can be reused between stages.
It only works if the execution goes from the build stage to the test stage in the same runner/machine. Otherwise, the cache might not be available.

How to delete artifacts directory on gitlab runner after uploading them to gitlab?

I'm trying to create a gitlab job that shows a metric for test code coverage. To do that, I'm creating a .coverage file and placing it in a directory that uploads artifacts. In a subsequent stage the artifacts are downloaded and consumed by a coverage tool to produce a coverage report. I noticed that the artifacts are not deleted when the gitlab runner finishes the job and are bloating my filesystem. How can I remove the artifacts directory after the artifacts are uploaded?
Here's what we currently have
stages:
- test
- build
before_script:
- export GITLAB_ARTIFACT_DIR="$(pwd)"/artifacts
[...]
some-test:
stage: test
script:
- [some script that puts something in ${GITLAB_ARTIFACTS_DIR}
artifacts:
expire_in: 4 days
paths:
- artifacts/
some-other-test:
stage: test
script:
- [some script that puts something in ${GITLAB_ARTIFACTS_DIR}
artifacts:
expire_in: 4 days
paths:
- artifacts/
[...]
coverage:
stage: build
before_script:
script:
- [our coverage script]
coverage: '/TOTAL.*\s+(\d+%)$/'
artifacts:
expire_in: 4 days
paths:
- artifacts/
when: always
[...]
after_script:
- sudo rm -rf "${GITLAB_ARTIFACT_DIR}"
According to https://gitlab.com/gitlab-org/gitlab-runner/issues/4146 after_script does not have access to before_script or scripts environment variables.
A solution could be to use cache and artifact simultaneously.
This config will create a new directory depending of the job id ($CI_JOB_ID) for each job execution :
stages:
- test
remote:
stage: test
script :
- mkdir cache-$CI_JOB_ID
- echo hello> cache-$CI_JOB_ID/foo.txt
cache:
key: build-cache
paths:
- cache-$CI_JOB_ID/
artifacts:
paths:
- cache-$CI_JOB_ID/foo.txt
expire_in: 1 week
At the next run, the previous cache-$CI_JOB_ID will be removed and replace by a new directory (as the $CI_JOB_ID will be different). This will keep only one instance of your cached file until the next job execution.
Note : you need to prefix the directory name with cache- otherwise the .gitlab-ci.yml is invalid.

Gitlab CI: create multiple builds from single commit

My current gitlab configuration is very simple as below
stages:
- build
before_script:
- some commands here
build-after-commit:
stage: build
script:
- some command here
artifacts:
expire_in: 1 day
when: on_success
name: name here
paths:
- build/*.zip
I want to run build-after-commit part twice with different settings. I am expecting something like this
stages:
- build
before_script:
- some commands here
build-after-commit:
stage: build
script:
- some command here
artifacts:
expire_in: 1 day
when: on_success
name: name1 here
paths:
- build/*.zip
# run it again with different settings
stage: build
script:
- Different script here
artifacts:
expire_in: 1 day
when: on_success
name: name2 here
paths:
- build/*.zip
So basically, in the second run the script will be different and the name of the output file will be different. How can I do this?
The straightforward approach would be to just have another job in the build stage.
E.g.
stages:
- build
before_script:
- some commands here
build-after-commit:
stage: build
script:
- some command here
artifacts:
expire_in: 1 day
when: on_success
name: name1 here
paths:
- build/*.zip
build-after-commit2:
stage: build
script:
- Different script here
artifacts:
expire_in: 1 day
when: on_success
name: name2 here
paths:
- build/*.zip
If you define build-after-commit2 in the same stage (build) it will even be run in parallel to build-after-commit.
In this case, I don't think having two jobs is bad design, as they are actually quite different from each other i.e. different script and different artifact name.

Stop cleanup between two stages in gitlab-runner

Here is my .gitlab-ci.yml
stages:
- build
- unit_test_1
- unit_test_2
- perf_test
job1:
stage: build
script:
- bash build.sh
allow_failure: true
job2:
stage: unit_test_1
script:
- bash ./all/deployment/testframwork/unit_test_1.sh
allow_failure: true
Here build.sh creates a build and stores all binary in build directory. But after completion of job1 this directory is deleting.
But I am using that directory for running my 2nd job.
How can i achieve this ?
Use build artifacts. You should use expire_in with the artifacts so the build dir is not stored in your gitlab forever. To control what dir gets what artifacts use dependencies
job1:
artifacts:
path: build
expire_in: 1 week
job2:
dependencies:
-job1
job3:
dependencies: []

Resources