How does Gitlab's "pages" job work internally? - gitlab

I have a Gitlab project like this (.gitlab-ci.yml):
# Sub-jobs listed as comments
stages:
- check-in-tests
# shellcheck
# pylint
# unit-tests
- several-other-things
# foo
# bar
# baz
- release
# release
# Run some shell code static tests and generate logs/badges
shellcheck:
stage: check-in-tests
script:
- bash run_shellcheck.sh
artifacts:
paths:
- logs/shellcheck.log
- logs/shellcheck.svg
# Run some python code static tests and generate logs/badges
pylint:
stage: check-in-tests
script:
- bash run_pylint.sh
artifacts:
paths:
- logs/pylint.log
- logs/pylint.svg
# <snip>
On my project page I'd like to render the .svg files produced during check-in-tests as badges.
The Gitlab badges tool requires a URL to an image file. It is incapable of loading images from URLs with query strings. Unfortunately, the syntax for accessing specific job artifacts ends in a query string. This effectively means that we can't link to job artifacts as badges.
The most popular workaround is to abuse Gitlab's pages feature to store artifacts as static content. From there we can get clean URLs to our artifacts that don't contain query strings.
My confusion involves the underlying mechanism behind the "pages" job defined in .gitlab-ci.yml. The official documentation here is very sparse. There are a million examples for deploying an actual webpage with various frameworks, but I'm not interested in any of them since I'm just using my project's "page" for file hosting.
The assumption seems to be that I want to deploy my page at the end of the pipeline. However, I want to upload the shellcheck and pylint artifacts near the beginning of the pipeline. Furthermore, I want those artifacts to be uploaded even if the pipeline stages fail.
Syntactically the pages job looks identical to any other job. There's nothing there to describe how it's magically picked up by Gitlab's internals. This leaves me with the following questions:
Can I change the stage from "deploy" to "check-in-tests", or is the deploy stage specifically part of the hidden magic that Gitlab looks for when parsing for a pages job?
If I'm tied to the deploy stage, can I re-arrange the stages to make it come earlier in the pipeline without breaking the magic?
Does the pages job deploy artifacts from the local machine (default behavior for a job), or are the listed paths coming from artifacts which have already been uploade to the Gitlab pipeline by earlier jobs?
If the pages job is only looking for artifacts locally how can I ensure that it runs on the same machine as the earlier jobs so that it finds the artifacts which they produced? Let's assume that the Gitlab executors all come from a pool with the same tag and aren't tagged individually.
Is there any chance of getting the pages job to run within the same Docker container that originally produced the artifacts?

The magic around GitLab pages is in the name of the job. It has to be named "pages", and nothing else. It is possible to move the job to different stages. As soon as the job "pages" has finished successfully, there's a special type of job that is called "pages:deploy". This job is shown in the deploy stage even if you change the stage that the "pages" job is run in.
If you have the pages job in an early stage, jobs in the later stages can fail but the "pages:deploy" job will still run and update GitLab pages.
Other than that, the "pages" job is just like a normal job in GitLab. If you need artifacts from other jobs, you can get these by using artifacts and dependencies:
https://docs.gitlab.com/ee/ci/yaml/#dependencies
The "pages" job should create a folder named "public" and give that folder as an artifact.

Related

Gitlab external job

I need to integrate security scans of my project files (SAST) in my Gitlab CI/CD pipeline, and it's easy to do with just another job in .gitlab-ci.yml, like:
security-scan:
stage: test
image: my_image:latest
script:
- scan run project/folder
But the problem is that developers can easily comment this part of the code and prevent the job run.
How can I create some kind of external job which will always be running by the trigger and developers would not be able to modify it?
I found this discussion on the Gitlab forum, but I don't get it.

Gitlab CI: Why next stage is allowed to run

I have one gitlab CI file like this
stages:
- build
- deploy
build-job:
stage: build
script:
- echo "Compiling the code..."
- echo "Compile complete."
when: manual
deploy-bridge:
stage: deploy
trigger:
project: tests/ci-downstream
What I understand is that the deploy-bridge stage should not be run unless the manual build-job is run successfully. But it is not the case here. Is this normal?
Jobs in the same stage run in parallel. Jobs in the next stage run
after the jobs from the previous stage complete successfully.
You're not defining your deploy-bridge job as a dependent job, or that it needs another job to finish first, so it can run right away as soon as it reaches the stage. Since the previous stage is all manual jobs, GitLab CI/CD sort of interprets it as 'done', at least enough so that other stages can start.
Since it doesn't look like you're uploading the compiled code from build-job as an artifact, we can't use the dependencies keyword here. All that keyword does is control which jobs' dependencies this job needs, but if it needs the artifacts of a prior job, that job will need to run and finish successfully for this job to start. Also, by default all available artifacts from all prior jobs will be downloaded and available for all jobs in the pipeline. The dependencies keyword can also be used to limit which artifacts this job actually needs. However, if there are no artifacts available in the job we "depend" on, it will throw an error. Luckily there's another keyword we can use.
The needs keyword controls the "flow" of the pipeline, so much so that if a job anywhere in the pipeline (even in the last of say 1,000 stages) had needs: [] it will run as soon as the pipeline starts (and as soon there is an available runner). We can use needs here to make the pipeline flow the way you need.
...
deploy-bridge:
stage: deploy
needs: ['build-job']
trigger:
project: tests/ci-downstream
Now, the deploy-bridge job won't run until the build-job has finished successfully. If build-job fails, deploy-bridge will be skipped.
One other use for needs is that it has the same functionality as dependencies, in that it can control what artifacts are downloaded in which jobs, but it won't fail if the "needed" jobs don't have artifacts at all.
Both dependencies and needs accept an empty array which equates to 'don't download any artifacts' and for needs, run as soon as a runner is available.

How can I prevent Gitlab CI multiple yml includes from overriding a stage's jobs?

In my Gitlab project, I'm including multiple .yml files. One of them is remote and the other is a template provided by Gitlab for Code Quality.
The .yml configuration is written like so:
include:
- template: Code-Quality.gitlab-ci.yml
- remote: 'https://raw.githubusercontent.com/checkmarx-ltd/cx-flow/develop/templates/gitlab/v3/Checkmarx.gitlab-ci.yml'
Both of these templates are accessible. The first is located here, and the second Checkmarx one is here.
Both of these .yml configs define jobs that run in the test pipeline stage.
I'm having an issue where only the second include's jobs are running in the test stage, and the Gitlab Code Quality job is completely ignored. If I remove the external Checkmarx include, the Code Quality job runs just fine.
Normally I would just define separate stages, but since these .yml files do not belong to me, I cannot change the stage in which they run.
Is there a way to ensure the jobs all run in the test stage? If not, is there a way to override the stage a job from an external .yml runs in?
Oddly, there seems to be some sort of rules conflict between the two templates, possibly due to the variables that the checkmarx template sets. Even though the CI Lint shows that all 4 jobs should run successfully, I can reproduce your issue with the above code.
Given that it's likely a rules issue, I overrode the rules for running the code_quality job and was able to get both running within the same pipeline:
include:
- template: Code-Quality.gitlab-ci.yml
- remote: 'https://raw.githubusercontent.com/checkmarx-ltd/cx-flow/develop/templates/gitlab/v3/Checkmarx.gitlab-ci.yml'
code_quality:
rules:
- when: on_success
You can lint the above changes to confirm they're successful (though GitLab will warn you that without any workflow:rules, you'll wind up with duplicate pipelines inside MRs, which is true).
You can also see the pipeline running with both jobs here though checkmarx fails because I don't have a subscription to test it with:

Define no-sources files for Gitlab CI

Is it possible to specify no-sources files that should not trigger the Gitlab CI?
When I make changes in README.md, the pipeline triggers, thought that file is only the documentation inside the gitlab and is not packaged in anz output artifact.
You can control when each of your jobs is added to a running pipeline using the only, except, or rules keywords. The easiest way to prevent jobs from running when only the README is changed is with except:
build_job:
stage: build
except:
changes:
- README.md
With this job syntax, if the only file that changes in a push is README.md, this job will not run. Unfortunately you can only set these rules at a job level, not the pipeline level so you'd have to put this in each of your jobs to prevent them all from running.

In Gitlab CI, can you "pull" artifacts up from triggered jobs?

I have a job in my gitlab-ci.yml that triggers an external Pipeline that generates artifacts (in this case, badges).
I want to be able to get those artifacts and add them as artifacts to the bridge job (or some other job) on my project so that I can reference them.
My triggered job looks like this:
myjob:
stage: test
trigger:
project: other-group/other-repo
strategy: wait
I'd like something like this:
myjob:
stage: test
trigger:
project: other-group/other-repo
strategy: wait
artifacts:
# how do I get artifacts from the job(s) on other-repo?
badge.svg
Gitlab has an endpoint that can be used for the badge url for downloading the artifact from the latest Pipeline/Job for a project
https://gitlabserver/namespace/project/-/jobs/artifacts/master/raw/badge.svg?job=myjob
Is there a way to get the artifacts from the triggered job and add them to my project?
The artifacts block is for handling archiving artifacts from the current job. In order to get artifacts from a different job, you would handle that in the script section. Once you have that artifact, you can archive it normally within the artifacts block as usual.
You can use wget to download artifacts from a different project as described here
I know it a bit late but maybe this could help.
Add this to your job. It tells the job it needs the artifacts from a specific project.
(You need to be owner of the project)
needs:
- project: <FULL_PATH_TO_PROJECT> (without hosting website)
job: <JOB_NAME>
ref: <BRANCH>
artifacts: true

Resources