How to avoid loading source code for the jobs azure DevOps pipeline every time. How can I make a download once source code, and then use it in all jobs? I set up a parallel launch of my jobs in the pipeline and now I have to spend time loading code every time. Thanks.
How can I make a download once source code, and then use it in all jobs
If you use Microsoft-hosted agents. It cannot be done. Because each job in your pipeline will get a fresh virtual Machine when you run your pipeline, The virtual machine is discarded after one use. So the source code downloaded in one job is not available for another job.
However it is possible in self-hosted agent. You can try creating a self-hosted agent and run your pipeline on this self-hosted agent. See below example:
I have below pipeline for testing on my self-hosted agent.
pool: Default #run pipeline on self-hosted agent
stages:
- stage: Build
jobs:
- job: A
steps:
- checkout: self
- powershell: |
echo "job1"> job1.txt
ls
- job:
dependsOn: A
steps:
- checkout: none
- powershell: |
echo "job2"> job2.txt
ls
See output in the second powershell task: The source code is only loaded for once in the first job. And the following jobs can use the it too.
If you want to skip downloading the source code for your whole pipeline. You can check below steps.
Click the 3dots on your yaml pipeline edit page--> Select Triggers-->Go the Yaml tab-->go to Get sources section--> Check Don't sync sources. See below screenshot.
But if you want to load the source code in some of the jobs. You can then add a script task to run the git clone commands to clone the source in this job (ie. git clone https://$(System.Accesstoken)#dev.azure.com/org/pro/_git/rep )
If you want to skip downloading source code for some of your jobs. You can also use checkout step (ie. checkout: none).
stages:
- stage: Build
jobs:
- job:
steps:
- checkout: none #skip loading source in this job
- job:
steps:
- checkout: self #loading source in this job
This is not possible as job
A stage contains one or more jobs. Each job runs on an agent. A job represents an execution boundary of a set of steps. All of the steps run together on the same agent. For example, you might build two configurations - x86 and x64. In this case, you have one build stage and two jobs.
So technically they run on separate machines:
Question is if you need source code on all those jobs. If not you can disable downloading source code by adding step checkout: none.
Related
I am working on a multi-pipeline project, and using trigger keyword to trigger a downstream pipeline, but I'm not able to pass artifacts created in the upstream project. I am using needs to get the artifact like so:
Downstream Pipeline block to get artifacts:
needs:
- project: workspace/build
job: build
ref: master
artifacts: true
Upstream Pipeline block to trigger:
build:
stage: build
artifacts:
paths:
- ./policies
expire_in: 2h
only:
- master
script:
- echo 'Test'
allow_failure: false
triggerUpstream:
stage: deploy
only:
- master
trigger:
project: workspace/deploy
But I am getting the following error:
This job depends on other jobs with expired/erased artifacts:
I'm not sure what's wrong.
Looks like there is a problem sharing artifacts between pipelines as well as between projects. It is known bug and has been reported here:
https://gitlab.com/gitlab-org/gitlab/-/issues/228586
You can find a workaround there but since it needs to add access token to project it is not the best solution.
Your upstream pipeline job "Build" is set to only store its artifacts for 2 hours (from the expire_in: 2h line. Your downstream pipeline must have run at least 2 hours later than the artifacts were created, so the artifact expired and was erased, generating that error.
To solve it you can either update the expire_in field to however long you need them to be active (so for example if you know the downstream pipeline will run up to 5 days later, set it to 5d for 5 days), or rerun the Build job to recreate the artifacts.
You can read more about the expire_in keyword and artifacts in general from the docs
It isn't a problem with expired artifacts, the error is incorrect. In my case I am able to download the artifacts as a zip directly from the UI on the executed job. My expire_in is set to 1 week yet I am still getting this message.
I am using an Open-Source project Magda (https://magda.io/docs/building-and-running) and want to make an Azure CI/CD Pipeline.
For this project, there are some prerequisites like having sbt + yarn + docker + java installed.
How can I specify those requirements in the azure-pipelines.yml file.
Is it possible in azure-pipelines.yml file, to just write scripts? Without any use of jobs or tasks? And what's the difference between them (Tasks,Jobs ... )
(I'm currently starting with it, so I don't have much experience)
That's my current azure-pipelines.yml file (if there is something wrong please tell me)
# Node.js
# Build a general Node.js project with npm.
# Add steps that analyze code, save build artifacts, deploy, and more:
# https://learn.microsoft.com/azure/devops/pipelines/languages/javascript
trigger:
- release
pool:
vmImage: 'ubuntu-latest'
steps:
- task: NodeTool#0
inputs:
versionSpec: '10.0.0'
displayName: 'Install Node.js'
- script: |
npm install
npm run build
displayName: 'npm install and build'
- script: |
curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh
displayName: 'install Helm '
- script: |
yarn global add lerna
yarn global add #gov.au/pancake
yarn install
displayName: 'install lerna & pancake packages'
- script: |
export NODE_OPTIONS=--max-old-space-size=8192
displayName: 'set Env Variable '
- script: |
lerna run build --stream --concurrency=1 --include-dependencies
lerna run docker-build-local --stream --concurrency=4 --include-filtered-dependencies
displayName: 'Build lerna '
I recommend you read this Key concepts for new Azure Pipelines users
It is possible to put all your stuff in one script step, but now you have logical separation, and this helps navigate and read file than one really long step.
Here you have some bascis from above mentioned documentation:
A trigger tells a Pipeline to run.
A pipeline is made up of one or more stages. A pipeline can deploy to one or more environments.
A stage is a way of organizing jobs in a pipeline and each stage can have one or more jobs.
Each job runs on one agent. A job can also be agentless.
Each agent runs a job that contains one or more steps.
A step can be a task or script and is the smallest building block of a pipeline.
A task is a pre-packaged script that performs an action, such as invoking a REST API or publishing a build artifact.
An artifact is a collection of files or packages published by a run.
But I really recommend you to go through it.
For this project , there are some prerequisites like having sbt + yarn + docker + java installed. How can i specifiy those requirements in the azure-pipelines.yml file.
If you are using Microsoft hosted agents you cannot specify demands
Demands and capabilities apply only to self-hosted agents. When using Microsoft-hosted agents, you select an image for the hosted agent. You cannot use capabilities with hosted agents.
So if you need sth what is not inside the agent you can install it and use taht new piece of sfotware. Later when you job is finished agent is restroed to original version. If you go for self hosted agent you can specify demands and based on agents capabilities it can be assigned to your job.
I'm trying to run 2 pipelines for a project in GitLab, but I can't find any way to do it.
In gitlab CI you can't create multiple pipelines for one project explicitly. There are cases where multiple pipelines will run simultaneously, such as when you have jobs that run only for merge requests and other jobs that do not run on merge requests.
That said, there are ways to obtain the effect of running multiple series of jobs independently from one another.
The hacky way, before gitlab-ce 12.2
If you want to start 2 pipelines for the same project you can use pipeline triggers. This method is limited to 2 pipelines and gitlab CI is not meant to be used this way. Usually, triggers are used to start pipelines on another project.
All in your .gitlab-ci.yml:
stages:
- start
- build
###################################################
# First pipeline #
###################################################
start_other_pipeline:
stage: start
script:
# Trigger a pipeline for current project on current branch
- curl --request POST --form "token=$CI_JOB_TOKEN" --form ref=$CI_COMMIT_REF_NAME $CI_API_V4_URL/projects/$CI_PROJECT_ID/trigger/pipeline
except:
- pipelines
build_first_pipeline:
stage: build
script:
- echo "Building first pipeline"
except:
- pipelines
###################################################
# Second pipeline #
###################################################
# Will run independently of the first pipeline.
build_second_pipeline:
stage: build
script:
- echo "Building second pipeline"
only:
- pipelines
To clean up this mess of a .gitlab-ci.yml, you can use the include keyword:
# .gitlab-ci.yml
include:
- '/first-pipeline.yml'
- '/second-pipeline.yml'
stages:
- start
- build
# This starts the second pipeline. The first pipeline is already running.
start_other_pipeline:
stage: start
script:
# Trigger a pipeline for current project on current branch
- curl --request POST --form "token=$CI_JOB_TOKEN" --form ref=$CI_COMMIT_REF_NAME $CI_API_V4_URL/projects/$CI_PROJECT_ID/trigger/pipeline
except:
- pipelines
# first-pipeline.yml
build_first_pipeline:
stage: build
script:
- echo "Building first pipeline"
except:
- pipelines
# second-pipeline.yml
build_second_pipeline:
stage: build
script:
- echo "Building second pipeline"
only:
- pipelines
The reason this works is the use only and except in the jobs. The jobs marked with
except:
- pipelines
do not run when the pipeline has started because of a trigger coming from another pipeline, so they don't run in the second pipeline. On the other hand,
only:
- pipelines
does the exact opposite, therefore those jobs run only when the pipeline is triggered by another pipeline, so they only run in the second pipeline.
The probably right way, depending on your needs ;)
In gitlab CE 12.2, it is possible to define Directed Acyclic Graphs to specify the order that your jobs run. This way, a job can start as soon as the job it depends on (using needs) finishes.
As of GitLab 12.7 it is also possible to use parent-child pipelines for this:
# .gitlab-ci.yml
trigger_child:
trigger:
include: child.yml
do_some_stuff:
script: echo "doing some stuff"
# child.yml
do_some_more_stuff:
script: echo "doing even more stuff"
The trigger_child job completes successfully once the child pipeline has been created.
In a project I'm running two stages with these jobs:
build
compile & test
generate sonar report
deploy
deploy to staging environment [manual]
deploy to production [manual]
The jobs in the deploy stage depend on the outputs of the compile & test job. However the generate sonar report job is not required to finish before I can start any job in the deploy stage. Nevertheless, GitLab insists that all jobs in the build phase have finished before I can launch any job in the deploy phase.
Is there a way I can tell GitLab that the generate sonar report job should not block subsequent pipeline stages? I already tried allow_failure: true on this job but this does not seem to have the desired effect. This job takes a long time to finish and I really don't want to wait for it all the time before being able to deploy.
We have similar situation, and while we do use allow_failure: true, this does not help when the Sonar job simply takes a long time to run, whether it fails or succeeds.
Since you are not wanting your deploy stage to actually be gated by the outcome of the generate sonar report job, then I suggest moving the generate sonar report job to the deploy stage, so your pipeline would become:
build
compile & test
deploy
deploy to staging environment [manual]
deploy to production [manual]
generate sonar report [allow_failure: true]
This way the generate sonar report job does not delay your deploy stage jobs
The other benefit of running generate sonar report after build & test is that you can save coverage reports from the build & test job as Gitlab job artifacts, and then have generate sonar report job consume them as dependencies, so Sonar can monitor your coverage, too
Finally, we find it useful to separate build & test into build, then test, so we can separate build failures from test failures - and we can then also run multiple test jobs in parallel, all in the same test stage, etc. Note you will need to convey the artifacts from the build job to the test job(s) via Gitlab job artifacts & dependencies if you choose to do this
From my point of view, it depends on your stage semantics. You should try to decide what is mostly important in your pipeline: clarity on stages or get the job done.
GitLab has many handy features like needs keyword you can use it to specify direct edges on the dependency graph.
stages:
- build
- deploy
build:package:
stage: build
script:
- echo "compile and test"
- mkdir -p target && echo "hello" > target/file.txt
artifacts:
paths:
- ./**/target
build:report:
stage: build
script:
- echo "consume the target artifacts"
- echo "waiting for 120 seconds to continue"
- sleep 120
- mkdir -p target/reports && echo "reporting" > target/reports/report.txt
artifacts:
paths:
- ./**/target/reports
deploy:
stage: deploy
needs: ["build:package"]
script:
- echo "deploy your package on remote site"
- cat target/file.txt
Unless I'm mistaken, this is currently not possible, and there is an open feature proposal, and another one similar to add what you are suggesting.
We have a project hosted on an internal Gitlab installation.
The Pipeline of the project has 3 stages:
Build
Tests
Deploy
The objective is to hide or disable the Deploy stage when Tests fails
The problem is that we can't use artifacts because they are lost each time our machines reboot.
My question: Is there an alternative solution to artifacts to achieve this task?
The used .gitlab-ci.yml looks like this:
stages:
- build
- tests
- deploy
build_job:
stage: build
tags:
# - ....
before_script:
# - ....
script:
# - ....
when: manual
only:
- develop
- master
all_tests:
stage: tests
tags:
# - ....
before_script:
# - ....
script:
# - ....
when: manual
only:
- develop
- master
prod:
stage: deploy
tags:
# - ....
script:
# - ....
when: manual
environment: prod
I think you might have misunderstood the purpose of the built-in CI. The goal is to have building and testing all automated on each commit or at least every push. Having all tasks set to manual execution gives you almost no advantage over external CI tools like Jenkins or Bamboo. Your only advantage to local execution of the targets right now is having visibility in a central place.
That said there is no way to conditionally show or hide CI tasks, because it's against the basic idea. If you insist on your idea, you could look up the artifacts of the previous stages and abort the manual execution in case something is wrong.
The problem is that we can't use artifacts because they are lost each time our machines reboot
AFAIK artifacts are uploaded to the master and not saved on the runners. You should be fine having your artifacts passed from stage to stage.
By the way, the default for when is on_success which means to execute build only when all builds from prior stages succeed.