Consider this .gitlab-ci.yml:
variables:
var1: "bob"
var2: "bib"
job1:
script:
- "[[ ${var1} == ${var2} ]]"
job2:
script:
- echo "hello"
after_script:
- "[[ ${var1} == ${var2} ]]"
In this example, job1 fails as expected but job2 succeeds, incomprehensibly. Can I force a job to fail in the after_script section?
Note: exit 1 has the same effect as "[[ ${var1} == ${var2} ]]".
The status of a job is determined solely by its script:/before_script: sections (the two are simply concatenated together to form the job script).
after_script: is a completely different construct -- it is not part of the job script. It is mainly for taking actions after a job is completed. after_script: runs even when jobs fail beforehand, for example.
Per the docs: (emphasis added on the last bullet)
Scripts you specify in after_script execute in a new shell, separate from any before_script or script commands. As a result, they:
Have the current working directory set back to the default (according to the variables which define how the runner processes Git requests).
Don’t have access to changes done by commands defined in the before_script or script, including:
Command aliases and variables exported in script scripts.
Changes outside of the working tree (depending on the runner executor), like software installed by a before_script or script script.
Have a separate timeout, which is hard-coded to 5 minutes.
Don’t affect the job’s exit code. If the script section succeeds and the after_script times out or fails, the job exits with code 0 (Job Succeeded).
Related
Problem
I have a GitLab job in a stage that will prompt for a manual trigger. I only want this job to be able to be manually run if an automatically run job shows that migrations are needed
Example
As an example, here job1 looks for a word using grep. I want to have job2 be able to be manually triggered only if the grep finds results but skipped otherwise.
job1:
stage: .pre
script:
- grep "bob_and_alice" file.txt
job2:
stage: run-after
script:
- echo "Bob and Alice were not in this story"
when: manual
I don't want to source my .sh script every time before I start packer build command ... because I always forget to do this. These tasks are repeatable and it makes sense to create a shell script for that.
Problem: If the command
$source env.sh
was executed once, I don't want to execute this command again but continue with the others. Is there any solution for that ? My script:
#!/bin/bash
set -e
echo "Today is $(date)"
echo "--------------------"
sleep 1.5
echo "1. Pass env variables"
source env.sh
echo "2. Check the configuration of Packer template"
packer validate example.pkr.hcl
echo "3. Build the image"
#packer build example.pkr.hcl
Set an variable inside your script, then test for the presence of that variable at the top of the script. For example:
#!/bin/bash
if [ "${ALREADY_LOADED}" -ne "YES" ]; then
...
# Put commands here
...
ALREADY_LOADED=YES
fi
If you want this to persist to child processes, then use an environment variable instead of a local one, by exporting it. But be aware: Some things are not inherited by child processes. For example, if your script sets an array variable, the child will not be able to see the array. So you may want to leave some commands outside the if...then...fi clause.
I have the situation where I need to add extra information to every output line from an execution is a Linux shell environment for which I can control the initialization, but I cannot (do not want to) control the execution of the scripts/commands in that context.
To better explain the problem, imagine that I have a Linux shell environment that is made available for executing steps (jobs) of a CI (continuous integration) pipeline.
The commands executed in the pipeline step are stored in different repositories managed by different people/teams and use the provided shell environment for execution, eg:
echo "command 1"
echo "command 2"
The output for the execution will look like:
command 1
command 2
The output for the commands above should have the pattern <timestamp>:<project_id>:<pipeline_id>:<pipeline_step>:<message> and look like:
143437560909100:9876:123456789:UnitTests:command 1
143437560909110:9876:123456789:UnitTests:command 2
The information in the pattern is available in the shell as:
<timestamp>: simple date +%H%M%S%N shell execution
<project_id>: available as environment variable PROJECT_ID
<pipeline_id>: available as environment variable PIPELINE_ID
<pipeline_step>: available as environment variable PIPELINE_STEP
The logs from a build/release execution are sent to a central logging service that allows me to correlate and analyze the behavior of the pipelines.
As I execute a lot of steps, some in parallel, from different pipelines (pipelines triggered by other pipelines) on different machines and sometimes even on different CI servers, I need to add to every log entry extra information related to the executed CI pipeline. The information to be added and the formatting must be configurable in the platform and without any change necessary to be applied to the source repositories where the pipeline definitions are stored.
The infrastructure information where the CI pipeline steps are executed can be automatically injected by the infrastructure platform (e.g. for Kubernetes K8S based runners information like hostname, pod name, STDOUT/STDERR), while the CI pipeline information must be added by the CI platform.
From man bash:
REDIRECTION
Before a command is executed, its input and output may be
redirected using a special notation interpreted by the shell.
...
This means the redirection can be used to forward the output sent to stdout to a string processor that can manipulate the information and sending it back to stdout in a modified format.
The commands:
exec {STDOUT}>&1
exec {STDERR}>&2
will assign new file descriptors that are duplicates of the current descriptors 1 (/dev/stdout) and 2 (/dev/stderr) and store the values of those new descriptors in the shell variables $STDOUT and $STDERR respectively.
This will allow us to redirect to a different place the initial file descriptors 1 and 2 that are targeted by the commands that are executed in the shell environment. The new place where we redirect the log messages can be a new script where we can change the log message. Once the log message is processed to the new expected format, we can now redirect it to the initial stdin and stdout file descriptors that are targeted by the new shell variables $STDOUT and $STDERR respectively.
Now that we know what we have to do, we need to ensure that the redirection is applied to every command execution done in the shell environment.
The exec command replaces the current process image with a new process image, while opening, closing, and/or copying file descriptors as specified by any redirections as part of the command.
The final command to would look like
exec {STDOUT}>&1 \
{STDERR}>&2 \
1> >(while IFS= read -r line; do echo "$(date +%H%M%S%N):${PROJECT_ID}:${PIPELINE_ID}:${PIPELINE_STEP}:$line" ; done >&${STDOUT}) \
2> >(while IFS= read -r line; do echo "$(date +%H%M%S%N):${PROJECT_ID}:${PIPELINE_ID}:${PIPELINE_STEP}:$line" ; done >&${STDERR})
P.S. if you want to change all the log entries from the Gitlab CI pipeline job output, just configure the pre-build script for the Gitlab CI runner
RUNNER_PRE_BUILD_SCRIPT='exec {STDIN}>&1 1> >(while IFS= read -r line; do echo "$(date +%H%M%S%N):${PROJECT_ID}:${PIPELINE_ID}:${PIPELINE_STEP}:$line" ; done >&${STDIN})'
Bash manual says "Each command in a pipeline is executed as a separate process (i.e., in a subshell)". I test two simple commands.
Scene 1
cd /home/work
str=hello
echo $str | tee a.log
It outputs:
hello
It seems that echo command is not executed in a subshell, as it can access the non-exported variable $str.
Scene 2
cd /home/work
cd src | pwd
pwd
It outputs:
/home/work
Is looks like cd command is executed in a subshell, as it doesn't affect the working directory of original shello.
Can anyone explain why the behaviors are not consistent?
Can anyone explain why the behaviors are not consistent?
Well, because this is how it was designed. A "subshell" inherits the whole context, not only exported variables.
Bash manual says "Each command in a pipeline is executed as a separate process (i.e., in a subshell)"
Bsah manual is available here. The sentence you are mentioning literally has a link to solve the mystery:
Each command in a pipeline is executed in its own subshell, which is a separate process (see Command Execution Environment).
Then you can check the "Command Execution Environment", from it (emphasis mine):
The shell has an execution environment, which consists of the following:
...
shell parameters that are set by variable assignment ...
...
...
Command substitution, commands grouped with parentheses, and asynchronous commands are invoked in a subshell environment that is a duplicate of the shell environment, ....
A subshell has all the environment (well, except traps). On the other hand commands:
When a simple command other than a builtin or shell function is to be executed, it is invoked in a separate execution environment that consists of the following. ....
...
shell variables and functions marked for export, ...
If bash pipeline commands run in subshell, why echo command can access the non-exported variables?
Because a subshell inherits the parent environment, including all the non-exported variables.
The documentation isn't clear on whether the after_script is executed for cancelled jobs:
after_script is used to define the command that will be run after for all jobs, including failed ones.
I'm doing potentially critical cleanup in the after_script and while cancelled jobs should be rare, I'd like to know that my clean up is guaranteed to happen.
No, I ran some tests and here are the behaviours I exhibited:
after_script:
- echo "This is not executed when a job is cancelled."
- echo "A failing command, like this one, doesn't fail the job." && false
- echo "This is not executed because the previous command failed."
1. after_script is not executed when a job is cancelled
There's an open issue for this on gitlab.com, so if this is affecting you, head over there and make some noise.
2. If a command in the after_script fails, the rest aren't executed
This is quite easy to work around:
after_script:
- potentially failing command || true
- next command
Replace potentially failing command with your command and the next command will execute regardless of whether potentially failing command passed or failed.
One could argue that this behaviour is actually desired, as it gives some flexibility to the user, but it might be counterintuitive to some.