Gitlab-CI avoid unnecessary rebuilds of react portion of project - node.js

I have a stage in my CI pipeline (gitlab-ci) as follows:
build_node:
stage: Build Prerequisites
only:
- staging
- production
- ci
image: node:15.5.0
artifacts:
paths:
- http
cache:
key: "node_modules"
paths:
- ui/node_modules
script:
- cd ui
- yarn install --network-timeout 600000
- CI=false yarn build
- mv build ../http
The UI however, is not the only part of the project. There are other files with their own build processes. So whenever we commit changes for only those other files, this stage gets rerun every time, even if nothing in the ui folder changed.
Is there a way to have gitlab cache or otherwise not rebuild this every time if there were no changes? Any changes that should trigger a rebuild would all be under the ui folder. Just have it use the older build if possible?

It is possible to do in latest Gitlab version using the rules:changes keyword.
rules:
- changes:
- ui/*
Link: https://docs.gitlab.com/ee/ci/jobs/job_control.html#variables-in-ruleschanges
This will only check for changes inside the ui folder and trigger this stage.
Check this link for more info: https://docs.gitlab.com/ee/ci/yaml/#ruleschanges

Related

Gitlab CI/CD cache expires and therefor build fails

I got AWS CDK application in typescript and pretty simple gitlab CI/CD pipeline with 2 stages, which takes care of the deployment:
image: node:latest
stages:
- dependencies
- deploy
dependencies:
stage: dependencies
only:
refs:
- master
changes:
- package-lock.json
script:
- npm install
- rm -rf node_modules/sharp
- SHARP_IGNORE_GLOBAL_LIBVIPS=1 npm install --arch=x64 --platform=linux --libc=glibc sharp
cache:
key:
files:
- package-lock.json
paths:
- node_modules
policy: push
deploy:
stage: deploy
only:
- master
script:
- npm run deploy
cache:
key:
files:
- package-lock.json
paths:
- node_modules
policy: pull
npm run deploy is just a wrapper for the cdk command.
But for some reason, sometimes it happens, that the cache of the node_modules (probably) expires - simply deploy stage is not able to fetch for it and therefore the deploy stage fails:
Restoring cache
Checking cache for ***-protected...
WARNING: file does not exist
Failed to extract cache
I checked that the cache name is the same as the one built previously in the last pipeline run with dependencies stage.
I suppose it happens, as often times this CI/CD is not running even for multiple weeks, since I contribute to that repo rarely. I was trying to search for the root causes but failed miserably. I pretty much understand that cache can expire after some times(30 days from what I found by default), but I would expect CI/CD to recover from that by running the dependencies stage despite the fact package-lock.json wasn't updated.
So my question is simply "What am I missing? Is my understanding of caching in Gitlab's CI/CD completely wrong? Do I have to turn on some feature switcher?"
Basically my ultimate goal is to skip the building of the node_modules part as often as possible, but not failing on the non-existent cache even if I don't run the pipeline for multiple months.
A cache is only a performance optimization, but is not guaranteed to always work. Your expectation that the cache might be expired is most likely correct, and thus you'll need to have a fallback in your deploy script.
One thing you could do is that you change your dependencies job to:
Always run
Both push & pull the cache
Shortcircuit the job if the cache was found.
E.g. something like this:
dependencies:
stage: dependencies
only:
refs:
- master
changes:
- package-lock.json
script:
- |
if [[ -d node_modules ]]; then
exit 0
fi
- npm install
- rm -rf node_modules/sharp
- SHARP_IGNORE_GLOBAL_LIBVIPS=1 npm install --arch=x64 --platform=linux --libc=glibc sharp
cache:
key:
files:
- package-lock.json
paths:
- node_modules
See also this related question.
If you want to avoid spinning up unnecessary jobs, then you could also consider to merge the dependencies & deploy jobs, and take a similar approach as above in the combined job.

GitLab CI - Run pipeline when the contents of a file changes

I have a mono-repo with several projects (not my design choice).
Each project has a .gitlab-ci.yml setup to run a pipeline when a "version" file is changed. This is nice because a user can check-in to stage or master (for a hot-fix) and a build is created and deployed to a test environment.
The problem is when a user does a merge from master to stage and commits back to stage (to pull in any hot-fixes). This causes ALL the pipelines to run; even for projects that do not have actual content changes.
How do I allow the pipeline to run from master and/or stage but ONLY when the contents of the "version" file change? Like when a user changes the version number.
Here is an example of the .gitlab-ci.yml (I have 5 of these, 1 for each project in the mono-repo)
#
# BUILD-AND-TEST - initial build
#
my-project-build-and-test:
stage: build-and-test
script:
- cd $MY_PROJECT_DIR
- dotnet restore
- dotnet build
only:
changes:
- "MyProject/.gitlab-ci.VERSION.yml"
# no needs: here because this is the first step
#
# PUBLISH
#
my-project-publish:
stage: publish
script:
- cd $MY_PROJECT_DIR
- dotnet publish --output $MY_PROJECT_OUTPUT_PATH --configuration Release
only:
changes:
- "MyProject/.gitlab-ci.VERSION.yml"
needs:
- my-project-build-and-test
... and so on ...
I am still new to git, GitLab, and CI/pipelines. Any help would be appreciated! (I have little say in changing the mono-repo)
The following .gitlab-ci.yml will run the test_job only if the file version changes.
test_job:
script: echo hello world
rules:
- changes:
- version
See https://docs.gitlab.com/ee/ci/yaml/#ruleschanges
See also
Run jobs only/except for modifications on a path or file

GitLab CI job takes 20+ minutes

With gitlab-ci I am using a simple .yml file. I have defined various stages to run synchronously. I have set a cache for node_modules. But the problem is that the cache of node_modules is actually slowing down the process. This cache is required to make the node_modules the same across each stage. (Each stage automatically clears /node_modules for some reason)
When building locally this whole process takes less then 2 minutes. But on the CI machine this process takes between 20 and 25 minutes. Learning how Gitlab CI works internally, I've learned that it's zipping the node_module files (about 36K small files) and that process is extremely slow.
tl;dr: What is the proper way to handle node_module caching with Gitlab CI without uploading node_modules to artifacts? I would like to avoid uploading artifacts that are over 400MB large.
See configuration below:
cache:
untracked: true
key: "%CI_COMMIT_REF_NAME%"
paths:
- node_modules
stages:
- install
- eslint-check
- eslint
- prettier
- test
- dist
# install dependancies
install:
stage: install
script:
- yarn install
environment:
name: development
# run eslint-check
eslint-check:
stage: eslint-check
script:
- yarn eslint-check
environment:
name: development
# Other scripts below
It would appear that there will be a solution for this in the future as the issue has been discussed here for almost two years. A milestone has been set so this can be resolved eventually.
https://gitlab.com/gitlab-org/gitlab-runner/issues/1797

Gitlab CI not invoking the 'pages' job

I have a project hosted on Gitlab. The project website is inside the pages branch and is a jekyll based site.
My .gitlab-ci.yml looks like
pages:
script:
- gem install jekyll
- jekyll build -d public/
artifacts:
paths:
- public
only:
- pages
image: node:latest
cache:
paths:
- node_modules/
before_script:
- npm install -g gulp-cli
- npm install
test:
script:
- gulp test
When I pushed this configuration file to master, the pipeline executed only the test job and not pages job. I thought maybe pushing to master didn't invoke this job because only specifies pages branch. Then I tried pushing to pages branch but to no avail.
How can I trigger the pages job?
You're right to assume that the only constraint makes the job run only on the ref's or branches specified in the only clause.
See https://docs.gitlab.com/ce/ci/yaml/README.html#only-and-except
It could be that there's a conflict because the branch and the job have the same name. Could you try renaming the job to something different just to test?
I'd try a couple of things.
First, I'd put in this stages snippet at the top of the YML:
stages:
- test
- pages
This explicitly tells the CI to run the pages stage after the test stage is successful.
If that doesn't work, then, I'd remove the only tag and see what happens.
Complementing #rex answer's:
You can do either:
pages:
script:
- gem install jekyll
- jekyll build -d public/
artifacts:
paths:
- public
Which will deploy your site regardless the branch name, or:
pages:
script:
- gem install jekyll
- jekyll build -d public/
artifacts:
paths:
- public
only:
- master # or whatever branch you want to deploy Pages from
Which will deploy Pages from master.
Pls let me know if this helps :)

Stop gitlab runner to not remove a directory

I have a directory which is generated during a build and it should not be deleted in the next builds. I tried to keep the directory using cache in .gitlab-ci.yml:
cache:
key: "$CI_BUILD_REF_NAME"
untracked: true
paths:
- target_directory/
build-runner1:
stage: build
script:
- ./build-platform.sh target_directory
In the first build a cache.zip is generated but for the next builds the target_directory is deleted and the cache.zip is extracted which takes a very long time. Here is a log of the the second build:
Running with gitlab-ci-multi-runner 1.11.
on Runner1
Using Shell executor...
Running on Runner1...
Fetching changes...
Removing target_directory/
HEAD is now at xxxxx Update .gitlab-ci.yml
From xxxx
Checking out xxx as master...
Skipping Git submodules setup
Checking cache for master...
Successfully extracted cache
Is there a way that gitlab runner not remove the directory in the first place?
What you need is to use a job artifacts:
Artifacts is a list of files and directories which are attached to a
job after it completes successfully.
.gitlab-ci.yml file:
your job:
before_script:
- do something
script:
- do another thing
- do something to generate your zip file (example: myFiles.zip)
artifacts:
paths:
- myFiles.zip
After a job finishes, if you visit the job's specific page, you can see that there is a button for downloading the artifacts archive.
Note
If you need to pass artifacts between different jobs, you need to use dependencies.
Gitlab has a good documentation about that if you really have this need http://docs.gitlab.com/ce/ci/yaml/README.html#dependencies

Resources