I have the following logic in my GitLab-ci:
stages:
- build
- deploy
job_make_zip:
tags:
- test123
image: node:10.19
stage: build
script:
- npm install
- make
- make source-package
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
artifacts:
when:
paths:
- test.bz2
expire_in: 2 days
When the job runs, I see the following message:
17 Restoring cache
18 Checking cache for master...
19 No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted.
20 Successfully extracted cache
I'm just new to Gitlab... and so I can't tell if this is an error or not. I basically don't want to have to download the same npm modules every single time this build runs.
I found a similar post here: GitLab CI caching key
But I'm already using the correct gitlab CI variable.
Any suggestions would be appreciated.
In my GitLab-CI setup at home I am getting this warning (in my case I am not considering it to be an error) in all of my build jobs. According to https://gitlab.com/gitlab-org/gitlab/-/issues/201861 and https://gitlab.com/gitlab-org/gitlab-runner/-/issues/16097 there seem to be cases where this is a message to be taken seriously.
This is especially true if you are uploading (and later on downloading / extracting) the cache to a particular URL, which is used by several runners to get and sync the cache. In a general case though - meaning that if the cache is stored on a single GitLab-Runner rather than on a shared source, which is supposed to be used by several GitLab-Runners, I don't think this message has any real meaning. On my GitLab-Runners, which usually are project- or group-specific, this never was a problem and I always had the cache properly extracted in a local manner.
Related
I am self-hosting a private Gitlab 15.0.2 on Gentoo using this overlay: https://gitlab.awesome-it.de/overlays/gitlab
It's basically an installation from source (no Omnibus). Now I have also configured a gitlab runner (docker based) and a CI pipeline in one of my projects (a homepage being generated through hugo). The pipeline works fine up the the point where it is supposed to upload the artifact which is currently about 11GB in size.
Initially this gave me an "413 Request Entity Too Large" error, so I raised the artifact size limits in Gitlab and increased the client_max_body_size in Nginx. Now I am seeing this error instead:
Uploading artifacts for successful job
Using docker image sha256:c20c992e5d83348903a6f8d18b4005ed1db893c4f97a61e1cd7a8a06c2989c40 for registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper:x86_64-latest with digest registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper#sha256:edc1bf6ab9e1c7048d054b270f79919eabcbb9cf052b3e5d6f29c886c842bfed ...
Uploading artifacts...
public: found 907 matching files and directories
WARNING: Uploading artifacts as "archive" to coordinator... 404 Not Found id=112 responseStatus=404 Not Found status=404 token=X8QjapaV
WARNING: Retrying... context=artifacts-uploader error=invalid argument
WARNING: Uploading artifacts as "archive" to coordinator... 404 Not Found id=112 responseStatus=404 Not Found status=404 token=X8QjapaV
WARNING: Retrying... context=artifacts-uploader error=invalid argument
WARNING: Uploading artifacts as "archive" to coordinator... 404 Not Found id=112 responseStatus=404 Not Found status=404 token=X8QjapaV
FATAL: invalid argument
ERROR: Job failed: exit code 1
It tries 3 times before eventually giving up. Each attempt takes a few minutes.
I am not seeing any messages related to this in Gitlab's production.log which leaves me a bit stumped. The 404 error code does not seem to make much sense in this context. I have tested the build pipeline by branching and removing lots of content to create a much smaller artifact. The upload works in that branch on first try, so the upload URL must be fine.
Are there any other configuration settings that I need to be aware of? Perhaps some timeout for the upload?
EDIT:
Here's my current .gitlab-ci.yaml to give you better idea of what I am doing. It's rather ugly with those NodeJS dependencies being installed every time the pipeline is run, but that's currently not the issue.
image: cibuilds/hugo
variables:
GIT_SUBMODULE_STRATEGY: recursive
build:
stage: build
script:
- curl -sL https://deb.nodesource.com/setup_16.x -o /tmp/nodesource_setup.sh
- sudo bash /tmp/nodesource_setup.sh
- sudo apt update
- sudo apt install nodejs
- npm install autoprefixer postcss-cli
- hugo
artifacts:
paths:
- public
I am planning to add another step to the pipeline for the deployment using rsync over ssh.
I have a gitlab job that does not seem to update the repository before being run. Sometimes it leaves some files in their old states and run the script... Any idea ?
For instance when I have a
packagePython:
stage: package
script:
- .\scripts\PackagePython.ps1
tags:
- myServer
cache:
paths:
- .\python\cache\
only:
changes:
- python/**/*
I finally managed to understand what was happening :
I realised that the gitlab-runner did not use exactly the same path for each run on my server, and my script assumed that it did... So I ended up pointing on a build made on the wrong path.
I guess if you think that it is not updating the repository (like I did) make sure you are not referencing hardcoded path/package in your scripts that could refer to previous versions !
According to the docs:
Since the cache is shared between jobs, if you’re using different paths for different jobs, you should also set a different cache:key otherwise cache content can be overwritten.
This sounds weird to me.
So if I'm "using different paths for different jobs" like this
job_a:
paths:
- binaries/
job_b:
paths:
- node_modules/
How could the cache be overwritten..?
Does it mean node_modules will overwrite binaries ?? because the cache key is the same?
Anyone knows the details of the implementation of cache in gitlab?
Does it works like this??
$job_cache_key = $job_cache_key || 'default';
if ($cache[$job_cache_key]){
return $cache[$job_cache_key];
}
$cache[$job_cache_key] = $job_cache;
return $job_cache;
Cache keys in GitLab mimick Rails caching, although, as app/models/concerns/faster_cache_keys.rb mentions:
# Rails' default "cache_key" method uses all kind of complex logic to figure
# out the cache key. In many cases this complexity and overhead may not be
# needed.
#
# This method does not do any timestamp parsing as this process is quite
# expensive and not needed when generating cache keys. This method also relies
# on the table name instead of the cache namespace name as the latter uses
# complex logic to generate the exact same value (as when using the table
# name) in 99% of the cases.
The pipeline itself starts with initializing its local cache: lib/gitlab/ci/pipeline/seed/build/cache.rb
You can see a cache example in spec/lib/gitlab/ci/pipeline/seed/build/cache_spec.rb
Does it mean node_modules will overwrite binaries ?? because the cache key is the same?
No: Each job will use their own paths set, which override any path set defined in a global cache.
gitlab-org/gitlab-runner issue 2838 asks about cache per job, and give the example:
stages:
- build
- build-image
# the following line is the global cache configuration but also defines an anchor with the name of "cache"
# you can refer to the anchor and reuse this cache configuration in your jobs.
# you can also add and replace properties
# In the job definitions you will find examples.
# for more information regarding reuse in YAML files, see https://blog.daemonl.com/2016/02/yaml.html
cache: &cache
paths:
- api/node_modules/
- global/node_modules/
- frontend/node_modules/
# first job, it does not have an explicit cache definition:
# therefore it uses the global cache definition!
build-app:
stage: build
image: node:8
before_script:
- yarn
- cd frontend
script:
- npm run build
# a job in a later stage, have a look at the cache block!
# it "inherits" from the global cache block and adds the "policy: pull" key / value
build-image-api:
stage: build-image
image: docker
dependencies: []
cache:
<<: *cache
policy: pull
before_script:
# .... and so on
That inheritance mechanism is also documented in the "Inherit global config, but override specific settings per job" section of caching
You can override cache settings without overwriting the global cache by using anchors.
For example, if you want to override the policy for one job:
cache: &global_cache
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
- public/
- vendor/
policy: pull-push
job:
cache:
# inherit all global cache settings
<<: *global_cache
# override the policy
policy: pull
1+ year later (Q2 2021):
See GitLab 13.11 (April 2021)
Use multiple caches in the same job
GitLab CI/CD provides a caching mechanism that saves precious development time when your jobs are running. Previously, it was impossible to configure multiple cache keys in the same job. This limitation may have caused you to use artifacts for caching, or use duplicate jobs with different cache paths. In this release, we provide the ability to configure multiple cache keys in a single job which will help you increase your pipeline performance.
https://about.gitlab.com/images/13_11/cache.png -- Use multiple caches in the same job
See Documentation and Issue.
I am trying to host a reveal.js presentation via gitlab pages. The repository can be found here: https://gitlab.com/JanGregor/demo-slides
My .gitlab-ci.yml is fairly simple:
image: node:4.2.2
pages:
cache:
paths:
- node_modules/
script:
- npm install
- node_modules/.bin/gulp
artifacts:
paths:
- build
only:
- master
After a commit to master though, something goes wrong. The pages task itself is executed and runs just fine. It even shows in the logs that my build directory has been scanned and that the artefacts have been found.
Oddly, the subsequent pages:deploy task fails. It only says :
pages failed to extract
Any help would be greatly appreciated, since I have no clue where to look to next. The documentation itself isn't really helpful when trying to implement an deployment flow with npm.
Thanks in advance folks !
Apparently a page can only be published from a folder in under the artifacts that is called "public".
From the GitLab Pages documentation:
To make use of GitLab Pages, the contents of .gitlab-ci.yml must follow the rules below:
A special job named pages must be defined
Any static content which will be served by GitLab Pages must be placed under a public/ directory
artifacts with a path to the public/ directory must be defined
Also mentioned (somewhat tangentially) in the "GitLab Pages from A to Z" guide:
... and GitLab Pages will only consider files in a directory called public.
I want to set up a build pipeline in Concourse for my web application. The application is built using Node.
The plan is to do something like this:
,-> build style guide -> dockerize
source code -> npm install -> npm test -|
`-> build website -> dockerize
The problem is, after npm install, a new container is created so the node_modules directory is lost. I want to pass node_modules into the later tasks but because it is "inside" the source code, it doesn't like it and gives me
invalid task configuration:
you may not have more than one input or output when one of them has a path of '.'
Here's my jobs set up
jobs:
- name: test
serial: true
disable_manual_trigger: false
plan:
- get: source-code
trigger: true
- task: npm-install
config:
platform: linux
image_resource:
type: docker-image
source: {repository: node, tag: "6" }
inputs:
- name: source-code
path: .
outputs:
- name: node_modules
run:
path: npm
args: [ install ]
- task: npm-test
config:
platform: linux
image_resource:
type: docker-image
source: {repository: node, tag: "6" }
inputs:
- name: source-code
path: .
- name: node_modules
run:
path: npm
args: [ test ]
Update 2016-06-14
Inputs and outputs are just directories. So you put what you want output into an output directory and you can then pass it to another task in the same job. Inputs and Outputs can not overlap, so in order to do it with npm, you'd have to either copy node_modules, or the entire source folder from the input folder to an output folder, then use that in the next task.
This doesn't work between jobs though. Best suggestion I've seen so far is to use a temporary git repository or bucket to push everything up. There has to be a better way of doing this since part of what I'm trying to do is avoid huge amounts of network IO.
There is a resource specifically designed for this use case of npm between jobs. I have been using it for a couple of weeks now:
https://github.com/ymedlop/npm-cache-resource
It basically allow you to cache the first install of npm and just inject it as a folder into the next job of your pipeline. You could quite easily setup your own caching resources from reading the source of that one as well, If you want to cache more than node_modules.
I am actually using this npm-cache-resource in combination with a Nexus proxy to speed up the initial npm install further.
Be aware that some npm packages have native bindings that need to be built with the standardlibs that matches the containers linux versions standard libs so, If you move between different types of containers a lot you may experience some issues with libmusl etc, in that case I recommend either streamlinging to use the same container types through the pipeline or rebuilding the node_modules in question...
There is a similar one for gradle (on which the npm one is based upon)
https://github.com/projectfalcon/gradle-cache-resource
This doesn't work between jobs though.
This is by design. Each step (get, task, put) in a Job is run in an isolated container. Inputs and outputs are only valid inside a single job.
What connects Jobs is Resources. Pushing to git is one way. It'd almost certainly be faster and easier to use a blob store (eg S3) or file store (eg FTP).