Bitbucket pipeline maven cache is not caching all artifacts - bitbucket-pipelines

I am using maven cache in my pipeline and I have a question. In my settings.xml I define my privet Jfrog repository for lib_release and lib_snapshot.
definitions:
steps:
- step: &compile
name: compile
caches:
- maven
script:
- mvn -s settings.xml clean compile package
artifacts:
- target/**
I see that in Build stage artifacts are Downloaded from the maven cache:
>Cache "maven": Downloading.
>Cache "maven": Downloaded 103.5 MiB in 4 seconds.
>Cache "maven": Extracting.
>Cache "maven": Extracted in 1 seconds
But during the build I see that some .pom, maven-metadata.xml, and some .jar files still downloading from my privet Jfrog artifactory.
For example:
>Downloaded from snapshots: https://jfrog.com/libs-snapshot/my-data/1.5.1-SNAPSHOT/my-data--1.pom (6.3 kB at 8.7 kB/s)
*So the question is why this data is not cashed?*

It is a snapshot. Snapshots are always retrieved to make sure the build has the latest version of it. Only the server knows the latest version of a snapshot (You can do an mvn clean install and look at your local repository to see the generated pom how this works).
https://maven.apache.org/guides/getting-started/index.html#What_is_a_SNAPSHOT_version

Related

How do you add JOOQ's 3.17 snaphot build to maven?

Does JOOQ push the latest non-stable builds to maven central? Or is there some snapshot repo I need to add to pom.xml?
Does JOOQ push the latest non-stable builds to maven central?
Snapshot builds are offered only to paying customers, available from here: https://www.jooq.org/download/versions
Or is there some snapshot repo I need to add to pom.xml?
The above website offers ZIP file downloads. The ZIP file contains two scripts:
maven-install.sh and maven-install.bat to install the snapshot artifacts locally, to your local repository (e.g. ~/.m2)
maven-deploy.sh and maven-deploy.bat to deploy the snapshot artifacts to your artifact repository
There are other options, e.g. using the maven-install-plugin in your build, of course, or build and install the snapshots directly from source code

Caching Rust/Wasm tools in Gitlab CI?

I'm working with Wasm and Rust, and I'm deploying the page with gitlab pages.
I'm using a gitlab-ci.yml file that looks like this:
image: "rust:latest"
variables:
PUBLIC_URL: "/repo-name"
pages:
stage: deploy
script:
- rustup target add wasm32-unknown-unknown
- cargo install wasm-pack
- wasm-pack build --target web
- mkdir public
- mv ./pkg ./public/pkg
- cp ./index.html ./public/index.html
artifacts:
paths:
- public
But even for a "Hello World" app, this takes ~12 minutes.
~11 minutes of that is taken by the cargo install wasm-pack step.
Is there any way I can cache the intermediate step, to avoid doing this every time?
This page: Caching in GitLab CI/CD talks about caching and/or using artifacts to persist files between jobs. You may be able to make use of that.
It then becomes a question of how to get cargo install to use that cache or the saved artifacts.
Alternatively, you can define your own base build image (run the cargo install steps in that), and store that in Gitlab's docker registry; see https://docs.gitlab.com/ee/user/packages/container_registry/.

How to save jar files as gitlab artifacts

I am trying to get the files generated by maven to store as artifacts. To my understanding I have edited the following ci yml.
stages:
- build
- package
maven-build:
stage: build
script:
- mvn install
artifacts:
paths:
- art/
maven-package:
stage: package
artifacts:
paths:
- art/
script:
- mvn package -U
By default, maven will build artifacts to a directory called target. You are not including this path in your artifacts declaration.
To deal with this you can:
Add target to your paths: array in your yaml OR
Add a script step to move/copy the target directory to the ./art which you already include in your paths: array OR
Change the directory to which Maven will place built files.

How to store node modules between jobs and stages in gitlab with continuous integration

I am fairly new to GitLab CI and I've been trying different approaches to use the node_modules directory in my entire pipeline. From what I've read in the official docs, cache and artifacts seem to be valid approaches to pass on files between jobs:
cache is used to specify a list of files and directories which should
be cached between jobs. You can only use paths that are within the
project workspace.
However, my issue with the caching method is that the node_modules would be persisted between pipelines by default:
cache can be set globally and per-job.
from GitLab 9.0, caching is enabled and shared between pipelines and jobs by default.
I do not want to persist the node_modules between pipelines. What I actually want is to trigger a fresh install with npm in my setup stage and then allow all further jobs in the pipeline to use these modules. Hence, I started using artifacts instead of cache, which is described similarly:
artifacts is used to specify a list of files and directories which
should be attached to the job after success. [...]
The artifacts will be sent to GitLab after the job finishes
successfully and will be available for download in the GitLab UI.
The dependency feature should be used in conjunction with artifacts
and allows you to define the artifacts to pass between different jobs.
The artifact-dependency method seems to be usable in my case. However, both cache and artifacts are extremely inefficient and slow. The node_modules are installed and usable, but the entire directory then gets uploaded somewhere and is re-downloaded between each job. (I would really love to know what happens here... Where do the modules go?)
Is there a better approach to run npm install only once at the beginning of the pipeline and then keep the node_modules in the pipeline during its entire runtime? I do not want to keep the node_modules after all jobs are finished so they don't need to be uploaded or downloaded anywhere.
Sample pipeline configuration file to reproduce the behavior:
image: node:lts
stages:
- setup
- build
- test
node:
stage: setup
script:
- npm install
artifacts:
paths:
- node_modules/
build:
stage: build
script:
- npm run build
dependencies:
- node
test:
stage: test
script:
- npm run lint
- npm run test
dependencies:
- node
Where do the modules go?
By default artifacts are saved on the main gitlab machine:
/var/opt/gitlab/gitlab-rails/shared/artifacts
Is there a better approach to run npm install only once at the beginning of the pipeline and then keep the node_modules in the pipeline during its entire runtime?
There are some options that you can try:
Merge setup and build stages to one stage.
Local npm cache on builder machines. Faster npm install times. Or use private npm proxy registry (for example - Nexus/Artifactory)
Check if gitlab main machine and the builders are in the same network so the upload/download will be faster
Consider packaging your build in docker. You will get reusable docker images between your gitlab stages. (Of course that there is an overhead of uploading the images to docker registry)

Publish Gitlab artifacts to Artifactory

I want to publish my Gitlab project's(not a maven project) artifact to JFrog Artifactory. The artifact size is 4.2 GB.
I searched for this but mostly got links to publish Gitlab Maven project to Artifactory. My project is not a maven project.
I have a requirement to keep all source code in Gitlab and artifacts(.war, .tar.gz) in Artifactory.
How do I achieve this?
It sounds like you're looking for Git LFS. This is an extention to Git that allows your Gitlab repository to track artifacts without actually storing them in the repository, instead using some external filestore or artifact management server.
Artifactory supports Git LFS repositories, and you can find the documentation for setting it up here.

Resources