Jenkins - Artifact handling - linux

I have a Jenkins set-up consisting of one Master and two Slaves. I have Jenkins jobs (which run only on the slaves) which will create binaries on every commit. Currently, Jenkins archives these artifacts into some place within the Jenkins Master. When i wish to download the binaries using a bash shell script, i use wget url_link_to_particular_artifact. I wish to change this. I want to copy all the generated artifacts into one common location on the master node. So, the url would remain the same and only the last part would change with respect to the generated binary name. I label my binaries with tags so it is easy to retrieve them later on. Now, is there a plugin which will copy artifacts into the master node but to the location that I can provide. The master and slave nodes are all redhat linux machines.
I have already gone through the Artifactory Plugin and I do not wish to use it. I want something really simple to implement. Is there really a need for a web server to be running at the location on the master where I wish to copy the artifacts into? Can i transfer the artifacts from slave to master over SSH? If yes, how?
EDIT:
I have made some progress and I am sort of stuck now: Assuming we have a web-server on the Jenkins master node that is running. Is it possible for the slave nodes to send the artifacts to this location and the web-server sort of writes it into the file system at that location on the Master??

This, of course, is possible, but let me explain to you, why this is a bad idea.
Jenkins is not your artifact repository. Indeed you can store your artifacts in Jenkins, but it was not designed to do so. If you will do that for most of your jobs, you will run into problems with disk space, etc. or even race condition with names.
Not to mention that you don't want to have hundreds or thousands of files in one directory.
Better approach would be to use an artifact repository, such as Nexus to store your artifacts. You can manage and retrieve them easily thru different channels.
Keep in mind that it would be nice to keep your Jenkins in stateless mode and version control your configuration for easy restoration.
If you still want to store your artifacts in one web location, I'd suggest to setup an nginx server, proxy /jenkins calls to jenkins and /artifacts to your artifacts directory.

Related

GitLab multiple runners, exchanging artifacts

I'm already using gitlab CI on smaller projects, but now I'm looking into using gitlab as CI for a larger project.
How can I pass build artifacts (bunch of binary files etc) between two gitlab-runners running on two different physical machines?
Context:
I have a large repository, which produces a lot of artifacts during the build. Obviously this takes time, so I'd like to build on a beefy multi-core machine. If the build passes, I want to test in parallel across many other (smaller) machines. These test-machines are hooked up to many different kinds of equipment. Equipment that I don't want to bother the beefy machine with.
I understand artifacts: and dependencies: should address this, but that uses a local cache as far as I can tell.
The build artefacts weigh in at ~4GB so somehow that data must be transferred.
Can gitlab help with this natively, or do I need a pattern of build+push followed by a fetch+test? (To say, artifactory CEPH NFS etc)
I imagine my needs aren't unique so something must already exist for this.
You are on the right path: artifacts is what you are looking for. Runners do not store the artifacts they build, but they upload them to the GitLab instance.
Now, where GitLab stores them is a different topic, and if you manage your GitLab installation, you can take a look at the administration documentation: https://docs.gitlab.com/ee/administration/job_artifacts.html
You can also retrieve artifacts through APIs, if you have any special need, but artifacts and dependencies should be more than enough for your use case: https://docs.gitlab.com/ee/api/job_artifacts.html#get-job-artifacts

Is it possible to have multiple gitlab-runners all execute the same jobs?

I'm hoping to leverage GitLab CI/CD / gitlab-runner to keep custom code up to date on a fleet of servers.
Desired effect is that when a commit is made against a certain project in GitLab, several servers then automatically pull those changes down.
Is it possible to leverage gitlab-runner's in this way, so that every runner registered with the project executes the contents of the .gitlab-ci.yml file? Or is there a better tool to accomplish this?
I could use Ansible to push updates files down to each server, but I was looking for something easier to solve for - something inherent in GitLab.
Edit: Alternative Solution
I decided to go the route of pre- and post-hook files in my repos as described here:
https://gist.github.com/noelboss/3fe13927025b89757f8fb12e9066f2fa
Basically I will be denoting a primary server as the main source for code pushes into the master repo, and have defined my entire fleet as remote repos in .git/config there. Using post-hooks inside of bare repo's on all of my servers, I can then copy my code into the proper execution path.
#ahxn81 Runners aren't really intended to be used in the pull way you describe. The Ansible push method you proposed is more in line with typical deploy flow. I can see why you might prefer the simplicity of the pull method over pushing via script. I guess a fleet of servers these days is via kubernetes or docker swarm which can simplify deployment after an initial setup headache.

About gitlab CI runners

I am new to gitlab CI and I am fascinated with it. I managed already to get the pipelines working even using docker containers, so I am familiar with the flow for setting jobs and artifacts. I just wish now to understand how this works. My questions are about the following:
Runners
Where is actually everything happening? I mean, which computer is running my builds and executables? I understand that Gitlab has its own shared runners that are available to the users, does this mean that if a shared runner grabs my jobs, is it going to run wherever those runners are hosted? If I register my own runner in my laptop, and use that specific runner, my builds and binaries will be run in my computer?
Artifacts
In order to run/test code, we need the binaries, which from the build stage they are grabbed as artifacts. For the build part if I use cmake, for example, in the script part of the CI.yml file I create a build directory and call cmake .. and so on. Once my job is succesful, if I want the binary i have to go in gitlab and retrieve it myself. So my question is, where is everything saved? I notice that the runner, withing my project, creates something like refs/pipeline/, but where is this actually? how could I get those files and new directories in my laptop
Working space
Pretty much, where is everything happening? the runners, the execution, the artifacts?
Thanks for your time
Everything that happens in each job/step in a pipeline happens on the runner host itself, and depends on the executor you're using (shell, docker, etc.), or on the Gitlab server directly.
If you're using gitlab.com, they have a number of shared runners that the Gitlab team maintains and you can use for your project(s), but as they are shared with everyone on gitlab.com, it can be some time before your jobs are run. However, no matter if you self host or use gitlab.com, you can create your own runners specific for your project(s).
If you're using the shell executor, while the job is running you could see the files on the filesystem somewhere, but they are cleaned up after that job finishes. It's not really intended for you to access the filesystem while the job is running. That's what the job script is for.
If you're using the docker executor, the gitlab-runner service will start a docker instance from the image you specify in .gitlab-ci.yml (or use the default that is configurable). Then the job is run inside that docker instance, and it's deleted immediately after the job finishes.
You can add your own runners anywhere -- AWS, spare machine lying around, even your laptop, and jobs would be picked up by any of them. You can also turn off shared runners and force it to be run on one of your runners if needed.
In cases where you need an artifact after a build/preparatory step, it's created on the runner as part of the job as above, but then the runner automatically uploads the artifact to the gitlab server (or another service that implements the S3 protocol like AWS S3 or Minio). Unless you're using S3/minio, it will only be accessible through the gitlab UI interface, or through the API. In the UI however, it will show up on any related MR's, and also the Pipeline page, so it's fairly accessible.

How do I get a Jenkins server to push bash code to a different server?

I have Jenkins installed on a Linux server. It can run builds on itself. I want to create either a Freestyle Project or an External Job that transfers a bash script and runs it on two separate linux servers. Where in the GUI do I configure the destination server when I create a build? I have added "nodes" in the GUI. I can see the free space of the servers in the Jenkins GUI, so I know the credentials work. But when I create a build, I see no field that would tell Jenkins to push the bash scripts and run them on certain servers.
Are Jenkins nodes just servers that lend computing power to the master server? Or are they the targets of Jenkins builds? I believe that Jenkins "slaves" provide computing power to the Jenkins master server.
Normally Jenkins is used to integrate code. What do you call the servers that Jenkins pushes code into? They would be called Chef clients or Puppet agents if I was using Chef or Puppet for integrating code. I've been doing my own research, but I don't seem to know the specific vocabulary.
I've been working with such tools for several years. And for as far as I know there isn't a Ubiquitous Language for this.
The node's you can configure in Jenkins itself to add 'computing power' are indeed called build slaves.
Usually, external machines you will copy to, deploy to or otherwise use in jobs are called "target machine". As it will be the target of an action in your job.
Nodes can be used in several forms, you can use agents, which will require a small installation on the node machine. Which will create a running agent service with which Jenkins can communicate.
Another way is simply allow Jenkins to connect to a machine via ssh and let it execute commands there. Both are called nodes and could be called build slaves. But the first are usually dedicated nodes while the second can be any kind of machine as long as the ssh user can execute the build.
I also have not found any different terms for these two types.
It's probably not a real answer to your questions, but I do hope it helped.

Deploying and scheduling changes with Ansible OSS

Please note: I am not interested in any enterprise/for-pay (Tower?) solutions here, only solutions available via Ansible's OSS offering.
OK so I've got my Ansible project configured and working perfectly, woo hoo! Looks something like this:
myansible01.example.com:/opt/ansible/
site.yml
fizz.yml
buzz.yml
group_vars/
roles/
common/
tasks/
main.yml
handlers
main.yml
foos/
tasks/
main.yml
handlers/
main.yml
There's several things I need to accomplish to get this working in a production environment:
I need to be able to automate the deployment of changes to this project
I need to schedule playbooks to be ran, say, every 30 seconds (to ensure all managed nodes are always in compliance)
So my concerns:
How are changes usually deployed to live Ansible projects? Say the project is located at myansible01.example.com:/opt/ansible (my Ansible server). Is it sufficient to simply delete the Ansible project root (rm -rf /opt/ansible) and then copy the latest (containing changes) Ansible project back to the same location? What happens if Ansible is currently running any plays while I perform this "drop-n-swap"?
It looks like the commercial offering (Ansible Tower) has a scheduling feature built into it, but not the OSS offering. How can I schedule Ansible OSS to run plays at certain times? For instance, I might want certain plays to be ran every 30 seconds, so as to ensure nodes are always within compliance. Is cron sufficient to do this, or is there a more standard approach?
For this kind of task you typically want an orchestration engine such as Jenkins to do all your, well, orchestration.
You can set Jenkins to run playbooks on timers or other events such as a push to an SCM such as git.
Typically a job starts by checking out a tag/branch of our Ansible code base and then applying it to all of our specified servers so you always know what is being run. If you want, this can simply be the head on master (in git terms) so it's always applying the most recent changes. If you were also to have this to hook into your SCM repo then a simple push will force those changes to be applied to all of your servers.
Because of that immediacy you might want to consider only doing this on some test servers that then have some form of testing done against them (such as Serverspec) to verify that your changes are good before rolling them out to a production environment.
Jenkins, by default, will not run a job while the same job is running (or if you are maxed out on executor slots) so you can always be sure that it will only pull the repo (including any changes) after your Ansible run is complete. If you have multiple jobs running you can use blocking to prevent jobs running at the same time (both trying to apply potentially different configurations to the servers) but you don't have to worry about a new job starting and pulling the repo into the already running job as Jenkins separates these into separate work spaces.
We use Jenkins for manual runs of Ansible against our environment but we also have a "self healing" Jenkins job that simply runs a tagged commit of our Ansible code base against our environment, forcing it to an idempotent state to prevent natural drift of configurations. When we need to do something different to the environment or are running a slightly further ahead commit of our code base in to it we can easily disable the self healing job until we're happy with things and then either just re-enable the job to put things back or advance the tag that Jenkins is using to now use the more recent commit.

Resources