Gitlab, howto configure build and test be on different machines?

Gitlab, howto configure build and test be on different machines? - gitlab

I'm new to gitlab and am asking for advice / best practice here.
I have a program that I build on my build machine. The program cant run on the build machine, as it needs to be installed to a test machine that has special hardware/enviornment that the program needs. I want to run some system tests (memory leak tests etc) on the test machine. How is this best done?
I think this can be accomplished with the "multi project pipeline" feature. Is this the simplest/best way?
Here is my plan:
I could have one (shh/shell) runner that build my program on my build machine, and a different runner that runs tests on my test machine. The two would be connected using "multi project pipeline" feature. The artifacts from the build pipeline would be installed on the test machine and then system tests would run on the test machine.
Is this the best way to solve this? Or is there a simpler/better way?

Answering my own question here. "Multi project pipeline" is not necessary here. You simply have a single project and mark jobs with different "tags". You can then register runners for these different tags, on different machines.
(Artifacts are transferred from one job to the next the same, regardless if the runners are run on the same machine or on different ones)
https://docs.gitlab.com/ee/ci/yaml/README.html#tags

Related

Dynamic create environment and run Cypress parallel

We are using Cypress to run our end-2-end-tests in GitLab. Before we run the test we create a dynamic environment. A dynamic environment is an environment which is created with docker-compoe inside the gitlab runner which executes the cypress tests. After the dynamic environment is up fire the tests against this dynamic environment. Everything happens in one gitlab-runner so no external deployment to a test environment takes place.
Now we want to move forward and parallelize the cypress run. Its documented here https://docs.cypress.io/guides/guides/parallelization and it is working under the assumption that the environment is already there. It creates several GitLab runners and cypress takes care for distribution the scenarios between the runners.
The question is, how to set up a dynamic environment with GitLab which can be shared between GitLab runners? Is it only possible with a dummy deployment to a Kubernetes environment which is prepared for this user case? Do I need create a dynamic environment in each runner? Or any other hints?

Our solution now looks like this.
We start the environment from docker-compose in each running gitlab-job. This of course reduces the time-saving of parallel execution of tests since gitlab runners are blocked longer. But since it happens in parallel and we are not short on runners its OK for us. An other advantage was that we could keep our current setup with only minimal modifications. In the end we reduced our "run all end 2 end tests" - execution time by about 30%

About gitlab CI runners

I am new to gitlab CI and I am fascinated with it. I managed already to get the pipelines working even using docker containers, so I am familiar with the flow for setting jobs and artifacts. I just wish now to understand how this works. My questions are about the following:
Runners
Where is actually everything happening? I mean, which computer is running my builds and executables? I understand that Gitlab has its own shared runners that are available to the users, does this mean that if a shared runner grabs my jobs, is it going to run wherever those runners are hosted? If I register my own runner in my laptop, and use that specific runner, my builds and binaries will be run in my computer?
Artifacts
In order to run/test code, we need the binaries, which from the build stage they are grabbed as artifacts. For the build part if I use cmake, for example, in the script part of the CI.yml file I create a build directory and call cmake .. and so on. Once my job is succesful, if I want the binary i have to go in gitlab and retrieve it myself. So my question is, where is everything saved? I notice that the runner, withing my project, creates something like refs/pipeline/, but where is this actually? how could I get those files and new directories in my laptop
Working space
Pretty much, where is everything happening? the runners, the execution, the artifacts?
Thanks for your time

Everything that happens in each job/step in a pipeline happens on the runner host itself, and depends on the executor you're using (shell, docker, etc.), or on the Gitlab server directly.
If you're using gitlab.com, they have a number of shared runners that the Gitlab team maintains and you can use for your project(s), but as they are shared with everyone on gitlab.com, it can be some time before your jobs are run. However, no matter if you self host or use gitlab.com, you can create your own runners specific for your project(s).
If you're using the shell executor, while the job is running you could see the files on the filesystem somewhere, but they are cleaned up after that job finishes. It's not really intended for you to access the filesystem while the job is running. That's what the job script is for.
If you're using the docker executor, the gitlab-runner service will start a docker instance from the image you specify in .gitlab-ci.yml (or use the default that is configurable). Then the job is run inside that docker instance, and it's deleted immediately after the job finishes.
You can add your own runners anywhere -- AWS, spare machine lying around, even your laptop, and jobs would be picked up by any of them. You can also turn off shared runners and force it to be run on one of your runners if needed.
In cases where you need an artifact after a build/preparatory step, it's created on the runner as part of the job as above, but then the runner automatically uploads the artifact to the gitlab server (or another service that implements the S3 protocol like AWS S3 or Minio). Unless you're using S3/minio, it will only be accessible through the gitlab UI interface, or through the API. In the UI however, it will show up on any related MR's, and also the Pipeline page, so it's fairly accessible.

How do I get a Jenkins server to push bash code to a different server?

I have Jenkins installed on a Linux server. It can run builds on itself. I want to create either a Freestyle Project or an External Job that transfers a bash script and runs it on two separate linux servers. Where in the GUI do I configure the destination server when I create a build? I have added "nodes" in the GUI. I can see the free space of the servers in the Jenkins GUI, so I know the credentials work. But when I create a build, I see no field that would tell Jenkins to push the bash scripts and run them on certain servers.
Are Jenkins nodes just servers that lend computing power to the master server? Or are they the targets of Jenkins builds? I believe that Jenkins "slaves" provide computing power to the Jenkins master server.
Normally Jenkins is used to integrate code. What do you call the servers that Jenkins pushes code into? They would be called Chef clients or Puppet agents if I was using Chef or Puppet for integrating code. I've been doing my own research, but I don't seem to know the specific vocabulary.

I've been working with such tools for several years. And for as far as I know there isn't a Ubiquitous Language for this.
The node's you can configure in Jenkins itself to add 'computing power' are indeed called build slaves.
Usually, external machines you will copy to, deploy to or otherwise use in jobs are called "target machine". As it will be the target of an action in your job.
Nodes can be used in several forms, you can use agents, which will require a small installation on the node machine. Which will create a running agent service with which Jenkins can communicate.
Another way is simply allow Jenkins to connect to a machine via ssh and let it execute commands there. Both are called nodes and could be called build slaves. But the first are usually dedicated nodes while the second can be any kind of machine as long as the ssh user can execute the build.
I also have not found any different terms for these two types.
It's probably not a real answer to your questions, but I do hope it helped.

GitLab CI and Distributed Build Confusion

I'm relatively new to continuous integration servers. I've been using GitLab (v6.5) for a while to manage projects, but I'd like to begin using the GitLab CI to ensure tests pass and builds succeed.
My testing setup consists of two virtual machines: one machine for GitLab and another machine for the GitLab CI (and runners). However, in production I only have a single machine, which is running GitLab. The GitLab team posted an interesting blog post a while back that emphasized:
If you are running tests on the CI server you are doing it wrong!
It was a very informative post, but I didn't come away feeling like I understood this specific point. Does this mean one shouldn't run GitLab and GitLab CI on the same server? Does it mean one shouldn't run GitLab CI and GitLab CI runners on the same server? Or both-- Do I need three servers, one for each task?
From the same post:
Anybody who can push to a branch that is tested on a CI server can easily own that server.
This implies to me that the runners are the security risk since they can run stuff contained in a commit. If that's the case, what's the typical implementation? Put GitLab and GitLab CI on the same machine, but the runners on a separate machine? Wouldn't it still suck if the runner machine was compromised? So people are okay losing their runner machine as long as their code machine is safe?
I would really like to understand this a bit more-- definitely before I implement it in production. Is there any possible yet safe way to implement GitLab, GitLab CI, and GitLab CI runners all on the same machine?

Ideally you're fine running gitlab-ci and gitlab on the same host. Others may disagree with me but the orechestrator (the gitlab-ci node) doesn't do any of the heavy lifting. Its strictly job meta IO and warehousing the results.
With that being said, I would not put the runners on the same machine. Gitlab-CI Runners are resource intensive and will be executing at full tilt on whichever machine you place them on. Its a good idea if you're running in production to put these on spot instances to help curb some of the costs of running the often cpu/memory hungry builds - but can be impractical as your instances are not always on at that point.
I've had some success with putting my gitlab-ci runner's in digital ocean on small instances. I'm not doing HUGE builds, but the idea is to distribute the work load against several servers so your CI server:
Is responsive
Can build multiple project builds at once
Can exercise isolation (this is kind of arbitrary in this list)
and a few other things that don't come to mind right away.
Hope this helps!

How to secure Ant builds?

Our company uses ANT to automate build scripts.
Now somebody raised the question how to secure such build scripts agains (accidental or intended) threats?
Example 1: someone checks in a build script that deletes everything under Windows drive T:\ because that is where the Apache deployment directory is mounted for a particular development machine. Months later, someone else might run the build script and erase everything on T:\ which is a shared drive on this machine.
Example 2: an intruder modifies the default build target in a single project to scan the entire local hard disk. The Continuous Integration machine (e.g. Jenkins) is configured to execute the default build target and will therefore send its entire local directory structure to the intruder, even for projects that the intruder should not have access to.
Any suggestions how to prevent such scenarios (besides "development policies" or "do not mount shared drives")?
My only idea is to use chroot enviroments for builds?!

The issues you describe are the same for any code that you execute on the build machine - you could do the same thing using a unit test.
In this case the best solution may be to place your build scripts under source control and have a code review prior to check in.

At my company, the build scripts (usually a build folder) are an svn:external to another subversion repository that is only controlled by build/release engineers. Developers can control variables such as servers it can deploy to, but not what those functions do. This same code is reused amongst multiple projects in flight, and only a few devops folks can alter it, not the entire development staff.
Addition: When accessing shared resources, we use a system account that has only read access to those resources. Further: jenkins,development projects and build/deploy code are written to handle complete loss of jenkins project workspace and deploy environments. This is basic build automation/deploy automation that leads to infrastructure automation.
Basic rule: Murphy's law is going to happen. You should write scripts that are robust and handle cold start scenarios and not worry about wild intruder theories.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string