How many agents should I have? - gitlab

I trying to build a branch-based GitOps declarative infrastructure for Kubernetes. I plan to create clusters on a cloud provider with crossplane, and those clusters will be stored in Gitlab. However, as I start building, I seem to be running into gitlab-agent sprawl.
Every application I will be deploying to each of my environments is stored in a separate git repo, and I'm wondering if I need a separate agent for each repo and environment. For example, I have my three clusters prod, stage, and dev, and my three apps, API, kafka, and DB. I've started with three agents per repo (gitlab-agent-api-prod, gitlab-agent-kafka-stage, ...), Which seems a bit excessive. Do I really need 9 agents?
Additionally, I now have to install as many agents as I have apps onto each of my clusters, which already eats up significant resources. I'd imagine I can get away with one gitlab agent per cluster, I am just not seeing how that is done. Any help would be appreciated!
PS: if anyone has a guide on how to automatically add gitlab agents to new clusters created with crossplane, I'm all ears. Thanks!

Related

Azure web app deployment using vscode is faster than devops pipeline

Currently, I am working on Django based project which is deployed in the azure app service. While deploying into the azure app service there were two options, one via using DevOps and another via vscode plugin. Both the scenario is working fine, but strangle while deploying into app service via DevOps is slower than vscode deployment. Usually, via DevOps, it takes around 17-18 minutes whereas via vscode it takes less than 14 min.
Any reason behind this.
Assuming you're using Microsoft hosted build agents, the following statements are true:
With Microsoft-hosted agents, maintenance and upgrades are taken care of for you. Each time you run a pipeline, you get a fresh virtual machine. The virtual machine is discarded after one use.
and
Parallel jobs represents the number of jobs you can run at the same time in your organization. If your organization has a single parallel job, you can run a single job at a time in your organization, with any additional concurrent jobs being queued until the first job completes. To run two jobs at the same time, you need two parallel jobs.
Microsoft provides a free tier of service by default in every organization that includes at least one parallel job. Depending on the number of concurrent pipelines you need to run, you might need more parallel jobs to use multiple Microsoft-hosted or self-hosted agents at the same time.
This first statement might cause an Azure Pipeline to be slower because it does not have any cached information about your project. If you're only talking about deploying, the pipeline first needs to download (and extract?) an artifact to be able to deploy it. If you're also building, it might need to bring in the entire source code and/or external packages before being able to build.
The second statement might make it slower because there might be less parallelization possible than on the local machine.
Next to these two possible reasons, the agents will most probably not have the specs of your development machine, causing them to run tasks slower than they can on your local machine.
You could look into hosting your own agents to eliminate these possible reasons.
Do self-hosted agents have any performance advantages over Microsoft-hosted agents?
In many cases, yes. Specifically:
If you use a self-hosted agent, you can run incremental builds. For example, if you define a pipeline that does not clean the repo and does not perform a clean build, your builds will typically run faster. When you use a Microsoft-hosted agent, you don't get these benefits because the agent is destroyed after the build or release pipeline is completed.
A Microsoft-hosted agent can take longer to start your build. While it often takes just a few seconds for your job to be assigned to a Microsoft-hosted agent, it can sometimes take several minutes for an agent to be allocated depending on the load on our system.
More information: Azure Pipelines Agents
When you deploy via DevOps pipeline. you will go through a lot more steps. See below:
Process the pipeline-->Request Agents(wait for an available agent to be allocated to run the jobs)-->Downloads all the tasks needed to run the job-->Run each step in the job(Download source code, restore, build, publish, deploy,etc.).
If you deploy the project in the release pipeline. Above process will need to be repeated again in the release pipeline.
You can check the document Pipeline run sequence for more information.
However, when you deploy via vscode plugin. Your project will get restored, built on your local machine, and then it will be deployed to azure web app directly from your local machine. So we can see deploying via vscode plugin is faster, since much less steps are needed.

Can test and production share the same cloud kubernetes environment?

I have created a kubernetes cluster and I successfully deployed my spring boot application + nginx reverse proxy for testing purposes.
Now I'm moving to production, the only difference between test and prod is the connection to the database and the nginx basic auth (of course scaling parameters are also different).
In this case, considering I'm using a cloud provider infrastructure, what are the best practices for kubernetes?
Should I create a new cluster only for prod? Or I could use the same cluster and use labels to identify test and production machines?
For now having 2 clusters seems a waste to me: the provider assures me that I have the hardware capabilities and I can put different request/limit/replication parameters according to the environment. Also, for now, I just have 2 images to deploy per environment (even though for production I will opt for an horizontal scaling of 2).
I would absolutely 100% set up a separate test cluster. (...assuming a setup large enough where Kubernetes makes sense; I might consider an easier deployment system for a simple three-tier app like what you're describing.)
At a financial level this shouldn't make much difference to you. You'll need some amount of hardware to run the test copy of your application, and your organization will be paying for it whether it's in the same cluster or a different cluster. The additional cost will only be the cost of the management plane, which shouldn't be excessive.
At an operational level, there are all kinds of things that can go wrong during a deployment, and in particular there are cases where one Kubernetes resource can "step on" another. Deploying to a physically separate cluster helps minimize the risk of accidents in production; you won't accidentally overwrite the prod deployment's ConfigMap holding its database configuration, for example. If you have some sort of crash reporting or alerting set up, "it came from the test cluster" is a very clear check you can use to not wake up the DevOps team. It also gives you a place to try out possibly risky configuration changes: if you run your update script once in the test cluster and it passes then you can re-run it in prod, but if the first time you run it is in prod and it fails, that's an outage.
Depending on what you're using for a CI system, the other thing you can set up is fully automated deploys to the test environment. If a commit passes its own unit tests, you can have the test environment always running current master and run integration tests there. If and only if those integration tests pass, you can promote to the production environment.
It is true, that it is definitely a better practice to use a different cluster, as in your test cluster you could do something wrong (especially resource wise), and take down you prod environment, but if you can't afford it, and if you feel confident on k8s, you can put your prod environment in a different namespace.
I don't know on azure, but on GKE you can take the number of nodes to zero. If it is possible on azure, may be you can take the number of nodes of test environment to zero, whenever not using it, and get 2 clusters.
Its better to use different clusters for production and dev/testing. Please refer here for best practices

Handling multiple environments, spinning up environment for testing in Azure Devops

We have a large-scale service architecture with multiple features. Currently there are 3 environments created apart from production, due to which there are issues in sharing environments across multiple teams working on the same application but different features.
What would be nice is if every programmer is able to test his/her code in an integrated environment before merging to master.
Is there a way in Azure DevOps where in we can spin up a new temporary environment for testing and can be teared down after the code is merged to master.
I am not sure how to get started on this. It would be great if someone can help me.

How to manage patching on multiple AWS accounts with different schedules

I'm looking for the best way to manage patching Linux systems across AWS accounts with the following things to consider:
Separate schedules to roll patches through Dev, QA, Staging and Prod sequentially
Production patches to be released on approval, not automatic
No newer patches can be deployed to Production than what was already deployed to lower environments (as new patches come out periodically throughout the month)
We have started by caching all patches in all environments on the first Sunday of every month. The goal there was to then install patches from cache. This helps prevent un-vetted patches being installed in prod.
Most, not all, instances are managed by OpsWorks, but there are numerous OpsWorks stacks. We have some other instances managed by Chef Server. Still others are not managed, but are just simple EC2 instances created from the EC2 console. This means, using recipes means we have to kick off approved patches on a stack-by-stack basis or instance-by-instance basis. Not optimal.
More recently, we have looked at the new features of SSM using a central AWS account to manage instances. However, this causes problems with some applications because the AssumeRole for SSM adds credentials to the .aws/config file that interferes with other tasks we need to run.
We have considered other tools, such as Ansible, but we would like to explore staying within the toolset we currently have which is largely OpsWorks and Chef Server. I'm looking for ideas that are more on a higher level, an architecture of how one would approach this scenario.
Thanks for any thoughts or ideas.
This sounds like one of the exact scenarios RunCommand was designed for.
You can create multiple groups of servers with different schedules based on tags. More importantly, you don't need to rely on secret/keys being deployed anywhere.

Implementing a CI/Deployment Pipeline for a Node app

I will shortly be in the process of rewriting a node app with the intention of
implementing Continuous Integration and TDD.
I also want to design and set up a deployment pipeline for development, staging, and production.
Currently I'm using Shipit to push changes to different instances that have pre-configured environments. I've heard about deploying Docker containers with the needed environments, and I'd like to learn more about that.
I'm looking at TravisCI and for automated testing/builds, and from my understanding, one can push the Docker image to a registry after the build succeeds.
I'm also learning about scaling, and looking at a design for production that incorporates Google Cloud servers/services serving 3 clustered versions of the node app, a Redis cluster, and 2 PostgreSQL nodes, which each service being behind a load balancer.
I've heard of Kubernetes being used to manage and deploy containerized applications, but I'm curious on how it all fits together.
In my head I think that it would seem like the process would be as follows:
commit changes on dev machine - push to repository.
TravisCI builds and runs tests, (what about migrations and pushing changes to the postgreSQL service?), pushes to a Google Cloud Container Registry.
Log into the Google Container Engine and run the app with Kubernetes.
What about the Redis Cluster? The PostgreSQL nodes?
I apologize in advance if this question is lacking in clarity and knowledge, but I'm trying to learn more before I move along.
Have you considered Google Cloud Container Builder? It's very easy to set up a trigger from your Github repository, which would start a new build on changes (branch or tag).
As part of the build, you can push the new image to GCR.
And you could also deploy to Kubernetes as part of the same build.

Resources