Why am I getting long docker build times in Azure pipeline? - azure

I'm having issues on troubleshooting long build times for my ADO pipeline. Build steps are taking much longer than expected. We're using self-hosted agents - I suspect there may be a problem there, but I wanted input on any other directions I should take in investigating.

According to your description, do you mean that same pipeline run via microsoft hosted agent takes much shorter time?
It's suggested that you could check the pipeline debug logs from microsoft-hosted agent and self-hosted agent to see which part of the task and which step would perform different speed. And you could also share the logs with us for investigations.
And you could configure another self hosted agent on another local machine to test again. Usually, the time spent on same pipeline run through different agent would vary from agent performance
And for more information, pipeline run on self-hosted agent would be effected by multiple factors, like hardware property, network transmission and other concept.

Related

Docker containers runs great locally. Now I need it on schedule in cloud

I've containerized a logic that I have to run on a schedule. If I do my docker run locally (whatever my image is local or it is using the one from the hub) everything works great.
Now I need though to run that "docker run" on a scheduled base, on the cloud.
Azure would be preferred, but honestly, I'm looking for the easier and cheapest way to achieve this goal.
Moreover, my schedule can change, so maybe today that job runs once a day, in the future that can change.
What do you suggest?
You can create an Azure Logic app to trigger the start of a Azure Container Instance. As you have a "run-once" (every N minute/hour/..) container, the restart-policy should be set to "Never", so that the container only executes and then stops after the scheduling.
The Logic app needs to have the permissions to start the Container, so add a role assignment on the ACI to the managed identity of the logic App.
Screenshot shows the workflow with a Recurrence trigger, that starts an existing container every minute.
Should be quite cheap and utilizes only Azure services, without any custom infrastructure
Professionally I used 4 ways to run cron jobs/ scheduled builds. I give a quick summary of all with it pros/cons.
GitLab scheduled builds (free)
My personal preference would be to setup a scheduled pipeline in GitLab. Simply add the script to a .gitlab-ci.yml, configure the scheduled build and you are done. This is the lightweight option and works in most cases, if the execution time is not too long. I used this approach for scraping simple pages.
Jenkins scheduled builds (not-free)
I used the same approach as GitLab with Jenkins. But Jenkins comes with more overhead and you have to configure the entire Jenkins on multiple machines.
Kubernetes CronJob (expensive)
My third approach would be using a kubernetes cronjob. However, I would only use this if I consume a lot of memory/ram, or have a long execution time. I used this approach for dumping really large data sets.
Run a cron job from a container (expensive)
My last option would be to deploy a docker container on either a VM or a Kubernetes cluster and configure a cron job from within that docker container. You can even use docker-in-docker for that. This gives maximum flexibility, but comes with some challenges. Personally I like the separation of concerns when it comes to down-times etc. That's why never run a cron job as main process.

Azure web app deployment using vscode is faster than devops pipeline

Currently, I am working on Django based project which is deployed in the azure app service. While deploying into the azure app service there were two options, one via using DevOps and another via vscode plugin. Both the scenario is working fine, but strangle while deploying into app service via DevOps is slower than vscode deployment. Usually, via DevOps, it takes around 17-18 minutes whereas via vscode it takes less than 14 min.
Any reason behind this.
Assuming you're using Microsoft hosted build agents, the following statements are true:
With Microsoft-hosted agents, maintenance and upgrades are taken care of for you. Each time you run a pipeline, you get a fresh virtual machine. The virtual machine is discarded after one use.
and
Parallel jobs represents the number of jobs you can run at the same time in your organization. If your organization has a single parallel job, you can run a single job at a time in your organization, with any additional concurrent jobs being queued until the first job completes. To run two jobs at the same time, you need two parallel jobs.
Microsoft provides a free tier of service by default in every organization that includes at least one parallel job. Depending on the number of concurrent pipelines you need to run, you might need more parallel jobs to use multiple Microsoft-hosted or self-hosted agents at the same time.
This first statement might cause an Azure Pipeline to be slower because it does not have any cached information about your project. If you're only talking about deploying, the pipeline first needs to download (and extract?) an artifact to be able to deploy it. If you're also building, it might need to bring in the entire source code and/or external packages before being able to build.
The second statement might make it slower because there might be less parallelization possible than on the local machine.
Next to these two possible reasons, the agents will most probably not have the specs of your development machine, causing them to run tasks slower than they can on your local machine.
You could look into hosting your own agents to eliminate these possible reasons.
Do self-hosted agents have any performance advantages over Microsoft-hosted agents?
In many cases, yes. Specifically:
If you use a self-hosted agent, you can run incremental builds. For example, if you define a pipeline that does not clean the repo and does not perform a clean build, your builds will typically run faster. When you use a Microsoft-hosted agent, you don't get these benefits because the agent is destroyed after the build or release pipeline is completed.
A Microsoft-hosted agent can take longer to start your build. While it often takes just a few seconds for your job to be assigned to a Microsoft-hosted agent, it can sometimes take several minutes for an agent to be allocated depending on the load on our system.
More information: Azure Pipelines Agents
When you deploy via DevOps pipeline. you will go through a lot more steps. See below:
Process the pipeline-->Request Agents(wait for an available agent to be allocated to run the jobs)-->Downloads all the tasks needed to run the job-->Run each step in the job(Download source code, restore, build, publish, deploy,etc.).
If you deploy the project in the release pipeline. Above process will need to be repeated again in the release pipeline.
You can check the document Pipeline run sequence for more information.
However, when you deploy via vscode plugin. Your project will get restored, built on your local machine, and then it will be deployed to azure web app directly from your local machine. So we can see deploying via vscode plugin is faster, since much less steps are needed.

Multi Self-hosted Azure pipeline agents on same server

We currently have more than one self-hosted Azure pipeline agents running on one server. Recently we have notices the pipelines are failing with "Network Path Issues", looks like all the steps run on one agent and somehow one of the steps jumps to a different agent causing it to fail. Is there a way to separate this other than creating new servers for each agent?
Looks like all the steps run on one agent and somehow one of the steps
jumps to a different agent causing it to fail. Is there a way to
separate this other than creating new servers for each agent?
I can't reproduce same issue on my side. I assume the self-agents you mentioned above are in same agent pool, if so, as I know Devops doesn't have one option to separate agents from same agent pool when these agents are installed in same machine.
About the strange behavior you met, you can try this to resolve it:
1.Since you may have more than one self-agents in same Agent Pool running in same server, I suggest you can try separate those agents in different Agent Pool. Since your agents are running in same server, in this situation one agent pool for one agent can be more suitable.
2.Assuming your steps may not in same agent job, check and make sure your different agent jobs use same agent pool.
Hope it helps:)
After going through lot of logs and all the pipelines that were having issues was able to find some similarities. Most of the issues were having on the step where we were using Powershell task and the task was outdated (replaced with new one by Azure). After updating all the powershell tasks the issue seems to have gone.

Setting for running pipelines in sequence - Azure Devops

Is there a parameter or a setting for running pipelines in sequence in azure devops?
I currently have a single dev pipeline in my azure DevOps project. I use this for infrastructure because I build, test, and deploy using scripts in multiple stages in my pipeline.
My issue is that my stages are sequential, but my pipelines are not. If I run my pipeline multiple times back-to-back, agents will be assigned to every run and my deploy scripts will therefore run in parallel.
This is an issue if our developers commit close together because each commit kicks off a pipeline run.
You can reduce the number of parallel jobs to 1 in your project settings.
I swear there was a setting on the pipeline as well but I can't find it. You could do an API call as part or your build/release to pause and start the pipeline as well. Pause as the first step and start as the last step. This will ensure the active pipeline is the only one running.
There is a new update to Azure DevOps that will allow sequential pipeline runs. All you need to do is add a lockBehavior parameter to your YAML.
https://learn.microsoft.com/en-us/azure/devops/release-notes/2021/sprint-190-update
Bevan's solution can achieve what you want, but there has an disadvantage that you need to change the parallel number manually back and forth if sometimes need parallel job and other times need running in sequence. This is little unconvenient.
Until now, there's no directly configuration to forbid the pipeline running. But there has a workaruond that use an parameter to limit the agent used. You can set the demand in pipeline.
After set it, you'll don't need to change the parallel number back and forth any more. Just define the demand to limit the agent used. When the pipeline running, it will pick up the relevant agent to execute the pipeline.
But, as well, this still has disadvantage. This will also limit the job parallel.
I think this feature should be expand into Azure Devops thus user can have better experience of Azure Devops. You can raise the suggestion in our official Suggestion forum. Then vote it. Our product group and PMs will review it and consider taking it into next quarter roadmap.

Limit azure pipeline to only run one after the other rather than in parallel

I have set up a PR Pipeline in Azure. As part of this pipeline I run a number of regression tests. These run against a regression test database - we have to clear out the database at the start of the tests so we are certain what data is in there and what should come out of it.
This is all working fine until the pipeline runs multiple times in parallel - then the regression database is being written to multiple times and the data returned from it is not what is expected.
How can I stop a pipeline running in parallel - I've tried Google but can't find exactly what I'm looking for.
If the pipeline is running, the the next build should wait (not for all pipelines - I want to set it on a single pipeline), is this possible?
Depending on your exact use case, you may be able to control this with the right trigger configuration.
In my case, I had a pipeline scheduled to kick off every time a Pull Request is merged to the main branch in Azure. The pipeline deployed the code to a server and kicked off a suite of tests. Sometimes, when two merges occurred just minutes apart, the builds would fail due to a shared resource that required synchronisation being used.
I fixed it by Batching CI Runs
I changed my basic config
trigger:
- main
to use the more verbose syntax allowing me to turn batching on
trigger:
batch: true
branches:
include:
- main
With this in place, a new build will only be triggered for main once the previous one has finished, no matter how many commits are added to the branch in the meantime.
That way, I avoid having too many builds being kicked off and I can still use multiple agents where needed.
One way to solve this is to model your test regression database as an "environment" in your pipeline, then use the "Exclusive Lock" check to prevent concurrent "deployment" to that "environment".
Unfortunately this approach comes with several disadvantages inherent to "environments" in YAML pipelines:
you must set up the check manually in the UI, it's not controlled in source code.
it will only prevent that particular deployment job from running concurrently, not an entire pipeline.
the fake "environment" you create will appear in alongside all other environments, cluttering the environment view if you happen to use environments for "real" deployments. This is made worse by this view being a big sack of all environments, there's no grouping or hierarchy.
Overall the initial YAML reimplementation of Azure Pipelines mostly ignored the concepts of releases, deployments, environments. A few piecemeal and low-effort aspects have subsequently been patched in, but without any real overarching design or apparent plan to get to parity with the old release pipelines.
You can use "Trigger Azure DevOps Pipeline" extension by Maik van der Gaag.
It needs to add to you DevOps and configure end of the main pipeline and point to your test pipeline.
Can find more details on Maik's blog.
According to your description, you could use your own self-host agent.
Simply deploy your own self-hosted agents.
Just need to make sure your self host agent environment is the same as your local development environment.
Under this situation, since your agent pool only have one available build agent. When multiple builds triggered, only one build will be running simultaneously. Others will stay in queue with a specific order for agents. Unless the prior build finished, it will not run with next build.
For other pipeline, just need to keep use the host agent pool.

Resources