I've created a new onetime pipeline in Azure Data Factory using Copy Data wizard.
The pipeline has 4 activities and it was run just fine - all 4 activities succeeded.
Now I'd like to rerun the pipeline. So I do:
Clone the pipeline.
Change name to [name]_rev2.
Remove start and end properties.
Deploy the cloned pipeline.
Now the status of the new cloned pipeline is Running.
But no activities are executed at all.
What's wrong?
Mmmmm. Where to start!
In short. You can't just clone the pipeline. If its a one time data factory that you've created it won't have a schedule attached and therefore won't have any time slices provisioned that now require execution.
If your unsure how time slices in ADF work a recommend some reading around this concept.
Next, I recommend opening up a Visual Studio 2015 solution and downloading the data factory that did run as a project. Check out the various JSON blocks for scheduling and interval availability in the datasets, activities and the time frame in the pipeline.
Here's a handy guide for scheduling and execution to understand how to control your pipelines.
https://learn.microsoft.com/en-us/azure/data-factory/data-factory-scheduling-and-execution
Once you've made all the required changes (not just the name) publish the project to a new ADF service.
Hope this helps
Related
I was hoping to get some feedback on using Azure Pipelines and what the best practices are for my situation.
We have recently migrated from TFS 2017 and we are in the process of re-writing all our pipelines. We were using builds and releases prior to the upgrade in the legacy build tasks. We would like to setup more useful YAML pipelines.
Let me set the stage with what we currently have
10+ microservices
10 individual builds that trigger from a folder in the repo for each one
10 releases that get created on successful build
3 environments per release (Dev, QA, UAT)
So in summary... a build of a single microservice triggers off of a commit to a folder in the branch. The successful build then triggers a release to Dev. Dev completes and a user go go start a QA build by clicking the release.
In the new Azure Pipeline world... what would be the best approach to doing this model.
We would like to have all the builds happen in a single pipeline (each stage would be a microservice?)
How do we trigger only on a commit to that folder?
What would the CD look like? Should it be in the same pipeline and be a new stage?
How can we easily just add environments without having to just keep copy/pasting all the code for each environment? Ideally i would like to just be able to add a variable and a new environment can be deployed to
I am open to any suggestions here. I am ok if I am way off here, I am looking for the best practices and best approach to this.
TIA
For my team where we have partner teams providing us SW pieces that need to be integrated on HW systems and tested together, our code footprint is small and hence churn is small, while number of changes from partner teams is frequent. In such a scenario, I see the need to trigger the release part of the yaml many more times than the build part. Is multi-stage pipelines the way to go? I want to trigger new release instances using RestAPI invoke only the Release stages on the YAML file, using AzureDevOps Rest API.
Regards,
You don't have to use multi-stage pipelines to be able to trigger repeated releases, it just makes the management of the pipeline cleaner.
It's possible to create a pipeline that include a build stage and release stages for each of your environments, trigger the build stage (manually or based on a CI trigger), and then from that Pipeline "Run", deploy as many times as you see fit to whatever environments you like. That can be done from API or portal.
It's also possible to create a pipeline that is "release-only" - that is, it gets created manually, or as the result of seeing a specified build having been run.
Personally, I like the multi-stage build because it's a little easier to see what build created the release that you're deploying around. It's not a requirement, though.
We have a self-hosted build agent on an on-prem server.
We typically have a large codebase, and in the past followed this mechanism with TFS2013 build agents:
Daily check-ins were built to c:\work\tfs\ (taking about 5 minutes)
Each night a batch file would run that did the same build to those folders, using the same sources (they were already 'latest' from the CI build), and build the installers. Copy files to a network location, and send an email to the team detailing the build success/failures. (Taking about 40 minutes)
The key thing there is that for the nightly build there would be no need to get the latest sources, and the disk space required wouldn't grow much. Just by the installer sizes.
To replicate this with Azure Devops, I created two pipelines.
One pipeline that did the CI using MSBuild tasks in the classic editor- works great
Another pipeline in the classic editor that runs our existing powershell script, scheduled at 9pm - works great
However, even though my agent doesn't support parallel builds what's happening is that:
The CI pipeline's folder is c:\work\1\
The Nightly build folder is c:\work\2\
This doubles the amount of disk space we need (10gb to 20gb)
They are the same code files, just built differently.
I have struggled to find a way to say to the agent "please use the same sources folder for all pipelines"
What setting is this, as we have to pay our service provider for extra GB storage otherwise.
Or do I need to change my classic pipelines into Yaml and somehow conditionally branch the build so it knows it's being scheduled and do something different?
Or maybe, stop using a Pipeline for the scheduled build, and use task scheduler in Windows as before?
(I did try looking for the same question - I'm sure I can't be the only one).
There is "workingDirectory" directive available for running scripts in pipeline. This link has details of this - https://learn.microsoft.com/en-us/azure/devops/pipelines/tasks/utility/command-line?view=azure-devops&tabs=yaml
The number '1' '2'...'6' of work folder c:\work\1\, c:\work\2\... c:\work\6\ in your build agent which stands for a particular pipeline.
Agent.BuildDirectory
The local path on the agent where all folders for a given build
pipeline are created. This variable has the same value as
Pipeline.Workspace. For example: /home/vsts/work/1
If you have two pipelines, there will also be two corresponding work folders. It's an except behavior. We could not configure pipelines to share the same build folde. This is by designed.
If you need to use less disk space to save cost, afraid stop using a Pipeline for the scheduled build, and use task scheduler in Windows as before is a better way.
So to give you a bit of context we have a service which has been split into two different services ie one for the read and one for the write side operations. The read side is called ProductStore and the write side is called ProductCatalog. The issue were facing was down the write side as the load tests create 100 products in the write side resource web app and then they are transferred to the read side for the load test to then read x number of times. If a build is launched in the product catalog because something new was merged to master then this will cause issues in the product store pipeline if it gets run concurrently.
The question I want to ask is there a way in the ProductStore yaml file to directly query via a specified azure task or via an AzurePowershell script to check if a build is currently running in the ProductCatalog pipeline.
The second part of this would be to loop/wait until that pipeline has successfully finished before resuming the product store pipeline.
Hope this is clear as I'm not sure how to best ask this question as I'm very new to the DevOps pipelines flow but this would massively help if there was a good way of checking this sort of thing.
As a workaround , you can set Pipeline completion trigger in ProductStore pipeline.
To trigger a pipeline upon the completion of another, specify the triggering pipeline as a pipeline resource.
Or configure build completion triggers in the UI, choose Triggers from the settings menu, and navigate to the YAML pane.
I have a data factory that I would like to publish, however I want to delay one of the pipelines from running as it uses a shared resource that isn't quite ready.
If possible I would like to allow the previous pipelines to run and then enable the downstream pipeline when the resource is ready for it.
How can I disable a pipeline so that I can re-enable it at a later time?
Edit your trigger and make sure Activated is checked NO. And of course don't forget to publish your changes!
Its not really possible in ADF directly. However, I think you have a couple of options to dealing with this.
Option 1.
Chain the datasets in the activities to enforce a fake dependency making the second activity wait. This is a bit clunky and requires the provisioning of fake datasets. But could work.
Option 2.
Manage it at a higher level with something like PowerShell.
For example:
Use the following cmdlet to check the status of the first activity and wait maybe in some sort of looping process.
Get-AzureRmDataFactoryActivityWindow
Next, use the following cmdlet to pause/unpause the downstream pipeline as required.
Suspend-AzureRmDataFactoryPipeline
Hope this helps.
You mentioned publishing, so if you are publishing trough Visual Studio, it is possible to disable a pipeline by setting its property "isPaused" to true in .json pipeline configuration file.
Property for making pipeline paused
You can disable pipeline by clicking Monitor & Manage in Data Factory you are using. Then click on the pipeline and in the upper left corner you have two options:
Pause: Will not terminate current running job, but will not start next
Terminate: Terminates all job instances (as well as not starting future ones)
GUI disabling pipeline
(TIP: Paused and Terminated pipeline have orange color, resumed have green color)
Use the powershell cmdlet to check the status of the activity
Get-AzureRmDataFactoryActivityWindow
Use the powershell cmdlet to pause/unpause a pipeline as required.
Suspend-AzureRmDataFactoryPipeline
Right click on the pipeline in the "Monitor and Manage" application and select "Pause Pipeline".
In case you're using ADF V2 and your pipeline is scheduled to run using a trigger, check which trigger your pipeline uses. Then go to the Manage tab and click on Author->Triggers. There you will get an option to stop the trigger. Publish the changes once you've stopped the trigger.