Export ARM template for ADF and deploy it in another ADF instance using DevOps - azure

I have two instances of Azure Data Factory in which one is integrated with GIT repo and another is not.
I have taken ARM templates from non GIT ADF instance using Export ARM template option.
I have to deploy this ARM template in the second ADF instance using azure DevOps pipeline.
How should I do this process using the devOps pipeline.
Is there any way to fetch this downloaded ARM templates using devOps pipeline and deploy in another ADF instance?
Also before deployment I have to replace some text in the json files. (ex: instance01/tmp/file to instance02/tmp/file) Is there any shell script that can be make use here?

Deploying the code to the other Data Factory will not be a problem.
After downloading the ARM template for the non Git integrated one there should be a parameter for what Data Factory you are deploying to. It would be as simple as updating that parameter to point to the other instance of Data Factory and deploying the ARM template either via Azure DevOps or Powershell w/ the Incremental deployment. This is possible since the deployment will do a delta and each pipeline/linked service/ data set will be it's own resource it should deploy the new. Just be careful if there are any conflicts on the naming standard.
The issue you will run into will be the source code definition. The way the CI/CD integration works is the collaboration branch, usually master, is defined and when clicking publish the changes are consolidated and pushed to adf_publish by default. The adf_publish will be the definition of the Data Factory that is published. Since we are publishing outside of a repository neither the master nor adf_publish will have any knowledge of the changes. Unfortunately given that adf_publish cannot merge back into master the best way might be to stand up a new repo or reimport the data factory definition after the manual publish has occurred.

Related

ARM template deployment does not copy over files or setting in devops release pipeline

Im trying to set up and test CI/CD integration in Azure datafactory. As for now i have connected the datafactory to git, set up a new enviroment with all the same resources (datafactory, blobstorage, sql database). I have set up my arm template to copy over the "test" datafactory to dev/publish datafactory in devops pipeline release. I did this by connecting directly to the git repository (the publish branch). When i run the release pipeline, everything seems to be fine, and the template works (it connects to the new resources in the new enviroment etc.) The pipeline is set up with the same setting. However, when i try to run the pipeline it does not work (the publish pipeline).
I have narrowed it down to the that it does not seem that the arm template copies over my csv files or makes new entries to database, however on several of the guides/youtube videos i have seen they seem to get this right.
Does anyone know what the issue might be, and if it is possible to get arm template deployment to copy over the files? If not, if there are any other functions in pipeline release that can do this insted.
I have tried to manually add the containers in blob storage, and query the databse table, and then the rest of the pipeline seems to work fine.

Azure Data Factory Environment Set up

I am setting up an environment for ADF. I have multiple projects and Segments which I need to support. I have noticed that I can only set up one publish branch in ADF.
Should we create ADF for each project?
What is the recommended approach for setting up an environment ?.
Multiple ADFs are required if there is complete different project with different data. If you want to work in same project and just need to manage different environment like development, testing and prod; all this can be managed in one ADF.
A developer creates a feature branch to make a change. They debug their pipeline runs with their most recent changes. For more information on how to debug a pipeline run, see Iterative development and debugging with Azure Data Factory.
After a developer is satisfied with their changes, they create a pull request from their feature branch to the main or collaboration branch to get their changes reviewed by peers.
After a pull request is approved and changes are merged in the main branch, the changes get published to the development factory.
When the team is ready to deploy the changes to a test or UAT (User Acceptance Testing) factory, the team goes to their Azure Pipelines release and deploys the desired version of the development factory to UAT. This deployment takes place as part of an Azure Pipelines task and uses Resource Manager template parameters to apply the appropriate configuration.
After the changes have been verified in the test factory, deploy to the production factory by using the next task of the pipelines release.
Note
Note: Only the development factory is associated with a git repository. The test and production factories shouldn't have a git repository associated with them and should only be updated via an Azure DevOps pipeline or via a Resource Management template.

DevOps deployment based on releases

I have a data factory instance which is linked to github, used for development.
I am having two different changes in the two different branches of data factory.
change01 and change02
I have merged these two changes into master branch and did a publish.
While doing a CI/CD even though these two changes are now available in the dev data factory instance, is it possible to deploy only change01 into other environments?
How can we control which release/change should go for deployment into other environments?
Can we do a build directly from a branch and push to prod?
To best accomplish this will have to to publish outside of the Data Factory editor. Each branch contains the necessary ARM components to publish the Data Factory ARM templates. The issue is when clicking the publish button Data Factory/ADO behind the scenes consolidates the ARM templates to just one .json file to make it easier to deploy while simultaneously deploying to the Data Factory destination.
The best course of action here might be to determine how to publish the ARM templates w/o clicking the publish button. This can be done by using ARM deployments or Powershell.
Furthermore I'd consider the potential options you have when considering how to managed and deploy Data Factory under CI/CD.
I would suggest to use separate branch and configure your builds to use proper one. Verify you builds in Azure Dev Ops.
It can be also helpful to cherry-pick changes which shouldn't be deployed.

How to use the build pipeline to deploy azure data factory resources?

I have a Azure data factory to deploy using azure devops. So I have created a build pipeline using the "Publish Build Artifacts" task and created the artifacts folder named "drop" which has the resources i want to deploy.
I need help with the release pipeline for the same. Which task should i use in my release pipeline to deploy this artifact folder "drop"?
I initially tried with ARM template deployment but it doesn't make use of the drop folder and deploys everything i.e, the entire data factory everytime. So i created a build folder wherein i have only limited things to deploy. But i am now stuck with the release pipeline task for the same.
Any help would be great. Thanks
You need to use version 2 of RG deployment:
And then you need to select, Create or Update Resource Group.
You need to provide path of ADF ARM templates in the template path text box.
Thanks,
Pratik

DevOPS with Azure Data Factory

I have created Azure Data Factory with Copy Activity using C# and Azure SDK.
How can deploy it using CI/CD ?
Any URL or link will help
Data Factory continuous integration and delivery is now possible with directly through the web user interface using ARM Templates or even Git (Github or Azure DevOps).
Just click on "Set up Code Repository" and follow the steps.
Check the following link for more information, including a video demostration: https://aka.ms/azfr/401/02
One idea that I got from Microsoft was that using the same Azure SDK you could deserialize the objects and save down the JSON files following the official directory structure into your local GitHub/Git working directory
In other words you would have to mimic what the UI Save All/Save button does from the portal.
Then using Git bash, you can just commit and push to your working branch (i.e. develop) and from the UI you can just publish (this will create an adf_publish release branch with the ARM objects)
Official reference for CI using VSTS and the UI Publish feature: https://learn.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment
Unfortunately, CI/CD for ADF is not very intuitive at first glance.
Check out this blog post where I'm describing what/how/why step by step:
Deployment of Azure Data Factory with Azure DevOps
Let me know if you have any questions or concerns and finally - if that works for you.
Good luck!
My resources on how to enable CI/CD using Azure DevOps and Data Factory comes from the Microsoft site below:
Continuous integration and delivery (CI/CD) in Azure Data Factory
I am still new to DevOps and CI/CD, but I do know that other departments had this set up and it looks to be working for them.

Resources