DevOPS with Azure Data Factory - devops-services

I have created Azure Data Factory with Copy Activity using C# and Azure SDK.
How can deploy it using CI/CD ?
Any URL or link will help

Data Factory continuous integration and delivery is now possible with directly through the web user interface using ARM Templates or even Git (Github or Azure DevOps).
Just click on "Set up Code Repository" and follow the steps.
Check the following link for more information, including a video demostration: https://aka.ms/azfr/401/02

One idea that I got from Microsoft was that using the same Azure SDK you could deserialize the objects and save down the JSON files following the official directory structure into your local GitHub/Git working directory
In other words you would have to mimic what the UI Save All/Save button does from the portal.
Then using Git bash, you can just commit and push to your working branch (i.e. develop) and from the UI you can just publish (this will create an adf_publish release branch with the ARM objects)
Official reference for CI using VSTS and the UI Publish feature: https://learn.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment

Unfortunately, CI/CD for ADF is not very intuitive at first glance.
Check out this blog post where I'm describing what/how/why step by step:
Deployment of Azure Data Factory with Azure DevOps
Let me know if you have any questions or concerns and finally - if that works for you.
Good luck!

My resources on how to enable CI/CD using Azure DevOps and Data Factory comes from the Microsoft site below:
Continuous integration and delivery (CI/CD) in Azure Data Factory
I am still new to DevOps and CI/CD, but I do know that other departments had this set up and it looks to be working for them.

Related

Azure Synapse Analytics - CI/CD workspace and infrastructure - Design question

I have an Azure Repos project with IaC code and ci/cd yaml pipelines to set up Azure Synapse infrastructure. Can you recommend what is right approach when I integrate the workspace to connect to git? Should I create a new project in Azure Repos for the Synapse artifacts or should I use the same repository as the infrastructure project?
I will be setting up ci/cd pipelines to deploy the azure synapse artifacts as well.
Thanks!
Here's an answer to a similar question I posted:
You'll want to follow Microsoft's guide on CI/CD with Synapse. This is a great walkthrough video for the process.
Work in your development environment and commit to your collaboration branch.
A pipeline triggers off of your workspace_publish branch, releasing your code to your next environment (preferably with an approval gate)
Continue releasing the same code to higher environments, again with approval gates.
For Azure DevOps, you can use variable groups to parameterize your pipeline. Also make sure to read through the custom parameters section on that link to parameterize parts of the template that are not parameterized by default.

Load pipelines from repository into ADF

I have a newly created ADF and I will have to configure the repository with the ADF. I have almost 100 count pipelines, their related datasets, linked services, triggers in the repository. How can I load all the pipelines and their respective into the ADF. Once I configure the git with the ADF I am unable to see the pipelines. Any thoughts?
Did you select the below option when configuring the repo?
If yes double check the Root folder path. If it's not set accurately you will not be able to see pipelines, dataset etc.
I followed the steps below to configure GitHub repo in ADF. Please make sure you also followed the same steps.
Enter name of GitHub repository owner. After this you will be redirected to login page of GitHub.
Authoring directly with the Data Factory service is disabled in the
Azure Data Factory UX when a Git repository is configured. Changes
made via PowerShell or an SDK are published directly to the Data
Factory service, and are not entered into Git.
Select your repository and fill required details.
You will get all Pipelines and Datasets as shown in below screenshot.
For more information follow GitHub integration best practices

Export ARM template for ADF and deploy it in another ADF instance using DevOps

I have two instances of Azure Data Factory in which one is integrated with GIT repo and another is not.
I have taken ARM templates from non GIT ADF instance using Export ARM template option.
I have to deploy this ARM template in the second ADF instance using azure DevOps pipeline.
How should I do this process using the devOps pipeline.
Is there any way to fetch this downloaded ARM templates using devOps pipeline and deploy in another ADF instance?
Also before deployment I have to replace some text in the json files. (ex: instance01/tmp/file to instance02/tmp/file) Is there any shell script that can be make use here?
Deploying the code to the other Data Factory will not be a problem.
After downloading the ARM template for the non Git integrated one there should be a parameter for what Data Factory you are deploying to. It would be as simple as updating that parameter to point to the other instance of Data Factory and deploying the ARM template either via Azure DevOps or Powershell w/ the Incremental deployment. This is possible since the deployment will do a delta and each pipeline/linked service/ data set will be it's own resource it should deploy the new. Just be careful if there are any conflicts on the naming standard.
The issue you will run into will be the source code definition. The way the CI/CD integration works is the collaboration branch, usually master, is defined and when clicking publish the changes are consolidated and pushed to adf_publish by default. The adf_publish will be the definition of the Data Factory that is published. Since we are publishing outside of a repository neither the master nor adf_publish will have any knowledge of the changes. Unfortunately given that adf_publish cannot merge back into master the best way might be to stand up a new repo or reimport the data factory definition after the manual publish has occurred.

What are the standards of Azure Sql development with Azure DevOps?

What are the industrial standards for developing a CI/CD pipelines for Azure SQL database?
I have an existing Azure SQL database (DEV instance, includes Schemas, Tables, Functions, Stored Procedures, etc. ) the code for these are hardcoded (meaning, not generated using SSDT compare nor generating script from existing table/SP/Function nor DACPAC/BACPAC file, it's just the code which developers wrote) and maintained in Git repo,
Now, my users want to create another Database using the scripts which were uploaded into Git by developers (Use the code which was uploaded by developers in Git (bitbucket), meaning identifying all the dependencies of DB objects and executing them in order to create new Database, Is this the correct approach? consider this as approach 1),
upon investing lots of time on deployments, I am confused/convinced that it is advised to follow below approach, let's call it as approach 2,
create a solution and clone your existing Git repo in Visual Studio
Import the DB objects from solution explorer and push the solution to Git.
Create a pipeline includes steps as build solution/copy/publish artifact
Create a new release pipeline and use "Azure SQL Data Warehouse deployment" task and link DACPAC file (which is generated from above step dynamically)
Now, for incremental changes, my assumption is, Change the code-> upload in git->generate solution-> build release (the DACPAC file generated from build pipeline will be compared with current QA db and only new changes will be applied, behind the scenes, sqlpackage will be used to compare at release "Azure SQL Data Warehouse deployment task" )
Links I have gone thru:
Configure CD of Azure SQL database using Azure DevOps and Visual Studio
Please correct me if my understanding is wrong,
Thanks a ton,
A DevOps newbie here.
Azure DevOps services provide the Azure SQL database deployment task to deploy an Azure SQL database.
So the approach 2 is the common way. With the task we can deploy an Azure SQL Database using DACPAC or run scripts using SQLCMD.
You can also reference the following links:
Tutorial: Deploy your ASP.NET app and Azure SQL Database code by using Azure DevOps Starter
DevOps for Azure SQL
Azure SQL Database CI/CD with Azure DevOps

Continuous integration and Continuous deployment in Azure Data factory

I want to do continuous integration and deployment in Azure Data factory. I'm not able to find any specific document explaining this.
How can I do it or where can I read about it?
To build your project, you can use msbuild - just like it's done in Visual Studio. It will validate syntax, check references between json configurations and check all dependencies. If you are using Visual Studio Team Services as CI server you can use Visual Studio Build step in build configuration to do it. However, it requires to install ADF tools for VS on build agent machine.
To deploy, you can try:
Powershell. For example, you can use Set-AzureRmDataFactoryV2Dataset to deploy datasets. There are similar commands for all other configurations and for version 1 of Azure Data Factory as well.
If you are using VSTS, you can try this 3rd party extension. It allows to deploy json configurations and start/pause pipelines. I'm not sure if it works with ADF v2.
You can use the VSTS GIT integration with ADF v2 UX to do continous deployment and continuous integration. VSTS GIT integration allows you to choose a feature/development branch or create a new one in your VSTS GIT repo. You can work in your feature/development branch and create PR in VSTS GIT to merge your changes to the master branch. You can then publish to your data factory using ADF v2 UX. Please try this and let us know if this doesn't work for you or you face any issues.

Resources