Azure data factory: Visual studio - azure

I am learning azure data factory and would really like to do its development in Visual studio environment. I have VS 2019 installed on my machine and I don't see an option to develop ADF in it.
Is there any version of VS that ADF can be developed in or we are right now stuck with developing it in web UI for the time?
I know BI development tools needed additional plug in to VS environment to work. Does ADF need something similar to that too.
If not, how can we back up our work done in web ADF. Is there an option to link it somehow with the azure repo or GIT?

Starting with ADF V2, development is really intended to be done completely in the web interface. I had the same question as you at the time, but now the web tools are quite good and I don't give it a second thought. While I'm sure there are other options for developing and deploying the ARM templates, do yourself a favor and use the web UI.
By default, Data Factory only saves code changes on "Publish". An optional configuration allows source control via Git integration. You can use either either Azure DevOps or Github. I highly recommend this approach, even if you only ever work in the main branch (fine for lone developers, a bad idea for collaboration). In this case, Publish takes the current state of the main branch and surfaces your artifacts to the ADF service. That means you will still need to Publish for your changes to be live.
NOTE: Git integration is also supported in Azure Synapse, where it has tremendous value for collaboration across a wide variety of artifact types.

Related

Closing the project in Azure DevOps

We have multiple small projects within our organization. Ever since we adopted Azure DevOps recently, we started creating the individual Azure DevOps projects for everyone of these projects. The ALM process has been going on very well, with end to end traceability established with in the online tool.
However, as we get to end of each of these projects, we started to realize that these individual project and its code needs to be maintained further for the bug fixes and hotfixes. Unfortunately, Microsoft doesn't give a clear road-map for a project code, once it is completed.
Since we have a separate devops project for all the maintenance applications, this becomes much more confusing.
So, Could you suggest me on what is the best practice to maintain the code after the project is completed? Here are some options that I can think of.
Just keep the code in the same project and keep doing the bug fixes there. But, this will create an administrative nightmare to keep track of multiple projects for our maintenance application landscape (especially, since we already have a single projects with multiple repos and teams maintained).
Migrate the code from the completed project repository to the maintenance project. But, this again is a migration from one repo to another. So, I am not sure if this is right.

Azure Data Factory V2 multiple environments like in SSIS

I'm coming from a long SSIS background, we're looking to use Azure data factory v2 but I'm struggling to find any (clear) way of working with multiple environments. In SSIS we would have project parameters tied to the Visual Studio project configuration (e.g. development/test/production etc...) and say there were 2 parameters for SourceServerName and DestinationServerName, these would point to different servers if we were in development or test.
From my initial playing around I can't see any way to do this in data factory. I've searched google of course, but any information I've found seems to be around CI/CD then talks about Git 'branches' and is difficult to follow.
I'm basically looking for a very simple explanation and example of how this would be achieved in Azure data factory v2 (if it is even possible).
It works differently. You create an instance of data factory per environment and your environments are effectively embedded in each instance.
So here's one simple approach:
Create three data factories: dev, test, prod
Create your linked services in the dev environment pointing at dev sources and targets
Create the same named linked services in test, but of course these point at your tst systems
Now when you "migrate" your pipelines from dev to test, they use the same logical name (just like a connection manager)
So you don't designate an environment at execution time or map variables or anything... everything in test just runs against test because that's the way the linked servers have been defined.
That's the first step.
The next step is to connect only the dev ADF instance to Git. If you're a newcomer to Git it can be daunting but it's just a version control system. You save your code to it and it remembers every change you made.
Once your pipeline code is in git, the theory is that you migrate code out of git into higher environments in an automated fashion.
If you go through the links provided in the other answer, you'll see how you set it up.
I do have an issue with this approach though - you have to look up all of your environment values in keystore, which to me is silly because why do we need to designate the test servers hostname everytime we deploy to test?
One last thing is that if you a pipeline that doesn't use a linked service (say a REST pipeline), I haven't found a way to make that environment aware. I ended up building logic around the current data factories name to dynamically change endpoints.
This is a bit of a bran dump but feel free to ask questions.
Although it's not recommended - yes, you can do it.
Take a look at Linked Service - in this case, I have a connection to Azure SQL Database:
You have possibilities to use dynamic content for either the server name and database name.
Just add a parameter to your pipeline, pass it to the Linked Service and use in the required field.
Let me know whether I explained it clearly enough?
Yes, it's possible although not so simple as it was in VS for SSIS.
1) First of all: there is no desktop application for developing ADF, only the browser.
Therefore developers should make the changes in their DEV environment and from many reasons, the best way to do it is a way of working with GIT repository connected.
2) Then, you need "only":
a) publish the changes (it creates/updates adf_publish branch in git)
b) With Azure DevOps deploy the code from adf_publish replacing required parameters for target environment.
I know that at the beginning it sounds horrible, but the sooner you set up an environment like this the more time you save while developing pipelines.
How to do these things step by step?
I describe all the steps in the following posts:
- Setting up Code Repository for Azure Data Factory v2
- Deployment of Azure Data Factory with Azure DevOps
I hope this helps.

Azure release management with visual studio online

I'm looking for some advices concerning release management in azure
I've found a lot of ressources, but I still have some questions
I have an asp.net 4 solution (to make it simple : one asp.net project, one database project, one test project)
I'm using GIT in visual studio online
At this moment I have one app service and one sql server database in azure.
I have a build that download nuget packages, build, execute a dacpac for the database, executes the tests from the project (I have integration tests that uses the database) and finaly deploys the app on an azure app services
What I want to do seems a "normal stuff" :
I want to cerate the build, then deploy it on a "dev" environnement in azure, then in a "qa" environnement, then in a "stagging" and in "prod"
In my web project, I created different web.config transformations (one for each environnement)
I've seen the releases in visual studio online and I get that its for the deployment part in different environnements
What I have questions with :
In Azure :
Do I create 1 app service by environnement ? or do I create a single app service and use slots ?
Do I create 1 sql server for each environnement or is it best to use a single sql server and to have one database for each environnement ?
In visual studio online :
How do I do the tasks ?
In the build part, what configuration do I use ? Which environnement do I select ?
In the build, how do I manage the database project ? I thing the deployment part should be in the release part, but I don't see how to configure the connexionstring ?
For the tests -> Do I execute them in the release part? How do I change the connexionstrings and appsettings? there is no transformations for the app.config
I've seen that there is test-plans as well, but I don't really get it
Can somebody help me to see a little bit better in all of that ?
I cannot answer all of these, but I can answer a few.
I would create separate Web App instances for your separate environments. With slots, your slots exist on the same Web App and share computing resources. If something goes horribly wrong (your staging code pegs CPU to 100% or eats all of your RAM), this will cause problems for your production slot. I see slots as part of A/B testing or to aid in deployment.
You will likely need to create a separate database per environment as well. This is almost always required if you will be upgrading your database schema at any point in the future and introduce breaking changes to your database schema. For example, what happens if your production code requires a specific field in a database table, but your next version of the database removes that field?
You mentioned you're using web.config transforms, but I want to throw out another option that we've found to be easier and have fewer moving parts and sources for error. If you're just changing connection strings and AppSettings, consider using the Web App's application settings per environment. These override whatever is in your web.config. Doing so means you can forget about web.config transforms and not have one more thing that could possibly go wrong in a deployment.
Since you're using a Database project, to deploy your database, check out the VSTS Azure SQL Database Deployment task. It'll use your database project to create a DACPAC, and then deploy that to your target server.

Is there a way to backup or export Azure IOT components?

I've got a large Azure IOT Suite implementation and I'd like to back it up or export the various jobs/components for safe keeping. We had a few instances where someone deleted something incorrectly and it took some time to recreate it.
Thanks!
Nick
Are you using azureiotsuite.com to deploy your Azure IoT suite?
Microsoft provides with an alternative approach of deploying it locally, and you can also customize and extend the solution to meet your specific requirements.
See to this repository for more details.
Use a DevOps approach i.e. script all the components and maintain the scripts in GitHub or some other version control tool.
Unfortunately, support for reverse engineering existing components is very limited.
I think I finally figured this out. With the new Resource Manager, you can click on the "Automation Script" and it will build out the ARM template that can be used to recreate the resources / settings as needed.

Deploy the same Azure binaries to multiple subscriptions

We are trying to work out a good continuous deployment setup using TFS, Visual Studio and Azure. At our company, each developer has their own Azure subscription that we use for testing, as well as shared QA1/QA2/PROD subscriptions that we can deploy to. We have matching TFS XAML build definitions for each of these, running Powershell scripts with parameters and PublishSettings files.
This all gives us a set of .cspkg and .cspkg files, and in theory we can deploy the right cspkg with the correct cspkg to any Azure system.
The problem we are encountering now though is that we want to start using the Redis Cache service. Installing the nuget package writes subscription-specific settings into the web.config, to point at the cache. This means that the cspkg is now complied specifically for the Azure subscription.
We could use SlowCheetah to merge web.config files on build, but this means that we would have to compile the package for each build definition, and as the number of developers increases this is obviously going to become unsustainable.
I am looking for a way to keep our old generic packages and still use the Redis Cache. We can connect to the cache in code during app_start, but then we can't use it to store IIS session state. I understand that the Azure Load Balancer is meant to keep users on the same server, but I'm unsure how that will work as we swap servers in/out.
It feels like we are approaching the problem wrong and there should be a simple solution that we are overlooking.
We are using Azure Tools 2.6, Visual Studio 2013, TFS 2015r2.
I think there are always 3 ways of doing this.
1st one is config during build, which is building one thing for one thing you described, which is not desired in most of scenario.
2nd is config during deployment, which means you open the cspkg file, change config, then put it back before upload without re-compile.
3nd is config after deployment, have a configuration management tool adjust the config file for you on the fly.
We use octopus deploy to archive #2 above, our CI tool feed octopus with cspkg and cscfg, octopus handles the rest. I would definitely not going after #1 but consider #3 is a valid option too.
As of today we store all our connection settings in .cscfg files. Even if for security reasons, we avoid storing any production connection strings in source control, only QA. And we have CI for QA, but not production. This way it works well for us, we just maintain different .cscfg for different environments (subscriptions)
However, in near future I think we will move to Key Vault for this.

Resources