How to Migrate Azure DataFactory(v2) Job to another DataFactory(2) - azure

How to move all existing job to another Azure Datafactory.
I am trying to move existing job from one data Factory to another but not able to find the solution any suggestions, please.

As far as I know there is no easy import/export facility.
I recommend connecting your Data Factories to source control (GIT). You can then copy and paste the JSON definitions between the two repo's using a text editor.
For propagating pipelines between environments, you can look in to the documentation for CI/CD in Azure Data Factory.

Related

What is the best way to automate creating SQL .bak or .bacpac backup files and saving them to an Azure cloud storage container?

Currently I am tasked with researching a solution to easily copying data from one environment to another (QA to DEV for example) as well as having the flexibility of going to different times to compare our data. It is an easy task to do locally with SSMS and I am looking for the best ways to do it using Azure and it's tools.
These are the options that I found so far:
Backup Service and Backup Vault (The MS solution that I am not asking for. They don't generate .bak files)
Azure Function to execute generate and transfer SQL (flexible but the code needs to be maintained + manage authentication)
Powershell process with Azure Automate (Flexible too but needs to be maintained)
Datafactory/SSIS (Still learning and researching)
Anyone got any tools/methods that are worth looking into before I dive deeper with a solution?
For Azure SQL database, SQL Data Sync is one of feature for the data sync between Azure SQL and SQL server(on-premise). Some limits are that Azure SQL database must be hub and each must have a primary key. That may not suit you.
Per my experience, Data Factory is the best one for you. You can copy the data between different environment, in Sink settings, we can using upsert(insert or update) operation to sync the data.
If you only want to schedule the backup automatically for the SQL, the third-part tool also could feed your request: SQL Backup and FTP.
Since you have searched a lot and found almost all the options in Azure, all the ways can achieve that. You need to know your real request, data sync or auto backup create the .bacpac file to storage. That's not a good question to help you find the best way. The way you like, the way is the best.
I went with writing an Azure Automate powershell script. including cmdlts like New-AzureRmSqlDatabaseExport and passing in the parameters was ticky but it finally did the job.

Azure data factory data flow task cannot take on prem as source

Hey,
I am having trouble in creating a Data Flows task which uses an on-prem source. Is this not possible in the Preview version?
I have created a self hosted IR to connect ADF to my laptop, and that is what I want to use. In the pic below I am trying to create a dataset off self hosted IR. It works great in Copy task but for Data Flows it is greyed out.
For this question, I asked Azure support for help and they replied me with the answer:
Answer:
This means on-premise SQL server is not supported as dataset in data flow in current stage.
Screen shot:
Update:
Data flow now only support Azure IR so it doesn’t support on-premise dataset.
Refer to Integration runtime types.
Hope this helps.
If your goal is to use visual data transformations in ADF using Mapping Data Flows with on-prem data, then build a pipeline with a Copy Activity first. Use the Self-Hosted Integration Runtime with the Copy Activity to stage your data in Blob Store. Then add a subsequent Execute Data Flow activity to transform that data.
I made video on how to do this:
https://www.youtube.com/watch?v=IN-4v0e7UIs
Reproduce your issue on my side ,however nothing about this feature is claimed on the official document. As you can see everywhere about data flow cliams that:
You could submit any voice here:
Also found a feedback for data flow in ADF for your reference.If you need push the progress of it,you could vote up it. Also,i would suggest you referring to the comments in the link:
For access to the 80+ ADF connectors, use Copy Activity to stage data
for transformation.
Data Flows will access data in your lake (Blob, ADB, ADW, ADLS) for
transformation.
Think of Copy Activity as your data heavy-lifting activity and Data
Flow as your data transformation engine.

How to import your example files into my Azure Data Factory

I believe many of you have ADF experiences, and may be have seen Mark Kromer's example of azure data flows (https://github.com/kromerm/adfdataflowdocs)
I am a beginner using ADF and azure Data Flows especially. I am very curious of these examples, and I really want to import all your examples files (json) to my newly created data factory. It must be an easier way than creating all activities, connections, datasets and others manually. The templates are of course good but I want to test out your example code in my azure portal and in my data factory.
And just a second question: I am a SSIS man used control flows with master packages executing other packages. When I am now building a data warehouse in ADF with many dimension tables and fact tables, is it best practice to have separate data flows or should I build general data flows that either have many parallel upserts to different dimensions? I think I need some guiding here
Thank you
Regards Geir
I've been publishing those example data flows as ADF pipeline templates.
In your browser, go to ADF and then click Factory Resources > Pipeline from Template. You should see a Data Flow category. Many of the examples are in there.
In ADF, you can also have master / child packages. You'll use Execute Pipeline in ADF instead of execute package.
ADF supports re-use and generalization of data flow patterns, so that you could process multiple dimension tables in a single data flow. You'll have to weigh the value of doing that vs. supporting it long-term as well as the danger of creating a very complex data flow.

Copy files from on-prem to azure

I'm new to Azure eco system. I'm doing some research on copying data from on-prem to azure. I found following options:
AzCopy
Azure Data Factory (Copy Data Tool)
Data Management Gateway
Ours is a Microsoft shop; so, I'm looking for tools that gel with MS platform. Also, down the line, we want to automate the entire thing as much as we can. So, I think, Azure Storage Explorer is out of the question. Is there a preference among the above 3. Or, are there any better tools?
I think you are mixing stuff, Copy Data Tool is just an Azure Data Factory Wizard to make some sample data moving between resources. Azure Data Factory uses the data management gateway to get on premises resources such as files and databases.
What you want to do can be made with Azure Data Factory. I recommend using version 2 (even in its preview version) because its Authoring is easier to understand if you are new to the tool. You can graphically configure linked services, datasets and pipelines from there.
I hope this helped, if you need further help just ask away!
If you're already familiar with SSIS, there's also the option to use SSIS in ADF that enables on-prem data access via VNet.

Azure Data Sync - Copy Each SQL Row to Blob

I'm trying to understand the best way to migrate a large set of data - ~ 6M text rows from (an Azure Hosted) SQL Server to Blob storage.
For the most part, these records are archived records, and are rarely accessed - blob storage made sense as a place to hold these.
I have had a look at Azure Data Factory and it seems to be the right option, but I am unsure of it fulfilling requirements.
Simply the scenario is, for each row in the table, I want to create a blob, with the contents of 1 column from this row.
I see the tutorial (i.e. https://learn.microsoft.com/en-us/azure/data-factory/data-factory-copy-activity-tutorial-using-azure-portal) is good at explaining migration of bulk-to-bulk data pipeline, but I would like to migrate from a bulk-to-many dataset.
Hope that makes sense and someone can help?
As of now, Azure Data Factory does not have anything built in like a For Each loop in SSIS. You could use a custom .net activity to do this but it would require a lot of custom code.
I would ask, if you were transferring this to another database, would you create 6 million tables all with the same structure? What is to be gained by having the separate items?
Another alternative might be converting it to JSON which would be easy using Data Factory. Here is an example I did recently moving data into DocumentDB.
Copy From OnPrem SQL server to DocumentDB using custom activity in ADF Pipeline
SSIS 2016 with the Azure Feature Pack, giving Azure Tasks such as Azure Blob Upload Task and Azure Blob Destination. You might be better off using this, maybe an OLEDB command or the For Each loop with an Azure Blob destination could be another option.
Good luck!
Azure has a ForEach activity which can be place after LookUp or Metadata to get the each row from SQL to blob
ForEach

Resources