Is it possible to store variables in Azure Data Factory pipelines? - azure

In my Azure Data Factory pipeline, I want to use a variable, which gets updated on each run and which is also read on each run. At the moment, I am using a Database to achieve that. But it would be much simpler if Azure Data Factory provided a way of storing variables. So, my question is, is there any such facility in Azure Data Factory?

As #Joel Cochran says, ADF doesn't support persist a variable inside pipeline runs. We need to write data to a storage, eg. database or azure storage. Use Lookup Activity to get the value from blob storage file or DB. :)

Related

How to remove special characters from XML stored in ADLS using Azure data factory or any other option?

I have scenario where i need to remove some characters from xml tags which is stored in ADLS. I am looking for an option with ADF. Can someone help me here with approach i should follow?
This is not possible by ADF. May you can have piece code to do this in
Azure Functions. As, Azure Data Factory can do data movement and data
transformation only. When you are saying about tags that means it does
not come under that.
You may use the Azure Function activity in a Data Factory pipeline to run Azure Functions. To launch an Azure Function, you must first set up a connected service connection and an activity that specifies the Azure Function you want to perform.
There is the Microsoft document which have deep insights about Azure Function Activity in ADF | Here.

Execute SQL script stored in Azure Blob container via Azure Data Factory

I have a SQL script stored in Azure Blob container as a ".sql" file. I want to execute/invoke this code using Azure Data factory. Please note that the script already has the SQL query I wish to execute and I simply intend to point to it and invoke it using ADF. How can we achieve this ?
Data Factory more focus on data transferring not executing the script directly. It can't achieve that for now. You need achieve that in code level and call the function in ADF.
Like you said, you will have to write a function for the same and execute the function using ADF.

Is there any Azure service that can simulate the concept of a global Azure 'variable' to hold a single value?

I am looking for some Azure service that can store a value and then I can fetch it from any other Azure service. It's a storage basically but extremely lightweight storage -- it should allow one to define a variable for a given subscription and then its value can be updated from any other Azure service. In Azure Data Factory there is a recent introduction of global parameter at data factory level , even this could serve purpose to some limited extent if it was mutable, but it's a parameter not a variable. So its value can't be updated. Even if I can get some solution that will work within data factory that's fine too. One could always store such a value in SQL or blob but that sounds like an overkill. Having a global Azure variable is a genuine requirement -- so wondering if there is anything like that.
Please consider Azure KeyVault. You can define there a secret to hold this value. However I'm not sure what integration with other Azure services you need.
you have several options:
cosmosdb table api
redis
table storage
ref: https://learn.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview#keyvalue-stores

Create a generic data factory with multiple linked services

Use Case: To create a generic data factory which can read data from different azure blob containers which has flat files into Azure SQL. I have created a data pipeline which uses stored procedures to populate the Azure SQL tables.
Issue: The trouble that I have is that I want to execute this data factory from my code and change the database and blob container on the fly and execute the same data factory with this new parameters. The Table names will remain the same on the Azure SQL side and the File name will also remain same in the blob storage. The change will the the Container or the folder name inside the Container which will be know before hand.
Please help me out or point me in the direction as to what could help me achieve this and if this can be at all be achieved or not.
You would need to use the parameterized datasets and linked services. Define parameters on your data factory pipeline (which you want to pass from your code e.g. container name or the folder name, connection string for SQL azure and connection string for blob storage). Once this is defined - you would need to pass these values downstream all the way till the linked service
i.e. something like this
Pipeline Parameters > Dataset Parameters > Linked Service Parameters

Generating and storing JSON files from the run-time parameters passed to Azure Data Factory v2 pipeline?

Can we create a file (preferably json) and store it in its supported storage sinks (like Blob, Azure Data Lake Service etc) using the parameters that are passed to Azure Data Factory v2 pipeline at run-time. I suppose it can be done via Azure Batch but it seems to be an overkill for such a trivial task. Is there a better way to do that?
Here are all the transform activities ADFv2 currently equips with, I'm afraid there isn't a direct way to create a file in ADFv2. You could leverage Custom activity to achieve this by running your customized code logic on an Azure Batch pool of virtual machines. Hope it'll help a little.

Resources