Parameterize Linked Services Azure Data Factory - azure

I have Database Username , Servername, hostand other details that are stored in Table. I wanted to make a Linked Service that can use the connection details from these table and store it in parameter.
As of now i am hardcoding these details in parameters created in linked service but I want a generic linked service that can take details from table or from pipeline parameter.

AFAIK, there is no such feature available in Azure Data Factory which allows to parameterize the Linked Service or the pipeline where values are stored in a out source Table or file. You need to define the values in ADF only.
The standard and only way possible is to parameterize a linked service and pass dynamic values at run time by defining the values in ADF. For example, if you want to connect to different databases on the same logical SQL server, you can now parameterize the database name in the linked service definition. This prevents you from having to create a linked service for each database on the logical SQL server.
You can use parameters to pass external values into pipelines,
datasets, linked services, and data flows. Once the parameter has been
passed into the resource, it cannot be changed. By parameterizing
resources, you can reuse them with different values each time.
Parameters can be used individually or as a part of expressions. JSON
values in the definition can be literal or expressions that are
evaluated at runtime.
The official document Parameterize linked services in Azure Data Factory will help you to understand the complete fundamentals.

Related

Automatically adding data to cosmos DB through ARM template

I made an ARM template which runs through an azure devops pipeline to create a new cosmos instance and put two collections inside it. I'd like to put some data inside the collections (fixed values, same every time). Everything is created in the standard way, e.g. the collections are using
"type": "Microsoft.DocumentDb/databaseAccounts/apis/databases/containers"
I think these are the relevant docs.
I haven't found mentions of automatically adding data much, but it's such an obviously useful thing I'm sure it will have been added. If I need to add another step to my pipeline to add data, that's an option too.
ARM templates are not able to insert data into Cosmos DB or any service with a data plane for many of the reasons listed in the comments and more.
If you need to both provision a Cosmos resource and then insert data into it you may want to consider creating another ARM template to deploy an Azure Data Factory resource and then invoke the pipeline using PowerShell to copy the data from Blob Storage into the Cosmos DB collection. Based upon the ARM doc you referenced above it sounds as though you are creating a MongoDB collection resource. ADF supports MongoDB so this should work very well.
You can find the ADF ARM template docs here and the ADF PowerShell docs can be found here. If you're new to using ARM to create ADF resources, I recommend first creating it using the Azure Portal, then export it and examine the properties you will need to drive with parameters or variables during deployment.
PS: I'm not sure why but this container resource path (below) you pointed to in your question should not used as it breaks a few things in ARM, namely, you cannot put a resource lock on it or use Azure Policy. Please use the latest api-version which as of this writing is 2021-04-15.
"type": "Microsoft.DocumentDb/databaseAccounts/apis/databases/containers"

Is there any Azure service that can simulate the concept of a global Azure 'variable' to hold a single value?

I am looking for some Azure service that can store a value and then I can fetch it from any other Azure service. It's a storage basically but extremely lightweight storage -- it should allow one to define a variable for a given subscription and then its value can be updated from any other Azure service. In Azure Data Factory there is a recent introduction of global parameter at data factory level , even this could serve purpose to some limited extent if it was mutable, but it's a parameter not a variable. So its value can't be updated. Even if I can get some solution that will work within data factory that's fine too. One could always store such a value in SQL or blob but that sounds like an overkill. Having a global Azure variable is a genuine requirement -- so wondering if there is anything like that.
Please consider Azure KeyVault. You can define there a secret to hold this value. However I'm not sure what integration with other Azure services you need.
you have several options:
cosmosdb table api
redis
table storage
ref: https://learn.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview#keyvalue-stores

Create a generic data factory with multiple linked services

Use Case: To create a generic data factory which can read data from different azure blob containers which has flat files into Azure SQL. I have created a data pipeline which uses stored procedures to populate the Azure SQL tables.
Issue: The trouble that I have is that I want to execute this data factory from my code and change the database and blob container on the fly and execute the same data factory with this new parameters. The Table names will remain the same on the Azure SQL side and the File name will also remain same in the blob storage. The change will the the Container or the folder name inside the Container which will be know before hand.
Please help me out or point me in the direction as to what could help me achieve this and if this can be at all be achieved or not.
You would need to use the parameterized datasets and linked services. Define parameters on your data factory pipeline (which you want to pass from your code e.g. container name or the folder name, connection string for SQL azure and connection string for blob storage). Once this is defined - you would need to pass these values downstream all the way till the linked service
i.e. something like this
Pipeline Parameters > Dataset Parameters > Linked Service Parameters

Dynamically generate Extract scripts from metadata in USQL

I have a requirement to read metadata information that comes in json format and dynamically generate extract statements to further transform data for that table.
I have currently loaded metadata information in Azure SQL DB. So, I would need to read this data and create extract statements on the fly and pass them to the USQL as a parameter.
Need some help in how to proceed with this and also whether this is the correct approach that I am following.
Thanks in advance.
Don't equate executing U-SQL to something like Stored Procedures in SQL Server: the two are quite different under the covers. For instance, passing parameters is kinda supported, but not like you may think, and [to the best of my knowledge] dynamic script elements aren't supported.
I do, however, think you could accomplish this with Azure Data Factory (ADF) and some custom code.
ADF executes U-SQL scripts by referencing a blob in Blob Storage, so you could have an ADF custom activity (Azure Batch) that reads your metadata and dynamically generates the U-SQL script to an Azure Blob.
Once available, the Data Factory can execute the generated script based on a pipeline parameter that holds the script name.
Doing this in ADF allows you to perform this complex operation dynamically. If you go this route, be sure to use ADF V2.

Azure Search default database type

I am new to Azure Search and I have just seen this tutorial https://azure.microsoft.com/en-us/documentation/articles/search-howto-dotnet-sdk/ on how to create/delete an index, upload and search for documents. However, I am wondering what type of database is behind the Azure Search functionality. In the given example I couldn't see it specified. Am I right if I assume it is implicitly DocumentDb?
At the same time, how could I specify the type of another database inside the code? How could I possibly use a Sql Server database? Thank you!
However, I am wondering what type of database is behind the Azure
Search functionality.
Azure Search is offered to you as a service. The team hasn't made the underlying storage mechanism public so it's not possible to know what kind of database are they using to store the data. However you interact with the service in form of JSON records. Each document in your index is sent/retrieved (and possibly saved) in form of JSON.
At the same time, how could I specify the type of another database
inside the code? How could I possibly use a Sql Server database?
Short answer, you can't. Because it is a service, you can't specify the service to index any data source. However what you could do is ask search service to populate its database (read index) through multiple sources - SQL Databases, DocumentDB Collections and Blob Containers (currently in preview). This is achieved through something called Data Sources and Indexers. Once configured properly, Azure Search Service will constantly update the index data with the latest data in the specified data source.

Resources