Dynamic Integration Runtime in Azure Data Factory (ADF)

Dynamic Integration Runtime in Azure Data Factory (ADF) - azure

I have multiple IR for each client. I want to use one pipeline to copy data from client SAP system to Blob.
I have created all the configuration variables like in below code but i am not able to create parameter for Integration Runtime variable.
Is there any JSON syntax where we can make it dynamic.
{
"name": "LS_SAP_TBL",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"type": "SapTable",
"annotations": [],
"typeProperties": {
"clientId": "#{linkedService().ClientId}",
"language": "",
"sncMode": false,
"userName": "#{linkedService().userName}",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "KAJDSKJDHLSJDFHALKFD",
"type": "LinkedServiceReference"
},
"secretName": "#{linkedService().SecretName}"
},
"server": "#{linkedService().server}",
"systemNumber": "#{linkedService().systemNumber}"
},
"connectVia": {
"referenceName": "#IntegrationRuntime_Param - Need to pass this dynamically",
"type": "IntegrationRuntimeReference"
},
"parameters": {
"SecretName": {
"type": "String"
},
"ClientId": {
"type": "String"
},
"userName": {
"type": "String"
},
"server": {
"type": "String"
},
"systemNumber": {
"type": "String"
},
"IntegrationRuntime_Param": {
"type": "String"
}
}
}
}```

Unfortunately, as of now there in no option to parameterize the Integration Runtime.
You can parameterize the different parameters in a single IR to use it for multiple resources. But different IRs can't be parameterize.
Please refer this similar request on Microsoft Q&A for better understanding.
Need help on parameterization of Integration Runtime and Azure Key vault Secrets on link service in Azure Data Factory pipeline

Yes, I second UtkarshPal's response. I hope we get this option ASAP.
In our case, we do have multiple ADFs (one per project), and each ADF has it's own SHIR.
The only method to reuse linked services with self hosted IR is to give all SHIRs the same name..
P.S. - If we are having multiple SHIRs within a single ADF, then it's not possible to have same name.

Related

Use Azure Stream Analytics Managed Identity to access SQL DB using terraform

There is an option to create Managed Identity from terraform for Stream analytics job (azurerm_stream_analytics_job, using identity block).
And it is possible to use Managed Identity to connect to databases (as explained here)
But I could not find how to use managed identity to create input using azurerm_stream_analytics_reference_input_mssql
UPDATE:
To be clear, thats what I am after:
And then

As Per July 2022
It does not look like terraform is supporting it (see documentation).
With this arm template, I was able to deploy ("authenticationMode": "Msi"):
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"streamAnalyticsJobName": {
"type": "string"
},
"streamAnalyticsJobNameInputName": {
"type": "string"
},
"sqlServerName": {
"type": "string"
},
"databaseName": {
"type": "string"
}
},
"resources": [
{
"type": "Microsoft.StreamAnalytics/streamingjobs/inputs",
"apiVersion": "2017-04-01-preview",
"name": "[format('{0}/{1}', parameters('streamAnalyticsJobName'), parameters('streamAnalyticsJobNameInputName'))]",
"properties": {
"type": "Reference",
"datasource": {
"type": "Microsoft.Sql/Server/Database",
"properties": {
"authenticationMode": "Msi",
"server": "[parameters('sqlServerName')]",
"database": "[parameters('databaseName')]",
"refreshType": "Static",
"fullSnapshotQuery": "SELECT Id, Name, FullName\nFrom dbo.Device\nFOR SYSTEM_TIME AS OF #snapshotTime --Optional, available if table Device is temporal"
}
}
}
}
]
}
So you could always use azurerm_template_deployment resource to deploy using terraform.

ADF ARM Template doesn't appear to include the factory itself

I was in the process of configuring DevOps to deploy my Dev ADF to the UAT ADF instance.
I had come across the standard issue of the deploy not deleting out-dated pipelines, and attempted to use "Complete" deployment mode to resolve that.
Whereupon DevOps entirely deleted the UAT ADF instance!
Looking further at the docs, it appears that this is the expected behaviour if the factories are not in the ARM Templates.
And looking at my ARM Template (generated entirely by ADF, and with [AFAIK] entirely standard settings), it confirms that the factory itself is NOT amongst the documented resources to be created.
This seems ... odd.
Am I missing something?
How do I get the factory to be included in the ARM Template?
Or alternatively, how can I use the "Complete" deployment mode without it deleting the target ADF instance?
Note that the reason I don't want to use the "define a separate script to solve this" approach, is that it seems excessively complex when that the "Complete" mode sounds like it should do exactly what I want :) (If it weren't for this one oddity about deleting the factory)

You are correct. I've run into this issue before. To circumnavigate it this I recommend creating a core ARM template that would contain the Data Factory and any necessary linked services solely used by Data Factory. This will ensure the "infrastructure/connections" are deployed when creating a new instance.
If you are following Azure Data Factory CI/CD this would be an additional Azure Resource Group Deployment task before the Pipelines are deployed and reference the ARM template which should be in a separate repository.
Here's a template for Data Factory w/ Log Analytics to get you started. I included Log Analytics as most people don't realize about Log retention until they need it. Plus it's a best practices. Just update the system name as this will create a naming standard of adf-systemName-environment-regionAbrviation. The region abbreviation is dynamic based upon the object and will look up agianst the resource group.
{
"$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"environment": {
"type": "string",
"metadata": "Name of the environment being deployed to"
},
"location": {
"type": "string",
"defaultValue": "[resourceGroup().location]",
"metadata": {
"description": "Location for all resources."
}
}
},
"variables": {
"systemName": "DataFactoryBaseName",
"regionReference": {
"centralus": "cus",
"eastus": "eus",
"westus": "wus"
},
"dataFactoryName": "[toLower(concat('adf-', variables('systemName'),'-', parameters('environment'),'-',variables('regionDeployment')))]",
"logAnalyticsName": "[toLower(concat('law-', variables('systemName'),'-', parameters('environment'),'-',variables('regionDeployment')))]",
"regionDeployment": "[toLower(variables('regionReference')[parameters('location')])]"
},
"resources": [
{
"name": "[variables('dataFactoryName')]",
"type": "Microsoft.DataFactory/factories",
"apiVersion": "2018-06-01",
"location": "[parameters('location')]",
"tags": {
"displayName": "Data Factory",
"ProjectName": "[variables('systemName')]",
"Environment":"[parameters('environment')]"
},
"identity": {
"type": "SystemAssigned"
}
},
{
"type": "Microsoft.OperationalInsights/workspaces",
"name": "[variables('logAnalyticsName')]",
"tags": {
"displayName": "Log Analytics",
"ProjectName": "[variables('systemName')]",
"Environment":"[parameters('environment')]"
},
"apiVersion": "2020-03-01-preview",
"location": "[parameters('location')]"
},
{
"type": "microsoft.datafactory/factories/providers/diagnosticsettings",
"name": "[concat(variables('dataFactoryName'),'/Microsoft.Insights/diagnostics')]",
"location": "[parameters('location')]",
"apiVersion": "2017-05-01-preview",
"dependsOn": [
"[resourceID('Microsoft.OperationalInsights/workspaces',variables('logAnalyticsName'))]",
"[resourceID('Microsoft.DataFactory/factories',variables('dataFactoryName'))]"
],
"properties": {
"name": "diagnostics",
"workspaceId": "[resourceID('Microsoft.OperationalInsights/workspaces',variables('logAnalyticsName'))]",
"logAnalyticsDestinationType": "Dedicated",
"logs": [
{
"category": "PipelineRuns",
"enabled": true,
"retentionPolicy": {
"enabled": false,
"days": 0
}
},
{
"category": "TriggerRuns",
"enabled": true,
"retentionPolicy": {
"enabled": false,
"days": 0
}
},
{
"category": "ActivityRuns",
"enabled": true,
"retentionPolicy": {
"enabled": false,
"days": 0
}
}
],
"metrics": [
{
"category": "AllMetrics",
"timeGrain": "PT1M",
"enabled": true,
"retentionPolicy": {
"enabled": false,
"days": 0
}
}
]
}
}
]
}

Stream Analytics output to Synapse Analytics in ARM template

I'm finding absolutely 0 documentation about how to output from Stream Analytics to Azure Synapse Analytics as an output... I've got it configured in the portal but when I export the template all of the details about this output are lost besides the name.
I tried building it from scratch and there's no documentation at all. How do I write this into an ARM template? I have the following...
"outputs": [
{
"name": "synapse-output",
"properties": {
"datasource": {
"type": "",
"properties": {
}
}
}
}
]
And there are no details about how to fill it in... what is the Type for this type of output and how do I even fill in the properties with 0 documentation?

There is a reference for all template resources in microsoft documentation see the Microsoft.StreamAnalytics/streamingjobs/outputs template reference
The code to create stream analytics outputs but there is no reference for Azure Synapse Analytics output
{
"name": "string",
"type": "Microsoft.StreamAnalytics/streamingjobs/outputs",
"apiVersion": "2016-03-01",
"properties": {
"datasource": {
"type": "string",
"properties": {
}
},
"serialization": {
"type": "string",
"properties": {
}
}
}
}

You can use refernce function to output the output details
"output": {
"type": "object",
"value": "[reference(resourceId('Microsoft.StreamAnalytics/streamingjobs/outputs', 'JobName', 'outputName'), '2016-03-01', 'Full')]"
}

Is it possible to read a parameter at runtime from inside Azure Function Linked Service in Data Factory?

I need to dynamically call an Azure Function from inside my ADF pipeline.
Currently i'm able to parameterize the functionName through the Azure Function Activity, but i'm not able to parameterize nor the functionKey nor the URL.
The URL is no problem since I can store all the functions below the same URL but the functionKey is really a must for this.
Do you now any option to do that?
What I've tried
Parameter inside the json as with DataStoreLinkedServices:
{
"properties": {
"type": "AzureFunction",
"annotations": [],
"parameters": {
"functionSecret": {
"type": "String"
}
},
"typeProperties": {
"functionAppUrl": "https://<myurl>.azurewebsites.net",
"functionKey": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "KeyVaultLinkedService",
"type": "LinkedServiceReference"
},
"secretName": "#{linkedService().functionSecret}"
}
}
}
}
ErrorMsg:
"code":"BadRequest","message":"No value provided for Parameter 'functionSecret'"
Is there a way to achieve this? It seems not obvious, and I didn't found anything surfing the web. The most similar was this

I'll answer myself just in case someone have the same problem, what we do to manage this was parameterize the needed information from the pipeline itself.
So we have a pipeline that just call a generic Azure Function. In the caller pipeline, there is a process to obtain the desired parameters from the KeyVault and pass them to the AF pipeline.
The LS remain as follows:
{
"properties": {
"annotations": [],
"type": "AzureFunction",
"typeProperties": {
"functionAppUrl": "https://#{linkedService().functionAppUrl}.azurewebsites.net",
"functionKey": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "KeyVaultLinkedService",
"type": "LinkedServiceReference"
},
"secretName": "#{linkedService().functionKey}"
}
},
"parameters": {
"functionAppUrl": {
"type": "String",
"defaultValue": "#pipeline().parameters.functionAppUrl"
},
"functionKey": {
"type": "String",
"defaultValue": "#pipeline().parameters.functionKey"
}
}
}
}

Data Factory v1 mask some headers as credential in http headers

I have a Data Factory (v1) which downloads some files from an HTTP server.
Within the dataset pointing to the file location on this server we add an API key as an additional header to the HTTP request. We don't want this key to be visible from the portal similar to how Linked Services mask credentials after having been deployed.
The following Json files define the source linked service, the source dataset and the copy activity.
HTTP_source_linkedservice.json
{
"name": "HTTPSourceLinkedService",
"properties": {
"hubName": "this_is_a_hubname",
"type": "Http",
"typeProperties": {
"url": "https://website.com",
"authenticationType": "Anonymous"
}
}
}
HTTP_source_dataset
{
"name": "HTTPSourceDataset",
"properties": {
"published": false,
"type": "Http",
"linkedServiceName": "HTTPSourceLinkedService",
"typeProperties": {
"relativeUrl": "/main_file_to_download",
"additionalHeaders": "X-api-key: API_KEY_HERE\n"
},
"availability": {
"frequency": "Day",
"interval": 1
},
"external": true,
"policy": {}
}
}
Copy Activity
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "HttpSource"
},
"sink": {
"type": "BlobSink",
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:00"
}
},
"inputs": [
{
"name": "HTTPSourceDataset"
}
],
"outputs": [
{
"name": "HTTPSinkDataset"
}
],
"scheduler": {
"frequency": "Day",
"interval": 1
},
"name": "CopyFileFromServer"
}
I know we could use a Custom Activity to make the request itself and fetch the API key from a keyvault but I really want to use the standard Copy Activity.
Is there a way to achieve this ?

Unfortunately, I think this is not possible. Header field are defined as string. And in v1, there is even no secure string which is introduced in v2 to indicate a field is credentials.
But I think this can’t be achieved in v2 either. As the model type is fixed.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Dynamic Integration Runtime in Azure Data Factory (ADF) - azure

Related

Use Azure Stream Analytics Managed Identity to access SQL DB using terraform

ADF ARM Template doesn't appear to include the factory itself

Stream Analytics output to Synapse Analytics in ARM template

Is it possible to read a parameter at runtime from inside Azure Function Linked Service in Data Factory?

Data Factory v1 mask some headers as credential in http headers

Categories

Resources