ADF prod object pipeline parametrization not working

ADF prod object pipeline parametrization not working - azure

I have created an object param in the dev environment in my ADF instance like {"SuperSet":"SuperSet/SuperSet.csv"}, and calling the object like pipeline().parameters.DWH.SuperSet it is working fine, but when deployed on production instance using Azure CI/CD, it gives below mentioned error when triggered on set time.
Operation on target SuperSet failed: The expression 'pipeline().parameters.DWH.SuperSet' cannot be evaluated because property 'SuperSet' cannot be selected. Property selection is not supported on values of type 'String'.
my arm-templete-parameters-definition.json file has:
"Microsoft.DataFactory/factories/pipelines": {
"properties": {
"activities": [
{
"policy": {
"retry": "-",
"retryIntervalInSeconds": "-"
}
}
],
"parameters": {
"*": {
"defaultValue": "-::string"
}
}
}
}

Had to update arm-templete-parameters-definition.json, and it worked.
"Microsoft.DataFactory/factories/pipelines": {
"properties": {
"activities": [
{
"policy": {
"retry": "-",
"retryIntervalInSeconds": "-"
}
}
],
"parameters": {
"*": {
"defaultValue": "-::string"
},
"DWH": {
"type": "object",
"defaultValue": "=::object"
}
}
}
}
After the adf_publish branch update you will find before and after mentioned below.
Before:
"Aport_Import_properties_parameters_DWH": {
"type": "string"
}
After:
"Aport_Import_properties_parameters_DWH": {
"type": "object",
"defaultValue": {
"FILENAME": "techbit.csv"
}
}
"FILENAME": "techbit.csv" is the object that i had declared in the parameters in pipeline.

Related

How to inject variables into a Terraform JSON template for a Logic App

I have an exported template of a logic app built in Azure. The JSON file has information such as the subscriptionid which will vary depending upon environment. How do I use a variable instead of the literal value, as an example:
"defaultValue": "/subscriptions/**328974123908741329180713290587125**/resourceGroups/rg-management/providers/Microsoft.Web/connections/azureblob-5"
being replaced with something such as:
"defaultValue": "/subscriptions/**var.subscription**/resourceGroups/rg-management/providers/Microsoft.Web/connections/azureblob-5"
I'm not sure of two things. First, how to actually use this to create the logic app, and second my JSON template, exported from the Azure Portal, doesn't contain any information about the connections to the storage accounts, so not sure how to work this one out. The JSON follows my updated module, it's fairly large.
Here's my updated terraform module:
terraform {
required_providers {
azurerm = {
configuration_aliases = [azurerm.env, azurerm.mgmt]
}
}
}
resource "azurerm_logic_app_workflow" "storage_replication" {
name = "logic-app-storage-replication-${var.environment_name}-${var.resource_location}"
location = var.resource_location
resource_group_name = var.resource_group
}
templatefile("${path.module}/LogicAppStorageReplicationTemplate.json", }"
{
subscription = var.subscription_id,
resource_group = var.resource_group,
resource_location = var.resource_location,
container = var.container
}
)
Here is the exported JSON template with the variables added in, key vault I'll deal with later:
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"workflows_ReplicateStorage_name": {
"defaultValue": "ReplicateStorage",
"type": "String"
},
"connections_azureblob_4_externalid": {
"defaultValue": "/subscriptions/${subscription}/resourceGroups/${resource_group}/providers/Microsoft.Web/connections/azureblob-4",
"type": "String"
},
"connections_azureblob_5_externalid": {
"defaultValue": "/subscriptions/${subscription}/resourceGroups/${resource_group}/providers/Microsoft.Web/connections/azureblob-5",
"type": "String"
}
},
"variables": {},
"resources": [
{
"type": "Microsoft.Logic/workflows",
"apiVersion": "2017-07-01",
"name": "[parameters('workflows_ReplicateStorage_name')]",
"location": "${resource_location}",
"identity": {
"principalId": "#############",
"tenantId": "################",
"type": "SystemAssigned"
},
"properties": {
"state": "Enabled",
"definition": {
"$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"$connections": {
"defaultValue": {},
"type": "Object"
}
},
"triggers": {
"When_a_blob_is_added_or_modified_(properties_only)_(V2)_2": {
"recurrence": {
"frequency": "Minute",
"interval": 10
},
"evaluatedRecurrence": {
"frequency": "Minute",
"interval": 10
},
"splitOn": "#triggerBody()",
"type": "ApiConnection",
"inputs": {
"host": {
"connection": {
"name": "#parameters('$connections')['azureblob']['connectionId']"
}
},
"method": "get",
"path": "/v2/datasets/#{encodeURIComponent(encodeURIComponent('AccountNameFromSettings'))}/triggers/batch/onupdatedfile",
"queries": {
"checkBothCreatedAndModifiedDateTime": false,
"folderId": "/${container}",
"maxFileCount": 1
}
}
}
},
"actions": {
"Create_blob_(V2)": {
"runAfter": {
"Get_blob_content_using_path_(V2)": [
"Succeeded"
]
},
"type": "ApiConnection",
"inputs": {
"body": "#body('Get_blob_content_using_path_(V2)')",
"headers": {
"ReadFileMetadataFromServer": true
},
"host": {
"connection": {
"name": "#parameters('$connections')['azureblob_1']['connectionId']"
}
},
"method": "post",
"path": "/v2/datasets/#{encodeURIComponent(encodeURIComponent('AccountNameFromSettings'))}/files",
"queries": {
"folderPath": "#{replace(body('Get_Blob_Metadata_using_path_(V2)')?['Path'], body('Get_Blob_Metadata_using_path_(V2)')?['Name'], '')}",
"name": "#body('Get_Blob_Metadata_using_path_(V2)')?['Name']",
"queryParametersSingleEncoded": true
}
},
"runtimeConfiguration": {
"contentTransfer": {
"transferMode": "Chunked"
}
}
},
"Get_Blob_Metadata_using_path_(V2)": {
"runAfter": {},
"type": "ApiConnection",
"inputs": {
"host": {
"connection": {
"name": "#parameters('$connections')['azureblob']['connectionId']"
}
},
"method": "get",
"path": "/v2/datasets/#{encodeURIComponent(encodeURIComponent('AccountNameFromSettings'))}/GetFileByPath",
"queries": {
"path": "#triggerBody()?['Path']",
"queryParametersSingleEncoded": true
}
}
},
"Get_blob_content_using_path_(V2)": {
"runAfter": {
"Get_Blob_Metadata_using_path_(V2)": [
"Succeeded"
]
},
"type": "ApiConnection",
"inputs": {
"host": {
"connection": {
"name": "#parameters('$connections')['azureblob']['connectionId']"
}
},
"method": "get",
"path": "/v2/datasets/#{encodeURIComponent(encodeURIComponent('AccountNameFromSettings'))}/GetFileContentByPath",
"queries": {
"inferContentType": true,
"path": "#body('Get_Blob_Metadata_using_path_(V2)')?['Path']",
"queryParametersSingleEncoded": true
}
}
}
},
"outputs": {}
},
"parameters": {
"$connections": {
"value": {
"azureblob": {
"connectionId": "[parameters('connections_azureblob_4_externalid')]",
"connectionName": "azureblob-4",
"id": "/subscriptions/${subscription}/providers/Microsoft.Web/locations/${resource_location}/managedApis/azureblob"
},
"azureblob_1": {
"connectionId": "[parameters('connections_azureblob_5_externalid')]",
"connectionName": "azureblob-5",
"id": "/subscriptions/${subscription}/providers/Microsoft.Web/locations/${resource_location}/managedApis/azureblob"
}
}
}
}
}
}
]
}

You can do this with the templatefile function.
First, in your .json file, replace the id with an interpolation expression, like this:
"defaultValue": "/subscriptions/${subscription}/resourceGroups/rg-management/providers/Microsoft.Web/connections/azureblob-5"
Then in Terraform, you would generate the final JSON like this:
templatefile("${path.module}/my_template.json", {
subscription = "328974123908741329180713290587125"
})
You mentioned you want the value to come from an Azure Key Vault secret, so that would look like this:
templatefile("${path.module}/my_template.json", {
subscription = azurerm_key_vault_secret.my_secret.value
})
Of course the result of the templatefile() function call needs to be assigned to a local variable, a resource property, or a module input, but without seeing any of your code it's difficult to give a more complete answer.

OpenAPI type schemas are written to the Swagger type field

My question is in regard to API schemas in Azure API Management, resource type Microsoft.ApiManagement/service/apis/schemas.
A schema created through the Azure Portal is created with the content type application/vnd.oai.openapi.components+json and written to document/components/schemas, which is the correct path for schema definitions in an OpenAPI definition.
Sample:
{
"value": [
{
"id": "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.ApiManagement/service/xxx/apis/xxx/schemas/1629566051926",
"type": "Microsoft.ApiManagement/service/apis/schemas",
"name": "1629566051926",
"properties": {
"contentType": "application/vnd.oai.openapi.components+json",
"document": {
"components": {
"schemas": {
"patch-components-id-request-1": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"description": {
"type": "string"
},
"documentationURL": {
"type": "string"
},
"iacURL": {
"type": "string"
},
"duration": {
"type": "integer"
},
"statusID": {
"type": "integer"
},
"owner": {
"type": "string"
}
}
}
}
}
}
}
}
],
"count": 1
}
A schema created through the REST API or the Go SDK is set using properties.document.definitions, which leads to it being written to document/definitions, no matter the contentType.
Sample:
{
"value": [
{
"id": "/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.ApiManagement/service/xxx/apis/marble-dev-fctn/schemas/marbleschemas",
"type": "Microsoft.ApiManagement/service/apis/schemas",
"name": "marbleschemas",
"properties": {
"contentType": "application/vnd.oai.openapi.components+json",
"document": {
"definitions": {
"patch-components-id-request-1": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"description": {
"type": "string"
},
"documentationURL": {
"type": "string"
},
"iacURL": {
"type": "string"
},
"duration": {
"type": "integer"
},
"statusID": {
"type": "integer"
},
"owner": {
"type": "string"
}
}
}
}
}
}
}
],
"count": 1
}
When setting the contentType to application/vnd.ms-azure-apim.swagger.definitions+json this is fine, but when setting the contentType to application/vnd.oai.openapi.components+json two problems arise: The definition will not be included when exporting the OpenAPI Schema (at least when exporting from the Azure Portal, i have not tried any other way), since definitions is not a valid OpenAPI field, and it will not be shown in the Developer Portal.
As far as i understand it the definitons would need to be written to document/components/schemas for application/vnd.oai.openapi.components+json and document/definitions for application/vnd.ms-azure-apim.swagger.definitions+json.
I can import the definition to the correct path by setting it manually in an ARM Template or REST API call, but i am working on getting the terraform resource to work correctly, which is relying on the Go SDK. There might be a workaround for that as well but i would really like to find out if my understanding is wrong or if there might be a problem here.

Is it possible to read a parameter at runtime from inside Azure Function Linked Service in Data Factory?

I need to dynamically call an Azure Function from inside my ADF pipeline.
Currently i'm able to parameterize the functionName through the Azure Function Activity, but i'm not able to parameterize nor the functionKey nor the URL.
The URL is no problem since I can store all the functions below the same URL but the functionKey is really a must for this.
Do you now any option to do that?
What I've tried
Parameter inside the json as with DataStoreLinkedServices:
{
"properties": {
"type": "AzureFunction",
"annotations": [],
"parameters": {
"functionSecret": {
"type": "String"
}
},
"typeProperties": {
"functionAppUrl": "https://<myurl>.azurewebsites.net",
"functionKey": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "KeyVaultLinkedService",
"type": "LinkedServiceReference"
},
"secretName": "#{linkedService().functionSecret}"
}
}
}
}
ErrorMsg:
"code":"BadRequest","message":"No value provided for Parameter 'functionSecret'"
Is there a way to achieve this? It seems not obvious, and I didn't found anything surfing the web. The most similar was this

I'll answer myself just in case someone have the same problem, what we do to manage this was parameterize the needed information from the pipeline itself.
So we have a pipeline that just call a generic Azure Function. In the caller pipeline, there is a process to obtain the desired parameters from the KeyVault and pass them to the AF pipeline.
The LS remain as follows:
{
"properties": {
"annotations": [],
"type": "AzureFunction",
"typeProperties": {
"functionAppUrl": "https://#{linkedService().functionAppUrl}.azurewebsites.net",
"functionKey": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "KeyVaultLinkedService",
"type": "LinkedServiceReference"
},
"secretName": "#{linkedService().functionKey}"
}
},
"parameters": {
"functionAppUrl": {
"type": "String",
"defaultValue": "#pipeline().parameters.functionAppUrl"
},
"functionKey": {
"type": "String",
"defaultValue": "#pipeline().parameters.functionKey"
}
}
}
}

How to get Azure Data Factory to Loop Through Files in a Folder

I am looking at the link below.
https://azure.microsoft.com/en-us/updates/data-factory-supports-wildcard-file-filter-for-copy-activity/
We are supposed to have the ability to use wildcard characters in folder paths and file names. If we click on the 'Activity' and click 'Source', we see this view.
I would like to loop through months any days, so it should be something like this view.
Of course that doesn't actually work. I'm getting errors that read: ErrorCode: 'PathNotFound'. Message: 'The specified path does not exist.'. How can I get the tool to recursively iterate through all files in all folders, given a specific pattern of strings in a file path and file name? Thanks.

I would like to loop through months any days
In order to do this you can pass two parameters to the activity from your pipeline so that the path can be build dynamically based on those parameters. ADF V2 allows you to pass parameters.
Let's start the process one by one:
1. Create a pipeline and pass two parameters in it for your month and day.
Note: This parameters can be passed from the output of other activities as well if needed. Reference: Parameters in ADF
2. Create two datasets.
2.1 Sink Dataset - Blob Storage here. Link it with your Linked Service and provide the container name (make sure it is existing). Again if needed, it can be passed as parameters.
2.2 Source Dataset - Blob Storage here again or depends as per your need. Link it with your Linked Service and provide the container name (make sure it is existing). Again if needed, it can be passed as parameters.
Note:
1. The folder path decides the path to copy the data. If the container does not exists, the activity will create for you and if the file already exists the file will get overwritten by default.
2. Pass the parameters in the dataset if you want to build the output path dynamically. Here i have created two parameters for dataset named monthcopy and datacopy.
3. Create Copy Activity in the pipeline.
Wildcard Folder Path:
#{concat(formatDateTime(adddays(utcnow(),-1),'yyyy'),'/',string(pipeline().parameters.month),'/',string(pipeline().parameters.day),'/*')}
where:
The path will become as: current-yyyy/month-passed/day-passed/* (the * will take any folder on one level)
Test Result:
JSON Template for the pipeline:
{
"name": "pipeline2",
"properties": {
"activities": [
{
"name": "Copy Data1",
"type": "Copy",
"dependsOn": [],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"source": {
"type": "DelimitedTextSource",
"storeSettings": {
"type": "AzureBlobStorageReadSettings",
"recursive": true,
"wildcardFolderPath": {
"value": "#{concat(formatDateTime(adddays(utcnow(),-1),'yyyy'),'/',string(pipeline().parameters.month),'/',string(pipeline().parameters.day),'/*')}",
"type": "Expression"
},
"wildcardFileName": "*.csv",
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "DelimitedTextReadSettings"
}
},
"sink": {
"type": "DelimitedTextSink",
"storeSettings": {
"type": "AzureBlobStorageWriteSettings"
},
"formatSettings": {
"type": "DelimitedTextWriteSettings",
"quoteAllText": true,
"fileExtension": ".csv"
}
},
"enableStaging": false
},
"inputs": [
{
"referenceName": "DelimitedText1",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "DelimitedText2",
"type": "DatasetReference",
"parameters": {
"monthcopy": {
"value": "#pipeline().parameters.month",
"type": "Expression"
},
"datacopy": {
"value": "#pipeline().parameters.day",
"type": "Expression"
}
}
}
]
}
],
"parameters": {
"month": {
"type": "string"
},
"day": {
"type": "string"
}
},
"annotations": []
}
}
JSON Template for the SINK dataset:
{
"name": "DelimitedText1",
"properties": {
"linkedServiceName": {
"referenceName": "AzureBlobStorage1",
"type": "LinkedServiceReference"
},
"annotations": [],
"type": "DelimitedText",
"typeProperties": {
"location": {
"type": "AzureBlobStorageLocation",
"container": "corpdata"
},
"columnDelimiter": ",",
"escapeChar": "\\",
"quoteChar": "\""
},
"schema": []
}
}
JSON Template for the Source Dataset:
{
"name": "DelimitedText2",
"properties": {
"linkedServiceName": {
"referenceName": "AzureBlobStorage1",
"type": "LinkedServiceReference"
},
"parameters": {
"monthcopy": {
"type": "string"
},
"datacopy": {
"type": "string"
}
},
"annotations": [],
"type": "DelimitedText",
"typeProperties": {
"location": {
"type": "AzureBlobStorageLocation",
"folderPath": {
"value": "#concat(formatDateTime(adddays(utcnow(),-1),'yyyy'),dataset().monthcopy,'/',dataset().datacopy)",
"type": "Expression"
},
"container": "copycorpdata"
},
"columnDelimiter": ",",
"escapeChar": "\\",
"quoteChar": "\""
},
"schema": []
}
}

When to use AzureQueueSink

I am in the process of integrating an existing azure data factory project in my solution. While observing the data factory pipelines I saw that all the pipelines use SqlSource and the destination is AzureQueueSink.
The input datasets are
1. on-prem table
2. The output of a stored procedure
The output is an azure sql table.
Now I am confused as to when to use this AzureQueueSink I checked on google but I did not find any information regarding the use case for this.
Below is the sample pipeline activity.
{
"$schema": "http://datafactories.schema.management.azure.com/schemas/2015-09-01/Microsoft.DataFactory.Pipeline.json",
"name": "OnPremToAzureList",
"properties": {
"activities": [
{
"type": "SqlServerStoredProcedure",
"typeProperties": {
"storedProcedureName": "dbo.TruncateStgTable",
"storedProcedureParameters": { "TableName": "[dbo].[List]" }
},
"inputs": [
{
"name": "AzureSqlTableStart"
}
],
"outputs": [
{
"name": "AzureSqlTableTruncate"
}
],
"scheduler": {
"frequency": "Day",
"interval": 1
},
"name": "SPTruncateStgTable"
},
{
"name": "CopyActivityList",
"type": "Copy",
"inputs": [
{
"name": "OnPremList"
},
{
"name": "AzureSqlTableTruncate"
}
],
"outputs": [
{
"name": "AzureSqlTableList"
}
],
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": "select * from dbo.List"
},
"sink": {
"type": "AzureQueueSink",
"writeBatchSize": 1000,
"writeBatchTimeout": "00:30:00"
}
},
"policy": {
"concurrency": 1,
"executionPriorityOrder": "OldestFirst",
"retry": 1,
"timeout": "01:00:00"
},
"scheduler": {
"frequency": "Day",
"interval": 1
}
}
]
}
}
Any help is greatly appreciated.

Please do not use AzureQueueSink as copy into Azure Queue has not been shipped and we don't have any plan to bring it back. It's leaked into our Sdk/Schema by mistake :)
This sink type now gives you undeterministic behavior which happens to be working but that behavior is not to last too long.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

ADF prod object pipeline parametrization not working - azure

Related

How to inject variables into a Terraform JSON template for a Logic App

OpenAPI type schemas are written to the Swagger type field

Is it possible to read a parameter at runtime from inside Azure Function Linked Service in Data Factory?

How to get Azure Data Factory to Loop Through Files in a Folder

When to use AzureQueueSink

Categories

Resources