I'm trying to get data from Azure Table Storage using Azure Data Factory. I have a table called orders which has 30 columns. I want to take only 3 columns from this table (PartitionKey, RowKey and DeliveryDate). The DeliveryDate column has different data types like DateTime.Null (String value) and actual datetime values. When I want to preview the data i get the following error:
The DataSource looks like this:
{
"name": "Orders",
"properties": {
"linkedServiceName": {
"referenceName": "AzureTableStorage",
"type": "LinkedServiceReference"
},
"annotations": [],
"type": "AzureTable",
"structure": [
{
"name": "PartitionKey",
"type": "String"
},
{
"name": "RowKey",
"type": "String"
},
{
"name": "DeliveryDate",
"type": "String"
}
],
"typeProperties": {
"tableName": "Orders"
}
},
"type": "Microsoft.DataFactory/factories/datasets"}
I test your problem,it works.Can you show me more detail about this or there is something wrong about my test?
Below is my test data:
The dataset code:
{
"name": "Order",
"properties": {
"linkedServiceName": {
"referenceName": "AzureTableStorage",
"type": "LinkedServiceReference"
},
"annotations": [],
"type": "AzureTable",
"structure": [
{
"name": "PartitionKey",
"type": "String"
},
{
"name": "RowKey",
"type": "String"
},
{
"name": "DeliveryDate",
"type": "String"
}
],
"typeProperties": {
"tableName": "Table7"
}
},
"type": "Microsoft.DataFactory/factories/datasets"
}
Related
I need to Maintain Folder Structure to store files in yyyy/MM/DD format and I am getting date like this "2021-12-01T00:00:00Z" I need to Extract year from the date and to store in one variable and need to extract Month from date and set to another variable and for Date as well so that I will Concat these variable result under Copy activity Sink section
Yes , you can Maintain Folder Structure by using Split .
First create parameter name and add value.
Create 3 set variable with year , month and date.
Inside Year,Month,Date. Add this dynamic content value :
Year:#split(pipeline().parameters.fileName,'-')[0]
Month: #split(pipeline().parameters.fileName,'-')[1]
Data:
#split(split(pipeline().parameters.fileName,'-')[2],'T')[0]
For more information refer this JSON Code representation.
{
"name": "pipeline1",
"properties": {
"activities": [
{
"name": "year",
"type": "SetVariable",
"dependsOn": [],
"userProperties": [],
"typeProperties": {
"variableName": "year",
"value": {
"value": "#split(pipeline().parameters.fileName,'-')[0]",
"type": "Expression"
}
}
},
{
"name": "month",
"type": "SetVariable",
"dependsOn": [
{
"activity": "year",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "month",
"value": {
"value": "#split(pipeline().parameters.fileName,'-')[1]",
"type": "Expression"
}
}
},
{
"name": "date",
"type": "SetVariable",
"dependsOn": [
{
"activity": "month",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "date",
"value": {
"value": "#split(split(pipeline().parameters.fileName,'-')[2],'T')[0]",
"type": "Expression"
}
}
}
],
"parameters": {
"fileName": {
"type": "string",
"defaultValue": "2021-12-01T00:00:00Z.csv"
}
},
"variables": {
"year": {
"type": "String"
},
"month": {
"type": "String"
},
"date": {
"type": "String"
}
},
"annotations": [],
"lastPublishTime": "2023-02-15T05:42:23Z"
},
"type": "Microsoft.DataFactory/factories/pipelines"
}
Use a metadata activity to get the file list, and then a foreach activity to copy each file to the appropriate folder:
Metdata activity setting:
On foreach activity, in items loop through all the child items -
#activity('Get files Metadata').output.childItems
Inside the foreach look, add a copy activity to copy each file:
Source:
Source dataset (with a parameter on filename, to copy one file only:
Sync settings:
Expression to pass to folder parameter:
#concat(
formatDateTime(item().name ,'yyyy'),
'/',
formatDateTime(item().name ,'MM'),
'/',
formatDateTime(item().name ,'dd')
)
Sync dataset, with a parameter on folder name to create the hierarchy:
The result:
I'm trying to access the values in JSON output that I received from two (2) Graph API calls, but each time I try to use them I get this error:
ExpressionEvaluationFailed. The execution of template action 'For_each' failed: the result of the evaluation of 'foreach' expression '#body('Parse_JSON_-_Managed_Devices')?['body']?['value']' is of type 'Null'. The result must be a valid array.
I have validated that my Graph API calls are properly formatted, and output is exactly what I'm expecting to be returned from both API calls. I get this error every time I either try to access the parsed JSON in an Azure Runbook or any other Logic App tasks.
I would love to know if someone has experienced this before and how it can be solved?
Graph query: https://graph.microsoft.com/beta/deviceManagement/managedDevices/?$select=id,userId,deviceName,userDisplayName,azureADDeviceId,managedDeviceName,emailAddress&$filter=operatingSystem eq 'windows'
JSON schema for managed devices
{
"properties": {
"body": {
"properties": {
"##odata.context": {
"type": "string"
},
"##odata.count": {
"type": "integer"
},
"##odata.nextLink": {
"type": "string"
},
"value": {
"items": {
"properties": {
"azureADDeviceId": {
"type": "string"
},
"deviceName": {
"type": "string"
},
"emailAddress": {
"type": "string"
},
"id": {
"type": "string"
},
"managedDeviceName": {
"type": "string"
},
"userDisplayName": {
"type": "string"
},
"userId": {
"type": "string"
}
},
"required": [
"id",
"userId",
"deviceName",
"userDisplayName",
"azureADDeviceId",
"managedDeviceName",
"emailAddress"
],
"type": "object"
},
"type": "array"
}
},
"type": "object"
}
},
"type": "object"
}
Graph query: https://graph.microsoft.com/beta/users?$select=id,displayName,mail,officeLocation&$filter=accountEnabled eq true
JSON schema used for users
Graph query: https://graph.microsoft.com/beta/users?$select=id,displayName,mail,officeLocation&$filter=accountEnabled eq true
{
"properties": {
"body": {
"properties": {
"##odata.context": {
"type": "string"
},
"##odata.nextLink": {
"type": "string"
},
"value": {
"items": {
"properties": {
"displayName": {
"type": "string"
},
"id": {
"type": "string"
},
"mail": {
"type": "string"
},
"officeLocation": {
"type": "string"
}
},
"required": [
"id",
"displayName",
"mail",
"officeLocation"
],
"type": "object"
},
"type": "array"
}
},
"type": "object"
}
},
"type": "object"
}
I am looking at the link below.
https://azure.microsoft.com/en-us/updates/data-factory-supports-wildcard-file-filter-for-copy-activity/
We are supposed to have the ability to use wildcard characters in folder paths and file names. If we click on the 'Activity' and click 'Source', we see this view.
I would like to loop through months any days, so it should be something like this view.
Of course that doesn't actually work. I'm getting errors that read: ErrorCode: 'PathNotFound'. Message: 'The specified path does not exist.'. How can I get the tool to recursively iterate through all files in all folders, given a specific pattern of strings in a file path and file name? Thanks.
I would like to loop through months any days
In order to do this you can pass two parameters to the activity from your pipeline so that the path can be build dynamically based on those parameters. ADF V2 allows you to pass parameters.
Let's start the process one by one:
1. Create a pipeline and pass two parameters in it for your month and day.
Note: This parameters can be passed from the output of other activities as well if needed. Reference: Parameters in ADF
2. Create two datasets.
2.1 Sink Dataset - Blob Storage here. Link it with your Linked Service and provide the container name (make sure it is existing). Again if needed, it can be passed as parameters.
2.2 Source Dataset - Blob Storage here again or depends as per your need. Link it with your Linked Service and provide the container name (make sure it is existing). Again if needed, it can be passed as parameters.
Note:
1. The folder path decides the path to copy the data. If the container does not exists, the activity will create for you and if the file already exists the file will get overwritten by default.
2. Pass the parameters in the dataset if you want to build the output path dynamically. Here i have created two parameters for dataset named monthcopy and datacopy.
3. Create Copy Activity in the pipeline.
Wildcard Folder Path:
#{concat(formatDateTime(adddays(utcnow(),-1),'yyyy'),'/',string(pipeline().parameters.month),'/',string(pipeline().parameters.day),'/*')}
where:
The path will become as: current-yyyy/month-passed/day-passed/* (the * will take any folder on one level)
Test Result:
JSON Template for the pipeline:
{
"name": "pipeline2",
"properties": {
"activities": [
{
"name": "Copy Data1",
"type": "Copy",
"dependsOn": [],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"source": {
"type": "DelimitedTextSource",
"storeSettings": {
"type": "AzureBlobStorageReadSettings",
"recursive": true,
"wildcardFolderPath": {
"value": "#{concat(formatDateTime(adddays(utcnow(),-1),'yyyy'),'/',string(pipeline().parameters.month),'/',string(pipeline().parameters.day),'/*')}",
"type": "Expression"
},
"wildcardFileName": "*.csv",
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "DelimitedTextReadSettings"
}
},
"sink": {
"type": "DelimitedTextSink",
"storeSettings": {
"type": "AzureBlobStorageWriteSettings"
},
"formatSettings": {
"type": "DelimitedTextWriteSettings",
"quoteAllText": true,
"fileExtension": ".csv"
}
},
"enableStaging": false
},
"inputs": [
{
"referenceName": "DelimitedText1",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "DelimitedText2",
"type": "DatasetReference",
"parameters": {
"monthcopy": {
"value": "#pipeline().parameters.month",
"type": "Expression"
},
"datacopy": {
"value": "#pipeline().parameters.day",
"type": "Expression"
}
}
}
]
}
],
"parameters": {
"month": {
"type": "string"
},
"day": {
"type": "string"
}
},
"annotations": []
}
}
JSON Template for the SINK dataset:
{
"name": "DelimitedText1",
"properties": {
"linkedServiceName": {
"referenceName": "AzureBlobStorage1",
"type": "LinkedServiceReference"
},
"annotations": [],
"type": "DelimitedText",
"typeProperties": {
"location": {
"type": "AzureBlobStorageLocation",
"container": "corpdata"
},
"columnDelimiter": ",",
"escapeChar": "\\",
"quoteChar": "\""
},
"schema": []
}
}
JSON Template for the Source Dataset:
{
"name": "DelimitedText2",
"properties": {
"linkedServiceName": {
"referenceName": "AzureBlobStorage1",
"type": "LinkedServiceReference"
},
"parameters": {
"monthcopy": {
"type": "string"
},
"datacopy": {
"type": "string"
}
},
"annotations": [],
"type": "DelimitedText",
"typeProperties": {
"location": {
"type": "AzureBlobStorageLocation",
"folderPath": {
"value": "#concat(formatDateTime(adddays(utcnow(),-1),'yyyy'),dataset().monthcopy,'/',dataset().datacopy)",
"type": "Expression"
},
"container": "copycorpdata"
},
"columnDelimiter": ",",
"escapeChar": "\\",
"quoteChar": "\""
},
"schema": []
}
}
How do I get the subject property from the payload below ?
I've got an http-triggered logic app:
I want to be able to grab the contents of the subject property.
The schema as shown above in the image looks like this:
{
"type": "array",
"items": {
"type": "object",
"properties": {
"topic": {
"type": "string"
},
"subject": {
"type": "string"
},
"eventType": {
"type": "string"
},
"eventTime": {
"type": "string"
},
"id": {
"type": "string"
},
"data": {
"type": "object",
"properties": {
"api": {
"type": "string"
},
"clientRequestId": {
"type": "string"
},
"requestId": {
"type": "string"
},
"eTag": {
"type": "string"
},
"contentType": {
"type": "string"
},
"contentLength": {
"type": "integer"
},
"blobType": {
"type": "string"
},
"url": {
"type": "string"
},
"sequencer": {
"type": "string"
},
"storageDiagnostics": {
"type": "object",
"properties": {
"batchId": {
"type": "string"
}
}
}
}
},
"dataVersion": {
"type": "string"
},
"metadataVersion": {
"type": "string"
}
},
"required": [
"topic",
"subject",
"eventType",
"eventTime",
"id",
"data",
"dataVersion",
"metadataVersion"
]
}
}
How do I get the subject property from this payload?
Go to your logic app designer in the azure portal and you can specifically assign the json to variables in your flow process
Here is the link on how to do this
With the Request trigger, if you want to get the property, you need pass the Request Body into json cause the triggerBody() value is in a String type, it doesn't support select the property. Set the parse json action like the below pic.
Then your json set the data in array type, that's another problem you will encounter. So when you select property you need add the index like the below with Expression: body('Parse_JSON')[0]['subject'].
I test with short json two properties subject and topic.
I am setting up some Azure budget alerts to call a Logic App webhook to perform an action.
In the budget I have specified alert conditions to fire an action group at 25%, 50% and 75% of budget. The action group has an action to call the Logic App webhook using the common alert schema.
I have a "When a HTTP request is received" Logic App set up with the simple alert payload and a step to process the request.
In this processing step I would like to have access to details of the budget that triggered the alert (budget name, % of budget etc) but the sample schema does not contain that information:
{
"properties": {
"data": {
"properties": {
"alertContext": {
"properties": {
"condition": {
"properties": {
"allOf": {
"items": {
"properties": {
"dimensions": {
"items": {
"properties": {
"name": {
"type": "string"
},
"value": {
"type": "string"
}
},
"required": [
"name",
"value"
],
"type": "object"
},
"type": "array"
},
"metricName": {
"type": "string"
},
"metricNamespace": {
"type": "string"
},
"metricValue": {
"type": "number"
},
"operator": {
"type": "string"
},
"threshold": {
"type": "string"
},
"timeAggregation": {
"type": "string"
}
},
"required": [
"metricName",
"metricNamespace",
"operator",
"threshold",
"timeAggregation",
"dimensions",
"metricValue"
],
"type": "object"
},
"type": "array"
},
"windowSize": {
"type": "string"
}
},
"type": "object"
},
"conditionType": {
"type": "string"
},
"properties": {}
},
"type": "object"
},
"essentials": {
"properties": {
"alertContextVersion": {
"type": "string"
},
"alertId": {
"type": "string"
},
"alertRule": {
"type": "string"
},
"alertTargetIDs": {
"items": {
"type": "string"
},
"type": "array"
},
"description": {
"type": "string"
},
"essentialsVersion": {
"type": "string"
},
"firedDateTime": {
"type": "string"
},
"monitorCondition": {
"type": "string"
},
"monitoringService": {
"type": "string"
},
"originAlertId": {
"type": "string"
},
"resolvedDateTime": {
"type": "string"
},
"severity": {
"type": "string"
},
"signalType": {
"type": "string"
}
},
"type": "object"
}
},
"type": "object"
},
"schemaId": {
"type": "string"
}
},
"type": "object"
}
Is there somewhere that has a schema template with all of the possible fields for a budget alert? So that my Logic App can use those Budget fields as dynamic content in subsequent steps.
Thanks
I created a logic app which puts the input json into a blob storage. Added this Logic app as Webhook in Budget alert action group.
I received following message. This looks like the schema for budget alerts.
{
"schemaId": "AIP Budget Notification",
"data": {
"SubscriptionName": "",
"SubscriptionId": "",
"EnrollmentNumber": "",
"DepartmentName": "",
"AccountName": "",
"BillingAccountId": "",
"BillingProfileId": "",
"InvoiceSectionId": "",
"ResourceGroup": "",
"SpendingAmount": "",
"BudgetStartDate": "",
"Budget": "",
"Unit": "",
"BudgetCreator": "",
"BudgetName": "",
"BudgetType": "",
"NotificationThresholdAmount": ""
}
}
Looks like Microsoft did mention about this schema in their documentation but in a slightly hidden manner (look for the json in the below article)
https://learn.microsoft.com/en-us/azure/billing/billing-cost-management-budget-scenario#create-an-azure-logic-app-for-orchestration