How to create a audit table in azure data factory which will hold the status for the pipeline run in Azure Data Factory - azure

I have a requirement where a Azure Data Pipeline is running and inside that we have a data flow where different tables are loaded from ADLS to Azure Sql Database. So the issue is I wanted to store the status of the pipeline like success or failure in an audit table as well as Primary Key column ID which is present in Azure SQL database table so that when I want to filter job I on the primary key like for which ID job is success I should get from the audit table.i managed to did something in stored procedure and store the status in a table but I am unable to add a column like ID .Below is the screen shot of pipeline.
The Report_id column is from the table which is loaded from Dataload pipeline.How to add that in audit table so the every time when a pipline runs Report_id is captured and stored in audit table
Audit Table where I want to add Report id
Any help will be appreciated.Thanks

The Data Flow must have a sink. So, after the Data Flow completes, you need to use a Lookup activity to get the value of that Report_Id from the sink. Then, you can set that to a variable and pass that into your Stored Procedure. (You could also just pass it directly to the Stored Procedure from the Lookup using the same expression you would use to set the variable.)

Related

How azure data factory -Dataflow can create CRM related entities records in single transaction?

I am trying to implement Azure Data factory DATAFLOW to create CRM entity records into multiple entities in single transaction. If any error occurred in the second entity, then first entity record should be rollback. Please share your idea.
I tried with Json file as input with multiple hierarchy, representing multiple CRM entity. I used Data flow source json dataset and 3 CRM sinks. But, i am unable to achieve single transaction when an error occurred.
ADF does not support roll back option. You can have any Watermark column or flag in the target table which indicates the records which got inserted during the current pipeline run and delete only those records if any error occurred.
Watermark column is the column which can have the timestamp at which the row got inserted or it can be incrementing key. Before running the pipeline, maximum value of the watermark column is noted. Whenever the pipeline is failed, rows inserted after the maximum watermark value can be deleted.
Instead of deleting all records from current pipeline run, if records which are not copied in some entities only need to be deleted then based on the key field, we can delete the rows. Below is the approach.
Source1 and source2 are taken with entity1 and entity2 data respectively.
img1: entity1 data
img2: entity2 data
Id=6 is not copied to entity2. So, this should be deleted from entity1.
Exist transformation is added and left and right stream are given as source1 and 2 respectively. Exists type is doesn't exist. Exists conditions: source1#id = source2#id.
img3: exists transformation settings
Alter row transformation is added and condition is given as delete if true().
img4: Alter Row transformation settings
In sink settings, allow delete is selected and key column is selected as id.
Img5: Sink settings
img6: Sink data preview.
When pipeline with this dataflow is run, all rows which are in entity1 but not in entity2 are deleted.

querying of SQL table with dynamic date filter in Azure data factory (Azure synapse analytics)

How to pass datetime parameter to a SQL query in a source of the data flow activity in ADF/synapse analytics?
I am building a synapse analytics pipeline that performs a delta load in a fact table. First, the table is queried with a lookup activity to get the latest LoadDate value. The returned value is then set as a variable and passed as a parameter to a data flow activity.
I am struggling to get the data flow running properly. I have tried to concatenate the SQL query with the filter value in the 'SetVariable' activity but get 'The store configuration is not defined.' error. Same happens when I pass only converted LoadDate value to the source query in data flow activity:
"SELECT top 10 * FROM dbo.facts WHERE timestamp > #pipeline().parameters.LastLoadedDate"
After many try-and-error attempts, this syntax worked for me:
concat("SELECT * FROM dbo.facts WHERE timestamp > CONVERT(datetime2, '" , $LastLoadedDate, "')")
the key was to use double quotes to wrap concatenated strings...
Please try this SQL:
concat('select top 10 * FROM dbo.facts WHERE timestamp >', $yourParameterName)
In data flow, you can't use pipeline expression like this #pipeline().parameters.LastLoadedDate, you should use the parameter value in data flow.

Azure Data Factory Copy Data pipeline just stuck at In Progress status; but not even loading any data

Really confused to what is going on.
Source:
Azure Table Storage - I have done the "Preview Data" and I can see the data. Added couple of filters in Query window as shown in the description.
Target
Azure SQL Server
Mapping
Successful
Pipeline Status
In Progress for last 15 minutes. I don't mind it but it hasn't loaded any data in the SQL destination so far
If you click on the pipeline name, it will drill down to the activity level monitoring and you can see the details for each activity
Once you click on the details button, it will show you the copy info
If this didn't help, please provide your pipeline run id and/or activity runid and I'll take a look in our logs.
Thanks,
Paco
I think the error is happened in your Source dataset query:
When you first choose the table storage as Source dataset, you can see all the data with Preview data, for example:
But when you add the query filter, the data will changed in Preview data:
I added the filter: PartitionKey eq '1' and RowKey eq '1'
Click the Preview data again to check if the filter works:
The filtered data is the data which will be transfer to Azure SQL.
As you know, Table storage only support you query with PartitionKey, RowKey and Timestamp, there isn't column like DSP_Status.
That means you query DSP_Status eq '***' and Timestamp ge datatime '2019-05-01' is wrong, the filter result must be null.
That's why no data is loaded to the SQL destination.
Please see:Design for querying
Hope this helps.

Azure Data Factory passing a parameter into a function (string replace)

I'm trying to use ADF to create azure table storage tables from one source SQL table.
Within my pipeline..
I can query a distinct list of customers pass this into a for-each
task.
Inside the for each select the data for each customer
But when I try to to create an Azure table for each customers data with the table name based on the customer ID I hit errors.
The customer ID is a GUID so I'm trying to format this to remove the dashes which are invalid in a table name...
Something along the lines of
#replace('#{item().orgid}','-','')
So 7fb6d90f-2cc0-40f5-b4d0-00c82b9935c4 becomes 7fb6d90f2cc040f5b4d000c82b9935c4
I can't seem to get the syntax right
Any ideas?
try this: #replace(item().orgid,'-','').

How to achieve dynamic columnmapping in azure data factory when Dynamics CRM is used as sink

I have a requirement where i need to pass the column mapping dynamically from a stored procedure to the copy activity. This copy activity will perform update operation in Dynamics CRM. The source is SQL server (2014) and sink is Dynamics CRM.
I am fetching the column mapping from a stored procedure using look up activity and passing this parameter to copy activity.'
When i directly provide the below mentioned json value as default value to the parameter, the copy activity is updating the mapped fields correctly.
{"type":"TabularTranslator","columnMappings":{"leadid":"leadid","StateCode":"statecode"}}
But when the json value is fetched from the SP , it is not working . I am getting the error ColumnName is read only.
Please suggest if any conversion is required on the output of the loopup activity before passing the parameter to copy activity. Below is the output of the lookup activity.
{\"type\":\"TabularTranslator\",\"columnMappings\":{\"leadid\":\"leadid\",\"StateCode\":\"statecode\"}}
Appreciate a quick turnaround.
Using parameter directly and Using lookup output are different. can you share how did you write the parameter from the output of lookup actvitiy.
you can refer to this doc https://learn.microsoft.com/en-us/azure/data-factory/control-flow-lookup-activity

Resources