Azure Data Factory - Use system variable in Dynamic Content - azure

I'm trying to use system variable '#pipeline().TriggerTime' in a dynamic content field.
I have a 'Copy Data' activity which has a sink dataset to a folder.
Inside this Sink dataset, I try to set the filepath to
#concat('Trigger_',formatDateTime(#pipeline().TriggerTime, 'ddMMyyyyHHmmss'), '.trg')
But I get the following error message.
The activity is contained in an 'If Condition' block which itself is contained in a 'ForEach' but this variable should be global in the pipeline so I don't see why it shouldn't work.
Thanks for any help.

As Joel comments,just change "#pipeline" to "pipeline".
#concat('Trigger_',formatDateTime(pipeline().TriggerTime, 'ddMMyyyyHHmmss'), '.trg')
If you want to use multiple functions,you just add # at the beginning.
If you want to get the string of functions,you need to add double #,such as "Answer is: ##{pipeline().parameters.myNumber}" return the string Answer is: #{pipeline().parameters.myNumber}.
More detail,you can refer to this documentation.

Related

Copy CSV File with Multiline Attribute with Azure Synapse Pipeline

I have a CSV File in the Following format which want to copy from an external share to my datalake:
Test; Text
"1"; "This is a text
which goes on on a second line
and on on a third line"
"2"; "Another Test"
I do now want to load it with a Copy Data Task in an Azure Synapse Pipeline. The result is the following:
Test; Text
"1";" \"This is a text"
"which goes on on a second line";
"and on on a third line\"";
"2";" \"Another Test\""
So, yo see, it is not handling the Multi-Line Text correct. I also do not see an option to handle multiline text within a Copy Data Task. Unfortunately i'm not able to use a DataFlow Task, because it is not allowing to run with an external Azure Runtime, which i'm forced to use, due to security reasons.
In fact, i'm of course not speaking about this single test file, instead i do have x thousands of files.
My settings for the CSV File look like follows:
CSV Connector Settings
Can someone tell me how to handle this kind of multiline data correctly?
Do I have any other options within Synapse (apart from the Dataflows)?
Thanks a lot for your help
Well turns out this is not possible with a CSV File.
The pragmatic solution is to use "binary" files instead, to transfer the CSV Files and only load and transform them later on with a Python Notebook in Synapse.
You can achieve this in azure data factory by iterating through all lines and check for delimiter in each line. And then, use string manipulation functions with set variable activities to convert multi-line data to a single line.
Look at the following example. I have a set variable activity with empty value (taken from parameter) for req variable.
In lookup, create a dataset with following configuration to the multiline csv:
In foreach, where I iterate each row by giving items value as #range(0,sub(activity('Lookup1').output.count,1)). Inside for each, I have an if activity with following condition:
#contains(activity('Lookup1').output.value[item()]['Prop_0'],';')
If this is true, then I concat the current result to req variable using 2 set variable activities.
temp: #if(contains(activity('Lookup1').output.value[add(item(),1)]['Prop_0'],';'),concat(variables('req'),activity('Lookup1').output.value[item()]['Prop_0'],decodeUriComponent('%0D%0A')),concat(variables('req'),activity('Lookup1').output.value[item()]['Prop_0'],' '))
actual (req variable): #variables('val')
For false, I have handled the concatenation in the following way:
temp1: #concat(variables('req'),activity('Lookup1').output.value[item()]['Prop_0'],' ')
actual1 (req variable): #variables('val2')
Now, I have used a final variable to handle last line of the file. I have used the following dynamic content for that:
#if(contains(last(activity('Lookup1').output.value)['Prop_0'],';'),concat(variables('req'),decodeUriComponent('%0D%0A'),last(activity('Lookup1').output.value)['Prop_0']),concat(variables('req'),last(activity('Lookup1').output.value)['Prop_0']))
Finally, I have taken copy data activity with a sample source file with 1 column and 1 row (using this to copy our actual data).
Now, take source file configuration as shown below:
Create an additional column with value as final variable value:
Create a sink with following configuration and select mapping for only above created column:
When I run the pipeline, I get the data as required. The following is an output image for reference.

ADF Azure Data-Factory loop over folder syntax - wilcard?

i'm tryimg to loop over a diffrent countries folder that got fixed sub folder named survey (i.e Spain/survey , USA/survey ).
where and how I Need to define a wildcard / parameter for the countries so I could loop over all the files that in the survey folder ?
what is the right wildcard syntax ? ( the equivalent of - like 'survey%' in SQL) ?
I tried several ways to define it with no success and I would be happy to get some help on this - Thanks !
In case if the list of paths are static, you can create a parameter or add it in a SQL database and get that result from a lookup activity.
Pass the output to a for each activity and within foreach activity use a copy activity.
You can parameterize the input dataset to get the file paths thereby you need not think of any wildcard characters but use the actual paths itself.
Hope this is helpful.

DELETE using PreCopy Script in CopData activity

I have simple copy data activity with source and destination as a table in Azure Data Factory, Before inserting I'm having delete script in the pre-copy script option. The Delete should be done on the basis of parameters passed to the pipeline.
I tried this way but getting error.
DELETE FROM [dbo].[StgMetricLoad] where TransactionKey in(pipeline().parameters.TransactionKey)
Per my experience,you can't merge pipeline string parameter into sql string like that directly. This should be configured as dynamic content with #cancat built-in function.
I tested it in the Set Variable Activity:
#concat('DELETE FROM [dbo].[StgMetricLoad] where TransactionKey in(',
pipeline().parameters.keystring,
')')
Test Output:

Add Dynamic Content - Azure Data Factory ADF V2

I need to add a Dynamic Content in an ADF. In such a way that it needs to read the files from the folder with name as ‘StartDateOfMonth-EndDateOfMonth’ as below format.
Result: 20190601-20190630
Here are a few steps for how you can achieve that:
in DataSet:
create parameter "Date"
set up a connection to one selected file
now, replace "File" field with expression, similar to the following:
#concat('filedata_',dataset().Date,'.csv')
in Pipeline:
when using above DataSet, you just must pass the value, which you can set up by 'Set Variable'

Unable to copy file from SFTP in Azure Data Factory when using wildcard(*) in the filename

I am unable to copy csv files from an SFTP connection to blob storage when using the wildcard(*) in the filename.
More specifically, I receive csv files in the SFTP on a daily basis, and they are of the format: "ddMMyyyyxxxxxx.csv", where "xxxxxx" is the timestamp. More concretely, my csv file for the 13th of March is: "13032019083647.csv", while for the 14th of March: "14032019083556.csv". Obviously, the timestamp is different for every day, thus I want to copy the file independently of whatever strings exists between the date and the the file extenstion.
In the "File" subfield of the "File path" of the "Connection" tab of my subset, I give as input: "13032019*.csv", as instructed by the help icon next to the field:
When I do so, my Debug run fails with:
{"errorCode": "2200", "message":
"ErrorCode=UserErrorInvalidCopyBehaviorBlobNameNotAllowedWithPreserveOrFlattenHierarchy,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Cannot
adopt copy behavior PreserveHierarchy when copying from folder to a
single file.,Source=Microsoft.DataTransfer.ClientLibrary}
I receive a similar error no matter which type of copy behaviour I choose. I have also tried experimenting with the fileFilter parameter (even though ADF warns that the same behaviour can be achieved with the fileName option), but I still end up getting the same error.
For further clarification, I am attaching the Code segment that ADF produces for this configuration:
I should also mention, that when using the full fileName in the corresponding field, namely the value: "13032019083647.csv", copying works normally.
Any help would be greatly appreciated!
My guess it might get two files with wildcard operation.
In such cases we need to use metadata activity, filter activity and for-each activity to copy these files.
1.Metadata activity : Use data-set in these activity to point the particular location of the files and pass the child Items as the parameter.
2.Filter activity : Use filter to filter the files based on your needs.
3.For-each activity : In the For-each activity get Items from the previous activity and add copy activity inside the for-each.
In copy activity the source data set should be #item().name.
I hope this will solve your issue.
What worked for me was the following: I kept the same regex for the input file, but I defined as "Copy behaviour: Merge Files". Since as mentioned, there is only 1 file that satisfies the regex condition, only 1 file was created as output. I am aware that this is a sort of "dirty" solution, but it did the trick for me.

Resources