Substring of a file name in ADF - azure

in Azure Data factory ,i am getting "Common_EUR_AP_COMPCODE_YYY_MM_DD" as file name from "Get Metadata" activity which is then going thru "foreach loop" , now i want to take just "COMPCODE" bit of it inside foreach > "set variable" and ignore the rest. Can somebody please help on how to do it.
i used many ways but the v close one was "#substring(item().name,add(indexof(item().name,''),3),add(lastindexof(item().name,''),1))"

Please try this expression:
#split(item().name,'_')[3]
Result:
Reference:
https://learn.microsoft.com/en-us/azure/data-factory/control-flow-expression-language-functions#split

Related

ADF Azure Data-Factory loop over folder syntax - wilcard?

i'm tryimg to loop over a diffrent countries folder that got fixed sub folder named survey (i.e Spain/survey , USA/survey ).
where and how I Need to define a wildcard / parameter for the countries so I could loop over all the files that in the survey folder ?
what is the right wildcard syntax ? ( the equivalent of - like 'survey%' in SQL) ?
I tried several ways to define it with no success and I would be happy to get some help on this - Thanks !
In case if the list of paths are static, you can create a parameter or add it in a SQL database and get that result from a lookup activity.
Pass the output to a for each activity and within foreach activity use a copy activity.
You can parameterize the input dataset to get the file paths thereby you need not think of any wildcard characters but use the actual paths itself.
Hope this is helpful.

Azure Data Factory - Use system variable in Dynamic Content

I'm trying to use system variable '#pipeline().TriggerTime' in a dynamic content field.
I have a 'Copy Data' activity which has a sink dataset to a folder.
Inside this Sink dataset, I try to set the filepath to
#concat('Trigger_',formatDateTime(#pipeline().TriggerTime, 'ddMMyyyyHHmmss'), '.trg')
But I get the following error message.
The activity is contained in an 'If Condition' block which itself is contained in a 'ForEach' but this variable should be global in the pipeline so I don't see why it shouldn't work.
Thanks for any help.
As Joel comments,just change "#pipeline" to "pipeline".
#concat('Trigger_',formatDateTime(pipeline().TriggerTime, 'ddMMyyyyHHmmss'), '.trg')
If you want to use multiple functions,you just add # at the beginning.
If you want to get the string of functions,you need to add double #,such as "Answer is: ##{pipeline().parameters.myNumber}" return the string Answer is: #{pipeline().parameters.myNumber}.
More detail,you can refer to this documentation.

Checking file count returned by getmetadata activity in Azure data factory

How to get how many file names/folder names are returned by getmetadata activity in Azure data factory?
I want to get the number of files/folders returned by getmetadata activity and on the basis of this count decide which activities will execute..
Anyone have any idea about how can we get this count?
Thanks to those who commented and special thanks to Steve Zhao.
Actually i wanted to check if files present i.e length is greater than 0 then go to flow A else Choose flow B.
I've tried standalone greater and length function in my expression (greater(length(activity('GETFILENAMESFROMBLOB').output.childitems), 0)) to calculate length but when child items are 0 but when i tried this previous activity i.e GETMETADATA doesn't contains childitems array in it's output so my If condition activity gives error that property childitems doesn't exist.
I also tried empty function but the main issue was when there are no input files for getmetadata then we should not expect a childitems array in it's output.
So here is how i have solved the problem, at first i checked if childitems array is present in output of my getmetadata activity, then we'll get the actual count using length function else expression will have to return 0. Below is the expression used for if condition Activity. Please check.
Expression:
#if( contains(activity('GETFILENAMESFROMBLOB').output,'childitems'), length(activity('GETFILENAMESFROMBLOB').output.childitems), equals(2,3))
Hope this may help you!
Please try something like this:
The screenshots of pipeline:
Setting of Get Metadata1
Setting of If Condition1
Expression:#greater(length(activity('Get
Metadata1').output.childItems),100)
Hope this can help you:).
Below Code in IF condition Activities expression should be suffice for filecount > 0 logic.
#greater(string(length(activity('Get Metadata1').output.childitems)),'0')

Logic APP : ActionFailed. An action failed. No dependent actions succeeded

I am facing the issue with for loop execution with logic APP in azure. Apparently complete playbook execute successfully and functionally its working good. However, i am getting this error because it takes "body" parameter from previous step as input and nothing else. The body is long json and therefore should not be the right input for foreach loop. I tried adding account or Ip address as input but that fails as well.
Input
Output
Please help here
As you mentioned there is just one item in your json data array which contains "MachineId", I assume the first item contains "MachineId". Please refer to the solution below, it will help you to use the only "MachineId" in the 24 cycles of your loop.
We can input an expression to use the "MachineId" in first item:
body('Parse_JSON')[0].MachineId
(In the screenshot above, I just use a "Set variable" to replace your two actions in "For each" loop, but I think there is no difference between them)
Please have a try with this solution~

Unable to copy file from SFTP in Azure Data Factory when using wildcard(*) in the filename

I am unable to copy csv files from an SFTP connection to blob storage when using the wildcard(*) in the filename.
More specifically, I receive csv files in the SFTP on a daily basis, and they are of the format: "ddMMyyyyxxxxxx.csv", where "xxxxxx" is the timestamp. More concretely, my csv file for the 13th of March is: "13032019083647.csv", while for the 14th of March: "14032019083556.csv". Obviously, the timestamp is different for every day, thus I want to copy the file independently of whatever strings exists between the date and the the file extenstion.
In the "File" subfield of the "File path" of the "Connection" tab of my subset, I give as input: "13032019*.csv", as instructed by the help icon next to the field:
When I do so, my Debug run fails with:
{"errorCode": "2200", "message":
"ErrorCode=UserErrorInvalidCopyBehaviorBlobNameNotAllowedWithPreserveOrFlattenHierarchy,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Cannot
adopt copy behavior PreserveHierarchy when copying from folder to a
single file.,Source=Microsoft.DataTransfer.ClientLibrary}
I receive a similar error no matter which type of copy behaviour I choose. I have also tried experimenting with the fileFilter parameter (even though ADF warns that the same behaviour can be achieved with the fileName option), but I still end up getting the same error.
For further clarification, I am attaching the Code segment that ADF produces for this configuration:
I should also mention, that when using the full fileName in the corresponding field, namely the value: "13032019083647.csv", copying works normally.
Any help would be greatly appreciated!
My guess it might get two files with wildcard operation.
In such cases we need to use metadata activity, filter activity and for-each activity to copy these files.
1.Metadata activity : Use data-set in these activity to point the particular location of the files and pass the child Items as the parameter.
2.Filter activity : Use filter to filter the files based on your needs.
3.For-each activity : In the For-each activity get Items from the previous activity and add copy activity inside the for-each.
In copy activity the source data set should be #item().name.
I hope this will solve your issue.
What worked for me was the following: I kept the same regex for the input file, but I defined as "Copy behaviour: Merge Files". Since as mentioned, there is only 1 file that satisfies the regex condition, only 1 file was created as output. I am aware that this is a sort of "dirty" solution, but it did the trick for me.

Resources