How to Split Country and Year part separately, Folder creation is not working properly in ADLS I can't find From Where is the Bug Raised - azure

I was Supposed to give folder structure for file is /yyyy/MM/dd But it is coming like yyyy/MM/dd and the pipeline Structure is like
and the result I am getting like below

#concat('originalData/SFMC/CampaignLevelData/',pipeline().parameters.jobName,'/',variables('YMD'),'/',variables('MM'),'/',variables('Day'),'/')
There is no error in the expression and I tried the same in my environment and Folder structure is ....AU/2023/02/17
Error could be in value of variables('YMD'). Check the input data from your end.

Related

Azure Data Factory Counting number of Files in Folder

I am attempting to determine if a folder is empty.
My current method involves using a GetMeta shape and running the following to set a Boolean.
#greater(length(activity('Is Staging Folder Empty').output.childItems), 0)
This works great when files are present.
When the folder is empty (a state I want to test for) I get
"The required Blob is missing".
Can I trap this condition?
What alternatives are there to determine if a folder is empty?
I have reproduced the above and got same error.
This error occurs when the folder is empty, and the source is a Blob storage. You can see it is working fine for me when the Source is ADLS.
for sample I have used set variable.
inside false of if.
if folder is empty:
Can I trap this condition?
What alternatives are there to determine if a folder is empty?
One alternative can be to use ADLS instead of Blob storage as source.
(or)
You can do like below, if you want to avoid this error with Blob storage as source. Give an if activity for the failure of Get Meta and check the error in the expression.
#startswith(string(activity('Get Metadata1').error.message), 'The required Blob is missing')
In True activities (required error and folder is empty) I have used set variable for demo.
In False activities (If any other error apart from the above occurs) use Fail activity to fail the pipeline.
Fail Message: #string(activity('Get Metadata1').error.message)
For success of Get Meta activity, there is no need to check the count of child Items because Get Meta data fails if the folder is empty. So, on success Go with your activities flow.
An alternative would be
Blob:
Dataset :
where test is the container and test is the folder inside the container which I am trying to scan (Which ideally doesnt exist as seen above)
Use get meta data activity to check if the folder exists :
If false, exit else count for the files

Parametrization using Azure Data Factory

I have a Pipeline job in Azure Data Factory which I want to use to run the pipeline job but pass all files for a specific month through for example.
I have a folder called 2020/01 inside this folder is numerous files with different names.
The question is: Can one pass a parameter through to only extract and load the files for 2020/01/01 and 2020/01/02 if that makes sense?
Excellent, Thanks Jay it worked and i can now run my pipeline jobs passing through the month or even day level.
Really appreciate your response, have a fantastic day.
Regards
Rayno
The question is: Can one pass a parameter through to only extract and
load the files for 2020/01/01 and 2020/01/02 if that makes sense?
You did't mention which connector you are using in pipeline job,but you mentioned folder in your question.As i know,the majority folder path could be parametrization in ADF copy activity configuration.
You could create a param :
Then apply it in the wildcard folder path:
Even if your files' names have same prefix,you could apply 01*.json on the wildcard file name property.

How to delete files based older than specified date in Azure Data lake

I have data folders created on daily basis in datalake. Folder path is dynamic from JSON Format
Source Folder Structure
SAPBW/Master/Text
Destination Folder Structure
SAP_BW/Master/Text/2019/09/25
SAP_BW/Master/Text/2019/09/26
SAP_BW/Master/Text/2019/09/27
..
..
..
SAP_BW/Master/Text/2019/10/05
SAP_BW/Master/Text/2019/09/06
SAP_BW/Master/Text/2019/09/07
..
..
SAP_BW/Master/Text/2019/09/15
SAP_BW/Master/Text/2019/09/16
SAP_BW/Master/Text/2019/09/17
I want to delete the folders created before 5 days for each folder of sinkTableName
So, in DataFactory, i have Called the folder path in a for each loop as
#concat(item().DestinationPath,item().SinkTableName,'/',item().LoadTypeName,'/',formatDateTime(adddays(utcnow(),-5),item().LoadIntervalFormat),'/')"
Need syntax to delete the files in each folder based on the JSON.
Unable to find the way to delete folder wise and setup the delete activity depending on the dates prior to five days from now
I see that you are doing a concatenation , which I think is the way to go . But I see that you are using the expression formatDateTime(adddays(utcnow(),-5) , which will give you something like 2019-10-15T08:23:18.9482579Z which i don't think is desired . I suggest to try with #formatDateTime(adddays(utcnow(),-5) ,'yyyy/MM/dd'). Let me know how it goes .

cv2.VideoCapture directory of images

I though this would be simple, but i have been caught by the simplest of puzzles which i can't find the answer to anywhere,
I have some code which reads images and then OpenCV looks for differences.
I read files with the following command
vs = cv2.VideoCapture("/home/andrew/images/image_%6d.jpg")
and this work perfectly with images called image_000000.jpg image_000001.jpg
However i don't want to rename my images so i would like to read files called
MDAlarm_20180921-031140.jpg whcih contain the date then time.
What is the printf format for this ? as what ever I try it does not work i.e no files found or do the files need to start from 0 , so i need to append an index
starting at 000000?
Lastly once i have this working how can i tell which file is being processed ?
Many Thanks
Andrew

Problems came up in the following areas during load: Table

I have generated an excel file from xml. But i can not open it with Excel. Excel gives the following error opening it:
Problems came up in the following areas during load:
Table
Then it shows a message that the log file corresponding the error can be found at : C:/Documents and Setting/myUserName/Local Settings/Temporary Internet Files/Content.MSO/xxxxx.log
But i can not find Content.MSO folder in my windows. I checked folder settings and made all folders visible but i still can not access this folder. So that i can not analise the log file.
how could i find the generated log file?
I found the problem without analising the log file. i stil can not access the log file in temporary internet files. But i realised that i put a string(non-number) characters on a number-styled cell in Excel xml. So if you having the similar issues about your Excel file generated from xml, then have a look at if your cell values are appopriate with your cell data type.
If you type or paste the path of the log file into Explorer or your text editor of choice, you may find that the folder does exist, despite being invisible.
In my case it was a <Row> with an incorrect ss:Index
I was using a template and the last row had a fixed Index=100. If the number of rows I added exceeded 100, this last row had a wrong index and excel threw the error without any other message or log (MacOSX, Excel 15.25.1). I wish they printed more informative error messages, what a waste of our time.
Excel 2016. My error message was "Worksheet Settings". Path was pointing to non-existing file.
My cause of the problem was ExpandedRowCount not big enough for number of rows in Worksheet. If you add rows in XML directly (i.e. on a machine where Excel is not installed), make sure to increment number of rows in ExpandedRowCount.
yes.Even i too faced the same problem and problem was with the data type of cells ofexcel generated using xslt
In addition to checking the data being used vs "Type" assigned, make sure that the list of characters that need to be encoded for XML are indeed encoded.
I had a system that appeared to be working, but then some user data including & and < was throwing this error.
If you're not sure what's going on with your file, try http://www.xmlvalidation.com/ - that helped be spot the issue in a large file immediately.
I used this function to fix it, modified from this post:
function xmlsafe($s) {
return str_replace(array('&','>','<','"'), array('&','>','<','"'), $s);
}
and then run echo xmlsafe($myvalue) where you were just echoing $myvalue in your script.
This seems to be more appropriate for XML than htmlentities() or other options built into PHP.
I had the same issue, and the answer was - type of Cell was Number and some values doesn't converts to this type on my backend.
I had the SAME problem,
and its because de file is TOO BIG.
I try an extract from SAP, more little than the one with that make the error) and save it in XML file. and it WORK, no more error.
so maybe if you can save in 2 Excel files XML instead of 1 it will be good ;)
ALicia

Resources