Azure Datafactory Pipeline Failed inside a scheduled trigger - azure

I have created 2 pipeline in Azure Datafactory. We have a custom activity created to run a python script inside the pipeline.When the pipeline is executed manually it successfully run for n number of time.But i have created a scheduled trigger of an interval of 15 minutes in order to run the 2 pipelines.The first execution successfully runs but in the next interval i am getting the error "Operation on target PyScript failed: Hit unexpected exception and execution failed." we are blocked wiht this.any input on this would be really helpful.

from ADF troubleshooting guide, it states...
Custom Activity :
The following table applies to Azure Batch.
Error code: 2500
Message: Hit unexpected exception and execution failed.
Cause: Can't launch command, or the program returned an error code.
Recommendation: Ensure that the executable file exists. If the program started, make sure stdout.txt and stderr.txt were uploaded to the storage account. It's a good practice to emit copious logs in your code for debugging.
Related helpful doc: Tutorial: Run Python scripts through Azure Data Factory using Azure Batch
Hope this helps.
If you are still blocked, please share failed pipeline run ID & failed activity run ID, for further analysis.

Related

Azure Machine Learning - Every pipeline run is canceled automatically after 30 minutes

since 5 days every pipeline run is canceled (not failed) automatically after round about 30 minutes.
The pipeline stage shows the error message:
Response status code does not indicate success: 403 (Identity does not have permissions for Microsoft.MachineLearningServices/workspaces/experiments/runs).
Microsoft.RelInfra.Common.Exceptions.ErrorResponseException: Identity does not have permissions for Microsoft.MachineLearningServices/workspaces/experiments/runs/read actions.
runs
I verified that my user has the rights of the role (owner and contributor). So every write/ read access should be there.
access rights
I created a completly new machine learning ressource and tried with different users (role contributor).
The error message does not make sense, because the pipeline starts and runs for 30 minutes (every user right is there?!) and is canceled automatically after some time. If I re-run the pipeline step from the ML studio, the step succeed.
I am working with Azure Machine Learning for 4 months and everything went fine.
The pipeline is created with a local python SDK.
The abonnement contains the payed version.
EDIT:
The 70_driver_log log does not show any error message - it stops after printing some own messages.
70_driver_log.txt
7%|█ | 10388/139445 [12:55<2:00:20, 17.87it/s]
7%|█ | 10390/139445 [12:55<3:04:23, 11.67it/s]
executionlogs.txt:
[2021-05-25 12:00:47Z] Job is running, job runstatus is Running
[2021-05-25 12:02:49Z] Cancelling the job

Azure Function Error: The operation has timed out

An error has started popping up in my Azure Data Factory Pipeline. I have a few Azure Function steps in the pipeline, but for some reason, one of the Azure Function steps has started returning an error. In Azure Data Factory, the error is a 3608 code after running for 1 minute 40 seconds:
Failure type: User configuration issue
Details: Call to provided Azure function 'CollateSheetsHTTPTrigger' failed with status-'InternalServerError' and message - 'Invoking Azure function failed with HttpStatusCode - InternalServerError.'.
However, in a prior run sub-pipeline, this Azure Function ran successfully on the same data (parameters and worksheet are on the only difference). The subsequent 3 runs of pipelines fail immediately (after 2 seconds) at the first Azure Function (a different AZ function now) step in each, with the same 3608 error code but different details:
Call to provided Azure function '???????????????' failed with status-'NotFound'
and message - '<html> <head><title>404 Not Found</title></head> <body
bgcolor="white"> <center><h1>404 Not Found</h1></center> <hr><center>nginx</center>
</body> </html> '.
Now it gets even stranger. After these 3 failed pipelines, the next pipeline which is pretty much the same as the previous 4 except for a few parameters, runs successfully, even though it has the same 2 AZ functions that failed before. And then the next 2 pretty similar pipelines also run successfully.
I then went and looked at the monitoring page for the 2 Azure Functions:
The first AZ function that failed, had 2 errors even though it only failed once in AZ Data Factory... the timing is slightly different for the 2 errors but they could only come from the first failed pipeline, so why does it say there are 2 errors? Then if you look at the actual error, all it says is "The operation timed out". The function was not running for more than 150 seconds so this is strange. Additionally, I have a bunch of error catching code and nothing comes up there.
The other failed AZ function steps from the other function do not show up on the monitoring page, it seems as if the first error crashed the AZ function app and then it eventually restarted?
I'm sorry I can't help but I did have a similar problem with an Azure function that executes a SOAP-call to a webservice every minute. Since 4 days this also fails with a timeout. If I run the function within my debugger it runs without problems. But the Azure Function fails every time, after 20 sec.
I'll follow this question and hope someone else can help...
An Azure Support Engineer identified the issue, it was due to a change to the azure-function-host library. The relevant issue is here https://protect-eu.mimecast.com/s/-CT5C3QxrTmREBwhgPxLm?domain=github.com and was fixed last week

Azure File Copy failing on second run in build pipeline

I am using the "Azure File Copy" task in Azure Devops, which as far as I can see, uses an Az copy command to copy a file into Azure storage.
Here's my task definition for this:
[Note - this is v3 of the task]
This works fine on first run of the task within a build pipeline, and creates the file in the container as expected (shown below):
When I run the task in the pipeline subsequent times, it fails. I can see from the error it seems to be prompting for overwrite options - Yes/No/All. See below:
My Question:
Does anyone know how I give the task arguments that will tell it to force overwrite each time? Documentation for this on the Microsoft website isn't great and I can't find an example on the Github repo.
Thanks in advance for any pointers!
Full Error:
& "AzCopy\AzCopy.exe" /Source:"D:\a\1\s\TestResults\Coverage\Reports" /Dest:"https://project1.blob.core.windows.net/examplecontainer" /#:"D:\a\_temp\36c17ff3-27da-46a2-95d7-7f3a01eab368" /SetContentType:image/png /Pattern:"Example.png"
[2020/04/18 21:29:18][ERROR] D:\a\1\s\TestResults\Coverage\Reports\Example.png: No input is received when user needed to make a choice among several given options.
Overwrite https://project1.blob.core.windows.net/examplecontainer/Example.png with D:\a\1\s\TestResults\Coverage\Reports\Example.png? (Yes/No/All) [2020/04/18 21:29:18] Transfer summary:
-----------------
Total files transferred: 1
Transfer successfully: 0
Transfer skipped: 0
Transfer failed: 1
Elapsed time: 00.00:00:01
##[error]Upload to container: 'examplecontainer' in storage account: 'project1' with blob prefix: '' failed with error: 'AzCopy.exe exited with non-zero exit code while uploading files to blob storage.' For more info please refer to https://aka.ms/azurefilecopyreadme
Not so much as solution, as a workaround but I set this to version 1 of the task and it worked for me!

Azure Data Factory Integration runtimes will not start

I have an issue where Azure Data Factory Integration runtimes will not start.
When I trigger the pipeline I get the following error in Monitor -> Pipeline runs "InternalServerError executing request"
Image 1
In "view activity run" I can see that it's the Data Flow that failed with the error
{
"errorCode": "1006",
"message": "Hit unexpected exception and execution failed.",
"failureType": "SystemError",
"target": "data_wrangling_ks",
"details": []
}
Image 2
(the two successful runs are from a Self-Hosted IR)
When i try to start "Data flow debug" it will just disappear without any information.
This issue started earlier today without any changes in Data Factory config or the pipeline.
Please help and thank you for your time.
SOLVED:
I changed the Compute type from General Purpose to Compute Optimized and that solved the problem.
By looking at the error message, it seems like this issue has occurred due ADF related service outage in West Europe region. The issue has been resolved by the product team. Please open a MSDN thread if you ever encounter this issue.
Ref: Azure Data Factory Pipeline failed while running data flows with error message : Hit unexpected exception and execution failed

Azure Data Factory - Use GetRunRecord(runid) to get complete Error Details

I just tried running a first data copy job inside Azure Data Factory - it failed almost immediately, and displays the message:
Failed Execution: Error message too large to be returned. Use
GetRunRecord(runid) to get complete Error Details.
Can someone tell me where exactly I'm supposed to use this GetRunRecord command? Googling this error brought me exactly one relevant result, and it was no help.
Thanks.
do you have a RunID in your error messages which you could pass to GetRunRecord(runid)?
if yes, you might try the API call described here and pass in the RunID: https://learn.microsoft.com/en-us/rest/api/datafactory/data-factory-slice-run#save-run-log

Resources