How to force PythonScriptStep to run in Azure ML - azure

I'm relatively new to Azure ML and trying to run a model via PythonScriptStep
I can publish pipelines and run the model. However, once it has run once I can't re-submit the step as it states "This run reused the output from a previous run".
My code declares allow_reuse to be False, but this doesn't seem to make a difference and I can simply not resubmit the step even though the underlying data is changing.
train_step = PythonScriptStep(
name='model_train',
script_name="model_train.py",
compute_target=aml_compute,
runconfig=pipeline_run_config,
source_directory=train_source_dir,
allow_reuse=False)
Many thanks for your help

Related

I can not register a model in my Azure ml experiment using run context

I am trying to register a model inside one of my azure ml experiments. I am able to register it via Model.register but not via run_context.register_model
This are the two code sentences I use. The commented one is the one that fails
learn.path = Path('./outputs').absolute()
Model.register(run_context.experiment.workspace, "outputs/login_classification.pkl","login_classification", tags=metrics)
run_context.register_model("login_classification", "outputs/login_classification.pkl", tags=metrics)
I received the next error:
Message: Could not locate the provided model_path outputs/login_classification.pkl
But model is stored in this path:
Before implementing run_context.register_model() implement run_context = Run.get_context()
I was able to fix the problem by explicitly uploading the model into the run history record before trying for registering the model.
run.upload_file("output/model.pickle", "output/model.pickle")
Check the documentation for Message: Could not locate the provided model_path outputs/login_classification.pkl
To check about Run Class

Rerunning a failed ADF pipeline automatically

I have multiple pipelines in Azure Data factory that get data from APIs and then push it to a datalake. I get alerts in case one of the pipelines fail. I then go to the ADF instance and rerun the the failed pipeline manually. I am trying to come up with an automated way of rerunning a pipeline in case it fails. Any suggestions or guidance would be helpful. thought of Azure logic apps or powerautomate but turns out don't have the right actions in there to trigger a failed pipeline.
If the pipeline design could be modified then a method can be to
Set parameter pMax_rerun_count ( This is to ensure pipeline doesn go into indefinite loop )
set 2 variables:
(2.a) Pipeline_status default value : Fail
(2.b) Max_loop_count default value : 0 ; This would be to ensure the pipeline doesnt run in loops . The value could be set during the pipeline run to have the maximum permissible retry count (i.e. pMax_rerun_count) passed as parameter in the pipeline
All activities should be inside and Untill activity which will have expression or(equals(Pipeline_status,'Success'),equals(pMax_rerun_count,Max_loop_count)
The first activity inside Untill activity will be Set Variable activity that increment the value of variable
Max_loop_count by 1 .
The final activity insisde Untill activity will be to Set variable activity that sets Pipeline_status to "Success"
The purpose here is to run all intended activities inside untill block untill the intended activities in pipeline completes successfully . pMax_rerun_count is to ensure pipeline doesnt go into indefinite loops.
This setup can considered as a framework if all pipelines needs to rerun in case of failure
I came with a streamlined way of running failed pipelines. I decided to use the Azure Data factory API alongside Azure Logic apps to solve the problem.
I run logic apps on a scheduled run time and then use the following API commands:
GET https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.DataFactory/factories/{factoryName}/pipelineruns/?api-version=2018-06-01
This API query gives us all the pipeine runs. If we want to filter it down to failed values, we can add the following body to it:
{
"lastUpdatedAfter": "2018-06-16T00:36:44.3345758Z",
"lastUpdatedBefore": "2018-06-16T00:49:48.3686473Z",
"filters": [
{
"operand": "status",
"operator": "Equals",
"values": [
"failed"
]
}
]
}
After getting the failed pipelines, We can then invoke the following API on each failed pipeline, to rerun them:
POST https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.DataFactory/factories/{factoryName}/pipelines/{pipelineName}/createRun?api-version=2018-06-01
This solution can be created using a scripting language , powerautomate workflow or Azure Logic apps.
As of now there is no inbuilt method to automate the process of "rerunning from
failed activity" in the ADF, but each activity has a Retry option that
you should certainly employ. In the pipeline, you may attempt any
action as many times as necessary if it fails.
Allow the trigger to point to a new pipeline with a Execute activity that points to the current Azure Datafactory with the copy activity:
Then choose the Advanced -> Wait for completion option.
After the execute pipeline is complete, the webhook action should contain logic to halt the DW.

Is there a way to "wait" for "Azure Data Factory" Execution task to complete before executing next steps of Azure Logic Apps

Trying to Load some Excel data using ADF pipeline via Logic Apps. However when triggering through Logic Apps, the task triggers and then moves to the next step immediately. Looking for a solution where the next step waits for a "Execute Data factory Pipeline" to execute completely before proceeding.
Adding an image for clarity.
-Thanks
For this requirement, I provide a sample of my logic app below for your reference:
1. Add a "Create a pipeline run" action and initialize a variable named status(set its value as "InProgerss").
2. Then add a "Until" action, set the break condition as status is equal to "Succeeded". Add a "Get a pipeline run" action and set the variable status as the value of Status comes from "Get a pipeline run" in the "Until" action. Shown as below screenshot:
3. After that, run your logic app. The steps will run after the "Until" action(also after your pipeline complete).
By the way:
You can also do it in Data Factory, you can delete the data after completion. Please refer to this document.

Azure DevOps REST API: how to get release stage's test results?

I have a Azure DevOps release pipeline which contains 10+ stages (environments). Each release stage will run a set of test cases for example, BVT Test Stage, Performance Test Stage, etc.
Now, I would like to automatically query the test results from each of the test stage, by REST API.
I can use "Runs" API to query the test runs for this release; I can use "Release" API to query the release stages including their stage names.
But the problem is, I am not able to link the test result from the test run back to the release stage.
For example, I have release stages like "BVT Test", "Performance Test", etc. But the test result from test run is something like "VSTest_TestResults_2234523"
Thanks!
how to get release stage's test results?
Try with below api:
GET https://vstmr.dev.azure.com/{org name}/{project name}/_apis/testresults/resultdetailsbyrelease?releaseId={release id}&releaseEnvId={environment id}&api-version=5.2-preview.1
To get the test result of one specific environment, you must provide the environment id, along with its corresponding release id.
Each test run has a member points to the release's environmentId.
$stageName = $stageTable[$($oneRun.release.environmentId)]

Azure Datafactory Pipeline Failed inside a scheduled trigger

I have created 2 pipeline in Azure Datafactory. We have a custom activity created to run a python script inside the pipeline.When the pipeline is executed manually it successfully run for n number of time.But i have created a scheduled trigger of an interval of 15 minutes in order to run the 2 pipelines.The first execution successfully runs but in the next interval i am getting the error "Operation on target PyScript failed: Hit unexpected exception and execution failed." we are blocked wiht this.any input on this would be really helpful.
from ADF troubleshooting guide, it states...
Custom Activity :
The following table applies to Azure Batch.
Error code: 2500
Message: Hit unexpected exception and execution failed.
Cause: Can't launch command, or the program returned an error code.
Recommendation: Ensure that the executable file exists. If the program started, make sure stdout.txt and stderr.txt were uploaded to the storage account. It's a good practice to emit copious logs in your code for debugging.
Related helpful doc: Tutorial: Run Python scripts through Azure Data Factory using Azure Batch
Hope this helps.
If you are still blocked, please share failed pipeline run ID & failed activity run ID, for further analysis.

Resources