How to use Pipeline parameters on AzureML - azure

I've built a pipeline on AzureML Designer and I'm trying to use pipeline parameters but I'm not able to get the values of those parameters on a python script module.
https://learn.microsoft.com/en-us/azure/machine-learning/how-to-create-your-first-pipeline
This documentation contains a section called "Use pipeline parameters for arguments that change at inference time" but, unfortunately, it is empty.
I'm defining the parameters on the pipeline setting, see the screenshot on the bottom. Does anyone know how to use the parameters when using the Designer to build the pipeline?

You can correlate each pipeline stage’s outputs w/its inputs. e.g. given the results of model evaluation we should be able to easily identify all the artifacts (model evaluation configuration, model specification, model parameters, training script, training data etc.) pertaining to said evaluation.
Azure Machine Learning Pipelines Referenced Article:
https://github.com/Azure/MachineLearningNotebooks/blob/4a3f8e7025334ea8c0de0bada69b031ce54c24a0/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-databricks-as-compute-target.ipynb
We have an AMLS pipeline trying to parameterize with a date string to process our pipeline in the context of old historical dates.
Here’s the code we’re using to submit the pipeline
from azureml.core.authentication import InteractiveLoginAuthentication
import requests
auth = InteractiveLoginAuthentication()
aad_token = auth.get_authentication_header()
rest_endpoint = published_pipeline.endpoint
print("You can perform HTTP POST on URL {} to trigger this pipeline".format(rest_endpoint))
# specify the param when running the pipeline
response = requests.post(rest_endpoint,
headers=aad_token,
json={"ExperimentName": "dtpred-Dock2RTEG-EX-param",
"RunSource": "SDK",
"DataPathAssignments": {"input_datapath": {"DataStoreName": "erpgen2datastore","RelativePath": "teams/PredictiveInsights/DatePrediction/2019/10/10"}},
"ParameterAssignments": {"param_inputDate": "2019/10/10"}})
run_id = response.json()["Id"]
print('Submitted pipeline run: ', run_id)

Related

Azure DevOps build pipeline logid for a specific task - dynamically [duplicate]

This question already has an answer here:
Azure DevOps pipeline logs for a specific task
(1 answer)
Closed last month.
In Azure DevOps, I have a multistage build pipeline. In that pipeline, I have a task named - Terraform_init. I need the log ID of this task, dynamically. How do I find out the log ID dynamically, if I know the displayname of the task?
Current situation:
Right now I have figured out the Log ID for that task. But later, in my build pipeline, I will add more tasks before the Terraform_init task. So, the log Id will be changed for that task.
Why I need:
After Terraform init task, I have another task called Get_logs. This task gets the logs of the Terraform_init task and saves it in a blob. For that, I have to use the following line -
$logs_url = ('https://dev.azure.com/bmw-ai-big-data-platform/{0}/_apis/build/builds/{1}/logs/27?api-version=6.0' -f $($env:SYSTEM_TEAMPROJECTID), $($env:BUILD_BUILDID) )
I will have more tasks before the Terrafrom_init task, so ../logs/27.. - this part will need to update every time. I want to avoid this.
Thanks in advanced.
Use this code to get the logids:
import requests
import json
url = "https://dev.azure.com/<orgname>/<project name>/_apis/build/builds/<build id>/timeline?api-version=6.0"
payload={}
headers = {
'Authorization': 'Basic <base64encoded PAT>'
}
response = requests.request("GET", url, headers=headers, data=payload)
reponse_json = response.json()
records = reponse_json['records']
for record in records:
if record['name'] == 'Initialize job':
print(record['log']['id'])
After that, using logging command to output the data as variable, and then you can be able to use it in other following tasks(Only for runtime, compile time is unable to achieve.).

Add run id when registering ml.azure model via python (pipeline)

I have registed a model in this way:
from azureml.core.model import Model
model = Model.register(model_path="sklearn_regression_model.pkl",
model_name="sklearn_regression_model",
tags={'area': "diabetes", 'type': "regression"},
description="Ridge regression model to predict diabetes",
workspace=ws)
However I would like to add run id, from the experiment, so I can always back-track the model to the experiment that created the model. In azure ml there is a column indicating that it is possible to add run id to a registered model, however the model class doesn't have this parameter.
In order to see the Experiment name and the Run ID in the Azure Machine Learning Studio, I had to use Run.register_model() in the outer pipeline file instead.
It is in some way even better since at that place we get access to the Dataset objects which we can link to the model.
run = Experiment(workspace, "rgb_finetune").submit(pipeline)
run.wait_for_completion(show_output=True)
eval_metrics = run.get_metrics()["Fine-Tuned Evaluation"]
if eval_metrics["AP50"] > 0.5:
run.find_step_run("finetune.py")[0].register_model(
model_name="92c5e1a1d1",
model_path="outputs/model_export",
properties={"AP50": round(float(eval_metrics["AP50"]), 3)},
description="RGB model",
datasets=[("images", images), ("labels", labels)],
)
First of all, the Model Class has a "Run ID", which you can verify with:
azureml.core.Model.run_id this contains the ID of the Run that created the Model.
The run_id is an optional ID used to filter returned results.
So, you if you register it first, you should be able to query the run_id.
Alternatively, you can query the run_id from the Run that generated your model, and then you can register using a tag as tags={'run_id': '{your-run-id}'}

How can I change Azure data factory pipeline parameter dynamically?I want to assign new value to pipeline parameter from 'Metadata activity'

I want to obtain a parameter from pipeline output. Currently as per my knowledge Azure data factory pipeline output can not be customized. Hence I want to pass my output string in pipeline parameter, to be able to extract it from pipeline output json.
The output of a pipeline is an output of an activity with in that pipeline . You can use the "Execute Pipeline" activity and trigger a new pipeline .

'Delay until' finish time of 'Queue a new build' not working in Azure Logic App

I'm triggering an Azure Logic App from an https webhook for a docker image in Azure Container Registry.
The workflow is roughly:
When a HTTP request is received
Queue a new build
Delay until
FinishTime of Queue a new build
See: Workflow image
The Delay until action doesn't work in that the queueried FinishTime is 0001-01-01T00:00:00.
It complains about the wrong format, so I manually added a Z after the FinishTime keyword.
Now the time stamp is in the right format, however, the timestamp 0001-01-01T00:00:00Z obviously doesn't make sense and subsequent steps are executed without delay.
Anything that I am missing?
edit: Queue a new build queues an Azure pipeline build. I.e. the FinishTime property comes from the pipeline.
You need to set a timestamp in future, the timestamp 0001-01-01T00:00:00Z you set to the "Delay until" action is not a future time. If you set a timestamp as 2020-04-02T07:30:00Z, the "Delay until" action will take effect.
Update:
I don't think the "Delay until" can do what you expect, but maybe you can refer to the operations below. Just add a "Condition" action to judge if the FinishTime is greater than current time.
The expression in the "Condition" is:
sub(ticks(variables('FinishTime')), ticks(utcNow()))
In a word, if the FinishTime is greater than current time --> do the "Delay until" aciton. If the FinishTime is less than current time --> do anything else which you want.(By the way you need to pay attention to the time zone of your timestamp, maybe you need to convert all of the time zone to UTC)
I've been in touch with an Azure support engineer, who has confirmed that the Delay until action should work as I intended to use it, however, that the FinishTime property will not hold a value that I can use.
In the meantime, I have found a workaround, where I'm using some logic and quite a few additional steps. Inconvenient but at least it does what I want.
Here are the most important steps that are executed after the workflow gets triggered from a webhook (docker base image update in Azure Container Registry).
Essentially, I'm initializing the following variables and queing a new build:
buildStatusCompleted: String value containing the target value completed
jarsBuildStatus: String value containing the initial value notStarted
jarsBuildResult: String value containing the default value failed
Then, I'm using an Until action to monitor when the jarsBuildStatus's value is switching to completed.
In the Until action, I'm repeating the following steps until jarsBuildStatus changes its value to buildStatusCompleted:
Delay for 15 seconds
HTTP request to Azure DevOps build, authenticating with personal access token
Parse JSON body of previous raw HTTP output for status and result keywords
Set jarsBuildStatus = status
After breaking out of the Until action (loop), the jarsBuildResult is set to the parsed result.
All these steps are part of a larger build orchestration workflow, where I'm repeating the given steps multiple times for several different Azure DevOps build pipelines.
The final action in the workflow is sending all the status, result and other relevant data as a build summary to Azure DevOps.
To me, this is only a workaround and I'll leave this question open to see if others have suggestions as well or in case the Azure support engineers can give more insight into the Delay until action.
Here's an image of the final workflow (at least, the part where I implemented the Delay until action):
edit: Turns out, I can simplify the workflow because there's a dedicated Azure DevOps action in the Logic App called Send an HTTP request to Azure DevOps, which omits the need for manual authentication (Azure support engineer pointed this out).
The workflow now looks like this:
That is, I can query the build status directly and set the jarsBuildStatus as
#{body('Send_an_HTTP_request_to_Azure_DevOps:_jar''s')['status']}
The code snippet above is automagically converted to a value for the Set variable action. Thus, no need to use an additional Parse JSON action.

Azure Data Factory v2: Activity execute pipeline output

Is there a way to reference the output of an executed pipeline in the activity "Execute pipeline"?
I.e.: master pipeline executes 2 pipelines in sequence. The first pipeline generates an own created run_id that needs to be forwarded as a parameter to the second pipeline.
I've read the documentation and checked that the master pipeline log the output of the first pipeline, but it looks like that this is not directly possible?
We've used until now only 2 pipelines without a master pipeline, but we want to re-use the logic more. Currently we have 1 pipeline that calls the next pipeline and forwards the run_id.
ExecutePipline currently cannot pass anything from its insides to its output. You can only get the runID or name.
For some weird reason, the output of ExecutePipeline is returned not as a JSON object but as a string. So if you try to select a property of output like this #activity('ExecutePipelineActivityName').output.something then you get this error:
Property selection is not supported on values of type 'String'
I found that I had to use the following to get the run ID:
#json(activity('ExecutePipelineActivityName').output).pipelineRunId
The execute pipeline activity is just another activity with outputs that can be captured by other activities. https://learn.microsoft.com/en-us/azure/data-factory/control-flow-execute-pipeline-activity#type-properties
If you want to use the runId of the pipeline executed previosly, it would look like this:
#activity('ExecutePipelineActivityName').output.pipeline.runId
Hope this helped!

Resources