I have a Synapse Pipeline which runs a notebook containing unit tests before executing the business job (another notebook). The unit test notebook references the functions using the mssparkutils.notebook.run() command, and works fine when I run the notebook on its own in Synapse Studio. However, when I trigger the notebook in a pipeline, it fails with the error:
{
"errorCode": "6002",
"message": "NameError: name 'get_latest_report_per_user' is not defined",
"failureType": "UserError",
"target": "Run Tests",
"details": []
}
get_latest_report_per_user is defined in the referenced notebook. The reference works fine when run on its own outside of the pipeline.
The above snippet is from the unit test notebook I am running, and get_latest_report_per_user is defined in the "dependency" notebook.
I tried using the magic %run command instead of mssparkutils.notebook.run() to no avail.
Why is this only failing when executed as part of a pipeline?
I've been having similar issues and some of them were resolved by "Publishing" the Notebooks. You could also try the "Enable unpublished notebook reference" option under Notebook properties, but I can't recall if this option is respected by Notebooks which are invoked by a Synapse Pipeline.
Related
I cant seem to run my Synapse Notebook in a pipeline, however if i run it by itself it works fine.
The error i get is simply : "Operation on target LoadFactSalesDoc_without_DimResource failed: MessageQueueFullException: The message queue is full or is completed and cannot accept more items."
Anyone had this error before?
I tried running the notebook in a different pipeline, I also tried allocating more ressources to the notebook. It always has the same error.
Is there a way to run (or convert) .ipynb files on a Databricks cluster without using the import ui of Databricks. Basically I want to be able to develop in Jupyter but also be able to run this file on Databricks where its pulled trough git.
It's possible to import Jupyter notebooks into Databricks workspace as a Databricks notebook, and then execute it. You can use:
Workspace Import REST API
databricks workspace import command of databricks-cli.
P.S. Unfortunately you can't open it by committing into a Repo, it will be treated as JSON. So you need to import it to convert into a Databricks notebook
SO far i can run the pipeline manually and it runs my testcafe tests with the "node myTests.js" command.
my pipeline run
My file myTests.js looks like this:
myTests.js
I followed this tutorial: https://learn.microsoft.com/en-us/azure/devops/test/run-automated-tests-from-test-hub?view=azure-devops
I tried to associated an test to my Testplan via the REST API, i guess theres some problem. because i can name it whatever i want and it just runs it without errors:my testcase association
When i run a testcase it says it found an automated test and it runs it without errors, the VsTest Job runs but with a warning:
2021-05-18T09:16:32.7619103Z Source filter: *test.dll,!*TestAdapter.dll,!\obj*
2021-05-18T09:16:32.7879061Z ##[warning]No test sources found matching the given filter '*,!\obj**'
Any ideas what im doing wrong? I just want to run my pipeline or my test when im running a testcase in my testplans.
Azure DevOps: How can i connect Testcafe-tests to a testcase?
I am afraid you could not connect the Testcafe-tests to a testcase at this moment.
According to the document Run automated tests from test plans:
You will need:
A Team Build pipeline that generates builds containing the test
binaries.
That is the reason why you get the error No test sources found matching the given filter '*test.dll,!*TestAdapter.dll,!\obj*.
We could build and test the Testcafe-tests in the azure devops pipeline.
Please check the document Integrate TestCafe with Azure DevOps for some more details.
I'm new to Azure Machine Learning, and trying to create a simple ML pipeline. AzureML supports YAML to define ML pipeline, and it's described here (https://learn.microsoft.com/en-us/azure/machine-learning/reference-pipeline-yaml).
An error I faced is that, when I create a pipeline from "az ml pipeline create" with YAML file, it returns the message below even if I specify "download" for bind_mode of data_references.
Messeage: "<class azureml.data.tabular_dataset.TabularDataset'> does not support mount. Only FileDataset supports mount"
Environment:
OS: Windows 10
Azure CLI: 2.11.1
It seems that bind_mode of Tabular dataset is not working or I miss something. The reason I'm confused is that, as you can see in the sample yaml file described in the link above, dataset with "bind_mode: download" should work.
Sample YAML is below with a defined dataset called "dataset1" of Tabular format.
Sample YAML:
pipeline:
name: "Sample ML pipeline YAML"
data_references:
sampleDS:
dataset_name: dataset1
bind_mode: download
default_compute: compute-name
steps:
SampleStep:
type: PythonScriptStep
name: SampleProcessing
script_name: processing.py
allow_reuse: True
source_directory: ".\\src\\pipeline\\steps"
inputs:
input_ds:
source: sampleDS
When data_references is changed to the following (specify the path in a datastore directly, not via registered dataset), it works.
name: "Sample ML pipeline YAML"
data_references:
sampleDS:
datastore: workspaceblobstore
path_on_datastore: path/of/sampeDS/sample.csv
Yes.You are right. TabularDataset does not support download or mount. You can create®ister a Filedataset and the code sample will work.
Learn more about dataset type here
I create experiments in my workspace using the python sdk (azureml-sdk). I now have a lot of 'test' experiments littering our workspace. How can I delete individual experiments either through the api or on the portal. I know I can delete the whole workspace but there are some good experiments we don't want to delete
https://learn.microsoft.com/en-us/azure/machine-learning/service/how-to-export-delete-data#delete-visual-interface-assets suggests it is possible but my workspace view does not look anything like what is shown there
Experiment deletion is a common request and we in Azure ML team are working on it. Unfortunately it's not supported quite yet.
Starting from 2021-08-24 Azure ML Workspace release you can delete the experiment - but only by clicking in UI (Select Experiment in Experiments view -> 'Delete')
Watch out - deleting the experiment will delete all the underlying runs - and deleting a run will delete the child runs, run metrics, metadata, outputs, logs and working directories!
Only for experiments without any underlying runs you can use Python SDK (azureml-core==1.34.0) - Experiment class delete static method, example:
from azureml.core import Workspace, Experiment
aml_workspace = Workspace.from_config()
experiment_id = Experiment(aml_workspace, '<experiment_name>').id
Experiment.delete(aml_workspace, experiment_id)
If an experiment has runs you will get an error:
CloudError: Azure Error: UserError
Message: Only empty Experiments can be deleted. This experiment contains run(s)
I hope Azure ML team gets this functionality to Python SDK soon!
Also on a sad note - would be great if you optimize the deletion - for now it seems like extremely slow (implementation) synchronous (need async as well) call...
You can delete your experiment with the following code:
# Declare your experiment
from azureml.core import Experiment
experiment = Experiment(workspace=ws, name="<your_experiment>")
# Delete the experiment
experiment.archive()
# Now check the list of experiments on your AML wokrspace and see that it was deleted
This issue is still opened at the moment. What I have figure out to avoid many experiments in workspace is run locally in Python SDK and after upload output files to the run's outputs folder when the run completes.
You can define it as:
run.upload_file(name='outputs/sample.csv', path_or_stream='./sample.csv')
Follow the two steps:
1.Delete experiment's child jobs in Azure Studio, here is how:
2.Delete the (empty) experiment with Python API, here is how:
from azureml.core import Workspace, Experiment, Run
# choose the workspace and experiment
ws = Workspace.from_config()
exp_name = 'digits_recognition'
# ... delete first experiment's child jobs in Azure Studio
exp = Experiment(ws,exp_name)
Experiment.delete(ws,exp.id)
Note: for a more fine-grained control over deletions, use Azure CLI.