I am using the below code to get some information in the Azure Databricks notebook, but runOutput isn't appearing even after the successful completion of the notebook activity.
Code that I used.
import json
dbutils.notebook.exit(json.dumps({
"num_records" : dest_count,
"source_table_name" : table_name
}))
Databricks notebook exited properly, but Notebook activity isn't showing runOutput.
Can someone please help me what is wrong here?
When I tried the above in my environment, it is working fine for me.
These are my Linked service Configurations.
Result:
I suggest you try the troubleshooting steps like, changing Notebook and changing the Databricks workspace with new one or using Existing cluster in linked service.
If still, it is giving the same, then it's better to raise a Support ticket for your issue.
Related
I'm trying to connect to an azure data explorer but I keep getting a non descriptive error. I'm following this tutorial.
https://learn.microsoft.com/en-us/sql/azure-data-studio/notebooks/notebooks-kqlmagic?view=sql-server-ver16.
Has anyone seen this?
click here for screenshot
I was trying to connect to azure data explorer from Azure Machine Learning Studio notebooks. I also tried it in Jupyter notebooks with an anaconda environment and I got the same error.
However, the command %reload_ext Kqlmagic worked for me
Maybe its because that Azure login has multiple directories?
Using Azure Data Factory Version 2, we have created a Spark Activity ( a simple Hello World example ), but it throws Error with Error Code 2312
Our configuration is Hdinsight cluster with Azure Data Lake as primary storage.
We also tried spinning up an HDInsight cluster with Azure Blob Storage as primary storage and there as well we are facing same issue.
We further tried replacing Scala code with Python scrip ( simple hello world example ), But facing same issue.
Has anyone encountered this issue, are we missing any basic setting
Thanks in advance
May be its too late and you have already solved your issue . However , you can try below
Use azure databricks . Create a new instance of databricks and run your sample hello world in notebook . if its works in notebook then call the same notebook in adf .
hope it helps
#Yogesh, have you tried debugging the issue through ADF by opting Debug as the screenshot? That might help you get the exact root cause. I would suggest trying using the spark-submit with the jar in the Linux box to find out the exact cause.
Also, you can find more info on https://learn.microsoft.com/en-us/azure/data-factory/data-factory-troubleshoot-guide#error-code-2312
When trying to reference/load a dsource or dprep file generated with a data source file from blob storage, I receive the error "No files for given path(s)".
Tested with .py and .ipynb files. Here's the code:
# Use the Azure Machine Learning data source package
from azureml.dataprep import datasource
df = datasource.load_datasource('POS.dsource') #Error generated here
# Remove this line and add code that uses the DataFrame
df.head(10)
Please let me know what other information would be helpful. Thanks!
Encountered the same issue and it took some research to figure out!
Currently, data source files from blob storage are only supported for two cluster types: Azure HDInsight PySpark and Docker (Linux VM) PySpark
In order to get this to work, it's necessary to follow instructions in Configuring Azure Machine Learning Experimentation Service.
I also ran az ml experiment prepare -c <compute_name> to install all dependencies on the cluster before submitting the first command, since that deployment takes quite a bit of time (at least 10 minutes for my D12 v2 cluster.)
Got the .py files to run with HDInsight PySpark compute cluster (for data stored in Azure blobs.) But .ipynb files are still not working on my local Jupyter server - the cells never finish.
I'm from the Azure Machine Learning team - sorry you are having issues with Jupyter notebook. Have you tried running the notebook from the CLI? If you run from the CLI you should see the stderr/stdout. The IFrame in WB swallows the actual error messages. This might help you troubleshoot.
I am trying to create a azure batch service. While creating a pool, i am trying to give a starttask which should be run when the VM's are spinned up for the first time. After the pool is committed when i try to observe the progress on the Azure portal, state of the nodes appear as starttaskfailed. I could see the scheduling error inside the starttaskinfo. Error info is as given below.
CATEGORY - ServerError
CODE - BlobDownloadMiscError
MESSAGE - Miscellaneous error encountered while downloading one of the specified azure blob's.
Here I am trying to run the simple executable as a start task which is creating a container and writing a blob.
I have already tried to run the exe standalone from my machine, it performs the operation as expected.
But when I am trying to run the same thing as a start task, I am getting the aforementioned error.
P.S. I have already verified that all the paths and the required dependable(dll) are uploaded on to the blob.
Please help me in identifying the root cause of the problem. Even if i get to know the descriptive error message that would be of great help.
I am able to solve this issue now. I had provided wrong container name so was getting this error as it was not able to locate the files in the given location. But while running a starttask it was not giving me any meaningful error neither in portal nor in code where it doesn't even give error. So as rectify this issue i tried running this as job and added this in its task where it correctly notified the error in ExecutionInformation->schedulingerror property of cloudtask.
When I try to test my Azure ML model, I get the following error: “Error code: InternalError, Http status code: 500”, so it appears something is failing inside of the machine learning service. How do I get around this error?
I've run into this error before, and unfortunately, the only workaround I found was to create a new ML workspace backed by a storage account that you know is online. Then copy your experiment over to the new workspace, and things should work. It can be a bit cumbersome, but it should get rid of your error message. With the service being relatively new, things sometimes get corrupted as updates are being made, so I recommend checking the box labeled "disable updates" within your experiment. Hope that helps!