How to Exit Azure databricks Notebook while Job is running - databricks

I am executing azure databricks notebook and Keeping try catch for exception handling in that I want to exit notebook run when Exceptions meet true. I'm keeping notebook exit command but it is moving to next cell.
Below is my code and output when I executed notebook. Can you please check and help If I'm doing wrong in any place.

Related

Amazon EMR Step keeps running without error

I have Spark EMR job that is running fine until a point in code that it sort of stucks. The step remains in Running state and in the logs I don't see any error. I believe the code is fine as I have tested it on my jupyter notebook. I am unable to debug or figure out what is the issue here as the step is not failing and there are no error logs produced in stderr file.
When I try to execute only the part of code that my Step is stuck on, it executes and the Step ends successfully. Also, in the monitoring I see no issues with memory and nodes etc. Any help to identify the issue?

Update databricks job status through API

We need to execute a long running exe running on a windows machine and thinking of ways to integrate with the workflow. The plan is to include the exe as a task in the Databricks workflow.
We are thinking of couple of approaches
Create a DB table and enter a row when this particular task gets started in the workflow. Exe which is running on a windows machine will ping the database table for any new records. Once a new record is found, the exe proceeds with actual execution and updates the status after completion. Databricks will query this table constantly for the status and once completed, task finishes.
Using databricks API, check whether the task has started execution in the exe and continue with execution. After application finishes, update the task status to completion until then the Databricks task will run like while (true). But the current API doesn't support updating the task execution status (To Complete) (not 100% sure).
Please share thoughts OR alternate solutions.
This is an interesting problem. Is there a reason you must use Databricks to execute an EXE?
Regardless, I think you have the right kind of idea. How I would do this with the jobs api is as described:
Have your EXE process output a file to a staging location probably in DBFS since this will be locally accessible inside of databricks.
Build a notebook to load this file, having a table is optional but may give you addtional logging capabilities if needed. The output of your notebook should use the dbutils.notebook.exit method which allows you to output any value string or array. You could return "In Progress" and "Success" or the latest line from your file you've written.
Wrap that notebook in a databricks job and execute on an interval with a cron schedule (you said 1 minute) and you can retrieve the output value of your job via the get-output endpoint
Additional Note, the benefit of abstracting this into return values from a notebook is you could orchestrate this via other workflow tools e.g. Databricks Workflows or Azure Data Factory with inside an Until condition. There are no limits so long as you can orchestrate a notebook in that tool.

Azure Databricks cell execution stuck on waiting to run state

I am using Azure Databricks for connecting to SAP system and ADLS. For SAP connection I am installing the latest version of JDBC library(ngdbc-2.10.14.jar). After installing the library, the notebook cells have stopped executing. When I try to run the cell, it gets stuck in a waiting to run state.
You cannot perform any future commands in a notebook tied to a Databricks Runtime cluster after cancelling a running streaming cell. The commands are stuck in a "waiting to execute" state, and you'll have to clear the notebook's state or detach and reconnect the cluster before you can run commands on it.
This problem only happens when you cancel a single cell; it does not occur when you run all cells and cancel all of them.
To fix an impacted notebook without having to restart the cluster, go to the Clear menu and choose Clear State:

Using databricks for twtter sentiment analysis - issue running the official tutorial

I am starting to use Databricks and tried to implement one of the official tutorials (https://learn.microsoft.com/en-gb/azure/azure-databricks/databricks-sentiment-analysis-cognitive-services) from the website. However, I run into an issue - not even sure if I can call it an issue - when I run the second notebook (analysetweetsfromeventhub) then all commands (2nd, 3rd, 4th ...) are officially waiting to run, but never run. See the picture. Any idea what might be? Thanks.
After you cancel a running streaming cell in a notebook attached to a Databricks Runtime cluster, you cannot run any subsequent commands in the notebook. The commands are left in the “waiting to run” state, and you must clear the notebook’s state or detach and reattach the cluster before you can successfully run commands on the notebook.
Note that this issue occurs only when you cancel a single cell; it does not apply when you run all and cancel all cells.
In the meantime, you can do either of the following:
To remediate an affected notebook without restarting the cluster, go to the notebook’s Clear menu and select Clear State:
If restarting the cluster is acceptable, you can solve the issue by turning off idle context tracking. Set the following Spark configuration value on the cluster:
spark.databricks.chauffeur.enableIdleContextTracking false
Then restart the cluster.

Can I start the another cluster from current notebook in Databricks?

I have notebook1 assigned to cluster1 and notebook2 assigned to cluster2.
I want to trigger notebook2 from notebook1 but notebook2 should use only cluster2 for execution.
Currently its getting triggered using Cluster1.
Please let me know for more information.
Unfortunately, you cannot start another cluster from current notebook.
This is excepted behaviour, when you trigger notebook2 from notebook it will use cluster1 and not cluster2.
Reason: When you run any command from notebook1, always runs on the attached cluster.
Notebooks cannot be statically assigned to a cluster; that's actually runtime state only. If you want to run some code on a different cluster (in this case, the code is a notebook), then you have to do it by having your first notebook submit a separate job, rather than using dbutils.notebook.run or %run.
Notebook Job Details:
Hope this helps.

Resources