how to rename Databricks job cluster name during runtime - azure

I have created an ADF pipeline with Notebook activity. This notebook activity automatically creates databricks job clusters with autogenerated job cluster names.
1. Rename Job Cluster during runtime from ADF
I'm trying to rename this job cluster name with the process/other names during runtime from ADF/ADF linked service.
instead of job-59, i want it to be replaced with <process_name>_
2. Rename ClusterName Tag
Wanted to replace Default generated ClusterName Tag to required process name

Settings for the job can be updated using the Reset or Update endpoints.
Cluster tags allow you to easily monitor the cost of cloud resources used by various groups in your organization. You can specify tags as key-value pairs when you create a cluster, and Azure Databricks applies these tags to cloud resources like VMs and disk volumes, as well as DBU usage reports.
For detailed information about how pool and cluster tag types work together, see Monitor usage using cluster, pool, and workspace tags.
For convenience, Azure Databricks applies four default tags to each cluster: Vendor, Creator, ClusterName, and ClusterId.
These tags propagate to detailed cost analysis reports that you can access in the Azure portal.
Checkout an example how billing works.

Related

"No Isolation Shared" Databricks job cluster through CLI

I turned on Unity Catalog for our workspace. Now a job cluster has an access mode setting. (docs) I can manually change this setting on the UI:
But how do I control this setting when creating the job through databricks jobs create --json-file X.json?
You need to specify the data_security_mode with value "NONE" in the cluster definition (for some reason it's missing from API docs, but you can find details in the Terraform provider docs). But really it should be the default value, so you don't need to explicitly specify it.

How to create Azure databricks cluster using Service Principal

I have azure databricks workspace and I added service principal in that workspace using databricks cli. I have been trying to create cluster using service principal and not able to figure it. Can any help me?
I am able to create cluster using my account but I want to create using Service Principal and want it to be the owner of the cluster not me.
Also, it there a way I can transfer the ownership of my cluster to Service Principal?
First, answering the second question - no, you can't change the owner of the cluster.
To create a cluster that will have Service Principal as owner you need to execute creation operation under its identity. To do this you need to perform following steps:
Prepare a JSON file with cluster definition as described in the documentation
Set DATABRICKS_HOST environment variable to an address of your workspace:
export DATABRICKS_HOST=https://adb-....azuredatabricks.net
Generate AAD token for Service principal as described in documentation and assign its value to DATABRICKS_TOKEN or DATABRICKS_AAD_TOKEN environment variables (see docs).
Create Databricks cluster using databricks-cli providing name of JSON file with cluster specification (docs):
databricks clusters create --json-file create-cluster.json
P.S. Another approach (really recommended) is to use Databricks Terraform provider to script your Databricks infrastructure - it's used by significant number of Databricks customers, and much easier to use compared with command-line tools.

Azure Databricks Execution Fail - CLOUD_PROVIDER_LAUNCH_FAILURE

I'm using Azure DataFactory for my data ingestion and using an Azure Databricks notebook through ADF's Notebook activity.
The Notebook uses an existing instance pool of Standard DS3_V2 (2-5 nodes autoscaled) with 7.3LTS Spark Runtime version. The same Azure subscription is used by multiple teams for their respective data pipelines.
During the ADF pipeline execution, I'm facing a notebook activity failure frequently with the below error message
{
"reason": {
"code": "CLOUD_PROVIDER_LAUNCH_FAILURE",
"type": "CLOUD_FAILURE",
"parameters": {
"azure_error_code": "SubnetIsFull",
"azure_error_message": "Subnet /subscriptions/<Subscription>/resourceGroups/<RG>/providers/Microsoft.Network/virtualNetworks/<VN>/subnets/<subnet> with address prefix 10.237.35.128/26 does not have enough capacity for 2 IP addresses."
}
}
}
Can anyone explain what this error is and how I can reduce the occurrence of this? (The documents I found are not explanatory)
Looks like your data bricks has been created within a VNET see this link or this link. When this is done, the databricks instances are created within one of the subnets within this VNET. It seems that at the point of triggering, all the IPs within the subnet were already utilized.
You cannot ad should not extend the IP space. Please do not attempt to change the existing VNET configuration as this will affect your databricks cluster.
You have the following options.
Check when less number of databricks instances are being instantiated and
schedule your ADF during this time. You should be looking at
distributing the execution across the time so we don't attempt to
peak over the existing IPs in the subnet.
Request your IT department to create a new VNET and subnet and
create a new Databricks cluster in this VNET.
The problem arise from the fact that when your workspace was created, the network and subnet sizes wasn't planned correctly (see docs). As result, when you're trying to launch a cluster, then there is not enough IP addresses in a given subnet, and given this error.
Unfortunately right now it's not possible to expand network/subnets size, so if you need a bigger network, then you need to deploy a new workspace and migrate into it.

Azure Databricks move Log Analytics

Databricks VMs are pointing to Default Log Analytics but I want to point them to another one
If I try to move VMs to antoher workpacks it tells me that its locked
Error: cannot perform delete operation because following scope(s) are locked
Unfortunately, you are not allowed to move Log Analytics for the Managed Resource Group created in Azure Databricks using Azure portal.
Reason: By default, you cannot perform any write operation on the managed resource group which created by Azure Databricks.
If you try to modify anything in the managed resource group, you will see this error message:
{"details":[{"code":"ScopeLocked","message":"The scope '/subscriptions/xxxxxxxxxxxxxxxx/resourceGroups/databricks-rg-chepra-d7ensl75cgiki' cannot perform write operation because following scope(s) are locked: '/subscriptions/xxxxxxxxxxxxxxxxxxxx/resourceGroups/databricks-rg-chepra-d7ensl75cgiki'. Please remove the lock and try again."}]}
Possible way: You can specify tags as key-value pairs when while creating/modifying clusters, and Azure Databricks will apply these tags to cloud resources.
Possible way: Configure your Azure Databricks cluster to use the monitoring library.
This article shows how to send application logs and metrics from Azure Databricks to a Log Analytics workspace. It uses the Azure Databricks Monitoring Library.
Hope this helps.

Update WadCfg "only" of existing Azure Service Fabric cluster?

I want to monitor Perfomance metrics of a existing Service Fabric Cluster.
Here is the link of Performance metrics -
https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-diagnostics-event-generation-perf
I went through this Microsoft documentation -
https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-diagnostics-perf-wad
My problem is, The ARM template I downloaded during Service Fabric creation time is quite big and contains lot of params and I don't have the template-params file. I think it is possible to build the params file but it will be time consuming.
Is it possible to download template and template-params file of
existing service fabric cluster ?
If no, Is it possible to just update the "WadCfg" section to add new
performance counters ?
Your can export your entire resource group with all definitions and parameters, there you can find all parameters(as default parameters) for the resources deployed in the resource group. I've never done for SF cluster, but a quick look to an existing resource group I have I could see the cluster definition included.
This link explain how: https://learn.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-export-template
In Summary:
Find the resource group where your cluster is deployed
Open the resource group and navigate to 'Automation Scripts'
Click 'Download' on top bar
Open the ARM template with all definitions
Make the modifications and save
Publish the updates
1:
2:
You could also add it to a library and deploy from there, as guided in the link above.
From the docs: Not all resource types support the export template function. To resolve this issue, manually add the missing resources back into your template.
To be honest, I've never deployed this way other than test environments, so I am not sure if it is safe for production.

Resources