I would like to know if the AML workspace ID needs to be configured at initial creation of a Databricks workspace or it can be taken as an update thereafter without destroying the workspace
No it's not required at creation time - link to Azure ML could be added after workspace is created. See documentation for steps.
Related
I need to find info who starts the pipeline (trigered Manual); In the pipeline runs section there is no info about user only about parent pipeline if applicable (Triggered by column).
I miss something or is this info isn't accessible?
EDIT:
More specifically, I would like to know who launched a pipeline that has the status "Triggered by" = "Manual Trigger"
Yes, you are following the process is correct. Checking, who is running the pipeline in Azure Synapse but because of the RBAC permission action issue, you do not have the required permission access.
Please follow the below steps to solve the permission issue:
Open synapse studio ->workspace, expand the Security section on the left and select Access control -> Add a Synapse role assignment.
Check whether your pipelines running or not in azure synapse
Reference:
https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/synapse-analytics/security/how-to-manage-synapse-rbac-role-assignments.md
https://learn.microsoft.com/en-us/azure/data-factory/monitor-visually
https://learn.microsoft.com/en-us/azure/synapse-analytics/security/synapse-workspace-synapse-rbac-roles
I have azure databricks workspace and I added service principal in that workspace using databricks cli. I have been trying to create cluster using service principal and not able to figure it. Can any help me?
I am able to create cluster using my account but I want to create using Service Principal and want it to be the owner of the cluster not me.
Also, it there a way I can transfer the ownership of my cluster to Service Principal?
First, answering the second question - no, you can't change the owner of the cluster.
To create a cluster that will have Service Principal as owner you need to execute creation operation under its identity. To do this you need to perform following steps:
Prepare a JSON file with cluster definition as described in the documentation
Set DATABRICKS_HOST environment variable to an address of your workspace:
export DATABRICKS_HOST=https://adb-....azuredatabricks.net
Generate AAD token for Service principal as described in documentation and assign its value to DATABRICKS_TOKEN or DATABRICKS_AAD_TOKEN environment variables (see docs).
Create Databricks cluster using databricks-cli providing name of JSON file with cluster specification (docs):
databricks clusters create --json-file create-cluster.json
P.S. Another approach (really recommended) is to use Databricks Terraform provider to script your Databricks infrastructure - it's used by significant number of Databricks customers, and much easier to use compared with command-line tools.
I have created an ADF pipeline with Notebook activity. This notebook activity automatically creates databricks job clusters with autogenerated job cluster names.
1. Rename Job Cluster during runtime from ADF
I'm trying to rename this job cluster name with the process/other names during runtime from ADF/ADF linked service.
instead of job-59, i want it to be replaced with <process_name>_
2. Rename ClusterName Tag
Wanted to replace Default generated ClusterName Tag to required process name
Settings for the job can be updated using the Reset or Update endpoints.
Cluster tags allow you to easily monitor the cost of cloud resources used by various groups in your organization. You can specify tags as key-value pairs when you create a cluster, and Azure Databricks applies these tags to cloud resources like VMs and disk volumes, as well as DBU usage reports.
For detailed information about how pool and cluster tag types work together, see Monitor usage using cluster, pool, and workspace tags.
For convenience, Azure Databricks applies four default tags to each cluster: Vendor, Creator, ClusterName, and ClusterId.
These tags propagate to detailed cost analysis reports that you can access in the Azure portal.
Checkout an example how billing works.
Databricks VMs are pointing to Default Log Analytics but I want to point them to another one
If I try to move VMs to antoher workpacks it tells me that its locked
Error: cannot perform delete operation because following scope(s) are locked
Unfortunately, you are not allowed to move Log Analytics for the Managed Resource Group created in Azure Databricks using Azure portal.
Reason: By default, you cannot perform any write operation on the managed resource group which created by Azure Databricks.
If you try to modify anything in the managed resource group, you will see this error message:
{"details":[{"code":"ScopeLocked","message":"The scope '/subscriptions/xxxxxxxxxxxxxxxx/resourceGroups/databricks-rg-chepra-d7ensl75cgiki' cannot perform write operation because following scope(s) are locked: '/subscriptions/xxxxxxxxxxxxxxxxxxxx/resourceGroups/databricks-rg-chepra-d7ensl75cgiki'. Please remove the lock and try again."}]}
Possible way: You can specify tags as key-value pairs when while creating/modifying clusters, and Azure Databricks will apply these tags to cloud resources.
Possible way: Configure your Azure Databricks cluster to use the monitoring library.
This article shows how to send application logs and metrics from Azure Databricks to a Log Analytics workspace. It uses the Azure Databricks Monitoring Library.
Hope this helps.
I tried following the Quickstart: Run a Spark job on Azure Databricks using the Azure portal as described at: https://learn.microsoft.com/en-us/azure/azure-databricks/quickstart-create-databricks-workspace-portal
But when I later try to delete resource group for that databricks resource I got the following two errors:
Delete resource group databricks-rg-mydatabricksws-5mlo3dio7wef2
failed The resource group databricks-rg-mydatabricksws-5mlo3dio7wef2
is locked and can't be deleted. Click here to manage locks for this
resource group.
UnauthorizedApplicationId "The management lock ... is owned by system
application"
See: https://aka.ms/arm-lock
Lock Deletion Failure The lock named mydatabricksws was unable to be
deleted for the following reasons: {"errorThrown":"Unavailable in
batch","jqXHR":{"responseJSON":{"error":{"code":"UnauthorizedApplicationId","message":"The
management lock 'mydatabricksws' is owned by system application(s)
'd9327919-6775-4843-9037-3fb0fb0473cb'.
I also encountered the same problem before. I get the answer from this link.
Log into your Azure Databricks workspace as the account owner (the user who created the service), and click the user profile Account icon at the top right.
Select Manage Account.
In the Azure Databricks service, click Azure Delete and then OK.
You also could get the Azure Databricks code demo from this document.