Azure Databricks move Log Analytics

Azure Databricks move Log Analytics - azure

Databricks VMs are pointing to Default Log Analytics but I want to point them to another one
If I try to move VMs to antoher workpacks it tells me that its locked
Error: cannot perform delete operation because following scope(s) are locked

Unfortunately, you are not allowed to move Log Analytics for the Managed Resource Group created in Azure Databricks using Azure portal.
Reason: By default, you cannot perform any write operation on the managed resource group which created by Azure Databricks.
If you try to modify anything in the managed resource group, you will see this error message:
{"details":[{"code":"ScopeLocked","message":"The scope '/subscriptions/xxxxxxxxxxxxxxxx/resourceGroups/databricks-rg-chepra-d7ensl75cgiki' cannot perform write operation because following scope(s) are locked: '/subscriptions/xxxxxxxxxxxxxxxxxxxx/resourceGroups/databricks-rg-chepra-d7ensl75cgiki'. Please remove the lock and try again."}]}
Possible way: You can specify tags as key-value pairs when while creating/modifying clusters, and Azure Databricks will apply these tags to cloud resources.
Possible way: Configure your Azure Databricks cluster to use the monitoring library.
This article shows how to send application logs and metrics from Azure Databricks to a Log Analytics workspace. It uses the Azure Databricks Monitoring Library.
Hope this helps.

Related

Upscaling/Downscaling provisioned RU for cosmos containers at specific time

As mentioned in the Microsoft documentation there is support to increase/decrease the provisioned RU of cosmos containers using cosmosDB Java SDK but when I am trying to perform the steps I am getting below error:
com.azure.cosmos.CosmosException: {"innerErrorMessage":"\"Operation 'PUT' on resource 'offers' is not allowed through Azure Cosmos DB endpoint. Please switch on such operations for your account, or perform this operation through Azure Resource Manager, Azure Portal, Azure CLI or Azure Powershell\"\r\nActivityId: 86fcecc8-5938-46b1-857f-9d57b7, Microsoft.Azure.Documents.Common/2.14.0, StatusCode: Forbidden","cosmosDiagnostics":{"userAgent":"azsdk-java-cosmos/4.28.0 MacOSX/10.16 JRE/1.8.0_301","activityId":"86fcecc8-5938-46b1-857f-9d57b74c6ffe","requestLatencyInMs":89,"requestStartTimeUTC":"2022-07-28T05:34:40.471Z","requestEndTimeUTC":"2022-07-28T05:34:40.560Z","responseStatisticsList":[],"supplementalResponseStatisticsList":[],"addressResolutionStatistics":{},"regionsContacted":[],"retryContext":{"statusAndSubStatusCodes":null,"retryCount":0,"retryLatency":0},"metadataDiagnosticsContext":{"metadataDiagnosticList":null},"serializationDiagnosticsContext":{"serializationDiagnosticsList":null},"gatewayStatistics":{"sessionToken":null,"operationType":"Replace","resourceType":"Offer","statusCode":403,"subStatusCode":0,"requestCharge":"0.0","requestTimeline":[{"eventName":"connectionAcquired","startTimeUTC":"2022-07-28T05:34:40.472Z","durationInMicroSec":1000},{"eventName":"connectionConfigured","startTimeUTC":"2022-07-28T05:34:40.473Z","durationInMicroSec":0},{"eventName":"requestSent","startTimeUTC":"2022-07-28T05:34:40.473Z","durationInMicroSec":5000},{"eventName":"transitTime","startTimeUTC":"2022-07-28T05:34:40.478Z","durationInMicroSec":60000},{"eventName":"received","startTimeUTC":"2022-07-28T05:34:40.538Z","durationInMicroSec":1000}],"partitionKeyRangeId":null},"systemInformation":{"usedMemory":"71913 KB","availableMemory":"3656471 KB","systemCpuLoad":"empty","availableProcessors":8},"clientCfgs":{"id":1,"machineId":"uuid:248bb21a-d1eb-46a5-a29e-1a2f503d1162","connectionMode":"DIRECT","numberOfClients":1,"connCfg":{"rntbd":"(cto:PT5S, nrto:PT5S, icto:PT0S, ieto:PT1H, mcpe:130, mrpc:30, cer:false)","gw":"(cps:1000, nrto:PT1M, icto:PT1M, p:false)","other":"(ed: true, cs: false)"},"consistencyCfg":"(consistency: Session, mm: true, prgns: [])"}}}
at com.azure.cosmos.BridgeInternal.createCosmosException(BridgeInternal.java:486)
at com.azure.cosmos.implementation.RxGatewayStoreModel.validateOrThrow(RxGatewayStoreModel.java:440)
at com.azure.cosmos.implementation.RxGatewayStoreModel.lambda$toDocumentServiceResponse$0(RxGatewayStoreModel.java:347)
at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:106)
at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:74)
at reactor.core.publisher.FluxPeek$PeekSubscriber.onNext(FluxPeek.java:200)
at reactor.core.publisher.FluxHandle$HandleSubscriber.onNext(FluxHandle.java:119)
Message says to switch on such operations for your accounts but I could not find any page to do that. Can I use Azure functions to do the same thing at a specific time?
Code snippet:
CosmosAsyncContainer container = client.getDatabase("DatabaseName").getContainer("ContainerName");
ThroughputProperties autoscaleContainerThroughput = container.readThroughput().block().getProperties();
container.replaceThroughput(ThroughputProperties.createAutoscaledThroughput(newAutoscaleMaxThroughput)).block();

This is because disableKeyBasedMetadataWriteAccess is set to true on the account. You will need to contact either your subscription owner or someone with DocumentDB Account Contributor to modify the throughput using PowerShell or azure cli, links to samples. You can also do this by redeploying the ARM template or Bicep file used to create the account (be sure to do a GET first on the resource so you don't accidentally change something.
If you are looking for a way to automatically scale resources up and down on a schedule, please refer to this sample here, Scale Azure Cosmos DB throughput by using Azure Functions Timer trigger
To learn more about the disableKeyBasedMetadataWriteAccess property and it's impact to control plane operations from the data plane SDK's see, Preventing changes from the Azure Cosmos DB SDKs

How to create Azure databricks cluster using Service Principal

I have azure databricks workspace and I added service principal in that workspace using databricks cli. I have been trying to create cluster using service principal and not able to figure it. Can any help me?
I am able to create cluster using my account but I want to create using Service Principal and want it to be the owner of the cluster not me.
Also, it there a way I can transfer the ownership of my cluster to Service Principal?

First, answering the second question - no, you can't change the owner of the cluster.
To create a cluster that will have Service Principal as owner you need to execute creation operation under its identity. To do this you need to perform following steps:
Prepare a JSON file with cluster definition as described in the documentation
Set DATABRICKS_HOST environment variable to an address of your workspace:
export DATABRICKS_HOST=https://adb-....azuredatabricks.net
Generate AAD token for Service principal as described in documentation and assign its value to DATABRICKS_TOKEN or DATABRICKS_AAD_TOKEN environment variables (see docs).
Create Databricks cluster using databricks-cli providing name of JSON file with cluster specification (docs):
databricks clusters create --json-file create-cluster.json
P.S. Another approach (really recommended) is to use Databricks Terraform provider to script your Databricks infrastructure - it's used by significant number of Databricks customers, and much easier to use compared with command-line tools.

Azure Synapse severless SQL pool - query execution fails

After completing tutorial 1, I am working on this tutorial 2 from Microsoft Azure team to run the following query (shown in step 3). But the query execution gives the error shown below:
Question: What may be the cause of the error, and how can we resolve it?
Query:
SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'https://contosolake.dfs.core.windows.net/users/NYCTripSmall.parquet',
FORMAT='PARQUET'
) AS [result]
Error:
Warning: No datasets were found that match the expression 'https://contosolake.dfs.core.windows.net/users/NYCTripSmall.parquet'. Schema cannot be determined since no files were found matching the name pattern(s) 'https://contosolake.dfs.core.windows.net/users/NYCTripSmall.parquet'. Please use WITH clause in the OPENROWSET function to define the schema.
NOTE: The path of the file in the container is correct, and actually I generated the following query just by right clicking the file inside container and generated the script as shown below:
Remarks:
Azure Data Lake Storage Gen2 account name: contosolake
Container name: users
Firewall settings used on the Azure Data lake account:
Azure Data Lake Storage Gen2 account is allowing public access (ref):
Container has required access level (ref)
UPDATE:
The owner of the subscription is someone else, and I did not get the option Check the "Assign myself the Storage Blob Data Contributor role on the Data Lake Storage Gen2 account" box described in item 3 of Basics tab > Workspace details section of tutorial 1. I also do not have permissions to add roles - although I'm the owner of synapse workspace. So I am using workaround described in the Configure anonymous public read access for containers and blobs from Azure team.

--Workaround
If you are unable to granting Storage Blob Data Contributor, use ACL to grant permissions.
All users that need access to some data in this container also needs
to have the EXECUTE permission on all parent folders up to the root
(the container). Learn more about how to set ACLs in Azure Data Lake
Storage Gen2.
Note:
Execute permission on the container level needs to be set within the
Azure Data Lake Gen2. Permissions on the folder can be set within
Azure Synapse.
Go to the container holding NYCTripSmall.parquet.
--Update
As per your update in comments, it seems you would have to do as below.
Contact the Owner of the storage account, and ask them to perform the following tasks:
Assign the workspace MSI to the Storage Blob Data Contributor role on
the storage account
Assign you to the Storage Blob Data Contributor role on the storage
account
--
I was able to get the query results following the tutorial doc you have mentioned for the same dataset.
Since you confirm that the file is present and in the right path, refresh linked ADLS source and publish query before running, just in case if a transient issue.
Two things I suspect are
Try setting Microsoft network routing in Network Routing settings in ADLS account.
Check if built-in pool is online and you have atleast contributer roles on both Synapse workspace and Storage account. (If the current credentials using to run the query has not created the resources)

Fail to create in-demand hadoop cluster in Azure Data Factory; additionalProperties is not active

It's my first time trying out the Azure data factory so I hope this is not a bad question to ask.
So I'm using the Azure portal trying to create an on-demand hadoop cluster as one of the linked service in Azure Data Factory following the steps in the tutorial.
But whenever I click create, the following error message pops up.
Failed to save HDinisghtLinkedService. Error: An additional property 'subnetName' has been specified but additionalProperties is not active.The relevant property is 'HDInsightOnDemandLinkedServiceTypeProperties'.The error occurred at the location 'body/properties/typeProperties' in the request.;An additional property 'virtualNetworkId' has been specified but additionalProperties is not active.The relevant property is 'HDInsightOnDemandLinkedServiceTypeProperties'.The error occurred at the location 'body/properties/typeProperties' in the request.
I couldn't understand why it requires the 'subnetName' and 'virtualNetworkId'. But I tried putting values under Advanced Properties -> Chose Vnet and Subnet -> From Azure subscription -> and put in the existing vitrual network ID and subnet name. But the problem still present and the same error message shows up.
Other background information:
For the tutorial I posted above, I did not use its powershell code. I have existing resource group and created a new storage account on the Azure portal.
I also created a new app registration in Azure Active Directory and retrieve principal service application ID and authentication key following this link
Some parameters:
Type: On-demand HDInsight
Azure Storage Linked Service: the one listed in the connection
Cluster size: 1 (for testing)
Service principal id/service principal key: described above
Version: 3.6
...
Any thoughts or anything I might be doing wrong?

From the error message, it clearly states that “subnetName” is not active, which means it has not created at all.
Note: If you want to create on-demand cluster within your Vnet, then first create Vnet and Subnet and the pass the following values.
Advanced Properties are not mandatory to create a on-demand cluster.
Have you tried created on-demand cluster without passing the Vnet and Subnet?
Hope this helps. Do let us know if you any further queries.

How do I delete Azure Databricks resource group?

I tried following the Quickstart: Run a Spark job on Azure Databricks using the Azure portal as described at: https://learn.microsoft.com/en-us/azure/azure-databricks/quickstart-create-databricks-workspace-portal
But when I later try to delete resource group for that databricks resource I got the following two errors:
Delete resource group databricks-rg-mydatabricksws-5mlo3dio7wef2
failed The resource group databricks-rg-mydatabricksws-5mlo3dio7wef2
is locked and can't be deleted. Click here to manage locks for this
resource group.
UnauthorizedApplicationId "The management lock ... is owned by system
application"
See: https://aka.ms/arm-lock
Lock Deletion Failure The lock named mydatabricksws was unable to be
deleted for the following reasons: {"errorThrown":"Unavailable in
batch","jqXHR":{"responseJSON":{"error":{"code":"UnauthorizedApplicationId","message":"The
management lock 'mydatabricksws' is owned by system application(s)
'd9327919-6775-4843-9037-3fb0fb0473cb'.

I also encountered the same problem before. I get the answer from this link.
Log into your Azure Databricks workspace as the account owner (the user who created the service), and click the user profile Account icon at the top right.
Select Manage Account.
In the Azure Databricks service, click Azure Delete and then OK.
You also could get the Azure Databricks code demo from this document.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string