Error while starting the azure databricks cluster - databricks

I setup a Databricks instance on Azure using terraform. The deployment seems to be good. But, I am getting the following error when creating/starting a new cluster,
Message
Cluster terminated. Reason: Cloud provider launch failure
Help
A cloud provider error was encountered while launching worker nodes.
See Knowledge base: Cloud provider initiated terminations for remediations.
Search for error code NetworkingInternalOperationError
Details
Azure error code: NetworkingInternalOperationError
Azure error message: An unexpected error occured while processing the network profile of the VM. Please retry later.
Any idea why this is happening?

Usually such errors are returned when there are temporary problems with underlying VM infrastructure. Usually they mitigated very fast, so you just need to try later, although it makes sense to check Azure Databricks and Azure status pages - they may show if outage is in progress.

Related

Azure pipelines is giving timeout for helm upgrade

At some point, Azure DevOps pipelines started to return "timeout" for helm tasks, after expending 3+ mins executing the task.
UPGRADE FAILED: timed out waiting for the condition (see the image).
It was somehow misleading because the error was happening at the same time they had an incident with some of the services in Europe.
The cause of this error was the lack of capacity in the target AKS cluster. Apparently, the task timed out while waiting for some provisioning to happen or something like that.

Error with 'auditIfNotExists' while scaling up and scaling down Azure SQL DB using Rest API from Azure Data factory

I have been using Rest API from Azure Data factory to scale up and scale down Azure SQL DB.
This was running fine last couple of months. But recently I started getting some intermittent error. Sometimes the rest api from ADF showed below error.
"internalServerError","message: an unexpected error occurred while processing the request"
But when I check the database I found that the database is successfully scaled up / scaled down.
Even though, Azure SQL DB activity log shows an an error related to 'auditIfNotExists' Policy action.
Can anyone help me with this ?
I am using api-version 2017-10-01 ? Is this policy error related api version ? How to decide which api version we need to use ? Is it always latest version to use ?
Or is the error related to something else ?

Azure data factory data lake analytics linked service - failed because the connected party did not properly respond

I have created a azure data factory to perform some USQL activities. while creating a New Linked Service (Azure Data Lake Analytics) getting following error.
while searching on this issue, found that developer facing different kind of issues with ADF.
Is anything am I missing?
Error: Failed to connect to ADLA account 'ad-cxp-analytics-c11' with error 'An error occurred while sending the request.'. An error occurred while sending the request. Unable to connect to the remote server A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 40.90.138.193:443 Activity ID: c218e103-f0a8-4b07-811d-014d39607dcc.
I would suggest you having looked into the IR logs.As its appears its seems the IR is having issues and you should have more info on the logs. Also capturing the performance metrics on the IR server will help.
Also as a short term solution ,you could try to enable the retry option on the activity.

Azure storage queue polling stops after connectivity issues

I'm experiencing intermittent 503 Service Unavailable from azure storage.
Webjobs runner hosted as a top shelf service. Due to the fact that I used JobHost.Start() instead of JobHost.RunAndBlock() every time I get 503 from azure storage, service ends up in a corrupted stopping state.
When I switched to JobHost.RunAndBlock() using instead service is running continuously now but after 503 exception queue trigger stops polling queues.
I use standard azure queue trigger bindings. No manual setup.
Anyone experienced similar behaviour? How to recover from such connectivity errors?
Assuming you are using C# for your web job, I think using something like Polly or Enterprise Library's Transient Fault Handling Application Block to implement retry logic for when an occasional error occurs while using an Azure service, as you might be hitting a throttling thresholds (resource limit per your selected service tier).
Hope it helps!

Azure Data Factory (SSIS) Execution Using Integration Runtime Throws "Unexpected Termination"

I have been using Azure Data Factory V2 for a while to execute SSIS packages from the SSISDB catalouges.
Today (16-11-2018) I have encountered "Unexpected Termination" Failure message without any Warning and Error message.
Things than I have done:
Executing the SSIS package manually from the SSISDB catalogue in SQL Server Management Services (SSMS). What i have noticed is that it took an exceptionally long time to assign the task to a machine. When the package is assigned to a machine, within 1 or two minutes it throws back the Failure message.
There are 3 SSIS packages that is excecuted "sequentially" with the Azure Data Factory Pipeline. Often the 1st package is executed successfully, however the 2nd and 3rd package never succeded.
Another error message that I got is "Failed pull task from SSISDB, please check if SSISDB has exceeded its limit".
I hope anyone can help me with this issue. I have been searching the web and could not find anything on this subject.
What tier of Azure sql server have you provisioned for the SSISDB to run on? If its too small, it may be taking too much time starting and throw a timeout.
Personally, I've had no problems provisioning an S3 Azure Sql Server.
Hope this helped!
Martin

Resources