Can't list HDInsight clusters - azure

I'm trying to use the azure command-line interface.
I imported the manifest file and am able to run azure hdinsight -h and azure account list (which gives me the good credentials).
However, I'm unable to list my HDInsight clusters with
azure hdinsight cluster list
This returns me the following error :
- Getting HDInsight serverserror: tunneling socket could not be established, cause=1500:error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol:openssl\ssl\s23_clnt.c:766:
info: Error information has been recorded to azure.err
error: hdinsight cluster list command failed
I get a similar error message when doing azure hdinsight account storage create storagename
Did I miss a step in the installation or is there something wrong going on ? I'm working behind a proxy and got http_proxy and https_proxy well set.

In order to proceed ahead with the project, you could also launch the Powershell from the portal itself and execute the PowerShell commandlet from there. (It is called CloudShell in Azure, click this highlighted icon I just launched the PowerShell windows from portal and executed "azure hdinsight cluster list" and it returned me the list of my clusters.
more details about the Azure Powershell at :
https://azure.microsoft.com/en-us/blog/powershell-comes-to-azure-cloud-shell/
and
https://learn.microsoft.com/en-us/azure/cloud-shell/quickstart-powershell

Related

Run ADX script on cluster scope with bicep

I use Azure Devops pipelines. You can run a script on the database level with Bicep, that is listed clearly in the documents. But I want to run a script on cluster level to update the workload_group policy to increase the allowed concurrent queries. But when running the query as part of the bicep deployment (on the database script property) to alter this it results in the following error:
Reason: Not a database-scope command
How can I run this query (that should indeed be run on a cluster level) as part of the bicep deployment? I use the following query, that does work when running it in the query window in Azure Portal.
.create-or-alter workload_group ['default'] ```
<<workgroupConfig>>
```.
I also know there are tasks for Azure Devops for running scripts against the database, but I would not like to use those since data explorer is in a private network and not accessible publicly.

Azure pipeline 'WinRMCustomScriptExtension' underlying connection was closed in non-public VM

In Azure pipeline when creating a VM through deployment template, we have the option to 'Configure with WinRM agent' as given below.
This acts as a custom extension behind the scenes. But the downloading of this custom extension can be blocked by an internal vnet in Azure. This is the error we are getting.
<datetime> Adding extension 'WinRMCustomScriptExtension' on virtual machine <vmname>
<datetime> Failed to add the extension to the vm: <vmname>. Error: "VM has reported a failure when processing extension 'WinRMCustomScriptExtension'. Error message: \"Failed to download all specified files. Exiting. Error Message: The underlying connection was closed: An unexpected error occurred on a send.\"\r\n\r\nMore information on troubleshooting is available at https://aka.ms/VMExtensionCSEWindowsTroubleshoot "
Since the files cannot be downloaded, I am thinking of a couple of solutions:
How can I know which powershell files azure is using to setup winrm?
Location to store files would be storage account (same vnet as VM)
Perhaps not use WinRM at all and use custom script extension to resolve
everything (with all files from storage account). I hope error from extension stops the pipeline if it happens.
Is there a better solution to resolve this? To me it looks like a bad design by azure as it is not covering non-public VMs.
EDIT:
Found answer to #1) https://aka.ms/vstsconfigurewinrm. This was shown in Raw logs of the pipeline when diagnostics were enabled
Even if you know - how does it help you? It won't be able to download them anyway and you cant really tell it to use local files
If you enable service endpoins and allow your subnet to talk to the storage account - it should work
there is a way to configure WinRM when you create the VM. Keyvault example
You could use script extension like you wanted to as well, but script extension has to download stuff to the Vm as well. Example

How to Pass Variables into Azure Databricks Cluster Init Script

I'm trying to use workspace environment variables to pass access tokens into my custom cluster init scripts.
It appears that there are only a few supported environment variables that we can access in our custom cluster init scripts as described at https://docs.databricks.com/clusters/init-scripts.html#environment-variables
I've attempted to write to the base cluster configuration using
Microsoft.Azure.Databricks.Client.SparkEnvironmentVariables.Add("WORKSPACE_ID", workspaceId)
My init scripts are still failing to uptake this variable in the following line:
[[ -z "${WORKSPACE_ID}" ]] && LOG_ANALYTICS_WORKSPACE_ID='default' || LOG_ANALYTICS_WORKSPACE_ID="${WORKSPACE_ID}"
With the above lines of code, my init script causes the cluster to fail with the following error:
Spark Error: Spark encountered an error on startup. This issue can be caused by
invalid Spark configurations or malfunctioning init scripts. Please refer to the Spark
driver logs to troubleshoot this issue, and contact Databricks if the problem persists.
Internal error message: Spark error: Driver down
The logs don't say that any part of my bash script is failing, so I'm assuming that it's just failing to pick up the variable from the environment variables.
Has anyone else dealt with a problem with this? I realize that I could write this information to dbfs, and then read it into the init script, but I'd like to avoid doing that since I'll be passing in access tokens. What other approaches can I try?
Thanks for any help!
This article shows how to send application logs and metrics from Azure Databricks to a Log Analytics workspace. It uses the Azure Databricks Monitoring Library, which is available on GitHub.
Prerequisites: Configure your Azure Databricks cluster to use the monitoring library, as described in the GitHub readme.
Steps to build the Azure monitoring library and configure an Azure Databricks cluster:
Step1: Build the Azure Databricks monitoring library
Step2: Create and configure the Azure Databricks cluster
For more details, refer "Monitoring Azure Databricks".
Hope this helps.

Fail to create in-demand hadoop cluster in Azure Data Factory; additionalProperties is not active

It's my first time trying out the Azure data factory so I hope this is not a bad question to ask.
So I'm using the Azure portal trying to create an on-demand hadoop cluster as one of the linked service in Azure Data Factory following the steps in the tutorial.
But whenever I click create, the following error message pops up.
Failed to save HDinisghtLinkedService. Error: An additional property 'subnetName' has been specified but additionalProperties is not active.The relevant property is 'HDInsightOnDemandLinkedServiceTypeProperties'.The error occurred at the location 'body/properties/typeProperties' in the request.;An additional property 'virtualNetworkId' has been specified but additionalProperties is not active.The relevant property is 'HDInsightOnDemandLinkedServiceTypeProperties'.The error occurred at the location 'body/properties/typeProperties' in the request.
I couldn't understand why it requires the 'subnetName' and 'virtualNetworkId'. But I tried putting values under Advanced Properties -> Chose Vnet and Subnet -> From Azure subscription -> and put in the existing vitrual network ID and subnet name. But the problem still present and the same error message shows up.
Other background information:
For the tutorial I posted above, I did not use its powershell code. I have existing resource group and created a new storage account on the Azure portal.
I also created a new app registration in Azure Active Directory and retrieve principal service application ID and authentication key following this link
Some parameters:
Type: On-demand HDInsight
Azure Storage Linked Service: the one listed in the connection
Cluster size: 1 (for testing)
Service principal id/service principal key: described above
Version: 3.6
...
Any thoughts or anything I might be doing wrong?
From the error message, it clearly states that “subnetName” is not active, which means it has not created at all.
Note: If you want to create on-demand cluster within your Vnet, then first create Vnet and Subnet and the pass the following values.
Advanced Properties are not mandatory to create a on-demand cluster.
Have you tried created on-demand cluster without passing the Vnet and Subnet?
Hope this helps. Do let us know if you any further queries.

Azure Point-to-Site VPN Resource Manager powershell

I found the link, https://azure.microsoft.com/en-us/documentation/articles/vpn-gateway-howto-point-to-site-rm-ps/, which give details instructions on how to create a Point-to-Site VPN connection using powershell in the new Azure resource manager.
While attempting to create run this script I am getting the error message. " The term 'Add-AzureRmVpnClientRootCertificate' is not recognized as the name of a cmdlet"
I am currently running Azure Powershell version 1.0.1 and this reference, https://msdn.microsoft.com/en-us/library/mt653593.aspx, indicates that it should be available in version 1.0.
What am I doing wrong?
It looks like you need at least Azure PowerShell 1.0.4 to get this cmdlet. If you look at the GitHub source for this cmdlet at https://github.com/Azure/azure-powershell/blob/master/src/ResourceManager/Network/Commands.Network/VirtualNetworkGateway/AddAzureVpnClientRootCertificateCommand.cs, it looks like it was added with the commit for 1.0.4: https://github.com/Azure/azure-powershell/commit/09b5f57ff798ca90aeb84e73fbd88f406d7edd7c.

Resources