HDInsight region is not supported. Region code: ln - azure

I receive the error on my output dataset in Azure data factory.
"HDInsight region is not supported. Region code: ln."
It's a little odd as I'm not using HDInsight, it's a pipeline of a custom activity in c# running on Azure batch and Two storage accounts for experimentation purposes.
The datafactory is in North Europe and the rest in UK South.
Does HDInsight perhaps power the data movement?
Reading the FAQ the location of the compute and storage resource can be in separate regions?
Edit:
Here is the activity JSON from inside the pipeline:
"activities": [
{
"type": "DotNetActivity",
"typeProperties": {
"assemblyName": "AzureBatchDemoActivity.dll",
"entryPoint": "AzureBatchDemoActivity.DemoActivity",
"packageLinkedService": "AzureStorageLinkedService",
"packageFile": "/demoactivitycontainer/AzureBatchDemoActivity.zip",
"extendedProperties": {
"SliceStart": "$$Text.Format('{0:yyyyMMddHH-mm}', Time.AddMinutes(SliceStart, 0))"
}
},
"inputs": [
{
"name": "InputDataset"
}
],
"outputs": [
{
"name": "OutputDataset"
}
],
"policy": {
"timeout": "00:30:00",
"concurrency": 2,
"retry": 3
},
"scheduler": {
"frequency": "Hour",
"interval": 1
},
"name": "DemoActivity",
"linkedServiceName": "AzureBatchLinkedService"
}
],

I've been in contact with Azure support in tandem, a very prompt response from them!
It appears to be an incorrect error message when using custom activities along with storage accounts in regions which don't support data movement.
I see re-reading the documentation, there is a subtly:
the service powering the data movement in Data Factory is available
globally in several regions.
-- (supported regions)
I read “globally” incorrectly as meaning everywhere, but I should off read it as in particular regions around the globe.
I assume that even though I'm using a custom activity because there is a source and destination storage accounts involved then it's implicitly considered a "data movement" operation.

I had a similar issue (identical error message) running HDInsightOnDemand. There were no problems with the regions of the storage account.
The problem was that cluster details were not specified in the LinkedService. I guess ADF was confused which cluster to create Linux or Windows, Hadoop or Spark.
Anyway, the solution was to add the following properties in HDInsightLinkedService
"properties": {
"type": "HDInsightOnDemand",
"typeProperties": {
"clusterType": "Hadoop",
"osType": "linux",
"version": "3.5",
...

I had this exact issue and found out it is an Azure bug. 'du' is an internal code for a data center in the North Europe region.
HDInsight or storage of Azure Batch region is not supported. Region code: du.
Two resource groups deployed via the same script to the same region produced one working and one broken Data Factory resource. An Azure support engineer told me it was because a data center in that region was new and had not been white listed yet.
The recommended workaround was to redeploy the environment, and hope the Storage Account would be deployed to a different data center in that region that is white listed.

Related

wanna enable audit diagnostics settings of aks node resource group resources NSG and virtual machine scale set using ARM template

I am able to enable audit diagnostic settings for aks using arm(below snippet inside arm )but the same way have enable the same in all resources in node resource group like network security group and vitual machine scale set.
"resources": [
{
"condition": "[parameters('audit_enable')]",
"type": "Microsoft.ContainerService/managedClusters/providers/diagnosticSettings",
"apiVersion": "2021-05-01-preview",
"name": "[clustername]",
"dependsOn": [
"[resourceId('Microsoft.ContainerService/managedClusters', clutername)]"
],
"properties": {
"storageAccountId": "[variables('storageAccountId')]",
"logs": [
{
"categoryGroup": "allLogs",
"enabled": true,
"retentionPolicy": {
"days": 30,
"enabled": true
}
}
],
"metrics": [
{
"category": "AllMetrics",
"enabled": true,
"retentionPolicy": {
"days": 30,
"enabled": true
}
}
]
}
}
]
Below statements are based on our observations & Azure Documentations. We have tested in our local environment by creating a virtual machine scale set & tried enabling the diagnostic setting for it Unfortunately we dont have diagnostics setting feature for virtual machine scale sets.
Here is the output screenshot for reference:
As per the Azure documentation, Azure Diagnostics agent is available for virtual Machine only.
Azure Diagnostics extension collects monitoring data from the guest operating system and workloads of Azure virtual machines and other compute resources. It primarily collects data into Azure Storage but also allows you to define data sinks to also send data to other destinations such as Azure Monitor Metrics and Azure Event Hubs.
Here is the reference documentation to create the diagnostics setting for a virtual machine using arm template.
We tried searching for sample arm templates to create the diagnosticsetting for network security group unfortunately we didnt found any Would suggest you to go this documentation of basic arm template to create the diagnostic settings & make the changes accordingly to your requirement.
You can also refer the ARM templates samples for diagnostic settings in Azure monitor.

Azure Data Factory publishing error "404 - File or directory not found"

I have three Data Factories in Azure. I've made several changes to pipelines in the Data Factories (different in each) and now I am no longer able to publish from the Data Factory UI. Previously, publishing worked just fine. I believe the issue started after making changes in the UI and running a DevOps pipeline. The pipeline, however, does not deploy anything to the data factories. It simply makes an artifact of the ADF content.
In two out of three data factory I've made changes to
Pipelines: changing the target of the copy activity from blob storage to ADLS
Adding linked service for on-premises SQL server.
In the other data factory, I made no changes but the error also shows there.
It displays the following error (I've removed sensitive details) in all ADFs:
Publishing error
{"_body":"\r\n\r\n\r\n\r\n404 - File or directory not found.\r\n\r\n\r\n\r\n
Server Error
\r\n
\r\n
\r\n
404 - File or directory not found.
\r\n
The resource you are looking for might have been removed, had its name changed, or is temporarily unavailable.
\r\n
\r\n
\r\n\r\n\r\n","status":404,"ok":false,"statusText":"OK","headers":{},"url":"https://management.azure.com/subscriptions/<subscription id>/resourcegroups/<resource group>/providers/Microsoft.DataFactory/factories/<adf name>/mds/databricks%20notebooks.md?api-version=2018-06-01"}
Clicking on 'Details' gives the following information on the error:
Error code: OK
Inner error code: undefined
Message: undefined
The data factories are almost exact replicas, apart from some additional pipelines and linked services. One of the data factories has a databricks instance in the same resource group and is connected to that. Pipelines have always run successfully. The other data factories have the same linked service for databricks, but have no databricks workspace. It's only there as a template.
The JSON of the databricks linked service looks like this, after removing secret names:
{
"properties": {
"type": "AzureDatabricks",
"annotations": [],
"typeProperties": {
"domain": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "LS_keyvault",
"type": "LinkedServiceReference"
},
"secretName": ""
},
"authentication": "MSI",
"workspaceResourceId": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "LS_keyvault",
"type": "LinkedServiceReference"
},
"secretName": ""
},
"existingClusterId": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "LS_keyvault",
"type": "LinkedServiceReference"
},
"secretName": ""
}
}
}
}
Solutions I've tried
Added Databricks as resource provider in the subscriptions but still the same error shows.
In the data factory that is actually connected to databricks, updated the databricks notebooks path to reference the true location of the notebooks.
The error suggests to me that the issue is related to databricks, but I can't pinpoint the problem. Has anyone solved this issue before?
Thanks!
I've seen similar issues when working directly against the main branch. The Publish branch can get stale/out of sync with main, specifically when items get moved or renamed. Here is another post on a related issue, the solution there may help with your situation.

How do I provision throughput on a container?

I created a Cosmos Db account, database, and containers using this ARM template. Deployment via Azure DevOps Release pipeline is working.
I used this ARM template to adjust the database throughput. It is also in a Release pipeline and is working.
Currently the throughput is provisioned at the database level and shared across all containers. How do I provision throughput at the container level? I tried running this ARM template to update throughput at the container level. It appears that once shared throughput is provisioned at the database level there's no way to provision throughput at the container level.
I found this reference document but throughput is not listed. Am I missing something super obvious or is the desired functionality not implemented yet?
UPDATE:
When attempting to update the container with the above template I get the following:
2019-05-29T20:25:10.5166366Z There were errors in your deployment. Error code: DeploymentFailed.
2019-05-29T20:25:10.5236514Z ##[error]At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-debug for usage details.
2019-05-29T20:25:10.5246027Z ##[error]Details:
2019-05-29T20:25:10.5246412Z ##[error]NotFound: {
"code": "NotFound",
"message": "Entity with the specified id does not exist in the system.\r\nActivityId: 7ba84...b52b2, Microsoft.Azure.Documents.Common/2.4.0.0"
} undefined
2019-05-29T20:25:10.5246730Z ##[error]Task failed while creating or updating the template deployment.
I was also experiencing the same error:
"code": "NotFound",
"message": "Entity with the specified id does not exist in the system.
I was deploying an ARM template via DevOps pipeline to change the configuration of an existing resource in Azure.
The existing resource had a dedicated throughput defined at the container/collection level, and my ARM template was trying to defined the throughput at the database level...
Once adjusted my deployment pipeline worked.
Here is some info on my throughput provisioning fix: https://github.com/MicrosoftDocs/azure-docs/issues/30853
I believe you have to create the container with a dedicated throughput, first. I have not seen any documentation for changing a container from shared to dedicated throughput. In the Microsoft documentation, the example is creating containers with both shared and dedicated throughput.
Set throughput on a database and a container
You can combine the two models. Provisioning throughput on both the database and the container is allowed. The following example shows how to provision throughput on an Azure Cosmos database and a container:
You can create an Azure Cosmos database named Z with provisioned throughput of "K" RUs.
Next, create five containers named A, B, C, D, and E within the database. When creating container B, make sure to enable Provision dedicated throughput for this container option and explicitly configure "P" RUs of provisioned throughput on this container. Note that you can configure shared and dedicated throughput only when creating the database and container.
The "K" RUs throughput is shared across the four containers A, C, D, and E. The exact amount of throughput available to A, C, D, or E varies. There are no SLAs for each individual container’s throughput.
The container named B is guaranteed to get the "P" RUs throughput all the time. It's backed by SLAs.
There is a prereq ARM template in a subfolder for the 101-cosmosdb-sql-container-ru-update. In the prereq version, the container has the throughput property set when the container is created. After the container is created with dedicated throughput, the update template works without error. I have tried it out and verified that it works.
{
"type": "Microsoft.DocumentDB/databaseAccounts/apis/databases",
"name": "[concat(variables('accountName'), '/sql/', variables('databaseName'))]",
"apiVersion": "2016-03-31",
"dependsOn": [ "[resourceId('Microsoft.DocumentDB/databaseAccounts/', variables('accountName'))]" ],
"properties":{
"resource":{
"id": "[variables('databaseName')]"
},
"options": { "throughput": "[variables('databaseThroughput')]" }
}
},
{
"type": "Microsoft.DocumentDb/databaseAccounts/apis/databases/containers",
"name": "[concat(variables('accountName'), '/sql/', variables('databaseName'), '/', variables('containerName'))]",
"apiVersion": "2016-03-31",
"dependsOn": [ "[resourceId('Microsoft.DocumentDB/databaseAccounts/apis/databases', variables('accountName'), 'sql', variables('databaseName'))]" ],
"properties":
{
"resource":{
"id": "[variables('containerName')]",
"partitionKey": {
"paths": [
"/MyPartitionKey1"
],
"kind": "Hash"
}
},
"options": { "throughput": "[variables('containerThroughput')]" }
}
}

Copy data from private s3 implementation using Azure Data Factory

I'm looking into using azure data factory for copying files from a S3 bucket that is not hosted at Amazon. The current solution that we are using includes some azure functions, a logic app and a web app. With azure data factory this should al be much simpler and easier to maintain.
I've looked into the properties that can be given for an Amazon S3 linked service.
"name": "AmazonS3LinkedService",
"properties": {
"type": "AmazonS3",
"typeProperties": {
"accessKeyId": "<access key id>",
"secretAccessKey": {
"type": "SecureString",
"value": "<secret access key>"
}
},
"connectVia": {
"referenceName": "<name of Integration Runtime>",
"type": "IntegrationRuntimeReference"
}
}
But i do not see a property to set a different host in the documentation.
My question is, is this possible at all?
I think it is just a connector for Amazon S3. At least from model layer, there is no property for different host.

How to delete an Azure Table after it is copied to another storage using ADF

I have 100 tables which i want to copy to another storage account frequently. After the copy is over, i want to delete the source tables. I am able to copy entities inside tables to another storage account using ADF Copy Activity. But couldn't figure out a way to delete the source tables after successful copy.
I am using DataFactory .NET API to create pipelines, datasets etc. I thought of Custom Activity as the solution but not sure how to plug this actvity into pipeline through API ?
Any code samples or suggestions are highly appreciated.
As metioned that we could do that with Custom Activity.
but not sure how to plug this actvity into pipeline through API?
We could use the create or update Pipeline API to create or update the pipeline API.
We could get more info about how to use custom activities in an Azure Data from this tutorials.
The following is the snippet from the tutorials.
1.Create a custom activity .NET Class Library project implements that IDotNetActivity interface
2.Launch Windows Explorer, and navigate to bin\debug or bin\release folder
3.Zip the all of the file under bin\release folder and upload to the azure storage customactivitycontainer
4.Create Azure Storage linked service
5.Create Azure Batch linked service
We could use the create or update Pipeline API to create a pipeline that uses the custom activity
{
"name": "ADFTutorialPipelineCustom",
"properties": {
"description": "Use custom activity",
"activities": [
{
"Name": "MyDotNetActivity",
"Type": "DotNetActivity",
"Inputs": [
{
"Name": "InputDataset"
}
],
"Outputs": [
{
"Name": "OutputDataset"
}
],
"LinkedServiceName": "AzureBatchLinkedService",
"typeProperties": {
"AssemblyName": "MyDotNetActivity.dll",
"EntryPoint": "MyDotNetActivityNS.MyDotNetActivity",
"PackageLinkedService": "AzureStorageLinkedService",
"PackageFile": "customactivitycontainer/MyDotNetActivity.zip",
"extendedProperties": {
"SliceStart": "$$Text.Format('{0:yyyyMMddHH-mm}', Time.AddMinutes(SliceStart, 0))"
}
},
"Policy": {
"Concurrency": 2,
"ExecutionPriorityOrder": "OldestFirst",
"Retry": 3,
"Timeout": "00:30:00",
"Delay": "00:00:00"
}
}
],
"start": "2016-11-16T00:00:00Z",
"end": "2016-11-16T05:00:00Z",
"isPaused": false
}
}
About how to operate azure storage table please refer to document.

Resources