ADF activities and Ondemand HDInsight instances - azure-hdinsight

I am new to using on demand hd insight. I have a basic question -
I have multiple activities running simultaneously in separate ADF pipelines each using an HDInsight ondemand linked service. How many instances of HDInsight gets created? Is it one instance per activity?
I got confused a bit because the documentation states that each instance created has a time-to-live value within which if a new job comes it will process that. Does the new job need to come from an activity in the same pipeline that originally created the instance or this instance is shared across activities in other pipelines?
Also just wanted to confirm my understanding that the cores count used for on demand instances do not count towards the subscription usage count.
Really sorry if the questions are very basic but any help very much appreciated.

Partial answers to my questions provided below - refer to comments section above for open points.
Answer for sharing of instance across pipelines is available at url -> "If the timetolive property value is appropriately set, multiple pipelines can share the instance of the on-demand HDInsight cluster."
Regarding my other question on CPU limits for HDInsight, as per azure limits -> the ondemand HDinsight core limits are restricted to 60 per subscription and this is different than the general core limit per subscription.
Also interestingly for manually created HDInsight clusters there exists a CPU limit as mentioned in this Stackoverflow link. It is as of today 170 per subscription obtainable by issuing the powershell command Get-AzureHDInsightProperties. Again I understand this limit is different than the subscription general core limit.

Related

Using Azure CosmosDB container to move object to different status

I wonder if it's a good idea to use Azure Cosmos DB "container" to manage an entity's status? For example, an employee's reimbursement can have different statuses like submitted, approved, paid, etc. Do you see any problem creating a separate container for "Submitted", "Approved", etc? They would contain the similar reimbursement object but slightly different data points due to their status. For example, Submitted container could have Manager's name as the approver, Paid container could have payment method.
In other words, it's like a persistent queue. It will be moved out of the container and into the next in the workflow process.
Are there any concerns with this approach? Does Azure pricing model "provisioned throughput" charge by the container? Meaning the more container you have, the more expensive it gets? Or is it on the database level so that I can have as many containers I want, it's only charging the queries?
Sorry for the newbie questions, learning about Cosmos. Thanks for any advice!
It's a two part question :).
First part (single container v/s multiple container) is basically an "opionion-based" question. I would have opted for a single container approach as it would have given me just one place to look for the current status of an item. But, that's just my opinion :).
Regarding your question about pricing model, Azure Cosmos DB offers you both.
You can provision throughput at the container level as well as on the database level. When you provision throughput at the database level, all containers (up to 25) in that database will share the throughput of the database.
You can even mix and match the approaches as well i.e. you can have throughput provisioned at the database level and then have some containers share the throughput of the database while some containers have dedicated throughput.
Please note that once throughput type (fixed, shared or auto-scale) is configured at the database/container level, it can't be changed. You will have to delete and create new resource with changed throughput type.
You can learn more about throughput here: https://learn.microsoft.com/en-us/azure/cosmos-db/set-throughput.

Cannot create Compute Instance Microsoft Azure ML

I am new at working with Microsoft Azure and I am trying to open a Notebook from the Azure Machine learning studio.
Every time I try to create a new compute it says Creation failed so I cannot work. My region is francecentral and I have tried different Virtual Machine size
Your reason might be explained here:
As demand continues to grow, if we are faced with any capacity
constraints in any region during this time, we have established clear
criteria for the priority of new cloud capacity. Top priority will be
going to first responders, health and emergency management services,
critical government infrastructure organizational use, and ensuring
remote workers stay up and running with the core functionality of
Teams.
If you qualify for this category, you should reach out to Azure Support or your Microsoft representative. If not, you need to keep retrying (might work better at night) or try a different region.

I am getting this error while creating new HDInsight cluster

"You do not have the minimum cores available,12, to create a cluster in Est des Etats-unis, please select different location or subscription".
I tried with two subscriptions and many locations, but I got the same error.
You need to follow this document to request quota increase for a specific region you are interested in. This is a "routine" operation. Usually takes a couple of days (faster if you have EA support). Once your request if fulfilled you can proceed with creating your cluster.
Each subscription in Azure got its own quotas you have to adhere to. By default you are limited to 10 cores per VM type for most VM types.

Compare: Azure Functions vs Azure Batch

Can we use Azure Functions along with Azure Batch? Please Advise.
I am working on a POC to decide which one to use for our background processes.
I too was in similar dilemma till I tried both of them for my use case.
The major difference between the two is that Azure Function has a hard timeout limit of I guess 10 minutes which you can not exceed. What I mean is that if your script/execution runs beyond 10 minutes then Azure function will kill it automatically.
Whereas Azure batch is essentially a configuration of pools or VMs in which you can run long running jobs where you are not bothered about the time of its execution. Essentially they are old VMs (low costs too). Difference between batch and Azure VMs is that Azure VMs have high speed VMs but in batch you can configure the periodic jobs where in Azure VMs you need to code in such a way that it executed like a periodic job
And yes it is possible to use Functions with Azure batch. You can configure your script as HTTP trigger in Function which you can call (get/post) through Azure Batch VMs.
Hope it helps.
May be we should expand this topic to Azure services for Batch processing in general. I did come across an article from Microsoft that goes through these options in general (which includes Web Jobs, and Kubernetes options).
But, frankly, even after reading the article; the confusion remains. For example, Azure Batches can be scheduled; but not sure if they can be triggered based on other Azure services like how Azure web jobs handles it. I get a feeling that Azure Batch is pitched where you need high + parallel computing at low costs. Because, none of the other options directly allow you to low-priority and low-cost compute instances. Correct me please!
#AzureBatch #AzureWebJobs #AzureAKS #AzureFunctions

Azure: Do not deploy a role by configuration

We have written a high scalable Cloudservice for MS Azure with two roles: "WebsiteRole" and "WebsiteWorkerRole". For better performance we deploy this Cloudservice in multiple regions (2x US, 2x EU, 1x JP). We have different configuration files for each region (EuWestProductive.azurePubxml, ServiceConfiguration.CloudEuWest.cscfg, Web.ReleaseEuWest.config).
Now the Problem: In each Region we have running the "WebsiteRole" and "WebsiteWorkerRole". But the "WebsiteWorkerRole" has only very small tasks, so that one extra small instance in one region is more than enough.
We tried to set the Role instance count to zero (ServiceConfiguration.CloudEuWest.cscfg). But this is not allowed:
Azure Feedback: Allow a Role instance count of 0
Is there an other way to remove a role when deploy the Cloudservice?
No, as you've discovered, a cloud service does not allow for scale to zero. You have to effectively remove the deployment. To have the minimum change to what you already have in place you could separate the two roles into two different deployments. Then have an Azure Automation Script, or set of scripts run elsewhere, that handles deploying the worker role when needed and decommissioning when it's not needed.
Depending on the type of workload that worker is doing you could also look at taking another route of using something like Azure Automation to perform the work. This is especially true if it's a small amount of processing that occurs only a few times a day. You're charged by the minute for the automation script, so just make sure it's going to run less than the actual current instance does.
It really boils down to what that worker is doing, how much processing it really needs to do, how much resources it needs and how often it needs to be running. There are a lot of options, such as Azure Automation, another thread on the web role, a separate cloud service deployment, etc. Each with their own pros and cons. One option might even to look at the new Azure Functions they just announced (in preview and charged by the execution).
The short answer is separate the worker from the WebSiteRole deployment, then decide the best hosting mechanism for that worker role making sure that the option includes the ability to only run when you need it to.
Thanks #MikeWo, your idea to separate the deployments was great!
I have verified this with an small example project and it works just fine. Now it is also possible to change the VM size and other configurations per region.
(Comments do not allow images)

Resources