autoscaling with queues does not start - azure

so my environment is set in cloud service with 2 instances of worker role which process messages from service bus queue.I have also set up autoscaling block to increase instances when an instance has more than 10 messages to handle.
here are steps i take.
I push messages to a queue about 1000
current all my messages are unprocessed as my instances are not up.
i publish the worker role with 2 instances, and when they are up,they start reading messages correctly.
then i configure autoscaling in above stated rule for queues and 10 messages per instance.
What I excpected was since the instances already have more than they can handle, azure should start spinning up new instance.but this doesnt happen untill at least 10-15 minutes after my first two instances are up.
What could be the reason behind this and any algorithm on microsoft side?

Related

Azure Functions Elastic Premium Plan and Queue Triggers

We have a solution where we use an Azure Storage Queue to process messages that take approx 6 minutes.
I've read that the maximum batchSize of Queue messages concurrently processed are 32 per VM.
If the function app scales out to multiple VMs, each VM could run one instance of each queue-triggered function.
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue?tabs=in-process%2Cextensionv5%2Cextensionv3&pivots=programming-language-csharp#host-json
How does that translate to Azure Functions Premium plan?
Lets say we want to be able to process 64 messages at once using Azure Functions Premium plan with Always ready instances. If we have 2 ready instances, can they process 2 * 32 concurrent messages? Or do they underwater really need to be on seperate VM's and 2 instances will not do anything different?
In the Premium plan, you can have your app always ready on a specified number of instances. The maximum number of always ready instances is 20. When events begin to trigger the app, they are first routed to the always ready instances.
https://learn.microsoft.com/en-us/azure/azure-functions/functions-premium-plan?tabs=portal#always-ready-instances
Yes. In Azure Functions premium plan, if you have pre-warmed instance, then that is given a dedicated VM instance. So, if you had 2 VM instances running your function app, then they can process 2*(batchSize + newBatchThreshold) concurrent Queue messages!
The Azure platform scales the function app onto new VM as the existing instances gets more busy.

Azure Functions is not adding more servers to process the queue

I'm working on queue trigger based Azure function which is under consumption plan. Currently I have millions of messages in my queue which should be processed. Based on Azure Functions documentation it should add more servers to be able to process my messages. However it's running only 145 servers and more than 2 millions messages are stuck on the queue for last 24 hours. Why Azure is not adding more servers to speed up the message processing? Is there any limit for the servers can be allocated for each Function App?

Azure servicebus messages when webapp scale out to fewer instances

What happens if you have let's say 4 instances of a webapp, and a lot of messages queued up in the service bus queue (meant to reach all the 4 instances) and the webapp scales down to only 2 instances. Will the messages, in the service bus that were meant for the other two instances that were removed, be stuck in the queue until the time to live is exceeded and then removed or does the service bus understand that there are not 4 instances anymore and therefore it doesn't need to send out messages to 4 instances?
I'm not sure if this is the correct but from my understanding there usually is a topic and then multiple subscriptions? will the service bus understand when one of the instances (who has a subscription) is gone and then removes the message meant for that subscription while the message is queued up (a lot of other messages before it)?
Sorry if the question is a little dumb but I couldn't find any answers on the internet.
Will the messages, in the service bus that were meant for the other two instances that were removed, be stuck in the queue until the time to live is exceeded and then removed
Service Bus queues and topic subscriptions provide a secondary sub-queue, called a dead-letter queue (DLQ).
The purpose of the dead-letter queue is to hold messages that cannot be delivered to any receiver, or simply messages that could not be processed. More details please refer to the document.
does the service bus understand that there are not 4 instances anymore and therefore it doesn't need to send out messages to 4 instances?
Service Bus Queues offer First In, First Out (FIFO) message and Service Bus queue message is received and processed by only one message consumer. If we have multiple web instance, when a quest arrive then WebApp LoadBlance (ARR) will assign the request to the corresponding instance to process the message. If the instance has been removed, the WebApp loadBlance (ARR) will assign the request to the existing instance.
will the service bus understand when one of the instances (who has a subscription) is gone and then removes the message meant for that subscription while the message is queued up (a lot of other messages before it)?
As mentioned above, it is assigned by WebApp LoadBlance not Service bus. How to consume message is depended on your WebApp.

How to handle non transient exception in Azure Worker role

We have two azure worker roles - A and B.
A is a Quartz scheduler which runs jobs every minute.
It reads some ids from a 'Redis cache' every minutes and execute jobs for those ids.
'A' publish its output to a service bus queue which is
subscribed by Worker role 'B'.
'B' worker role reads values from
queue and execute some more operation on them.
Both worker roles has
to build cache on startup.
Now here are few issues regarding Azure component failure:
If Redis cache goes down, how can we handle that. We need to stop our execution till the time it is up again and then we need to build our cache again. 'B' worker role should stop pulling message from service bus till the time Redis comes up again.
How to handle Service bus failure in worker role 'B'?
You don't need to stop any of the worker roles.
Worker role A should be resilient to issues in Redis cache, meaning that your code should handle any exception thrown by Redis (or network exceptions), either by retry or swallow the exception.
Worker role B should constantly pull messages from the service bus. If worker role A doesn't publish data, then worker role B should handle empty results.
Stopping your service on Redis/Azure glitch will require you to handle more complicated scenarios - for example: automatically detect if Redis is up again and automatically start your service.
One potential solution would be to configure an external health service that your workers check before pulling from the service bus or cache. If the health service says the cache or service bus are down your workers simply don't attempt to process anything.

Azure worker role + number of message queue consumer

I am trying to understand the best practice when hosting a message queue consumer in a azure worker role. I have many different type of message consumers that subscribe to different azure service bus subscriptions (or queue if you like to call it). I am wondering if I should instantiate multiple threads for each consumer in one Worker Role or should I deploy to multiple Worker Role for each consumer.
This is really dependent on your app and workload. If you have tasks that are I/O-blocked, then you should be running several threads; otherwise, you'll have a virtual machine instance which isn't being used efficiently. If it's primarily CPU-based, you may find that you can run efficiently with a lower number of threads.
You should only scale out your worker instances if you can't handle the capacity in a single instance (or if you need high-availability, in which you'd need at least two instances). Just remember that a worker role instance is a full VM, so adding one VM per queue consumer scales in cost, and you still might not see great throughput in an I/O-bound app (or one that blocks on other things such as a Web Service call).
You'll need to do a bit of experimenting to see how many threads to work with on the worker side.

Resources