What is the optimal architecture design on Azure for an infrequently used backend that needs a robust configuration? - azure

I'm trying to find the optimal cloud architecture to host a software on Microsoft Azure.
The scenario is the following:
A (containerised) REST API is exposed to the users through which they can submit POST and GET requests. POST requests trigger a backend that needs a robust configuration to operate properly and GET requests are sent to fetch the result of the backend, if any. This component of the solution is currently hosted on an Azure Web App Service which does the job perfectly.
The (containerised) backend (triggered by POST requests) perform heavy calculations during a short amount of time (typically 5-10 minutes are allotted for the calculation). This backend needs (at least) 4 cores and 16 Gb RAM, but the more the better.
The current configuration consists in the backend hosted together with the REST API on the App Service with a plan that accommodates the backend's requirements. This is clearly not very cost-efficient, as the backend is idle ~90% of the time. On top of that it's not really scalable despite an automatic scaling rule to spawn new instances based on the CPU use: it's indeed possible that if several POST requests come at the same time, they are handled by the same instance and make it crash due to a lack of memory.
Azure Functions doesn't seem to be an option: the serverless (consumption plan) solution they propose is restricted to 1.5 Gb RAM and doesn't have Docker support.
Azure Container Instances neither, because first the max number of CPUs is 4 (which is really few for the needs here, although acceptable) and second there are cold starts of approximately 2 minutes (I imagine due to the creation of the container group, pull of the image, and so on). Despite the process is async from a user perspective, a high latency is not allowed as the result is expected within 5-10 minutes, so cold starts are a problem.
Azure Batch, which at first glance appears to be a perfect fit (beefy configurations available, made for hpc, cost effective, made for time limited tasks, ...) seems to be slow too (it takes a couple of minutes to create a pool and jobs don't run immediately when submitted).
Do you have any idea what I could use?
Thanks in advance!

Azure Functions
You could look at Azure Functions Elastic Premium plan. EP3 has 4 cores, 14GB of RAM and 250GB of storage.
Premium plan hosting provides the following benefits to your functions:
Avoid cold starts with perpetually warm instances
Virtual network connectivity.
Unlimited execution duration, with 60 minutes guaranteed.
Premium instance sizes: one core, two core, and four core instances.
More predictable pricing, compared with the Consumption plan.
High-density app allocation for plans with multiple function apps.
https://learn.microsoft.com/en-us/azure/azure-functions/functions-premium-plan?tabs=portal
Batch Considerations
When designing an application that uses Batch, you must consider the possibility of Batch not being available in a region. It's possible to encounter a rare situation where there is a problem with the region as a whole, the entire Batch service in the region, or your specific Batch account.
If the application or solution using Batch always needs to be available, then it should be designed to either failover to another region or always have the workload split between two or more regions. Both approaches require at least two Batch accounts, with each account located in a different region.
https://learn.microsoft.com/en-us/azure/batch/high-availability-disaster-recovery

Related

Google Cloud Run not scaling as expected

I'm using Google Cloud Run to run a pretty basic Express / Node JS backend container. I receive fairly low number of requests per day, and only the occasional concurrent request.
However, I can see on my Cloud Run dashboard that Cloud Run sometimes scale up to 4 instances, most of the time to at least 2 instances. I know that my app load is so low that I'll pretty much never need more than 1 instance, so why is Cloud Run being so wasteful?
My settings is set as maximum 40 requests concurrently; minimum 0 containers and maximum 4 containers.
Container instance counts fluctuates substantially. Green line is idle containers and blue line is active containers.
My CPU usage is also very low:
You know your workload profile and the expected request. Cloud Run autoscaler does not. Therefore, it over provisions additional instances in case of traffic spike.
Of course, YOU know that will never happen, but IT doesn't.
Cloud Run is pretty well designed for average traffic. If you are at one extremity of this standard usage (very low traffic or very high, very spiky traffic), yes, the Cloud Run autoscaler provisioning model doesn't work so well.
However, what's the problem? You pay only when a request is processed on an instance. If there are over provisioned and not used instances, you won't pay them. It's a waste of money for Google, not for you.
Your only concern could be for the earth and the resource saving, and you have absolutely right.

How to get better performance with Azure ServiceBus Standard plan

I don't manage to get over 14 msg/second with the Azure ServiceBus Standard Plan. I'm running some benchmark tests with the Azure-Sample tool that I found in this question:
The test is done with a ServiceBus resource with a single Queue and all default configurations:
If I read this correctly, you've got the maximum concurrency of one (MaxInflightReceives) with 5 receivers (ReceiverCount). Increasing concurrency and enabling prefetch on the clients will increase the overall throughput. But,
Testing should be done within the same Azure data centre. If you're testing from a local machine, you're introducing a substantial latency that cannot be avoided.
The receive mode used is PeekLock. It is slower than ReceiveAndDelete. Not suggesting to switch, but this needs to be taken into consideration as you're trading throughput for safety by using PeekLock.
The standard tier has a cap on the number of operations per second. In addition to that, your namespace is deployed in a shared environment with entities scattered in various deployment containers. Performance will vary and cannot be guaranteed. If you want to have a guaranteed throughput, use Premium SKU.

Azure Functions - Node vs .Net Performance - ColdStart

I don't know whether it is purposefully done or Azure is poor by performance than AWS. Whenever I cold-start an Azure Function it takes close to a minute.
The same function cold starts less than a second (close to 250 msecs) with AWS.
What I see is, Azure stores all the function code in Azure Storage account and load it over the network creating this latency. This is with consumption plan.
If I use App Service Plan for functions, so junky to even have that in modern applications. It can reduce to 3 seconds but not even close to what AWS performs.
What are the other way I can improve performance with Azure, so that I can cold start my functions quickly?
I'm a member of the Azure Functions team. I can assure you we're not deliberately making JavaScript slow. It just comes with some challenges we're still working through.
As you noted, the 60 second cold start performance you are getting is due to the network latency incurred when loading a very large number of small files which is typical for node.js applications.
Our current recommendation is for your to take advantage of Azure-Functions-Pack. It uses webpack to dramatically decrease the number of files loaded by your application.
We are working on some improvements designed to make the manual process of running Functions Pack unnecessary. We aim to have some of these improvements in production sometime later in 2017.

Is the maximum scalability of an Azure Function capped at 10?

We're trying to test scalability of Azure functions (it's a bear). We came across this https://azure.microsoft.com/en-in/documentation/articles/functions-reference/#parallel-execution
If a function app is using the Dynamic Service Plan, the function app
could scale out automatically up to 10 concurrent instances. Each
instance of the function app, whether the app runs on the Dynamic
Service Plan or a regular App Service Plan
Does this mean that the maximum scalability of a single function is just 10? we've never been able to get over 10 units running... (previous question on the algorithm to determine adding another consumption unit, this to determine the upper end of scalability).
Thanks
UPDATE: There is no official maximum number of instances. We see customers who are able to scale out to hundreds. The number you achieve depends mostly on your workload, but partly on the region you're running in (some regions have more capacity than others). The 10 instance limit mentioned in previous versions of the docs has been removed.
You can find more information about our consumption plan and scaling here: https://learn.microsoft.com/en-us/azure/azure-functions/functions-scale#how-the-consumption-plan-works
Also note that each instance in Azure Functions can run multiple function executions in parallel. For example, if you have a function app which has a single function that runs quickly, you could expect to see dozens or even hundreds of concurrent executions on a single instance. This is unlike other services such as AWS Lambda which only execute a single function at a time per instance. New instances are added only when the system decides that the current number of instances is insufficient to handle the current load (more details on that in my answer to your other question).

deploying CPU intensive web service on cloud

I have an application which I want to expose as a web service (SaaS). The application is CPU intensive and is a multithreaded application which takes good amount of time for the execution(on an average 15-20secs). Since, I want to expose it as a SaaS and want to use existing cloud services available in the market like Amazon, Google App Engine etc. so that the cost involved and the work involved while scaling my service is not much. I have couple of questions in my mind like:
1.) Since the application is multithreaded and the number of threads invoked depends on the number of results thrown by the service(so basically number of threads is a dynamic entity). Right now I have a 6 core processor so I have kept the threadpool size to be 6 but since I am moving onto the cloud, how can I optimally use the cloud infrastructure?
2.) Do the cloud service providers(which?) give the option to select number of CPU cores required for each request (or something similar to serve my purpose)?
3.) What changes are needed in the code (related to the threads)?
4.) Any other specific area which I should give a sight for moving to the cloud?
In Amazon EC2 you are basically paying for different types of instances - you are free to pick one with only single core and one with sixteen. You get what you pay for.
how can I optimally use the cloud infrastructure?
Your approach is fine, if your task is CPU-intenstive, have a thread pool with the same number of threads as CPU cores/CPUs.
select number of CPU cores required for each request
No, at least not Amazon. You run your application on a given instance and that's all you get. You have to pick instance type in advance, but of course you are free to switch between them, add new, etc. at any time. The cloud!
In Google App Engine you can't create threads, so it's a no-option for you. See also: Why does Google App Engine support a single thread of execution only?
3.) What changes are needed in the code (related to the threads)?
None. It's a standard PC, after all.
4.) Any other specific area which I should give a sight for moving to the cloud?
Well, see above, some services are completely useless for you, like GAE. Make some research before you actually pay for something.

Resources