So I've been going through Azure Signal R Service for blazor apps and I've noticed they have their pricing according to units as well. The free version allows up to one unit where as the standard version has up to 100 units. I'm currently clueless as to what a "Unit" is, with this regard so it would be nice if someone would be kind enough to give a brief explanation on this. P.s: I am relatively new to Blazor however I have experience with .Net Core & Asp.Net Mvc .
A unit is a sub-instance that processes your SignalR messages. Units are used to increase the performance and connections count.
An instance is what you need to create first to use SignalR.
Think unit this way: Let’s say you have a web server that is not enough to handle the web traffic. You can add two more servers to load balance the traffic. This increases the performance and number of requests your environment can handle. In this example, the environment is an INSTANCE. Each server is a UNIT. Before adding new servers, you have 1 instance and 1 unit in that instance. After adding new servers, you have 1 instance and 3 units in that instance.
SignalR Pricing
In FREE plan, you can use only 1 unit and this unit can handle maximum 20 concurrent connections
In STANDARD plan, you can use 100 units. Each unit can handle 1,000 concurrent connections
(Please note the difference: The unit in FREE plan supports maximum 20 connections while a unit in STANDARD plan supports 1,000 connections. In terms of pricing, FREE plan unit and STANDARD plan unit are not the same)
Source: What is the difference between SignalR unit and instance? How SignalR pricing works?
Azure SignalR Unit has to be thought as a nodes available for processing messages for you app.
As you can see on the screenshot below, you can only select multiple units when using the "Standard" pricing tier (the free tier only allows one Unit with limited throughput).
When you select the Standard tier, you can then add up to 100 Units, which theoretically can allow you to
handle 1000 connections per Unit (with 100 Units, then 100,000 connections),
manage 1 million messages per day (with 100 Units, then 100 million connections).
You can scale up to you needs anytime, all depends on your app!
Related
I'm running three MEAN stack programmes. Each application receives over 10,000 monthly users. Could you please assist me in finding an EC2 instance for my apps?
I've been using a "t3.large" instance with two vCPUs and eight gigabytes of RAM, but it costs $62 to $64 per month.
I need help deciding which EC2 instance to use for three Nodejs applications.
First check CloudWatch metrics for the current instances. Is CPU and memory usage consistent over time? Analysing the metrics could help you to decide whether you should select a smaller/bigger instance or not.
One way to avoid too unnecessary costs is to use auto scaling groups and load balancers. By using them and finding and applying proper settings, you could have always right amount of computing power for your applications.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/working_with_metrics.html
https://docs.aws.amazon.com/autoscaling/ec2/userguide/auto-scaling-groups.html
Depends on your applications. If your apps need more compute power or more memory or more storage? Deciding a server is similar to installing an app on system. Check what are basic requirements for it & then proceed to choose server.
If you have 10k+ monthly customers, think about using ALB so that traffic gets distributed evenly. Try caching to server some content if possible. Use unlimited burst mode of t3 servers if CPU keeps hitting 100%. Also, try to optimize code so that fewer resources are consumed. Once you are comfortable with ec2 choice, try to purchase saving plans or RIs for less cost.
Also, do monitor the servers & traffic using Cloudwatch agent, internet monitor etc features.
I'm trying to find the optimal cloud architecture to host a software on Microsoft Azure.
The scenario is the following:
A (containerised) REST API is exposed to the users through which they can submit POST and GET requests. POST requests trigger a backend that needs a robust configuration to operate properly and GET requests are sent to fetch the result of the backend, if any. This component of the solution is currently hosted on an Azure Web App Service which does the job perfectly.
The (containerised) backend (triggered by POST requests) perform heavy calculations during a short amount of time (typically 5-10 minutes are allotted for the calculation). This backend needs (at least) 4 cores and 16 Gb RAM, but the more the better.
The current configuration consists in the backend hosted together with the REST API on the App Service with a plan that accommodates the backend's requirements. This is clearly not very cost-efficient, as the backend is idle ~90% of the time. On top of that it's not really scalable despite an automatic scaling rule to spawn new instances based on the CPU use: it's indeed possible that if several POST requests come at the same time, they are handled by the same instance and make it crash due to a lack of memory.
Azure Functions doesn't seem to be an option: the serverless (consumption plan) solution they propose is restricted to 1.5 Gb RAM and doesn't have Docker support.
Azure Container Instances neither, because first the max number of CPUs is 4 (which is really few for the needs here, although acceptable) and second there are cold starts of approximately 2 minutes (I imagine due to the creation of the container group, pull of the image, and so on). Despite the process is async from a user perspective, a high latency is not allowed as the result is expected within 5-10 minutes, so cold starts are a problem.
Azure Batch, which at first glance appears to be a perfect fit (beefy configurations available, made for hpc, cost effective, made for time limited tasks, ...) seems to be slow too (it takes a couple of minutes to create a pool and jobs don't run immediately when submitted).
Do you have any idea what I could use?
Thanks in advance!
Azure Functions
You could look at Azure Functions Elastic Premium plan. EP3 has 4 cores, 14GB of RAM and 250GB of storage.
Premium plan hosting provides the following benefits to your functions:
Avoid cold starts with perpetually warm instances
Virtual network connectivity.
Unlimited execution duration, with 60 minutes guaranteed.
Premium instance sizes: one core, two core, and four core instances.
More predictable pricing, compared with the Consumption plan.
High-density app allocation for plans with multiple function apps.
https://learn.microsoft.com/en-us/azure/azure-functions/functions-premium-plan?tabs=portal
Batch Considerations
When designing an application that uses Batch, you must consider the possibility of Batch not being available in a region. It's possible to encounter a rare situation where there is a problem with the region as a whole, the entire Batch service in the region, or your specific Batch account.
If the application or solution using Batch always needs to be available, then it should be designed to either failover to another region or always have the workload split between two or more regions. Both approaches require at least two Batch accounts, with each account located in a different region.
https://learn.microsoft.com/en-us/azure/batch/high-availability-disaster-recovery
I don't manage to get over 14 msg/second with the Azure ServiceBus Standard Plan. I'm running some benchmark tests with the Azure-Sample tool that I found in this question:
The test is done with a ServiceBus resource with a single Queue and all default configurations:
If I read this correctly, you've got the maximum concurrency of one (MaxInflightReceives) with 5 receivers (ReceiverCount). Increasing concurrency and enabling prefetch on the clients will increase the overall throughput. But,
Testing should be done within the same Azure data centre. If you're testing from a local machine, you're introducing a substantial latency that cannot be avoided.
The receive mode used is PeekLock. It is slower than ReceiveAndDelete. Not suggesting to switch, but this needs to be taken into consideration as you're trading throughput for safety by using PeekLock.
The standard tier has a cap on the number of operations per second. In addition to that, your namespace is deployed in a shared environment with entities scattered in various deployment containers. Performance will vary and cannot be guaranteed. If you want to have a guaranteed throughput, use Premium SKU.
I have a node.js web application / website hosted in Google Cloud App Engine. The website will have no more than 10 users per day and does not have any complex resource consuming feature.
I used app.yaml file given in tutorial
# [START app_yaml]
runtime: nodejs
env: flex
manual_scaling:
instances: 1
resources:
cpu: 1
memory_gb: 0.5
disk_size_gb: 10
# [END app_yaml]
But this is costing around 40 USD per month which is too high for basic application. Can you please suggest minimum possible lowest cost resource configuration? It would be helpful if you can provide app.yaml sample for it.
Google Cloud Platform's Pricing Calculator shows that the specs in your app.yaml turn out to be Total Estimated Cost: $41.91 per 1 month so your costs seem right.
AppEngine Flexible instances are charged for their resources by hour. With manual_scaling option set your instance is up all the time, even when there is no traffic and it is not doing any work. So, not turning your instance down during the idle time is the reason for the $40 bill. You might want to look into using Automatic or Basic scaling to minimize the time your instance is running, which will likely reduce your bill considering you don't have traffic 24/7 (you will find examples of proper app.yaml settings via the link).
Note that with automatic/basic scaling you get to select instance classes with less than 1 dedicated core (i.e. 0.2 & 0.5 CPUs). Not sure if setting CPU to be > 0 and < 1 with manual_scaling here would also work, you might give want to give it a try as well.
Also, don't forget to have a detailed look at your bills to see what else you are potentially being charged for.
After few searches, that seems to be the lowest possible configurations. See related answer here:
Can you use fractional vCPUs with GAE Flexible Environment?
At least for now, there is no shared CPUs so you'll pay for one even if your app is using an average 2% of it. Maybe adding few star here will help changing that in a near future:
https://issuetracker.google.com/issues/62011060
After reading articles on the internet I have created 1 f1-micro (1 vCPU, 0.6 GB memory) VM instance of bitnami MEAN stack which costs ~$5.5/month. I was able to host 1 Mongo DB instance and 2 Node.JS web applications in it. Both the applications have different domain names.
I have implemented reverse proxy using Apache HTTP server to route traffic to appropriate Node.JS application by it domain-name/hostname. I have documented the steps I followed here: https://medium.com/#prasadkothavale/host-multiple-web-applications-on-single-google-compute-engine-instance-using-apache-reverse-proxy-c8d4fbaf5fe0
Feel free to suggest if you have any other ways to implement this scenario.
The cheapest way to host a Node JS application is through Google Compute Engine, not Google App Engine.
This is because you can host it for 100% free on Compute Engine!
I have many Node apps that have been running for the last 2 years, and I have been charged a maximum of a few cents per month, if any at all.
As long as you are fine with a low spec machine (shared vCPU) and no scaling, look into the Compute Engine Always Free options.
https://cloud.google.com/free/docs/always-free-usage-limits#compute_name
The only downside is that you have to set up the server (installing Node, setting up firewalls etc). But it is a one time job, and easily repeatable after you have done it once.
App Engine Standard environment would be the best route for your use case. The standard environment runs directly on Google's Infrastructure, scales quickly and scales down to zero when there's no traffic. The free quota might be sufficient enough for this uses case as well.
App Engine Flexible environment runs as a container in a GCE VM (1 VM per instance/container). This makes it slower to scale compared to the standard environment as scaling up would require new VMs to boot up before the instance containers can be pulled and started. Flex also has the requirement of having minimum 1 instance running all the time (where as standard scales down to 0).
Flex is useful when your requirements of runtime/resources go beyond the limitations of standard environment.
You can understand more about the differences between the standard and flex environments at https://cloud.google.com/appengine/docs/the-appengine-environments
Use the Basic, not Flexible. It is a better fit and far cheaper for you.
I currently have a web application deployed to "Web Sites" - This is configured in standard mode and it performs really well from what I have seen so far.
I have a few questions:
1)My instance size is currently small - however I can scale out to 10 instances. Does this also mean that if I change my instance size to medium or large, I can still have 10 instances?
2)What is the maximum number of instances I can have for an azure web site?
3)Is there any SLA for a single azure instance?
4)Is it possible to change the instance size programatically or is better to just change the instance count
1) Yes
2) 10 for standard.
3) Yes, for Websites Basic and Standard, MS guarantee a 99.9% monthly availability.
4) It depends on a lot of factors. The real question is "Is it better for your app to scale up or scale out?"
Yes, the default limit is 10 instances regardless of the size.
The default limit is 10 instances, but you can contact Azure Support to have the limit increased. Default and "real" limits for Azure services are documented here.
According to the Websites pricing page Free and Shared sites have no SLA and Basic and Standard sites have 99.9% uptime SLA. Having a single instance means that during the 0.1% outage time (43.8 minutes per month) your site will be down. If you have multiple instances then most likely at least one will be up at any given time.
Typically instance auto-scaling is used to handle variation in demand while instance size would be used for application performance. If you only get 100 requests per day but each request is slow because it's maxing out CPU then adding more instances won't help you. Likewise if you're getting millions of requests that are being processed quickly but the volume is maxing out your resources then adding more instances is probably the better solution.