What happens if a spot instance isn't available for an AWS autoscaling group? - autoscaling

If I have an autoscaling group that consists of on-demand and spot instances with a minimum of 4 on-demand instances, and the extra capacity consisting of spot instances, what happens if it needs to scale up with a spot instance, and there isn't an available spot instance (because I've been outbid, or if there aren't any spare instances to fulfil the spot request)?
Will it still scale up using an on-demand instance?
Will the autoscaling group fail to scale up?
Other info:
I'm using a "Lowest Price" Spot Allocation Strategy
The max_spot_price is capped at the on-demand price.
My Google foo seems to be failing me as I can't seem to find any answers on the web. I would appreciate if anybody could shed some light on this issue.
Thanks in advance!

An AutoScaling Group in AWS will not failover to on demand if there's no spot capacity. This is essentially the trade off your getting for the lower price of spot instances. To work around this, try adding more AZs and/or instance types (not as much of an issue now that weights are supported and ALB can route based on Least Outstanding Requests)
If you have multiple instance types and AZs setup in the ASG, this happens after your on demand base is met:
1) Tries to launch the spot instance(s) based on your allocation strategy and number of spot pools
2) If the desired instance type(s) aren't available, try all the other types in that AZ
3) If no spot instances are available in that AZ, that launch request will fail and it will try again in another enabled AZ
4) If there are no spot instances, in any of the types you have selected, in any of the AZs you have on the ASG, then nothing will launch and the ASG will periodically retry until it reaches the desired capacity.
Think of it like this, there's only so many servers in their data centers. If spot evictions are happening because they need capacity for on demand instances, and everyone running spot failed over to ondemand for that instance type; there would probably suddenly also be an on demand instance capacity issue in that AZ.

Related

Aws Aurora serverless v2 will not scale down to .5 ACU even though 0 connections

I'm running a v2 instance and from the documentation aws states you should only be paying for resources that you are actually using. I have an instance than is most of the time at 0 connections but it never scales down under 2 ACUs. See images below for reference. I have the instance setup to scale between 0.5-16ACU. But it doesn't seem to matter the load it always stays at a baseline of 2ACUs.
I had to turn off the AI monitoring on the DB. Then restart the instance. This then started the db at the minimum.
I can confirm this behaviour but as yet can't explain it. We have three databases running, all with the same schema and with different ACU limits set. Our production and staging databases insist at near flatlines close to the max capacity allowed whilst one other behaves as we would expect and only shows an upscale when we actually send it load.
We have tried rebooting the instances but they immediately scale up and do not appear willing to scale down.
We have full support with AWS so will raise a ticket with them and should report back here if we get an explanation/solution

Schedule based start/stop of EC2 Instances in Autoscaling groups

Our requirement is we have Tibco BW components on top of AMAZON Ec2 instances and we need to start and stop instances on the timings provided by Business.Please note all EC2 instances are within the Autoscaling groups.
I was able to start and stop the EC2 instances when there is no autoscaling group involved.I had built a Lambda function and was triggering that function from Cloudwatch which was working fine.Nut I am not sure how to extend that to Ec2 instances which are having Autoscaling groups.
The expected result is that Applications on EC2 instances will be stopped depending on Schedule provided by Business.All the EC2 instances are within the Autoscaling group
You can use Scheduled Scaling to modify an Auto Scaling group so that it adds/removes instances.
You can configure it to change one of three variables:
The Minimum number of instances. For example, increasing the minimum might launch additional instances.
The Maximum number of instances, which might cause instances to be terminated.
The Desired number of instances, which will set the quantity 'now', but the quantity might change later based upon other rules you have in place (eg when things get busy).
It is quite common for companies to increase the minimum quantity at the start of the day to provide more instances before things get busy. Similarly, it is common to decrease the minimum number of instances at night or on weekends to allow instances to scale-in if there are scaling rules in place to detect idle capacity.
Please note that Auto Scaling will either Launch new instances or Terminate existing instances. It does not start or stop instances.
See: Scheduled Scaling for Amazon EC2 Auto Scaling

How do I determine the number of Node Types, Number of nodes and VM size in Service Fabric cluster for a relatively simple but high throughput API?

I have an Asp.Net core 2.0 Wen API that has a relatively simple logic (simple select on a SQL Azure DB, return about 1000-2000 records. No joins, aggregates, functions etc.). I have only 1 GET API. which is called from an angular SPA. Both are deployed in service fabric as as stateless services, hosted in Kestrel as self hosting exes.
considering the number of users and how often they refresh, I've determined there will be around 15000 requests per minute. in other words 250 req/sec.
I'm trying to understand the different settings when creating my Service Fabric cluster.
I want to know:
How many Node Types? (I've determined as Front-End, and Back-End)
How many nodes per node type?
What is the VM size I need to select?
I have ready the azure documentation on cluster capacity planning. while I understand the concepts, I don't have a frame of reference to determine the actual values i need to provide to the above questions.
In most places where you read about the planning of a cluster they will suggest that this subject is part science and part art, because there is no easy answer to this question. It's hard to answer it because it depends a lot on the complexity of your application, without knowing the internals on how it works we can only guess a solution.
Based on your questions the best guidance I can give you is, Measure first, Measure again, Measure... Plan later. Your application might be memory intensive, network intensive, CPU, Disk and son on, the only way to find the best configuration is when you understand it.
To understand your application before you make any decision on SF structure, you could simply deploy a simple cluster with multiple node types containing one node of each VM size and measure your application behavior on each of them, and then you would add more nodes and span multiple instances of your service on these nodes and see which configuration is a best fit for each service.
1.How many Node Types?
I like to map node types as 1:1 to roles on your application, but is not a law, it will depend how much resource each service will consume, if the service consume enough resource to make a single VM(node) busy (Memory, CPU, Disk, IO), this is a good candidate to have it's own node type, in other cases there are services that are light-weight that would be a waste of resources provisioning an entire VM(node) just for it, an example is scheduled jobs, backups, and so on, In this case you could provision a set of machines that could be shared for these services, one important thing you have to keep in mind when you share a node-type with multiple service is that they will compete for resources(memory, CPU, network, disk) and the performance measures you took for each service in isolation might not be the same anymore, so they would require more resources, the option is test them together.
Another point is the number of replicas, having a single instance of your service is not reliable, so you would have to create replicas of it(the right number I describe on next answer), in this case you end up with a service load split in to multiple nodes, making this node-type under utilized, is where you would consider joining services on same node-type.
2.How many nodes per node type?
As stated before, it will depend on your service resource consumption, but a very basic rule is a minimum of 3 per node type.
Why 3?
Because 3 is the lowest number where you could have a rolling update and guarantee a quorum of 51% of nodes\service\instances running.
1 Node: If you have a service running 1 instance in a node-type of 1 node, when you deploy a new version of your service, you would have to bring down this instance before the new comes up, so you would not have any instance to serve the load while upgrading.
2 Nodes: Similar to 1 node, but in this case you keep only 1 node running, in case of failure, you wouldn't have a failover to handle the load until the new instance come up, it will worse if you are running a stateful service, because you will have only one copy of your data during the upgrade and in case of failure you might loose data.
3 Nodes: During a update you still have 2 nodes available, when the one being updated get back, the next one is put down and you still have 2 nodes running, in case of failure of one node, the other node can support the load until a new node is deployed.
3 nodes does not mean the your cluster will be highly reliable, it means the chances of failure and data loss will be lower, you might be unlucky a loose 2 nodes at same time. As suggested in the docs, in production is better to always keep the number of nodes as 5 or more, and plan to have a quorum of 51% nodes\services available. In this case I would recommend 5, 7 or 9 nodes in cases you really need higher uptime 99.9999...%
3.What is the VM size I need to select?
As said before, only measurements will give this answer.
Observations:
These recommendations does not take into account the planning for primary node types, it is recommended to have at least 5 nodes on primary Node Types, it is where SF system services are placed, they are responsible to manage the
cluster, so they must be highly reliable, otherwise you risk losing control of your cluster. If you plan to share these nodes with your application services, keep in mind that your services might impact them, so you have to always monitor them to check for any impact it might cause.

Azure scaling of roles and pricing

I have been digging without being able to wrap my head around this. It seems like once a role is deployed, you are charged for it in full, whether or not you scale it up or down?
Why would anyone scale down with this? I don't see the incentive to not just leave the role with all possible instances used to max?
I can see why an availability set, with several roles, might want to distribute the cores between them depending on load. But is that all it's for?
You pay the price of one instance of choosen size (A0 to D14) multiplied per the number of instance that are running.
Try with the Azure Princing Calculator, the number of instance increases charges.
When you try to use Autoscaling it clearly states :
Autoscale is saving you XX% off your current bill

Reduce costs of Azure availability set

I am planning on running Sharepoint Foundation on one VM size A3 and SQL Server on another of size A6. As far as I understand this is not enough to achieve SLA and I should use 2 more instances - one for Sharepoint and one for SQL Server configured in 2 seperate availability sets.
Can I use scaling (by CPU usage) to turn off one instance and leave only one running at a time in an availability set? This would reduce the costs but I wonder if this solution will be good enough to achieve Azure's SLA. The way I see it one instance is running at a time while other one is shut down so I am billed for one instance. When there is an update or failure going on, the instance that until then has been running is shut down and the other one comes online. Is this the way it works? Can I cut costs of availability sets like this?
no, the SLA requires two running instances. However, if you want to control your costs, the approach you have in place will work. Just keep in mind that the duration/window for a disruption will be dependent on how quickly you detect that the primary VM has failed, and how fast you can start the secondary VM. And depending on the nature of the service disruption, it may not be possible for you to start the secondary. So its a risk.

Resources