Amazon EC2 boot time - node.js

Our web app performs a random number of tasks for a user initiated action. We have built a small system where a master server calculates the number of worker servers that are needed to complete the task, and the same number of EC2 instances are "Turned On" which pick up the tasks and perform the same.
"Turned On" because the time taken to span an instance from an AMI is extremely high. So the idea is have a pool of worker instances and start and stop them as per requirement.
Also considering how amazon charges when you start up an instance (You are billed for 1 hour every time you Turn on an instance). The workers once spawned will be active for an hour and will accept other tasks during this period.
We have managed to get this architecture up and running, however the boot up time still bothers us as it fluctuates between 40 to 80 seconds. Is there some way we can reduce the same.
Below is the stack information of the things running on the worker instance
Ubuntu AMI
Node JS (using forever-service for auto startup on boot)
Docker (the tasks are performed inside individual docker containers)

Have you taken a look at AWS lambda ? (https://aws.amazon.com/lambda ).
Lambda supports node.js and will automatically manage the scaling of required worker infrastructure, depending on the number of requests. This will avoid your "one hour bill" problem. You only pay for used processing time.

Related

Google Cloud Run not scaling as expected

I'm using Google Cloud Run to run a pretty basic Express / Node JS backend container. I receive fairly low number of requests per day, and only the occasional concurrent request.
However, I can see on my Cloud Run dashboard that Cloud Run sometimes scale up to 4 instances, most of the time to at least 2 instances. I know that my app load is so low that I'll pretty much never need more than 1 instance, so why is Cloud Run being so wasteful?
My settings is set as maximum 40 requests concurrently; minimum 0 containers and maximum 4 containers.
Container instance counts fluctuates substantially. Green line is idle containers and blue line is active containers.
My CPU usage is also very low:
You know your workload profile and the expected request. Cloud Run autoscaler does not. Therefore, it over provisions additional instances in case of traffic spike.
Of course, YOU know that will never happen, but IT doesn't.
Cloud Run is pretty well designed for average traffic. If you are at one extremity of this standard usage (very low traffic or very high, very spiky traffic), yes, the Cloud Run autoscaler provisioning model doesn't work so well.
However, what's the problem? You pay only when a request is processed on an instance. If there are over provisioned and not used instances, you won't pay them. It's a waste of money for Google, not for you.
Your only concern could be for the earth and the resource saving, and you have absolutely right.

Changing date/time on an AWS EC2 instance causes the server to beccome unresponsive within about an hour

I have a program which runs on an EC2 instance. The program is is time-driven, and written in go, so to test it, I can't use faketime - I have to change the time on the server. The server is built from the Amazon Linux 2 AMI, with some extra agents, and runs fine in many (100s-1000s of instances) other cases, where the time is not modified. The instance type is an M5a.xlarge.
Google can't find any other instances where people are reporting issues changing time for testing on EC2 linux instances (or my search skills are not up to it).
If I change the time, the server runs OK for upto an hour or so, and then the load-average starts to climb, and it quickly becomes unresponsive.
The server has a single ENI (I did originally write that it had a second, but have simplified by removing that), and is using EFS.
Does anyone have experience setting time forwards on EC2 instances?
Update: having top running is useful as it shows that although the load average is climing, no single process is responsible (as far as I can see) - nothing in the process table is using > 1%
Further update: removing the second ENI makes no difference - at some random but less that around an hour - time, the server becomes unresponsive.

Schedule based start/stop of EC2 Instances in Autoscaling groups

Our requirement is we have Tibco BW components on top of AMAZON Ec2 instances and we need to start and stop instances on the timings provided by Business.Please note all EC2 instances are within the Autoscaling groups.
I was able to start and stop the EC2 instances when there is no autoscaling group involved.I had built a Lambda function and was triggering that function from Cloudwatch which was working fine.Nut I am not sure how to extend that to Ec2 instances which are having Autoscaling groups.
The expected result is that Applications on EC2 instances will be stopped depending on Schedule provided by Business.All the EC2 instances are within the Autoscaling group
You can use Scheduled Scaling to modify an Auto Scaling group so that it adds/removes instances.
You can configure it to change one of three variables:
The Minimum number of instances. For example, increasing the minimum might launch additional instances.
The Maximum number of instances, which might cause instances to be terminated.
The Desired number of instances, which will set the quantity 'now', but the quantity might change later based upon other rules you have in place (eg when things get busy).
It is quite common for companies to increase the minimum quantity at the start of the day to provide more instances before things get busy. Similarly, it is common to decrease the minimum number of instances at night or on weekends to allow instances to scale-in if there are scaling rules in place to detect idle capacity.
Please note that Auto Scaling will either Launch new instances or Terminate existing instances. It does not start or stop instances.
See: Scheduled Scaling for Amazon EC2 Auto Scaling

Clustering Node.js on Bluemix

Will a Node.js app on Bluemix automatically be scaled to run on multiple processors, or do I need to implement that myself using Node's clustering API? And if I do use clustering, will there be more than one CPU available?
Short answer: You need to use node cluster module to take full advantage of all cores in each instance. Or, you can also just increase the number of instances.
Long answer: Each instance of your application that you push to bluemix runs in a warden container. Resource control is managed by linux cgroups. The number of cores per instance is not something you can control. Running a quick test on Bluemix, os.cpus() showed 4 cores. If you want to take advantage of all 4 cores, in your one Bluemix instance (warden container) of your node.js application, then you should use nodes cluster module.
Keep in mind, you can also just increase the number of instances (horizontal scaling), which could achieve near linear results depending on your bottleneck on use of external services. So if you have 3 instances, each of those instances has 4 cores, and the built-in load balancer distributes traffic among the 3 instances.
The hybrid model that Ram suggested makes sense. You might want to do some benchmark to determine how many processes you want to run in one application container. You can use "cf app " to monitor the CPU utilization of each app instances under load, and if it's not fully consuming the CPU then it may make sense to spawn more processes.
However, please note -
* CPU might not be the bottleneck, in which case spawn more processes in the app container or scaling more app container instances won't help;
* The more processes you spawn in one container, the more memory it consumes, so make sure you do not spawn too many and exceed the allocated memory number (otherwise the app container will be killed).

Long running (or forever) task on Windows Azure

I need to write some data to database every 50 seconds or so. It's similar to a Windows service that's running on background and silently doing its job. Starting and stopping is not an option in my case as I need a small amount of previously inserted data to be stored in memory. What's the best solution for this when using Windows Azure or AWS?
Thank you.
With Windows Azure, you can choose either a Web or Worker role (both basically Windows 2008 Server R2 or SP2) and have some type of timed event, as #Lucifure suggested. You could also run a scheduler, like Quartz.net, or take advantage of windows Azure queues or service bus queues to have messages show up at a certain time. However: You cannot have a "forever" task in a given role instance, in that periodically your VM instances will be rebooted (e.g. for host OS maintenance every month). With role shutdowns, you'll get notice, which you can handle these shutdown notices in Stopping() or OnStop(). If you have multiple instances, you can use a scheduler or queue to ensure your events still trigger every 50 seconds or so, and get handled across multiple instances (but only by one instance at any given time).
To preserve your in-memory information, one idea is to store that information in a cache. You have 2 choices:
Distributed (shared) cache service, which has been around for some time now. It runs independently of your role instances.
In-memory cache, just introduced in June 2012. Assuming you have more than one instance, the cache is spread across those instances. You can even run the cache inside of memory of your existing roles.
More information on caching is here.
There are a few StackOverflow answers regarding Quartz.net and Windows Azure, such as this one.
On Windows Azure, you can use a Worker Role, which can do this. It can be simple as a while loop.
Try this article for an introduction.
http://www.c-sharpcorner.com/uploadfile/40e97e/windows-azu-creating-and-deploying-worker-role/
You could setup a System.Threading.Timer to fire every 50 seconds or so, and do your work whenever the event occurs.

Resources