Google Cloud node.js flexible environment - node.js

I deployed a node.js app as a learning tool and noticed that I'm getting billed for the project (around a $1/day). I know node.js on Google Cloud uses Compute Engine to run the vm's, but they say the flexible environment has all the advantages of the AppEngine platform, but it seems the instances don't auto stop and start to reduce billing when not in use.
I have java project that's been running on App Engine for years and I've never been billed anything, i'm guessing that's because the instances are shutdown automatically when not in use. So my questions are;
Is there a way to configure the flexible environment to mimic the standard environment to reduce the operating costs?
Am I miss-using something with the flexible environment?

According to Google App Engine Documentation,
Instances within the standard environment have access to a daily limit
of resource usage that is provided at no charge defined by a set of
quotas...
Instances within the flexible environment are charged the cost of the
underlying Google Compute Engine Virtual Machines.
According to this article,
Currently, the Flexible Environment needs at least one instance
running to serve traffic and there is no free tier.
This means that at any one time, you have at least one instance running, if you're using a Flexible VM. That should explain the billing.
Please note that by default appengine launches two g1-small instances. Depending on your application needs, this may be an over-kill. You should configure the compute resource settings in your app.yaml to the appropriate sizes of RAM, disk size and CPU, so as to save costs. You may also want to specify the min_num_instances as 1 in your service scaling settings.

I had the same problem. You can try to use Google's pricing calculator to figure out which configuration you need and how to minimize the cost of your application.
According to the calculator, the minimal cost for a flexible environment app is a little less than 40$ per month, There is nothing to do about it right now.
I eventually moved to Heruko because of that.

Related

Please suggest Google Cloud App Engine's smallest configuration

I have a node.js web application / website hosted in Google Cloud App Engine. The website will have no more than 10 users per day and does not have any complex resource consuming feature.
I used app.yaml file given in tutorial
# [START app_yaml]
runtime: nodejs
env: flex
manual_scaling:
  instances: 1
resources:
  cpu: 1
  memory_gb: 0.5
  disk_size_gb: 10
# [END app_yaml]
But this is costing around 40 USD per month which is too high for basic application. Can you please suggest minimum possible lowest cost resource configuration? It would be helpful if you can provide app.yaml sample for it.
Google Cloud Platform's Pricing Calculator shows that the specs in your app.yaml turn out to be Total Estimated Cost: $41.91 per 1 month so your costs seem right.
AppEngine Flexible instances are charged for their resources by hour. With manual_scaling option set your instance is up all the time, even when there is no traffic and it is not doing any work. So, not turning your instance down during the idle time is the reason for the $40 bill. You might want to look into using Automatic or Basic scaling to minimize the time your instance is running, which will likely reduce your bill considering you don't have traffic 24/7 (you will find examples of proper app.yaml settings via the link).
Note that with automatic/basic scaling you get to select instance classes with less than 1 dedicated core (i.e. 0.2 & 0.5 CPUs). Not sure if setting CPU to be > 0 and < 1 with manual_scaling here would also work, you might give want to give it a try as well.
Also, don't forget to have a detailed look at your bills to see what else you are potentially being charged for.
After few searches, that seems to be the lowest possible configurations. See related answer here:
Can you use fractional vCPUs with GAE Flexible Environment?
At least for now, there is no shared CPUs so you'll pay for one even if your app is using an average 2% of it. Maybe adding few star here will help changing that in a near future:
https://issuetracker.google.com/issues/62011060
After reading articles on the internet I have created 1 f1-micro (1 vCPU, 0.6 GB memory) VM instance of bitnami MEAN stack which costs ~$5.5/month. I was able to host 1 Mongo DB instance and 2 Node.JS web applications in it. Both the applications have different domain names.
I have implemented reverse proxy using Apache HTTP server to route traffic to appropriate Node.JS application by it domain-name/hostname. I have documented the steps I followed here: https://medium.com/#prasadkothavale/host-multiple-web-applications-on-single-google-compute-engine-instance-using-apache-reverse-proxy-c8d4fbaf5fe0
Feel free to suggest if you have any other ways to implement this scenario.
The cheapest way to host a Node JS application is through Google Compute Engine, not Google App Engine.
This is because you can host it for 100% free on Compute Engine!
I have many Node apps that have been running for the last 2 years, and I have been charged a maximum of a few cents per month, if any at all.
As long as you are fine with a low spec machine (shared vCPU) and no scaling, look into the Compute Engine Always Free options.
https://cloud.google.com/free/docs/always-free-usage-limits#compute_name
The only downside is that you have to set up the server (installing Node, setting up firewalls etc). But it is a one time job, and easily repeatable after you have done it once.
App Engine Standard environment would be the best route for your use case. The standard environment runs directly on Google's Infrastructure, scales quickly and scales down to zero when there's no traffic. The free quota might be sufficient enough for this uses case as well.
App Engine Flexible environment runs as a container in a GCE VM (1 VM per instance/container). This makes it slower to scale compared to the standard environment as scaling up would require new VMs to boot up before the instance containers can be pulled and started. Flex also has the requirement of having minimum 1 instance running all the time (where as standard scales down to 0).
Flex is useful when your requirements of runtime/resources go beyond the limitations of standard environment.
You can understand more about the differences between the standard and flex environments at https://cloud.google.com/appengine/docs/the-appengine-environments
Use the Basic, not Flexible. It is a better fit and far cheaper for you.

What is the Azure equivalent of AWS Lambda?

At the moment we are running our application on an AWS Beanstalk but are trying to determine the suitablilty of Azure.
Our biggest issue is the amount of wasted CPU time we are paying for but not using. We are running on t2.small instances as these have the min amount of RAM we need but we never use even the base amount of CPU time allotted. (20% for a t2.small ) We need lots of CPU power during short bursts of the day and bringing more instances on line in advance of this is the only way we can handle it.
AWS Lambda looks a good solution for us but we have dependencies on Windows components like SAPI so we have to run inside of Windows VMs.
Looking at Azure cloud services we thought using a Web role would be best fit for our app but it seems a Web role is nothing more than a Win 2012 VM with IIS enabled. So as the app scales it just brings on more of these VMs which is exactly what we have at the moment. Does Azure have a service similar to Lambda where you just pay for the CPU processing time you use?
The reason for our inefficient use of CPU resources is that our speech generation app uses lost of 3rd party voices but can only run single threaded when calling into SAPI because the voice engine is prone to crashing when multithreading. We have no control over this voice engine. It must have access to a system registry and Windows SAPI so the ideal solution is to somehow wrap all dependencies is a package and deploy this onto Azure and then kick off multiple instances of this. What "this" is I have no Idea
Microsoft just announced a new serverless compute service as an alternative to AWS Lambda, called Azure Functions:
https://azure.microsoft.com/en-us/services/functions/
http://www.zdnet.com/article/microsoft-releases-preview-of-new-azure-serverless-compute-service-to-take-on-aws-lambda/
With Azure Functions you only pay for what you use with compute metered to the nearest 100ms at Per/GB price based on the time your function runs and the memory size of the function space you choose. Function space size can range from 128mb to 1536mb. With the first 400k GB/Sec free.
Azure Function requests are charged per million requests, with the first 1 million requests free.
Based on the documentation on Azure website here: https://azure.microsoft.com/en-in/campaigns/azure-vs-aws/mapping/, the services equivalent to AWS Lambda are Web Jobs and Logic Apps.
The most direct equivalent of Lambda on Azure is Azure Automation which does a lot of what Lambda does except it runs Powershell instead of Node etc. It isn't as tightly integrated into other services like Lambda is, but it has the same model. i.e. you write a script, and it is executed on demand.
I presume by SAPI you are refering to the speech API? If so you can create Powershell modules for Azure, and they can include dll files. In which case you could create a module to wrap around the SAPI dll, and that should do what you are looking for.
If you want a full compute environment, without the complexity of multiple machines when you run. You could use Azure Batch which would be the Azure recommended way of running what you are looking for.
The cost benefit you need to evaluate would be how much quicker your solution would run against a native .net stack (in batch), and if performance is significantly degraded when run from Powershell.
Personally I would give Automation a try, it is surprisingly powerful.
There is something called "Cloud Service" in azure which allows you to run code on a pure VM. Scaling options on these include such things as CPU%, queue size, etc. If you can schedule your needs, Azure allows you to easily set up a scheduled scaler, i.e. 4 VM's from 8AM until 08:10AM, and of course, in Azure, you pay by the minute, so it could be a feasible solution.
I'd say more, but the documentation in Azure is really so great that I'd be offending them by offering my "translation" here. Checkout azure.com for more info :)

What does the Azure Web Apps architecture look like?

I've had a few outages of 10 to 15 minutes, because apparently Microsoft had a 'blip' on their storages. They told me that it is because of a shared file system between the instances (making it a single point of failure?)
I didn't understand it and asked how file share is involved, because I would assume a really dumb stateless IIS app that communicates with SQL Azure for its data.
I would assume the situation below:
This is their reply to my question (I didn't include the drawing)
The file shares are not necessarily for your web app to communicate to
another resources but they are on our end where the app content
resides on. That is what we meant when we suggested that about storage
being unavailable on our file servers. The reason the restarts would
be triggered for your app that is on both the instances is because the
resources are shared, the underlying storage would be the same for
both the instances. That’s the reason if it goes down on one, the
other would also follow eventually. If you really want the
availability of the app to be improved, you can always use a traffic
manager. However, there is no guarantee that even with traffic manager
in place, the app doesn’t go down but it improves overall availability
of your app. Also we have recently rolled out an update to production
that should take care of restarts caused by storage blips ideally, but
for this feature to be kicked it you need to make sure that there is
ample amount of memory needs to be available in the cases where this
feature needs to kick in. We have couple of options that you can have
set up in order to avoid any unexpected restarts of the app because of
a storage blip on our end:
You can evaluate if you want to move to a bigger instance so that
we might have enough memory for the overlap recycling feature to be
kicked in.
If you don’t want to move to a bigger instance, you can always use
local cache feature as outlined by us in our earlier email.
Because of the time differences the communication takes ages. Can anyone tell me what is wrong in my thinking?
The only thing that I think of is that when you've enabled two instances, they run on the same physical server. But that makes really little sense to me.
I have two instances one core, 1.75 GB memory.
My presumption for App Service Plans was that they were automatically split into availability sets (see below for a brief description) Largely based on Web Apps sales spiel which states
App Service provides availability and automatic scale on a global data centre infrastructure. Easily scale applications up or down on demand, and get high availability within and across different geographical regions.
Following on from David Ebbo's answer and comments, the underlying architecture of Web apps appears to be that the VM's themselves are separated into availability sets. However all of the instances use the same fileserver to share the underlying disk space. This file server being a significant single point of failure.
To mitigate this Azure have created the WEBSITE_LOCAL_CACHE_OPTION which will cache the contents of the file server onto the individual Web App instances. Using caching in lieu of solid, high availability engineering principles.
The problem here is that as a customer we have no visibility into this issue, we've no idea if there is a plan to fix it, or if or when it will ever be fixed since it seems unlikely that Azure is going to issue a document that admits to how badly this has been engineered, even if it is to say that it is fixed.
I also can't imagine that this issue would be any different between ASM and ARM. It seems exceptionally unlikely that there was originally a high availability solution at the backend that they scrapped when ARM came along. So it is very likely that cloud services would suffer the exact same issue.
The small upside is that now that we know this is an issue, one possible solution would be to deploy multiple web apps and have a traffic manager between them. Even if they are in the same region, different apps should have different backend file servers.
My first action would be to reply to that email, with a link to the Web Apps page, (and this question) with a copy of the quote and ask how to enable high availability within a geographic region.
After that you'll likely need to rearchitect your solution!
Availability sets
For virtual machines Azure will let you specify an availability set. An availability set will automatically split VMs into separate update and fault domains. Meaning that servers will end up in different server racks, and those server racks won't get updates at the same time. (it is a little more complex than that, but that's the basics!)
Azure Web Apps do used a shared file storage. The best way to think about it is that all the instances of your app map to the same network share that have your files. So if you modify the files by any mean (e.g. FTP, msdeploy, git, ...), all the instances instantly get the new files (since there is only one set of files).
And to answer your final question, each instance does run on a separate VM.

Create azure VM on my local machine

Is it possible to create one or several azure VMs on my local machine? I want to create a web app and load test it locally, without the need of putting it in the cloud. I'm thinking at the following scenario: I have a local VM running a IIS server with my web app; I use a tool to generate a lot of load; I need to deploy the second VM containing the same things as the first VM. The downtime of the web app should be equal to 0(hopefully).
Clarification(update):
I want to achieve the following: create a web app and a monitoring app(CPU,Memory) and deploy them on one VM. On a load test, if the VM cannot handle it(e.g. CPU goes above 80%), I want to programmatically deploy a new VM(with the same configuration, having both the web app and the monitoring app), such that no downtime occurs.
Azure has several ways for you to host sites.
Virtual Machines is just that, normal VMs. You can create them locally and upload them, but everything is up to you, including how to handle upgrades. If that is what you need to do then I don't know how you would handle upgrades with no down time; though, you can add multiple VMs to a load balancer and then upgrade them one at a time.
It sounds like what you really want to explore is Cloud Services. You can run one or more VMs locally in the emulator, upgrade with no down time once in the cloud, implement auto scaling (you will have to use a tool or write some code).
Alternatively you may want to look at Azure Web sites, but that is a completely different concept and you can't really test load and load balancing locally the same way.
Based on your statement that you essentially want to auto-scale your application you want to look at Cloud Services with Auto Scaling. However, you can't fully test this in the cloud emulator - but you can test your logic.
Background
Azure Cloud Services is designed for this kind of thing; You don't really work with VMs in the way you may be used to, instead you create a package that Azure then deploys to as many servers as you like. Once up and running, you can manually go into the management console and increase or decrease the number of active servers simply by moving a slider. Of course, you want to do this automatically, so you have a few options.
There is a management API you can use to change the number of servers. So, it would be quite simple to write a bit of code that you spin up in another thread from WebRole.Start and that simply sits and monitors the CPU on the machine and then calls the management API to spin up a new server instance if your CPU goes over a certain treshold. Okay, locally you can only test that the call to the management API is made, you won't actually see the new server coming up. But, if you grab your free trial of Azure and just try it you will see that you really don't need to test that part - it just works.
However, in practice there is an awful lot more to auto scaling. Here are some of the things you need to consider;
Even relatively idle web servers will often spike briefly to 100% so just having a simple treshold is unlikely to be good enough; You need to decide on how long the server needs to be over a certain treshold before you spin up another server instance.
What happens when you have more than one server? And, on Azure, you should always have at least two servers to ensure you have resilience. Note that the idea with Cloud Services really is to have many small servers rather than a few big servers. You pay per core, not per number of servers.
Imagine you currently have three servers and one is really busy for some reason and the other two are idle. Do you want to spin up a fourth server?
Imagine you currently have two servers and they are both quite busy. Do you really want them both to start a new server so you end up with four servers running?
There are several ways to handle these challenges. For starters, rather than having monitor programs running locally on each server, you are better of moving that monitoring outside; Azure comes with the ability to dump performance metrics to table storage at whatever interval you choose. You can then run an external program that retrieves the performance data over time from all your current servers and then reason about the overall workload before deciding to spin up or shut down additional servers. Now, you can of course host that external monitor program in a separate thread on each of your webroles to give your monitoring resilience - but the key point is that the monitoring program doesn't monitor the server it runs on, it monitors all the servers. You will, of course, still have to deal with stopping multiple monitoring program instances from all starting and stopping servers. One way to do is to place stop/start commands onto an Azure "message queue" (there are a few different types) and use the built-in "de-duper" which will automatically delete identical commands that are put on the queue within a certain time window (I am over simplyfing but you get the idea).
The actual answer
Really, though, you want to look at the Auto Scaling Application Block which will do most of this for you. I guess that is the real answer to your question, but I wanted to provide a bit of context first.
Again, I recognise you asked for how to test this locally - but I believe that that question doesn't really make sense in the context of Azure and I hope the above information helps.
I'm pretty sure you can't do that and it wouldn't make sense anyway. If you want load testing, you need to run that in an environment as similar to production as possible and that means you have to run your application is Azure cloud. How else do you know that the load will actually be processed fine on real cloud?

deploying CPU intensive web service on cloud

I have an application which I want to expose as a web service (SaaS). The application is CPU intensive and is a multithreaded application which takes good amount of time for the execution(on an average 15-20secs). Since, I want to expose it as a SaaS and want to use existing cloud services available in the market like Amazon, Google App Engine etc. so that the cost involved and the work involved while scaling my service is not much. I have couple of questions in my mind like:
1.) Since the application is multithreaded and the number of threads invoked depends on the number of results thrown by the service(so basically number of threads is a dynamic entity). Right now I have a 6 core processor so I have kept the threadpool size to be 6 but since I am moving onto the cloud, how can I optimally use the cloud infrastructure?
2.) Do the cloud service providers(which?) give the option to select number of CPU cores required for each request (or something similar to serve my purpose)?
3.) What changes are needed in the code (related to the threads)?
4.) Any other specific area which I should give a sight for moving to the cloud?
In Amazon EC2 you are basically paying for different types of instances - you are free to pick one with only single core and one with sixteen. You get what you pay for.
how can I optimally use the cloud infrastructure?
Your approach is fine, if your task is CPU-intenstive, have a thread pool with the same number of threads as CPU cores/CPUs.
select number of CPU cores required for each request
No, at least not Amazon. You run your application on a given instance and that's all you get. You have to pick instance type in advance, but of course you are free to switch between them, add new, etc. at any time. The cloud!
In Google App Engine you can't create threads, so it's a no-option for you. See also: Why does Google App Engine support a single thread of execution only?
3.) What changes are needed in the code (related to the threads)?
None. It's a standard PC, after all.
4.) Any other specific area which I should give a sight for moving to the cloud?
Well, see above, some services are completely useless for you, like GAE. Make some research before you actually pay for something.

Resources