Pricing of Google App Engine Flexible env, a $500 lesson

Pricing of Google App Engine Flexible env, a $500 lesson - node.js

I followed the Nodejs on App Engine Flexible env tutorial:
https://cloud.google.com/appengine/docs/flexible/nodejs/create-app
Having successfully deployed and tested the tutorial, I changed the code to experiment a little and successfully deployed it... and then left it running since this was a testing environment (not public).
A month later, I receive a bill from Google for over $370!
In the transaction details I see the following:
Oct 1 – 31, 2017 App Engine Flex Instance RAM: 5948.774 Gibibyte-hours
([MYPROJECT]) $42.24
Oct 1 – 31, 2017 App Engine Flex Instance Core Hours: 5948.774 Hours ([MYPROJECT]) $312.91
How did this testing environment with almost 0 requests require about 6,000 hours of resources? In the worst, I would have assume 720 hrs running fulltime for a month # $0.05 per hour would cost me ~$40.
https://cloud.google.com/appengine/pricing
Can someone help shed light on this? I have not been able to find out why so many resources were needed?
Thanks for the help!
For more data, this is the traffic over the last month (basically 0):
And instance data
UPDATE:
Note that I did bring one modification to the package.json: I added nodemon as a dependency and added it as part of my "nmp start" script. Though I doubt this explains the 6000 hours of resources:
"scripts": {
"deploy": "gcloud app deploy",
"start": "nodemon app.js",
"dev": "nodemon app js",
"lint": "samples lint",
"pretest": "npm run lint",
"system-test": "samples test app",
"test": "npm run system-test",
"e2e-test": "samples test deploy"
},
App.yaml (default-no change from tutorial)
runtime: nodejs
env: flex

After multiple back and forth with Google, and hours of reading blogs and looking at reports, I've finally found an explanation for what happened. I will post it here with my suggestions so that other people do not also fall victim to this problem.
Note, this may seem obvious to some, but as a new GAE user, all of this was brand new to me.
In short, when deploying to GAE and using the following command "$ gcloud app deploy", it creates a new version and sets it as the default, but also and more importantly, it does NOT remove the previous version that was deployed.
More info about versions and instances can be found here: https://cloud.google.com/appengine/docs/standard/python/an-overview-of-app-engine
So in my case, without knowing it, I had created multiple versions of my simple node app. These versions are still running in case one needs to switch following an error. But these versions also require instances, and the default, unless stated in the app.yaml, is 2 instances.
Google says:
App Engine by default scales the number of instances running up and
down to match the load, thus providing consistent performance for your
app at all times while minimizing idle instances and thus reducing
cost.
However, from my experience, this was not the case. As I said earlier, I pushed my node app with nodemon which it seems was causing errors.
In the end, following the tutorial and not shutting down the project, I had 4 versions, each with 2 instances running full-time for 1.5 months serving 0 requests and generating lots of error messages and it cost me $500.
RECOMMENDATIONS IF YOU STILL WANT TO USE GAE FLEX ENV:
First and foremost, setup a billing budget & alerts so that you do not get surprised by an expensive invoice that is automatically charged to your CC: https://cloud.google.com/billing/docs/how-to/budgets
In a testing env, you most likely do not need multiple versions, so while deploying use the following command:
$ gcloud app deploy --version v1
Update your app.yaml to force only 1 instance with minimal resources:
runtime: nodejs
env: flex
# This sample incurs costs to run on the App Engine flexible environment.
# The settings below are to reduce costs during testing and are not appropriate
# for production use. For more information, see:
# https://cloud.google.com/appengine/docs/flexible/nodejs/configuring-your-app-with-app-yaml
manual_scaling:
instances: 1
resources:
cpu: 1
memory_gb: 0.5
disk_size_gb: 10
Set daily spending limit
See this blog post for more info: https://medium.com/google-cloud/three-simple-steps-to-save-costs-when-prototyping-with-app-engine-flexible-environment-104fc6736495
I wish some of these steps had been included in the tutorial in order to protect those who are trying to learn and experiment, but it was not.
Google App Engine Flex env can be tricky if one does not know all these details. A friend pointed me to Heroku, that has both set pricing and Free/Hobby offers. I was able to quickly push a new node app there, and it worked like charm!
https://www.heroku.com/pricing
It "only" cost me $500 to learn this lesson, but I do hope this helps others looking at Google App Engine Flex Env.

If you want to reduce your GAE costs please do not use manual_scaling as suggested in this article or the accepted answer!
The beautiful thing about Google App Engine is that it can scale up and down to hundreds of machines within milliseconds based on demand. And you only pay for instances that are running.
To be able to optimize your costs you need to understand the different scaling options and instance types:
1. App engine flex vs standard:
The details about differences can be found here, but one important difference relevant for this question is:
[Standard is] Intended to run for free or at very low cost, where you pay only for
what you need and when you need it. For example, your application can
scale to 0 instances when there is no traffic.
2. Scaling Options:
Automatic scaling: Google will scale your app depending on demand and configuration you provided.
Manual scaling: No scaling at all, GAE will run exact # of instances you asked for, all the time(very misleading naming)
Basic scaling: It will scale up to limit you set and will also scale down after certain time
3. Instance Types:
There are 2 instance types, and they basically differ in the time it takes to spin up a new instance. F class instances(used in automatic scaling) can be created when there is need within ~0.1 seconds and B class instances(used in manual scaling/basic) within ~0.7 seconds:
Now that you understood the basics let's go back to accepted answer:
manual_scaling:
instances: 1
resources:
cpu: 1
memory_gb: 0.5
disk_size_gb: 10
What this instructs GAE is to run a custom instance class(more costly), all the time. Obviously this is not the cheapest option because B1/F1 instance type could be used instead(it has lower specs) and it is also running an instance constantly.
What would be the cheapest is to turn off the instance when there is no traffic. If you don't mind the ~0.1 second spin up time you could go with this instead:
instance_class: F1
automatic_scaling:
max_instances: 1 (--> you can adjust this as you wish)
min_instances: 0 (--> will scale to 0 when there is no traffic so won't incur costs)
This will fall within the free quotas google provide and it should not cost you anything if you don't have any real traffic.
PS: It's also highly recommended to set up daily spending limit in case you forgot something running or you have some costly settings somewhere(daily spending limits are deprecated but will be available until July 24, 2021, source).

We had code deployed to GAE FE go absolutely nuts due to a cascading, exponential failure (bounced emails generated bounced-email emails, etc.) and we could NOT turn off the GAE instances that were bugged. After 4+ hours, and 1M+ emails sent (Mailgun just would NOT let us disable the account. It said "Please wait up to 24 hours for the password change to go into effect", and revoking API keys did nothing), the redis VM was stopped, the DB down, and all the site's code reduced to a single "Down For Maintenance" static 503 page), the emails kept being sent.
I determined that GAE FE just simply does not end either docker VMs or Cloud Compute VMs (redis) that are under CPU load. Maybe never! Once we actually deleted the Compute VM (instead of "merely" stopping it), the emails instantly stopped.
But, our DB continued to get filled with "could not send email" notices for up to 2 more hours, despite the GAE app reporting 100% of the versions and instances to be "Stopped". I ended up having to change the Google Cloud SQL password.
We kept checking the bill, and the 7 rogue instances kept using up CPU and so we cancelled the card used on that account, and the site did, in fact, go down when the bill was past due, but so did the rogue instances. We never were able to resolve the situation with GAE email support.
Update (30 Sep 2020): This is still the worst moment of my 22 year career!! An entire company of 15 crack genius devs couldn't figure out how to turn off GAE. We knew customers were receiving MILLIONS of emails when one of my dev's couldn't access her GMail account. Couldn't unplug it, couldn't turn it off. It was quite a "Terminator" moment!
It wouldn't have been nearly so bad, except for expenses, if MailGun had allowed us to actually disable the API access or change the password. But it would have still been bad expense-wise on GAE.
I no longer trust servers I can't issue reboot on.
In the end, MailGun only charged us about $50. GAE, however... If I had just assumed "OK, mails stopped, we can stop", we could have ended up with a $20,000 excess bill! As it was, it "only" cost $1,500. And we never could get in contact with anyone to dispute it. So the CEO just ate it.

Also note that if you still want your app to have automatic scaling but you don't want the default minimum of 2 instances running at all times, you can configure your app.yaml like so:
runtime: nodejs
env: flex
automatic_scaling:
min_num_instances: 1

Since no one mentioned, here are the gcloud commands related to the versions
# List all versions
$ gcloud app versions list
SERVICE VERSION.ID TRAFFIC_SPLIT LAST_DEPLOYED SERVING_STATUS
default 20200620t174631 0.00 2020-06-20T17:46:56+03:00 SERVING
default 20200620t174746 0.00 2020-06-20T17:48:12+03:00 SERVING
default prod 1.00 2020-06-20T17:54:51+03:00 SERVING
# Delete these 2 versions (you can't delete all versions, you have to have at least one remaining)
$ gcloud app versions delete 20200620t174631 20200620t174746
# Help
$ gcloud app versions --help

for dev environments where I don't mind a little latency, I'm using the following settings:
instance_class: B1
basic_scaling:
max_instances: 1
idle_timeout: 1m
And if you use your instance more than the free backend instance allowance try this:
instance_class: F1
automatic_scaling:
max_instances: 1
It the AppEngine dashboard, watch the Instances, take note of the start time, and watch to ensure that after the idle_timeout period has passed the Instance count drop to zero and you see the message "This version has no instances deployed".

These options don't work in the flex env:
app.yaml :
# 1.
resources:
cpu: .5
memory_gb: .18
disk_size_gb: 10
# 2.
automatic_scaling:
min_instances: 1
max_instances: 1
# 3.
beta_settings:
machine_type: f1-micro
Related errors:
1.
Error Response: [3] App Engine Flexible validation error: Memory GB
(0.58) per VCPUs must be between 0.90 and 6.50
ERROR: (gcloud.app.deploy) INVALID_ARGUMENT: VM-based automatic
scaling should NOT have the following parameter(s):
[standard_scheduler_settings.min_instances,
standard_scheduler_settings.max_instances]
'#type': type.googleapis.com/google.rpc.BadRequest fieldViolations:
description: 'VM-based automatic scaling should NOT have the following parameter(s):
[standard_scheduler_settings.min_instances, standard_scheduler_settings.max_instances]'
field: version.automatic_scaling
ERROR: (gcloud.app.deploy) INVALID_ARGUMENT: Unrecognized or
unpermitted key(s) in configuration "beta_settings"
'#type': type.googleapis.com/google.rpc.BadRequest fieldViolations:
description: beta_setting key can not be used with env:flex
field: machine_type

Related

Google Cloud Run not scaling as expected

I'm using Google Cloud Run to run a pretty basic Express / Node JS backend container. I receive fairly low number of requests per day, and only the occasional concurrent request.
However, I can see on my Cloud Run dashboard that Cloud Run sometimes scale up to 4 instances, most of the time to at least 2 instances. I know that my app load is so low that I'll pretty much never need more than 1 instance, so why is Cloud Run being so wasteful?
My settings is set as maximum 40 requests concurrently; minimum 0 containers and maximum 4 containers.
Container instance counts fluctuates substantially. Green line is idle containers and blue line is active containers.
My CPU usage is also very low:

You know your workload profile and the expected request. Cloud Run autoscaler does not. Therefore, it over provisions additional instances in case of traffic spike.
Of course, YOU know that will never happen, but IT doesn't.
Cloud Run is pretty well designed for average traffic. If you are at one extremity of this standard usage (very low traffic or very high, very spiky traffic), yes, the Cloud Run autoscaler provisioning model doesn't work so well.
However, what's the problem? You pay only when a request is processed on an instance. If there are over provisioned and not used instances, you won't pay them. It's a waste of money for Google, not for you.
Your only concern could be for the earth and the resource saving, and you have absolutely right.

Do GC App Engine deployments of class B1 go idle, even if they have min of 1 instance?

I have the following configuration in my app.yaml
runtime: nodejs10
manual_scaling:
instances: 1
instance_class: B1
With this configuration, will my instance go idle even if it does not receive requests?
My node app is not really a https-server. It runs a couple of jobs based on data from a firestore db. I want this to always run and was wondering if this is possible with App Engine

According to the App Engine Standard documentation the instances of a service with manual scaling won't be shutdown even if there isn't workload:
Manual scaling
Manual scaling specifies the number of instances that continuously run regardless of the load level. This allows tasks such as complex initializations and applications that rely on the state of the memory over time.
But notice that instances might be shutdown and restarted at some point due to maintenance tasks. I would suggest you to read that full doucmentation page to learn more about how instances are managed with the different scaling types.

How can I decrease deployment time of Node app on Google App Engine

Right now the time is around 10 minutes, but my app uses 2 minutes on npm install, which app engine does on every deploy, and then runs in about 5 seconds. Why does it take so long time, and is there any tricks that can be done to lower this?
I have heard other places that this is because of changing routes, and that docker slows things down. But I would believe a company like google could manage to atleast cut this down to 1/3 of the current speed.
There are some older questions, but I would like to have an up to date answer
Google cloud deploy so slow
why does google appengine deployment take several minutes to update service
https://groups.google.com/forum/#!topic/google-appengine/hZMEkmmObDU

At the moment, App Engine Flexible deployments are indeed quite slow but as stated in the links you provided (this still stands true), most of the deployment time needed is incurred by actions you can't act upon (load balancer and network configuration, etc...). What you CAN do to speed it up is to:
limit the size of the app you're deploying
limit the complexity of the build necessary in the Dockerfile, if present
ensure you have a fast and reliable internet connection during deployment
Now, there is one option to bypass most of the new setting-up overheads during development. You may specify an already existing version name as parameter during deployment and also specify --no-promote flag:
gcloud app deploy --version <existing-version-number> --no-promote
I've tried it myself and it drastically reduced the deployment time, to ~1m30 for a Hello World app. It does an in-place replacement instead of a new one. Of course, most of the saved time is due to skipped overhead and you'll have to manually direct traffic to that new version. Also, versioning clarity will obviously be impacted, that's why I wouldn't recommend it for production deployment.

Please suggest Google Cloud App Engine's smallest configuration

I have a node.js web application / website hosted in Google Cloud App Engine. The website will have no more than 10 users per day and does not have any complex resource consuming feature.
I used app.yaml file given in tutorial
# [START app_yaml]
runtime: nodejs
env: flex
manual_scaling:
  instances: 1
resources:
  cpu: 1
  memory_gb: 0.5
  disk_size_gb: 10
# [END app_yaml]
But this is costing around 40 USD per month which is too high for basic application. Can you please suggest minimum possible lowest cost resource configuration? It would be helpful if you can provide app.yaml sample for it.

Google Cloud Platform's Pricing Calculator shows that the specs in your app.yaml turn out to be Total Estimated Cost: $41.91 per 1 month so your costs seem right.
AppEngine Flexible instances are charged for their resources by hour. With manual_scaling option set your instance is up all the time, even when there is no traffic and it is not doing any work. So, not turning your instance down during the idle time is the reason for the $40 bill. You might want to look into using Automatic or Basic scaling to minimize the time your instance is running, which will likely reduce your bill considering you don't have traffic 24/7 (you will find examples of proper app.yaml settings via the link).
Note that with automatic/basic scaling you get to select instance classes with less than 1 dedicated core (i.e. 0.2 & 0.5 CPUs). Not sure if setting CPU to be > 0 and < 1 with manual_scaling here would also work, you might give want to give it a try as well.
Also, don't forget to have a detailed look at your bills to see what else you are potentially being charged for.

After few searches, that seems to be the lowest possible configurations. See related answer here:
Can you use fractional vCPUs with GAE Flexible Environment?
At least for now, there is no shared CPUs so you'll pay for one even if your app is using an average 2% of it. Maybe adding few star here will help changing that in a near future:
https://issuetracker.google.com/issues/62011060

After reading articles on the internet I have created 1 f1-micro (1 vCPU, 0.6 GB memory) VM instance of bitnami MEAN stack which costs ~$5.5/month. I was able to host 1 Mongo DB instance and 2 Node.JS web applications in it. Both the applications have different domain names.
I have implemented reverse proxy using Apache HTTP server to route traffic to appropriate Node.JS application by it domain-name/hostname. I have documented the steps I followed here: https://medium.com/#prasadkothavale/host-multiple-web-applications-on-single-google-compute-engine-instance-using-apache-reverse-proxy-c8d4fbaf5fe0
Feel free to suggest if you have any other ways to implement this scenario.

The cheapest way to host a Node JS application is through Google Compute Engine, not Google App Engine.
This is because you can host it for 100% free on Compute Engine!
I have many Node apps that have been running for the last 2 years, and I have been charged a maximum of a few cents per month, if any at all.
As long as you are fine with a low spec machine (shared vCPU) and no scaling, look into the Compute Engine Always Free options.
https://cloud.google.com/free/docs/always-free-usage-limits#compute_name
The only downside is that you have to set up the server (installing Node, setting up firewalls etc). But it is a one time job, and easily repeatable after you have done it once.

App Engine Standard environment would be the best route for your use case. The standard environment runs directly on Google's Infrastructure, scales quickly and scales down to zero when there's no traffic. The free quota might be sufficient enough for this uses case as well.
App Engine Flexible environment runs as a container in a GCE VM (1 VM per instance/container). This makes it slower to scale compared to the standard environment as scaling up would require new VMs to boot up before the instance containers can be pulled and started. Flex also has the requirement of having minimum 1 instance running all the time (where as standard scales down to 0).
Flex is useful when your requirements of runtime/resources go beyond the limitations of standard environment.
You can understand more about the differences between the standard and flex environments at https://cloud.google.com/appengine/docs/the-appengine-environments

Use the Basic, not Flexible. It is a better fit and far cheaper for you.

GCE managed VM: how to configure the number and location of instances? (nodejs)

Okay so I am testing a hello world application with node.js and gce vm instance.
This is the tutorial I followed:
https://cloud.google.com/nodejs/getting-started/hello-world
When created the app following the tutorial, it created 20 vm instances automatically.
My questions are:
Is it normal to have gce to create 20 instances? Will I be charged 20 small instances as it automatically created the 20 instances? (I was just thinking about testing on the MEAN stack, so should it just create the micro instances?)
How would I configure to use different instances? Like I want to create a micro instance in a different area automatically with maybe around 5 maximum for testing.
Any help would be appreciated. and sorry for being newbie.

Yikes! It's not supposed to go straight to 20 instances. The autoscaler is supposed to ramp up the number of instances based on load. I would take a look at the 'versions' list in the cloud developers console, and make sure you don't have instances sitting around. Sometimes old deployed versions stick around, leaving you with a bunch of abandoned VMs.
On configuring scaling - you want this doc:
https://cloud.google.com/appengine/docs/flexible/nodejs/configuring-your-app-with-app-yaml
You can use manual scaling, which sets a static number of instances by putting this in your app.yaml:
manual_scaling:
instances: 5
Or you can change the range of the instance count (it's 2-20 by default):
automatic_scaling:
min_num_instances: 5
max_num_instances: 20
You can control the resources used in the VM like this:
resources:
cpu: .5
memory_gb: 1.3
disk_size_gb: 10
Hope this all helps!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string