Google cloud functions write quota problems with 40+ functions - node.js

We are using Google Cloud Functions quite a bit (around 40 functions currently deployed). They are all in one repository, which is a monorepo for our nodejs backend. They are deployed using Github Actions when a new features/bugfix is merged. Our problem is that all functions are deployed, we dealt with concurrency to deploy more than one function at a time (multiple deploys are run in parallel) but we hit a wall. We are hitting the write quota (which is 80 requests pre 100 seconds) and we are not sure why. It seems that a single function deploy sends around 40 write requests which is isane and while deploying the functions in a slower manner (2 at a time max) it's not acceptable as the deploy would then take 40+ minutes.
While searching info about the quota I found that a single function deploy should do 1 write request (makes sense), but it does multiple for us and I couldn't find any way to debug this.
Example command used for deploying:
gcloud functions deploy functionName --runtime nodejs10 --memory=2048MB --timeout=540s --set-env-vars FN_NAME=functionName --trigger-http --allow-unauthenticated --project our-project --set-env-vars APP_ENV=production
Our functions structure looks like this (names have been replaced):
functions/src
├── fns
│   │  
│   ├── atlas
│   │   ├── some-function.fn.ts
│   │   ├── some-function.fn.ts
│   │   ├── newsletter
│   │   │   ├── some-function.fn.ts
│   │   │   └── some-function.fn.ts
│   │   ├── suggestion
│   │   │   ├── some-function.fn.ts
│   │   │   └── some-function.fn.ts
│   │   ├── some-function.fn.ts
│   │   └── some-function.fn.ts
│   │  
│   ├── leads
│   │   ├── some-function.fn.ts
│   │   ├── some-function.fn.ts
│   │   ├── some-function.fn.ts
│   │   ├── some-function.fn.ts
│   │   ├── some-function.fn.ts
│   │   └── some-function.fn.ts
│   │  
│   ├── utils
│   │   ├── some-function.fn.ts
│   │   ├── some-function.fn.ts
│   │   ├── some-function.fn.ts
│   │   └── some-function.fn.ts
│   └── development.fn.ts
├── utils
│   ├── some-file-used-by-multiple-functions.ts
│   └── some-file-used-by-multiple-functions.ts
└── index.ts
development.fn.ts contains code which is run only on local machine and is ignored during deploy. It basically starts all the functions.
Every .fn.ts exports a single variable named after the function, which is simply a function handling the request request. This is wrapped in our "bootstrap" which handles connecting to databse, PubSub client and others.
index.ts is the entry file for Google Cloud with this content:
import { fns, getFnDefinition } from './bootstrap/get-fns';
// should export util
const ENV_FUNCTION_NAME = process.env.FN_NAME;
const shouldExportFn = (fnName: string) => {
if (!ENV_FUNCTION_NAME) {
return true;
}
return ENV_FUNCTION_NAME === fnName;
};
// export cycle
for (const fn of fns) {
if (shouldExportFn(fn.name)) {
const fnDefinition = getFnDefinition(fn);
exports[fn.name] = fnDefinition.handler;
}
}
export default exports;
Where fns is an array of { name, absolutePath } for our functions. It's read from filesystem (so no imports) and getFnDefinition requires the file and based on the result (exportet object) decides whether the function is triggered by HTTP request or PubSub message.
Also I saw the --entry-point=ENTRY_POINT option, but I'm not sure if that would solve our problem. Would it help if every function had its own entry point instead of the index.js?

The issue is how you are deploying them. You have all 40 functions in one Github repo, but how are you deploying them when one function requires a change? Do you resync / redeploy the whole thing? That would explain the 40 writes since you have 40 functions. I would recommend having them in individual repo or make sure each individual update doesn't cause all the functions to get updated.

Running into this too, but with READS! I hit 150k READS in a few hours just deploying (about 24 functions) about two dozen times. Looks like I have to even optimize my deployment strategy...

Our current solution is kind of dumb, but it works.
So it turns out the limit we were hitting (write quota) can be easily bypassed. What we do now is create a zip of the functions deploy, upload it to gcloud storage and then pass it as a parameter during deploy. This means we now don't reach the write quota (as no files are being uploaded) and everything works. We will however need to solve this in a better way in the future, as there is a limit of 60 when deploying functions and we currently have 48.

Related

Server Side Rendering, Client Side Rendering and REST API Separation

I'm new to Server-Side Rendering and I'm wondering how do I apply best practices, separation of concern and naming convention for SSR, CSR and REST API code. Currently my folder structure looks like this:
.
├── client
│   ├── src
│   └── webpack.config.js - // This is the config for CSR
├── routes
│   ├── api
│   │   ├── controllers
│   │   │   ├── users.controller.js
│   │   └── routing
│   │   └── users.route.js
│   └── dao
│   └── usersDAO.js
├── app.js
├── server.js
├── webpack.client.js
└── webpack.server.js - // This one uses the app.js as an entry point
The REST API is implemented in the /server.js, the SSR routes are in my /app.js, and in my /server.js I have a conditional statements that uses environment variable and webpack's global constants to determine whether I'm using SSR or CSR, if so, server.js will then not listen on the server nor connect to the database, and instead just initialize the routes and the listen method and database connection are instead invoked in the /app.js rather than server.js. Is this a good practice or is there any better approach for separating these routes? Should I also make a folder and separate out each SSR routes? Also, how exactly does the folder structure of the separation of controllers, routes and DAO would look like?

Terragrunt Best practice: dependencies between modules

I have a terragrunt project like this
├── common_vars.hcl
├── envs
│   ├── dev
│   │   ├── env_vars.hcl
│   │   ├── rds-aurora
│   │   │   └── terragrunt.hcl
│   │   ├── rds-sg
│   │   │   └── terragrunt.hcl
│   │   └── vpc
│   │   └── terragrunt.hcl
│   └── prod
│   ├── env_vars.hcl
│   ├── rds-sg
│   │   └── terragrunt.hcl
│   └── vpc
│   └── terragrunt.hcl
├── modules
│   ├── aws-data
│   │   ├── main.tf
│   │   └── outputs.tf
│   ├── rds-aurora
│   │   └── main.tf
│   ├── rds-sg
│   │   └── main.tf
│   └── vpc
│   └── main.tf
└── terragrunt.hcl
The rds-sg is the security group depends on the vpc.
The terragrunt.hcl under dev and prod has the same code like this.
terraform {
source = format("%s/modules//%s", get_parent_terragrunt_dir(), path_relative_to_include())
}
include {
path = find_in_parent_folders()
}
dependencies {
paths = ["../vpc"] # not dry
}
dependency "vpc" {
config_path = "../vpc" # not dry
}
inputs = {
vpc_id = dependency.vpc.outputs.vpc_id # if something changes or we need more inputs
}
As described in the comments, some codes are not so DRY. If I want to change something like change to another vpc or add more inputs, then I need to modify this file everywhere.
So I want something in the main.tf under modules
module "rds-sg" {
source = "terraform-aws-modules/security-group/aws//modules/mysql"
name = "${var.name_prefix}-db-sg"
description = "Security group for mysql 3306 port open within VPC"
vpc_id = ""
# I want something like
# vpc_id = dependency.vpc.outputs.vpc_id
}
Is that possible? or some better practices to solve this problem?
Thanks very much.
Maybe using terraform_remote_state can fix this problem. Any better idea?
This comment may explain this problem better.
https://github.com/gruntwork-io/terragrunt/issues/759#issuecomment-687610130
I would use Data sources to read any ID for any resources:
Add this to the module that uses VPC ID.
data "aws_vpc" "this" {
filter {
name = "tag:Name"
values = [var.name]
}
}
...
vpc_id = data.aws_vpc.this.id
This way you are making sure to read from AWS API not from State file, which has on plan validation also.

Azure WebApps React: the environment variables exist but are not present when I call process.env

Prerequisite
this is my first usage of React/Node.JS/Azure App Service. I usually deploy apps using flask/jinja2/gunicorn.
The use case
I would like to use the environment variables stored in the Configuration of my App Service on Azure
Unfortunately, the environment variables displays 3 environment variables (NODE_END, PUBLIC_URL and FAST_REFRESH) instead of several dozens.
The partial content of the Azure App Service appsettings
[
{
"name": "REACT_APP_APIKEY",
"value": "some key",
"slotSetting": false
},
{
"name": "REACT_APP_APPID",
"value": "an app id",
"slotSetting": false
},
{
"name": "REACT_APP_AUTHDOMAIN",
"value": "an auth domain",
"slotSetting": false
},
{
"name": "APPINSIGHTS_INSTRUMENTATIONKEY",
"value": "something",
"slotSetting": false
},
{
"name": "APPLICATIONINSIGHTS_CONNECTION_STRING",
"value": "something else",
"slotSetting": false
},
{
"name": "ApplicationInsightsAgent_EXTENSION_VERSION",
"value": "some alphanumeric value",
"slotSetting": false
},
{
"name": "KUDU_EXTENSION_VERSION",
"value": "78.11002.3584",
"slotSetting": false
}
]
The CI/CD process
I am using Azure DevOps to build and deploy the app on Azure.
The process runs npm install and npm run build before generating the zip file containing the build (see the directory tree list here below)
How do I Run the App?
The startup command contains npx serve -l 8080 .
The Issue
I display the environment variables with
console.log('process.env', process.env);
The content of the process.env is
{
"NODE_ENV": "production",
"PUBLIC_URL": "",
"FAST_REFRESH": true
}
The Wired part
I use SSH on Azure and I run
printenv | grep APPINS the result is
APPSETTING_APPINSIGHTS_INSTRUMENTATIONKEY=something
APPINSIGHTS_INSTRUMENTATIONKEY=something
printenv | grep APPLICATION the result is
APPSETTING_APPLICATIONINSIGHTS_CONNECTION_STRING=something else
APPLICATIONINSIGHTS_CONNECTION_STRING=something else
Misc
Directory Tree list
.
├── asset-manifest.json
├── favicon.ico
├── images
│   ├── app
│   │   └── home_page-ott-overthetop-platform.png
│   ├── films
│   │   ├── children
│   │   │   ├── despicable-me
│   │   │   │   ├── large.jpg
│   │   │   │   └── small.jpg
│   ├── icons
│   │   ├── add.png
│   ├── misc
│   │   ├── home-bg.jpg
│   ├── series
│   │   ├── children
│   │   │   ├── arthur
│   │   │   │   ├── large.jpg
│   │   │   │   └── small.jpg
│   └── users
│   ├── 1.png
├── index.html
├── static
│   ├── css
│   │   ├── 2.679831fc.chunk.css
│   │   └── 2.679831fc.chunk.css.map
│   ├── js
│   │   ├── 2.60c35184.chunk.js
│   │   ├── 2.60c35184.chunk.js.LICENSE.txt
│   │   ├── 2.60c35184.chunk.js.map
│   │   ├── main.80f5c16d.chunk.js
│   │   ├── main.80f5c16d.chunk.js.map
│   │   ├── runtime-main.917a28e7.js
│   │   └── runtime-main.917a28e7.js.map
│   └── media
│   └── logo.623fc416.svg
└── videos
└── bunny.mp4
74 directories, 148 files
When you run your application locally, you could use .env file to config your environment variables. Format like "name=value"(without quotes).
Here is a sample:
REACT_APP_APIKEY=REACT_APP_APIKEY
REACT_APP_APPID=REACT_APP_APPID
REACT_APP_AUTHDOMAIN=REACT_APP_AUTHDOMAIN
When I call console.log('process.env', process.env); in index.js file, it works well:
After configuring the app settings on portal, deploy the nodejs web app to Azure.
You can view log output (calls to console.log) from the app directly in the VS Code output window. Just right-click the app node and choose Start Streaming Logs in the AZURE APP SERVICE explorer.
It shows the environment variables on portal Application settings:
By the way:
If you are very new to use node with Azure web app, you could have a look at this: https://learn.microsoft.com/en-us/azure/app-service/quickstart-nodejs?pivots=platform-windows
About how to use .env file, see this: https://holycoders.com/node-js-environment-variable/

Ignore hidden files while recursively scanning directories

How to ignore the hidden files while recursively traversing the directories.
My file structure is of following type:
7_jan
├── 7_jan_25_cropped
│   ├── 1.tiff
|
│  
│  
├── 7_jan_50_cropped
│   ├── 1.tiff
│   ├── 10.tiff
│   ├── 11.tiff
│   ├── 12.tiff
│   ├── 13.tiff
│   ├── 14.tiff
│
└── 7_jan_75_cropped
├── 1.tiff
├── 10.tiff
├── 11.tiff
├── 12.tiff
I am recursively storing each file path so that I can later operate upon them but meanwhile the .DS_Store file is also getting stored which I don't want to store. How to remove that?
folders = []
files = []
rec_folders = []
for entry in os.scandir('/Users/swastik/csre/dataset'):
if entry.is_dir():
folders.append(entry.path)
for recentry in os.scandir(entry.path):
if not recentry.path.startswith('.'):
rec_folders.append(recentry.path)
elif entry.is_file():
files.append(entry.path)
print('Folders:')
print(folders)
print('Further files:')
print(rec_folders)
Output-
Folders:
['/Users/swastik/csre/dataset/7_jan']
Further folders:
['/Users/swastik/csre/dataset/7_jan/7_jan_75_cropped',
'/Users/swastik/csre/dataset/7_jan/.DS_Store',
'/Users/swastik/csre/dataset/7_jan/7_jan_50_cropped',
'/Users/swastik/csre/dataset/7_jan/7_jan_25_cropped']
Here , it is also storing the .DS_Store file, which I don't want.
You can just replace if not recentry.path.startswith('.'): with if not recentry.name.startswith('.'):, so that it will ignore your .DS_Store file.

Setting up environment directory - Getting Could not find default node or by name with XXX on environment other than "production"

I m currently trying to configure some directory environment to manage different clients
Puppet Master version : "puppet-server-3.8.1-1" (centos 6)
Here is my tree from the puppet master /etc/puppet :
├── organisation
│   ├── environment.conf
│   ├── manifests
│   │   ├── accounts.pp
│   │   ├── lab_accounts.pp
│   │   ├── lab_nodes.pp
│   │   └── nodes.pp
│   └── modules
│   ├── account
│   │   ├── files
│   │   ├── lib
│   │   ├── spec
│   │   │   └── classes
│   │   └── templates
│   └── dns
│   ├── manifests
│   │   └── init.pp
│   └── templates
│   ├── resolv.conf.erb
│   └── resolv.conf.fqdn.erb
├── production
│   ├── environment.conf
│   ├── manifests
│   │   ├── accounts.pp
│   │   ├── lab_accounts.pp
│   │   └── lab_nodes.pp
│   └── modules
│   ├── account
│   │   ├── CHANGELOG
│   │   ├── files
│   │   ├── lib
│   │   ├── LICENSE
│   │   ├── manifests
│   │   │   ├── init.pp
│   │   │   └── site.pp
│   │   ├── metadata.json
│   │   ├── Modulefile
│   │   ├── Rakefile
│   │   ├── README.mkd
│   │   ├── spec
│   │   │   ├── classes
│   │   │   ├── defines
│   │   │   │   └── account_spec.rb
│   │   │   └── spec_helper.rb
│   │   └── templates
│   ├── dns
│   │   ├── manifests
│   │   │   └── init.pp
│   │   └── templates
│   │   ├── resolv.conf.erb
│   │   └── resolv.conf.fqdn.erb
│   └── sshkeys
│   └── manifests
│   └── init.pp
└── README.md
Now the configuration files :
/etc/puppet.conf
[main]
logdir = /var/log/puppet
rundir = /var/run/puppet
ssldir = $vardir/ssl
dns_alt_names = centos66a.local.lab,centos66a,puppet,puppetmaster
[master]
environmentpath = $confdir/environments
basemodulepath = $confdir/modules:/opt/puppet/share/puppet/modules
[agent]
classfile = $vardir/classes.txt
localconfig = $vardir/localconfig
server = puppet
Here is the environment I called "organisation" :
/etc/puppet/environments/organisation/environment.conf
modulepath = /etc/puppet/environments/organisation/modules
environment_timeout = 5s
Now I declare my nodes in "nodes.pp" :
/etc/puppet/environments/organisation/manifests/nodes.pp
node 'centos66a.local.lab' {
include dns
}
node 'gcacnt02.local.lab' {
include dns
}
Here is the output when I try to sync my node to the master :
gcacnt02:~ # hostname
gcacnt02.local.lab
gcacnt02:~ # puppet agent -t
Info: Creating a new SSL key for gcacnt02.local.lab
Info: csr_attributes file loading from /etc/puppet/csr_attributes.yaml
Info: Creating a new SSL certificate request for gcacnt02.local.lab
Info: Certificate Request fingerprint (SHA256): 49:73:11:78:99:6F:50:BD:6B:2F:5D:B9:92:7C:6F:A9:63:52:92:53:DB:B8:A1:AE:86:21:AF:36:BE:B0:94:DB
Info: Caching certificate for gcacnt02.local.lab
Info: Caching certificate for gcacnt02.local.lab
Info: Retrieving pluginfacts
Info: Retrieving plugin
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find default node or by name with 'gcacnt02.local.lab, gcacnt02.local, gcacnt02' on node gcacnt02.local.lab
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
if I move /etc/puppet/environments/organisation/manifests/nodes.pp to /etc/puppet/environments/production/manifests/nodes.pp it works just fine.
When I print "manifest" from "organisation" and "production" I get a correct output asweel :
[root#centos66a environments]# puppet config print manifest --section master --environment production
/etc/puppet/environments/production/manifests
[root#centos66a environments]# puppet config print manifest --section master --environment organisation
/etc/puppet/environments/organisation/manifests
I m probably missing something here but can't put my finger on it...
Thank you
Problem resolved.
Configuration on the master is OK.
Since Puppet scan for directory in the environment directory set by the "environmentpath" variables, I thought that the master would automaticly reply to nodes in setup in each environment. This is false.
Default environment is : Production.
If you set any other environment you have to configure each puppet agent node to query to a specific environment
In my case, my node is gcacnt02.local.lab. So to fix the issue I had to add the following variable in /etc/puppet/puppet.conf
[main]
logdir = /var/log/puppet
rundir = /var/run/puppet
ssldir = $vardir/ssl
[agent]
classfile = $vardir/classes.txt
localconfig = $vardir/localconfig
environment = lan

Resources