We are running a larger backend application in NodeJS/TS on Firebase with about 180 cloud functions and Firestore as database. Firebase has been good for our needs so far, but we are getting to a level of usage where even small amounts of down-time can cause a lot of damage. Due to the amount of cloud functions a full deploy could take up to 30 minutes, we therefore usually only do partial deploys of changed functions only, which still take about 10 minutes. I am trying to find a way to be able to do quick rollback to previous version of a given function in case a bug is discovered after a production deploy. Firebase does not seem to provide rollback functionality, so the only option is to re-deploy the code with the previous version. One issue is the deploy time (up to 10 min for a single function), and the other is git versioning when there are partial deploys. Normally there would be a branch reflecting exactly what is in prod that could be used, but with partial deploys this is no longer the case. The only alternative for maintaining good git versioning with one to one branch with prod is to do a full deploy every time, but this takes a prohibitive amount of time (30+ minutes not including retries). The firebase deploy usually fail or exceed deployment quota as well, which makes things like CI pipelines very difficult (it would have to automatically retry failed functions, and the time is still an issue since 30+ min to deploy is not acceptable in the case of down-time). Has anyone found a good solution for roll-back (versioning) and a git structure that works well with firebase at scale?
Cloud Functions for Firebase is based on Cloud Functions and their behavior are the same. And today, it's not possible to route the traffic to a previous version (and to perform a rollback). (And I can also told you that NodeJS16 is now GA, instead of Beta as still mentioned in the Cloud Functions for Firebase documentation)
The next Cloud Functions runtime is cooking (and available in preview). That runtime is based on Cloud Run under the hood, that allow traffic splitting/routing, and therefore accept rollback.
So, for now, you haven't solution to perform a simple rollback with Firebase functions. A great change could be to use Cloud Functions V2 runtime directly, or event Cloud Run, but it's a big change in your code base.
Another solution could be to use a load balancer in front of all your functions and to:
Deploy new function under new name (no update of the current deployment, create a new service each time that you deploy a new version)
Create a new serverless backend with the new functions
Update the URL map to take into account the new backend.
After a while, delete the old function versions.
That also requires a lot of work to put that in action. And the advertising delay when you update your URL map should be between 3 and 5 minutes, not a such great advantage compare to your current solution.
it looks like your not the only one. previous questions answered. I recommend setting up some version control.I would solve the failing deploy issues first which should reduce the deploy time and redeploy times specifically if its multiple . You could use a different deploy branch or setup a staging environment as well. I would invest the time in getting the GIT control setup/turnkey.
Per user Ariel:
Each time you make a deploy to a cloud function you get an output line like this:
sourceArchiveUrl: gs://my-store-bucket/us-central1-function_name-xxoxtdxvxaxx.zip
I entered my Google Cloud Platform Developer Console -> Cloud Functions -> function_name -> Source tab
and there almost at the bottom it says: Source location
my-store-bucket/us-central1-function_name-xxoxtdxvxaxx.zip
the same as it was shown in the CLI, but without gs:// that link lead me to the following: https://storage.cloud.google.com/my-store-bucket/us-central1-function_name-........
I removed from the link everything that came after
https://storage.cloud.google.com/my-store-bucket
and that lead me to a huge list of files that each one of them represented a an image of all my cloud functions at the time point of each time i have made a deploy, exactly what i needed!
The only thing left to do was to locate the file with the last date before my mistaken deploy
source: Retrieving an old version of a Google Cloud function source
as of 2019
Rolling back to an older version of a firebase function (google cloud function)
2021:
Roll back Firebase hosting and functions deploy jointly?
You can roll back a Firebase Hosting deployment, but not the functions without using a GIT Version control etc. Using partials you can deploy multiple functions/Groups. You can checkout remote config templates to rollback and its kept for up to 90 days.
https://firebase.google.com/docs/remote-config/templates
Firebase partial deploy multiple grouped functions
https://firebase.google.com/docs/cli#roll_back_deploys
Related
I have a webserver hosted on cloud run that loads a tensorflow model from cloud file store on start. To know which model to load, it looks up the latest reference in a psql db.
Occasionally a retrain script runs using google cloud functions. This stores a new model in cloud file store and a new reference in the psql db.
Currently, in order to use this new model I would need to redeploy the cloud run instance so it grabs the new model on start. How can I automate using the newest model instead? Of course something elegant, robust, and scalable is ideal, but if something hacky/clunky but functional is much easier that would be preferred. This is a throw-away prototype but it needs to be available and usable.
I have considered a few options but I'm not sure how possible either of them are:
Create some sort of postgres trigger/notification that the cloud run server listens to. Guess this would require another thread. This ups complexity and I'm unsure how multiple threads works with Cloud Run.
Similar, but use a http pub/sub. Make an endpoint on the server to re-lookup and get the latest model. Publish on retrainer finish.
could deploy a new instance and remove the old one after the retrainer runs. Simple in some regards, but seems riskier and it might be hard to accomplish programmatically.
Your current pattern should implement cache management (because you cache a model). How can you invalidate the cache?
Restart the instance? Cloud Run doesn't allow you to control the instances. The easiest way is to redeploy a new revision to force the current instance to stop and new ones to start.
Setting a TTL? It's an option: load a model for XX hours, and then reload it from the source. Problem: you could have glitches (instances with new models and instances with the old one, up to the cache TTL expires for all the instances)
Offering cache invalidation mechanism? As said before, it's hard because Cloud Run doesn't allow you to communicate with all the instances directly. So, push mechanism is very hard and tricky to implement (not impossible, but I don't recommend you to waste time with that). Pull mechanism is an option: check a "latest updated date" somewhere (a record in Firestore, a file in Cloud Storage, an entry in CLoud SQL,...) and compare it with your model updated date. If similar, great. If not, reload the latest model
You have several solutions, all depend on your wish.
But you have another solution, my preference. In fact, every time that you have a new model, recreate a new container with the new model already loaded in it (with Cloud Build) and deploy that new container on Cloud Run.
That solution solves your cache management issue, and you will have a better cold start latency for all your new instances. (In addition of easier roll back, A/B testing or canary release capability, version management and control, portability, local/other env testing,...)
I'm fairly new to Function Apps, anyways we have about a dozen small programs currently running as Windows scheduled tasks on an Azure VM and we are looking to migrate these to PaaS. Most of these are small console type background processes that might make an API call, perform a calc, and store the result in a db, or maybe read some data from a db and then send out an email. We have a mixture of pwsh, Python, and .NET.
Anyways I was wondering how many repos I should have? I assume I would need at least 3 (one for each runtime stack). Also I didn't want to create a separate repo per app and end up having 50 git repos eventually. I don't know if it's best just to make some root level folders using the app names in the repo to keep the structure separate looking?
Lastly should each of these apps be hosted in their own function app (Azure Resource), or can I have several of the apps hosted in a single FA? I'm not sure if splitting everything up into separate function apps would make deployment easier or not. I guess ease of long term support/maintenance would be the most important aspect to me.
Short version: What is considered the best practice for logically grouping and creating the relationship/mapping between your apps, number of git repos, and azure function app resources?
I'm trying to write a script that will create a new google cloud project, then provision firebase resources to it, all using the Node SDK.
The first step is calling google.cloudresourcemanager("v1").projects.create, and I use an organization-level service account for that with the proper permissions which returns the proper Operation object on success.
The issue is that after this call, there's often a delay of up to several hours before a call to google.cloudresourcemanager("v1").projects.get or google.firebase({version: "v1beta1"}).projects.addFirebase works, and if you check in the console the project isn't there. The issue is not with permissions (authorization/authentication) as when I manually verify that a project exists then call those two functions, they work as expected.
Has anybody else experienced something like this?
thanks!
On the Official API Documentation it is mentioned this:
Request that a new Project be created. The result is an Operation which can be used to track the creation process. This process usually takes a few seconds, but can sometimes take much longer. The tracking Operation is automatically deleted after a few hours, so there is no need to call operations.delete.
This means that this method works in an asynchronous way. As the response can be used to track the creation of the project, but it hasn't been fully created when the API answers.
Also it Mentiones that it can take long time.
As this is how the API works all the SDKs including the NodeJS one will share this behavior.
I'm working on a project that has many small tasks. Some of these tasks are related and require overlapping apis.
task_1/
main.py
task_2/
main.py
apis/
api_1/
api_2/
api_3/
test/
test_api_1.py
test_api_2.py
test_task_1.py
test_task_2.py
test_task_3.py
For example, task_1 needs api_1 and api_3, while task_2 needs api_1 and api_2. At first I tried using Google Cloud Functions to execute these tasks, but I ran into the issue that GCF needs local dependencies installed in the same folder as the task. This would mean duplicating the code from api_1 into task_1. Further, local testing would become more complicated because of the way GCF does imports (opposed to .mylocalpackage.myscript):
You can then use code from the local dependency, mylocalpackage:
from mylocalpackage.myscript import foo
Is there a way to structure my codebase to enable easier deployment of GCF? Due to my requirements, I cannot deploy each API as its own GCF. Will Google Cloud Run remedy my issues?
Thanks!
To use Cloud Functions for this, you will need to arrange your code in such a way that all the code a function depends on is present within that function's directory at the time of deployment. This might be done as a custom build/packaging step to move files around.
To use Cloud Run for this, you need to create a minimal HTTP webserver to route requests to each of your "functions". This might be best done by creating a path for each function you want to support. At that point, you've recreated a traditional web service with multiple resources.
If these tasks were meant as Background Functions, you can wire up Pub/Sub Push integration.
I am currently changing our database deployment strategy to use FluentMigration and have been reading up on how to run this. Some people have suggested that it can be run from Application_Start, I like this idea but other people are saying no but without specifying reasons so my questions are:
Is it a bad idea to run the database migration on application start and if so why?
We are planning on moving our sites to deploying to azure cloud services and if we don't run the migration from application_start how should/when should we run it considering we want to make the deployment as simple as possible.
Where ever it is run how do we ensure it is running only once as we will have a website and multiple worker roles as well (although we could just ensure the migration code is only called from the website but in the future we may increase to 2 or more instances, will that mean that it could run more than once?)
I would appreciate any insight on how others handle the migration of the database during deployment, particularly from the perspective of deployments to azure cloud services.
EDIT:
Looking at the comment below I can see the potential problems of running during application_start, perhaps the issue is I am trying to solve a problem with the wrong tool, if FluentMigrator isn't the way to go and it may not be in our case as we have a large number of stored procedures, views, etc. so as part of the migration I was going to have to use SQL scripts to keep them at the right version and migrating down I don't think would be possible.
What I liked about the idea of running during Application_Start was that I could build a single deployment package for Azure and just upload it to staging and the database migration would be run and that would be it, rather thank running manual scripts, and then just swap into production.
Running migrations during Application_Start can be a viable approach. Especially during development.
However there are some potential problems:
Application_Start will take longer and FluentMigrator will be run every time the App Pool is recycled. Depending on your IIS configuration this could be several times a day.
if you do this in production, users might be affected i.e. trying to access a table while it is being changed will result in an error.
DBA's don't usually approve.
What happens if the migrations fail on startup? Is your site down then?
My opinion ->
For a site with a decent amount of traffic I would prefer to have a build script and more control over when I change the database schema. For a hobby (or small non-critical project) this approach would be fine.
An alternative approach that I've used in the past is to make your migrations non-breaking - that is you write your migrations in such a way they can be deployed before any code changes and work with the existing code. This way both code and migrations both can be deployed independently 95% of the time. For example instead of changing an existing stored procedure you create a new one or if you want to rename a table column you add a new one.
The benefits of this are:
Your database changes can be applied before any code changes. You're then free to roll back any breaking code changes or breaking migrations.
Breaking migrations won't take the existing site down.
DBAs can run the migrations independently.