I am using Firebase in my App in Nodejs. I have JS API which takes imprints of website users(kind of Google Analytics) and from there it updates Data in Firebase and event Listeners binded in Nodejs compute the analytics.
Problem:
Currently, App is hosted on GAE and I have to manage minimum 300k concurrent connections.
I am facing issue with scaling the App, as Firebase works on sockets, so if multiple instances are working, it makes incur duplicate processing of the same request depending on the number of instances working.
Sol 1 - Queues: I tried using firebase-queues but it could not take the heavy load and logs to Firebase Disconnect for all the queues, as many tasks got queued which result to timeout
Sol 2 - Cloud Functions: I suppose cloud functions in background uses GAE, which will incur the same duplicate request issue.
How actually functions scale? does it add CPU & memory or add servers?
Can anyone suggest me? how should I scale with Firebase? else I will be left with last option to remove Firebase and make it REST based.
Related
I have recently setup an 'Azure Database for MySQL flexible server' using the burstable tier. The database is queried by a React frontend via a node.js api; which each run on their own seperate Azure app services.
I've noticed that when I come to the app first thing in the morning, there is a delay before database queries complete. The React app is clearly running when I first come to it, which is serving the html front-end with no delays, but queries to the database do not return any data for maybe 15-30 seconds, like it is warming up. After this initial slow performance though, it then runs with no delays.
The database contains about 10 records at the moment, and 5 tables, so it's tiny.
This delay could conceivably be due to some delay with the node.js server, but as the React server is running on the same type of infrastructure (an app service), configured in the same way, and is immediately available when I go to its URL, I don't think this is the issue. I also have no such delays in my dev environment which runs on my local PC.
I therefore suspect there is some delay with the database server, but I'm not sure how to troubleshoot. Before I dive down that rabbit hole though, I was wondering whether a delay when you first start querying a database (after, say, 12 hours of inactivity) is simply a characteristic of the burtsable tier on Azure?
There may be more factors affecting this (see comments from people on my original question), but my solution has been to set two global variables which cache data, improving initial load times. The following should be set to ON in the Azure config:
'innodb_buffer_pool_dump_at_shutdown'
'innodb_buffer_pool_load_at_startup'
This is explained further in the following best practices documentation: https://learn.microsoft.com/en-us/azure/mysql/single-server/concept-performance-best-practices in the section marked 'Use InnoDB buffer pool Warmup'
We built a web application where we utilized firebase functions for lightweight works such as login, update profile etc. and we deployed 2 functions to App Engine with Nodejs.
Function 1: Downloading an audio/video file from firebase storage and converting it with ffmpeg and uploading converted version back to storage.
But App Engine is terminating with a signal in the middle of download process (after ~40 seconds) if the file is larger (>500MB)
Function 2: Calling Google API (ASR) and waiting response (progress %) and writing this response to firestore until it's completed (100%).
This process may take between 1 min - 20 min depending on the file length. But here we get two different problems.
Either App Engine creates a new instance in the middle of API call process and kills current instance (since we set instances# to 1) even there is no concurrent requests.
I don't understand this behavior since this should not require intensive CPU or memory usage. App Engine gives following Info in logs:
This request caused a new process to be started for your application,
and thus caused your application code to be loaded for the first time.
This request may thus take longer and use more CPU than a typical
request for your application
Or App Engine terminates due to idle_timeout even it is waiting async API response and writing it to db.
It looks like when there is no incoming requests, App Engine is considering itself as idle and terminating after a while (~10 minutes)
We are new to GCP and App Engine, so maybe we are using wrong product (e.g. Compute Engine?) or doing the wrong implementation. Also we saw PubSub, Cloud Tasks etc. which looks like a solution for our case.
Thus I wonder what could be most elegant way to approach the problem and implement solution?
Any comment, feedback is appreciated.
Regards
A. Faruk Acar
App Engine app.yaml Configuration
runtime: nodejs10
manual_scaling:
instances: 1
App Engine has a maximum timeout per request
10 minutes for App Engine Standard
60 minutes for App Engine Flex
in both cases the default values are less.
So for what you describe as your process it is not the optimal solution.
Cloud Tasks has similar limitations as App Engine so you might get to similar problems.
Compute Engine can work for you as it is virtual machines where you control the configuration. To keep it cost effective see what is the smallest computer type which can run your app.
I have an Azure function app triggered by an HttpRequest. The function app reads the request, tosses one copy of it into a storage table for safekeeping and sends another copy to a queue for further processing by another element of the system. I have a client running an ApacheBench test that reports approximately 148 requests per second processed. That rate of processing will not be enough for our expected load.
My understanding of function apps is that it should spawn as many instances as is needed to handle the load sent to it. But this function app might not be scaling out quickly enough as it’s only handling that 148 requests per second. I need it to handle at least 200 requests per second.
I’m not 100% sure the problem is on my end, though. In analyzing the performance of my function app I found a LOT of 429 errors. What I found online, particularly https://learn.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-request-limits, suggests that these errors could be due to too many requests being sent from a single IP. Would several ApacheBench 10K and 20K request load tests within a given day cause the 429 error?
However, if that’s not it, if the problem is with my function app, how can I force my function app to spawn more instances more quickly? I assume this is the way to get more throughput per second. But I’m still very new at working with function apps so if there is a different way, I would more than welcome your input.
Maybe the Premium app service plan that’s in public preview would handle more throughput? I’ve thought about switching over to that and running a quick test but am unsure if I’d be able to switch back?
Maybe EventHub is something I need to investigate? Is that something that might increase my apparent throughput by catching more requests and holding on to them until the function app could accept and process them?
Thanks in advance for any assistance you can give.
You dont provide much context of you app but this is few steps how you can improve
If you want more control you need to use App Service plan with always on to avoid cold start, also you will need to configure auto scaling since you are responsible in this plan and auto scale is not enabled by default in app service plan.
Your azure function must be fully async as you have external dependencies so you dont want to block thread while you are calling them.
Look on the limits. Using host.json you can tweek it.
429 error means that function is busy to process your request, so probably when you writing to table you are not using async and blocking thread
Function apps work very well and scale as it says. It could be because request coming from Single IP and Azure could be considering it DDOS. You can do the following
AzureDevOps Load Test
You can load test using one of the azure service . I am very sure they have better criteria of handling IPs. Azure DeveOps Load Test
Provision VM in Azure
The way i normally do is provision the VM (windows 10 pro) in azure and use JMeter to Load test. I have use this method to test and it works fine. You can provision couple of them and subdivide the load.
Use professional Load testing services
If possible you may use services like Loader.io . They use sophisticated algos to run the load test and provision bunch of VMs to run the same test.
Use Application Insights
If not already you must be using application insights to have a better look from server perspective. Go to live stream and see how many instance it would provision to handle the load test . You can easily look into events and error logs that may be arising and investigate. You can deep dive into each associated dependency and investigate the problem.
I'm running an Azure Function app on Consumption Plan and I want to monitor the amount of instances currently running. Using REST API endpoint of format
https://management.azure.com/subscriptions/{subscr}/resourceGroups/{rg}
/providers/Microsoft.Web/sites/{appname}/instances?api-version=2015-08-01
I'm able to retrieve the instances. However, the result doesn't match the information that I see in Application Insights / Live Metrics Stream.
For example, right now App Insights shows 4 servers online, while API call returns just one (the GUID of this 1 instance is also among App Insights guids).
Who can I trust? Is there a better way to get instance count (e.g. from App Insights)?
UPDATE: It looks like data from REST API are wrong.
I was sending 10000 messages to the queue, logging each function call with respective instance ID which processed the request.
While messages keep coming in and the backlog grows, instance count from REST API seems to be correct (scaled from 1 to 12). After sending stops, the reported instance count rapidly goes down (eventually back to 1, while processors are still busy).
But based on the speed and the execution logs I can tell that the actual instance count kept growing and ended up at 15 instances at the moment of last message processed.
UPDATE2: It looks like SDK refuses to report more than 20 servers. The metric flats out at 20, while App Insights kept steady growth and is already showing 41.
Who can I trust? Is there a better way to get instance count (e.g. from App Insights)?
Based on my understanding we need to use Rest API endpoint to retrieve the instance, App Insights could be configured for multiple WebApps, so the number of servers online in the App Insights may be for multiple WebApps.
Updated:
Based on my test, the number of the application insight may be not real time.
During my test if the WebApp Function scale out then I could get multiple instances with Rest API, and I also can check the number of servers online in the App Insights.
https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourcegroup}/providers/Microsoft.Web/sites/{functionname}/instances?api-version=2016-08-01
But after I finished the test, I could get the number of the instance with Rest API is 1, based on my understanding, it is right result.
At the same time I check it in the Application Insight the number of the servers online is the max number during my test.
And after a while, the number of server online in the application insight also became 1.
So If we want to get the number of intance for Azure function, my suggestion is that using REST API to do that.
Update2:
According to the DavidEbbo mentioned that the REST API is not always reliable.
Unfortunately, the REST API is not always reliable. Specifically, when a Function App scales across multiple scale units, only the instances from the 'home' scale unit are reflected. You probably will not see this in a smallish test, but likely will if you start scaling out widely (say over 20 instances).
I'm fairly new at google cloud and node. Based on Google's recommendation (the requirement to watch Firebase at all times) I deployed managed VM node app, instead of just app engine. There are now 22-23 instances every time I deploy. Is this expected? I thought it would only scale when necessary.
This node app has a method which watches Firebase variables and change, in turn, the script fires off a notification.
What happens now is that multiple notifications are being fired and I only expect one. I suspect it's because there are multiple instances of this app.
What is the right way to do this so that only one is watching?
Thanks.
You can use the method suggested by google for flexible server environments and firebase. https://cloud.google.com/solutions/mobile/mobile-firebase-app-engine-flexible and https://cloudplatform.googleblog.com/2016/06/learn-to-build-a-mobile-backend-service-with-Firebase-and-App-Engine.html .
Have the instance "claim users" by transactioning it's instance ID at a location where the user can reach it and send updates to that instance by setting the instanceID in the path.