Firebase cloud functions - what happens with multiple HTTP triggers at once - node.js

I have a firebase cloud function that is an endpoint for an external API, and it handles a POST request.
This external API POSTS data to my cloud function endpoint at random intervals (this cloud function gets pinged with a POST request based on when a result is returned from this external API, and there can be multiple at once and its unpredictable)
exports.handleResults = functions.https.onRequest((req, res) => {
if (req.method === 'POST') {
// run code here that handles the POST payload
}
})
What happens when there is more than one POST request that come in at the same time?
Is there a queue? Does it finish the first request before moving on to the next?
Or if another request comes in while the function is running, does it block/ignore the request until the function is done?

Cloud Functions will automatically scale up the server instances running your functions when it determines that more capacity is needed. Those instances will run your function concurrently. The instances will be scaled down when they are no longer needed. The exact behavior is not documented - it should be considered an implementation detail that may change over time.
To learn more about this, watch my video about Cloud Functions scaling and isolation.

Related

Firebase Functions: How to maintain 'app-global' API client?

How can I achieve an 'app-wide' global variable that is shared across Cloud Function instances and function invocations? I want to create a truly 'global' object that is initialized only once per the lifetime of all my functions.
Context:
My app's entire backend is Firestore + Firebase Cloud Functions. That is, I use a mix of background (Firestore) triggers and HTTP functions to implement backend logic. Additionally, I rely on a 3rd-party location service to continually listen to location updates from sensors. I want just a single instance of the client on which to subscribe to these updates.
The problem is that Firebase/Google Cloud Functions are stateless, meaning that function instances don't share memory/objects/state. If I call functionA, functionB, functionC, there's going to be at least 3 instances of locationService clients created, each listening separately to the 3rd party service so we end up with duplicate invocations of the location API callback.
Sample code:
// index.js
const functions = require("firebase-functions");
exports.locationService = require('./location_service');
this.locationService.initClient();
// define callable/HTTP functions & Firestore triggers
...
and
// location_service.js
var tracker = require("third-party-tracker-js");
const self = (module.exports = {
initClient: function () {
tracker.initialize('apiKey')
.then((client)=>{
client.setCallback(async function(payload) {
console.log("received location update: ", payload)
// process the payload ...
// with multiple function instances running at once, we receive as many callbacks for each location update
})
client.subscribeProject()
.then((subscription)=>{
subscription.subscribe()
.then((subscribeMsg)=>{
console.log("subscribed to project with message: ", subscribeMsg); // success
});
// subscription.unsubscribe(); // ??? at what point should we unsubscribe?
})
.catch((err)=>{
throw(err)
})
})
.catch((err)=>{
throw(err)
})
},
});
I realize what I'm trying to do is roughly equivalent to implementing a daemon in a single-process environment, and it appears that serverless environments like Firebase/Google Cloud Functions aren't designed to support this need because each instance runs as its own process. But I'd love to hear any contrary ideas and possible workarounds.
Another idea...
Inspired by this related SO post and the official GCF docs on stateless functions, I thought about using Firestore to persist a tracker value that allows us to conditionally initialize the API client. Roughly like this:
// read value from db; only initialize the client if there's no valid subscription
let locSubscriberActive = await getSubscribeStatusFromDb();
if (!locSubscriberActive) {
this.locationService.initClient();
}
// in `location_service.js`, do setSubscribeStatusToDb(); // set flag to true when we call subscribe(). reset when we get terminated
The problem faced: at what point do I unset/reset that value? Intuitively, I would do so the moment the function instance that initialized the client gets recycled/killed. However, it appears that it is not possible to know when a Firebase Cloud Function instance is terminated? I searched everywhere but couldn't find docs on how to detect such an event...
What you're trying to do is not at all supported in Cloud Functions. It's important to realize that there may be any number of server instances allocated for each deployed function. That's how Cloud Functions scales up and down to match the load on the function in a cost-effective way. These instances might be terminated at any time for any reason. You have no indication when an instance terminates.
Also, instances are not capable of performing any computation when they are idle. CPU resources are clamped down after a function terminates, and are spun up again when the next function is invoked on that instance. You can't have any "daemon" code running when a function is not actively being invoked. I don't know what your locationService does, but it is certainly doing nothing at all after a function terminates, regardless of how it terminated.
For any sort of long-running or daemon-like code, Cloud Functions is not a suitable product. You should instead consider also using another product that lets you run code 24/7 without disruptions. App Engine and Compute Engine are viable alternatives, and you will have to think carefully about if and how you want their server instances to scale with load.

Error "ReferenceError: request is not defined" from basic Firebase Functions run

Trying to send off a webhook to Slack whenever onWrite() is triggered directed toward my Firebase DB. Going off a few other posts/guides I was able to deploy the below code, but get the ReferenceError: Request is not defined error on execution. I can't figure out how to fix the Request is not defined.
const functions = require('firebase-functions');
const webhookURL = "https://hooks.slack.com/services/string/string";
exports.firstTest = functions.database.ref('first').onWrite( event => {
return request.post(
webhookURL,
{json: {text: "Hello"}}
);
});
Calling your Cloud Function via an URL and sending back a response
By doing exports.firstTest = functions.database.ref('first').onWrite() you trigger your firstTest Cloud Function when data is created, updated, or deleted in the Realtime Database. It is called a background trigger, see https://firebase.google.com/docs/functions/database-events?authuser=0
With this trigger, everything happens in the back-end and you do not have access to a Request (or a Response) object. The Cloud Function doesn't have any notion of a front-end: for example it can be triggered by another back-end process that writes to the database. If you want to detect, in your front-end, the result of the Cloud Function (for example the creation of a new node) you would have to set a listener to listen to this new node location.
If you want to call your function through an HTTP request (possibly from your front-end, or from another "API consumer") and receive a response to the HTTP Request, you need to use another type of Cloud Function, the HTTP Cloud Function, see https://firebase.google.com/docs/functions/http-events. See also the other type of Cloud Function that you can call directly: the Callable Cloud Functions.
Finally, note that:
With .onWrite( event => {}), you are using the old syntax, see https://firebase.google.com/docs/functions/beta-v1-diff?authuser=0
The Firebase video series on Cloud Function is a good point to start to learn more on all these concepts, see https://firebase.google.com/docs/functions/video-series?authuser=0
Calling, from your Cloud Function, an external URL
If you want, from a Cloud Function, to call an external URL (the Slack webhook mentioned in your question, for example) you need to use a library like request-promise (https://github.com/request/request-promise).
See How to fetch a URL with Google Cloud functions? request? or Google Cloud functions call URL hosted on Google App Engine for some examples
Important: Note that you need to be on the "Flame" or "Blaze" pricing plan.
As a matter of fact, the free "Spark" plan "allows outbound network requests only to Google-owned services". See https://firebase.google.com/pricing/ (hover your mouse on the question mark situated after the "Cloud Functions" title)

API that will continuously return data

Beginner here, I'm using Firebase real time database and I need my API to constantly return that value when something has been added see my code below.
apiCalls.get('/api/getallusers',function(req,res){
userFunc.getAllUsers(function(err,result){
if (err) return res.status(500).send('internal server error!');
res.status(200).write(JSON.stringify(result));
res.end();
return res;
})
})
this will return the error
Error [ERR_STREAM_WRITE_AFTER_END]: write after end
but if i remove res.end it will show 1 record and constantly load until the page times out..
is what I'm doing possible or are there different ways to do it.
also I'm using firebase cloud functions for this api.
UPDATE:
Uploaded the API but it does not return anything...
here is the link https://us-central1-testproject-e6819.cloudfunctions.net/api1/api/getUser
tried axios and Event Source
Firebase functions logs the values but it does not return it..
If you're viewing the API response like a web page, your browser is buffering the data it's received until there's enough of it to form a more full page. Your browser is expecting content that ends, not some endless stream of data.
You should remove .end() if you expect to be able to continue to write to the output stream.
Also, I recommend using the Server-Sent Events (SSE) protocol for this. https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events It provides a nice standards-based abstraction that makes it very easy to handle event streams client-side.
const eventSource = new EventSource('https://api.example.com/someApi');
eventSource.addEventListener('userupdate', (e) => {
console.log(e.data);
});
Server-side, there are a couple Express-based middlewares to make this even easier than it already is.
Operations in Cloud Functions must be relatively short-lived and end deterministically. There is no way to keep a connection open from Cloud Functions to the client.
Typically consider what triggers the need to send new data. For example, if it is triggered by the fact that a new user is registered, you can use trigger your Cloud Functions from Firebase Authentication. Then the function could for example write to the Realtime Database (or Cloud Firestore), and your client/app listens to the database for realtime updates. That way you're using all the pieces of Firebase in the way they're designed: Cloud Functions for short-lived updates triggered from events in the system, and the Realtime Database or Cloud Firestore for sending realtime updates.
If that doesn't work for your use-case, you'll need a runtime environment that allows you to keep processes alive. Something like App Engine flex, Kubernetes, or many other options come to mind for that.

The Syntax to call a Google Cloud Function from another Google Cloud Function

I want to make a Google Cloud Function with HTTP trigger that call another function (example: changeString). I know that I can include the function changeString in index.js. However, I want to reuse changeString so others Google Cloud Functions can call it.
exports.helloWorld = function helloWorld(req, res) {
var result = changeString(req.body.string);
res.send(result);
};
I know that there is a similar question, but it did not solve my problem.
I was wondering about this myself and I think the answer is that you don’t call the function. Instead, you should send a payload to the PubSub service from your HTTP Cloud Function. A secondary Cloud Function subscribes to the PubSub topic and consumes the payload (which is Base64 encoded).
As #user3158158 brings out, you would publish a message to a Pub/Sub service. This is highlighted here: https://cloud.google.com/functions/docs/calling/pubsub#publishing_a_message_from_within_a_function
I shows the sample syntax at the above link, I needed to do the same thing today.

Node app that fetches, processes, and formats data for consumption by a frontend app on another server

I currently have a frontend-only app that fetches 5-6 different JSON feeds, grabs some necessary data from each of them, and then renders a page based on said data. I'd like to move the data fetching / processing part of the app to a server-side node application which outputs one simple JSON file which the frontend app can fetch and easily render.
There are two noteworthy complications for this project:
1) The new backend app will have to live on a different server than its frontend counterpart
2) Some of the feeds change fairly often, so I'll need the backend processing to constantly check for changes (every 5-10 seconds). Currently with the frontend-only app, the browser fetches the latest versions of the feeds on load. I'd like to replicate this behavior as closely as possible
My thought process for solving this took me in two directions:
The first is to setup an express application that uses setTimeout to constantly check for new data to process. This data is then sent as a response to a simple GET request:
const express = require('express');
let app = express();
let processedData = {};
const getData = () => {...} // returns a promise that fetches and processes data
/* use an immediately invoked function with setTimeout to fetch the data
* when the program starts and then once every 5 seconds after that */
(function refreshData() {
getData.then((data) => {
processedData = data;
});
setTimeout(refreshData, 5000);
})();
app.get('/', (req, res) => {
res.send(processedData);
});
app.listen(port, () => {
console.log(`Started on port ${port}`);
});
I would then run a simple get request from the client (after properly adjusting CORS headers) to get the JSON object.
My questions about this approach are pretty generic: Is this even a good solution to this problem? Will this drive up hosting costs based on processing / client GET requests? Is setTimeout a good way to have a task run repeatedly on the server?
The other solution I'm considering would deal with setting up an AWS Lambda that writes the resulting JSON to an s3 bucket. It looks like the minimum interval for scheduling an AWS Lambda function is 1 minute, however. I imagine I could set up 3 or 4 identical Lambda functions and offset them by 10-15 seconds, however that seems so hacky that it makes me physically uncomfortable.
Any suggestions / pointers / solutions would be greatly appreciated. I am not yet a super experienced backend developer, so please ELI5 wherever you deem fit.
A few pointers.
Use crontasks for periodic processing of data. This is far preferable especially if you are formatting a lot of data.
Don't setup multiple Lambda functions for the same task. It's going to be messy to maintain all those functions.
After processing / fetching the feed, you can store the JSON file in your own server or S3. Note that if it's S3, then you are paying and waiting for a network operation. You can read the file from your express app and just send the response back to your clients.
Depending on the file size and your load in the server you might want to add a caching server so that you can cache the response until new JSON data is available.

Resources