Avoid triggering Firebase functions by real-time database on special cases - node.js

Sometimes we use the firebase functions triggered by real-time database (onCreate/onDelete/onUpdate ...) to do some logic (like counting, etc).
My question, would it be possible to avoid this trigger in some cases. Mainly, when I would like to allow a user to import a huge JSON to firebase?
Example:
a function E triggered on the creation of a new child in /examples. Normally, users add examples one by one to /examples and function E runs to do some logic. However, I would like to allow a user (from the front-end) to import 2000 children to /examples and the logic which is done by function E is possible at import time without the need for E. Then, I do not need E to be triggered for such a case where a high number of functions could be executed. (Note: I am aware of the 1000 limit)
Update:
based on the accepted answer, submitted my answer down.

As far as I know, there is no way to disable a Cloud Function programmatically without just deleting it. However this introduces an edge case where data is added to the database while the import is taking place.
A compromise would be to signal that the data you are uploading should be post-processed. Let's say you were uploading to /examples/{pushId}, instead of attaching the database trigger to /examples/{pushId}, attach it to /examples/{pushId}/needsProcessing (or something similar). Unfortunately this has the trade-off of not being able to make use of change objects for onUpdate() and onWrite().
const result = await firebase.database.ref('/examples').push({
title: "Example 1A",
desc: "This is an example",
attachments: { /* ... */ },
class: "-MTjzAKMcJzhhtxwUbFw",
author: "johndoe1970",
needsProcessing: true
});
async function handleExampleProcessing(snapshot, context) {
// do post processing if needsProcessing is truthy
if (!snapshot.exists() || !snapshot.val()) {
console.log('No processing needed, exiting.');
return;
}
const exampleRef = admin.database().ref(change.ref.parent); // /examples/{pushId}, as admin
const data = await exampleRef.once('value');
// do something with data, like mutate it
// commit changes
return exampleRef.update({
...data,
needsProcessing: null /* delete needsProcessing value */
});
}
const functionsExampleProcessingRef = functions.database.ref("examples/{pushId}/needsProcessing");
export const handleExampleNeedingProcessingOnCreate = functionsExampleProcessingRef.onCreate(handleExampleProcessing);
// this is only needed if you ever intend on writing `needsProcessing = /* some falsy value */`, I recommend just creating and deleting it, then you can use just the above trigger.
export const handleExampleNeedingProcessingOnUpdate = functionsExampleProcessingRef.onUpdate((change, context) => handleExampleProcessing(change.after, context));

An alternative to Sam's approach is to use feature flags to determine if a Cloud Function performs its main function. I often have this in my code:
exports.onUpload = functions.database
.ref("/uploads/{uploadId}")
.onWrite((event) => {
return ifEnabled("transcribe").then(() => {
console.log("transcription is enabled: calling Cloud Speech");
...
})
});
The ifEnabled is a simple helper function that checks (also in Realtime Database) if the feature is enabled:
function ifEnabled(feature) {
console.log("Checking if feature '"+feature+"' is enabled");
return new Promise((resolve, reject) => {
admin.database().ref("/config/features")
.child(feature)
.once('value')
.then(snapshot => {
if (snapshot.val()) {
resolve(snapshot.val());
}
else {
reject("No value or 'falsy' value found");
}
});
});
}
Most of my usage of this is during talks at conferences, to enable the Cloud Functions at the right time (as a deploy takes a bit longer than we'd like for a demo). But the same approach should work to temporarily disable features during for example data import.

Okay, another solution would be
A: Add a new table in firebase like /triggers-queue where all CRUD that should fire a background function are added. In this table, we add a key for each table that should have triggers - in our example /examples table. Any key that represents a table should also have /created, /updated, and /deleted keys as follows.
/examples
.../example-id-1
/triggers-queue
.../examples
....../created
........./example-id
....../updated
........./example-id
............old-value
....../deleted
........./example-id
............old-value
Note that the old-value should be added from app (front-end, etc).
We set triggers always onCreate on
/triggers-queue/examples/created/{exampleID} (simulate onCreate)
/triggers-queue/examples/updated/{exampleID} (simulate onUpdate)
/triggers-queue/examples/deleted/{exampleID} (simulate onDelete)
The fired function can know all the necessary info to handle the logic as follows:
Operation type: from the path (either: created, updated, or deleted)
key of the object: from the path
current data: by reading the corresponding table (i.e., /examples/id)
old data: from the triggers table
Good Points:
You can import a huge data to /examples table without firing any function as we do not add to the /triggers-queue
you can fanout functions to pass the limit 1000/sec. That is by setting triggers on (as an example to fanout on-create)
/triggers-queue/examples/created0/{exampleID} and
/triggers-queue/examples/created1/{exampleID}
bad-points:
more difficult to implement
need to write more data to firebase (like old-data) from the app.
B- Another way (although not an answer for this) is to move the login in the background function to an HTTP function and call it on every crud ops.

Related

Number of reads for multiple Firebase trigger functions doing similar things

I have an onUpdate firestore trigger function that does multiple things:
functions.firestore.document('document').onUpdate((change, context) => {
const updatedObject = change.after.data()
if (updatedObject.first) {
doFirst()
}
if (updatedObject.second) {
doSecond()
}
})
I am thinking of splitting this trigger into 2 smaller triggers to keep my functions more concise.
functions.firestore.document('document').onUpdate((change, context) => {
const updatedObject = change.after.data()
if (!updatedObject.first) {
return
}
doFirst()
})
functions.firestore.document('document').onUpdate((change, context) => {
const updatedObject = change.after.data()
if (!updatedObject.second) {
return
}
doSecond()
})
The firestore pricing docs mentions the following:
When you listen to the results of a query, you are charged for a read each time a document in the result set is added or updated. You are also charged for a read when a document is removed from the result set because the document has changed. (In contrast, when a document is deleted, you are not charged for a read.)
Would this increase the number of reads from 1 to 2?
The docs does not clearly state the behavior when there are multiple functions listening to the same event.
A more general question I have is would increasing the number of functions listening to the same event increase the number of reads and hence increase my bill?
Is there a best practice for this issue?
firebaser here
The document data passed to Cloud Functions as part of the trigger (so change.before and change.after) comes out of the existing flow, and is not a charged read. Only additional reads that you perform inside your Cloud Functions code would be charged.

How to listen to realtime database changes (like a stream) in firebase cloud functions?

I am trying to listen to the changes in a specific field in a specific document in Firebase Realtime database. I am using Node JS for the cloud functions. this is the function.
const functions = require("firebase-functions");
const admin = require('firebase-admin');
admin.initializeApp(functions.config.firebase);
const delay = ms => new Promise(res => setTimeout(res, ms));
exports.condOnUpdate = functions.database.ref('data/').onWrite(async snapshot=> {
const snapBefore = snapshot.before;
const snapAfter = snapshot.after;
const dataB = snapBefore.val();
const data = snapAfter.val();
const prev= dataB['cond'];
const curr= data['cond'];
// terminate if there is no change
if(prev==curr)
{
return;
}
if(curr==0){
// send notifications every 10 seconds until value becomes 1
while(true){
admin.messaging().sendToTopic("all", payload);
await delay(10000);
}
}else if(curr==1){
// send one notification
admin.messaging().sendToTopic("all", payload);
return;
}
});
The function works as expected but the loop never stops as it never exists the loop, instead, the function runs again (with new instance I suppose).
so is there any way to listen to the data changes in one function just like streams in other languages, or perhaps stop all cloud functions from running.
Thanks in advance!
From the comments it seems that the cond property is updated from outside of the Cloud Function. When that happens, it will trigger a new instance of the Cloud Function to run with its own curr variable, but it won't cause the curr value in the current instance to be updated. So the curr variable in the original instance of your code will never become true, and it will continue to run until it times out.
If you want the current instance to detect the change to the property, you will need to monitor that property inside the code too by calling onValue on a reference to it.
An easier approach though might be to use a interval trigger rather than a database trigger to:
have code execute every minute, and then in there
query the database with the relevant cond value, and then
send a notification to each of those.
This requires no endless loop or timeouts in your code, which is typically a better approach when it comes to Cloud Functions.

Can I commit a Firestore batch write without waiting?

Overview
I want to create some document references in a Cloud Function and return them to be used in another document. My app is time critical, so I don't want to wait for the batch to commit before returning the references.
Current solution
I currently create the references and the destination document in one Cloud Function and then commit the whole batch. This makes my code repetitive, as I need to create these references in other places, also.
My question
If I omit the .then from the batch.commit() can I simply pass the references straight back and leave Cloud Firestore to write the documents in its own time?
I've created this test script, which works. Is there a problem with this approach or should I always wait for a batch to finish writing before continuing code execution?
My sample code
// Set the data to be written
let myData = {test: '123'};
// Create the document references and return them for future processing
let docRefs = writeData(myData);
// Write these references to a master document
myDoc = {
name: 'A document containing references to other documents',
doc0Ref: docRefs[0],
doc1Ref: docRefs[1],
doc2Ref: docRefs[2]
}
return db.collection('masterCollection').add(myDoc).then(response => {
console.log('Success');
return Promise.resolve();
}).catch(err => {
console.error(err);
return Promise.reject(err);
});
// Create the batch and write the data
function writeData(myData) {
let batch = firestore.batch();
let doc1Ref = firestore.collection('test').doc();
let doc2Ref = firestore.collection('test').doc();
let doc3Ref = firestore.collection('test').doc();
console.log(`doc1Ref: ${doc1Ref.id}, doc2Ref: ${doc2Ref.id}, doc3Ref = ${doc3Ref.id}`);
batch.set(doc1Ref, myData);
batch.set(doc2Ref, myData);
batch.set(doc3Ref, myData);
batch.commit(); // No .then to wait for the batch to be written
return [doc1Ref, doc2Ref, doc3Ref];
}
If your Cloud Function doesn't deal with all asynchronous work correctly (typically, with promises), there is a very good chance that the work may not complete successfully.
For HTTP triggers, you must only send your final response to the client after all the pending work is complete.
For all other types of triggers, you must return a promise that resolves only after all the async work in that function is complete.
What you have right now is a "dangling" promise that's not being handled according to these rules. If you're using ESLint or TSLint to check your code, the linter will likely detect this and complain about it.

Cloud Functions for Firebase - event.data read special value

I'm currently learning to use cloud functions from firebase and just have the following problem:
In my database the structure I´ll be referring to looks like that:
fruits
RandomFruitID
fruitID: RandomFruitID
In my index.js I want to create the function:
exports.newFruit = functions.database.ref("fruits").onWrite(event => {
(...)
// INSIDE HERE I WANT TO ACCESS THE "fruitID" VALUE, MEANING THE "RandomFruitID"
});
How can I achieve that?
Best wishes
Your current function will trigger on any change under /fruits. So there is no current fruitID value.
If you want to trigger when a specific fruit gets written, you'll want to change the trigger to fruits/{fruidId}. This also makes the value of fruitId available in your code:
exports.newFruit = functions.database.ref("fruits/{fruitId}").onWrite(event => {
if (!event.data.previous.exists()) {
var newFruitKey = event.params.fruitId;
...
}
});
I recommend reading the Firebase documentation for Database triggered functions, which covers a lot of such cases.

Is it considered bad practice to manipulate a queried database document before sending to the client in Mongoose?

So I spent too long trying to figure out how to manipulate a returned database document (using mongoose) using transform and virtuals, but for my purposes, those aren't options. The behaviour I desire is very similar to that of a transform (in which I delete a property), but I only want to delete the property from the returned document IFF it satisfies a requirement calculated using the req.session.user/req.user object (I'm using PassportJS, but any equivalent session user suffices). Obviously, there is no access to the request object in a virtual or transform, and so I can't do the calculation.
Then it dawned on me that I could just query normally and manipulate the returned object in the callback before I send it to the client. And I could put it in a middleware function that looks nice, but something tells me this is a hacky thing to do. I'm presenting an api to the client that does not reflect the data stored/retrieved directly from the database. It may also clutter up my route configuration if I have middleware like this all over making it harder to maintain code. Below is an example of what the manipulation looks like:
app.route('/api/items/:id').get(manipulateItem, sendItem);
app.param('id', findUniqueItem);
function findUniqueItem(req, res, next, id) {
Item.findUniqueById(id, function(err, item) {
if (!err) { req.itemFound = item; }
next();
}
}
function manipulateItem(req, res, next) {
if (req.itemFound.people.indexOf(req.user) === -1) {
req.itemFound.userIsInPeopleArray = false;
} else {
req.itemFound.userIsInPeopleArray = true;
}
delete req.itemFound.people;
}
function sendItem(req, res, next) {
res.json(req.itemFound);
}
I feel like this is a workaround to a problem with a simpler solution, but I'm not sure what that solution is.
There's nothing hacky about the act of modifying it.
It's all a matter of when you modify it.
For toy servers, and learning projects, the answer is whenever you want.
In production environments, you want to do your transform on your way out of your system, and into the next system (the next system might be the end user; it might be another server; it might be another big block of functionality in your own server, that shouldn't have access to more information that it needs to do its job).
getItemsFromSomewhere()
.then(transformToTypeICanUse)
.then(filterBasedOnMyExpectations)
.then(doOperations)
.then(transformToTypeIPromisedYou)
.then(outputToNextSystem);
That example might not be super-helpful in terms of an actual how, but that's sort of the point.
As you can see, you could link that system of events up to another system of events (that does its own transform to its own data-structure, does its own filtering/mapping, transforms that data into whatever its API promises, and passes it along to the next system, and eventually out to the end user).
I think part of the sense of "hacking" comes from bolting the result of the async process onto req, where req gets injected from step to step, through the middleware.
That said:
function eq (a) {
return function (b) { return a === b; };
}
function makeOutputObject (inputObject, personWasFound) {
// return whatever you want
}
var personFound = req.itemFound.people.some(eq(req.user));
var outputObject = makeOutputObject(req.itemFound, personFound);
Now you aren't using the actual delete keyword, or modifying the call-to-call state of that itemFound object.
You're separating your view-based logic from your app-based logic, but without the formal barriers (can always be added later, if they're needed).

Resources