How to stop firebase functions from propagating triggers - node.js

I have a Firebase function to decrease the commentCount when a comment is deleted, like this
export const onArticleCommentDeleted = functions.firestore.document('articles/{articleId}/comments/{uid}').onDelete((snapshot, context) => {
return db.collection('articles').doc(context.params.articleId).update({
commentCount: admin.firestore.FieldValue.increment(-1)
})
})
I also have firebase functions to recursively delete comments of an article when it's deleted
export const onArticleDeleted = functions.firestore.document('articles/{id}').onDelete((snapshot, context) => {
const commentsRef = db.collection('articles').doc(snapshot.id).collection('comments');
db.recursiveDelete(commentsRef); // this triggers the onArticleCommentDeleted multiple times
})
When I delete an article, the onArticleCommentDeleted is triggered and it tries to update the article that has already been deleted. Of course I can check if the article exists before updating it. But it's really cumbersome and waste of resources.
Are there any ways to avoid from propagating further triggers?

I think the problem arises from the way I make use of the trigger. In general, it's not a good idea to implement an onDelete trigger on a child document updates its own parent. This will surely cause conflict.
Instead, at the client side, I use transaction
...
runTransaction(async trans => {
trans.delete(commentRef);
trans.update(articleRef, {
commentCount: admin.firestore.FieldValue.increment(-1)
})
})
This makes sure that if one of the operation fails they both fail, and eradicates the triggers. Relying to the client side is not the best idea, but I think we can consider the trade off.

There is no way to prevent triggering the Cloud Function on the comments when you delete the comments for that article. You will have to check for that condition in the function code itself, as you already said.

Related

How to handle multiple database connections for 2 or 3 SELECT queries in AWS Lambda with nodejs?

The lambda's job is to see if a query returns any results and alert subscribers via an SNS topic. If no rows are return, all good, no action needed. This has to be done every 10 minutes.
For some reasons, I was told that we can't have any triggers added on the database, and no on prem environment is suitable to host a cron job
Here comes lambda.
This is what I have in the handler, inside a loop for each database.
sequelize.authenticate()
.then(() => {
for (let j = 0; j < database[i].rawQueries[j].length; j++) {
sequelize.query(database[i].rawQueries[j] => {
if (results[0].length > 0) {
let message = "Temporary message for testing purposes" // + query results
publishSns("Auto Query Alert", message)
}
}).catch(err => {
publishSns("Auto Query SQL Error", `The following query could not be executed: ${database[i].rawQueries[j])}\n${err}`)
})
}
})
.catch(err => {
publishSns("Auto Query DB Connection Error", `The following database could not be accessed: ${databases[i].database}\n${err}`)
})
.then(() => sequelize.close())
// sns publisher
function publishSns(subject, message) {
const params = {
Message: message,
Subject: subject,
TopicArn: process.env.SNStopic
}
SNS.publish(params).promise()
}
I have 3 separate database configurations, and for those few SELECT queries, I thought I could just loop through the connection instances inside a single lambda.
The process is asynchronous and it takes 9 to 12 seconds per invocation, which I assume is far far from optimal
The whole thing feels very very sub optimal but that's my current level :)
To make things worse, I now read that lambda and sequelize don't really play well together:
I am using sequelize because that's the only way I could get 3 connections to the database in the same invocation to work without issues. I tried mssql and tedious packages and wasn't able with either of them
It now feels like using an ORM is an overkill for this very simple task of a SELECT query, and I would really like to at least have the connections and their queries done asynchronously to save some execution time
I am looking into different ways to accomplish this and i went down the rabbit hole and I now have more questions than before! Generators? are they still useful? Observables with RxJs? Could this apply here? Async/Await or just Promises? Do I even need sequelize?
Any guidance/opinion/criticism would be very appreciated
I'm not familiar with sequelize.js but hope I can help. I don't know your level with RxJS and Observables but it's worth to try.
I think you could definitely use Observables and RxJS.
I would start with an interval() that will run the code every time you define.
You can then pipe the interval since it's an Observable, do the auth bit and do a map() to get an array of Observables (for each .query call, I am assuming all your calls, authenticate and query, are Promises so it's possible to transform them into Observables with from()). You can then use something like forkJoin() with the previous array to get a response after all calls are done.
In the .subscribe at the end, you would make the publishSns().
You can pipe a catchError() too and process errors.
The map() part might be not necessary and do it previously and have it stored in a variable since you don't depend on an authenticate value.
I'm certain my solution isn't the only one or the best but i think it would work.
Hope it helps and let me know if it works!

DataSnapshot.ref in Functions Emulators only points to default database

Let's say I have a node in a secondary realtime database called "test" with a value of "foobar".
I want to set up a function that prevents it from being deleted. More realistically this node would have several child nodes, where the function first checks if it can be deleted or not. However, here we never allow it to be deleted to keep the code as short as possible.
So I add a function that triggers onDelete and just rewrites the value.
In short:
Secondary database has: {"test":"foobar"}
onDelete function:
exports.testDelete = functions.database
.instance("secondary")
.ref("test")
.onDelete(async (snap, context) => {
await snap.ref.set(snap.val());
});
When running this with emulators, I would expect that when I delete the node, the node would just reappear in the secondary database, which is what happens when deployed to production. In the emulators, the node reappears, but in the main database instead of the secondary database. The only way I see to fix this is to replace snap.ref.set(snap.val()) with admin.app().database("https://{secondarydatabasedomain}.firebasedatabase.app").ref().child("test").set(snap.val()) which looks a little cumbersome just to get emulators to work.
Am I doing something wrong here?
I am using node 14, and firebase CLI version 9.23.0
To specify instance and path :
You have followed the syntax :
Instance named "my-app-db-2": functions.database.instance('my-app-db-2').ref('/foo/bar')
You have mentioned the instance name otherwise it will redirect to the default database so the syntax seems correct.
For triggering the event data follow the syntax as :
onDelete(handler: (snapshot: DataSnapshot, context: EventContext) => any): CloudFunction
For example you can refer to the Documentation :
// Listens for new messages added to /messages/:pushId/original and creates an
// uppercase version of the message to /messages/:pushId/uppercase
exports.makeUppercase = functions.database.ref('/messages/{pushId}/original')
.onCreate((snapshot, context) => {
// Grab the current value of what was written to the Realtime Database.
const original = snapshot.val();
functions.logger.log('Uppercasing', context.params.pushId, original);
const uppercase = original.toUpperCase();
// You must return a Promise when performing asynchronous tasks inside a Functions such as
// writing to the Firebase Realtime Database.
// Setting an "uppercase" sibling in the Realtime Database returns a Promise.
return snapshot.ref.parent.child('uppercase').set(uppercase);
});
If all above syntax has been followed correctly then I will recommend you to report a bug with a minimal repro on the repo along with including the entire cloud function as mentioned by Frank in a similar scenario.

How to handle Firebase Cloud Functions infinite loops?

I have a Firebase Cloud functions which is triggered by an update to some data in a Firebase Realtime Database. When the data is updated, I want to read the data, perform some calculations on that data, and then save the results of the calculations back to the Realtime Database. It looks like this:
exports.onUpdate = functions.database.ref("/some/path").onUpdate((change) => {
const values = change.after.val();
const newValues = performCalculations(value);
return change.after.ref.update(newValues);
});
My concern is that this may create an indefinite loop of updates. I saw a note on the Cloud Firestore Triggers that says:
"Any time you write to the same document that triggered a function,
you are at risk of creating an infinite loop. Use caution and ensure
that you safely exit the function when no change is needed."
So my first question is: Does this same problem apply to the Firebase Realtime Database?
If it does, what is the best way to prevent the infinite looping?
Should I be comparing before/after snapshots, the key/value pairs, etc.?
My idea so far:
exports.onUpdate = functions.database.ref("/some/path").onUpdate((change) => {
// Get old values
const beforeValues = change.before.val();
// Get current values
const afterValues = change.after.val();
// Something like this???
if (beforeValues === afterValues) return null;
const newValues = performCalculations(afterValues);
return change.after.ref.update(newValues);
});
Thanks
Does this same problem apply to the Firebase Realtime Database?
Yes, the chance of infinite loops occurs whenever you write back to the same location that triggered your Cloud Function to run, no matter what trigger type was used.
To prevent an infinite loop, you have to detect its condition in the code. You can:
either flag the node/document after processing it by writing a value into it, and check for that flag at the start of the Cloud Function.
or you can detect whether the Cloud Function code made any effective change/improvement to the data, and not write it back to the database when there was no change/improvement.
Either of these can work, and which one to use depends on your use-case. Your if (beforeValues === afterValues) return null is a form of the second approach, and can indeed work - but that depends on details about the data that you haven't shared.

Firebase doc changes

thanks for your help, I am new to firebase, I am designing an application with Node.js, what I want is that every time it detects changes in a document, a function is invoked that creates or updates the file system according to the new structure of data in the firebase document, everything works fine but the problem I have is that if the document is updated with 2 or more attributes the makeBotFileSystem function is invoked the same number of times which brings me problems since this can give me performance problems or file overwriting problems since what I do is generate or update multiple files.
I would like to see how the change can be expected but wait until all the information in the document is finished updating, not attribute by attribute, is there any way? this is my code:
let botRef = firebasebotservice.db.collection('bot');
botRef.onSnapshot(querySnapshot => {
querySnapshot.docChanges().forEach(change => {
if (change.type === 'modified') {
console.log('bot-changes ' + change.doc.id);
const botData = change.doc.data();
botData.botId = change.doc.id;
//HERE I CREATE OR UPDATE FILESYSTEM STRUCTURE, ACCORDING Data changes
fsbotservice.makeBotFileSystem(botData);
}
});
});
The onSnapshot function will notify you anytime a document changes. If property changes are commited one by one instead of updating the document all at once, then you will receive multiple snapshots.
One way to partially solve the multiple snapshot thing would be to change the code that updates the document to commit all property changes in a single operation so that you only receive one snapshot.
Nonetheless, you should design the function triggered by the snapshot so that it can handle multiple document changes without breaking. Given that document updates will happen no matter if by single/multiple property changes your code should be able to handle those. IMHO the problem is the filesystem update rather than how many snaphots are received
You should use docChanges() method like this:
db.collection("cities").onSnapshot(querySnapshot => {
let changes = querySnapshot.docChanges();
for (let change of changes) {
var data = change.doc.data();
console.log(data);
}
});

Node Js Complex Design Principle (Promise, async/await)

This is a common process for me in my previous works, so i usually have a very complex use case take for example
async function doThis(){
for (100x) {
try {
insertToDatabase()
await selectAndManipulateData()
createEmailWorker()
/** and many more **/
} catch {
logToAFile()
}
}
}
The code works, but its complicated 1 function doing all the things, the only reason i do this is because i can verify in real time if one function fails i can make sure the other function wont run so there wont be any incorrect data.
What i want to know is, what is the best architecture in defining a project structure that is not sacrificing the data integrity? (or is it already good enough?)
const doThis = async() => {
try {
for (100x) {
await insertToDatabase();
await selectAndManipulateData();
await createEmailWorker();
/** and many more **/
}
}
catch {
await logToAFile();
}
}
The best way of doing this is, you should always use await to call any function and make sure to with es6 syntax's as it gives a lot more feature. Your function should always be an async.
Always put your loop in try catch as it will give you any error in catch and it will calling function specific.
Actually, I would separate the persistence, manipulation and email jobs. Consider storing your data is a single responsibility. In addition to this, your modification and email workers should work as scheduled jobs. Once the jobs triggered, they should check if there is data related to its responsibility.
Another way is changing these scheduled jobs with triggered jobs. You can build a chain of responsibility that triggers next jobs and they would decide to work or not.

Resources