How to iterate through every document in every collection in Firestore? - node.js

Let's say my app has a collection for every restaurant that uses it. Each document for each collection is basically a type of food that has an expiration time stamp on it. What I need to do it query through every 'food' in every 'restaurant' and delete each 'food' if the current time that I get from my node js server is past the expiration time stamp. What is an efficient way to do this?

First you need a HTTP Request trigger Cloud Function that you can invoke by cron. ie. https://cron-job.org/
With this cloud function you should use the Admin JS SDK to loop through collections and fetch all documents.
You can get all collections with the function getCollections() See https://cloud.google.com/nodejs/docs/reference/firestore/0.9.x/Firestore#getCollections
import * as admin from 'firebase-admin'
try {
admin.initializeApp(functions.config().firebase)
} catch (e) {}
const db = admin.firestore()
export const getCollections = functions.https.onRequest(async(req, res) => {
const collections = await db.getCollections()
res.status(200).send(collections.map(collection => collection.id))
})

Related

How to recursively list subcollections in Firestore

We back up all our firestore collections daily, using an empty array [] in:
const client = new firestore.v1.FirestoreAdminClient();
return client
.exportDocuments({
name: databaseName,
outputUriPrefix: storageName,
collectionIds:[], // empty array backs up all collections and subcollections
})
However, according to the docs, were we ever to want to import data, we would need to import ALL collections and subcollections, which is substantial. In order to import more granularly (import only necessary collections), we need to provide exportDocuments.collectionIds an array of all our collections' and subcollections' names. ie: ['Coll1', 'Coll2', 'SubCollX', etc.]. Since our collections are subject to change, we need a way to get an array of the names of all collections and subcollections programatically.
This code gets the names of only the root collections, not any subcollections:
const functions = require('firebase-functions');
const admin = require('firebase-admin');
admin.initializeApp();
exports.getCollections = functions.https.onRequest(async (req, res) => {
var rootCollectionIds = [];
var subCollectionIds = [];
const rootCollections = await admin.firestore().listCollections();
rootCollections.map(async rootColl => {
rootCollectionIds.push(rootColl.id);
// without named doc, this returns empty:
const subCollections = await rootColl.doc().listCollections();
subCollectionIds = subCollections.map(subColl => subColl.id);
});
res.status(200).send({"rootCollectionIds": rootCollectionIds, "subCollectionIds": subCollectionIds});
});
This must be doable. In the console (https://console.cloud.google.com/firestore/import-export?project={myproject}), Google lists all collections and subcollections when using
Export one or more collection groups
Best I can tell, listCollections() onlys works at the root, or on a specific document. Since we do not know which specific documents contain subcollections, we need a way to find the subcollections globally. Is there some kind of allDescendents param that could be used?
Running node.js in a Cloud Function.
Any help greatly appreciated.
No, no such option exists. The only way you can find out about subcollections is to build a DocumentReference and call listCollections on it to find which subcollections are available under that one document.

Node / Express generate calendar URL

I have a database with a bunch of dates and an online overview where you can view them, now I know I can copy a URL from my Google Agenda and import this in other calendar clients so I can view the events there.
I want to generate an Express endpoint where I fetch every event every time the endpoint is called and return it in a format that can be imported by other calendar clients. Now with packages like iCal-generator I could generate, read, and return the file whenever a user requests the URL. but it feels redudent to write a file to my storage to then read it, return it and delete it every time it's requested.
What is the most effiecent way to go about this?
Instead of generating the file/calendar data on every request, you could implement a simple caching mechanism. That is, upon start of your node app you generate the calendar data and put it in your cache with corresponding time to live value. Once the data has expired or new entries are inserted into your DB you invalidate the cache, re-generate the data and cache it again.
Here's a very simple example for an in-memory cache that uses the node-cache library:
const NodeCache = require('node-cache');
const cacheService = new NodeCache();
// ...
const calendarDataCacheKey = 'calender-data';
// at the start of your app, generate the calendar data and cache it with a ttl of 30 min
cacheCalendarData(generateCalendarData());
function cacheCalendarData (calendarData) {
cacheService.set(calendarDataCacheKey, calendarData, 1800);
}
// in your express handler first try to get the value from the cache
// if not - generate it and cache it
app.get('/calendar-data', (req, res) => {
let calendarData = cacheService.get(calendarDataCacheKey);
if (calendarData === undefined) {
calendarData = generateCalendarData();
cacheCalendarData(calendarData);
}
res.send(calendarData);
});
If your app is scaled horizontally you should consider using redis.
100% untested, but I have code similar to this that exports to a .csv from a db query, and it might get you close:
const { Readable } = require('stream');
async function getCalendar(req, res) {
const events = await db.getCalendarEvents();
const filename = 'some_file.ics';
res.set({
'Content-Type': 'text/calendar',
'Content-Disposition': `attachment; filename=${filename}`,
});
const input = new Readable({ objectMode: true });
input.pipe(res)
.on('error', (err) => {
console.error('SOME ERROR', err);
res.status(500).end();
});
events.forEach(e => input.push(e));
input.push(null);
}
if you were going to use the iCal generator package, you would do your transforms within the forEach method before pushing to the stream.

Similar method getReferenceFromUrl (from Java client) to use with Cloud Functions... exists?

i have one function that deletes a document from firebase using pub / sub, however, before deleting the document (using document reference) I want to get the storage reference for a link that is saved in a field of that document.
I'll exemplify to make it easier, there is the document Joseph that has the fields
username: "Joseph"; SexUser: "Male";
and urlProfileUser: "any-valid-link-to-download-to-storage-image-uploaded".
Before deleting the document I want to take this field from the selected document, get the reference through the link (in Java I use storage.getReferenceFromUrl (urlProfileUser)), and through that, delete that photo from the storage, so that, just like that , I delete the document from the firestore.
Code for a cloud function that deletes the document I want:
(I just need to now delete the image referenced by the storage link...)
import * as functions from 'firebase-functions'
import * as admin from "firebase-admin";
admin.initializeApp();
//Scheduled job executed every day 23:00
exports.removeUsersUnavailable = functions.pubsub.schedule('0 23 * * *').onRun((context)=>
{
const db = admin.firestore();
const dateEvent = Date.now();
const cutOff = dateEvent - 24*60*60*1000; // After 24 hours(one day) delete document
db.collection("userManagers").orderBy('dateCreated').endAt(cutOff)
.get()
.then(snapshot => {
if(snapshot.empty){
console.log('Nothing still expirated');
return;
}
snapshot.forEach(doc =>{
//console.log(doc.id, "=>", doc.data);
console.log('Expirated, date deleted');
// Here I should delete the photo from the storage, since I already have the document data at that time
{...} <-// delete the image from storage with getReference link (link is string doc.data().urlProfile);
//Delete from firestore
doc.ref.
delete()
.then(response=>{
console.log('Document deleted successful', response);
})
.catch(error=>{
console.log('Error ocurred while data delete', error);
});
});
})
.catch(error =>{
console.log('Error while get the documents', error);
});
});
I'm use typescript to write cloud functions
Cloud Storage server SDKs don't offer an equivalent of getReferenceFromUrl. That's a client SDK operation only.
What you should probably do instead is store the full path of the file in the storage bucket along with the URL, and use that path to delete the object instead of the URL you generated with the client SDK. So, for Android clients, you would store the value of StorageReference.getPath(), then feed that to storage SDK using Bucket.file() to build another reference to delete the object.

Auto Delete Mongo Document if Email is Not Confirmed in NodeJS

I have a working user registration system in NodeJS using Express and Mongoose. At the moment when the user creates an account, a confirmation email will be sent to that email and when the link is clicked, it will confirm their account and let them set a password. Though I want to delete the entire mongo document if the user does not confirm their account within 60 minutes. What is the best method on going about this?
Well mongoose is a javascript wrapper over mongoDB so you should just use regular javascript to do this.
I would take advantage of the pre hook:
https://mongoosejs.com/docs/middleware.html#pre
I would set a timeout of 60 minutes and run a delete query to mongoose and set the timer id you get in response of the setTimeout function to the actual document that gets created.
I would then, in case of validation of the account, clear the timer and remove the entry in the new user's document.
In case nothing happens, the document gets programmatically deleted.
schema.pre("save", async function(next) {
const id = setTimeout(function() {
const document = await Model.findOne(/* use some unique value with this */);
await document.remove()
}, 3600000);
this.timerId = id;
next();
}
Then in case of validation you run:
const document = await Model.findOne(/* find the document */);
clearTimeout(document.id);
This should be enough to handle the whole flow.
Please let me know if it's enough! :)

Firebase Cloud Function onDelete variable param length

I have the following code
//index.js
export const deletePicture = functions.region("europe-west1").database
.ref("galleries/{galleryId}/{pictureId}")
.onDelete(pictures.deletePicture)
//pictures.js
export const deletePicture = (snap, {params: {galleryId}}) => {
console.log("Image deletion detected in the database, deleting images
from the Storage...")
const {fileName} = snap.val()
const bucket = storage.bucket()
const baseURL = `galleries/${galleryId}`
// Thumbnails
const promises = [sizes
.map(size =>
bucket.file(`${baseURL}/thumb_${size}_${fileName}`).delete())]
// Original
promises.push(bucket.file(`${baseURL}/${fileName}`).delete())
return Promise
.all(promises)
.then(() => console.log(`All versions of ${fileName} are now deleted.`))
}
In my realtime database and Storage, I have the following structure:
galleries
|_rooms
| |_roomId
| |_picture
|_foods
|_picture
Is there any way that the above mentioned onDelete Cloud Function would trigger for the deletion of either of the pictures? The difference here is that the rooms picture is one level deeper, so I think that pictureId does not match roomId/picture.
Cloud Functions has no knowledge of the meaning of your JSON. So the trigger for delete of galleries/{galleryId}/{pictureId}, really just means that this function gets triggered whenever a node at the second level under galleries gets trigged.
In the structure you show, that means that this function will trigger whenever /galleries/room/roomId or /galleries/foods/picture gets deleted. The first of these will get triggered when you delete the last picture from a room.

Resources