Firebase cloud function to count and update collections - node.js

I have three collections in my Firebase project, one contains locations that users have checked in from, and the other two are intended to hold leaderboards with the cities and suburbs with the most check ins.
However, as a bit of a newbie to NOSQL databases, I'm not quite sure how to do the queries I need to get and set the data I want.
Currently, my checkins collection has this structure:
{ Suburb:,
City:,
Leaderboard:}
The leaderboard entry is a boolean to mark if the check in has already been added to the leaderboard.
What I want to do is query for all results where leaderboard is false, count the entries for all cities, count the entries for all suburbs, then add the city and suburb data to a separate collection, then update the leaderboard boolean to indicate they've been counted.
exports.updateLeaderboard = functions.pubsub.schedule('30 * * * *').onRun(async context => {
db.collection('Bears')
.where('Leaderboard', '==', 'false')
.get()
.then(snap =>{
snap.forEach(x => {
//Count unique cities and return object SELECT cities,COUNT(*) AS `count` FROM Bears GROUP BY cities
})
})
.then(() => {
console.log({result: 'success'});
})
.catch(error => {
console.error(error);
});
})
Unfortunately, I've come to about the limit of my knowledge here and would love some help.

Firebase is meant to be a real-time platform, and most of your business logic is going to be expressed in Functions. Because the ability to query is so limited, lots of problems like this are usually solved with triggers and data denormalization.
For instance, if you want a count of all mentions of a city, then you have to maintain that count at event-time.
// On document create
await firestore()
.collection("city-count")
.doc(doc.city)
.set({
count: firebase.firestore.FieldValue.increment(1),
}, { merge: true });
Since it's a serverless platform, it's built to run a lot of very small, very fast functions like this. Firebase is very bad at doing large computations -- you can quickly run in to mb/minute and doc/minute write limits.
Edit: Here is how Firebase solved this exact problem from the perspective of a SQL trained developer https://www.youtube.com/watch?v=vKqXSZLLnHA

As clarified in this other post from the Community here, Firestore doesn't have a built-in API for counting documents via query. You will need to read the whole collection and load it to a variable and work with the data then, counting how many of them have False as values in their Leaderboard document. While doing this, you can start adding these cities and suburbs to arrays that after, will be written in the database, updating the other two collections.
The below sample code - untested - returns the values from the Database where the Leaderboard is null, increment a count and shows where you need to copy the value of the City and Suburb to the other collections. I basically changed some of the orders of your codes and changed the variables to generic ones, for better understanding, adding a comment of where to add the copy of values to other collections.
...
// Create a reference to the collection of checkin
let checkinRef = db.collection('cities');
// Create a query against the collection
let queryRef = checkinRef.where('Leaderboard', '==', false);
var count = 0;
queryRef.get().
.then(snap =>{
snap.forEach(x => {
//add the cities and suburbs to their collections here and update the counter
count++;
})
})
...
You are very close to the solution, just need now to copy the values from one collection to the others, once you have all of them that have False in leaderboard. You can get some good examples in copying documents from a Collection to another, in this other post from the Community: Cloud Functions: How to copy Firestore Collection to a new document?
Let me know if the information helped you!

Related

How to get multiple collection group's documents at once from firestore?

So here I Have multiple sub-collections(subjects) in different doc's(grades) and I want to get all the sub-collections(subjects) documents(questions) at once I tried to get them by using Collection group queries the only problem which I am facing in my code sometime it returning all the doc's(questions) but sometimes not what is the issue
this is what i have tried
const getAllQuestions = (request,response)=>{
const subjects = ['Maths','English']
const questionsArray = []
subjects.forEach((subject,index)=>{
db.collectionGroup(subject)
.get()
.then((querySnapshot)=>{
querySnapshot.forEach((doc) => {
questionsArray.push({...doc.data(),id:doc.id})
})
if(index==subjects.length-1){
response.status(200).json({
status:200,
data:questionsArray,
length:questionsArray.length
})
}
})
})
}
If you don't want to get the subcollections from all grades, but only from one of them, you should not use a collection group query but instead specify the entire path to the collection you want to query/read:
db.collection('quizQuesDb/Grade 5/'+subject)
.get()
If you want to perform a query across all collections of a certain name under a specific path, see: CollectionGroupQuery but limit search to subcollections under a particular document

Typeorm querybuilder update get updated result

I'm running a query-builder that updates multiple users based on last logged in date, meaning I don't know which users are getting updated. Here is my code, which does not provide information about which users it updated.
await getConnection()
.createQueryBuilder()
.update(User)
.set({
isDeactivated: true
})
.where('lastConnected < :someTimeAgo', { someTimeAgo })
.andWhere('isDeactivated = :isDeactivated', { isDeactivated: false })
.execute()
.then(result => {
// Result: UpdateResult { generatedMaps: [], raw: undefined }
})
How can I access the updated data? Database is SQL Server.
Normally you cannot find which rows were updated by an UPDATE statement in SQL, hence tyeorm cannot tell you.
Here are several solutions if you REALLY need to know which rows were updated.
Before go ahead, ask WHY do you need to know? Do you REALLY need to know?
If, after careful consideration, you find you need to know which rows were updated, there are several solutions:
In your code, find the users to be deleted, then delete them one at a time, logging info on each one as you go.
Create a table in the database containing the user id's to be deactivated. Populate this table first: INSERT INTO deactivatedusers (userid) SELECT userid FROM users WHERE ... then run UPDATE users SET isDeactivated = 1 WHERE userid IN SELECT userid FROM deactivatedusers then to find which users were deactivated: SELECT userid FROM deactivatedusers and finally clear deactivatedusers ready for next time, either with DELETE FROM deactivatedusers or TRUNCATE TABLE deactivatedusers
Since you are using MS SQL Server, this provides OUTPUT INTO specifically to do what you are asking (non standard SQL, so only works with this DBMS). If you decide to use this approach, you should write a stored procedure to do the update and return the updated data back to caller, then call this stored proc from typeorm.

How to start Firestore query from a particular document number without using OFFSET?

I have a Firestore collection named 'users' and has many documents by the name of each user.
I want to retrieve list of 25 users at a time in alphabetical order and this is what I tried:
const allUsersRef = admin.firestore().collection('users').orderBy('name').offset(0).limit(25)
allUsersRef.get().then((top25Users) => {
let usersList = '``` << Users LIST >>\n'
if (!top25Users.empty) {
top25Users.forEach(eachUser => {
usersList = usersList + `\n${eachUser.data().name} \n${eachUser.data().id}`
})
console.log(usersList)
return
} else {
message.channel.send('Looks like we have no users at the moment!')
return
}
}).catch((error) => {
console.log(error)
return
})
This way I can get the top 25 users easily! But what if I want the next 25? This is a Discord Bot and not an Android Application where I can add a button [view more] and then continue the results query.start() as shown in this firebase video
I can use OFFSET but the number of users is large so using offset(500) won't be affordable :'(
Also I need to fetch users in alphabetical order and when new users register, the order changes.
TL,DR: If I had a list of my users in alphabetical order, how do I get users from 126th position to 150th position on the list which is sort of page 5 for my 25/page query! and without using offset because that just uses more resources!
I had this in firebase realtime database first but then I needed some more advanced querying so I have migrated here :)
Database Struture: Just a single collection named USERS and documents named as username in it.
PS:
const startAtRes = await db.collection('cities')
.orderBy('population')
.startAt(1000000)
.get();
Using something like this ^ from Firebase Documentation is not possible because I won't be knowing from where to start from. As the list changes as new users Register!
Firestore does not support efficient offset based pagination. When you use offset(), you're paying for reads of all the documents up to that point. The only availabe efficient pagination requires that you provide an anchor document, or properties of the anchor document, to navigate between pages, as described in the documentation.

In Cloud function how can i join from another collection to get data?

I am using Cloud Function to send a notification to mobile device. I have two collection in Firestore clientDetail and clientPersonalDetail. I have clientID same in both of the collection but the date is stored in clientDetail and name is stored in clientPersonal.
Take a look:
ClientDetail -- startDate
-- clientID
.......
ClientPersonalDetail -- name
-- clientID
.........
Here is My full Code:
exports.sendDailyNotifications = functions.https.onRequest( (request, response) => {
var getApplicants = getApplicantList();
console.log('getApplicants', getApplicants);
cors(request, response, () => {
admin
.firestore()
.collection("clientDetails")
//.where("clientID", "==", "wOqkjYYz3t7qQzHJ1kgu")
.get()
.then(querySnapshot => {
const promises = [];
querySnapshot.forEach(doc => {
let clientObject = {};
clientObject.clientID = doc.data().clientID;
clientObject.monthlyInstallment = doc.data().monthlyInstallment;
promises.push(clientObject);
});
return Promise.all(promises);
}) //below code for notification
.then(results => {
response.send(results);
results.forEach(user => {
//sendNotification(user);
});
return "";
})
.catch(error => {
console.log(error);
response.status(500).send(error);
});
});
}
);
Above function is showing an object like this
{clienId:xxxxxxxxx, startDate:23/1/2019}
But I need ClientID not name to show in notification so I'll have to join to clientPersonal collection in order to get name using clientID.
What should do ?
How can I create another function which solely return name by passing clientID as argument, and waits until it returns the name .
Can Anybody please Help.?
But I need ClientID not name to show in notification so I'll have to join to clientPersonal collection in order to get name using clientID. What should do ?
Unfortunately, there is no JOIN clause in Firestore. Queries in Firestore are shallow. This means that they only get items from the collection that the query is run against. There is no way to get documents from two top-level collection in a single query. Firestore doesn't support queries across different collections in one go. A single query may only use properties of documents in a single collection.
How can I create another function which solely return name by passing clientID as argument, and waits until it returns the name.
So the most simple solution I can think of is to first query the database to get the clientID. Once you have this id, make another database call (inside the callback), so you can get the corresponding name.
Another solution would be to add the name of the user as a new property under ClientDetail so you can query the database only once. This practice is called denormalization and is a common practice when it comes to Firebase. If you are new to NoQSL databases, I recommend you see this video, Denormalization is normal with the Firebase Database for a better understanding. It is for Firebase realtime database but same rules apply to Cloud Firestore.
Also, when you are duplicating data, there is one thing that need to keep in mind. In the same way you are adding data, you need to maintain it. With other words, if you want to update/detele an item, you need to do it in every place that it exists.
The "easier" solution would probably be the duplication of data. This is quite common in NoSQL world.
More precisely you would add in your documents in the ClientDetail collection the value of the client name.
You can use two extra functions in this occasion to have your code clear. One function that will read all the documents form the collection ClientDetail and instead of getting all the fields, will get only the ClientID. Then call the other function, that will be scanning all the documents in collection ClientPersonalDetail and retrieve only the part with the ClientID. Compare if those two match and then do any operations there if they do so.
You can refer to Get started with Cloud Firestore documentation on how to create, add and load documents from Firestore.
Your package,json should look something like this:
{
"name": "sample-http",
"version": "0.0.1",
"dependencies": {
"firebase-admin": "^6.5.1"
}
}
I have did a little bit of coding myself and here is my example code in GitHub. By deploying this Function, will scan all the documents form one Collection and compare the ClientID from the documents in the other collection. When it will find a match it will log a message otherwise it will log a message of not matching IDs. You can use the idea of how this function operates and use it in your code.

Massive inserts with pg-promise

I'm using pg-promise and I want to make multiple inserts to one table. I've seen some solutions like Multi-row insert with pg-promise and How do I properly insert multiple rows into PG with node-postgres?, and I could use pgp.helpers.concat in order to concatenate multiple selects.
But now, I need to insert a lot of measurements in a table, with more than 10,000 records, and in https://github.com/vitaly-t/pg-promise/wiki/Performance-Boost says:
"How many records you can concatenate like this - depends on the size of the records, but I would never go over 10,000 records with this approach. So if you have to insert many more records, you would want to split them into such concatenated batches and then execute them one by one."
I read all the article but I can't figure it out how to "split" my inserts into batches and then execute them one by one.
Thanks!
UPDATE
Best is to read the following article: Data Imports.
As the author of pg-promise I was compelled to finally provide the right answer to the question, as the one published earlier didn't really do it justice.
In order to insert massive/infinite number of records, your approach should be based on method sequence, that's available within tasks and transactions.
var cs = new pgp.helpers.ColumnSet(['col_a', 'col_b'], {table: 'tableName'});
// returns a promise with the next array of data objects,
// while there is data, or an empty array when no more data left
function getData(index) {
if (/*still have data for the index*/) {
// - resolve with the next array of data
} else {
// - resolve with an empty array, if no more data left
// - reject, if something went wrong
}
}
function source(index) {
var t = this;
return getData(index)
.then(data => {
if (data.length) {
// while there is still data, insert the next bunch:
var insert = pgp.helpers.insert(data, cs);
return t.none(insert);
}
// returning nothing/undefined ends the sequence
});
}
db.tx(t => t.sequence(source))
.then(data => {
// success
})
.catch(error => {
// error
});
This is the best approach to inserting massive number of rows into the database, from both performance point of view and load throttling.
All you have to do is implement your function getData according to the logic of your app, i.e. where your large data is coming from, based on the index of the sequence, to return some 1,000 - 10,000 objects at a time, depending on the size of objects and data availability.
See also some API examples:
spex -> sequence
Linked and Detached Sequencing
Streaming and Paging
Related question: node-postgres with massive amount of queries.
And in cases where you need to acquire generated id-s of all the inserted records, you would change the two lines as follows:
// return t.none(insert);
return t.map(insert + 'RETURNING id', [], a => +a.id);
and
// db.tx(t => t.sequence(source))
db.tx(t => t.sequence(source, {track: true}))
just be careful, as keeping too many record id-s in memory can create an overload.
I think the naive approach would work.
Try to split your data into multiple pieces of 10,000 records or less.
I would try splitting the array using the solution from this post.
Then, multi-row insert each array with pg-promise and execute them one by one in a transaction.
Edit : Thanks to #vitaly-t for the wonderful library and for improving my answer.
Also don't forget to wrap your queries in a transaction, or else it
will deplete the connections.
To do this, use the batch function from pg-promise to resolve all queries asynchronously :
// split your array here to get splittedData
int i = 0
var cs = new pgp.helpers.ColumnSet(['col_a', 'col_b'], {table: 'tmp'})
// values = [..,[{col_a: 'a1', col_b: 'b1'}, {col_a: 'a2', col_b: 'b2'}]]
let queries = []
for (var i = 0; i < splittedData.length; i++) {
var query = pgp.helpers.insert(splittedData[i], cs)
queries.push(query)
}
db.tx(function () {
this.batch(queries)
})
.then(function (data) {
// all record inserted successfully !
}
.catch(function (error) {
// error;
});

Resources