node.js mongodb cursor looping on client request - node.js

I know how to query a collection. But I have a collection with 100,000 records and I want to show only 100 items per page. The user can then select next 100 records and so on...
Since this request is coming from the user, how do I keep the cursor open on node.js for looping the next 100 items when client requests for it?
What is the standard practice?
Thanks!

The standard practice for what you are referring to is something like pagination.
You don't need to keep the cursor open all the time. All you need to make sure is that you continue from the same place you left off.
The client would retain the number of records that has already been displayed and use that number inside the skip() function of the cursor.
For example:
Client is provided with 10 records. record_count = 10.
Client requests more records and includes record_count in the request.
Server uses supplied record_count in another query in the skip parameter.
Server returns another 10 records to client.
Client updates the record_count variable to now be 20.
Rinse, Repeat...
Keep in mind that you'd want your results to be sorted somehow so that your query will always return different results (the next 10 records).
I'm not too familiar with the node drivers for mongo, but in the mongo shell, you would execute the query as follows:
db.collection.find().sort( { "time": 1 } ).skip( record_count ).limit( 10 )

Related

Use an array of values to query Firestore and setup a snapshot listener

Here is my problem:
I have a firestore collection that has a number of documents. There are about 500 documents generated/updated every hour and saved to the collection.
I would like to query the collection and setup a real-time snapshot listener for a subset of document IDs, that are provided by the client.
I think maybe I could to something like this (this syntax is likely not correct...just trying to get a feel for if it's even possible...but isn't the "in" limited to an array of 10 items? ):
const subbedDocs = ["doc1","doc2","doc3","doc4","doc5"]
docsRef.where('docID', 'in', subbedDocs).onSnapshot((doc) => {
handleSnapshot(doc);
});
I'm sorry, that code probably doesn't make sense....I'm still trying to learn all the ins and outs of Firestore.
Essentially, what I am trying to do is take an array of ID's and setup a .onSnapshot listener for those ID's. This list of IDs could be upwards of 40-50 items. Is this even possible? I am trying to avoid just setting up a listener on the whole collection and filtering out things I am not "subscribed" too as that seems wasteful from a resources perspective.
If you have the doc IDs in your array (it looks like you have) you can loop over them and start a listener during that:
const subbedDocs = ["doc1", "doc2", "doc3", "doc4", "doc5"];
for (let i = 0; i < subbedDocs.length; i++) {
const docID = subbedDocs[i];
docsRef.doc(docID).onSnapshot((doc) => {
handleSnapshot(doc);
});
}
It would be better to listen to a query and all filtered docs at once. But if you want to listen to each of them with a explicit listener that would do the trick.
As you've discovered, Firestore's in operator only allows up to 10 entries in the array. I'm also guessing you've added the docID as a field in the document, since I don't believe 'docID references the actual documentid.
I would not take this approach, because of the 10-entry limitation. What I would do is, as the client is selecting documents to follow, set a field (same in each document) to a unique Id for the client, so your query completely avoids the limitation. You can allow an unlimited number of Client listeners (up to implementation limits of Firestore) if you add that client ID into an array (called something like "ListenerArray") [again, as the client is selecting them]. Your query would be more like:
docsRef.where('ListenerArray', 'array-contains', clientID).onSnapshot((doc) => {
handleSnapshot(doc);
})
array-contains checks a single value against all entries in a document array, without limit. Every client can mark any number of documents to subscribe to.

How to update 10 users as winners in backend?

I have a usecase where I have to generate 10 winners out of 100 participants and update them in janusgraph. I have generated winners using math.ceil(math.random()) method and maintained their Id's in an array(say winners[10]).This winners[10] array is sent as a body and game I'd as a query parameter from front end. It is a post end point. I just need to add 500 points to the winners and retrieve their data.
So what I have tried is
g.V().hasLabel('Game').has('active', true).
as('game').
outE('participated').inV().hasLabel('User').
has('userdId', id).as('winner').
addE('won').property('points', 500).
to('game').
select('winner').
valueMap()
The above query executes for only one user. I want to make my query work for all the users. I have done some research on repeat(),loop(),iterate() steps but strucked with no option.And the result should be an array with 10 winners data.
Thanks in advance!
You can filter the vertices by multiple ids by using within:
g.V().hasLabel('Game').has('active', true).
as('game').
outE('participated').inV().hasLabel('User').
has('userdId', within(1, 2, 3)).as('winner').
addE('won').property('points', 500).
to('game').
select('winner').
valueMap()
example: https://gremlify.com/9j071eajda4

How to get Salesforce REST API to paginate?

I'm using the simple_salesforce python wrapper for the Salesforce REST API. We have hundreds of thousands of records, and I'd like to split up the pull of the salesforce data so all records are not pulled at the same time.
I've tried passing a query like:
results = salesforce_connection.query_all("SELECT my_field FROM my_model limit 2000 offset 50000")
to see records 50K through 52K but receive an error that offset can only be used for the first 2000 records. How can I use pagination so I don't need to pull all records at once?
Your looking to use salesforce_connection.query(query=SOQL) and then .query_more(nextRecordsUrl, True)
Since .query() only returns 2000 records you need to use .query_more to get the next page of results
From the simple-salesforce docs
SOQL queries are done via:
sf.query("SELECT Id, Email FROM Contact WHERE LastName = 'Jones'")
If, due to an especially large result, Salesforce adds a nextRecordsUrl to your query result, such as "nextRecordsUrl" : "/services/data/v26.0/query/01gD0000002HU6KIAW-2000", you can pull the additional results with either the ID or the full URL (if using the full URL, you must pass ‘True’ as your second argument)
sf.query_more("01gD0000002HU6KIAW-2000")
sf.query_more("/services/data/v26.0/query/01gD0000002HU6KIAW-2000", True)
Here is an example of using this
data = [] # list to hold all the records
SOQL = "SELECT my_field FROM my_model"
results = sf.query(query=SOQL) # api call
## loop through the results and add the records
for rec in results['records']:
rec.pop('attributes', None) # remove extra data
data.append(rec) # add the record to the list
## check the 'done' attrubite in the response to see if there are more records
## While 'done' == False (more records to fetch) get the next page of records
while(results['done'] == False):
## attribute 'nextRecordsUrl' holds the url to the next page of records
results = sf.query_more(results['nextRecordsUrl', True])
## repeat the loop of adding the records
for rec in results['records']:
rec.pop('attributes', None)
data.append(rec)
Looping through the records and using the data
## loop through the records and get their attribute values
for rec in data:
# the attribute name will always be the same as the salesforce api name for that value
print(rec['my_field'])
Like the other answer says though, this can start to use up a lot of resources. But it what you're looking for if want to achieve page nation.
Maybe create a more focused SOQL statement to get only the records needed for your use case at that specific moment.
LIMIT and OFFSET aren't really meant to be used like that, what if somebody inserts or deletes a record on earlier position (not to mention you don't have ORDER BY in there). SF will open a proper cursor for you, use it.
https://pypi.org/project/simple-salesforce/ docs for "Queries" say that you can either call query and then query_more or you can go query_all. query_all will loop and keep calling query_more until you exhaust the cursor - but this can easily eat your RAM.
Alternatively look into the bulk query stuff, there's some magic in the API but I don't know if it fits your use case. It'd be asynchronous calls and might not be implemented in the library. It's called PK Chunking. I wouldn't bother unless you have millions of records.

Insert records into database and at the same time notify users about number of record inserted or failed in node.js

I have a requirement like this:
Maximum 500 records.
I have to insert records into a table. However, before inserting them I have to check if that same record or it's parents are already inserted.
Want to achieve:- How can i notify the user at the same time once the record is inserted in node.js
Example:- if i am uploading 400 records and 5 records are inserted user should be notified that 5 record is inserted if any failed, failed record count should be notified.
Any help would be really appreciated.
Igor already told you how to make a question so you should pass through what he righted.
Now answering your question , you need basically control the insertion. And use 2 counters, for example: let inserted , let notInserted or var inserted , var notInserted
For each insertion you check if exists , if yes notInserted +1 else inserted+1
At the end you should return to user the result res.send().json({message:"Inserted"+ inserted+"Not Inserted"+notInserted}) ;
Something like this!

Mongoose/MongoDB batchSize(30) still returning all results

The following query returns all my users. I was hoping it would be batched.
statics.findAllUsers = function findAllUsers(callback) {
this.find({}, callback).batchSize(30);
};
batchSize() instructs the driver to retrieve a certain number of items per time. It'll still get everything from the DB, only a batch at a time.
To make it clearer: If you use batchSize(30) then it'll ask for 30 items then, when you need the 31st, it'll query the next 30 and so forth.
If you need only that number of items, then use limit() (and offset() to set which is the first item)
Docs: http://docs.mongodb.org/manual/reference/method/cursor.batchSize/

Resources