MongoDB: How to retrieve a large number of documents in successive queries

MongoDB: How to retrieve a large number of documents in successive queries - node.js

This is what I want to achieve:
I have a collection with a large number of documents. A user will query a certain field and this will return a large number of documents.
But for bandwidth/processing reasons, I don't want to get all documents and send them to user (browser) at once.
Lets say the user makes a GET request with a search field. This would give 1000 results. But I want to only send the first 10 to the user. And then the user would request for the next 10, and so on.
Is there a way to achieve this in Mongodb? Can I query with a certain counter and increment it in successive queries?
What is a good way to achieve this mechanism?
Thank you.

MongoDB natively supports the paging operation using the skip() and limit() commands.
//first 10 results -> page 1
collection.find().limit (10)
//next 10 results -> page 2
collection.find().skip(10).limit(10)
//next 10 results -> page 3
collection.find().skip(20).limit(10)

Here number represents 1,2,3,..... means first 10 records if you are giving number = 1.
db.students.find().skip(number > 0 ? ((number-1)*10) : 0).limit(10);
//first 10 records ,number = 1
db.students.find().skip(1 > 0 ? ((1-1)*10) : 0).limit(10);
//next 10 records ,number = 2
db.students.find().skip(2 > 0 ? ((2-1)*10) : 0).limit(10);

Related

How to query from maximum number all the way down in mongoose?

I have a mongo collection which stores orders for a shopping site. Therefore, it starts from 0 and goes up 1 by 1. I want to search for documents, 10 each time, but starting from the max order number all the way down to 0.
In order to do this, I had the idea of getting the max order number and then simply query for the order number from max-10 to max.
Is there a better way to do this? And also, how do I get the maximum order number?

You can use aggregation query to achieve this
var limit = 10;
var offset = page_no > 1 ? (page_no-1) * limit : 0;
Orders.aggregate([
{
$sort:{
'order_no':-1
}
},{
$skip:offset
},{
$limit:limit
}
])
Here order_no is the Number type in your schema as you mentioned & offset will be set based on the page_no you increase from 1-N

Data manipulation with Node and mongo when Data size is very large

I am currently working with MEAN stack. I have around 99000 records in mongo dB. each record consist of an image array, which is containing image urls. maximum size of this image array can be 10. so every record can maximum have imageURL array length = 10 .
Now I want to fetch every records, and then compare images of every records with each other, using resemble js. then save their average value in that same record.
I used async module and tried to implement this, but it is taking too much time even with 5 records. also used async's forEachLimit but it won't help.
So basically How can I manipulate these kind of large amount of data with Node and mongo?
is there any way to do it in batches ? any other solution ?
loop1 ==> all records (response) {
loop2 == > convert all images of one record to base64 (resemble can't use images from urls)==> saved in new array = TempArray1 <==loop ends
loop3 == > TempArray1.length (TempArray1[i]) {
loop4 ==> TempArray1.length (TempArray1[j]){
count += resemble(TempArray1[i],TempArray1[j]);
}
avg[i] = count/(TempArray1.length -1);
}
}

mongoose limit & nin not working properly

i am trying to limit the number of records returned in a query:
Property.find(searchParams).nin('_id', prop_ids).limit(5).exec(function(err, properties) {
when the first call comes in, i get 5 records back. then i make a second call and pass in an array of ids (prop_ids). This array has all of the ids that were records that were returned in the first call... in this case i get no records back. I have a total of 7 records in my database, so the second call should return 2 records. How should I go about doing this?

I think mongoose might apply the limit before the nin query is applied so you will always just get those five. If it's a type of pagination you want to perform where you get 5 objects and then get 5 others, you can use the option skip instead:
var SKIP = ... // 0, 5, 10...
Property.find(searchParams, null, {
skip: SKIP,
limit: 5,
}, function(err, properties) {
})
This is what I took from your question, maybe you had something other in mind with the nin call?

Query WadPerformanceCountersTable in Increments?

I am trying to query the WadPerformanceCountersTable generated by Azure Diagnostics which has a PartitionKey based on tick marks accurate up to the minute. This PartitionKey is stored as a string (which I do not have any control over).
I want to be able to query against this table to get data points for every minute, every hour, every day, etc. so I don't have to pull all of the data (I just want a sampling to approximate it). I was hoping to using the modulus operator to do this, but since the PartitionKey is stored as a string and this is an Azure Table, I am having issues.
Is there any way to do this?
Non-working example:
var query =
(from entity in ServiceContext.CreateQuery<PerformanceCountersEntity>("WADPerformanceCountersTable")
where
long.Parse(entity.PartitionKey) % interval == 0 && //bad for a variety of reasons
String.Compare(entity.PartitionKey, partitionKeyEnd, StringComparison.Ordinal) < 0 &&
String.Compare(entity.PartitionKey, partitionKeyStart, StringComparison.Ordinal) > 0
select entity)
.AsTableServiceQuery();

If you just want to get a single row based on two different time interval (now and N time back) you can use the following query which returns the single row as described here:
// 10 minutes span Partition Key
DateTime now = DateTime.UtcNow;
// Current Partition Key
string partitionKeyNow = string.Format("0{0}", now.Ticks.ToString());
DateTime tenMinutesSpan = now.AddMinutes(-10);
string partitionKeyTenMinutesBack = string.Format("0{0}", tenMinutesSpan.Ticks.ToString());
//Get single row sample created last 10 mminutes
CloudTableQuery<WadPerformanceCountersTable> cloudTableQuery =
(
from entity in ServiceContext.CreateQuery<PerformanceCountersEntity>("WADPerformanceCountersTable")
where
entity.PartitionKey.CompareTo(partitionKeyNow) < 0 &&
entity.PartitionKey.CompareTo(partitionKeyTenMinutesBack) > 0
select entity
).Take(1).AsTableServiceQuery();

The only way I can see to do this would be to create a process to keep the Azure table in sync with another version of itself. In this table, I would store the PartitionKey as a number instead of a string. Once done, I could use a method similar to what I wrote in my question to query the data.
However, this is a waste of resources, so I don't recommend it. (I'm not implementing it myself, either.)

SQLAlchemy and going through a large result set [duplicate]

This question already has answers here:
memory-efficient built-in SqlAlchemy iterator/generator?
(7 answers)
Closed 3 years ago.
I need to read data from all of the rows of a large table, but I don't want to pull all of the data into memory at one time. Is there a SQLAlchemy function that will handle paging? That is, pull several rows into memory and then fetch more when necessary.
I understand you can do this with limit and offset as this article suggests, but I'd rather not handle that if I don't have to.

If you are using Flask-SqlAlchemy, see the paginate method of query. paginate offers several method to simplify pagination.
record_query = Record.query.paginate(page, per_page, False)
total = record_query.total
record_items = record_query.items
First page should be 1 otherwise the .total returns exception divided by zero

If you aren't using Flask, you can use SqlAlchemy function 'slice' or a combo of 'limit' & 'offset', as mentioned here. E.g.:
some_query = Query([TableBlaa])
query = some_query.limit(number_of_rows_per_page).offset(page_number*number_of_rows_per_page)
# -- OR --
query = some_query.slice(page_number*number_of_rows_per_page, (page_number*number_of_rows_per_page)+number_of_rows_per_page)
current_pages_rows = session.execute(query).fetchall()

If you are building an api to use with ReactJs, vueJs or other frontEnd framework, you can process like:
Notice:
page: current page that you need
error_out: Not display errors
max_per_page or per_page : the limit
Documentaion: SqlAchemy pagination
record_query = Record.query.paginate(page=*Number*, error_out=False, max_per_page=15)
result = dict(datas=record_query.items,
total=record_query.total,
current_page=record_query.page,
per_page=record_query.per_page)
On record_query you can use :
next(error_out=False)
Returns a Pagination object for the next page.
next_num
Number of the next page
page = None
the current page number (1 indexed)
pages
The total number of pages
per_page = None
the number of items to be displayed on a page.
prev(error_out=False)
Returns a Pagination object for the previous page.
prev_num
Number of the previous page.
query = None
the unlimited query object that was used to create this pagination
object.
total = None
the total number of items matching the query
Hope it help you!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

MongoDB: How to retrieve a large number of documents in successive queries - node.js

MongoDB natively supports the paging operation using the skip() and limit() commands. //first 10 results -> page 1 collection.find().limit (10) //next 10 results -> page 2 collection.find().skip(10).limit(10) //next 10 results -> page 3 collection.find().skip(20).limit(10)

Related

How to query from maximum number all the way down in mongoose?

Data manipulation with Node and mongo when Data size is very large

mongoose limit & nin not working properly

Query WadPerformanceCountersTable in Increments?

SQLAlchemy and going through a large result set [duplicate]

Categories

Resources