SQLAlchemy and going through a large result set [duplicate] - pagination

This question already has answers here:
memory-efficient built-in SqlAlchemy iterator/generator?
(7 answers)
Closed 3 years ago.
I need to read data from all of the rows of a large table, but I don't want to pull all of the data into memory at one time. Is there a SQLAlchemy function that will handle paging? That is, pull several rows into memory and then fetch more when necessary.
I understand you can do this with limit and offset as this article suggests, but I'd rather not handle that if I don't have to.

If you are using Flask-SqlAlchemy, see the paginate method of query. paginate offers several method to simplify pagination.
record_query = Record.query.paginate(page, per_page, False)
total = record_query.total
record_items = record_query.items
First page should be 1 otherwise the .total returns exception divided by zero

If you aren't using Flask, you can use SqlAlchemy function 'slice' or a combo of 'limit' & 'offset', as mentioned here. E.g.:
some_query = Query([TableBlaa])
query = some_query.limit(number_of_rows_per_page).offset(page_number*number_of_rows_per_page)
# -- OR --
query = some_query.slice(page_number*number_of_rows_per_page, (page_number*number_of_rows_per_page)+number_of_rows_per_page)
current_pages_rows = session.execute(query).fetchall()

If you are building an api to use with ReactJs, vueJs or other frontEnd framework, you can process like:
Notice:
page: current page that you need
error_out: Not display errors
max_per_page or per_page : the limit
Documentaion: SqlAchemy pagination
record_query = Record.query.paginate(page=*Number*, error_out=False, max_per_page=15)
result = dict(datas=record_query.items,
total=record_query.total,
current_page=record_query.page,
per_page=record_query.per_page)
On record_query you can use :
next(error_out=False)
Returns a Pagination object for the next page.
next_num
Number of the next page
page = None
the current page number (1 indexed)
pages
The total number of pages
per_page = None
the number of items to be displayed on a page.
prev(error_out=False)
Returns a Pagination object for the previous page.
prev_num
Number of the previous page.
query = None
the unlimited query object that was used to create this pagination
object.
total = None
the total number of items matching the query
Hope it help you!

Related

TYPO3: Performance issue with pagination

I am currently building some kind of video channel based on an extension I created. It consists of videos and playlist that contains videos (obviously).
I have to create a page which contains a list of videos AND playlist by category. You also can sort those items by date. Finally, the page is paginated with an infinite scrolling that should load items 21 by 21.
To do so, I created on both Video and Playlist repositories a "findByCategory" function which is really simple :
$query = $this->createQuery();
return $query->matching($query->equals('categorie.uid',$categoryUid))->execute()->toArray();
Once I requested the items I need, I merge them in one array and do my sorting stuff. Here is my controller show action :
if ($this->request->hasArgument('sort'))
$sort = $this->request->getArgument('sort');
else
$sort = 'antechrono';
//Get videos in repositories
$videos = $this->videoRepository->findByCategorie($categorie->getUid());
$playlists = $this->playlistRepository->findByCategorie($categorie->getUid());
//Merging arrays then sort it
if ($videos && $playlists)
$result = array_merge($videos, $playlists);
else if ($videos)
$result = $videos;
else if ($playlists)
$result = $playlists;
if ($sort == "chrono")
usort($result, array($this, "sortChrono"));
else if ($sort == "antechrono" || $sort == null)
{
usort($result, array($this, "sortAnteChrono"));
$sort="antechrono";
}
$this->view->assignMultiple(array('categorie' => $categorie, 'list' => $result, 'sort' => $sort));
Here is my view :
<f:widget.paginate objects="{list}" as="paginatedList" configuration="{addQueryString: 'true', addQueryStringMethod: 'GET,POST', itemsPerPage: 21}">
<div class="videos row">
<f:for each="{paginatedList}" as="element">
<f:render partial="Show/ItemCat" arguments="{item: element}"/>
</f:for>
</div>
</f:widget.paginate>
The partial render shows stuff including a picture used as a cover. So I need at least this relation in the view.
This works fine and shows only the items from the category that is requested. Unfortunatly I have a huge performance issue : I tried to show a category that contains more than 3000 records and It takes about one minute to load. It's a little bit long.
By f:debugging my list variable, I see that it contains every records even through it shouldn't be the case (that's the point of pagination...). So the first question is : is there something wrong in the way I did my pagination ?
I tried to simplify my requests by enabling the rawQuery thing ($query->execute(true)) : I get way better performance, but I can't get the link for the pictures (in my view, I get 1 or 0 but not the picture's uid...). Second question : is there a way to fix this issue ?
I hope my description is clear enough. Thanks for your help :-)
When you execute a query, it will not actually fetch the data from the database until the results are accessed. If the paginate widget gets a query result it will add limits and offset to the query and then fetch the data from the database, so you will only get the records that are shown on a page.
In your case you added toArray() after execute(), which accesses the results, so the data is fetched from the database and you get all records. The best solution I can think of is to combine the 2 tables into 1 so you can do it with a single query and don't have to merge and order them in PHP.
As long as you sort the data after the query you have to handle all data (request all records and especially resolve all relations).
Try to sort the data in the query itself (order by), so you could restrict the data to only those records which are needed for the current 'page' (limit <offset>,<number>).
Here a complex query with join and limit could be faster than a full query and filtering in PHP.

Suitescript Pagination

Ive been trying to create a suitelet that allows for a saved search to be run on a collection of item records in netsuite using suitescript 1.0
Pagination is quite easy everywhere else, but i cant get my head around how to do it in NetSuite.
For instance, we have 3,000 items and I'm trying to limit the results to 100 per page.
I'm struggling to understand how to apply a start row and a max row parameter as a filter so i can run the search to return the number of records from my search
I've seen plenty of scripts that allow you to exceed the limit of 1,000 records, but im trying to throttle the amount shown on screen. but im at a loss to figure out how to do this.
Any tips greatly appreciated
function searchItems(request,response)
{
var start = request.getParameter('start');
var max = request.getParameter('max');
if(!start)
{
start = 1;
}
if(!max)
{
max = 100;
}
var filters = [];
filters.push(new nlobjSearchFilter('category',null,'is',currentDeptID));
var productList = nlapiSearchRecord('item','customsearch_product_search',filters);
if(productList)
{
response.write('stuff here for the items');
}
}
You can approach this a couple different ways. Either way, you will definitely need to sort your search results by something meaningful and consistent, like by internal ID. Make sure you've got your results sorted either in your saved search definition or by adding a search column in your script.
You can continue building your search exactly like you are, and then just using the native slice method on the productList Array. You would use your start and end parameters to pass as the arguments to slice appropriately.
Another approach is to use the async API for searches. It will look similar to this:
var search = nlapiLoadSearch("item", "customsearch_product_search");
search.addFilter(new nlobjSearchFilter('category',null,'is',currentDeptID));
var productList = search.runSearch().getResults(start, end);
For more references on this approach, check out the NetSuite Help page titled "Search APIs" and the reference page for nlobjSearch.

Filtering Haystack (SOLR) results by django_id

With Django/Haystack/SOLR, I'd like to be able to restrict the result of a search to those records within a particular range of django_ids. Getting these IDs is not a problem, but trying to filter by them produces some unexpected effects. The code looks like this (extraneous code trimmed for clarity):
def view_results(request,arg):
# django_ids list is first calculated using arg...
sqs = SearchQuerySet().facet('example_facet') # STEP_1
sqs = sqs.filter(django_id__in=django_ids) # STEP_2
view = search_view_factory(
view_class=SearchView,
template='search/search-results.html',
searchqueryset=sqs,
form_class=FacetedSearchForm
)
return view(request)
At the point marked STEP_1 I get all the database records. At STEP_2 the records are successfully narrowed down to the number I'd expect for that list of django_ids. The problem comes when the search results are displayed in cases where the user has specified a search term in the form. Rather than returning all records from STEP_2 which match the term, I get all records from STEP_2 plus all from STEP_1 which match the term.
Presumably, therefore, I need to override one/some of the methods in for SearchView in haystack/views.py, but what? Can anyone suggest a means of achieving what is required here?
After a bit more thought, I found a way around this. In the code above, the problem was occurring in the view = search_view_factory... line, so I needed to create my own SearchView class and override the get_results(self) method in order to apply the filtering after the search has been run with the user's search terms. The result is code along these lines:
class MySearchView(SearchView):
def get_results(self):
search = self.form.search()
# The ID I need for the database search is at the end of the URL,
# but this may have some search parameters on and need cleaning up.
view_id = self.request.path.split("/")[-1]
view_query = MyView.objects.filter(id=view_id.split("&")[0])
# At this point the django_ids of the required objects can be found.
if len(view_query) > 0:
view_item = view_query.__getitem__(0)
django_ids = []
for thing in view_item.things.all():
django_ids.append(thing.id)
search = search.filter_and(django_id__in=django_ids)
return search
Using search.filter_and rather than search.filter at the end was another thing which turned out to be essential, but which didn't do what I needed when the filtering was being performed before getting to the SearchView.

MVC 5 Paging a long list (50 000 object), i used listPaged but it takes time to charge list

I has to display a list of books that containes more than 50 000 book.
I want to display paged list where for each page i invoke a method that gives me 20 books.
List< Books > Ebooks = Books.GetLibrary(index);
But using PagedList doesnt match with my want because it creates a subset of the collection of objects given and accesse to each subset with the index. And refering to the definition of its methode, i had to charge the hole list from the begining.
I also followed this article
var EBooks = from b in db.Books select b;
int pageSize = 20;
int pageNumber = (page ?? 1);
return View(Ebooks.ToPagedList(pageNumber, pageSize));
But doing so, i has to invoke (var Books = from b in db.Books select b; ) on each index
**EDIT****
I'm searching for indications to achieve this
List< Books > Ebooks = Books.GetLibrary(index);
and of course i has the number of all the books so i know the number of pages
So i'm searching for indication that leads me to achieve it: for each index, i invoke GetLibrary(index)
any suggestions ?
Have you tried something like:
var pagedBooks = Books.GetLibrary().Skip(pageNumber * pageSize).Take(pageSize);
This assumes a 0-based pageNumber.
If that doesn't work, can you add a new method to the Books class that gets a paged set directly from the data source?
Something like "Books.GetPage(pageNumber, pageSize);" that way you don't get the entire collection every time.
Other than that, you may have to find a way to cache the initial result of Books.GetLibrary() somewhere.

Initiate ApexPage.StandardSetController with List causes exception being thrown in later pagination calls

I have a Paginated List displayed on the visual force page and in the backend I was using a StandardSetController to control the pagination. However, one column on the table is an aggregated field whose calculation is done in a wrapper class. Recently, I want to sort the paginated list against the calculated field. And unfortunately the calculated result cannot be done on the data model(SObject) level.
So I am thinking to passed a sorted list of SObject to the StandardSetController constructor. That is to sort the record before it has been pass into the StandardSetController.
The code is like below:
List<Job__c> jobs = new List<Job__c>();
List<Job__c> tempJobs = Database.Query(basicQuery + filterExpression);
//sort with values
List<JobWrapper> jws = createJobWrappers(tempJobs);
JobWrapper.sortBy = JobWrapper.SORTBY_CALCULATEDFIELD_ASC;
jws.sort();
for(JobWrapper jw : jws){
jobs.add(jw.JobRecord);
}
jobs = jobs.deepClone(true, true, true);
StandardSetController con = new ApexPages.StandardSetController(jobs);
con.setPageSize(10);
However after executing the last line system throw exception:Modified rows exist in the records collection!
I did not modify any rows in the controller. Could anyone help me understanding the exception?

Resources