CouchDB view collation sorted by date - couchdb

I am using a couchDB database.
I can get all documents by category and paginate results with a key like ["category","document_id"]and a query likestartkey=["category","document_id"]&endkey=["category",{}]`
Now I want to sort those results by date to have latest documents first.
I tried a lot of keys such as ["category","date","document_id"]
but nothing works (or I can't get it working).
I would use something like
startkey=["queried_category","queried_date","queried_document_id"]&endkey=["queried_category"]
but ignore the "queried_date" key part (sort but do not take documents where "document_id" > "queried_document_id")
EDIT:
Example :
With a key like :
startkey=["apple","2012-12-27","ZZZ"]&endkey=["apple",{}]&descending=true
I will have (and it is the normal behavior)
"apple","2012-12-27","ABC"
"apple","2012-05-01","EFG"
...
"apple","2012-02-13","ZZZ"
...
But the result set I want should start with
"apple","2012-02-13","ZZZ"

Emit the category and the timestamp (you don't need the document_id):
emit(category, timestamp);
And then filter on the category:
?startkey=[":category"]&endkey=[":category",{}]
You must understand that this is only a sort, so you need the startkey to be before the first row, and the endkey to be after the last row.
Last but not least, don't forget to have a representation for the timestamp that is adequate to the sort.

The problem with pagination with timestamp instead of doc ID is that timestamp is not unique. That's why you will have problem with paging Aurélien's solution.
I would stay with what you tried but use timestamp as the number (standard UNIX milliseconds since 1970). You can reverse the order of single numeric field just by multiplying by -1:
emit(category, -timestamp, doc_id)
This way result sorted lexicographically (ascending) will be ordered according to your needs:
first dates descending,
then document id's ascending.

Related

solr query to sort result in descending order on basis of price

I am very beiginer in Solr and I am trying to do query on my data. I am trying to find data with name=plant and sort it by maximum price
my schema for both name and price is text type.
for eg let say data is
name:abc, price:25;
name:plant, price:35;
name:plant,price:45; //1000 other data
My Approach
/query?q=(name:"Plant")&stopwords=true
but above is giving me result of plants but I am not sure how to sort result using price feild
Any help will be appreciated
You can use the sort param for achieving the sorting.
Your query would be like q=(name:"Plant")&sort=price desc
The sort parameter arranges search results in either ascending (asc)
or descending (desc) order. The parameter can be used with either
numerical or alphabetical content. The directions can be entered in
either all lowercase or all uppercase letters (i.e., both asc or ASC).
Solr can sort query responses according to document scores or the
value of any field with a single value that is either indexed or uses
DocValues (that is, any field whose attributes in the Schema include
multiValued="false" and either docValues="true" or indexed="true" – if
the field does not have DocValues enabled, the indexed terms are used
to build them on the fly at runtime), provided that:
the field is non-tokenized (that is, the field has no analyzer and its
contents have been parsed into tokens, which would make the sorting
inconsistent), or
the field uses an analyzer (such as the KeywordTokenizer) that
produces only a single term.

How to retrieve item closest to another item in DynamoDB?

I have a dynamo DB table where the sort key has a numeric value.
I have a requirement to retrieve the first item which has a lower value than the one, that I have.
I have gone through http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_UpdateItem.html#API_UpdateItem_Examples docs but I can see no way to:
- sort the output
- limit the result to 1 entry
Is there any way to actually achieve what I want with dynamo DB?
EDIT:
According to this: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.html
The results are sorted using sorting key, and when it's numeric, they are sorted descending. Which is great, but I still can't find any way to get only a single result [don't want to "pay" for the full table scan in some cases].
Are you searching for the next item which has a lower sort key within the same Partition Key?
In that case, you are able to use Query as you've found, sort in Descending and Limit to 1. This will not scan the entire table.
Alternatively, if you wish you scan cross Partitions, unfortunately a Table Scan is the only way to do this.

Query Couchdb by date while maintaining sort order

I am new to couchdb, i have looked at the docs and SO posts but for some reason this simple query is still eluding me.
SELECT TOP 10 * FROM x WHERE DATE BETWEEN startdate AND enddate ORDER BY score
UPDATE: It cannot be done. This is unfortunate since to get this type
of data you have to pull back potentially millions of records (a few
fields) from couch then do either filtering, sorting or limiting
yourself to get the desired results. I am now going back to my
original solution of using _changes to capture and store elsewhere the data i do need to perform that query on.
Here is my updated view (thanks to Dominic):
emit([d.getUTCFullYear(), d.getUTCMonth() + 1, d.getUTCDate(), score], doc.name);
What I need to do is:
Always sort by score descending
Optionally filter by date range (for instance, TODAY only)
Limit by x
Update: Thanks to Dominic I am much closer - but still having an
issue.
?startkey=[2017,1,13,{}]&endkey=[2017,1,10]&descending=true&limit=10&include_docs=true
This brings back documents between the dates sorted by score
However if i want top 10 regardless of date then i only get back top 10 sorted by date (and not score)
For starters, when using complex keys in CouchDB, you can only sort from left to right. This is a common misconception, but read up on Views Collation for a more in-depth explanation. (while you're at it, read the entire Guide to Views as well since you're getting started)
If you want to be able to sort by score, but filter by date only, you can accomplish this by breaking down your timestamp to only show the degree you care about.
function (doc) {
var d = new Date(doc.date)
emit([ d.getUTCFullYear(), d.getUTCMonth() + 1, d.getUTCDate(), score ])
}
You'll end up outputting a more complex key than what you currently have, but you query it like so:
startkey=[2017,1,1]&endkey=[2017,1,1,{}]
This will pick out all the documents on 1-1-2017, and it'll be sorted by score already! (in ascending order, simply swap startkey and endkey to get descending order, no change to the view needed)
As an aside, avoid emitting the entire doc as the value in your view. It is likely more efficient to leverage the include_docs=true parameter, and leaving the value of your emit empty. (please refer to this SO question for more information)
With this exact setup, you'd need separate views in order to query by different precisions. For example, to query by month you just use the year/month and so on.
However, if you are willing/able to sort your scores in your application, you can use a single view to get all the date precision you want. For example:
function (doc) {
var d = new Date(doc.date)
emit([ d.getUTCFullYear(), d.getUTCMonth() + 1, d.getUTCDate(), d.getUTCHour(), d.getUTCMinutes(), d.getUTCSeconds(), d.getUTCMilliseconds() ])
}
With this view and the group_level parameter, you can get all the scores by year, month, date, hour, etc. As I mentioned, in this case it won't be sorted by score yet, but maybe this opens up other queries to you. (eg: what users participated this month?)

MongoDB API pagination

Imagine situation when a client has feed of objects with limit 10.
When the next 10 are required it sends request with skip 10 and limit 10.
But what if there are some new objects were added (or deleted) to collection since the 1st request with offset == 0.
Then on 2nd request (with offset == 10) response may have wrong objects order.
Sorting on time of their creation does not work here, because I have some feeds which are formed on sorting via some numeric field.
You can add a time field like created_at or updated_at. It must updated when ever the document is created or modified and the field must be unique.
Then query the DB for the range of time using $gte and $lte along with a sort on this time field.
This ensures that any changes made outside the time window will not get reflected in the pagination, provided that the time field does not have duplicates. Most probably if you include microtime, duplicates wont happen.
It really depends on what you want the result to be.
If you want the original objects in their original order regardless of Delete and Add operations then you need to make a copy of the list (or at least of the order) and then page through that. Copy every Id to a new collection that doesn't change once the page has loaded and then paginate through that.
Alternatively, and perhaps more likely, what you want is to see the next 10 after the last one in the current set including any Delete or Add operations that have take place since. For this, you can use the sorted order in which you are viewing them and a filter, $gt whatever the last item was. BUT that doesn't work when there are duplicates in the field on which you are sorting. To get around that you will need to index on that field PLUS some other field which is unique per record, for example, the _id field. Now, you can take the last record in the first set and look for records that are $eq the indexed value and $gt the _id OR are simply $gt the indexed value.

Get Max Date Using CAML Query From alist

how can i get max Date and Min Date from a list Date Column
The brute force approach is to create two queries that will retrieve the list content sorted by date asc and desc. I know that this sucks but at least you can move on with you project and refine the query later on.
If only it was possible to retrieve top 1 then it might even work in production.

Resources