CouchDB reducing sums with date filter - couchdb

I'm pretty new to couchdb and map/reduce in general. I have the following view:
{
"_id": "_design/keys",
"views": {
"keys": {
"map": "function(doc) { for (var thing in doc) { if (doc.created_at != null) { emit([thing, doc.created_at],1); } } }",
"reduce": "function(key,values) { return sum(values); }"
}
}
}
This works well to give me a sum of the count of all document keys in the database with the proper group_level:
.../_design/keys/_view/keys?group_level=1
{"rows":[
{"key":["_id"],"value":2},
{"key":["_rev"],"value":2},
{"key":["created_at"],"value":2},
{"key":["testing"],"value":2}
]}
Now what I want to do is reduce these mapped documents by date, which is an IOS8601 string:
{"rows":[
{"key":["_id","2015-11-25T21:13:58Z"],"value":1},
{"key":["_id","2015-11-25T21:14:39Z"],"value":1},
{"key":["_rev","2015-11-25T21:13:58Z"],"value":1},
{"key":["_rev","2015-11-25T21:14:39Z"],"value":1},
{"key":["created_at","2015-11-25T21:13:58Z"],"value":1},
{"key":["created_at","2015-11-25T21:14:39Z"],"value":1},
{"key":["testing","2015-11-25T21:13:58Z"],"value":1},
{"key":["testing","2015-11-25T21:14:39Z"],"value":1}
]}
But I still want the results grouped by the first part of the key. That is, I want to specify a start time of 2015-11-25T21:13:57Z and an end time of 2015-11-25T21:13:59Z, and get back everything with the time stamp of 2015-11-25T21:13:58Z, like so:
{"rows":[
{"key":["_id"],"value":1},
{"key":["_rev"],"value":1},
{"key":["created_at"],"value":1},
{"key":["testing"],"value":1}
]}
How can I do this?

You should use your view function to emit the date component of the timestamp (which as you note is conveniently already in hierarchical structure) as a complex key:
Instead of "2015-11-26T...", emit the key as [2015, 11, 26, 21, 13, 58]
Then you can range query on the complex keys to different levels (year, month, date, time). Note that if you use times other than Zulu time you may need to use the view function to read the tz and emit in Zulu time so that all sort correctly.
Please pardon typos as was entered from mobile device

I had a similar problem just a few days ago and found that List Functions are a pretty easy way to solve this. You could simply use the date as key, the things as values, do the counting in the list function and can still use all the regular view features to define start and end keys.

Related

Query CosmosDB when document contains Dictionary

I have a problem with querying CosmosDB document which contains a dictionary. This is an example document:
{
"siteAndDevices": {
"4cf0af44-6233-402a-b33a-e7e35dbbee6a": [
"f32d80d9-e93a-687e-97f5-676516649420",
"6a5eb9fa-c961-93a5-38cc-ecd74ada13ac",
"c90e9986-5aea-b552-e532-cd64a250ad10",
"7d4bfdca-547a-949b-ccb3-bbf0d6e5d727",
"fba51bfe-6a5e-7f25-e58a-7b0ced59b5d8",
"f2caac36-3590-020f-ebb7-5ccd04b4412c",
"1b446af7-ba74-3564-7237-05024c816a02",
"7ef3d931-131e-a639-10d4-f4dd5db834ca"
]
},
"id": "f9ef9fb6-4b70-7d3f-2bc8-c3d335018624"
}
I need to get all documents where provided guid is in the list, so in the dictionary value (I don't know dictionary key). I found an information somewhere here that it is not possible to iterate through keys in dictionary in CosmosDB (maybe it has changed since that time but I din't find any information in documentation), but maybe someone will have some idea. I cannot change form of the document.
I tried to do it in Linq, but I didn't get any results.
var query = _documentClient
.CreateDocumentQuery<Dto>(DocumentCollectionUri())
.Where(d => d.SiteAndDevices.Any(x => x.Value.Contains("f32d80d9-e93a-687e-97f5-676516649420")))
.AsDocumentQuery();
Not sure of the Linq query, but with SQL, you'd need something like this:
SELECT * FROM c
where array_contains(c.siteAndDevices['4cf0af44-6233-402a-b33a-e7e35dbbee6a'],"f32d80d9-e93a-687e-97f5-676516649420")
This is a strange document format though, as you've named your key with an id:
"siteAndDevices": {
"4cf0af44-6233-402a-b33a-e7e35dbbee6a": ["..."]
}
Your key is "4cf0af44-6233-402a-b33a-e7e35dbbee6a", which forces you to use a different syntax to reference it:
c.siteAndDevices['4cf0af44-6233-402a-b33a-e7e35dbbee6a']
You'd save yourself a lot of trouble refactoring this to something like:
{
"id": "dictionary1",
"siteAndDevices": {
"deviceId": "4cf0af44-6233-402a-b33a-e7e35dbbee6a",
"deviceValues": ["..."]
}
}
You can refactor further, such as using an array to contain multiple device id + value combos.

reduce output must shrink more rapidly, on adding new document

I have couple of documents in couchdb, each having a cId field, such as -
{
"_id": "ccf8a36e55913b7cf5b015d6c50009f7",
"_rev": "8-586130996ad60ccef54775c51599e73f",
"cId": 1,
"Status": true
}
I have a simple view, which tries to return max of cId with map and reduce functions as follows -
Map
function(doc) {
emit(null, doc.cId);
}
Reduce
function(key, values, rereduce){
return Math.max.apply(null, values);
}
This works fine (output is 1) until I add one more document with cId = 2 in db. I am expecting output as 2 but it starts giving error as "Reduce output must shrink more rapidly". When I delete this document things are back to normal again. What can be the issue here? Is there any alternative way to achieve this?
Note: There are more views in db, which perform different role and few return json as well. They also start failing on this change.
You could simply use the built-in _statsreduce function, in order to get the maximum value. It is returned in the "max" field.

Sorting CouchDB result by value

I'm brand new to CouchDB (and NoSQL in general), and am creating a simple Node.js + express + nano app to get a feel for it. It's a simple collection of books with two fields, 'title' and 'author'.
Example document:
{
"_id": "1223e03eade70ae11c9a3a20790001a9",
"_rev": "2-2e54b7aa874059a9180ac357c2c78e99",
"title": "The Art of War",
"author": "Sun Tzu"
}
Reduce function:
function(doc) {
if (doc.title && doc.author) {
emit(doc.title, doc.author);
}
}
Since CouchDB sorts by key and supports a 'descending=true' query param, it was easy to implement a filter in the UI to toggle sort order on the title, which is the key in my results set. Here's the UI:
List of books with link to sort title by ascending or descending
But I'm at a complete loss on how to do this for the author field.
I've seen this question, which helped a poster sort by a numeric reduce value, and I've read a blog post that uses a list to also sort by a reduce value, but I've not seen any way to do this on a string value without a reduce.
If you want to sort by a particular property, you need to ensure that that property is the key (or, in the case of an array key, the first element in the array).
I would recommend using the sort key as the key, emitting a null value and using include_docs to fetch the full document to allow you to display multiple properties in the UI (this also keeps the deserialized value consistent so you don't need to change how you handle the return value based on sort order).
Your map functions would be as simple as the following.
For sorting by author:
function(doc) {
if (doc.title && doc.author) {
emit(doc.author, null);
}
}
For sorting by title:
function(doc) {
if (doc.title && doc.author) {
emit(doc.title, null);
}
}
Now you just need to change which view you call based on the selected sort order and ensure you use the include_docs=true parameter on your query.
You could also use a single view for this by emitting both at once...
emit(["by_author", doc.author], null);
emit(["by_title", doc.title], null);
... and then using the composite key for your query.

Sort by date with null first

I'm trying to find documents in a collection, ordered by their date. This works, but I get the documents with null in the date field at the bottom, I want these first.
MyModel.find({ }, null, { sort: { date: -1 } }, function(err, models) {
// Models sorted with the "largest" date first and models with null dates last
});
If I change the sorting to { date: 1 } I do get the documents with null at first, but the order otherwise is reverse, which I do not want.
How can I achieve the desired behaviour?
Unfortunately it appears there's no trivial way to do this, akin to SQL's "nulls first" / "nulls last" syntax. This feature is mentioned in this Mongo bug:
https://jira.mongodb.org/browse/SERVER-153
Where it's looped in with the larger feature of custom sorting functions. (I really wish this were broken out into a separate feature request because I would think implementing only the nulls first/last bit should be much easier than custom sorting functions and still provide a lot of value.)
Anyway, the work-around for the time being is to just query separately for the null values in addition to your existing query:
How are null values in a MongoDB index sorted?
MyModel.find({ date: null });
MyModel.find({ date: { $ne: null } }).sort({ date: -1 } });

CouchDB reduce function useful in this scenario?

I want to store votes in CouchDB. To get round the problem of incrementing a field in one document and having millions of revisions, each vote will be a seperate document:
{
_id: "xyz"
type: "thumbs_up"
vote_id: "test"
}
So the actual document itself is the vote. The result I'd like is basically an array of: vote_id, sumOfThumbsUp, sumOfThumbsDown
Now I think my map function would need to look like:
if(type=="thumbs_up" | type =="thumbs_down"){
emit(vote_id, type)
}
Now here's the bit I can't figure out what to do, should I build a reduce function to somehow sum the vote types, keeping in mind there's two types of votes.
Or should I just take what's been emited from the map function and put it straight into an array to work on, ignoring the reduce function completely?
This is a perfect case for map-reduce! Having each document represent a vote is the right way to go in my opinion, and will work with CouchDB's strengths.
I would recommend a document structure like this:
Documents
UPVOTE
{
"type": "vote",
"vote_id": "test",
"vote": 1
}
DOWNVOTE
{
"type": "vote",
"vote_id": "test",
"vote": -1
}
I would use a document type of "vote", so you can have other document types in your database (like the vote category information, user information, etc)
I kept "vote_id" the same
I made the value field called "vote", and just used 1/-1 instead of "thumbs_up" or "thumbs_down" (really doesn't matter, you can do whatever you want and it will work just fine)
View
Map
function (doc) {
if (doc.type === "vote") {
emit(doc.vote_id, doc.vote);
}
}
Reduce
_sum
You end up with a result like this for your map function:
And if you reduce it:
As you add more vote documents with more vote_id variety, you can query for a specific vote_id by using: /:db/_design/:ddoc/_view/:view?reduce=true&group=true&key=":vote_id"

Resources