Couchdb: filter and group in a single view - couchdb

I have a Couchdb database with documents of the form: { Name, Timestamp, Value }
I have a view that shows a summary grouped by name with the sum of the values. This is straight forward reduce function.
Now I want to filter the view to only take into account documents where the timestamp occured in a given range.
AFAIK this means I have to include the timestamp in the emitted key of the map function, eg. emit([doc.Timestamp, doc.Name], doc)
But as soon as I do that the reduce function no longer sees the rows grouped together to calculate the sum. If I put the name first I can group at level 1 only, but how to I filter at level 2?
Is there a way to do this?

I don't think this is possible with only one HTTP fetch and/or without additional logic in your own code.
If you emit([time, name]) you would be able to query startkey=[timeA]&endkey=[timeB]&group_level=2 to get items between timeA and timeB grouped where their timestamp and name were identical. You could then post-process this to add up whenever the names matched, but the initial result set might be larger than you want to handle.
An alternative would be to emit([name,time]). Then you could first query with group_level=1 to get a list of names [if your application doesn't already know what they'll be]. Then for each one of those you would query startkey=[nameN]&endkey=[nameN,{}]&group_level=2 to get the summary for each name.
(Note that in my query examples I've left the JSON start/end keys unencoded, so as to make them more human readable, but you'll need to apply your language's equivalent of JavaScript's encodeURIComponent on them in actual use.)

You can not make a view onto a view. You need to write another map-reduce view that has the filtering and makes the grouping in the end. Something like:
map:
function(doc) {
if (doc.timestamp > start and doc.timestamp < end ) {
emit(doc.name, doc.value);
}
}
reduce:
function(key, values, rereduce) {
return sum(values);
}
I suppose you can not store this view, and have to put it as an ad-hoc query in your application.

Related

How to filter view query results based on reduce value (and not just on key)

Using map/reduce functions only (not Mango),and the following example from the documentation, using the map and reduce functions below One may obtain the number of unique labels:
Documents return by the view
{"total_rows":9,"offset":0,"rows":[
{"id":"3525ab874bc4965fa3cda7c549e92d30","key":"bike","value":null},
{"id":"3525ab874bc4965fa3cda7c549e92d30","key":"couchdb","value":null},
{"id":"53f82b1f0ff49a08ac79a9dff41d7860","key":"couchdb","value":null},
{"id":"da5ea89448a4506925823f4d985aabbd","key":"couchdb","value":null},
{"id":"3525ab874bc4965fa3cda7c549e92d30","key":"drums","value":null},
{"id":"53f82b1f0ff49a08ac79a9dff41d7860","key":"hypertext","value":null},
{"id":"da5ea89448a4506925823f4d985aabbd","key":"music","value":null},
{"id":"da5ea89448a4506925823f4d985aabbd","key":"mustache","value":null},
{"id":"53f82b1f0ff49a08ac79a9dff41d7860","key":"philosophy","value":null}
]}
Map function
function(doc) {
if(doc.name && doc.tags) {
doc.tags.forEach(function(tag) {
emit(tag, 1);
});
}
}
Reduce function
function(keys, values) {
return sum(values);
}
Response with grouping
{"rows":[
{"key":"bike","value":1},
{"key":"couchdb","value":3},
{"key":"drums","value":1},
{"key":"hypertext","value":1},
{"key":"music","value":1},
{"key":"mustache","value":1},
{"key":"philosophy","value":1}
]}
Now my question is, using map/reduce views only (not Mango) how can I query the view to only select rows having a specific value following reduce (for example "3"). It looks like all view parameters focus on filtering based on the key, but I need to filter based on value. Ideally, being able to also use greater than, lesser than for reduce value filtering would also be great.
The ability to filter based on the value is essential for scenarios like the one above, but also for more advanced scenarios involving linked documents. Of course, I am not interested in filtering in memory in the application layer since in real world scenarios, the result set would be much larger than a dozen lines.

learning mapreduce in Fauxton

I am brand new to noSQL, couchDB, and mapreduce and need some help.
I have the same question discussed here {How to use reduce in Fauxton} but do not understand the answer:(.
I have a working map function:
function (foo) {
if(foo.type == "blog post");
emit(foo)
}
which returns 11 individual documents. I want to modify this to return foo.type along with a count of 1.
I have tried:
function (doc) {
if(doc.type == "blog post");
return count(doc)
}
and "_count" from the Reduce panel, but clearly am doing something wrong as the View does not return anything.
Thanks in advance for any assistance or guidance!
In Fauxton, the Reduce step is kind of awkward and unintuitive to find.
Select _count in the "Reduce (optional)" popup below where you type
in your Map.
Select "Save Document and then Build Index". That will display your
map results.
Find the "Options" button at the top next to a gears icon. If you see a
green band instead, close the green band with the X.
Select Options, then the "Reduce" check-circle. Select Run Query.
Map
So when you build a map function, you are literally creating a dictionnary or map which are key:value data structures.
Your map function should emit keys that you will query. You can also emit a value but if you intend to simply get the associated document, you don't have to emit any values. Why? Because there is a query parameter that can be used to return the document associated (?include_docs=true).
Reduce
Then, you can have reduce function which will be called for every result with the same keys. Every result with the same key will be processed through your reduce function to reduce the value.
Corrected example
So in your case, you want to map document the document per type I suppose.
You could create a function that emit documents that have the type property.
function(doc){
if(doc.type)
emit(doc.type);
}
If you query this view, you will see that the keys of each rows will be the type of the document. If you choose the _count reduce function, you should have the number of document per types.
When querying the view, you have to specify : group=true&reduce=true
Also, you can get all the document of type blog postby querying with those parameters : ?key="blog post"

Cloudant Custom Sort

I have my data as follows
{
"key":"adasd",
"col1"::23,
"col2":3
}
I want to see the results sorted in descending order of the ratio of col1/sum(col2)
where sum(col2) refers to the sum of all values of col2. I am a bit new to cloudant so I don't know what the best way to approach this is. I can think of a few options.
Create a new column for sum(col2) and keep updating it with each new value of col2
For each record,also create a new column col1/sum(col2). Then i can sort on this column.
Use Views to calculate the ratio and sum on the fly. This way I don't have to store new columns plus I don't have to perform costly calculations on each update.
I tried to create a view and the map function is easy enough
function (doc) {
emit(doc._id, {"col1_value":doc.col1,"col2_value":doc.col2});
}
but I am confused by the reduce template
function (keys, values, rereduce) {
if (rereduce) {
return sum(values);
} else {
return values.length;
}
}
I have no idea on how to access the values of the two columns and then aggregate here. Is this even possible? Is there any other way to achieve the result I need?
Two comments:
Ordering by X/sum(Y) is the same as ordering by X (or by -X if sum(Y) is negative). So for ordering purposes, just order by X and save yourself a bunch of hassle.
Assuming you actually want to know the value of X/sum(Y), and not just order by it, there's no one-step way to accomplish this in CouchDB. The best I can think of is to create a map/reduce view that gives you the global sum(Y). Then you can fetch that sum with a simple query, and do the math in your application, when fetching your documents.

Couchdb - date range + multiple query parameters

I want to be able query the couchdb between dates, I know that this can be done with startkey and endkey (it works fine), but is it possible to do query for example like this:
SELECT *
FROM TABLENAME
WHERE
DateTime >= '2011-04-12T00:00:00.000' AND
DateTime <= '2012-05-25T03:53:04.000'
AND
Status = 'Completed'
AND
Job_category = 'Installation'
Generally-speaking, establishing indexes on multiple fields grows in complexity as the number of fields increases.
My main question is: do Status and Job_category need to be queried dynamically too? If not, your view is simple:
function (doc) {
if (doc.Status === 'Completed' && doc.Job_category === 'Installation') {
emit(doc.DateTime); // this line may change depending on how you break up and emit the datetimes
}
}
Views are fairly cheap, (depending on the size of your database) so don't be afraid to establish several that cover different cases. I would expect something like Status to have predefined list of available options, as oppposed to Job_category which seems like it could be more related to user input.
If you need those fields to be dynamic, you can just add them to the index as well:
function (doc) {
emit([ doc.Status, doc.Job_category, doc.DateTime ]);
}
Then you can use an array as your start_key. For example:
start_key=["Completed", "Installation", ...]
tl;dr: use "static" views where you have a predetermined list of values for a given field. while possible to query "dynamic" views with multiple fields, the complexity grows very quickly.

Couchdb query for values calculated from key input

suppose i have the following data in my database:
[1,2],[2,1],[1,3],[3,1]...
were the numbers represent the a and b values of the formula a*x+b
what i now want is a query that returns the difference to a given point x,y.
for example: the point [2,6] is given. i want my query to return
[1,2] = -2 (1*2+2=4 4-6=-2)
[2,1] = -1 (2*2+1=5 5-6=-1)
[1,3] = -1 (1*2+3=5 4-6=-1)
[3,1] = 1 (3*2+1=7 7-6=-1)
I know how to do this in SQL but the data is already in a couchdb. I'm quite new to the NoSQL world and was wondering if something like this would be possible in couchdb.
what you can do is to use the standard MapReduce functionality of CouchDB.
Map is function you put in a view, which finds your data. You can have various criteria how to locate the docs you need. Next, if you specify so in the query with reduce=true, a reduce function is executed on each document that matched the map condition. You can use JavaScript to perform various operations on the document's values.
In your case, the map can look something like this:
function(doc) {
if(doc.a && doc.b) {
emit(doc._id,[doc.a, doc.b]);
}
}
then, the reduce gets called, like this:
function(keys, values, rereduce) {
var res;
//do something with values...
return res;
}
In your case keys will be list of document ID's and values will be the array of your a & b fields.
When you call the MapReduce (depending what method you use to access the DB), you should specify reduce=true.
Good resources on MapReduce (and on Views, Sorting and List funtions) are:
http://guide.couchdb.org/draft/views.html
http://www.slideshare.net/okurow/couchdb-mapreduce-13321353
Another way to go is to use a list function on the Map result, if you want to output the result in HTML. A good reason to use List function is that you can pass arguments to it with querystring, in your case it may be the point for which you want to calculate distances.
For detailed description on List functions, have a look here:
http://guide.couchdb.org/draft/transforming.html
Hope this helps.

Resources