How can I query multiple key criteria? - couchdb

Using couchdb, with the following json:
{"total_rows":3,"offset":0,"rows":[ {"id":"bc26e5eae7f8c8c3486818e7e7971df0","key":{"user":"lili#abc.com","pal":["igol ≠ eagle"],"fecha":"10/5/2014"},"value":null},{"id":"cf0dc2e2874776958c59f2f544b5a750","key":{"user":"lili#abc.com","pal":["kat ≠cat"],"fecha":"10/6/2014"},"value":null},{"id":"df4ec96088ed52096db064f2ebd2310b","key":{"user":"dum#ghi.com","pal":["dok ≠ duck"],"fecha":"10/7/2014"},"value":null}]}
I would like to query for specific user AND specific date:
for example:
?user="lili#def.com"&fecha:"10/6/2014"
I also tried:
?user%3Dlili%40def.com%26fecha%3A10%2F6%2F2014
Needless to say, it isn't currently working as I expected (all results are shown, not only the register needed).
my view func is:
function(doc) {
if (doc.USER){
emit({user:doc.USER, pal:doc.palabras, fecha:doc.fecha});
}
}
Regards.

Remember that CouchDB views are simply key/value lookups that are built at index-time, not query time. At the minute you are emitting a key with no value. If you want to look something up by two values, you'll need to emit a composite key (an array):
function(doc) {
if (doc.USER) {
emit([doc.USER, doc.fecha], doc);
}
}
Then you can look up matching documents by passing the array as the key:
?key=%5B%22lili%40def.com%22%2C%20%2210%2F6%2F2014%22%5D
There are optimisations you can make to this (e.g. emitting a null value and using include_docs to reduce the size of the view) but this should set you off on the right track.

I do the same thing as Ant P but I tend to use strings.
function ( doc ) {
if ( doc.USER ) {
emit( 'user-' + doc.USER + '-' + doc.fecha, doc );
}
}
I would also highly recommend emitting null instead of doc as a value.
Remember, you can always emit more than once depending on what kind of queries you need.
For example, if you're looking for all posts by a specific user between two dates, you could do the following view.
function ( doc ) {
if ( doc.type == "post" ) {
emit( 'user-' + doc.nombre, null );
emit( 'fecha-' + doc.fecha, null );
}
}
Then you would query the view twice _view/posts?key="user-miUsario", and _view/posts?start_key="fecha-1413040000000"&end_key="fecha-1413049452904". Then, once you have all of the ids from both views, you take the intersection and use _all_docs to get your original documents.
You end up making three requests but it saves disk space in the view, the payloads are smaller because you return null, and your code is simpler because you can query the same view multiple ways.

Related

Reduce output must shrink more rapidly -- Reducing to a list of documents

I have a few documents in my couch db with json as below. The cId will change for each. And I have created a view with map/reduce function to filter out few documents and return a list of json documents.
Document structure -
{
"_id": "ccf8a36e55913b7cf5b015d6c50009f7",
"_rev": "8-586130996ad60ccef54775c51599e73f",
"cId": 1,
"Status": true
}
Here is the sample map:
function(doc) {
if(doc.Key && doc.Value && doc.Status == true)
emit(null, doc);
}
Here is the sample reduce:
function(key, values, rereduce){
var kv = [];
values.forEach(function(value){
if(value.cId != <some_val>){
kv.push({"k": value.cId, "v" : value});
}
});
return kv;
}
If there are two documents and reduce output has list containing 1 document, this works fine. But if I add one more document (with cId = 2), it throws the errors - "reduce output must shrink more rapidly". Why is this caused? And how can I achieve what I intend to do?
The cause of the error is, that the reduce function does not actually reduce anything (it rather is collecting objects). The documentation mentions this:
The way the B-tree storage works means that if you don’t actually
reduce your data in the reduce function, you end up having CouchDB
copy huge amounts of data around that grow linearly, if not faster
with the number of rows in your view.
CouchDB will be able to compute the final result, but only for views
with a few rows. Anything larger will experience a ridiculously slow
view build time. To help with that, CouchDB since version 0.10.0 will
throw an error if your reduce function does not reduce its input
values.
It is unclear to me, what you intend to achieve.
Do you want to retrieve a list of docs based on certain criteria? In this case, a view without reduce should suffice.
Edit: If the desired result depends on a value stored in a certain document, then CouchDB has a feature called list. It is a design function, that provides access to all docs of a given view, if you pass include_docs=true.
A list URL follow this pattern:
/db/_design/foo/_list/list-name/view-name
Like views, lists are defined in a design document:
{
"_id" : "_design/foo",
"lists" : {
"bar" : "function(head, req) {
var row;
while (row = getRow()) {
if (row.doc._id === 'baz') // Do stuff based on a certain doc
}
}"
},
... // views and other design functions
}

Sorting CouchDB result by value

I'm brand new to CouchDB (and NoSQL in general), and am creating a simple Node.js + express + nano app to get a feel for it. It's a simple collection of books with two fields, 'title' and 'author'.
Example document:
{
"_id": "1223e03eade70ae11c9a3a20790001a9",
"_rev": "2-2e54b7aa874059a9180ac357c2c78e99",
"title": "The Art of War",
"author": "Sun Tzu"
}
Reduce function:
function(doc) {
if (doc.title && doc.author) {
emit(doc.title, doc.author);
}
}
Since CouchDB sorts by key and supports a 'descending=true' query param, it was easy to implement a filter in the UI to toggle sort order on the title, which is the key in my results set. Here's the UI:
List of books with link to sort title by ascending or descending
But I'm at a complete loss on how to do this for the author field.
I've seen this question, which helped a poster sort by a numeric reduce value, and I've read a blog post that uses a list to also sort by a reduce value, but I've not seen any way to do this on a string value without a reduce.
If you want to sort by a particular property, you need to ensure that that property is the key (or, in the case of an array key, the first element in the array).
I would recommend using the sort key as the key, emitting a null value and using include_docs to fetch the full document to allow you to display multiple properties in the UI (this also keeps the deserialized value consistent so you don't need to change how you handle the return value based on sort order).
Your map functions would be as simple as the following.
For sorting by author:
function(doc) {
if (doc.title && doc.author) {
emit(doc.author, null);
}
}
For sorting by title:
function(doc) {
if (doc.title && doc.author) {
emit(doc.title, null);
}
}
Now you just need to change which view you call based on the selected sort order and ensure you use the include_docs=true parameter on your query.
You could also use a single view for this by emitting both at once...
emit(["by_author", doc.author], null);
emit(["by_title", doc.title], null);
... and then using the composite key for your query.

How do I create an entry with a compound key with Couchbase?

I have some code running in NodeJS that sets the doc in the database:
cb.set(req.body.id, req.body.value, function (err, meta) {
res.send(req.body);
});
I have read about compound keys and it seems that feature can simplify my life. The question is how to properly add an entry with a compound key? The code below fails and messages that a string was expected, no array.
cb.set([req.body.id, generate_uuid()], req.body.value, function (err, meta) {
res.send(req.body);
});
So should I convert my array to a string like '["patrick_bateman", 'uuid_goes_here']'?
If you're speaking about this "compound keys"...
This compuond keys aren't set by user directly, they are made by couchbase server while you use view. In couchbase view you can create map functions that will use "compund keys". Example:
map: function() {
if (doc.type === "mytype"){
emit([doc.body.id, doc.uuid], null);
}
}
In this case couchbase will create index by that "compund key" and when you query view you'll be able to set "two" keys.
This is useful i.e. in situations when you need to get some documents that varied by some time range. Example, you have docs with type "message" and you want to get all docs that have created from time 4 to 7.
In this case map function will look like:
map: function(){
if (meta.type === "json"){
emit([doc.type, doc.timestamp], null);
}
}
and query will contain params startKey=["message", 4] and endKey=["message", 7].
But also you can create complex keys like "message:4" and then query it via simple get. I.e. if you use sequential ids (by using increment function) for that messages you can easily iterate through that messages using simple for loop and couchbase.get function.
Also check this blog post by Tug Grall about creating chat application with nodejs and couchbase.

CouchDB and multiple keys

Is it possible to use similiar query in CouchDB? Like use two keys?
SELECT field FROM table WHERE value1="key1" OR value2="key2"
I was always using only one key.
function(doc) {
emit(doc.title, doc);
}
Thank you.
In CouchDB 0.9 and above you can POST to a view (or _all_docs) with a body like:
{"keys": ["key1", "key2", ...]}
In order to retrieve the set of rows with the matching keys.
Yes. Something like this should do the trick if I understand your question:
function(doc) {
a = (doc.value1 && doc.value1 == "key1");
b = (doc.value2 && doc.value2 == "key2");
if (a || b) {
emit(doc._id,doc.title);
}
}
Only emit the documents or values you need.
I'd add this to duluthian's reply:
emit(doc.title, null)
You can always pull out the "_id" and doc values using the view api.
You could create a view like this:
function(doc){
if(doc.value1) emit(doc.value1, doc.field);
if(doc.value2) emit(doc.value2, doc.field);
}
Then query it using llasram's suggestion to POST to the view with:
{"keys": ["key1", "key2", ...]}
Your client will have to be wary of dups though. A document where doc.value1 == "key1" && doc.value2 == "key2" will show up twice. Just use _id to filter the results.
One needs to expand a little bit on llasram's answer; the index must contain the values for both fields:
function(doc) {
emit("value1:"+doc.value1); // add check for undefined, null, etc.
emit("value2:"+doc.value2);
}
then query with
keys=["value1:key1","value2:key2"]
EDIT: This will however report the same document multiple times if it contains the matching value+key pairs.
You could do this ( assuming you want "dynamic parameters" ) by using 2 separate views, and a little client-side processing:
You would have one view on "field1", which you would query with "value1".
( getting a list of document IDs )
Then you query a second view on "field2", passing "value2", and getting another list of Doc IDs.
Now, you simply have to find the "intersection" of the 2 lists of ID numbers
( left as an exercise for the reader )

Count related documents in CouchDB

I'm pretty new to CouchDB and I still have some problems wrapping my head around the whole MapReduce way of querying my data...
To stay with the traditional "Blog" example, let's say I have 2 types of documents: post and comment... each comment document has a post_id field...
Is there a way I can get a list of posts with the number of comments for each of these posts with only 1 query? Let's say I want to display a list of post titles with the number of comments for each post like this:
My First Post: 4 comments
My Second Post: 6 comments
....
I know I can do the following:
function(doc) {
if(doc.type == "comment") {
emit(doc.post_id, 1);
}
}
and then reduce it like this:
function (key, values, rereduce) {
return sum(values);
}
which gives me a list of each blog post id, with the number of comments for each posts. But then I need to fetch the blog posts titles separately since the only thing I have right now is their id...
So, is there a way I could retrive a list of each blog post titles, with the number of comments for each posts, by doing only 1 query?
Have a look at View Collation:
http://wiki.apache.org/couchdb/View_collation?action=show&redirect=ViewCollation
You could do something like this:
function(doc) {
if(doc.type == "post") {
emit([doc._id, 'title', doc.title], 0);
}
if(doc.type == "comment") {
emit([doc.post_id, 'comments'], 1);
}
}
Then you'd get a view where each post gets two rows, one with the title and one with the comments.
You can merge the rows together on the client, or you can use a "list" function to merge these groups of rows together within couchdb:
http://wiki.apache.org/couchdb/Formatting_with_Show_and_List
function list(head, req) {
var post;
var row;
var outputRow = function() {
if(post) { send(post); }
}
while(row = getRow()) {
if(!post || row.key[0] != post.id) {
outputRow();
post = {id:row.key[0]};
}
/* If key is a triple, use part 3 as the value, otherwise assume its a count */
var value = row.key.length === 3 ? row.key[2] : row.value;
post[row.key[1]] = value;
}
outputRow();
}
Note: not tested code!
My experience is that in most "normal" cases you are better off having one big document containing both the post and the comments.
Of course, I am aware that it's not a good idea if you have thousands of comments. That's why I said "most normal cases". Don't throw out this option right off, as "improper".
You get all kinds of goodies like being able to count comments count in the map view, easy (one request) retrieval of the whole page from the database, ACID per post (with comments) etc. Plus, you don't need to think about trickeries like view collation right now.
If it gets slow, you can always transform your data structure later on (hell, we used to do it every day with RDBMS).
If your use case is not totally unsuitable for this, I really advise you to try it. It works remarkably well.

Resources