How do I perform a parameterized query on CouchDB - couchdb

I would like to use CouchDB to store some data for me and then use RESTful api calls to get the data that I need. My database is called "test" and my documents all have a similar structure and look something like this (where hello_world is the document ID):
"hello_world" : {"id":123, "tags":["hello", "world"], "text":"Hello World"}
"foo_bar" :{"id":124, "tags":["foo", "bar"], "text":"Foo Bar"}
What I'd like to be able to do is have my users send a query such as: "Give me all the documents that contain the words 'hello world', for example. I've been playing around with views but it looks like they will only allow me to move one or more of those values into the "key" portion of the map function. That gives me the ability to do something like this:
http://localhost:5984/test/_design/search/_view/search_view?key="hello"
But this doesn't allow me to let my users specify their query string. For example, what if they searched for "hello world". I'd have to do two queries: one for "hello" and one for "world" then I'd have to write a bunch of javascript to combine the results, remove duplicates, etc (YUCK!). What I really want is to be able to do something like this:
http://localhost:5984/test/_design/search/_view/search_view?term="hello world"
Then use the parameter "hello world" in the views map/reduce functions to find all the documents that contain both "hello" and "world" in the tags array. Is this sort of thing even possible with CouchDB? Is there another way to accomplish this inside a view that I'm not thinking of?

CouchDB Views do not support facetted search or fulltext search or result intersection. The couchdb-lucene plugin lets you do all these things.
http://github.com/rnewson/couchdb-lucene/tree/master

Technically this is possible if you emit for each document each set of the powerset of the tags of the document as the key. The key set element must be ordered and your query whould have to query the tags ordered, too.
function map(doc) {
function powerset(array) { ... }
powerset_of_tags = powerset(doc.tags)
for(i in powerset_of_tags) {
emit(powerset_of_tags[i], doc);
}
}
for the doc {"hello_world" : {"id":123, "tags":["hello", "world"], "text":"Hello World"} this would emit:
{ key: [], doc: ... }
{ key: ['hello'], doc: ... }
{ key: ['world'], doc: ... }
{ key: ['hello', 'world'], doc: ... }
Although is this possible I would consider this a rather arkward solution. I don't want to imagine the disk usage of the view for a larger number of tags. I expect the number of emitted keys to grow like 2^n.

under the hood, couchdb stores data by b-tree thus you should use views to pre-process, the limitation in this case that is you can not search regex. The alternative, you can search by prefixes or suffixes from the key in views.
Note: don't use emit(key, doc), it will clone document, you should use emit(key, null) or emit(key) and add "include_docs = true" when query.
You can use yours tags as key to query.
//view function
function (doc) {
if (doc.type === "hello") {
emit(doc);
}
}
//mango query
db
.query(your_view_name,
{ startkey: startkey, endkey: endkey, include_docs: true });
Note:
endkey = startkey + "\uffff";
startkey = "h", "he", "hell"...
Plus: don't never use mango query to query regex if you don't want performance go to the hell, sences. I fixed performance issue from 2 minutes to 2 seconds by view function.

Related

Query CosmosDB when document contains Dictionary

I have a problem with querying CosmosDB document which contains a dictionary. This is an example document:
{
"siteAndDevices": {
"4cf0af44-6233-402a-b33a-e7e35dbbee6a": [
"f32d80d9-e93a-687e-97f5-676516649420",
"6a5eb9fa-c961-93a5-38cc-ecd74ada13ac",
"c90e9986-5aea-b552-e532-cd64a250ad10",
"7d4bfdca-547a-949b-ccb3-bbf0d6e5d727",
"fba51bfe-6a5e-7f25-e58a-7b0ced59b5d8",
"f2caac36-3590-020f-ebb7-5ccd04b4412c",
"1b446af7-ba74-3564-7237-05024c816a02",
"7ef3d931-131e-a639-10d4-f4dd5db834ca"
]
},
"id": "f9ef9fb6-4b70-7d3f-2bc8-c3d335018624"
}
I need to get all documents where provided guid is in the list, so in the dictionary value (I don't know dictionary key). I found an information somewhere here that it is not possible to iterate through keys in dictionary in CosmosDB (maybe it has changed since that time but I din't find any information in documentation), but maybe someone will have some idea. I cannot change form of the document.
I tried to do it in Linq, but I didn't get any results.
var query = _documentClient
.CreateDocumentQuery<Dto>(DocumentCollectionUri())
.Where(d => d.SiteAndDevices.Any(x => x.Value.Contains("f32d80d9-e93a-687e-97f5-676516649420")))
.AsDocumentQuery();
Not sure of the Linq query, but with SQL, you'd need something like this:
SELECT * FROM c
where array_contains(c.siteAndDevices['4cf0af44-6233-402a-b33a-e7e35dbbee6a'],"f32d80d9-e93a-687e-97f5-676516649420")
This is a strange document format though, as you've named your key with an id:
"siteAndDevices": {
"4cf0af44-6233-402a-b33a-e7e35dbbee6a": ["..."]
}
Your key is "4cf0af44-6233-402a-b33a-e7e35dbbee6a", which forces you to use a different syntax to reference it:
c.siteAndDevices['4cf0af44-6233-402a-b33a-e7e35dbbee6a']
You'd save yourself a lot of trouble refactoring this to something like:
{
"id": "dictionary1",
"siteAndDevices": {
"deviceId": "4cf0af44-6233-402a-b33a-e7e35dbbee6a",
"deviceValues": ["..."]
}
}
You can refactor further, such as using an array to contain multiple device id + value combos.

Reduce output must shrink more rapidly -- Reducing to a list of documents

I have a few documents in my couch db with json as below. The cId will change for each. And I have created a view with map/reduce function to filter out few documents and return a list of json documents.
Document structure -
{
"_id": "ccf8a36e55913b7cf5b015d6c50009f7",
"_rev": "8-586130996ad60ccef54775c51599e73f",
"cId": 1,
"Status": true
}
Here is the sample map:
function(doc) {
if(doc.Key && doc.Value && doc.Status == true)
emit(null, doc);
}
Here is the sample reduce:
function(key, values, rereduce){
var kv = [];
values.forEach(function(value){
if(value.cId != <some_val>){
kv.push({"k": value.cId, "v" : value});
}
});
return kv;
}
If there are two documents and reduce output has list containing 1 document, this works fine. But if I add one more document (with cId = 2), it throws the errors - "reduce output must shrink more rapidly". Why is this caused? And how can I achieve what I intend to do?
The cause of the error is, that the reduce function does not actually reduce anything (it rather is collecting objects). The documentation mentions this:
The way the B-tree storage works means that if you don’t actually
reduce your data in the reduce function, you end up having CouchDB
copy huge amounts of data around that grow linearly, if not faster
with the number of rows in your view.
CouchDB will be able to compute the final result, but only for views
with a few rows. Anything larger will experience a ridiculously slow
view build time. To help with that, CouchDB since version 0.10.0 will
throw an error if your reduce function does not reduce its input
values.
It is unclear to me, what you intend to achieve.
Do you want to retrieve a list of docs based on certain criteria? In this case, a view without reduce should suffice.
Edit: If the desired result depends on a value stored in a certain document, then CouchDB has a feature called list. It is a design function, that provides access to all docs of a given view, if you pass include_docs=true.
A list URL follow this pattern:
/db/_design/foo/_list/list-name/view-name
Like views, lists are defined in a design document:
{
"_id" : "_design/foo",
"lists" : {
"bar" : "function(head, req) {
var row;
while (row = getRow()) {
if (row.doc._id === 'baz') // Do stuff based on a certain doc
}
}"
},
... // views and other design functions
}

Sorting CouchDB result by value

I'm brand new to CouchDB (and NoSQL in general), and am creating a simple Node.js + express + nano app to get a feel for it. It's a simple collection of books with two fields, 'title' and 'author'.
Example document:
{
"_id": "1223e03eade70ae11c9a3a20790001a9",
"_rev": "2-2e54b7aa874059a9180ac357c2c78e99",
"title": "The Art of War",
"author": "Sun Tzu"
}
Reduce function:
function(doc) {
if (doc.title && doc.author) {
emit(doc.title, doc.author);
}
}
Since CouchDB sorts by key and supports a 'descending=true' query param, it was easy to implement a filter in the UI to toggle sort order on the title, which is the key in my results set. Here's the UI:
List of books with link to sort title by ascending or descending
But I'm at a complete loss on how to do this for the author field.
I've seen this question, which helped a poster sort by a numeric reduce value, and I've read a blog post that uses a list to also sort by a reduce value, but I've not seen any way to do this on a string value without a reduce.
If you want to sort by a particular property, you need to ensure that that property is the key (or, in the case of an array key, the first element in the array).
I would recommend using the sort key as the key, emitting a null value and using include_docs to fetch the full document to allow you to display multiple properties in the UI (this also keeps the deserialized value consistent so you don't need to change how you handle the return value based on sort order).
Your map functions would be as simple as the following.
For sorting by author:
function(doc) {
if (doc.title && doc.author) {
emit(doc.author, null);
}
}
For sorting by title:
function(doc) {
if (doc.title && doc.author) {
emit(doc.title, null);
}
}
Now you just need to change which view you call based on the selected sort order and ensure you use the include_docs=true parameter on your query.
You could also use a single view for this by emitting both at once...
emit(["by_author", doc.author], null);
emit(["by_title", doc.title], null);
... and then using the composite key for your query.

How do I create an entry with a compound key with Couchbase?

I have some code running in NodeJS that sets the doc in the database:
cb.set(req.body.id, req.body.value, function (err, meta) {
res.send(req.body);
});
I have read about compound keys and it seems that feature can simplify my life. The question is how to properly add an entry with a compound key? The code below fails and messages that a string was expected, no array.
cb.set([req.body.id, generate_uuid()], req.body.value, function (err, meta) {
res.send(req.body);
});
So should I convert my array to a string like '["patrick_bateman", 'uuid_goes_here']'?
If you're speaking about this "compound keys"...
This compuond keys aren't set by user directly, they are made by couchbase server while you use view. In couchbase view you can create map functions that will use "compund keys". Example:
map: function() {
if (doc.type === "mytype"){
emit([doc.body.id, doc.uuid], null);
}
}
In this case couchbase will create index by that "compund key" and when you query view you'll be able to set "two" keys.
This is useful i.e. in situations when you need to get some documents that varied by some time range. Example, you have docs with type "message" and you want to get all docs that have created from time 4 to 7.
In this case map function will look like:
map: function(){
if (meta.type === "json"){
emit([doc.type, doc.timestamp], null);
}
}
and query will contain params startKey=["message", 4] and endKey=["message", 7].
But also you can create complex keys like "message:4" and then query it via simple get. I.e. if you use sequential ids (by using increment function) for that messages you can easily iterate through that messages using simple for loop and couchbase.get function.
Also check this blog post by Tug Grall about creating chat application with nodejs and couchbase.

CouchDB and multiple keys

Is it possible to use similiar query in CouchDB? Like use two keys?
SELECT field FROM table WHERE value1="key1" OR value2="key2"
I was always using only one key.
function(doc) {
emit(doc.title, doc);
}
Thank you.
In CouchDB 0.9 and above you can POST to a view (or _all_docs) with a body like:
{"keys": ["key1", "key2", ...]}
In order to retrieve the set of rows with the matching keys.
Yes. Something like this should do the trick if I understand your question:
function(doc) {
a = (doc.value1 && doc.value1 == "key1");
b = (doc.value2 && doc.value2 == "key2");
if (a || b) {
emit(doc._id,doc.title);
}
}
Only emit the documents or values you need.
I'd add this to duluthian's reply:
emit(doc.title, null)
You can always pull out the "_id" and doc values using the view api.
You could create a view like this:
function(doc){
if(doc.value1) emit(doc.value1, doc.field);
if(doc.value2) emit(doc.value2, doc.field);
}
Then query it using llasram's suggestion to POST to the view with:
{"keys": ["key1", "key2", ...]}
Your client will have to be wary of dups though. A document where doc.value1 == "key1" && doc.value2 == "key2" will show up twice. Just use _id to filter the results.
One needs to expand a little bit on llasram's answer; the index must contain the values for both fields:
function(doc) {
emit("value1:"+doc.value1); // add check for undefined, null, etc.
emit("value2:"+doc.value2);
}
then query with
keys=["value1:key1","value2:key2"]
EDIT: This will however report the same document multiple times if it contains the matching value+key pairs.
You could do this ( assuming you want "dynamic parameters" ) by using 2 separate views, and a little client-side processing:
You would have one view on "field1", which you would query with "value1".
( getting a list of document IDs )
Then you query a second view on "field2", passing "value2", and getting another list of Doc IDs.
Now, you simply have to find the "intersection" of the 2 lists of ID numbers
( left as an exercise for the reader )

Resources