Searching required data in couchdb - couchdb

I have documents like,
{_id:1,
name:"john"
}
{_id:2,
name:"john boss"
}
{_id:3,
name:"jim"
}
I have to search the data where ever john is stored in documents. Suppose, if i search "john" the documents should get _id:1 & _id:2 related data. Please guide me to get the result.
I appreciate if any one provide the solutions.

I suggest a CouchDB view to show you all "words" from the "name" field.
function(doc) {
// map function: _design/example/_view/names
if(!doc.name) // Optionally do more testing for doc type, etc. here.
return
// Emit one row per word in the name field (first name, last name, etc.).
var words = doc.name.split(/\s+/)
for(var i = 0; i < words.length; i++)
emit(words[i].toLowerCase(), doc._id)
}
Now if you query /db/_design/example/_view/names?key="john", you will get two rows: one for doc id 1, and another for id 2. I also added a conversion to lower case, so searching for "john" will match people named "John".
Duplicates are possible: the same doc ID listed multiple times, e.g. for {"name":"John John"}; however you are guaranteed that all duplicate rows will be adjacent.
You can also add ?include_docs=true to your request to get the full document for each row.

Related

How to list and remove duplicate values using mongoose

Following the comments on Mongoose: how to define a combination of fields to be unique?
First let's get the array of data sorted by all values which supposed to be unique.
Assuming we're talking about strings (as in this question), we can combine them to create one long string that is supposed to be unique.
Being sorted, if there are duplicate values they'll show up right after the other, so let's look for results that repeat themselves:
var previousName;
Person.find().sort('firstName lastName').exec().each(function (person) {
var name = person.firstName + person.lastName;
if (name == previousName) {
console.log(name);
person.remove();
}
previousName = name;
})

How to get from CouchDB only certain fields of certain documents for a single request?

For example I have a thousands of documents with same structure, for example:
{
"key_1":"value_1",
"key_2":"value_2",
"key_3":"value_3",
...
...
}
And I need to get, let's say key_1, key_3 and key_23 from some set of documents with known IDs, for example, I need to process only 5 documents while my DB contains several thousands. Each time I have a different set of keys and document IDs. Is it possible to get that information for a one request?
You can use a list function (see: this, this, and this).
Since you know the ids, you can then query _all_docs with the list function:
POST /{db}/_design/{ddoc}/_list/{func}/_all_docs?include_docs=true&columns=["key_1","key_2","key_3"]
Accept: application/json
Content-Length: {whatever}
{
"keys": [
"docid002",
"docid005"
]
}
The list function needs to look at documents, and send the appropriate JSON for each one. Not tested:
(function (head, req) {
send('{"total_rows":' + head.total_rows + ',"offset":' + head.offset + ',"rows":[');
var columns = JSON.parse(req.query.columns);
var delim = '';
var row;
while (row = getRow()) {
var doc = {};
for (var k in columns) {
doc[k] = row.doc[k];
}
row.doc = doc;
send(delim + toJSON(row));
delim = ',';
}
send(']}');
})
Whether this is a good idea, I'm not sure. If your documents are big, and bandwidth savings important, it might.
Yes, that’s possible. Your question can be broken up into two distinct problems:
Getting only a part of the document (in your example: key_1, key_3 and key_23). This can be done using a view. A view is saved into a design document. See the wiki for more info on how to create views.
Retrieving only certain documents, which are defined by their ID. When querying views, you cannot only specify a single ID (or rather key), but also an array of keys, which is what you would need here. Again, see the section on querying views in the wiki for explanations and examples.
Even though you only need a subset of values from a document, you may find that the system as a whole performs better if you just ask for the entire document then select the values you need from that result.
To only get the specific key value pairs you need to create a view that has view entries with a multipart key consisting of the doc id and doc item name, with value of the corresponding doc item.
So your map function would look something like:
function(doc){
for(var i = 1; i < doc.keysInDoc; i++){
var k = "key_"+i;
emit([doc._id, k], doc.[k]);
}
}
You can then use multi key lookup with each key being of the form ["docid12345", "key_1"], ["docid56789", "key_23"], etc.
So a query like:
http://host:5984/db/_design/design/_view/view?&keys=[["docid002","key_8"],["docid005","key_7"]]
will return
{"total_rows":84,"offset":67,"rows":[
{"id":"docid002","key":["docid002","key_8"],"value":"value d2_k8"},
{"id":"docid005","key":["docid005","key_12"],"value":"value d5_k12"}
]}

Cloudant 1 to many function

I’ve just started to use Cloudant and I just can’t get my head around the map functions. I’ve been fiddling with the data below but it isn’t working out as I expected.
The relationship is, a user can have many vehicles. A vehicle belongs to 1 user. The vehicle ‘userId’ is the key of the user. There is a bit of redundancy as in user the _id and userId is the same, guess later is not required.
Anyhow, how can I find for a/every user, the vehicles which belong to it? The closest I’ve come through trial and error is a result which displays the owner of every vehicle, but I would like it the other way round, the user and the vehicles belonging to it. All the examples I’ve found use another document which ‘joins’ two or more documents, but I don’t need to do that?
Any point in the right direction appreciated - I really have no idea.
function (doc) {
if (doc.$doctype == "vehicle")
{
emit(doc.userId, {_id: doc.userId});
}
}
EDIT: Getting closer. I'm not sure exactly what I was expecting, but the result seems a bit 'messy'. Row[0] is the user document, row[n > 0] are the vehicle documents. I guess it's fine when a startkey/endkey is used, but without the results are a bit jumbled up.
function (doc) {
if (doc.$doctype == 'user') {
emit([doc._id, 0], doc);
} else if (doc.$doctype == 'vehicle') {
emit([doc.userId, 1, doc._id], doc);
}
}
A user is described as,
{
"_id": "user:10",
"firstname": “firstnamehere",
"secondname": “secondnamehere",
"userId": "user:10",
"$doctype": "user"
}
a vehicle is described as,
{
"_id": "vehicle:4002”,
“name”: “avehicle”,
"userId": "user:10",
"$doctype": "vehicle",
}
You're getting in the right direction! You already got that right with the global IDs. Having the type of the document as part of the ID in some form is a very good idea, so that you don't get confused later (all documents are in the same "pot").
Here are some minor problems with your current solution (before getting to your actual question):
Don't emit the doc as value in emit(key, value). You can always ask for the document that belongs to a view row by querying with include_docs=true. Having the doc as view value increases the view indexes a lot. When you don't need a specific value, use emit(key, null).
You also don't need the ID in the emit value. You'll get the ID of the document that belongs to a view row as part of the row anyway.
View Collation
Now to your problem of aggregating the vehicles with their user. You got the basic pattern right. This pattern is called view collation, you can read more about it in the CouchDB docs (ignore that it is in the "Couchapp" section).
The trick with view collation is that you return two or more types of documents, but make sure that they are sorted in a way that allows for direct grouping. Thus it is important to understand how CouchDB sorts the view result. See the collation specification for more information on that one. An important key to understanding view collation is that rows with array keys are sorted by key elements. So when two rows have the same key[0], they sort by key[1]. If that's equal as well, key[2] is considered, and so on.
Your map function frist groups users and vehicles by user ID (key[0]). Your map function then uses the fact that 0 sorts before 1 in the second element of the key, so your view will contain the following:
user 1
vehicle of user 1
vehicle of user 1
vehicle of user 1
user 2
user 3
vehicle of user 3
user 4
etc.
As you can see, the vehicles of a user immediately follow their user. Thus you can group this result into aggregates without performing expensive sort or lookup operations.
Note that users are sorted according to their ID, and vehicles within users also according to their ID. This is because you use the IDs in the key array.
Creating Queries
Now that view isn't worth much if you can't query according to your needs. A view as you have it supports the following queries:
Get all users with their vehicles
Get a range of users with their vehicles
Get a single user with its vehicles
Get a single user without vehicles (you could also use the _all_docs view for that though)
Example query for "all users between user 1 and user 3 (inclusive) with their vehicles"
We want to query for a range, so we use startkey and endkey in the query:
startkey=["user:1", 0]
endkey=["user:3", 1, {}]
Note the use of {} as sentinel value, which is required so that the end key is larger than any row that has a key of ["user:3", 1, (anyConceivableVehicleId)]

How to search and sort with CouchDB in one map function

I'm stumbling a bit with my CouchDB knowledge.
I have a database of content that is tagged with an array of tags and has a created date.
I want to create a view that pulls a limited number of newest stories tagged with a specific tag.
For example, the newest 6 stories tagged "Business."
Ran across this question, which seems to get me almost to where I need to go, but I'm missing one key element, which I think is how to craft the query string to sort by one key while searching by the other.
Here's my map function.
function(doc) {
if (doc.published == "yes" && doc.type == "news") {
for (var i = 0; i < doc.tags.length; i++) {
if (doc.tags[i]) {
emit([doc.created, doc.tags[i]], doc);
}
}
}
}
So how do I query that view for a all documents tagged "Business" that are the newest documents based on created.
The created attribute is a date sortable format.
First, I would switch the order of your emit:
emit([doc.tags[i], doc.created]);
(leave out doc as well, you can just add include_docs=true to get the entire document, and your view won't take up so much disk-space in the process)
Now you can query for the all the stories tagged as "Business" by using the following querystring:
startkey=["Business"]&endkey=["Business",{}]
You'll get all the documents with the tag business, and they'll be sorted by date.
This takes advantage of view collation, which basically is the rules governing how indexes are sorted/queried. For complex keys like this, the sorting is done for each item of the array separately. (ie. the first key is sorted first, the second key is sorted second, etc) This is why the order matters, as you must always move from left to right when querying a view index.
If you want the 6 most recent, your querystring will need to change:
descending=true&limit=6&endkey=["Business"]&startkey=["Business",{}]
NOTICE You need to swap the startkey/endkey values, due to how the descending parameter works. See the View reference page on the wiki for further explanation.
OK, I think I figured this out, but I'm not quite certain I fully understand it.
I found this story about complex keys and searching and sorting.
My map function looks like this:
function(doc) {
if (doc.published == "yes" && doc.type == "news") {
for (var i = 0; i < doc.tags.length; i++) {
if (doc.tags[i]) {
emit([doc.tags[i], doc.created], doc);
}
}
}
}
And to query and sort using it, the query looks like this.
http://localhost:5984/database/_design/story/_view/tagged?limit=10&startkey=["Business"]&endkey=["Business",{}]&descending=false
I'm getting the results I want, but I'm not entirely certain I understand it all.

How to restrict rows in a list inside a document?

I have a document something like:
{"name":"Stock levels",
"content":[
{"sku":"328143",
"name":"Battery",
"stocklevel":"100",
"warehouse":"london"},
{"sku":"328143",
"name":"Battery",
"stocklevel":"20",
"warehouse":"manchester"},
{"sku":"328143",
"name":"Battery",
"stocklevel":"30",
"warehouse":"brighton"}]}
Where the list "content" could have quite a lot of rows.
What I want to do is return an internal row count and just one row from the list.
e.g.
{"name":"Stock levels",
"rows" : "2300",
"content":[
{"sku":"328143",
"name":"Battery",
"stocklevel":"100",
"warehouse":"london"}]}
How might I achieve this in CouchDb? My initial thought is using a list to effectively rebuild the document and inserting the extra rows field and restricting the number of rows return internally, but I am not sure if this is the best approach.
Thanks
You can use a view,
the following example allow you to search based on document id
(which is emit as key)
function(doc)
{
if (doc._id == "xxx")
{
emit(doc._id, {name:doc.name, rows:doc.content.length, content:doc.content[0]});
}
}

Resources