How to use CouchDB Mango query (/db/_find) with an index to select multiple _id keys - couchdb

I am using CouchDB 3.1.1 to perform Mango queries against a database containing a large number of documents. A very common requirement in my application is to perform queries on a very specific and dynamic set of documents. From what I understand at this moment, these are the only choices I have on how to confront my problem:
Make multiple requests to /db/_find each with a distinct "_id"
Make a single call to /db/_find
Of the ways I can accomplish the second choice:
Use an "$or" array on all the "_id": value pairs
Use an "$or" array on all the values of the "_id" key
The second choice is what I would prefer to use since making multiple POST requests would incur overhead. Unfortunately using "$or" seems to get in the way of the query engine making use of the "_id" index.
Thus, choice #1 returns with a speedy 2 ms per transaction but the results are not sorted (requiring my application to do the sorting). Choice #2, given an array of 2 _ids, regardless of the $or syntax, takes over 3 seconds to render.
What is the most efficient way to use a CouchDB Mango query index against a specific set of documents?
Fast Example: Results using a single _id
{
"selector": {
"_id": "184094"
},
"fields": [
"_id"
]
}
documents examined: 26,312
results returned: 1
execution time: 2 ms
Slow Example: Results using $or of key / value pairs
{
"selector": {
"$or": [
{
"_id": "184094"
},
{
"_id": "157533"
}
]
},
"fields": [
"_id"
]
}
documents examined: 26,312
results returned: 2
execution time: 2,454 ms
Slow Example: Results using $or array of values
{
"selector": {
"_id": {
"$or": [
"184094",
"157533"
]
}
},
"fields": [
"_id"
]
}
documents examined: 26,312
results returned: 2
execution time: 2,522 ms
Slow Example: Results using $in (which is illegal but still returns results)
{
"selector": {
"_id": {
"$in": [
"184094",
"157533"
]
}
},
"fields": [
"_id"
]
}
documents examined: 26,312
results returned: 2
execution time: 2,618 ms
Index: The registered index for _id
{
"_id": "_design/508b5b51e6085c2f96444b82aced1e5dfec986b2",
"_rev": "1-f951eb482f9a521752adfdb6718a6a59",
"language": "query",
"views": {
"foo-index": {
"map": {
"fields": {
"_id": "asc"
},
"partial_filter_selector": {}
},
"reduce": "_count",
"options": {
"def": {
"fields": [
"_id"
]
}}}}}
Explain: An 'explain' summary done to one of the slow queries. Note that the registered index was used.
{
"dbname": "dnp_person_comment",
"index": {
"ddoc": "_design/508b5b51e6085c2f96444b82aced1e5dfec986b2",
"name": "foo-index",
"type": "json",
"partitioned": false,
"def": {
"fields": [
{
"_id": "asc"
}
]
}
},
"partitioned": false,
"selector": {
"$or": [
{
"_id": {
"$eq": "184094"
}
},
{
"_id": {
"$eq": "157533"
}
}
]
},
"opts": {
"use_index": [],
"bookmark": "nil",
"limit": 25,
"skip": 0,
"sort": {},
"fields": [
"_id"
],
"partition": "",
"r": [
49
],
"conflicts": false,
"stale": false,
"update": true,
"stable": false,
"execution_stats": false
},
"limit": 25,
"skip": 0,
"fields": [
"_id"
],
"mrargs": {
"include_docs": true,
"view_type": "map",
"reduce": false,
"partition": null,
"start_key": [],
"end_key": [
"<MAX>"
],
"direction": "fwd",
"stable": false,
"update": true,
"conflicts": "undefined"
}
}

Related

Unable to run a Mango query

This is my query:
{
"selector": {
"_id": {
"$regex": "^rati" //need to find all documents in ratings partition
}
},
"fields": [
"MovieID",
"UserId",
"Rating"
],
"limit": 10,
"sort": [
{
"MovieID": "asc"
}
]
}
When I run this query I have that error: Error running query. Reason: (no_usable_index) No global index exists for this sort, try indexing by the sort fields.
If I remove
"sort": [
{
"MovieID": "asc"
}
]
everything works good. Honestly I'm going crazy, I can't understand where I'm wrong.
I've tried this query:
{
"selector": {
"_id": {
"$regex": "^rati"
},
"MovieID": {
"$gte": 0
}
},
"fields": [
"_id",
"MovieID",
"UserId",
"Rating"
],
"limit": 10,
"sort": [
{
"_id": "asc"
}
]
}
but is the same.
You need to create an Index for the field MovieID
//Create via POST /db/_index HTTP/1.1 Content-Type: application/json
{
"index": {
"fields": ["MovieID"]
},
"name" : "MovieID-index",
"type" : "json"
}
Afterward include the field MovieID as part of the Selector.
Try this here:
{
"selector": {
"_id": {
"$regex": "^rati" //need to find all documents in ratings partition
},
"MovieID": {"$gte": 0} // include MovieID, If the ID is non-numeric change the selecor type.
},
"fields": [
"MovieID",
"UserId",
"Rating"
],
"limit": 10,
"sort": [
{
"MovieID": "asc"
}
]
}

CouchDB index with $or and $and not working but just $and does

For some reason, I have the following .find() commands and I am getting conflicting indexing errors. Below are examples of one working when I only try to get one type of document. But then if I try to get 2 types of documents it doesn't work for some reason.
Does anyone know why this would be the case?
My index file:
{
"_id": "_design/index",
"_rev": "3-ce41abcc481f0a180eb722980d68f103",
"language": "query",
"views": {
"index": {
"map": {
"fields": {
"type": "asc",
"timestamp": "asc"
},
"partial_filter_selector": {}
},
"reduce": "_count",
"options": {
"def": {
"fields": [
"type",
"timestamp"
]
}
}
}
}
}
Works:
var result = await POUCHDB_DB.find({
selector:{
$and: [{type:"document"},{uid:"123"}]
},
limit:50,
bookmark: bookmark,
sort: [{timestamp: "desc"}]
});
Doesn't work:
var result = await POUCHDB_DB.find({
selector:{
$or: [
{$and: [{type:"document"},{uid:"123"}]},
{$and: [{type:"page"},{uid:"123"}]}
]
},
limit:50,
bookmark: bookmark,
sort: [{timestamp: "desc"}]
});
Missing timestamp in selector
In order yo use the timestamp to sort, it must be in your selector. You can simply add it with a "$gte":null.
Redundant condition
The uid seems redundant for your query. For this reason, I would add it into a separate condition.
Finally, in order to use your index, you should create an index with the following fields: uid, timestamp, type (I think this one is optional).
{
"selector": {
"$and": [{
"uid": "123",
"timestamp": {
"$gte": null
}
},
{
"$or": [{
"type": "document"
},
{
"type": "page"
}
]
}
]
},
"sort": [{
"timestamp": "desc"
}]
}
Recommandation
If you want your queries to use your index, I would recommend to specify the "use_index" field. If you can version your indexes and queries, it will make the queries faster.

Speeding up Cloudant query for type text index

We have a table with this type of structure:
{_id:15_0, createdAt: 1/1/1, task_id:[16_0, 17_0, 18_0], table:”details”, a:b, c: d, more}
We created indexes using
{
"index": {},
"name": "paginationQueryIndex",
"type": "text"
}
It auto created
{
"ddoc": "_design/28e8db44a5a0862xxx",
"name": "paginationQueryIndex",
"type": "text",
"def": {
"default_analyzer": "keyword",
"default_field": {
},
"selector": {
},
"fields": [
],
"index_array_lengths": true
}
}
We are using the following query
{
"selector": {
"createdAt": { "$gt": 0 },
"task_id": { "$in": [ "18_0" ] },
"table": "details"
},
"sort": [ { "createdAt": "desc" } ],
"limit”: 20
}
It takes 700-800 ms for first time, after that it decreases to 500-600 ms
Why does it take longer the first time?
Any way to speed up the query?
Any way to add indexes to specific fields if type is “text”? (instead of indexing all the fields in these records)
You could try creating the index more explicitly, defining the type of each field you wish to index e.g.:
{
"index": {
"fields": [
{
"name": "createdAt",
"type": "string"
},
{
"name": "task_id",
"type": "string"
},
{
"name": "table",
"type": "string"
}
]
},
"name": "myindex",
"type": "text"
}
Then your query becomes:
{
"selector": {
"createdAt": { "$gt": "1970/01/01" },
"task_id": { "$in": [ "18_0" ] },
"table": "details"
},
"sort": [ { "createdAt": "desc" } ],
"limit": 20
}
Notice that I used strings where the data type is a string.
If you're interested in performance, try removing clauses from your query one at-a-time to see if one is causing the performance problem. You can also look at the explanation of your query to see if it using your index correctly.
Documentation on creating an explicit text query index is here

index and query items in an array with mango query for cloudant and couchdb 2.0

I have the following db structure:
{"_id": "0096874","genre": ["Adventure","Comedy", "Sci-Fi" ]}
{"_id": "0099088","genre": ["Comedy", "Sci-Fi", "Western"]}
and like to query it like I could do in mongodb
db.movies.find({genre: {$in: ["Comedy"]}})
It works when i use a text index for the whole database, but that seems very wasteful:
// index
{
"index": {},
"type": "text"
}
//query
{
"selector": {
"genre": {
"$in": ["Comedy"]
}
},
"fields": [
"_id",
"genre"
]
}
The following index does not work:
{
"index": {
"fields": [
"genre"
]
},
"type": "json"
}
What is the correct index for cloudant query, which does not index the whole db?
Thanks for your help
You had it almost correct. Your index is right, but you need to throw in a selector to get all IDs https://cloudant.com/blog/mango-json-vs-text-indexes/.
This isn't a great solution performance-wise, as Tony says,
The reason this works is, again, because Mango performs the above $in operation as a filtering mechanism against all the documents. As we saw in the conclusion of the previous section on JSON syntax, the performance tradeoff with the query above is that it, essentially, performs a full index scan and then applies a filter.
{
"selector": {
"_id": {
"$gt": null
},
"genre": {
"$in": ["Western"]
}
},
"fields": [
"_id",
"genre"
],
"sort": [
{
"_id": "asc"
}
]
}

cloudant searching index by multiple values

Cloudant is returning error message:
{"error":"invalid_key","reason":"Invalid key use-index for this request."}
whenever I try to query against an index with the combination operator, "$or".
A sample of what my documents look like is:
{
"_id": "28f240f1bcc2fbd9e1e5174af6905349",
"_rev": "1-fb9a9150acbecd105f1616aff88c26a8",
"type": "Feature",
"properties": {
"PageName": "A8",
"PageNumber": 1,
"Lat": 43.051523,
"Long": -71.498852
},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[
-71.49978935969642,
43.0508382914137
],
[
-71.49978564033566,
43.052210148524
],
[
-71.49791499857444,
43.05220740550381
],
[
-71.49791875962663,
43.05083554852429
],
[
-71.49978935969642,
43.0508382914137
]
]
]
}
}
The index that I created is for field "properties.PageName", which works fine when I'm just querying for one document, but as soon as I try for multiple ones, I would receive the error response as quoted in the beginning.
If it helps any, here is the call:
POST https://xyz.cloudant.com/db/_find
request body:
{
"selector": {
"$or": [
{ "properties.PageName": "A8" },
{ "properties.PageName": "M30" },
{ "properties.PageName": "AH30" }
]
},
"use-index": "pagename-index"
}
In order to perform an $or query you need to create a text (full text) index, rather than a json index. For example, I just created the following index:
{
"index": {
"fields": [
{"name": "properties.PageName", "type": "string"}
]
},
"type": "text"
}
I was then be able to perform the following query:
{
"selector": {
"$or": [
{ "properties.PageName": "A8" },
{ "properties.PageName": "M30" },
{ "properties.PageName": "AH30" }
]
}
}

Resources