How would I query keys such that it would partially match? - couchdb

Let's take this document for example:
{
"id":1
"planet":"earth-616"
"data":[
["wolverine","mutant"],
["Storm","mutant"],
["Mark Zuckerberg","human"]]
}
I created a search index to index the name and type, for example if searched for name:wolverine or type:mutant I'd get the document that has it. But as per my requirement I don't want the whole document, I only want ["wolverine","mutant"] I've created a view that outputs as:
{
"id":1,
"key":"earth-616",
"value":["earth-616","wolverine","mutant"]
}
Then I found out I can query only with keys. (Is it possible to create search indexes on views?, Couldn't find anything in the documentation)
Or should I create views along with the one above like this:
{
"id":1,
"key":"wolverine",
"value":["earth-616","wolverine","mutant"]
}
And
{
"id":,
"key":"mutant"
"value":["earth-616","wolverine","mutant"]
}
This way I can query with keys that I want but I can't seem to partial match keys(Am I missing something?)

If you need the output to be exactly as described then I believe you have to use views, and to support wildcard searches I believe you will have to index every substring of a key.
One alternative is to use Cloudant Query, although admittedly you cannot get the exact output you are looking for. If you issue a query like so:
{
"selector": {
"_id": {
"$gt": 0
},
"data": {
"$elemMatch": {
"$elemMatch": {
"$regex": "(?i)zuck"
}
}
}
},
"fields": [
"data"
]
}
The result will be the entire data array:
{
"data": [
["wolverine", "mutant"],
["Storm", "mutant"],
["Mark Zuckerberg", "human"]
]
}

Related

How can I obtain a document from a Cosmos DB using a field in an array as a filter?

I have a Cosmos DB with documents that look like the following:
{
"name": {
"productName": "someProductName"
},
"identifiers": [
{
"identifierCode": "1234",
"identifierLabel": "someLabel1"
},
{
"identifierCode": "432",
"identifierLabel": "someLabel2"
}
]
}
I would like to write a sql query to obtain an entire document using "identifierLabel" as a filter when searching for the document.
I attempted to write a query based on an example I found from the following blog:
SELECT c,t AS identifiers
FROM c
JOIN t in c.identifiers
WHERE t.identifierLabel = "someLabel2"
However, when the result is returned, it appends the following to the end of the document:
{
"name": {
"productName": "someProductName"
},
"identifiers": [
{
"identifierCode": "1234",
"identifierLabel": "someLabel1"
},
{
"identifierCode": "432",
"identifierLabel": "someLabel2"
}
]
},
{
"identifierCode": "432",
"identifierLabel": "someLabel2"
}
How can I avoid this and get the result that I desire, i.e. the entire document with nothing appended to it?
Thanks in advance.
Using ARRAY_CONTAINS(), you should be able to do something like this to retrieve the entire document, without any need for a self-join:
SELECT *
FROM c
where ARRAY_CONTAINS(c.identifiers, {"identifierLabel":"someLabel2"}, true)
Note that ARRAY_CONTAINS() can search for either scalar values or objects. By specifying true as the third parameter, it signifies searching through objects. So, in the above query, it's searching all objects in the array where identifierLabel is set to "someLabel2" (and then it should be returning the original document, unchanged, avoiding the issue you ran into with the self-join).

Mango index "does not contain a valid index for this query" even when specified manually

I'm trying to efficiently query data via Mango (as that seems to be the only option given my requirements Searching for sub-objects with a date range containing the queried date value), but I can't even get a very simple index/query pair to work: although I specify my index manually for the query, I'm told that my index "was not used because it does not contain a valid index for this query. No matching index found, create an index to optimize query time."
(I'm doing all of this via Fauxton on CouchDB v. 3.0.0)
Let's say my documents look like this:
{
"tenant": "TNNT_a",
"$doctype": "JobOpening",
// a bunch of other fields
}
All documents with a $doctype of "JobOpening" are guaranteed to have a tenant property. The searches I wish to perform will only ever be for documents with $doctype of "JobOpening" and a tenant selector will always be provided when querying.
Here's the test index I've configured:
{
"index": {
"fields": [
"tenant",
"$doctype"
],
"partial_filter_selector": {
"\\$doctype": {
"$eq": "JobOpening"
}
}
},
"ddoc": "job-openings-doctype-index",
"type": "json"
}
And here's the query
{
"selector": {
"tenant": "TNNT_a",
"\\$doctype": "JobOpening"
},
"use_index": "job-openings-doctype-index"
}
Why isn't the index being used for the query?
I've tried not using a partial index, and I think the $doctype escaping is done properly in the requisite places, but nothing seems to keep CouchDB from performing a full scan.
The index isn't being used because the $doctype field is not being recognized by the query planner as expected.
Changing the fields declaration from $doctype to \\$doctype in the design document solves the issue.
{
"index": {
"fields": [
"tenant",
"\\$doctype"
],
"partial_filter_selector": {
"\\$doctype": {
"$eq": "JobOpening"
}
}
},
"ddoc": "job-openings-doctype-index",
"type": "json"
}
After that small refactor, the query
{
"selector": {
"tenant": "TNNT_a",
"\\$doctype": "JobOpening"
},
"use_index": "job-openings-doctype-index"
}
Returns the expected result, and produces an "explain" which confirms the job-openings-doctype-index was queried:
{
"dbname": "stack",
"index": {
"ddoc": "_design/job-openings-doctype-index",
"name": "7f5c5cea5acd90f11fffca3e3355b6a03677ad53",
"type": "json",
"def": {
"fields": [
{
"tenant": "asc"
},
{
"\\$doctype": "asc"
}
],
"partial_filter_selector": {
"\\$doctype": {
"$eq": "JobOpening"
}
}
}
},
// etc etc etc
Whether this change is intuitive or not is unclear, however it is consistent - and perhaps reveals leading field names with a "special" character may not be desirable.
Regarding the indexing of the filtered field, as per the documentation regarding partial_filter_selector
Technically, we don’t need to include the filter on the "status" [e.g.
$doctype here] field in the query selector ‐ the partial index
ensures this is always true - but including it makes the intent of the
selector clearer and will make it easier to take advantage of future
improvements to query planning (e.g. automatic selection of partial
indexes).
Despite that, I would not choose to index a field whose value is constant.

How to define an index to use in a Mango Query

I am trying to create a CouchDB Mango Query with an index with the hope that the query runs faster. At the moment I have the following Mango Query which returns what I am looking for but it's slow. Therefore, I assume, I need to create an index to make it faster. I need help figuring out how to create that index.
selector: {
categoryIds: {
$in: categoryIds,
},
},
sort: [{ publicationDate: 'desc' }],
You can assume that my documents are let say news articles from different categories. Therefore in each document I have a field that contains one or more categories that the news article belongs to. For that I have an array of categoryIds for each document. My query needs to be optimized for queries like "Give me all news that have categoryId1 in their array of categoryIds sorted by publicationDate". What I don't know how to do is 1. How to define an index 2. What that index should be 3. How to use that index in "use_index" field of the Mango Query. Any help is appreciated.
Update after "Alexis Côté" answer:
If I define the index like this:
{
"_id": "_design/0f11ca4ef1ea06de05b31e6bd8265916c1bbe821",
"_rev": "6-adce50034e870aa02dc7e1e075c78361",
"language": "query",
"views": {
"categoryIds-json-index": {
"map": {
"fields": {
"categoryIds": "asc"
},
"partial_filter_selector": {}
},
"reduce": "_count",
"options": {
"def": {
"fields": [
"categoryIds"
]
}
}
}
}
}
And run the Mango Query like this:
{
"selector": {
"categoryIds": {
"$in": [
"e0bd5f97ac35bdf6893351337d269230"
]
}
},
"use_index": "categoryIds-json-index"
}
It still does return the results but they are not sorted in the order I want by publicationDate. So I am not clear what you are suggesting the solution is.
You can create an index as documented here
In your case, you will need an index on the "categoryIds" field.
You can specify the index using "use_index": "_design/<name>"
Note:The query planner should automatically pick this index if it's compatible.

Query data where userID in multiples ID

I try to make a query and i don't know the right way to do this.
The mongo collection structure contains multiples user ID (uid) and i want to make a query that get all datas ("Albums") where the User ID match one of the uid.
I do not know if the structure of the collection is good for that and I would like to know if I should do otherwise.
{
"_id": ObjectId("55814a9799677ba44e7826d1"),
"album": "album1",
"pictures": [
"1434536659272.jpg",
"1434552570177.jpg",
"1434552756857.jpg",
"1434552795100.jpg"
],
"uid": [
"12814a8546677ba44e745d85",
"e745d677ba4412814e745d7b",
"28114a85466e745d677d85qs"
],
"__v": 0
}
I just searched on internet and found this documentation http://docs.mongodb.org/manual/reference/operator/query/in/ but I'm not certain that this is the right way.
In short, I need to know: if I use the right method for the stucture of the collection and the operator "$in" is the right solution (knowing that it may have a lot of "User ID": between 2 and 2000 maximum).
You don't need $in unless you are matching for more than one possible value in a field, and that field does not have to be an array. $in is in fact shorthand for $or.
You just need a simple query here:
Model.find({ "uid": "12814a8546677ba44e745d85" },function(err,results) {
})
If you want "multiple" user id's then you can use $in:
Model.find(
{ "uid": { "$in": [
"12814a8546677ba44e745d85",
"e745d677ba4412814e745d7b",
] } },
function(err,results) {
}
)
Which is short for $or in this way:
Model.find(
{
"$or": [
{ "uid": "12814a8546677ba44e745d85" },
{ "uid": "e745d677ba4412814e745d7b" }
]
},
function(err,results) {
}
)
Just to answer your question, you can use the below query to get the desired result.
db.mycollection.find( {uid : {$in : ["28114a85466e745d677d85qs"] } } )
However, you need to revisit your data structure, looks like its a Many-to-Many problem and you might need to think about introducing a mid collection for that.

Query the number of elements matching a filter using elastic.js?

I'm building a leaderboard with elasticsearch. I'd like to query all documents who have points greater than a given amount using the following query:
{
"constant_score" : {
"filter" : {
"range" : {
"totalPoints" : {
"gt": 242
}
}
}
}
This works perfectly -- elasticsearch appropriately returns all documents with points greater than 242. However, all I really need is the count of elements matching this query. Since I'm sending the result over the network, it would be helpful if the query simply returned the count, as opposed to all of the documents matching the filter.
How do I get elasticsearch to only report the count of documents matching the filter?
EDIT: I've learned that what I'm looking for is setting search_type to count. However, I'm not sure how to do this with elastic.js. Any noders willing to pitch in their advice?
You can use the query type count for exactly that purpose:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-search-type.html#count
This is an example that should help you:
GET /mymusic/itunes/_search?search_type=count
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"range": {
"year": {
"gt": 2000
}
}
}
}
}
}

Resources