Cloudant Search: what are the conditions for using the count facet? - search

I am trying to set up a search index using Cloudant, but I find the documentation pretty confusing. It states:
FACETING
In order to use facets, all the documents in the index must include all the fields that have faceting enabled. If your documents do not include all the fields, you will receive a bad_request error with the following reason, “dim field_name does not exist.”
If each document does not contain all the fields for facets, it is recommended that you create separate indexes for each field. If you do not create separate indexes for each field, you must include only documents that contain all the fields. Verify that the fields exist in each document using a single if statement.
Counts
The count facet syntax takes a list of fields, and returns the number of query results for each unique value of each named field.
The count operation works only if the indexed values are strings. The indexed values cannot be mixed types. For example, if 100 strings are indexed, and one number, then the index cannot be used for count operations. You can check the type using the typeof operator, and convert using parseInt, parseFloat and .toString() functions.
Specifically, what does it means when "all the documents in the index include all the fields that have faceting enabled".
For example, if my database consists of the following doc:
{
"_id": "mydoc"
"subjects": [ "subject A", "subject B" ]
}
And I write a search index like so:
function (doc) {
for(var i=0; i < doc.subjects.length; i++)
index("hasSubject", doc.subjects[i], {facet: true});
}
Would this be illegal because mydoc doesn't have a field called hasSubject? And when we rewrite the query to look like;
{
"_id": "mydoc"
"hasSubject": true,
"subjects": [ "subject A", "subject B" ]
}
Would that suddenly make it OK...?

So the new documentation is at https://console.ng.bluemix.net/docs/services/Cloudant/api/search.html#faceting ; however, the entry on faceting is the same. So no big deal there.
To answer your question though, I think what the documentation is saying is that all the JSON docs in your database must contain the subjects field, which is what you're declaring you want to facet on in your example.
So I would also consider defining your search index like so:
function (doc) {
if (doc.subjects) {
for(var i=0; i < doc.subjects.length; i++) {
if (typeof doc.subjects[i] == "string") {
index("hasSubject", doc.subjects[i], {facet: true});
}
}
}
}
And if you had a doc like this in your database:
{
"_id": "mydoc"
"hasSubject": true,
}
I think that would suddenly make your facets NOT ok.

Related

How to make $elemMatch work for json array data in mango query?

I have a field in my application like below.
{
"Ct": "HH",
Val:{
"Count":"A",
"Branch":"A"
}
}
When I'm trying to retrieve this using below command in CouchDB, I'm unable to retrieve records.
{
"selector" : {
"Val":{
"$elemMatch":{
"Count":"A"
}
}
}
From the CouchDB documentation,$elemMatch[1]
Matches and returns all documents that contain an array field with at
least one element that matches all the specified query criteria.
Val.Count is not an array field so $elemMatch is not appropriate.
Consider the CouchDB documentation regarding subfield queries[2]:
1.3.6.1.3. Subfields
A more complex selector enables you to specify the values for field of
nested objects, or subfields. For example, you might use a standard
JSON structure for specifying a field and subfield.
Example of a field and subfield selector, using a standard JSON
structure:
{
"imdb": {
"rating": 8
}
}
An abbreviated equivalent uses a dot notation to combine the field and
subfield names into a single name.
{
"imdb.rating": 8
}
Specifically,
selector: {
"Val.Count": "A"
}
1 CouchDB: 1.3.6.1.7. Combination Operators
2 CouchDB: 1.3.6.1.3. Subfields

Query CosmosDB when document contains Dictionary

I have a problem with querying CosmosDB document which contains a dictionary. This is an example document:
{
"siteAndDevices": {
"4cf0af44-6233-402a-b33a-e7e35dbbee6a": [
"f32d80d9-e93a-687e-97f5-676516649420",
"6a5eb9fa-c961-93a5-38cc-ecd74ada13ac",
"c90e9986-5aea-b552-e532-cd64a250ad10",
"7d4bfdca-547a-949b-ccb3-bbf0d6e5d727",
"fba51bfe-6a5e-7f25-e58a-7b0ced59b5d8",
"f2caac36-3590-020f-ebb7-5ccd04b4412c",
"1b446af7-ba74-3564-7237-05024c816a02",
"7ef3d931-131e-a639-10d4-f4dd5db834ca"
]
},
"id": "f9ef9fb6-4b70-7d3f-2bc8-c3d335018624"
}
I need to get all documents where provided guid is in the list, so in the dictionary value (I don't know dictionary key). I found an information somewhere here that it is not possible to iterate through keys in dictionary in CosmosDB (maybe it has changed since that time but I din't find any information in documentation), but maybe someone will have some idea. I cannot change form of the document.
I tried to do it in Linq, but I didn't get any results.
var query = _documentClient
.CreateDocumentQuery<Dto>(DocumentCollectionUri())
.Where(d => d.SiteAndDevices.Any(x => x.Value.Contains("f32d80d9-e93a-687e-97f5-676516649420")))
.AsDocumentQuery();
Not sure of the Linq query, but with SQL, you'd need something like this:
SELECT * FROM c
where array_contains(c.siteAndDevices['4cf0af44-6233-402a-b33a-e7e35dbbee6a'],"f32d80d9-e93a-687e-97f5-676516649420")
This is a strange document format though, as you've named your key with an id:
"siteAndDevices": {
"4cf0af44-6233-402a-b33a-e7e35dbbee6a": ["..."]
}
Your key is "4cf0af44-6233-402a-b33a-e7e35dbbee6a", which forces you to use a different syntax to reference it:
c.siteAndDevices['4cf0af44-6233-402a-b33a-e7e35dbbee6a']
You'd save yourself a lot of trouble refactoring this to something like:
{
"id": "dictionary1",
"siteAndDevices": {
"deviceId": "4cf0af44-6233-402a-b33a-e7e35dbbee6a",
"deviceValues": ["..."]
}
}
You can refactor further, such as using an array to contain multiple device id + value combos.

MongoDB nested array update multiple documents [duplicate]

I am trying to update a value in the nested array but can't get it to work.
My object is like this
{
"_id": {
"$oid": "1"
},
"array1": [
{
"_id": "12",
"array2": [
{
"_id": "123",
"answeredBy": [], // need to push "success"
},
{
"_id": "124",
"answeredBy": [],
}
],
}
]
}
I need to push a value to "answeredBy" array.
In the below example, I tried pushing "success" string to the "answeredBy" array of the "123 _id" object but it does not work.
callback = function(err,value){
if(err){
res.send(err);
}else{
res.send(value);
}
};
conditions = {
"_id": 1,
"array1._id": 12,
"array2._id": 123
};
updates = {
$push: {
"array2.$.answeredBy": "success"
}
};
options = {
upsert: true
};
Model.update(conditions, updates, options, callback);
I found this link, but its answer only says I should use object like structure instead of array's. This cannot be applied in my situation. I really need my object to be nested in arrays
It would be great if you can help me out here. I've been spending hours to figure this out.
Thank you in advance!
General Scope and Explanation
There are a few things wrong with what you are doing here. Firstly your query conditions. You are referring to several _id values where you should not need to, and at least one of which is not on the top level.
In order to get into a "nested" value and also presuming that _id value is unique and would not appear in any other document, you query form should be like this:
Model.update(
{ "array1.array2._id": "123" },
{ "$push": { "array1.0.array2.$.answeredBy": "success" } },
function(err,numAffected) {
// something with the result in here
}
);
Now that would actually work, but really it is only a fluke that it does as there are very good reasons why it should not work for you.
The important reading is in the official documentation for the positional $ operator under the subject of "Nested Arrays". What this says is:
The positional $ operator cannot be used for queries which traverse more than one array, such as queries that traverse arrays nested within other arrays, because the replacement for the $ placeholder is a single value
Specifically what that means is the element that will be matched and returned in the positional placeholder is the value of the index from the first matching array. This means in your case the matching index on the "top" level array.
So if you look at the query notation as shown, we have "hardcoded" the first ( or 0 index ) position in the top level array, and it just so happens that the matching element within "array2" is also the zero index entry.
To demonstrate this you can change the matching _id value to "124" and the result will $push an new entry onto the element with _id "123" as they are both in the zero index entry of "array1" and that is the value returned to the placeholder.
So that is the general problem with nesting arrays. You could remove one of the levels and you would still be able to $push to the correct element in your "top" array, but there would still be multiple levels.
Try to avoid nesting arrays as you will run into update problems as is shown.
The general case is to "flatten" the things you "think" are "levels" and actually make theses "attributes" on the final detail items. For example, the "flattened" form of the structure in the question should be something like:
{
"answers": [
{ "by": "success", "type2": "123", "type1": "12" }
]
}
Or even when accepting the inner array is $push only, and never updated:
{
"array": [
{ "type1": "12", "type2": "123", "answeredBy": ["success"] },
{ "type1": "12", "type2": "124", "answeredBy": [] }
]
}
Which both lend themselves to atomic updates within the scope of the positional $ operator
MongoDB 3.6 and Above
From MongoDB 3.6 there are new features available to work with nested arrays. This uses the positional filtered $[<identifier>] syntax in order to match the specific elements and apply different conditions through arrayFilters in the update statement:
Model.update(
{
"_id": 1,
"array1": {
"$elemMatch": {
"_id": "12","array2._id": "123"
}
}
},
{
"$push": { "array1.$[outer].array2.$[inner].answeredBy": "success" }
},
{
"arrayFilters": [{ "outer._id": "12" },{ "inner._id": "123" }]
}
)
The "arrayFilters" as passed to the options for .update() or even
.updateOne(), .updateMany(), .findOneAndUpdate() or .bulkWrite() method specifies the conditions to match on the identifier given in the update statement. Any elements that match the condition given will be updated.
Because the structure is "nested", we actually use "multiple filters" as is specified with an "array" of filter definitions as shown. The marked "identifier" is used in matching against the positional filtered $[<identifier>] syntax actually used in the update block of the statement. In this case inner and outer are the identifiers used for each condition as specified with the nested chain.
This new expansion makes the update of nested array content possible, but it does not really help with the practicality of "querying" such data, so the same caveats apply as explained earlier.
You typically really "mean" to express as "attributes", even if your brain initially thinks "nesting", it's just usually a reaction to how you believe the "previous relational parts" come together. In reality you really need more denormalization.
Also see How to Update Multiple Array Elements in mongodb, since these new update operators actually match and update "multiple array elements" rather than just the first, which has been the previous action of positional updates.
NOTE Somewhat ironically, since this is specified in the "options" argument for .update() and like methods, the syntax is generally compatible with all recent release driver versions.
However this is not true of the mongo shell, since the way the method is implemented there ( "ironically for backward compatibility" ) the arrayFilters argument is not recognized and removed by an internal method that parses the options in order to deliver "backward compatibility" with prior MongoDB server versions and a "legacy" .update() API call syntax.
So if you want to use the command in the mongo shell or other "shell based" products ( notably Robo 3T ) you need a latest version from either the development branch or production release as of 3.6 or greater.
See also positional all $[] which also updates "multiple array elements" but without applying to specified conditions and applies to all elements in the array where that is the desired action.
I know this is a very old question, but I just struggled with this problem myself, and found, what I believe to be, a better answer.
A way to solve this problem is to use Sub-Documents. This is done by nesting schemas within your schemas
MainSchema = new mongoose.Schema({
array1: [Array1Schema]
})
Array1Schema = new mongoose.Schema({
array2: [Array2Schema]
})
Array2Schema = new mongoose.Schema({
answeredBy": [...]
})
This way the object will look like the one you show, but now each array are filled with sub-documents. This makes it possible to dot your way into the sub-document you want. Instead of using a .update you then use a .find or .findOne to get the document you want to update.
Main.findOne((
{
_id: 1
}
)
.exec(
function(err, result){
result.array1.id(12).array2.id(123).answeredBy.push('success')
result.save(function(err){
console.log(result)
});
}
)
Haven't used the .push() function this way myself, so the syntax might not be right, but I have used both .set() and .remove(), and both works perfectly fine.

Sorting CouchDB result by value

I'm brand new to CouchDB (and NoSQL in general), and am creating a simple Node.js + express + nano app to get a feel for it. It's a simple collection of books with two fields, 'title' and 'author'.
Example document:
{
"_id": "1223e03eade70ae11c9a3a20790001a9",
"_rev": "2-2e54b7aa874059a9180ac357c2c78e99",
"title": "The Art of War",
"author": "Sun Tzu"
}
Reduce function:
function(doc) {
if (doc.title && doc.author) {
emit(doc.title, doc.author);
}
}
Since CouchDB sorts by key and supports a 'descending=true' query param, it was easy to implement a filter in the UI to toggle sort order on the title, which is the key in my results set. Here's the UI:
List of books with link to sort title by ascending or descending
But I'm at a complete loss on how to do this for the author field.
I've seen this question, which helped a poster sort by a numeric reduce value, and I've read a blog post that uses a list to also sort by a reduce value, but I've not seen any way to do this on a string value without a reduce.
If you want to sort by a particular property, you need to ensure that that property is the key (or, in the case of an array key, the first element in the array).
I would recommend using the sort key as the key, emitting a null value and using include_docs to fetch the full document to allow you to display multiple properties in the UI (this also keeps the deserialized value consistent so you don't need to change how you handle the return value based on sort order).
Your map functions would be as simple as the following.
For sorting by author:
function(doc) {
if (doc.title && doc.author) {
emit(doc.author, null);
}
}
For sorting by title:
function(doc) {
if (doc.title && doc.author) {
emit(doc.title, null);
}
}
Now you just need to change which view you call based on the selected sort order and ensure you use the include_docs=true parameter on your query.
You could also use a single view for this by emitting both at once...
emit(["by_author", doc.author], null);
emit(["by_title", doc.title], null);
... and then using the composite key for your query.

CouchDb view - key in a list

I Want to query CouchDB and I have a specific need : my query should return the name field of documents corresponding to this condition : the id is equal or contained in a document filed (a list).
For example, the field output is the following :
"output": [
"doc_s100",
"doc_s101",
"doc_s102",
"doc_s103",
],
I want to get all the documents having in their output field "doc_s102" for example.
I wrote a view in a design document :
"backward_by_docid": {
"map": "function(doc) {if(doc.output) emit(doc.output, doc.name)}"
}
but this view works only when I have a unique value in the output field.
How can I resolve this query ?
Thanks !
you have to iterate over the array:
if(doc.output) {
for (var curOutput in doc.output) {
emit (doc.output[curOutput],doc.name);
}
}
make sure that output always is an array (at least [])
.. and, of course use key="xx" instead key=["xxx"]

Resources