MongoDB compare users based on nested array

MongoDB compare users based on nested array - node.js

I have a collection of users with nested array Answers. I need to compare their responses, every time that it is the same i add the coeff value of the answers to a variable, then if for example it's above 10 i send back all the users with their own total coeff (above 10 so).
So question is how to compare users by going to a nested array (answers) checking if the same field have the same value (answerChoice) for the same answer (answerNumber) and taking another value of the nested array (answerCoeff) to increment into a variable and printing the total coeff and the users meeting a certain coeff amount
{
"id": "string",
"birthdate": "2021-06-18T13:53:30.443Z",
"userName": "string",
"pictures": [
"string"
],
"answers": [
{
"answerNumber": 0,
"answerChoice": 3,
"answerCoeff": 2
},
{
"answerNumber": 1,
"answerChoice": 2,
"answerCoeff": 5
}
...
],
}
Output expected :
{
"matchs": [
{
"ids": [
"string"
],
"userName": "string",
"pictures": [
"string"
],
"coeff": 0
}
]
}

Related

Unable to fetch the entire column index based on the value using JSONPath finder in npm

I have the below response payload and I just want to check the amount == 1000 if it's matching then I just want to get the entire column as output.
Sample Input:
{
"sqlQuery": "select SET_UNIQUE, amt as AMOUNT from transactionTable where SET_USER_ID=11651 ",
"message": "2 rows selected",
"row": [
{
"column": [
{
"value": "22621264",
"name": "SET_UNIQUE"
},
{
"value": "1000",
"name": "AMOUNT"
}
]
},
{
"column": [
{
"value": "226064213",
"name": "SET_UNIQUE"
},
{
"value": "916",
"name": "AMOUNT"
}
]
}
]
}
Expected Output:
"column": [
{
"value": "22621264",
"name": "SET_UNIQUE"
},
{
"value": "1000",
"name": "AMOUNT"
}
]
The above sample I just want to fetch the entire column if the AMOUNT value will be 1000.
I just tried below to achieve this but no luck.
1. row[*].column[?(#.value==1000)].column
2. row[*].column[?(#.value==1000)]
I don't want to do this by using index. Because It will be change.
Any ideas please?

I think you'd need nested expressions, which isn't something that's widely supported. Something like
$.row[?(#.column[?(#.value==1000)])]
The inner expression returns matches for value==1000, then the outer expression checks for existence of those matches.
Another alternative that might work is
$.row[?(#.column[*].value==1000)]
but this assumes some implicit type conversions that may or may not be supported.

Can I index an array in a composite index in Azure Cosmos DB?

I have a problem indexing an array in Azure Cosmos DB
I am trying to save this indexing policy via the portal
{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [
{
"path": "/*"
}
],
"excludedPaths": [
{
"path": "/\"_etag\"/?"
}
],
"compositeIndexes": [
[
{
"path": "/DeviceId",
"order": "ascending"
},
{
"path": "/TimeStamp",
"order": "ascending"
},
{
"path": "/Items/[]/Name/?",
"order": "ascending"
},
{
"path": "/Items/[]/DoubleValue/?",
"order": "ascending"
}
]
]
}
I get the error "Failed to update container DeviceEvents:
Message: {"code":"BadRequest","message":"Message: {"Errors":["The indexing path '\/Items\/[]\/Name\/?' could not be accepted, failed near position '8'."
This seems to be the array [] syntax that is giving an error.
On a side note I am not sure what I am doing makes sense at all but I have a query that looks like this
SELECT SUM(de0["DoubleValue"])
FROM root JOIN de0 IN root["Items"]
WHERE root["ApplicationId"] = 57 AND root["DeviceId"] = 126 AND root["TimeStamp"] >= "2021-02-21T17:55:29.7389397Z" AND de0["Name"] = "Use Case"
Where ApplicationId is the partition key and the item saved looks like this
{
"id": "59ab9323-26ca-436f-8d29-e1ddd826f025",
"DeviceId": 3,
"ApplicationId": 3,
"RawData": "640F7A000A00E30142000000",
"TimeStamp": "2021-02-20T18:36:52.833174Z",
"Items": [
{
"Name": "Battery Status",
"StringValue": "Full",
"DoubleValue": null
},
{
"Name": "Use Case",
"StringValue": null,
"DoubleValue": 12
},
{
"Name": "Battery Voltage",
"StringValue": null,
"DoubleValue": 3.962
},
{
"Name": "Rain Gauge Count",
"StringValue": null,
"DoubleValue": 10
}
],
"_rid": "CgdVAO7B0DNkAAAAAAAAAA==",
"_self": "dbs/CgdVAA==/colls/CgdVAO7B0DM=/docs/CgdVAO7B0DNkAAAAAAAAAA==/",
"_etag": "\"61008771-0000-0d00-0000-603156c50000\"",
"_attachments": "attachments/",
"_ts": 1613846213
}
I need to aggregate on some of these items in the array like say get MAX on temperature or something like this (using Use Case for test although it doesn't make sense). I reasoned that if all the data in the query is in a single composite index the database would be able to do the aggregation without reading the documents themselves. However I can't seem to add a composite index containing an array at all.

Yes, composite index can't contain an array path. It should be a scalar value.
Unlike with included or excluded paths, you can't create a path with
the /* wildcard. Every composite path has an implicit /? at the end of
the path that you don't need to specify. Composite paths lead to a
scalar value and this is the only value that is included in the
composite index.
Reference:https://learn.microsoft.com/en-us/azure/cosmos-db/index-policy#composite-indexes

Is there a significant performance difference between querying for keyword tags using keywords as fields vs. values?

I have 4 documents:
[
{
"id": "doc1",
"keywords": [
{
"keyword": "keyword1",
"weight": 1
},
{
"keyword": "keyword2",
"weight": 2
}
]
},
{
"id": "doc2",
"keywords": [
{
"keyword": "keyword1",
"weight": 2
},
{
"keyword": "keyword3",
"weight": 4
}
]
},
{
"id": "doc3",
"keywords": {
"keyword1": {
"weight": 3
},
"keyword4": {
"weight": 5
}
}
},
{
"id": "doc4",
"keywords": {
"keyword4": {
"weight": 1
},
"keyword5": {
"weight": 2
}
}
}
]
The first two have a "keywords" field that is a list of dictionaries containing a keyword and a weight. The second two have a "keywords" field that is a dictionary of the keywords themselves with dictionary attributes containing the weight data.
When I want to find documents that contain a particular keyword, I run this query:
SELECT c FROM c
JOIN
k IN c.keywords
where k.keyword="keyword1"
This returns all the documents among the first two documents that have the keyword "keyword1".
I can perform a similar query for the second two documents (although I may be adding unnecessary overhead with the weight check):
select d from d
WHERE d.keywords.keyword1.weight > 0
The RU cost is slightly lower for the second one, but it's only 4 documents. I will be scaling this up to around 10-20 million documents. Is one of these formats significantly more scalable than the other?

Both approaches are scalable, but the second approach using nested properties will be better overall in terms of efficiency/RUs.

Marklogic 8 Node.js API - How can I scope a search on a property child of root?

[updated 17:15 on 28/09]
I'm manipulating json data of type:
[
{
"id": 1,
"title": "Sun",
"seeAlso": [
{
"id": 2,
"title": "Rain"
},
{
"id": 3,
"title": "Cloud"
}
]
},
{
"id": 2,
"title": "Rain",
"seeAlso": [
{
"id": 3,
"title": "Cloud"
}
]
},
{
"id": 3,
"title": "Cloud",
"seeAlso": [
{
"id": 1,
"title": "Sun"
}
]
},
];
After inclusion in the database, a node.js search using
db.documents.query(
q.where(
q.collection('test films'),
q.value('title','Sun')
).withOptions({categories: 'none'})
)
.result( function(results) {
console.log(JSON.stringify(results, null,2));
});
will return both the film titled 'Sun' and the films which have a seeAlso/title property (forgive the xpath syntax) = 'Sun'.
I need to find 1/ films with title = 'Sun' 2/ films with seeAlso/title = 'Sun'.
I tried a container query using q.scope() with no success; I don't find how to scope the root object node (first case) and for the second case,
q.where(q.scope(q.property('seeAlso'), q.value('title','Sun')))
returns as first result an item which matches all text inside the root object node
{
"index": 1,
"uri": "/1.json",
"path": "fn:doc(\"/1.json\")",
"score": 137216,
"confidence": 0.6202662,
"fitness": 0.6701325,
"href": "/v1/documents?uri=%2F1.json&database=Documents",
"mimetype": "application/json",
"format": "json",
"matches": [
{
"path": "fn:doc(\"/1.json\")/object-node()",
"match-text": [
"Sun Rain Cloud"
]
}
]
},
which seems crazy.
Any idea about how doing such searches on denormalized json data?

Laurent:
XPaths on JSON are supported by MarkLogic.
In particular, you might consider setting up a path range index to match /title at the root:
http://docs.marklogic.com/guide/admin/range_index#id_54948
Scoped property matching required either filtering or indexed positions to be accurate. An alternative is to set up another path range index on /seeAlso/title
For the match issue it would be useful to know the MarkLogic version and to see the entire query.
Hoping that helps,

How to search through data with arbitrary amount of fields?

I have the web-form builder for science events. The event moderator creates registration form with arbitrary amount of boolean, integer, enum and text fields.
Created form is used for:
register a new member to event;
search through registered members.
What is the best search tool for second task (to search memebers of event)? Is ElasticSearch well for this task?

I wrote a post about how to index arbitrary data into Elasticsearch and then to search it by specific fields and values. All this, without blowing up your index mapping.
The post is here: http://smnh.me/indexing-and-searching-arbitrary-json-data-using-elasticsearch/
In short, you will need to do the following steps to get what you want:
Create a special index described in the post.
Flatten the data you want to index using the flattenData function:
https://gist.github.com/smnh/30f96028511e1440b7b02ea559858af4.
Create a document with the original and flattened data and index it into Elasticsearch:
{
"data": { ... },
"flatData": [ ... ]
}
Optional: use Elasticsearch aggregations to find which fields and types have been indexed.
Execute queries on the flatData object to find what you need.
Example
Basing on your original question, let's assume that the first event moderator created a form with following fields to register members for the science event:
name string
age long
sex long - 0 for male, 1 for female
In addition to this data, the related event probably has some sort of id, let's call it eventId. So the final document could look like this:
{
"eventId": "2T73ZT1R463DJNWE36IA8FEN",
"name": "Bob",
"age": 22,
"sex": 0
}
Now, before we index this document, we will flatten it using the flattenData function:
flattenData(document);
This will produce the following array:
[
{
"key": "eventId",
"type": "string",
"key_type": "eventId.string",
"value_string": "2T73ZT1R463DJNWE36IA8FEN"
},
{
"key": "name",
"type": "string",
"key_type": "name.string",
"value_string": "Bob"
},
{
"key": "age",
"type": "long",
"key_type": "age.long",
"value_long": 22
},
{
"key": "sex",
"type": "long",
"key_type": "sex.long",
"value_long": 0
}
]
Then we will wrap this data in a document as I've showed before and index it.
Then, the second event moderator, creates another form having a new field, field with same name and type, and also a field with same name but with different type:
name string
city string
sex string - "male" or "female"
This event moderator decided that instead of having 0 and 1 for male and female, his form will allow choosing between two strings - "male" and "female".
Let's try to flatten the data submitted by this form:
flattenData({
"eventId": "F1BU9GGK5IX3ZWOLGCE3I5ML",
"name": "Alice",
"city": "New York",
"sex": "female"
});
This will produce the following data:
[
{
"key": "eventId",
"type": "string",
"key_type": "eventId.string",
"value_string": "F1BU9GGK5IX3ZWOLGCE3I5ML"
},
{
"key": "name",
"type": "string",
"key_type": "name.string",
"value_string": "Alice"
},
{
"key": "city",
"type": "string",
"key_type": "city.string",
"value_string": "New York"
},
{
"key": "sex",
"type": "string",
"key_type": "sex.string",
"value_string": "female"
}
]
Then, after wrapping the flattened data in a document and indexing it into Elasticsearch we can execute complicated queries.
For example, to find members named "Bob" registered for the event with ID 2T73ZT1R463DJNWE36IA8FEN we can execute the following query:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "flatData",
"query": {
"bool": {
"must": [
{"term": {"flatData.key": "eventId"}},
{"match": {"flatData.value_string.keyword": "2T73ZT1R463DJNWE36IA8FEN"}}
]
}
}
}
},
{
"nested": {
"path": "flatData",
"query": {
"bool": {
"must": [
{"term": {"flatData.key": "name"}},
{"match": {"flatData.value_string": "bob"}}
]
}
}
}
}
]
}
}
}

ElasticSearch automatically detects the field content in order to index it correctly, even if the mapping hasn't been defined previously. So, yes : ElasticSearch suits well these cases.
However, you may want to fine tune this behavior, or maybe the default mapping applied by ElasticSearch doesn't correspond to what you need : in this case, take a look at the default mapping or, for even further control, the dynamic templates feature.

If you let your end users decide the keys you store things in, you'll have an ever-growing mapping and cluster state, which is problematic.
This case and a suggested solution is covered in this article on common problems with Elasticsearch.
Essentially, you want to have everything that can possibly be user-defined as a value. Using nested documents, you can have a key-field and differently mapped value fields to achieve pretty much the same.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

MongoDB compare users based on nested array - node.js

Related

Unable to fetch the entire column index based on the value using JSONPath finder in npm

Can I index an array in a composite index in Azure Cosmos DB?

Is there a significant performance difference between querying for keyword tags using keywords as fields vs. values?

Marklogic 8 Node.js API - How can I scope a search on a property child of root?

How to search through data with arbitrary amount of fields?

Categories

Resources