Get max of grouped documents in CosmosDb - azure

I want to query Azure CosmosDb documents with SQL API query. These Documents shall be filtered and grouped by specific values. From these groups only the document with a specified max value shall be returned.
Example
Azure CosmosDb Documents
{
"id": "1",
"someValue": "shall be included",
"group": "foo",
"timestamp": "1668907312"
}
{
"id": "2",
"someValue": "shall be included",
"group": "foo",
"timestamp": "1668907314"
}
{
"id": "3",
"someValue": "shall be included",
"group": "bar",
"timestamp": "1668907312"
}
{
"id": "4",
"someValue": "don't include",
"group": "bar",
"timestamp": "1668907312"
}
Query
I want do get all documents
with "someValue": "shall be included"
grouped by group
from group only max of timestamp
Example response
{
"id": "2",
"someValue": "shall be included",
"group": "foo",
"timestamp": "1668907314"
},
{
"id": "3",
"someValue": "shall be included",
"group": "bar",
"timestamp": "1668907312"
}
Question
What is the best way to do this? It would be optimal if
it is possible in one query
and executable with Azure SDK with use of SqlParameter (to prevent injection)
What i've tried
My current approach consists of 2 queries and uses ARRAY_CONTAINS, which does not allow the use of SqlParameter for the document paths.
{
"id": "2",
"some-value": "shall be included",
"group": "foo",
"timestamp": "1668907314"
}
First Query
SELECT c.group AS group
MAX(c.timestamp) AS maxValue
FROM c
WHERE c.someValue = 'shall be included'
GROUP BY c.group
Second Query
SELECT * FROM c WHERE ARRAY_CONTAINS(
<RESULT-FIRST-QUERY>,
{
"group": c.group,
"maxValue": c.timestamp
},
false
)

I would utilize the MAX() function in conjunction with GROUP BY i.e
SELECT *
FROM c
WHERE c.someValue = "shall be included"
GROUP BY c.group
HAVING MAX(c.timestamp)
Haven't run that yet/need to make a collection, but seems like it should do the trick...

Related

CosmosDb query to return arrays

I have the following data in my Collection
{
"id": "00000000-0000-0000-454c-4b74472b01d8",
"GroupId": 1,
"Location": "London",
"Status": "Ok"
},
{
"id": "d129adeb-d1bf-4a89-afe3-93e3f60589fb",
"GroupId": 1,
"Location": "Liverpool",
"Status": "Ok"
},
{
"id": "85ecf875-0e32-40b5-823a-a2545694f9b6",
"GroupId": 2,
"Location": "Manchester",
"Status": "Nok"
}
I need to build a query to get all possible value by Group for filtering.
Let's say for "GroupId": 1 I need result like
{
"Location": [
"London",
"Liverpool"
],
"Status": [
"Ok"
]
}
for "GroupId": 2 the response:
{
"Location": [
"Manchester",
],
"Status": [
"Nok"
]
}
Could you please help my to build such query? I don't know even if it possible with CosmosDb.
I have tried so far something like this but it doesn't work
select
(
select VALUE c.Location
FROM c
WHERE c.GroupId = 1
GROUP BY c.Location
) as Location,
(
select VALUE c.Status
FROM c
WHERE c.GroupId = 1
GROUP BY c.Status
) as Status
from c
WHERE c.GroupId = 1
and this
select
[
(SELECT VALUE [c.Location] from c)
] as Location,
[
(SELECT VALUE [c.Status] from c)
] as Status
from c
where c.GroupId = 1
Please help or suggest how to solve that. Thank you in advance.
It's not possible to do this with the way your data is modeled.
With the ARRAY expression you can do this in a subquery for arrays within your document. But not when the data spans documents as it is the case here.

Merge documents by fields

I have two types of docs. Main docs and additional info for it.
{
"id": "371"
"name": "Mike",
"location": "Paris"
},
{
"id": "371-1",
"age": 20,
"lastname": "Piterson"
}
I need to merge them by id, to get result doc. The result should look like:
{
"id": "371"
"name": "Mike",
"location": "Paris"
"age": 20,
"lastname": "Piterson"
}
Using COLLECT / INTO, SPLIT(), and MERGE():
FOR doc IN collection
COLLECT id = SPLIT(doc.id, '-')[0] INTO groups
RETURN MERGE(MERGE(groups[*].doc), {id})
Result:
[
{
"id": "371",
"location": "Paris",
"name": "Mike",
"lastname": "Piterson",
"age": 20
}
]
This will:
Split each id attribute at any - and return the first part
Group the results into sepearate arrays (groups)
Merge #1: Merge all objects into one
Merge #2: Merge the id into the result
See REMOVE & INSERT or REPLACE for write operations.

How to use pagination on dynamoDB

How can you make a paginated request (limit, offset, and sort_by) using dynamoDB?
On mysql you can:
SELECT... LIMIT 10 OFFSET 1 order by created_date ASC
I'm trying this using nodejs, and in this case created_date isn't the primary key, can I query using sort key created_date?
This is my users table
{
"user_id": "asa2311",
"created_date": "2019/01/18 15:05:59",
"status": "A",
"rab_item_id": "0",
"order_id": "1241241",
"description": "testajabroo",
"id": "e3f46600-1af7-11e9-ac22-8d3a3e79a693",
"title": "test"
},
{
"user_id": "asa2311",
"status_id": "D",
"created_date": "2019/01/18 14:17:46",
"order_id": "1241241",
"rab_item_id": "0",
"description": "testajabroo",
"id": "27b5b0d0-1af1-11e9-b843-77bf0166a09f",
"title": "test"
},
{
"user_id": "asa2311",
"created_date": "2019/01/18 15:05:35",
"status": "A",
"rab_item_id": "0",
"order_id": "1241241",
"description": "testajabroo",
"id": "d5879e70-1af7-11e9-8abb-0fa165e7ac53",
"title": "test"
}
Pagination in DynamoDB is handled by setting the ExclusiveStartKey parameter to the LastEvaluatedKey returned from the previous result. There is no way to start after a specific number of items like you can with OFFSET in MySQL.

how to implement algolia autocomplete on a single index, but i want results to show based on facets

I have an index on algolia, each document like this.
{
"title": "sample title",
"slug": "sample slug",
"content": "Head towards Rajinder Da Dhaba for some insanely delicious Kebabs!!",
"Tags": ["fashion", "shoes"],
"created": "2017-03-30T12:10:08.815Z",
"city": "delhi",
"user": {
"_id": "58b6f3ea884fdc682a820dad",
"description": "Roughly, somewhere between insanity and zen. Mostly the guy at the window seat!",
"displayName": "Jon Doe"
},
"type": "Post",
"places": [
{
"name": "Rajinder Da Dhaba",
"slug": "Rajinder-Da-Dhaba-safdarjung-9e9ffe",
"location": {
"_geoloc": [
{
"name": "Safdarjung",
"_id": "59611a2c2094b56a39afcbce",
"coordinates": {
"lng": 77.2030268,
"lat": 28.5685586
}
}
]
}
}
],
"objectID": "58dcf5a0355b590560d6ad68",
}
I want to implement autocomplete on this.
However, when i see the demos present in algolia dashboard, i found out that it returns the complete documents.
I want to only match on user.displayName, place.name, and title
and return only these fields as suggestions in the autocomplete results instead of complete documents, which match.
I know I can create separate indexes for users, places;
But is this possible with only a single index??
Did you had a look at http://algolia.com/doc/tutorials/search-ui/autocomplete/auto-complete/ ?
It shows how to have a custom display from an index.
To match on on user.displayName, place.name, and title
you can configure the "searchable attributes" from the algolia dashboard.

Marklogic 8 Node.js API - How can I scope a search on a property child of root?

[updated 17:15 on 28/09]
I'm manipulating json data of type:
[
{
"id": 1,
"title": "Sun",
"seeAlso": [
{
"id": 2,
"title": "Rain"
},
{
"id": 3,
"title": "Cloud"
}
]
},
{
"id": 2,
"title": "Rain",
"seeAlso": [
{
"id": 3,
"title": "Cloud"
}
]
},
{
"id": 3,
"title": "Cloud",
"seeAlso": [
{
"id": 1,
"title": "Sun"
}
]
},
];
After inclusion in the database, a node.js search using
db.documents.query(
q.where(
q.collection('test films'),
q.value('title','Sun')
).withOptions({categories: 'none'})
)
.result( function(results) {
console.log(JSON.stringify(results, null,2));
});
will return both the film titled 'Sun' and the films which have a seeAlso/title property (forgive the xpath syntax) = 'Sun'.
I need to find 1/ films with title = 'Sun' 2/ films with seeAlso/title = 'Sun'.
I tried a container query using q.scope() with no success; I don't find how to scope the root object node (first case) and for the second case,
q.where(q.scope(q.property('seeAlso'), q.value('title','Sun')))
returns as first result an item which matches all text inside the root object node
{
"index": 1,
"uri": "/1.json",
"path": "fn:doc(\"/1.json\")",
"score": 137216,
"confidence": 0.6202662,
"fitness": 0.6701325,
"href": "/v1/documents?uri=%2F1.json&database=Documents",
"mimetype": "application/json",
"format": "json",
"matches": [
{
"path": "fn:doc(\"/1.json\")/object-node()",
"match-text": [
"Sun Rain Cloud"
]
}
]
},
which seems crazy.
Any idea about how doing such searches on denormalized json data?
Laurent:
XPaths on JSON are supported by MarkLogic.
In particular, you might consider setting up a path range index to match /title at the root:
http://docs.marklogic.com/guide/admin/range_index#id_54948
Scoped property matching required either filtering or indexed positions to be accurate. An alternative is to set up another path range index on /seeAlso/title
For the match issue it would be useful to know the MarkLogic version and to see the entire query.
Hoping that helps,

Resources