CosmosDB - list in aggregate query response - azure

I have following document structure:
{
"id": "1",
"aId": "2",
"bId": "3",
....
},
{ "id":"2",
"aId": "2",
"bId": "4"
}
How do i return for that JSON that has aId that has list of all bIds of the same aId, and as additional field: count of such bIds? So for example above and condtion: "WHERE aId="2" response would be:
{
"aId": "2",
"bIds" : ["4","3"],
"bIds count" : 2
}
Assuming i only pass one aId as parameter.
I tried something like:
select
(select 'something') as aId,
(select distinct value c.bId from c where c.aId='something') as bIds
from TableName c
But for love of me i cant figure out how to get that list + its count + hardcoded aId in single JSON response (single row)
For example this query:
select
(select distinct value 'someId') as aId,
(select distinct value c.bId) as bIds
from c where c.aId='someId'
will return
{ { 'aId': 'someId', 'bIds':'2'},{'aId':'someId','bIds':'4'}}
while what i acutally want is
{ {'aId':''someId', 'bIds':['2','4']}}
Here is query that is closest to what i want:
select
c.aId as aId,
count(c2) as bIdCount,
array(select distinct value c2.bId from c2)
from c join (select c.bId from c) as c2
where c.aId = 'SOME_ID'
Only thing line with array make this query fail if i delete this line it works (correctly returns id and count in one row). But i need to select content of this list also, and i ma lost why its not working, example is almost copypasted from "How to perform array projection Cosmos Db"
https://azurelessons.com/array-in-cosmos-db/#How_to_perform_array_projection_Azure_Cosmos_DB

Here is how you'd return an array of bId:
SELECT distinct value c.bId
FROM c
where c.aId = "2"
This yields:
[
"3",
"4"
]
Removing the value keyword:
SELECT distinct c.bId
FROM c
where c.aId = "2"
yields:
[
{ "bId" : "3" },
{ "bId" : "4" }
]
From either of these, you can count the number of array elements returned. If your payload must include count and aId, you'll need to add those to your JSON output.

Related

Cosmos DB json array that matchs all the words

I need a query that can get me the document from a list of words for example if I use
select c from c join (SELECT distinct VALUE c.id FROM c JOIN word IN c.words WHERE word in('word1',word2) and tag in('motorcycle')) ORDER BY c._ts desc
it will bring both documents, I want to retrieve only the first one cause it matches the two words and not only one.
Document 1
"c": {
"id": "d0f1723c-0a55-454a-9cf8-3884f2d8d61a",
"words": [
"word1",
"word2",
"word3",
]}
Document 2
"c": {
"id": "d0f1723c-0a55-454a-9cf8-3884f2d8d61a",
"words": [
"word1",
"word4",
"word5",
]}
You should be able to cover this with two ARRAY_CONTAINS expressions in your WHERE clause (and no need for a JOIN):
SELECT c.id FROM c
WHERE ARRAY_CONTAINS(c.words, 'word1')
AND ARRAY_CONTAINS(c.words, 'word2')
This should return the id of your first document.

How do I select a numerical field in Cosmos DB?

I have data that looks like: {"id": "abc", "1":"2", "3":"5"}
I'm trying to select this data with this SQL query:
SELECT c.3 FROM c WHERE c.id = '102'
This gives me a syntax error. I also tried c.'3' and "c.3" the and c."3", but none of those worked.
Is there a way to do this?
Please try something like:
SELECT c["1"], c["3"] FROM c WHERE c.id = '102'
It will produce an output like:
[
{
"1": "2",
"3": "5"
}
]

What happens when adding a field in UDT in Cassandra?

For example, suppose I have a basic_info type:
CREATE TYPE basic_info (first_name text, last_name text, nationality text)
And table like this:
CREATE TABLE student_stats (id int PRIMARY KEY, grade text, basics FROZEN<basic_info>)
And I have millions of record in the table.
If I add a field in the basic_info like this:
ALTER TYPE basic_info ADD address text;
I want to ask what happens in Cassandra when you add a new field in UDT type (it's currently a column in a table)? The reason for this question is I afraid that some side effects will happen when the table contains a lot of data (millions of record). It's best if you can explain things that will happen from the start to the end.
fields of UDT are described in table system_schema.types. When you add a new field, the entry for that type is updated inside Cassandra, but no changes in data on disk will happen (SSTables are immutable). Instead, when Cassandra read data, it checks if field is present or not, and if not (because it wasn't set, or it's a new field of UDT), then it will return null for that value, but not modify data on disk.
For example, if I have following type and table that uses it:
CREATE TYPE test.udt (
id int,
t1 int
);
CREATE TABLE test.u2 (
id int PRIMARY KEY,
u udt
)
And I have some data in the table, so I get:
cqlsh> select * from test.u2; id | u ----+---------------- 5 | {id: 1, t1: 3}
If I add a field to UDT with alter type test.udt add t2 int;, I immediately see the null as a value for a new UDT field:
cqlsh> select * from test.u2;
id | u
----+--------------------------
5 | {id: 1, t1: 3, t2: null}
And if I do sstabledump on the SSTable, I can see that it contains only old data:
[
{
"partition" : {
"key" : [ "5" ],
"position" : 0
},
"rows" : [
{
"type" : "row",
"position" : 46,
"liveness_info" : { "tstamp" : "2019-07-28T09:33:12.019Z" },
"cells" : [
{ "name" : "u", "path" : [ "id" ], "value" : 1 },
{ "name" : "u", "path" : [ "t1" ], "value" : 3 }
]
}
]
}
]
See also my answer about adding/removing columns

Sub query and GroupBy in Azure cosmosDB

I have a situation where i need to get container item based on a GroupBy sub query. This looks simple but not working for me. Help is appreciated! Below is the sql query
SELECT * FROM my_container
WHERE my_container.item.Id
IN (SELECT VALUE c.item.Id FROM c WHERE c.item.name = 'ABC'
GROUP BY c.item.Id )
Gives error as it is not in correct IN acceptable format. IN ('a', 'b')
My my_container items are something like:
[
{
item : {
name: "ABC",
id: "1",
address1: "address1",
city: "city1"
},
item : {
name: "ABC",
id: "2",
address1: "address2",
city: "city2"
},
item : {
name: "ABC",
id: "3",
address1: "address3",
city: "city3"
},
}
]
The result of your sub query is an array[],but keyword IN just supports ().
I tried this sql:
SELECT * FROM c WHERE ARRAY_CONTAINS((SELECT VALUE c.item.id FROM c WHERE c.item.name = 'ABC' GROUP BY c.item.id),c.item.id,false)
But it gets 0 rows.The reason is that ARRAY_CONTAINS() function does not support sub query as argument.
AS a workaround:
you should use 2 sqls to achieve the goal.
First,execute the sql SELECT VALUE c.item.id FROM c WHERE c.item.name = 'ABC' GROUP BY c.item.id to get the outputs array[].
Then,pass the result you get at the first step to ARRAY_CONTAINS() and execute the below sql
SELECT * FROM c WHERE ARRAY_CONTAINS(['1','2','3'],c.item.id,false)
By the way,sub query in cosmos db unlike the relation database's.Learn more about sub query,please refer to this document.
In addition to Steve's answer, you can call ARRAY function on your subquery, instead of copying & pasting query result. Like this:
SELECT * FROM c WHERE ARRAY_CONTAINS(
ARRAY(SELECT VALUE c.item.id FROM c WHERE c.item.name = 'ABC' GROUP BY c.item.id),
c.item.id,
false
)

Azure Document Query Sub Dictionaries

I have stored the following JSON document in the Azure Document DB:
"JobId": "04e63d1d-2af1-42af-a349-810f55817602",
"JobType": 3,
"
"Properties": [
{
"Key": "Value1",
"Value": "testing1"
},
{
"Key": "Value",
"Value": "testing2"
}
]
When i try to query the document back i can easily perform the
Select f.id,f.Properties, C.Key from f Join C IN f.Properties where C.Key = 'Value1'
However when i try to query:
Select f.id,f.Properties, C.Key from f Join C IN f.Properties where C.Value = 'testing1'
I get an error that the query cannot be computed. I assume this is due to 'VALUE' being a reserved keyword within the query language.
I cannot specify a specific order in the property array because different subclasses can add different property in different orders as they need them.
Anybody any suggestion how i can still complete this query ?
To escape keywords in DocumentDB, you can use the [] syntax. For example, the above query would be:
Select f.id,f.Properties, C.Key from f Join C IN f.Properties where C["Value"] = 'testing1'

Resources