Matching multiple values using Azure Cosmos DB - azure

I have following json in my Cosmos DB:
[
{
"FirstName": "FirstName",
"LastName": "LastName",
"TechnologyRatings": [
{
"Technology": {
"Name": "C#",
"id": "d76d59a7-c9a3-404d-91dd-cf2596ee7501"
},
"Rating": 1
},
{
"Technology": {
"Name": "SQL",
"id": "5686189b-ccfc-41c6-bcdb-b56f80130b45",
},
"Rating": 2
}
],
"id": "7c34718f-ef01-4b40-9a03-f0880f424fd4",
"ModifiedAt": "2021-05-28T09:55:37.6260562Z",
"_rid": "GyRkALN-kZcCAAAAAAAAAA==",
"_self": "dbs/GyRkAA==/colls/GyRkALN-kZc=/docs/GyRkALN-kZcCAAAAAAAAAA==/",
"_etag": "\"00000000-0000-0000-53a7-9c3d693501d7\"",
"_attachments": "attachments/",
"_ts": 1622195737
}
]
Now I try to apply a filter on Technology.id and Rating. Meaning I want to select all entries for example with C# with Rating = 1 and SQL with Rating = 2.
Something like
(Technology.id = "d76d59a7-c9a3-404d-91dd-cf2596ee7501" and Rating = 1) OR (Technology.id = "5686189b-ccfc-41c6-bcdb-b56f80130b45" and Rating = 2)
As TechnologyRatings is an array that doesn't work.
I also played around with ARRAY_CONTAINS but I didn't get it to work.
SELECT VALUE c FROM c JOIN t IN c.TechnologyRatings WHERE ARRAY_CONTAINS([{"id": "d76d59a7-c9a3-404d-91dd-cf2596ee7501", "Rating": 1}, {"id": "5686189b-ccfc-41c6-bcdb-b56f80130b45", "Rating": 2}], {"id": t.Technology.id, "Rating": t.Rating}, true)
How can I write such a query?

You can try this SQL:
SELECT
Distinct VALUE c
FROM c
JOIN t IN c.TechnologyRatings
WHERE (t.Technology.id = "d76d59a7-c9a3-404d-91dd-cf2596ee7501" and t.Rating = 1) OR (t.Technology.id = "5686189b-ccfc-41c6-bcdb-b56f80130b45" and t.Rating = 2)
or
SELECT
VALUE c
FROM c
WHERE
(ARRAY_CONTAINS(c.TechnologyRatings,{"Technology": {"id":"d76d59a7-c9a3-404d-91dd-cf2596ee7501"}},true) and ARRAY_CONTAINS(c.TechnologyRatings,{"Rating":1},true))
OR
(ARRAY_CONTAINS(c.TechnologyRatings,{"Technology": {"id":"5686189b-ccfc-41c6-bcdb-b56f80130b45"}},true) and ARRAY_CONTAINS(c.TechnologyRatings,{"Rating":2},true))

Here's the query:
SELECT VALUE root FROM root JOIN (SELECT VALUE EXISTS(SELECT VALUE tRatings FROM root JOIN tRatings IN root["TechnologyRatings"]
WHERE ((tRatings["Technology"]["id"] = "5686189b-ccfc-41c6-bcdb-b56f80130b45") OR (tRatings["Technology"]["id"] = "d76d59a7-c9a3-404d-91dd-cf2596ee7501")))) AS found WHERE found
Note that you must make sure to include a partition key on that query to avoid extra delays and costs on the query.
If the partition key was the 'id' field, the query would look like this:
SELECT VALUE root FROM root JOIN (SELECT VALUE EXISTS(SELECT VALUE tRatings FROM root JOIN tRatings IN root["TechnologyRatings"]
WHERE ((tRatings["Technology"]["id"] = "5686189b-ccfc-41c6-bcdb-b56f80130b45") OR (tRatings["Technology"]["id"] = "d76d59a7-c9a3-404d-91dd-cf2596ee7501")))) AS found
WHERE ((root["id"] = "5686189b-ccfc-41c6-bcdb-b56f80130b45") AND found)
The query with the partition key has the following stats

Related

Cosmos Db (need any sort of iteration mechanism)

want to check my document have same value in object A for eg:
{
"id": "1234-wrew-1234314"
"_ts": 1672840679
"A": [
{
"Id": "123",
"values": 167273168512
},
{
"Id": "1234",
"values": 1672731685
},
{
"Id": "123456",
"values": 1673461685
}
]
}
have this document now i want to check all values have same value or not is there any way to do this?
what i already tried :
select EXISTS(
SELECT VALUE n
FROM n IN c.A
WHERE c.A[0].values= c.A[1].values) as a
from c
where c.id ="1234-wrew-1234314"
its working fine if i have only 2 records in object A but i want generic solution to handle any number of records in object.
i also try with array_contain but its not working.
Thanks in advance.
Does this do what you need?
SELECT d.MaxValue = d.MinValue ? 'All the same' : 'Not the same'
FROM
(
SELECT MAX(a.values) AS MaxValue ,
MIN(a.values) AS MinValue
FROM c
JOIN a IN c.A
WHERE c.id = "1234-wrew-1234314"
) d

select non unique record based on a unique combination of 2 fields cosmos db

I have many records in Cosmos DB container with the following structure (sample):
{
"id": "aaaa",
"itemCode": "1234",
"itemDesc": "TEST",
"otherfileds": ""
}
{
"id": "bbbb",
"itemCode": "1234",
"itemDesc": "TEST2",
"otherfileds": ""
}
{
"id": "cccc",
"itemCode": "5678",
"itemDesc": "HELLO",
"otherfileds": ""
}
{
"id": "dddd",
"itemCode": "5678",
"itemDesc": "HELLO",
"otherfileds": ""
}
{
"id": "eeee",
"itemCode": "9012",
"itemDesc": "WORLD",
"otherfileds": ""
}
{
"id": "ffff",
"itemCode": "9012",
"itemDesc": "WORLD",
"otherfileds": ""
}
Now I want to select records from this where an item code have a non distinct item description. Based on the above example records, I would like to return item code 1234 since it has different values of item descriptions in other records.
{
"id": "aaaa",
"itemCode": "1234",
"itemDesc": "TEST",
"otherfileds": ""
}
{
"id": "bbbb",
"itemCode": "1234",
"itemDesc": "TEST2",
"otherfileds": ""
}
I have tried the below query, but realised, it will return the duplicate entries which has same item code and description only.
select count(1) from (select distinct value d.itemCode FROM (SELECT
c.itemCode, c.itemDesc, COUNT(1) as dupcount
FROM
c where c.itemCode<>null
GROUP BY
c.itemCode, c.itemDesc) d where d.dupcount>1 )
But I need to find records where the same item code is having different item descriptions (the query above will return only records which has more than one occurrence of item code/descriptions, ie, item code 9012 and 5678)
EDIT
I think i managed to form the query to filter these results by 2 sub queries (I think this could be improved though).
select e.itemCode from (select d.itemCode, count(1) as dupcount FROM
(SELECT
c.itemCode, c.itemDesc
FROM
c where c.itemCode<>null
GROUP BY
c.itemCode, c.itemDesc) d group by d.itemCode )e where e.dupcount>1
I think I managed to form the query to filter these results by 2 sub-queries (I think this could be improved though).
select distinct e.itemCode from (select d.itemCode, count(1) as dupcount FROM
(SELECT
c.itemCode, c.itemDesc
FROM
c where c.itemCode<>null
GROUP BY
c.itemCode, c.itemDesc) d group by d.itemCode )e where e.dupcount>1

Cosmos DB Query Array value using SQL

I have several JSON files with below structure in my cosmos DB.
[
{
"USA": {
"Applicable": "Yes",
"Location": {
"City": [
"San Jose",
"San Diego"
]
}
}
}]
I want to query all the results/files that has the array value of city = "San Diego".
I've tried the below sql queries
SELECT DISTINCT *
FROM c["USA"]["Location"]
WHERE ["City"] IN ('San Diego')
SELECT DISTINCT *
FROM c["USA"]["Location"]
WHERE ["City"] = 'San Diego'
SELECT c
FROM c JOIN d IN c["USA"]["Location"]
WHERE d["City"] = 'San Diego'
I'm getting the results as 0 - 0
You need to query data from your entire document, where your USA.Location.City array contains an item. For example:
SELECT *
FROM c
WHERE ARRAY_CONTAINS (c.USA.Location.City, "San Jose")
This will give you what you're trying to achieve.
Note: You have a slight anti-pattern in your schema, using "USA" as the key, which means you can't easily query all the location names. You should replace this with something like:
{
"Country": "USA",
"CountryMetadata": {
"Applicable": "Yes",
"Location": {
"City": [
"San Jose",
"San Diego"
]
}
}
}
This lets you query all the different countries. And the query above would then need only a slight change:
SELECT *
FROM c
WHERE c.Country = "USA
AND ARRAY_CONTAINS (c.CountryMetadata.Location.City, "San Jose")
Note that the query now works for any country, and you can pass in country value as a parameter (vs needing to hardcode the country name into the query because it's an actual key name).
Tl;dr don't put values as your keys.

Cosmos DB high RU

I do have simple JSON data with an array, when I query array contains or join, the Request Unit is too high.
select value count(0)
From c
where ARRAY_CONTAINS(c.signerDetails, {'displayStatus':0}, true)
and c.orgId = "e27dd002-bad3-4444-aa2b-855e5d4e79c8"
For the above ARRAY_CONTAINS query, we get 1882 RU for 40518 Count
select count(0)
From c WHERE c.orgId = "e27dd002-bad3-4444-aa2b-855e5d4e79c8"
and EXISTS(SELECT VALUE t FROM t IN c.signerDetails WHERE t.displayStatus = 0)
For the above exists query, we do get 1960 Ru for 40518 count
{
"id": "1b675bb2-f783-48a0-93bb-967a990b5204f901795b-d18d-4c2d-a68c-61adf7e8e8da",
"orgId": "e27dd002-bad3-4444-aa2b-855e5d4e79c8", //Partition key
"userDetails": [
{
"userName": "",
"userEmail": "",
"displayStatus": 4,
},
{
"userName": "",
"userEmail": "",
"displayStatus": 1,
}
],
"status": "Completed",
"_ts": 1619535494
}
Please share your thoughts on reducing the RU, or do I need to JSON data format or do I need to consider any other database for this kind of operation.
PS: I do have complex array querying, on the above sample I have removed the same for the code brevity.

How can i get just one row where some column values are same in knexjs

I tried to get distinct values from my table
let records = db
.select("*")
.from("user_technical_skill")
.distinct('technical_skill_id')
in the user_technical_skill table i have for example
[{
"id": "84ed9c04-b1d3-4e69-b2d2-c569ad94545f",
"user_id": "5dfbf2cc-38f9-4388-a077-11480e62d893",
"technical_skill_id": "111",
"created_at": "2021-04-11T15:31:39.552Z",
"updated_at": "2021-04-11T15:31:39.552Z"
},
{
"id": "4b0fcdd6-cbab-4fdf-ada6-0154c7956630",
"user_id": "a3b91e2a-5d7e-4528-b496-3a0807299db7",
"technical_skill_id": "111",
"created_at": "2021-04-11T15:48:49.145Z",
"updated_at": "2021-04-11T15:48:49.145Z"
}]
two columns where technical_skill_id is 11
but it does not work. I get again the two columns in my query
How can i fix this ?
If you need all columns you can use group by instead of distinct().
let records = db
.select("*")
.from("user_technical_skill")
.groupBy('technical_skill_id')
otherwise if you need just user_technical_skill in the result you can use
let records = db.from("user_technical_skill").distinct('technical_skill_id')
you don't need a select("*") since distinct is like a select alternative.
Ref:
http://knexjs.org/#Builder-distinct
http://knexjs.org/#Builder-groupBy

Resources