Azure Document Query Sub Dictionaries - azure

I have stored the following JSON document in the Azure Document DB:
"JobId": "04e63d1d-2af1-42af-a349-810f55817602",
"JobType": 3,
"
"Properties": [
{
"Key": "Value1",
"Value": "testing1"
},
{
"Key": "Value",
"Value": "testing2"
}
]
When i try to query the document back i can easily perform the
Select f.id,f.Properties, C.Key from f Join C IN f.Properties where C.Key = 'Value1'
However when i try to query:
Select f.id,f.Properties, C.Key from f Join C IN f.Properties where C.Value = 'testing1'
I get an error that the query cannot be computed. I assume this is due to 'VALUE' being a reserved keyword within the query language.
I cannot specify a specific order in the property array because different subclasses can add different property in different orders as they need them.
Anybody any suggestion how i can still complete this query ?

To escape keywords in DocumentDB, you can use the [] syntax. For example, the above query would be:
Select f.id,f.Properties, C.Key from f Join C IN f.Properties where C["Value"] = 'testing1'

Related

CosmosDB - list in aggregate query response

I have following document structure:
{
"id": "1",
"aId": "2",
"bId": "3",
....
},
{ "id":"2",
"aId": "2",
"bId": "4"
}
How do i return for that JSON that has aId that has list of all bIds of the same aId, and as additional field: count of such bIds? So for example above and condtion: "WHERE aId="2" response would be:
{
"aId": "2",
"bIds" : ["4","3"],
"bIds count" : 2
}
Assuming i only pass one aId as parameter.
I tried something like:
select
(select 'something') as aId,
(select distinct value c.bId from c where c.aId='something') as bIds
from TableName c
But for love of me i cant figure out how to get that list + its count + hardcoded aId in single JSON response (single row)
For example this query:
select
(select distinct value 'someId') as aId,
(select distinct value c.bId) as bIds
from c where c.aId='someId'
will return
{ { 'aId': 'someId', 'bIds':'2'},{'aId':'someId','bIds':'4'}}
while what i acutally want is
{ {'aId':''someId', 'bIds':['2','4']}}
Here is query that is closest to what i want:
select
c.aId as aId,
count(c2) as bIdCount,
array(select distinct value c2.bId from c2)
from c join (select c.bId from c) as c2
where c.aId = 'SOME_ID'
Only thing line with array make this query fail if i delete this line it works (correctly returns id and count in one row). But i need to select content of this list also, and i ma lost why its not working, example is almost copypasted from "How to perform array projection Cosmos Db"
https://azurelessons.com/array-in-cosmos-db/#How_to_perform_array_projection_Azure_Cosmos_DB
Here is how you'd return an array of bId:
SELECT distinct value c.bId
FROM c
where c.aId = "2"
This yields:
[
"3",
"4"
]
Removing the value keyword:
SELECT distinct c.bId
FROM c
where c.aId = "2"
yields:
[
{ "bId" : "3" },
{ "bId" : "4" }
]
From either of these, you can count the number of array elements returned. If your payload must include count and aId, you'll need to add those to your JSON output.

Cosmos db null value

I have two kind of record mention below in my table staudentdetail of cosmosDb.In below example previousSchooldetail is nullable filed and it can be present for student or not.
sample record below :-
{
"empid": "1234",
"empname": "ram",
"schoolname": "high school ,bankur",
"class": "10",
"previousSchooldetail": {
"prevSchoolName": "1763440",
"YearLeft": "2001"
} --(Nullable)
}
{
"empid": "12345",
"empname": "shyam",
"schoolname": "high school",
"class": "10"
}
I am trying to access the above record from azure databricks using pyspark or scala code .But when we are building the dataframe reading it from cosmos db it does not bring previousSchooldetail detail in the data frame.But when we change the query including id for which the previousSchooldetail show in the data frame .
Case 1:-
val Query = "SELECT * FROM c "
Result when query fired directly
empid
empname
schoolname
class
Case2:-
val Query = "SELECT * FROM c where c.empid=1234"
Result when query fired with where clause.
empid
empname
school name
class
previousSchooldetail
prevSchoolName
YearLeft
Could you please tell me why i am not able to get previousSchooldetail in case 1 and how should i proceed.
As #Jayendran, mentioned in the comments, the first query will give you the previouschooldetail document wherever they are available. Else, the column would not be present.
You can have this column present for all the scenarios by using the IS_DEFINED function. Try tweaking your query as below:
SELECT c.empid,
c.empname,
IS_DEFINED(c.previousSchooldetail) ? c.previousSchooldetail : null
as previousSchooldetail,
c.schoolname,
c.class
FROM c
If you are looking to get the result as a flat structure, it can be tricky and would need to use two separate queries such as:
Query 1
SELECT c.empid,
c.empname,
c.schoolname,
c.class,
p.prevSchoolName,
p.YearLeft
FROM c JOIN c.previousSchooldetail p
Query 2
SELECT c.empid,
c.empname,
c.schoolname,
c.class,
null as prevSchoolName,
null as YearLeft
FROM c
WHERE not IS_DEFINED (c.previousSchooldetail) or
c.previousSchooldetail = null
Unfortunately, Cosmos DB does not support LEFT JOIN or UNION. Hence, I'm not sure if you can achieve this in a single query.
Alternatively, you can create a stored procedure to return the desired result.

Cosmos DB json array that matchs all the words

I need a query that can get me the document from a list of words for example if I use
select c from c join (SELECT distinct VALUE c.id FROM c JOIN word IN c.words WHERE word in('word1',word2) and tag in('motorcycle')) ORDER BY c._ts desc
it will bring both documents, I want to retrieve only the first one cause it matches the two words and not only one.
Document 1
"c": {
"id": "d0f1723c-0a55-454a-9cf8-3884f2d8d61a",
"words": [
"word1",
"word2",
"word3",
]}
Document 2
"c": {
"id": "d0f1723c-0a55-454a-9cf8-3884f2d8d61a",
"words": [
"word1",
"word4",
"word5",
]}
You should be able to cover this with two ARRAY_CONTAINS expressions in your WHERE clause (and no need for a JOIN):
SELECT c.id FROM c
WHERE ARRAY_CONTAINS(c.words, 'word1')
AND ARRAY_CONTAINS(c.words, 'word2')
This should return the id of your first document.

Azure Stream Analytics query language get value by key from array of key value pairs

I am trying to extract a specific value from an array property in the Stream Analytics query language.
My data looks as follows:
"context": {
"custom": {
"dimensions": [{
"MacAddress": "ma"
},
{
"IpAddress": "ipaddr"
}]
}
}
I am trying to obtain a result that has "MacAddress", "IpAddress" as column titles and "ma", "ipaddr" as rows.
I am currently achieving this with this query:
SELECT
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 0), 'MacAddress') AS MacAddress,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 1), 'IpAddress') AS IpAddress,
I am trying to use CROSS APPLY but so far no luck. Below the CROSS APPLY query:
SELECT
flat.ArrayValue.MacAddress as MacAddress,
flat.ArrayValue.IpAddress as IpAddress
FROM
[ffapi-track-events] as MySource
CROSS APPLY GetArrayElements(MySource.context.custom.dimensions) as flat
This one produces two rows instead of one:
MacAddress, IpAddress
ma ,
, ipaddr
so I'm missing precisely the flattening when writing it like that.
I would like to bypass hardcoding the index 0 as it's not guaranteed that MacAddress won't switch places with "IpAddress"... So I need something like FindElementInArray by condition, or some means to join with the dimensions array.
Is there such thing?
Thank you.

DocumentDb "where" clause with mathematical expression

I would like to understand how to create query where clauses on DocumentDB with mathematical comparator inside.
For example, I used this demonstrator to understand how to make a "greater than" comparaison : expression AND food.version > 0 seems to work very well.
Here is under what I tryed onto portal.azure.com documentdb query explorer and the results. I don't understand why I got an error in some cases(QUERY3), and (in option) how to get error details on portal.azure.com ?!
Tested:
>>> QUERY1 >>
SELECT d.id,
d.name,
d.lastUpdateTime
FROM d
>>> RESULT1 >>
[
{
"id": "558d6007b909e8dfb2286e7b",
"name": "cSimpleSIMS_ici",
"lastUpdateTime": 1435589982672
},
{
"id": "558d6009b909e8df18296e7b",
"name": "didier",
"lastUpdateTime": 1435330811285
},
{
"id": "558d600ab909e8df28296e7b",
"name": "cDoubleSIMD_ici",
"lastUpdateTime": 1435331176750
},
{
"id": "558d600bb909e8df55296e7b",
"name": "george",
"lastUpdateTime": 1435330813519
}
(...)
]
>>> QUERY2 >>
SELECT d.id,
d.name,
d.lastUpdateTime
FROM d
WHERE (d.name='george')
>>> RESULT2 >>
[
{
"id": "558d600bb909e8df55296e7b",
"name": "george",
"lastUpdateTime": 1435330813519
}
]
>>> QUERY3 >>
SELECT d.id,
d.name,
d.lastUpdateTime
FROM d
WHERE (d.lastUpdateTime > 14)
>>> RESULT3 IN ERROR!
>>> QUERY4 >>
SELECT d.id,
d.name,
d.lastUpdateTime
FROM d
WHERE (d.name='george' AND d.lastUpdateTime > 14)
>>> RESULT4 >>
[
{
"id": "558d600bb909e8df55296e7b",
"name": "george",
"lastUpdateTime": 1435330813519
}
]
>>> QUERY5 >>
SELECT d.id,
d.name,
d.lastUpdateTime
FROM d
WHERE (d.name='george' AND d.lastUpdateTime > 1435330813519)
>>> RESULT5 >>
[]
Here's the gist...
Today, all JSON properties in DocumentDB get automatically indexed by a Hash index; which means queries with equality operators (e.g. WHERE d.name= "george") are extremely fast.
On the other hand, range queries (e.g. WHERE d.lastUpdateTime > 14) require a range index to operate efficiently. Without a range index, the range query will require a scan across all documents (which we allow if the header, x-ms-documentdb-query-enable-scan, is passed in by the request).
The queries you issued that had both a equality and range filter (e.g. WHERE d.name='george' AND d.lastUpdateTime > 14) succeeded, because the equality filter greatly narrowed down the set of documents to scan through.
TL;DR: There are two things you can do here to get rid of the error:
Create a custom index policy to add a range index for numeric types. The documentation for indexing policies can be found here.
Issue your query programmatically (not through the Azure Portal) to set the x-ms-documentdb-query-enable-scan header to allow scans on range queries.
P.S. I will push to improve the Azure Portal for you.
Now... there appear to be a few issues in the Azure Portal - which I will push to get fixed for you.
Bug: Exception message is truncated
Looks like the meaningful part of the exception message gets truncated out when using the Azure Portal - which is no bueno. What SHOULD have been displayed is:
Microsoft.Azure.Documents.DocumentClientException: Message: {"Errors":["An invalid query has been specified with filters against path(s) that are not range-indexed. Consider adding allow scan header in the request."]}
Missing Feature: Enabling scans in query explorer
There ability to set the x-ms-documentdb-query-enable-scan header is currently not exposed in the Azure Portal's query explorer. We will add a checkbox or something for this.
To add to aliuy's answer, we're working on a change that will improve the developer experience here - Default indexing policy for numbers will be changed from Hash to Range index, so you do not need the header or override indexing policy in order to perform range queries.

Resources