I would like to understand how to create query where clauses on DocumentDB with mathematical comparator inside.
For example, I used this demonstrator to understand how to make a "greater than" comparaison : expression AND food.version > 0 seems to work very well.
Here is under what I tryed onto portal.azure.com documentdb query explorer and the results. I don't understand why I got an error in some cases(QUERY3), and (in option) how to get error details on portal.azure.com ?!
Tested:
>>> QUERY1 >>
SELECT d.id,
d.name,
d.lastUpdateTime
FROM d
>>> RESULT1 >>
[
{
"id": "558d6007b909e8dfb2286e7b",
"name": "cSimpleSIMS_ici",
"lastUpdateTime": 1435589982672
},
{
"id": "558d6009b909e8df18296e7b",
"name": "didier",
"lastUpdateTime": 1435330811285
},
{
"id": "558d600ab909e8df28296e7b",
"name": "cDoubleSIMD_ici",
"lastUpdateTime": 1435331176750
},
{
"id": "558d600bb909e8df55296e7b",
"name": "george",
"lastUpdateTime": 1435330813519
}
(...)
]
>>> QUERY2 >>
SELECT d.id,
d.name,
d.lastUpdateTime
FROM d
WHERE (d.name='george')
>>> RESULT2 >>
[
{
"id": "558d600bb909e8df55296e7b",
"name": "george",
"lastUpdateTime": 1435330813519
}
]
>>> QUERY3 >>
SELECT d.id,
d.name,
d.lastUpdateTime
FROM d
WHERE (d.lastUpdateTime > 14)
>>> RESULT3 IN ERROR!
>>> QUERY4 >>
SELECT d.id,
d.name,
d.lastUpdateTime
FROM d
WHERE (d.name='george' AND d.lastUpdateTime > 14)
>>> RESULT4 >>
[
{
"id": "558d600bb909e8df55296e7b",
"name": "george",
"lastUpdateTime": 1435330813519
}
]
>>> QUERY5 >>
SELECT d.id,
d.name,
d.lastUpdateTime
FROM d
WHERE (d.name='george' AND d.lastUpdateTime > 1435330813519)
>>> RESULT5 >>
[]
Here's the gist...
Today, all JSON properties in DocumentDB get automatically indexed by a Hash index; which means queries with equality operators (e.g. WHERE d.name= "george") are extremely fast.
On the other hand, range queries (e.g. WHERE d.lastUpdateTime > 14) require a range index to operate efficiently. Without a range index, the range query will require a scan across all documents (which we allow if the header, x-ms-documentdb-query-enable-scan, is passed in by the request).
The queries you issued that had both a equality and range filter (e.g. WHERE d.name='george' AND d.lastUpdateTime > 14) succeeded, because the equality filter greatly narrowed down the set of documents to scan through.
TL;DR: There are two things you can do here to get rid of the error:
Create a custom index policy to add a range index for numeric types. The documentation for indexing policies can be found here.
Issue your query programmatically (not through the Azure Portal) to set the x-ms-documentdb-query-enable-scan header to allow scans on range queries.
P.S. I will push to improve the Azure Portal for you.
Now... there appear to be a few issues in the Azure Portal - which I will push to get fixed for you.
Bug: Exception message is truncated
Looks like the meaningful part of the exception message gets truncated out when using the Azure Portal - which is no bueno. What SHOULD have been displayed is:
Microsoft.Azure.Documents.DocumentClientException: Message: {"Errors":["An invalid query has been specified with filters against path(s) that are not range-indexed. Consider adding allow scan header in the request."]}
Missing Feature: Enabling scans in query explorer
There ability to set the x-ms-documentdb-query-enable-scan header is currently not exposed in the Azure Portal's query explorer. We will add a checkbox or something for this.
To add to aliuy's answer, we're working on a change that will improve the developer experience here - Default indexing policy for numbers will be changed from Hash to Range index, so you do not need the header or override indexing policy in order to perform range queries.
Related
How can I retrieve objects which match order_id = 9234029m, given this document in CosmosDB:
{
"order": {
"order_id": "9234029m",
"order_name": "name",
}
}
I have tried to query in CosmosDB Data Explorer, but it's not possible to simply query the nested order_id object like this:
SELECT * FROM c WHERE c.order.order_id = "9234029m"
(Err: "Syntax error, incorrect syntax near 'order'")
This seems like it should be so simple, yet it's not! (In CosmosDB Data Explorer, all queries need to start with SELECT * FROM c, but REST SQL is an alternative as well.)
As you discovered, order is a reserved keyword, which was tripping up the query parsing. However, you can get past that, and still query your data, with slightly different syntax (bracket notation):
SELECT *
FROM c
WHERE c["order"].order_id = "9234029m"
This was due, apparently, to order being a reserved keyword in CosmosDB SQL, even if used as above.
We are using CosmosDb C# SDK
We tried both: "Microsoft.Azure.Cosmos 3.4.1", "Microsoft.Azure.DocumentDB.Core 2.9.1 and 2.4.2"
We are getting invalid results and the main problems is the ResponseContinuation
[{"token":null,"range":{"min":"05C1DFFFFFFFFC","max":"FF"}}]
This started showing in one of our smaller service with only 14 documents.
In all queryes we use the folowing headers:
"x-ms-documentdb-query-enablecrosspartition" = true
"x-ms-max-item-count" = 100
Query 1:
The query is the folowing SELECT * FROM c.
We get the folowing response:
- 7 items
- ResponseContinuation [{"token":null,"range":{"min":"05C1DFFFFFFFFC","max":"FF"}}]
Then we use the continuation token to get the other 7 items.
Query 2:
If we modify the query to SELECT * FROM c ORDER BY c.property ASC, the order gets messed up! (responses are simplified)
- we get the folowing result ["A", "B", "C", "D", "F"]
- and the second query ["C", "D", "G"]
Query 3:
if we want to find only one item SELECT TOP 1 * FROM c WHERE c.name = #name, and the item is in the "second query result"
- nothing and RequestContionuation {"top":1,"sourceToken":"[{\"token\":null,\"range\":{\"min\":\"05C1DFFFFFFFFC\",\"max\":\"FF\"}}]"}
This is all a really unexpected behaviour.
Why does ORDER BY, TOP even exist if we can't even use it properly..
We can't afford to list all data from cosmos to our server and then do ordering, expecialy on bigger containers.
Edit: github issue link: https://github.com/Azure/azure-cosmos-dotnet-v3/issues/1001
This query cost 265 RU/s:
SELECT top 1 * FROM c
WHERE c.CollectPackageId = 'd0613cbb-492b-4464-b66b-3634b5571826'
ORDER BY c.StartFetchDateTimeUtc DESC
StartFetchDateTimeUtc is a string property, serialized by using the Cosmos API
This query cost 5 RU/s:
SELECT top 1 * FROM c
WHERE c.CollectPackageId = 'd0613cbb-492b-4464-b66b-3634b5571826'
ORDER BY c._ts DESC
_ts is a built in field, a Unix-based numeric timestamp.
Example result (only including this field and _ts):
"StartFetchDateTimeUtc": "2017-08-08T03:35:04.1654152Z",
"_ts": 1502163306
The index is in place and follows the suggestions & tutorials how to configure a sortable string/timestamp. It looks like:
{
"path": "/StartFetchDateTimeUtc/?",
"indexes": [
{
"kind": "Range",
"dataType": "String",
"precision": -1
}
]
}
According to this article, the "Item size,Item property count,Data consistency,Indexed properties,Document indexing,Query patterns,Script usage" variables will affect the RU.
So it is very strange that different property costs different RU.
I also create a test demo on my side(with your index and same document property). I have inserted 1000 records to the documentdb. The two different query costs same RU. I suggest you could start a new collection and test again.
The result is like this:
Order by StartFetchDateTimeUtc
Order by _ts
I am trying to extract a specific value from an array property in the Stream Analytics query language.
My data looks as follows:
"context": {
"custom": {
"dimensions": [{
"MacAddress": "ma"
},
{
"IpAddress": "ipaddr"
}]
}
}
I am trying to obtain a result that has "MacAddress", "IpAddress" as column titles and "ma", "ipaddr" as rows.
I am currently achieving this with this query:
SELECT
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 0), 'MacAddress') AS MacAddress,
GetRecordPropertyValue(GetArrayElement(MySource.context.custom.dimensions, 1), 'IpAddress') AS IpAddress,
I am trying to use CROSS APPLY but so far no luck. Below the CROSS APPLY query:
SELECT
flat.ArrayValue.MacAddress as MacAddress,
flat.ArrayValue.IpAddress as IpAddress
FROM
[ffapi-track-events] as MySource
CROSS APPLY GetArrayElements(MySource.context.custom.dimensions) as flat
This one produces two rows instead of one:
MacAddress, IpAddress
ma ,
, ipaddr
so I'm missing precisely the flattening when writing it like that.
I would like to bypass hardcoding the index 0 as it's not guaranteed that MacAddress won't switch places with "IpAddress"... So I need something like FindElementInArray by condition, or some means to join with the dimensions array.
Is there such thing?
Thank you.
I have stored the following JSON document in the Azure Document DB:
"JobId": "04e63d1d-2af1-42af-a349-810f55817602",
"JobType": 3,
"
"Properties": [
{
"Key": "Value1",
"Value": "testing1"
},
{
"Key": "Value",
"Value": "testing2"
}
]
When i try to query the document back i can easily perform the
Select f.id,f.Properties, C.Key from f Join C IN f.Properties where C.Key = 'Value1'
However when i try to query:
Select f.id,f.Properties, C.Key from f Join C IN f.Properties where C.Value = 'testing1'
I get an error that the query cannot be computed. I assume this is due to 'VALUE' being a reserved keyword within the query language.
I cannot specify a specific order in the property array because different subclasses can add different property in different orders as they need them.
Anybody any suggestion how i can still complete this query ?
To escape keywords in DocumentDB, you can use the [] syntax. For example, the above query would be:
Select f.id,f.Properties, C.Key from f Join C IN f.Properties where C["Value"] = 'testing1'