How to query CosmosDB for nested object value - azure

How can I retrieve objects which match order_id = 9234029m, given this document in CosmosDB:
{
"order": {
"order_id": "9234029m",
"order_name": "name",
}
}
I have tried to query in CosmosDB Data Explorer, but it's not possible to simply query the nested order_id object like this:
SELECT * FROM c WHERE c.order.order_id = "9234029m"
(Err: "Syntax error, incorrect syntax near 'order'")
This seems like it should be so simple, yet it's not! (In CosmosDB Data Explorer, all queries need to start with SELECT * FROM c, but REST SQL is an alternative as well.)

As you discovered, order is a reserved keyword, which was tripping up the query parsing. However, you can get past that, and still query your data, with slightly different syntax (bracket notation):
SELECT *
FROM c
WHERE c["order"].order_id = "9234029m"

This was due, apparently, to order being a reserved keyword in CosmosDB SQL, even if used as above.

Related

Cosmos: DISTINCT results with JOIN and ORDER BY

I'm trying to write a query that uses a JOIN to perform a geo-spatial match against locations in a array. I got it working, but added DISTINCT in order to de-duplicate (Query A):
SELECT DISTINCT VALUE
u
FROM
u
JOIN loc IN u.locations
WHERE
ST_WITHIN(
{'type':'Point','coordinates':[loc.longitude,loc.latitude]},
{'type':'Polygon','coordinates':[[[-108,-43],[-108,-40],[-110,-40],[-110,-43],[-108,-43]]]})
However, I then found that combining DISTINCT with continuation tokens isn't supported unless you also add ORDER BY:
System.ArgumentException: Distict query requires a matching order by in order to return a continuation token. If you would like to serve this query through continuation tokens, then please rewrite the query in the form 'SELECT DISTINCT VALUE c.blah FROM c ORDER BY c.blah' and please make sure that there is a range index on 'c.blah'.
So I tried adding ORDER BY like this (Query B):
SELECT DISTINCT VALUE
u
FROM
u
JOIN loc IN u.locations
WHERE
ST_WITHIN(
{'type':'Point','coordinates':[loc.longitude,loc.latitude]},
{'type':'Polygon','coordinates':[[[-108,-43],[-108,-40],[-110,-40],[-110,-43],[-108,-43]]]})
ORDER BY
u.created
The problem is, the DISTINCT no longer appears to be taking effect because it returns, for example, the same record twice.
To reproduce this, create a single document with this data:
{
"id": "b6dd3e9b-e6c5-4e5a-a257-371e386f1c2e",
"locations": [
{
"latitude": -42,
"longitude": -109
},
{
"latitude": -42,
"longitude": -109
}
],
"created": "2019-03-06T03:43:52.328Z"
}
Then run Query A above. You will get a single result, despite the fact that both locations match the predicate. If you remove the DISTINCT, you'll get the same document twice.
Now run Query B and you'll see it returns the same document twice, despite the DISTINCT clause.
What am I doing wrong here?
Reproduced your issue indeed,based on my researching,it seems a defect in cosmos db distinct query. Please refer to this link:Provide support for DISTINCT.
This feature is broke in the data explorer. Because cosmos can only
return 100 results per page at a time, the distinct keyword will only
apply to a single page. So, if your result set contains more than 100
results, you may still get duplicates back - they will simply be on
separately paged result sets.
You could describe your own situation and vote up this feedback case.

Azure CosmosDB: how to ORDER BY id?

Using a vanilla CosmosDB collection (all default), adding documents like this:
{
"id": "3",
"name": "Hannah"
}
I would like to retrieve records ordered by id, like this:
SELECT c.id FROM c
ORDER BY c.id
This give me the error Order-by item requires a range index to be defined on the corresponding index path.
I expect this is because /id is hash indexed and not range indexed. I've tried to change the Indexing Policy in various ways, but any change I make which would touch / or /id gets wiped when I save.
How can I retrieve documents ordered by ID?
The best way to do this is to store a duplicate property e.g. id2 that has the same value of id, and is indexed using a range index, then use that for sorting, i.e. query for SELECT * FROM c ORDER BY c.id2.
PS: The reason this is not supported is because id is part of a composite index (which is on partition key and row key; id is the row key part) The Cosmos DB team is working on a change that will allow sorting by id.
EDIT: new collections now support ORDER BY c.id as of 7/12/19
I found this page CosmosDB Indexing Policies , which has the below Note that may be helpful:
Azure Cosmos DB returns an error when a query uses ORDER BY but
doesn't have a Range index against the queried path with the maximum
precision.
Some other information from elsewhere in the document:
Range supports efficient equality queries, range queries (using >, <,
>=, <=, !=), and ORDER BY queries. ORDER By queries by default also require maximum index precision (-1). The data type can be String or
Number.
Some guidance on types of queries assisted by Range queries:
Range Range over /prop/? (or /) can be used to serve the following
queries efficiently:
SELECT FROM collection c WHERE c.prop = "value"
SELECT FROM collection c WHERE c.prop > 5
SELECT FROM collection c ORDER BY c.prop
And a code example from the docs also:
var rangeDefault = new DocumentCollection { Id = "rangeCollection" };
// Override the default policy for strings to Range indexing and "max" (-1) precision
rangeDefault.IndexingPolicy = new IndexingPolicy(new RangeIndex(DataType.String) { Precision = -1 });
await client.CreateDocumentCollectionAsync(UriFactory.CreateDatabaseUri("db"), rangeDefault);
Hope this helps,
J

cosmos db sql query with non alphanumeric field name

My data structure in cosmosdb is next
{
"_id": {
"$oid": "554f7dc4e4b03c257a33f75c"
},
.................
}
and I need to sort collection by $oid field. How should I form my sql query?
Normal query SELECT TOP 10 * FROM collection c ORDER BY c._id.filedname not works if fieldname starts with $ like $oid.
I am using query explorer from azure portal.
To use a special character, like $, you need to use bracket notation:
SELECT c._id FROM c
order by c._id["$oid"]
You can do this with each property in the hierarchy, so the following also works:
SELECT c._id FROM c
order by c["_id"]["$oid"]

In Azure DocumentDb, how do I write an "IN" statement using LINQ to SQL

I want to retrieve about 50 - 100 documents by their ID from DocumentDb. I have the list of IDs in a List<string>. How do I use LINQ to SQL to retrieve those documents. I don't want to write the actual SQL query as a string, as in:
IQueryable<Family> results = client.CreateDocumentQuery<Family>(collectionUri, "SELECT * FROM family WHERE State IN ('TX', 'NY')", DefaultOptions);
I want to be able to use lambda expressions to create the query, because I don't want to hard-code the names of the fields as string.
It seems that you do not want to generate and pass query string SELECT * FROM family WHERE State IN ('TX', 'NY') to query documents, you could try the following code.
List<string> ids = new List<string>() { "TX", "NY" };
client.CreateDocumentQuery<Family>(collectionUri).Where(d => ids.Contains(d.id));

SELECT with multiple values in DocumentDB

I have an Employees collection and I want to retrieve full documents of 10 employees whose ID's I'd like to send to my SQL SELECT. How do I do that?
To further clarify, I have 10 EmployeeId's and I want pull these employees' information from my Employees collection. I'd appreciate your help with this.
Update:
As of 5/6/2015, DocumentDB supports the IN keyword; which supports up to 100 parameters.
Example:
SELECT *
FROM Employees
WHERE Employees.id IN (
"01236", "01237", "01263", "06152", "21224",
"21225", "21226", "21227", "21505", "22903",
"14003", "14004", "14005", "14006", "14007"
)
Original Answer:
Adding on to Ryan's answer... Here's an example:
Create the following UDF:
var containsUdf = {
id: "contains",
body: function(arr, obj) {
if (arr.indexOf(obj) > -1) {
return true;
}
return false;
}
};
Use your contains UDF is a SQL query:
SELECT * FROM Employees e WHERE contains(["1","2","3","4","5"], e.id)
For documentation on creating UDFs, check out the DocumentDB SQL reference
You can also vote for implementing the "IN" keyword for "WHERE" clauses at the DocumentDB Feedback Forums.
You could also achieve this by using OR support. Below is a sample –
SELECT *
FROM Employees e
WHERE e.EmployeeId = 1 OR e.EmployeeId = 2 OR e.EmployeeId = 3
If you need more number of ORs than what DocumentDB caps, you would have to break up your queries into multiple smaller queries by employeeId values. You can also issue the queries in parallel from the client and gather all the results
The best way to do this, today would be to create a Contains() UDF that took in the array of ids to search on and use that in the WHERE clause.
Does
Select * from Employees where EmployeeId in (1,3,5,6,...)
Not work ?
thanks to ryancrawcour we know it doesn't
Another method is to use the ARRAY_CONTAINS method in the SQL API.
Here is the sample code :
SELECT *
FROM Employees
WHERE ARRAY_CONTAINS(["01236", "01237", "01263", "06152", "21224"],Employees.id).
I ran both queries ( using the IN method ) with a sample set of datasets, both are consuming the same amount of RUs.

Resources