Why does this Cosmos SQL query require a subquery? - subquery

I'm trying to understand why my query below will only work when using a subquery.
Sample document structure:
{
"id": "78832-fsdfdf-3242",
"type": "Specific",
"title": "JavaScript vs TypeScript",
"summary": "Explain the differences between JavaScript and TypeScript.",
"products": [
"javascript v6",
"typescript v1",
"node.js"
]
}
Query requirements:
Find the id of all documents where the terms 'javascript' or 'csharp' or 'coding' are contained in either the title, summary or in one of the listed products.
To solve this, I'm using CONTAINS(). To avoid repeating the CONTAINS() for each combination of field and search term, I create a concatenation of the fields in question and name it searchField.
Working query
This is the query I came up with. It's using a subquery sub to add the concatenated fields and products to the result set. Then, I can use CONTAINS() on sub.searchField.
SELECT sub.id
FROM
(
SELECT
o.id,
o.type,
CONCAT(o.title, " ", o.summary, " ", p) as searchField
FROM o
JOIN p in o.products
) sub
WHERE
sub.type = "Specific"
AND
(
CONTAINS(sub.searchField, "javascript", true)
OR CONTAINS(sub.searchField, "csharp", true)
OR CONTAINS(sub.searchField, "coding", true)
)
Non-working query
Originally, I had the query written as seen below. I expected it to work as in other SQL dialects, but I cannot access searchField in the WHERE clause.
"Error: Identifier 'searchField' could not be resolved."
SELECT o.id, CONCAT(o.title, " ", o.summary, " ", p) as searchField
FROM o
WHERE
o.type = "Specific"
AND
(
CONTAINS(searchField, "javascript", true)
OR CONTAINS(searchField, "csharp", true)
OR CONTAINS(searchField, "coding", true)
)
Questions
Is there a better way to achieve the result needed? (Although, surprisingly, the query consumes only 230 RUs)
Why is the subquery needed? I really want to understand this so I can learn when the use subqueries and potentially write queries that would otherwise not be possible.

Related

Cosmos db null value

I have two kind of record mention below in my table staudentdetail of cosmosDb.In below example previousSchooldetail is nullable filed and it can be present for student or not.
sample record below :-
{
"empid": "1234",
"empname": "ram",
"schoolname": "high school ,bankur",
"class": "10",
"previousSchooldetail": {
"prevSchoolName": "1763440",
"YearLeft": "2001"
} --(Nullable)
}
{
"empid": "12345",
"empname": "shyam",
"schoolname": "high school",
"class": "10"
}
I am trying to access the above record from azure databricks using pyspark or scala code .But when we are building the dataframe reading it from cosmos db it does not bring previousSchooldetail detail in the data frame.But when we change the query including id for which the previousSchooldetail show in the data frame .
Case 1:-
val Query = "SELECT * FROM c "
Result when query fired directly
empid
empname
schoolname
class
Case2:-
val Query = "SELECT * FROM c where c.empid=1234"
Result when query fired with where clause.
empid
empname
school name
class
previousSchooldetail
prevSchoolName
YearLeft
Could you please tell me why i am not able to get previousSchooldetail in case 1 and how should i proceed.
As #Jayendran, mentioned in the comments, the first query will give you the previouschooldetail document wherever they are available. Else, the column would not be present.
You can have this column present for all the scenarios by using the IS_DEFINED function. Try tweaking your query as below:
SELECT c.empid,
c.empname,
IS_DEFINED(c.previousSchooldetail) ? c.previousSchooldetail : null
as previousSchooldetail,
c.schoolname,
c.class
FROM c
If you are looking to get the result as a flat structure, it can be tricky and would need to use two separate queries such as:
Query 1
SELECT c.empid,
c.empname,
c.schoolname,
c.class,
p.prevSchoolName,
p.YearLeft
FROM c JOIN c.previousSchooldetail p
Query 2
SELECT c.empid,
c.empname,
c.schoolname,
c.class,
null as prevSchoolName,
null as YearLeft
FROM c
WHERE not IS_DEFINED (c.previousSchooldetail) or
c.previousSchooldetail = null
Unfortunately, Cosmos DB does not support LEFT JOIN or UNION. Hence, I'm not sure if you can achieve this in a single query.
Alternatively, you can create a stored procedure to return the desired result.

How to query CosmosDB for nested object value

How can I retrieve objects which match order_id = 9234029m, given this document in CosmosDB:
{
"order": {
"order_id": "9234029m",
"order_name": "name",
}
}
I have tried to query in CosmosDB Data Explorer, but it's not possible to simply query the nested order_id object like this:
SELECT * FROM c WHERE c.order.order_id = "9234029m"
(Err: "Syntax error, incorrect syntax near 'order'")
This seems like it should be so simple, yet it's not! (In CosmosDB Data Explorer, all queries need to start with SELECT * FROM c, but REST SQL is an alternative as well.)
As you discovered, order is a reserved keyword, which was tripping up the query parsing. However, you can get past that, and still query your data, with slightly different syntax (bracket notation):
SELECT *
FROM c
WHERE c["order"].order_id = "9234029m"
This was due, apparently, to order being a reserved keyword in CosmosDB SQL, even if used as above.

Possible to add dynamic WHERE clause with a QueryFile?

I have a complex query stored in an SQL file and I would like to reuse it for various routes but change up the WHERE clause depending on the route. This would be instead of having a large complex query in multiple files with the only difference being the WHERE statement.
Is it possible to dynamically add a WHERE when using QueryFile? Simplified example below:
SELECT "id", "userId", "locationId", "title", "body",
(
SELECT row_to_json(sqUser)
FROM (
SELECT "id", "firstname", "lastname"
FROM "users"
WHERE "users"."id" = "todos"."userId"
) sqUser
) as "user"
FROM "todos"
const queryIndex = new pgp.QueryFile('sql/todos/index.pgsql', queryOptions);
// 1. Use as is to get a list of all todos
// 2. OR Append WHERE "locationId" = $1 to get list filtered by location
// 3. OR Append WHERE "id" = $1 to get a specific item
// without having three separate SQL files?
It seems like (maybe?) you could get away with adding the below in the query file but that still feels limiting (would still need two files for = and LIKE and it still limits to only one WHERE condition). It also also feels weird to do something like WHERE 1 = 1 to get all records to return.
WHERE $1 = $2
I would be interested in hearing peoples' thoughts on this or if there is a better approach.
You can inject dynamic condition into a query-file as Raw Text:
SELECT "id", "userId", "locationId", "title", "body",
(
SELECT row_to_json(sqUser)
FROM (
SELECT "id", "firstname", "lastname"
FROM "users"
${condition:raw}
) sqUser
) as "user"
FROM "todos"
Pre-formatted parameters, based on the condition:
// Generate a condition, based on the business logic:
const condition = pgp.as.format('WHERE col_1 = $1 AND col_2 = $2', [111, 222]);
Executing your query-file:
await db.any(myQueryFile, {condition});
Advanced
Above is for the scenario when your have a simple dynamic condition that you want to generate in the code. But sometimes you may have complex static conditions that you want to alternate. In this case, you can have your master query file refer to the condition from a slave query file (nested query files are supported right out of the box). And in this case you do not even need to use :raw filter, because query files are injected as raw text by default:
Master query:
SELECT * FROM table ${condition}
Load your slave query files with complex conditions (when the app starts):
const conditionQueryFile1 = new QueryFile(...);
const conditionQueryFile2 = new QueryFile(...);
Selecting the right slave query, based on the business logic:
const condition = conditionQueryFile1; // some business logic here;
Executing master query with a slave as parameter:
await db.any(myQueryFile, {condition});

cosmos db sql query with non alphanumeric field name

My data structure in cosmosdb is next
{
"_id": {
"$oid": "554f7dc4e4b03c257a33f75c"
},
.................
}
and I need to sort collection by $oid field. How should I form my sql query?
Normal query SELECT TOP 10 * FROM collection c ORDER BY c._id.filedname not works if fieldname starts with $ like $oid.
I am using query explorer from azure portal.
To use a special character, like $, you need to use bracket notation:
SELECT c._id FROM c
order by c._id["$oid"]
You can do this with each property in the hierarchy, so the following also works:
SELECT c._id FROM c
order by c["_id"]["$oid"]

Azure Document Query Sub Dictionaries

I have stored the following JSON document in the Azure Document DB:
"JobId": "04e63d1d-2af1-42af-a349-810f55817602",
"JobType": 3,
"
"Properties": [
{
"Key": "Value1",
"Value": "testing1"
},
{
"Key": "Value",
"Value": "testing2"
}
]
When i try to query the document back i can easily perform the
Select f.id,f.Properties, C.Key from f Join C IN f.Properties where C.Key = 'Value1'
However when i try to query:
Select f.id,f.Properties, C.Key from f Join C IN f.Properties where C.Value = 'testing1'
I get an error that the query cannot be computed. I assume this is due to 'VALUE' being a reserved keyword within the query language.
I cannot specify a specific order in the property array because different subclasses can add different property in different orders as they need them.
Anybody any suggestion how i can still complete this query ?
To escape keywords in DocumentDB, you can use the [] syntax. For example, the above query would be:
Select f.id,f.Properties, C.Key from f Join C IN f.Properties where C["Value"] = 'testing1'

Resources