I have two kind of record mention below in my table staudentdetail of cosmosDb.In below example previousSchooldetail is nullable filed and it can be present for student or not.
sample record below :-
{
"empid": "1234",
"empname": "ram",
"schoolname": "high school ,bankur",
"class": "10",
"previousSchooldetail": {
"prevSchoolName": "1763440",
"YearLeft": "2001"
} --(Nullable)
}
{
"empid": "12345",
"empname": "shyam",
"schoolname": "high school",
"class": "10"
}
I am trying to access the above record from azure databricks using pyspark or scala code .But when we are building the dataframe reading it from cosmos db it does not bring previousSchooldetail detail in the data frame.But when we change the query including id for which the previousSchooldetail show in the data frame .
Case 1:-
val Query = "SELECT * FROM c "
Result when query fired directly
empid
empname
schoolname
class
Case2:-
val Query = "SELECT * FROM c where c.empid=1234"
Result when query fired with where clause.
empid
empname
school name
class
previousSchooldetail
prevSchoolName
YearLeft
Could you please tell me why i am not able to get previousSchooldetail in case 1 and how should i proceed.
As #Jayendran, mentioned in the comments, the first query will give you the previouschooldetail document wherever they are available. Else, the column would not be present.
You can have this column present for all the scenarios by using the IS_DEFINED function. Try tweaking your query as below:
SELECT c.empid,
c.empname,
IS_DEFINED(c.previousSchooldetail) ? c.previousSchooldetail : null
as previousSchooldetail,
c.schoolname,
c.class
FROM c
If you are looking to get the result as a flat structure, it can be tricky and would need to use two separate queries such as:
Query 1
SELECT c.empid,
c.empname,
c.schoolname,
c.class,
p.prevSchoolName,
p.YearLeft
FROM c JOIN c.previousSchooldetail p
Query 2
SELECT c.empid,
c.empname,
c.schoolname,
c.class,
null as prevSchoolName,
null as YearLeft
FROM c
WHERE not IS_DEFINED (c.previousSchooldetail) or
c.previousSchooldetail = null
Unfortunately, Cosmos DB does not support LEFT JOIN or UNION. Hence, I'm not sure if you can achieve this in a single query.
Alternatively, you can create a stored procedure to return the desired result.
Related
I'm trying to understand why my query below will only work when using a subquery.
Sample document structure:
{
"id": "78832-fsdfdf-3242",
"type": "Specific",
"title": "JavaScript vs TypeScript",
"summary": "Explain the differences between JavaScript and TypeScript.",
"products": [
"javascript v6",
"typescript v1",
"node.js"
]
}
Query requirements:
Find the id of all documents where the terms 'javascript' or 'csharp' or 'coding' are contained in either the title, summary or in one of the listed products.
To solve this, I'm using CONTAINS(). To avoid repeating the CONTAINS() for each combination of field and search term, I create a concatenation of the fields in question and name it searchField.
Working query
This is the query I came up with. It's using a subquery sub to add the concatenated fields and products to the result set. Then, I can use CONTAINS() on sub.searchField.
SELECT sub.id
FROM
(
SELECT
o.id,
o.type,
CONCAT(o.title, " ", o.summary, " ", p) as searchField
FROM o
JOIN p in o.products
) sub
WHERE
sub.type = "Specific"
AND
(
CONTAINS(sub.searchField, "javascript", true)
OR CONTAINS(sub.searchField, "csharp", true)
OR CONTAINS(sub.searchField, "coding", true)
)
Non-working query
Originally, I had the query written as seen below. I expected it to work as in other SQL dialects, but I cannot access searchField in the WHERE clause.
"Error: Identifier 'searchField' could not be resolved."
SELECT o.id, CONCAT(o.title, " ", o.summary, " ", p) as searchField
FROM o
WHERE
o.type = "Specific"
AND
(
CONTAINS(searchField, "javascript", true)
OR CONTAINS(searchField, "csharp", true)
OR CONTAINS(searchField, "coding", true)
)
Questions
Is there a better way to achieve the result needed? (Although, surprisingly, the query consumes only 230 RUs)
Why is the subquery needed? I really want to understand this so I can learn when the use subqueries and potentially write queries that would otherwise not be possible.
I want to return the count and data by writing it in a single Cosmos sql query.
Something like
Select *, count() from c
Or if possible i want get the count in a json document.
[
{
"Count" : 1111
},
{
"Name": "Jon",
"Age" : 30
}
]
You're going to have to issue two separate queries - one to get the total number of documents matching your query, and a second to get a page of documents.
How can I retrieve objects which match order_id = 9234029m, given this document in CosmosDB:
{
"order": {
"order_id": "9234029m",
"order_name": "name",
}
}
I have tried to query in CosmosDB Data Explorer, but it's not possible to simply query the nested order_id object like this:
SELECT * FROM c WHERE c.order.order_id = "9234029m"
(Err: "Syntax error, incorrect syntax near 'order'")
This seems like it should be so simple, yet it's not! (In CosmosDB Data Explorer, all queries need to start with SELECT * FROM c, but REST SQL is an alternative as well.)
As you discovered, order is a reserved keyword, which was tripping up the query parsing. However, you can get past that, and still query your data, with slightly different syntax (bracket notation):
SELECT *
FROM c
WHERE c["order"].order_id = "9234029m"
This was due, apparently, to order being a reserved keyword in CosmosDB SQL, even if used as above.
My data structure in cosmosdb is next
{
"_id": {
"$oid": "554f7dc4e4b03c257a33f75c"
},
.................
}
and I need to sort collection by $oid field. How should I form my sql query?
Normal query SELECT TOP 10 * FROM collection c ORDER BY c._id.filedname not works if fieldname starts with $ like $oid.
I am using query explorer from azure portal.
To use a special character, like $, you need to use bracket notation:
SELECT c._id FROM c
order by c._id["$oid"]
You can do this with each property in the hierarchy, so the following also works:
SELECT c._id FROM c
order by c["_id"]["$oid"]
I want to retrieve about 50 - 100 documents by their ID from DocumentDb. I have the list of IDs in a List<string>. How do I use LINQ to SQL to retrieve those documents. I don't want to write the actual SQL query as a string, as in:
IQueryable<Family> results = client.CreateDocumentQuery<Family>(collectionUri, "SELECT * FROM family WHERE State IN ('TX', 'NY')", DefaultOptions);
I want to be able to use lambda expressions to create the query, because I don't want to hard-code the names of the fields as string.
It seems that you do not want to generate and pass query string SELECT * FROM family WHERE State IN ('TX', 'NY') to query documents, you could try the following code.
List<string> ids = new List<string>() { "TX", "NY" };
client.CreateDocumentQuery<Family>(collectionUri).Where(d => ids.Contains(d.id));