Correct way to query a Cosmos DB Table - azure

I am trying to use Cosmos DB Tables. What I am noticing is that if I query on Timestamp property, no data is returned.
Here's the query I am using:
Timestamp ge datetime'2010-01-01T00:00:00'
I believe my query is correct because the same query runs perfectly fine against a table in my Storage Account.
If I query on any other attribute, the query runs perfectly fine.
I tried running this query in both Cerebrata Cerulean and in Microsoft Storage Explorer and I am getting no results in both places.
However when I run the same query in Azure Portal Data Explorer, data is returned. I opened developer tools in Azure Portal and noticed that the Portal is not making OData query. Instead it is making SQL API query. For example, in the above case the query that's being sent is:
Select * from c where c._ts > [epoch value indicating time]
Similarly if I query on an attribute using the tools above:
AttributeName eq 'Some Attribute Value'
Same query is being sent in Azure Portal as
SELECT * FROM c WHERE c.AttributeName["$v"] = 'Some Attribute Value'
All the documentation states that I should be able to write OData queries and they should work but I am not finding it to be correct.
So what's the correct way of querying Cosmos DB Tables?
UPDATE
Seems this is not a problem with just Timestamp property but all Edm.DateTime kind of properties.
UPDATE #2
So I opened up my Cosmos DB Table account as SQL API account to see how the data is actually stored under the hood.
First thing I observed is that Timestamp property is not getting stored at all. Value of Timestamp (in Storage Table Entity) is actually getting stored as _ts system property and that too as Epoch seconds.
Next thing I noticed is that all Date/Time kind of properties are actually getting converted into a 20 character long strings and are stored something like the following:
"SourceTimestamp": {
"$t": 9,
"$v": "00637219463290953744"
},
I am wondering if that has something to do with not being able to issue ODATA queries directly.
BTW, I forgot to mention that I am using Azure Storage Node SDK to access my Cosmos Table account (as this is what Microsoft is recommending considering there's no Node SDK specifically for Table API).

Thanks for your patience while I looked into this.
The root cause for this behavior is while Storage table stores with time granularity of ticks, Cosmos DB's _ts is at a second level of granularity. This isn't OData related. We actually block queries for timestamp properties because it was confusing customers and overall Timestamp based queries are not recommended for Storage Tables.
The workaround for this is to add your own custom datetime or long data type property and set the value yourself from the client.
We will address this in a future update but this work is not currently scheduled.
Thanks.

Related

How to view Insert/Update operations performed on Cosmos DB SQL API account on Diagnostic logs?

I have a NOSQL Cosmos DB API account and I want to see the Insert/Update operations performed on the account.
I have enabled diagnostic logs and using the CDBQueryRuntimeStatistics to get the details. It shows only the Query operations not all the operations in diagnostic logs.
Am I missing something here?
I am expecting this to show all the operations including Insert/Update as well
Looks like you are using the wrong table in this case ,
as defined here,
This table details query operations executed against a SQL API
account. By default, the query text and its parameters are obfuscated
to avoid logging PII data with full text query logging available by
request.
Instead, you should use the CDBDataPlaneRequests
The DataPlaneRequests table captures every data plane operation for
the Cosmos DB account. Data Plane requests are operations executed to
create, update, delete or retrieve data within the account.
It should give the required information with all Create/Upsert Operations as well. Here are some of the sample queries

Ordering data in Azure Cosmos Table API

Azure Storage Tables have been superseeded by Azure Cosmos Table API at a significantly higher price point but also with new features like automatic secondary indexing.
One of the pain points using Azure Storage Tables was, that in order to achieve custom ordering of query, we have to redundantly store the data with different Partition/Row-Keys as the documentation states, that
Query results returned by the Table service are sorted in ascending
order based on PartitionKey and then by RowKey.
However, the next paragraph states, that
Query results returned by the Azure Table API in Azure DB are not
sorted by partition key or row key. For a detailed list of feature
differences, see differences between Table API in Azure Cosmos DB and
Azure Table storage.
Following, the link, i find that
Query results returned by the Table API aren't sorted in partition
key/row key order as they're in Azure Table storage.
So i am a bit confused now, how to achieve ordering when using Cosmos Table API. Is there no ordering at all? Or can i specify ordering with my querys?
For Azure Cosmos Table API, this one is correct: "Query results returned by the Azure Table API in Azure DB are not sorted by partition key or row key".
So the returned results is no sorting as of now.
Somebody has asked this issue before at GitHub here.
And the MS team suggests vote on this user voice. And they may add this basic sort feature in future.
Hope it helps.
Additional information to this topic i found out from the GitHub thread:
The latest preview of CosmosDB Tables SDK (0.11.0-preview) has OrderBy support:
https://github.com/MicrosoftDocs/azure-docs/issues/26228#issuecomment-471095278

Retrieving latest inserted _id(ObjectId) in azure cosmos db(mongo db API)

I would like to know if we can retrieve latest inserted _id(ObjectId) created by CosmosDb(mongoDb) on the same connection. (similar to SCOPE_IDENTITY() in sql server). Im inserting the document from azure functions using CosmosDb output binding.
Per my knowledge, there is no similar function like SCOPE_IDENTITY() of SQL Server in MongoDb API.
We can get the latest document via sorting Azure Cosmos DB's internal Timestamp (_ts) property, that is a number representing the number of elapsed seconds since January 1, 1970.
The query will be like:
db.YourCollection.find().sort({"_ts":1}]).limit(1)

How can I verify data uploaded to CosmosDB?

I have dataset of 442k JSON documents in single ~2.13GB file in Azure Data Lake Store.
I've upload it to collection in CosmosDB via Azure Data Factory pipeline. Pipeline is completed successfully.
But when I went to CosmosDB in Azure Portal, I noticed that collection size is only 1.5 GB. I've tried to run SELECT COUNT(c.id) FROM c for this collection, but it returns only 19k. I've also seen complains that this count function is not reliable.
If I open collection preview, first ~10 records match my expectations (ids and content are the same as in ADLS file).
Is there a way to quickly get real record count? Or some other way to be sure that nothing is lost during import?
According to this article, you could find:
When using the Azure portal's Query Explorer, note that aggregation queries may return the partially aggregated results over a query page. The SDKs produces a single cumulative value across all pages.
In order to perform aggregation queries using code, you need .NET SDK 1.12.0, .NET Core SDK 1.1.0, or Java SDK 1.9.5 or above.
So I suggest you could firstly try to use azure documentdb sdk to get the count value.
More details about how to use , you could refer to this article.

Azure Search default database type

I am new to Azure Search and I have just seen this tutorial https://azure.microsoft.com/en-us/documentation/articles/search-howto-dotnet-sdk/ on how to create/delete an index, upload and search for documents. However, I am wondering what type of database is behind the Azure Search functionality. In the given example I couldn't see it specified. Am I right if I assume it is implicitly DocumentDb?
At the same time, how could I specify the type of another database inside the code? How could I possibly use a Sql Server database? Thank you!
However, I am wondering what type of database is behind the Azure
Search functionality.
Azure Search is offered to you as a service. The team hasn't made the underlying storage mechanism public so it's not possible to know what kind of database are they using to store the data. However you interact with the service in form of JSON records. Each document in your index is sent/retrieved (and possibly saved) in form of JSON.
At the same time, how could I specify the type of another database
inside the code? How could I possibly use a Sql Server database?
Short answer, you can't. Because it is a service, you can't specify the service to index any data source. However what you could do is ask search service to populate its database (read index) through multiple sources - SQL Databases, DocumentDB Collections and Blob Containers (currently in preview). This is achieved through something called Data Sources and Indexers. Once configured properly, Azure Search Service will constantly update the index data with the latest data in the specified data source.

Resources