Partitionkey is ignored in CosmosDB - azure

I have a flow where I send a json document to the ServiceBus and a function listens to the topic and creates a document in my CosmosDB.
The CosmosDB has the partitionkey "targetid"
When I provide the document from the Function
The document is inserted and I can pull it again from c# using CreateDocumentQuery but I cant see the document in the portal and no logical partitions has been created based on the value in the targetid property.
If I create a document directly from the portal and pulls it with CreateDocumentQuery in my application then the document also has a completely different format than the documents that has been created from the application itself through ServiceBus and Functions.

Cosmos DB Change Feed (what the Cosmos DB Trigger reads) is not available on Mongo DB API accounts at this point. Change Feed is a feature of Cosmos DB and thus, surfaced on the Core / SQL API and, at this point, not available for Mongo DB API accounts.
You can verify the compatibility matrix on the official documentation.
As a side note, the fact that you are also using CreateDocumentQuery means that you are using the Core/SQL SDK. It would make sense for you to use a Core/SQL API account instead if you are not going to use the Mongo DB drivers or clients.

Related

Retrieving latest inserted _id(ObjectId) in azure cosmos db(mongo db API)

I would like to know if we can retrieve latest inserted _id(ObjectId) created by CosmosDb(mongoDb) on the same connection. (similar to SCOPE_IDENTITY() in sql server). Im inserting the document from azure functions using CosmosDb output binding.
Per my knowledge, there is no similar function like SCOPE_IDENTITY() of SQL Server in MongoDb API.
We can get the latest document via sorting Azure Cosmos DB's internal Timestamp (_ts) property, that is a number representing the number of elapsed seconds since January 1, 1970.
The query will be like:
db.YourCollection.find().sort({"_ts":1}]).limit(1)

Microsoft Cosmos DB (DocumentDB API) vs. Cosmos DB (Table API)

Microsoft Cosmos DB includes DocumentDB API, Table API and others. I have about ~ 10 TB of data and would like to have a fast key-value lookup (very little updating and writing, mostly are reading). Add a link for Microsoft Cosmos DB:
https://learn.microsoft.com/en-us/azure/cosmos-db/
So how should I choose between DocumentDB API and Table API?
Or when should I choose DocumentDB API? When should I choose Table API?
Is it a good practice to use DcoumentDB API to store 10 TB of data?
The Azure Cosmos DB Table API was introduced to make Cosmos DB and its advanced indexing, geo-distribution, etc. features available to the Azure Table storage community. The idea is that someone using Azure Table storage who needs more advanced features only offered by Cosmos DB can literally just change their connection string and their existing code will work with Cosmos DB.
But if you are a greenfield customer then I would recommend using SQL API (formerly called Document DB API) which is a super set of Table API. We are constantly investing in providing more advanced features and capabilities to SQL API where as for Table API we are just looking to maintain compatibility with Azure Table storage's API which hasn't changed in many years.
How much data you have doesn't have any affect on what API you choose. They both have the same multi-model infrastructure and can handle the same sizes of data, query loads, distribution, etc.
So how should I choose between DocumentDB API and Table API?
Choosing between DocumentDB API and Table API will primarily depend on the kind of data that you're going to store. DocumentDB API provides a schema-less JSON database engine with SQL querying capabilities whereas Table API provides a key-value storage database service. Since you mentioned that your data is key-value based, recommended is that you use Table API.
Or when should I choose DocumentDB API? When should I choose Table API?
Same as above.
Is it a good practice to use DcoumentDB API to store 10 TB of data?
Both Document DB API and Table API are designed to store huge amounts of data.
However you may want to look into Azure Table Storage as well. Cosmos DB lets you fine tune the throughput that you need and robust indexing/querying support and that comes at a price. Azure Tables on the other hand comes with fixed throughput and limited indexing/querying support and is extremely cheap compared to Cosmos DB.
You may find this link helpful to explore more about Cosmos DB: https://learn.microsoft.com/en-us/azure/cosmos-db/introduction.
Please don't flag this as off-topic.
It might help for you to know in advance: if you are considering the document interface, then in fact there is a case-insensitivity that can affect how DataContract classes (and I believe all others) are transformed to and from Cosmos.
In the linked discussion below, you will see that there is a case insensitivity in Newtonsoft.Json that can have effects on your handling of objects that you pass or get directly from the API. Not that Cosmos has ANY flaws, and in fact it is totally excellent. But with a document API, you might (like me) start to simply pass DataContract objects into Cosmos (which is obviously not wrong, and in fact very much expected from the object API), but there are some serializer and naming strategy handler options that you are probably better of at least being aware of up front.
So just to add a note for you to be aware of this behavior with an object interface. The discussion is here on GitHub:
https://github.com/JamesNK/Newtonsoft.Json/issues/815

How can we create Azure's Data Factory pipeline with Cosoms DB (with Graph API) as data sink ?

How can we create Azure's Data Factory pipeline with Cosoms DB (with Graph API) as data sink ? (data source being Cosmos DB only (Document DB as API)
One option that is available to you is to simply continue using the Document API for the graph enabled CosmosDB sink. If you transform and write your documents into the destination in GraphSON format as regular documents they will be automatically usable as vertices and edges in future graph traversals.
The ability to use both DocumentSQL and Gremlin APIs against the same collection is one of the most exciting and powerful features of CosmosDB IMO (and the team plans to support more APIs interacting with the same dataset in the future).
Not only is this possible, but I've personally observed significant improvements in throughput when importing large datasets into a graph enabled Cosmos collection using the Document APIs instead of gremlin. I plan to release a blog post describing this process in more detail in the near future.
Cosmos DB Graph API is not supported yet and we will add to our product backlog.

Back up in documentdb

I need to take a snapshot of my current collection and then restore it at any point of time later in documentdb. Available options are using azure migration tool. Is it possible to do this by calling API through my application built in node?
Yes, you can use the Read Document Feed API to scan the documents within a collection: https://learn.microsoft.com/en-us/rest/api/documentdb/list-documents The API supports "change feed" for incremental retrieval of documents from the collection. The method in Node.js is readDocuments.
Aside from this, the DocumentDB team has announced that self-service backup will be available, which will have its own API. You can find the status of the feature at https://feedback.azure.com/forums/263030-documentdb/suggestions/6331712-backup-solution-for-documentdb

Azure Search default database type

I am new to Azure Search and I have just seen this tutorial https://azure.microsoft.com/en-us/documentation/articles/search-howto-dotnet-sdk/ on how to create/delete an index, upload and search for documents. However, I am wondering what type of database is behind the Azure Search functionality. In the given example I couldn't see it specified. Am I right if I assume it is implicitly DocumentDb?
At the same time, how could I specify the type of another database inside the code? How could I possibly use a Sql Server database? Thank you!
However, I am wondering what type of database is behind the Azure
Search functionality.
Azure Search is offered to you as a service. The team hasn't made the underlying storage mechanism public so it's not possible to know what kind of database are they using to store the data. However you interact with the service in form of JSON records. Each document in your index is sent/retrieved (and possibly saved) in form of JSON.
At the same time, how could I specify the type of another database
inside the code? How could I possibly use a Sql Server database?
Short answer, you can't. Because it is a service, you can't specify the service to index any data source. However what you could do is ask search service to populate its database (read index) through multiple sources - SQL Databases, DocumentDB Collections and Blob Containers (currently in preview). This is achieved through something called Data Sources and Indexers. Once configured properly, Azure Search Service will constantly update the index data with the latest data in the specified data source.

Resources