Back up in documentdb - node.js

I need to take a snapshot of my current collection and then restore it at any point of time later in documentdb. Available options are using azure migration tool. Is it possible to do this by calling API through my application built in node?

Yes, you can use the Read Document Feed API to scan the documents within a collection: https://learn.microsoft.com/en-us/rest/api/documentdb/list-documents The API supports "change feed" for incremental retrieval of documents from the collection. The method in Node.js is readDocuments.
Aside from this, the DocumentDB team has announced that self-service backup will be available, which will have its own API. You can find the status of the feature at https://feedback.azure.com/forums/263030-documentdb/suggestions/6331712-backup-solution-for-documentdb

Related

Partitionkey is ignored in CosmosDB

I have a flow where I send a json document to the ServiceBus and a function listens to the topic and creates a document in my CosmosDB.
The CosmosDB has the partitionkey "targetid"
When I provide the document from the Function
The document is inserted and I can pull it again from c# using CreateDocumentQuery but I cant see the document in the portal and no logical partitions has been created based on the value in the targetid property.
If I create a document directly from the portal and pulls it with CreateDocumentQuery in my application then the document also has a completely different format than the documents that has been created from the application itself through ServiceBus and Functions.
Cosmos DB Change Feed (what the Cosmos DB Trigger reads) is not available on Mongo DB API accounts at this point. Change Feed is a feature of Cosmos DB and thus, surfaced on the Core / SQL API and, at this point, not available for Mongo DB API accounts.
You can verify the compatibility matrix on the official documentation.
As a side note, the fact that you are also using CreateDocumentQuery means that you are using the Core/SQL SDK. It would make sense for you to use a Core/SQL API account instead if you are not going to use the Mongo DB drivers or clients.

How to implement REST API using NodeJs for group/multiple/batch insert in table of azure storage?

I went through the documentation of REST BatchSave Reference but they didn't give any example or step for it as they give in insert/update/delete/fetch, i am not able to understand how to design REST API for batch save. I am done with insert, update, delete and fetch of entity from a table of azure datastore, but got stuck on Batch/Group/multiple insert in a table of azure datastore.
Need help!!!
Here is an example of implementation by Microsoft Azure storage Node.js Client library, please look into:
tablebatch.js
batchresult.js
tableservice.js executeBatch function
Would you also share your concern about directly using azure-storage node.js package?
Best Wishes

Microsoft Cosmos DB (DocumentDB API) vs. Cosmos DB (Table API)

Microsoft Cosmos DB includes DocumentDB API, Table API and others. I have about ~ 10 TB of data and would like to have a fast key-value lookup (very little updating and writing, mostly are reading). Add a link for Microsoft Cosmos DB:
https://learn.microsoft.com/en-us/azure/cosmos-db/
So how should I choose between DocumentDB API and Table API?
Or when should I choose DocumentDB API? When should I choose Table API?
Is it a good practice to use DcoumentDB API to store 10 TB of data?
The Azure Cosmos DB Table API was introduced to make Cosmos DB and its advanced indexing, geo-distribution, etc. features available to the Azure Table storage community. The idea is that someone using Azure Table storage who needs more advanced features only offered by Cosmos DB can literally just change their connection string and their existing code will work with Cosmos DB.
But if you are a greenfield customer then I would recommend using SQL API (formerly called Document DB API) which is a super set of Table API. We are constantly investing in providing more advanced features and capabilities to SQL API where as for Table API we are just looking to maintain compatibility with Azure Table storage's API which hasn't changed in many years.
How much data you have doesn't have any affect on what API you choose. They both have the same multi-model infrastructure and can handle the same sizes of data, query loads, distribution, etc.
So how should I choose between DocumentDB API and Table API?
Choosing between DocumentDB API and Table API will primarily depend on the kind of data that you're going to store. DocumentDB API provides a schema-less JSON database engine with SQL querying capabilities whereas Table API provides a key-value storage database service. Since you mentioned that your data is key-value based, recommended is that you use Table API.
Or when should I choose DocumentDB API? When should I choose Table API?
Same as above.
Is it a good practice to use DcoumentDB API to store 10 TB of data?
Both Document DB API and Table API are designed to store huge amounts of data.
However you may want to look into Azure Table Storage as well. Cosmos DB lets you fine tune the throughput that you need and robust indexing/querying support and that comes at a price. Azure Tables on the other hand comes with fixed throughput and limited indexing/querying support and is extremely cheap compared to Cosmos DB.
You may find this link helpful to explore more about Cosmos DB: https://learn.microsoft.com/en-us/azure/cosmos-db/introduction.
Please don't flag this as off-topic.
It might help for you to know in advance: if you are considering the document interface, then in fact there is a case-insensitivity that can affect how DataContract classes (and I believe all others) are transformed to and from Cosmos.
In the linked discussion below, you will see that there is a case insensitivity in Newtonsoft.Json that can have effects on your handling of objects that you pass or get directly from the API. Not that Cosmos has ANY flaws, and in fact it is totally excellent. But with a document API, you might (like me) start to simply pass DataContract objects into Cosmos (which is obviously not wrong, and in fact very much expected from the object API), but there are some serializer and naming strategy handler options that you are probably better of at least being aware of up front.
So just to add a note for you to be aware of this behavior with an object interface. The discussion is here on GitHub:
https://github.com/JamesNK/Newtonsoft.Json/issues/815

How can I verify data uploaded to CosmosDB?

I have dataset of 442k JSON documents in single ~2.13GB file in Azure Data Lake Store.
I've upload it to collection in CosmosDB via Azure Data Factory pipeline. Pipeline is completed successfully.
But when I went to CosmosDB in Azure Portal, I noticed that collection size is only 1.5 GB. I've tried to run SELECT COUNT(c.id) FROM c for this collection, but it returns only 19k. I've also seen complains that this count function is not reliable.
If I open collection preview, first ~10 records match my expectations (ids and content are the same as in ADLS file).
Is there a way to quickly get real record count? Or some other way to be sure that nothing is lost during import?
According to this article, you could find:
When using the Azure portal's Query Explorer, note that aggregation queries may return the partially aggregated results over a query page. The SDKs produces a single cumulative value across all pages.
In order to perform aggregation queries using code, you need .NET SDK 1.12.0, .NET Core SDK 1.1.0, or Java SDK 1.9.5 or above.
So I suggest you could firstly try to use azure documentdb sdk to get the count value.
More details about how to use , you could refer to this article.

Change Tracking in an Azure SQL DB for Azure Search for non-dbo Schema

I am associating an Azure SQL DB Table to my Azure Search using an Indexer. I am setting this all up using Azure's website: https://portal.azure.com
When I try and create the Indexer in Azure Search, I get the warning about "Consider enabling integrated change tracking on your database." However, I have enabled integrated change tracking on my database and table.
I have successfully setup several tables this way, in the same database, and they're working just fine with Azure Search. However, this table has a schema other than [dbo], and the others with change tracking were [dbo]. The same SQL user is being used for all the tables, and it has been granted the change tracking permission to this table, too.
Is there a problem with the Azure website where I cannot do this via the UI? Can this be done otherwise? Is there a permission issue with my DB's schema? Something else?
Because of this warning, I have not actually created this Azure Search Index.
Any help is appreciated!
It's a limitation of Azure Search portal - it doesn't support enabling integrated change tracking for non-default schemas. The workaround is to create the indexer programmatically, using REST API or .NET SDK. For a walkthrough, see https://learn.microsoft.com/azure/search/search-howto-connecting-azure-sql-database-to-azure-search-using-indexers.

Resources