Use CosmosDB Database Triggers with Azure Data Factory Copy Activity - azure

I have a ADF copy activity copy rows of data from Azure SQL to Azure Cosmos DB.
I have a need to manipulate the document generated. I wrote the logic for the same inside a Pre Create Database Trigger that gets executed whenever a new document is created.
The trigger is not getting executed.
I was not able understand what the problem is, couldn't find any documentation either. The Cosmos DB client API's to create document needs the trigger to execute to be specified explicitly. Not sure if something similar could be done for ADF copy activity as well. Please help.
I am trying to avoid writing a custom activity (so as to leverage built-in scaling and error handling capabilities).
This seems similar to Azure cosmos db trigger, but the answers are not applicable to this question.

Related

Best way to export data from on cosmos db account into another

What i want to do is similar to what we can easily do with Azure Sql Server databases, where we can click on the copy functionality what create the same database in another Sql Server.
I don't see that functionality in Azure Cosmos DB resource.
Looking in the Microsoft documentation they seem to point into a Data migration tool
but if we already have many containers/collections and millions of records, running this locally might be impractical.
Is there any other suggestion?
You can use Azure data factory pipeline to move data from one azure cosmos container to another container.
Here are the steps I have followed to move data from one container to another using ADF.
I have two containers in Cosmos db as shown below,
Employee container has data,
Initially staff1 has no data,
Created an Azure data factory resource.
Created Linked service of type Azure cosmos DB.
Created two data sets of type Azure cosmos DB. One is for source, and another is for sink.
Created a pipeline as shown below,
Selected Sink as shown below,
Ran pipeline run successfully, and data is inserted into target container.
Data is inserted into target as shown below,
Reference link

Solution architecture for data transfer from SQL Server database to external API and back

I am looking for a proper solution architecture for a data transfer scenario from SQL Server to an external API and then from the API back to SQL Server. We are thinking of using Azure technologies.
We have a database hosted on an Azure VM. When the value of the author of the book table changes, we would like to get all the data for that book from related table and transfer it an external API. the quantity of the rows to be transferred (the select-join) is huge so it takes a long time to execute the select-join query, After this data is read it is transformed and then it is sent to an external API (over which we have no control) The transfer of the data to the API could take up to an hour. After the data is written into this API, we read some reports from this API and write these reports back into the original database.
We must repeat this process more than 50 per day.
We are thinking of using Logic app to detect the trigger from SQL Server (as it is hosted in Azure VMs) publish this even to an Azure Data grid and then use Azure Durable functions to handle the Read SQL data-Transform it- and Send to the external API.
Does this make sense? Does anybody have any better ideas?
Thanks in advance
At this moment, Logic App SQL connector can't detect when a particular row changes, it will perform a select (which you'll provide), and then it will check for changes every X interval (you'll specify).
In other words, SQL Database doesn't offer a change feed like CosmosDB where you can subscribe to events and trigger an Azure Function.
Things you can do:
1-Add a Trigger on SQL after insert / update which will insert the new/changed row into a separated table, and then you can use Logic App / Azure Functions to query this table and retrieve data.
2-Migrate to Cosmos DB and use the change feed + Azure Functions
3-Change your code to after insert into SQL Database, also add a message with the Identifier for the row you're about to insert / update, then add it to a Queue, which will be consumed by Azure Function.

Azure data factory - Continue on conflict on insert

We are building data migration pipeline using Azure data factory (ADF). We are transferring data from one CosmosDb instance to another. We plan to enable dual writes, so that we write to both the databases before migration begins to ensure that during migration if any data point changes both the databases get the most updated data. However, In ADF there is only Insert or upsert options available. Our case is on Insert if it gets 'conflict' continue and fail the pipeline. Can anyone give any pointers on how to achieve that in ADF?
Other option would be to create our own custom tool using CosmosDb libraries to transfer data.
If you are doing a live migration ADF is not the right tool to use as this is intended for offline migrations. If you are migrating from one Cosmos DB account to another your best option is to use the Cosmos DB Live Data Migrator.
This tool also provides dead letter support as well which is another requirement you have.

Is there a way to create a CosmosDb Snapshot and restore it using a C# Console app?

I searched everywhere and I couldn't find a single article that shows a way to create a script/console app that creates a snapshot of a CosmosDb database and restores it -- is this even possible?
Cosmos DB doesn't have the ability to snapshot a database. You'd need to create this on your own.
While "how" you accomplish this is a bit off-topic, as it's very broad, there are two built-in Azure approaches:
Change Feed. Cosmos DB has a Change Feed you may subscribe to, to consume content from a container in a streaming approach. By consuming the change feed, you could effectively re-create a container's data into another container. There are several writeups around this very topic.
Data Factory. You can copy content between containers via an Azure Data Factory pipeline (Cosmos DB is available as both a source and a sink for pipelines).

Cosmos DB data migration

Wanted to implement my own backup mechanism for Cosmos DB. In order to do that wanted just to grab the data every x hours and put it onto some other storage account / different cosmos db instance.
Since I can't use Data Factory (not available in my region) is there any other easy way to get data from Cosmos and put it somewhere else?
First thing that comes to my mind are just some SQL queries that would go through all collections and copy them. Is there an easier way?
Since you can't use Data Factory (maybe it's most suitable for you), I suggest you using below two solutions:
1.Azure Time Trigger Function.
It supports CORN expression. So ,you could query the data and copy them into the target collection via Cosmos db sdk. However, please note the Azure Function has execution time limitation.
2.Azure Cosmos DB Migration Tool.
You could see the tool could be executed in command-line. So, please package the commands into a bat file. Then use Windows scheduled task to execute the file. Or you could use Azure Web Job to implement the same requirements.

Resources