In one of my projects, we created some Amazon Rekognition collections in one region (i.e. us-east-1). But now, due to some reason, we want to move these collections to another region (i.e. eu-west-1). I went through the AWS docs but I couldn't find any API or action that can be used to achieve this. Is it possible to migrate collections from one region to another? If yes then how?
Related
I am getting ready to create a brand new mobile application that communicates with CosmosDB and I will probably go the serverless way. The serverless way has some little disadvantages compared to the provisioned throughput (eg. only 50GB per container, no Geo-Redundancy, no Multi-region Writes, etc.).
If I need later on to convert my DB to a provisioned throughput one, can I do it somehow?
I know that I can probably use the change-feed and from that (I guess) recreate a new DB from it (provisioned throughput one) but this might open the Pandora's box especially while a mobile app connects to a specific DB.
As Gaurav mentioned, there is no way to change to Provisioned from Serverless plan once you create an account.
You will need to recreate the account with Serverless as type and follow the below ways to migrate the data,
(i) Data Migration Tool - You can easily migrate from one account to another
(ii) ChangeFeed and Restore - push the changes to the new the instance of Azure Cosmos DB
Once you are synced switch back to the new one.
Based on the documentation available here: https://learn.microsoft.com/en-us/azure/cosmos-db/serverless#using-serverless-resources, it is currently not possible to change a Cosmos DB server less account to provisioned throughput.
I wonder if it's a good idea to use Azure Cosmos DB "container" to manage an entity's status? For example, an employee's reimbursement can have different statuses like submitted, approved, paid, etc. Do you see any problem creating a separate container for "Submitted", "Approved", etc? They would contain the similar reimbursement object but slightly different data points due to their status. For example, Submitted container could have Manager's name as the approver, Paid container could have payment method.
In other words, it's like a persistent queue. It will be moved out of the container and into the next in the workflow process.
Are there any concerns with this approach? Does Azure pricing model "provisioned throughput" charge by the container? Meaning the more container you have, the more expensive it gets? Or is it on the database level so that I can have as many containers I want, it's only charging the queries?
Sorry for the newbie questions, learning about Cosmos. Thanks for any advice!
It's a two part question :).
First part (single container v/s multiple container) is basically an "opionion-based" question. I would have opted for a single container approach as it would have given me just one place to look for the current status of an item. But, that's just my opinion :).
Regarding your question about pricing model, Azure Cosmos DB offers you both.
You can provision throughput at the container level as well as on the database level. When you provision throughput at the database level, all containers (up to 25) in that database will share the throughput of the database.
You can even mix and match the approaches as well i.e. you can have throughput provisioned at the database level and then have some containers share the throughput of the database while some containers have dedicated throughput.
Please note that once throughput type (fixed, shared or auto-scale) is configured at the database/container level, it can't be changed. You will have to delete and create new resource with changed throughput type.
You can learn more about throughput here: https://learn.microsoft.com/en-us/azure/cosmos-db/set-throughput.
I have managed to get the C# and db setup using ListMappings. However, when I try to deploy the split/merge tool to Azure cloud classic the service it states 'The requested VM tier is currently not available in East US for this subscription. Please try another tier or deploy to a different location.' We tried a few other regions with the same result. Do you know if there is a workaround or updated version? Is the split / merge service even still relevant? Has anyone got this service to run on Azure lately?
https://learn.microsoft.com/en-us/azure/azure-sql/database/elastic-scale-overview-split-and-merge
The answer to the question on whether it is still relevant, in my opinion is ...no. Split\merge is no longer relevant with the maturation of elastic pools. Elastic pools with one data base per tenant seem the sustainable way to implement multi tenancy with legacy code. The initial plan was to add keys to each of our tables to have multiple tenants per database. Elastic pools give us the same flexibility without having to make breaking changes our existing code.
Late post here, but we are implementing ElasticScale for a client to split ~50 clients into a database-per-tenant model. I don't think the SplitMerge tool will be used over the long term, just for the initial data migration from one db to many shards, but it has been handy for that purpose. We are using the ElasticScale SDK to allow a single API to route queries to the appropriate shard(s) based on sharding key. Happy to compare notes with you if you are still working on this.
I have a single Azure SQL Server and a single database in it. I want a solution to store specific records of selected tables in this database in different regions.
as an example, I have a users table with all PII data in it. these users can be from anywhere from the world. but i would want to store user records who are from EU region to be stored only in EU region.
To add it - i want all the other table records related to a specific user as well to get stored in that user's region.
from application perspective, i would be able to query across all users and all related tables to have dashboard data for the global users.
Any pointers to solve this scenario would be helpful for me.
Another approach could be sharding the database. Use horizontal sharding to store the rows for each country/region in a separate database in that country/region. The Elastic Database Client library will use a shardmap do most of the sharding work for you (assuming you are using .NET). You can use the country code in your shardmap to split regional data.
Reference Architecture: https://learn.microsoft.com/en-us/azure/architecture/patterns/sharding
Elastic Database Client: https://learn.microsoft.com/en-us/azure/sql-database/sql-database-elastic-database-client-library
Here is one approach... When your user/tenant registers for your service they will need to pick where their data should reside. This is referred to as data residency. Then on subsequent requests to read or write data your application's repository layer needs to be aware of who the request is executing as so it can lookup the appropriate connection string and connect to that database to retrieve/write the data.
The routing data can be replicated to multiple regions and/or housed in a single location as it would not contain PII. The Azure Web App can be single region hosted (as depicted in the image below) or it can be replicated to multiple regions and traffic routed to it via a global traffic manager.
This approach supports the case where an European user picks to have their data reside in France but happens to be visiting the united states.
This picture shows how this might look. A guy named Barry Luijbregts has a nice pluralsight video that delves into this approach. https://www.pluralsight.com/courses/azure-paas-building-global-app
Good luck!
We are planning to use cosmos db single master deployment where all master data are maintained from a single region. The application is spread across various regions and we need to provide read access to the individual regions. However we would like to have filtered replication as not all regions will be interested in all data in cosmos DB. Is there any way to use selective region specific replication? I am aware that we could use Cosmos DB trigger and then have function app etc to replicate traffic but that is an overhead in terms of maintenance and monitoring. Hence would be interested to know if we can make use of any native functionality.
The built-in geo-replication mechanism is completely transparent to you. You can't see it and you can't do anything about it. There is no way to do what you described without writing something custom.
If you really want to have selected data replicated then you would need to do the following (It's a terrible solution and you should NOT go with it):
Create a main source of truth Cosmos DB account. That's "single master" that you described.
Create a few other accounts in whichever region you want.
Use a Cosmos DB trigger Azure Function or the Change Feed Processor library to listen to changes on the main account and then use your filtering logic to replicate them into the other accounts that need to use them.
Use a different connection string per application based on it's deployment environment
What's wrong with just having your data replicated across all regions though? There are no drawbacks.