How to manage through put of Azure Table Storage? (like AWS) - azure

AWS dynamo db has a throughput parameter you can set.
How does Azure Table Storage scale in that regard?

Windows Azure Table Storage does not provide a throughput parameter instead with throughput is already set for the Azure Table storage as described in this article.
Single Table Partition – a table partition are all of the entities in a table with the same partition key value, and most tables have many partitions. The throughput target for a single partition is:
Up to 500 entities per second
Note, this is for a single partition, and not a single table. Therefore, a table with good partitioning, can process up to a few thousand requests per second (up to the storage account target).

Related

How to look up logical partition count and size in Cosmos DB

In Cosmos DB, is it possible to see how many logical partitions my database has created and the current fill capacity of each partition? I'd like to know how evenly my partitioning strategy is dividing data. Also, I'd like to prepare for data re-distribution before I hit the 20GB limit on a hot partition.
The accepted answer doesn't give you size metrics by partition. You can find the partition breakdown by following these steps:
Go to the Overview page of your Cosmos DB instance in Azure Portal
In the Monitoring section select "Metrics (Classic)"
Click on the Storage tab
Select the database and container of interest
In the "Data + Index storage consumed by top partition keys" panel is a chart of data consumption by key.
Azure monitor for Cosmos DB. Check out the metrics.
https://learn.microsoft.com/en-us/azure/cosmos-db/use-metrics

Unique list of PartionKey's in Azure Table Storage Account

I would like to get a unique list of PrimaryKey's used in an existing Azure Table Storage Account with a large amount of data. In this db there will be more than 20 unique Primary Key's and I would like to get this unique list back. Is there any FilterCondition or API that supports this?
Unfortunately no. There's no API available in Azure Table Storage that will give you the list of partition keys back.
You will need to fetch all entities and extract the unique partition keys from the result set.
To optimize this process, you can do a few things:
Run your application in a VM in the same region as that of the storage account. That way you're avoiding the latency + data egress charges.
Use Query Projection to only return PartitionKey in the result as you're only interested in getting that information.

Is Azure table storage data retrieval faster than Sql Azure

There is a requirement to store data (xml data) in some storage. As it is mentioned it stores large XML data, each record (row) size nearly 1MB. The doubt is which storage we are going to use to store the data means Azure Table storage (Storage Account) or Sql Azure.
So which storage will help data store and retrieval faster?
When looking at sheer volume, Table Storage is today far more scalable than SQL Azure. Given a storage account (storage accounts hold blobs, queues and tables) is allowed to be 100TB in size, in theory your table could consume all 100TB. At first glance, a 100TB chunk of data may seem overwhelming. However, Table Storage can be partitioned. Each partition of Table Storage can be moved to a separate server by the Azure controller thereby reducing the load on any single server. As demand lessens, the partitions can be reconsolidated. Reads of Azure Table Storage are load balanced across three replicas to help performance.
Entities in Table Storage are limited to 1MB each with no more than 255 properties (3 of which are required partition key, row key, and timestamp).
Today, SQL Azure databases are limited to 1GB or 10GB. However, sometime this month (June 2010), a 50GB limit is supposed to be available. What happens if your database is larger than 10GB today (or 50GB tomorrow)? Options include repartitioning your database into multiple smaller databases or sharding (Microsoft’s generally recommended approach). Without getting into the database details of both of these database design patterns, both of these approaches are not without issue and complexity, some of which must be resolved at the application level.
It's hard to say Azure table storage data retrieval must be faster than Sql Azure. It depend on your data structure, size.
As you said, each record (row) size of your XML data nearly 1MB, if not exceed the limit 1MB, you can choose the Table Storage first.
You can reference this document to know more comparisons about Azure Table Storage and SQL Azure: Azure Table Storage vs. Windows SQL Azure
Hope this helps.

Provision throughput on Database level using Table API in cosmos db

I have come across the requirement where I have to choose the API for Cosmos DB.
I have gone through with all API's like SQL,Graph, Mongo and Table. Since my current project structure is based on Table storage where I am storing IoT Device data.
In Current structure (Table storage) :
I have a separate Table for each Device with payload like below
{
Timestamp,
Parameter name,
value
}
Now If I plan to use Cosmos DB then I can see that I have to Provision RU/throughput against each table which I think going to be big cost. I have not found any way to assign RU on database level so that my allocated RU can be shared across all tables.
Please let me know in case we have something here.... or is it the limitation i can treat for CosmosDB with Table API?
As far as I can see SQL API and consider my use case I can create a single data base and then multiple collection (with the name of Table) and then I have both option for RU provision like on Database as well as on Device level which give me more control on cost.
You can set the throughput on the account level.
You can optionally provision throughput at the account level to be shared by all tables in this account, to reduce your bill. These settings can be changed ONLY when you don't have any tables in the account. Note, throughput provisioned at the account level is billed for, whether you have tables created or not. The estimate below is approximate and does not include any discounts you may be entitled to.
Azure Cosmos DB pricing
The throughput configured on the database is shared across all the containers of the database. You can choose to explicitly exclude certain containers from database provisioning and instead provision throughput for those containers at container level.
A Cosmos DB database maps to the following: a database while using SQL or MongoDB APIs, a keyspace while using Cassandra API or a database account while using Gremlin or Table storage APIs.
You can embed Cerebrata into the situation where the tools allow you to assign any number of throughput values post assigning the throughput type (fixed, auto-scale, or no throughput)
Disclaimer: It’s purely based on my experience

Azure Table Storage partition limit

What is the limit on the number of partitions that can belong to an Azure table?
The only decent source of information regarding table storage limitations I have found are here (2012) and here(2010).
There is no limit. As your links suggest, there are mostly scalability targets and limitations on the physical keys and entities that describe the table schema.

Resources