When do physical partitions get created? - azure

I have a partitioned collection that uses a 5-digit membership code as its partition key. There could be thousands of partition keys in the collection.
I upserted about 32K documents in it. Using the Partition Stats sample:
Summary:
partitions: 1
documentsCount: 32,190
documentsSize: 0.045 GB
But there is only a single physical partition! If I use the portal metrics, i see a similar thing:
Doesn't this mean that all my queries are going against a single physical partition? When does Cosmos add more physical partitions?
The reason I ask is because I am seeing really poor performance that seriously deteriorates when I load test. For example this simple count method starts off fast in a light test and then takes seconds when the system is under pressure (ignore the handler stuff):
private async Task<int> RetrieveDigitalMembershipRefreshItemsCount(string code, string correlationId)
{
var error = "";
double cost = 0;
var handler = HandlersFactory.GetProfilerHandler(_loggerService, _settingService);
handler.Start(LOG_TAG, "RetrieveDigitalMembershipRefreshItemsCount", correlationId);
try
{
if (this._docDbClient == null)
throw new Exception("No singleton DocDb client!");
// Check to see if there is a URL
if (string.IsNullOrEmpty(_docDbDigitalMembershipsCollectionName))
throw new Exception("No Digital Memberships collection defined!");
FeedOptions queryOptions = new FeedOptions { MaxItemCount = 1, PartitionKey = new Microsoft.Azure.Documents.PartitionKey(code.ToUpper()) };
return await _docDbClient.CreateDocumentQuery<DigitalMembershipDeviceRegistrationItem>(
UriFactory.CreateDocumentCollectionUri(_docDbDatabaseName, _docDbDigitalMembershipsCollectionName), queryOptions)
.Where(e => e.CollectionType == DigitalMembershipCollectionTypes.RefreshItem && e.Code.ToUpper() == code.ToUpper())
.CountAsync();
}
catch (Exception ex)
{
error = ex.Message;
throw new Exception(error);
}
finally
{
handler.Stop(error, cost, new
{
Code = code
});
}
}
Here is the log for this method as the test progresses ordered by the highest duration. Initially it takes only a few milliseconds:
I tried most of the performance tips i.e. direct mode, same region, singleton. Any help would be really appreciated.
Thanks.

Partition Key is used for logical partition, distributing data across physical partitions. Physical partition management is managed by Azure Cosmos DB.
Auto-split of physical partition can be achieved, as long as the initial container was created with at least 1000 RU/s of throughput and a partition key is specified. The split mainly distributes logical partitions in one physical partition to different physical partitions. And the process is transparent to us.
Two scenarios for physical parition:
Provision throughput higher than settings value, Azure Cosmos DB splits one or more of your physical partitions to support the higher throughput.
Physical partition p reaches its storage limit, Azure Cosmos DB seamlessly splits p into two new physical partitions. If p only has one logical partition inside, split won't occur.
So if those conditions are not met, you only have one single physical partition. But the queries are going against specified logical partition using partition key.
For more details, please refer to Azure Cosmos DB partition.

Related

Why is this zero result Cosmos DB query so expensive?

I'm investigating why we're exhausting so many RUs in Cosmos. Our writes are the expected amount of RUs but our reads are through the roof - a magnitude more than our writes. I tried to strip it to the simplest scenario. A single request querying on a partition with no results uses up 2000 RUs. Why is this so expensive?
var query = new QueryDefinition("SELECT * FROM c WHERE c.partitionKey = #partionKey ORDER BY c._ts ASC, c.id ASC")
.WithParameter("#partionKey", id.Value)
using var queryResultSetIterator = container.GetItemQueryIterator<MyType>(query,
requestOptions: new QueryRequestOptions
{
PartitionKey = new PartitionKey(id.Value.ToString()),
});
while (queryResultSetIterator.HasMoreResults)
{
foreach (var response in await queryResultSetIterator.ReadNextAsync())
{
yield return response.Data;
}
}
The partition key of the collection is /partitionKey. The RU capacity is set directly on the container, not shared. We have a composite index matching the where clause - _ts asc, id asc. Although I'm not sure how that would make any difference for returning no records.
Unfortunately the SDK doesn't appear to give you the spent RUs when querying this way so I've been using Azure monitor to observe RU usage.
Is anyone able to shed any light on why this query, returning zero records and limited to a single partition would take 2k RUs?
Update:
I just ran this query on another instance of the database in the same storage account. Both configured identically. DB1 has 0MB in it the, DB2 has 44MB in it. For the exact same operation involving no records returned, DB1 used 111 RUs, DB2 used 4730RUs - over 40 times more for the same no-result queries.
Adding some more detail: The consistency is set to consistent prefix. It's single region.
Another Update:
I've replicated the issue just querying via Azure Portal and it's related to the number of records in the container. Looking at the query stats it's as though it's loading every single document in the container to search on the partition key. Is the partition key not the most performant way to search? Doesn't Cosmos know exactly where to find documents belonging to a partition key by design?
2445.38 RUs
Showing Results
0 - 0
Retrieved document count: 65671
Retrieved document size: 294343656 bytes
Output document count: 0
Output document size: 147 bytes
Index hit document count: 0
Index lookup time: 0 ms
Document load time: 8804.060000000001 ms
Query engine execution time: 133.11 ms
System function execution time: 0 ms
User defined function execution time: 0 ms
Document write time: 0 ms
I eventually got to the bottom of the issue. In order to search on the partition key it needs to be indexed. Which strikes me as odd considering the partition key is used to decide where a document is stored, so you'd think Cosmos would inherently know the location of every partition key.
Including the partition key in the list of indexed items solved my problem. It also explains why performance degraded over time as the database grew in size - it was scanning through every single document.

How do I get list of Physical Partition in Azure CosmosDb, is there a way to get list of physical partitions?

Although I am setting high RUs, I am not getting required results.
Background is: I am working on IOT application and unfortunately partition key set is very bad {deviceID}+ {dd/mm/yyyy hh:mm:sec:}, which means technically speaking each logical partition would have very less items (never reach 10 GB limit), but I feel there is a huge number of physical partitions got created which is forcing my RUs to split. How do I get physical partition list
you cant control partitions, nor you can get a partition list. but you dont actually need them. its not like each partition will be placed on a separate box. if you are suffering from low performance you need to identify what is causing throttling. You can use the metrics blade to identify throttled partitions and figure out why those are throttled. You can also use diagnostic settings and stream those to Log Analytics to gain additional insights
We can get the list of partition key ranges using this API. Partition Key Ranges might change in future with changes in data.
Physical partitions are internal implementations. We don't have any control over the size or number of physical partitions and we can't control the mapping between logical & physical partitions.
But we can control the distribution of data over logical partitions by choosing appropriate Partition Key which can spread data evenly across multiple logical partitions.
This information used to be displayed straight forwardly in the portal but this was removed in a redesign.
I feel that this is a mistake as provisoning RU requires knowledge of peak RU per partition multiplied by number of partitions so this number should be easily accessible.
The information is returned in the JSON returned to the portal but not shown to us. For collections provisioned with dedicated throughput (i.e. not using database provisioned throughput) this javascript bookmark shows the information.
javascript:(function () { var ss = ko.contextFor($(".ext-quickstart-tabs-left-margin")[0]).$rawData().selectedSection(); var coll = ss.selectedCollectionId(); if (coll === null) { alert("Please drill down into a specific container"); } else { alert("Partition count for container " + coll + " is " + ss.selectedCollectionPartitionCount()); } })();
Visit the metrics tab in the portal and select the database and container and then run the bookmark to see the count in an alert box as below.
You can also see this information from the pkranges REST end point. This is used by the SDK. Some code that works in the V2 SDK is below
var documentClient = new DocumentClient(new Uri(endpointUrl), authorizationKey,
new ConnectionPolicy {
ConnectionMode = ConnectionMode.Direct
});
var partitionKeyRangesUri = UriFactory.CreatePartitionKeyRangesUri(dbName, collectionName);
FeedResponse < PartitionKeyRange > response = null;
do {
response = await documentClient.ReadPartitionKeyRangeFeedAsync(partitionKeyRangesUri,
new FeedOptions {
MaxItemCount = 1000
});
foreach(var pkRange in response) {
//TODO: Something with the pkRange
}
} while (!string.IsNullOrEmpty(response.ResponseContinuation));

Efficiently retrieving large numbers of entities from Azure Table Storage

What are some ways to optimize the retrieval of large numbers of entities (~250K) from a single partition from Azure Table Storage to a .NET application?
As far as I know, there are two ways to optimize the retrieval of large numbers of entities from a single partition from Azure Table Storage to a .NET application.
1.If you don’t need to get all properties of the entity, I suggest you could use server-side projection.
A single entity can have up to 255 properties and be up to 1 MB in size. When you query the table and retrieve entities, you may not need all the properties and can avoid transferring data unnecessarily (to help reduce latency and cost). You can use server-side projection to transfer just the properties you need.
From:Azure Storage Table Design Guide: Designing Scalable and Performant Tables(Server-side projection)
More details, you could refer to follow codes:
string filter = TableQuery.GenerateFilterCondition(
"PartitionKey", QueryComparisons.Equal, "Sales");
List<string> columns = new List<string>() { "Email" };
TableQuery<EmployeeEntity> employeeQuery =
new TableQuery<EmployeeEntity>().Where(filter).Select(columns);
var entities = employeeTable.ExecuteQuery(employeeQuery);
foreach (var e in entities)
{
Console.WriteLine("RowKey: {0}, EmployeeEmail: {1}", e.RowKey, e.Email);
}
2.If you just want to show the table’s message, you needn’t to get all the entities at same time.
You could get part of the result.
If you want to get the other result, you could use the continuation token.
This will improve the table query performance.
A query against the table service may return a maximum of 1,000 entities at one time and may execute for a maximum of five seconds. If the result set contains more than 1,000 entities, if the query did not complete within five seconds, or if the query crosses the partition boundary, the Table service returns a continuation token to enable the client application to request the next set of entities. For more information about how continuation tokens work, see Query Timeout and Pagination.
From:Azure Storage Table Design Guide: Designing Scalable and Performant Tables(Retrieving large numbers of entities from a query)
By using continuation tokens explicitly, you can control when your application retrieves the next segment of data.
More details, you could refer to follow codes:
string filter = TableQuery.GenerateFilterCondition(
"PartitionKey", QueryComparisons.Equal, "Sales");
TableQuery<EmployeeEntity> employeeQuery =
new TableQuery<EmployeeEntity>().Where(filter);
TableContinuationToken continuationToken = null;
do
{
var employees = employeeTable.ExecuteQuerySegmented(
employeeQuery, continuationToken);
foreach (var emp in employees)
{
...
}
continuationToken = employees.ContinuationToken;
} while (continuationToken != null);
Besides, I suggest you could pay attention to the table partition scalability targets.
Target throughput for single table partition (1 KB entities) Up to 2000 entities per second
If you reach the scalability targets for this partition, the storage service will throttle.

Service Fabric - (reaching MaxReplicationMessageSize) Huge amount of data in a reliable dictionary

EDIT question summary:
I want to expose an endpoints, that will be capable of returning portions of xml data by some query parameters.
I have a statefull service (that is keeping the converted to DTOs xml data into a reliable dictionary)
I use a single, named partition (I just cant tell which partition holds the data by the query parameters passed, so I cant implement some smarter partitioning strategy)
I am using service remoting for communication between the stateless WEBAPI service and the statefull one
XML data may reach 500 MB
Everything is OK when the XML only around 50 MB
When data gets larger I Service Fabric complaining about MaxReplicationMessageSize
and the summary of my few questions from below: how can one achieve storing large amount of data into a reliable dictionary?
TL DR;
Apparently, I am missing something...
I want to parse, and load into a reliable dictionary huge XMLs for later queries over them.
I am using a single, named partition.
I have a XMLData stateful service that is loading this xmls into a reliable dictionary in its RunAsync method via this peace of code:
var myDictionary = await this.StateManager.GetOrAddAsync<IReliableDictionary<string, List<HospitalData>>>("DATA");
using (var tx = this.StateManager.CreateTransaction())
{
var result = await myDictionary.TryGetValueAsync(tx, "data");
ServiceEventSource.Current.ServiceMessage(this, "data status: {0}",
result.HasValue ? "loaded" : "not loaded yet, starts loading");
if (!result.HasValue)
{
Stopwatch timer = new Stopwatch();
timer.Start();
var converter = new DataConverter(XmlFolder);
List <Data> data = converter.LoadData();
await myDictionary.AddOrUpdateAsync(tx, "data", data, (key, value) => data);
timer.Stop();
ServiceEventSource.Current.ServiceMessage(this,
string.Format("Loading of data finished in {0} ms",
timer.ElapsedMilliseconds));
}
await tx.CommitAsync();
}
I have a stateless WebApi service that is communicating with the above stateful one via service remoting and querying the dictionary via this code:
ServiceUriBuilder builder = new ServiceUriBuilder(DataServiceName);
DataService DataServiceClient = ServiceProxy.Create<IDataService>(builder.ToUri(),
new Microsoft.ServiceFabric.Services.Client.ServicePartitionKey("My.single.named.partition"));
try
{
var data = await DataServiceClient.QueryData(SomeQuery);
return Ok(data);
}
catch (Exception ex)
{
ServiceEventSource.Current.Message("Web Service: Exception: {0}", ex);
throw;
}
It works really well when the XMLs do not exceeds 50 MB.
After that I get errors like:
System.Fabric.FabricReplicationOperationTooLargeException: The replication operation is larger than the configured limit - MaxReplicationMessageSize ---> System.Runtime.InteropServices.COMException
Questions:
I am almost certain that it is about the partitioning strategy and I need to use more partitions. But how to reference a particular partition while in the context of the RunAsync method of the Stateful Service? (Stateful service, is invoked via the RPC in WebApi where I explicitly point out a partition, so in there I can easily chose among partitions if using the Ranged partitions strategy - but how to do that while the initial loading of data when in the Run Async method)
Are these thoughts of mine correct: the code in a stateful service is operating on a single partition, thus Loading of huge amount of data and the partitioning of that data should happen outside the stateful service (like in an Actor). Then, after determining the partition key I just invoke the stateful service via RPC and pointing it to this particular partition
Actually is it at all a partitioning problem and what (where, who) is defining the Size of a Replication Message? I.e is the partiotioning strategy influencing the Replication Message sizes?
Would excerpting the loading logic into a stateful Actor help in any way?
For any help on this - thanks a lot!
The issue is that you're trying to add a large amount of data into a single dictionary record. When Service Fabric tries to replicate that data to other replicas of the service, it encounters a quota of the replicator, MaxReplicationMessageSize, which indeed defaults to 50MB (documented here).
You can increase the quota by specifying a ReliableStateManagerConfiguration:
internal sealed class Stateful1 : StatefulService
{
public Stateful1(StatefulServiceContext context)
: base(context, new ReliableStateManager(context,
new ReliableStateManagerConfiguration(new ReliableStateManagerReplicatorSettings
{
MaxReplicationMessageSize = 1024 * 1024 * 200
}))) { }
}
But I strongly suggest you change the way you store your data. The current method won't scale very well and isn't the way Reliable Collections were meant to be used.
Instead, you should store each HospitalData in a separate dictionary item. Then you can query the items in the dictionary (see this answer for details on how to use LINQ). You will not need to change the above quota.
PS - You don't necessarily have to use partitioning for 500MB of data. But regarding your question - you could use partitions even if you can't derive the key from the query, simply by querying all partitions and then combining the data.

Slow Azure Table Search and Insert Operations on small tables

I am trying to benchmark search/read & insert queries on an ATS which is small size(500 entities). Average insert time is 400ms and average search + retrieve time is 190ms.
When inserting, I am querying on the partition key and the condition itself is only composed of one predicate : [PartitionKey] eq <value> (no more ands/ors). Also, I am returning only 1 property.
What could be the reason for such results?
Search code:
TableQuery<DynamicTableEntity> projectionQuery = new TableQuery<DynamicTableEntity>().Select(new string[] { "State" });
projectionQuery.Where(TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, "" + msg.PartitionKey));
// Define an entity resolver to work with the entity after retrieval.
EntityResolver<bool?> resolver = (pk, rk, ts, props, etag) => props.ContainsKey("State") ? (props["State"].BooleanValue) : null;
Stopwatch sw = new Stopwatch();
sw.Start();
List<bool?> sList = table.ExecuteQuery(projectionQuery, resolver, null, null).ToList();
sw.Stop();
Insert Code:
CloudTable table = tableClient.GetTableReference("Messages");
TableOperation insertOperation = TableOperation.Insert(msg);
Stopwatch sw = new Stopwatch();
// Execute the insert operation.
sw.Start();
table.Execute(insertOperation);
sw.Stop();
You can refer to this post for possible performance issues: Microsoft Azure Storage Performance and Scalability Checklist.
The reason why you can only get one property is you're using EntityResolver, please try to remove that. Refer to Windows Azure Storage Client Library 2.0 Tables Deep Dive for the usage of EntityResolver - when you should use it and how to use it correctly.
From the SLA Document:
Storage
We guarantee that at least 99.99% of the time, we will successfully
process requests to read data from Read Access-Geo Redundant Storage
(RA-GRS) Accounts, provided that failed attempts to read data from the
primary region are retried on the secondary region.
We guarantee that at least 99.9% of the time, we will successfully process requests to read data from Locally Redundant Storage (LRS),
Zone Redundant Storage (ZRS), and Geo Redundant Storage (GRS)
Accounts.
We guarantee that at least 99.9% of the time, we will successfully process requests to write data to Locally Redundant Storage (LRS),
Zone Redundant Storage (ZRS), and Geo Redundant Storage (GRS) Accounts
and Read Access-Geo Redundant Storage (RA-GRS) Accounts.
And also from there refereed document:
Table Query / List Operations
Maximum Processing Time: Ten (10)
seconds (to complete processing or return a continuation)
There is no commitment for fast / low response time. Nor are there any commitments on being faster with smaller tables.

Resources