Performance degradation after upgrading to Microsoft.Azure.Cosmos.Table to access Azure table storage - azure

We upgraded to the next version of SDK to access our Azure Table storage.
We observed performance degradation of our application after that. We even created test applications with identical usage pattern to isolate it, and still see this performance hit.
We are using .NET Framework code, reading data from Azure table.
Old client: Microsoft.WindowsAzure.Storage - 9.3.2
New client: Microsoft.Azure.Cosmos.Table - 1.0.6
Here is one of the sample tests we tried to run:
public async Task ComparisionTest1()
{
var partitionKey = CompanyId.ToString();
{
// Microsoft.Azure.Cosmos.Table
var storageAccount = Microsoft.Azure.Cosmos.Table.CloudStorageAccount.Parse(ConnectionString);
var tableClient = Microsoft.Azure.Cosmos.Table.CloudStorageAccountExtensions.CreateCloudTableClient(storageAccount);
var tableRef = tableClient.GetTableReference("UserStatuses");
var query = new Microsoft.Azure.Cosmos.Table.TableQuery<Microsoft.Azure.Cosmos.Table.TableEntity>()
.Where(Microsoft.Azure.Cosmos.Table.TableQuery.GenerateFilterCondition("PartitionKey", "eq", partitionKey));
var result = new List<Microsoft.Azure.Cosmos.Table.TableEntity>(20000);
var stopwatch = Stopwatch.StartNew();
var tableQuerySegment = await tableRef.ExecuteQuerySegmentedAsync(query, null);
result.AddRange(tableQuerySegment.Results);
while (tableQuerySegment.ContinuationToken != null)
{
tableQuerySegment = await tableRef.ExecuteQuerySegmentedAsync(query, tableQuerySegment.ContinuationToken);
result.AddRange(tableQuerySegment.Results);
}
stopwatch.Stop();
Trace.WriteLine($"Cosmos table client. Elapsed: {stopwatch.Elapsed}");
}
{
// Microsoft.WindowsAzure.Storage
var storageAccount = Microsoft.WindowsAzure.Storage.CloudStorageAccount.Parse(ConnectionString);
var tableClient = storageAccount.CreateCloudTableClient();
var tableRef = tableClient.GetTableReference("UserStatuses");
var query = new Microsoft.WindowsAzure.Storage.Table.TableQuery<Microsoft.WindowsAzure.Storage.Table.TableEntity>()
.Where(Microsoft.WindowsAzure.Storage.Table.TableQuery.GenerateFilterCondition("PartitionKey", "eq", partitionKey));
var result = new List<Microsoft.WindowsAzure.Storage.Table.TableEntity>(20000);
var stopwatch = Stopwatch.StartNew();
var tableQuerySegment = await tableRef.ExecuteQuerySegmentedAsync(query, null);
result.AddRange(tableQuerySegment.Results);
while (tableQuerySegment.ContinuationToken != null)
{
tableQuerySegment = await tableRef.ExecuteQuerySegmentedAsync(query, tableQuerySegment.ContinuationToken);
result.AddRange(tableQuerySegment.Results);
}
stopwatch.Stop();
Trace.WriteLine($"Old table client. Elapsed: {stopwatch.Elapsed}");
}
}
Anyone observed it, any thoughts about it?

The performance issue will be resolved in Table SDK 1.0.7 as verified with large entity.
On 1.0.6 the workaround is to disable Table sdk trace by adding diagnostics section in app.config if it's a .NET framework app. It will still be slower than Storage sdk, but much better than without the workaround depending on the usage.

I think your data are stored in the legacy Storage Table. Just in case, if this is CosmosDB Table backed, you may get better performance if you set TableClientConfiguration.UseRestExecutorForCosmosEndpoint to True.
If it's the legacy Storage Table store, CosmosDB Table sdk 1.0.6 is about 15% slower than Storage Table sdk 9.3.3. In addition, it has an extra second overhead upon the first CRUD operation. Higher query duration has been resolved in 1.0.7, which is on-par with Storage SDK. The initialization second is still required why using CosmosDB Table sdk 1.0.7, which should be acceptable.
We are planning to release 1.0.7 during the week of 4/13.

Related

Azure data lake query acceleration error - One or more errors occurred. (XML specified is not syntactically valid

I am trying to filter data from azure storage account using ADLS query. Using Azure Data Lake Storage Gen2. Not able to filter data and get the result in. Been stuck on this issue, even Microsoft support is not able to crack this issue. Any help is greatly appreciated.
Tutorial Link: https://www.c-sharpcorner.com/article/azure-data-lake-storage-gen2-query-acceleration/
Solution - .Net Core 3.1 Console App
Error: One or more errors occurred. (XML specified is not syntactically valid.)
Status: 400 (XML specified is not syntactically valid.)
private static async Task MainAsync()
{
var connectionString = "DefaultEndpointsProtocol=https;AccountName=gfsdlstestgen2;AccountKey=0AOkFckONVYkTh9Kpr/VRozBrhWYrLoH7y0mW5wrw==;EndpointSuffix=core.windows.net";
var blobServiceClient = new BlobServiceClient(connectionString);
var containerClient = blobServiceClient.GetBlobContainerClient("test");
await foreach (var blobItem in containerClient.GetBlobsAsync(BlobTraits.Metadata, BlobStates.None, "ds_measuringpoint.json"))
{
var blobClient = containerClient.GetBlockBlobClient(blobItem.Name);
var options = new BlobQueryOptions
{
InputTextConfiguration = new BlobQueryJsonTextOptions(),
OutputTextConfiguration = new BlobQueryJsonTextOptions()
};
var result = await blobClient.QueryAsync(#"SELECT * FROM BlobStorage WHERE measuringpointid = 547", options);
var jsonString = await new StreamReader(result.Value.Content).ReadToEndAsync();
Console.WriteLine(jsonString);
Console.ReadLine();
}
After looking every where and testing almost all variations of ADLS query for .net Microsoft support mentioned
Azure.Storage.Blobs version 12.10 is broken version. We had to downgrade to 12.8.0
Downgrading this package to 12.8.0 worked.

Azure Storage Queue performance

We are migrating a transaction-processing service which was processing messages from MSMQ and storing transacitons in a SQLServer Database to use the Azure Storage Queue (to store the id's of the messages and placing the actual messages in the Azure Storage Blob).
We should at least be able to process 200.000 messages per hour, but at the moment we barely reach 50.000 messages per hour.
Our application requests batches of 250 messages from the Queue (which now takes about 2 seconds to get the id's from the azure queue and about 5 seconds to get the actual data from the azure blob storage) and we're storing this data in one time into the database using a stored procedure accepting a datatable.
Our service also resides in Azure on a virtual machine, and we use the nuget-libraries Azure.Storage.Queues and Azure.Storage.Blobs suggested by Microsoft to access the Azure Storage queue and blob storage.
Does anyone have suggestions how to improve the speed of reading messages from the Azure Queue and then retrieving the data from the Azure Blob?
var managedIdentity = new ManagedIdentityCredential();
UriBuilder fullUri = new UriBuilder()
{
Scheme = "https",
Host = string.Format("{0}.queue.core.windows.net",appSettings.StorageAccount),
Path = string.Format("{0}", appSettings.QueueName),
};
queue = new QueueClient(fullUri.Uri, managedIdentity);
queue.CreateIfNotExists();
...
var result = await queue.ReceiveMessagesAsync(1);
...
UriBuilder fullUri = new UriBuilder()
{
Scheme = "https",
Host = string.Format("{0}.blob.core.windows.net", storageAccount),
Path = string.Format("{0}", containerName),
};
_blobContainerClient = new BlobContainerClient(fullUri.Uri, managedIdentity);
_blobContainerClient.CreateIfNotExists();
...
public async Task<BlobMessage> GetBlobByNameAsync(string blobName)
{
Ensure.That(blobName).IsNotNullOrEmpty();
var blobClient = _blobContainerClient.GetBlobClient(blobName);
if (!blobClient.Exists())
{
_log.Error($"Blob {blobName} not found.");
throw new InfrastructureException($"Blob {blobName} not found.");
}
BlobDownloadInfo download = await blobClient.DownloadAsync();
return new BlobMessage
{
BlobName = blobClient.Name,
BaseStream = download.Content,
Content = await GetBlobContentAsync(download)
};
}
Thanks,
Vincent.
Based on the code you posted, I can suggest two improvements:
Receive 32 messages at a time instead of 1: Currently you're getting just one message at a time (var result = await queue.ReceiveMessagesAsync(1);). You can receive a maximum of 32 messages from the top of the queue. Just change the code to var result = await queue.ReceiveMessagesAsync(32); to get 32 messages. This will save you 31 trips to storage service and that should lead to some performance improvements.
Don't try to create blob container every time: Currently you're trying to create a blob container every time you process a message (_blobContainerClient.CreateIfNotExists();). It is really unnecessary. With fetching 32 messages, you're reducing this method call by 31 times however you can just move this code to your application startup so that you only call it once during your application lifecycle.

Is there a way to programmatically change TTL on a cosmos db Table

As the title describes, I'm trying to change the TTL of a cosmos db table.
I couldn't find anything in c#/powershell/arm templates
Here is what I'm trying to achieve
The only thing I was able to find is the api call that is triggered in azure portal, but I'm wondering if it is safe to use this API directly?
In Cosmos DB Table API, Tables are essentially Containers thus you can use Cosmos DB SQL API SDK to manipulate the Table. Here's the sample code to do so:
var cosmosClient = new CosmosClient(CosmosConnectionString);
var database = cosmosClient.GetDatabase(Database);
var container = database.GetContainer("test");
var containerResponse = await container.ReadContainerAsync();
var containerProperties = containerResponse.Resource;
Console.WriteLine("Current TTL on the container is: " + containerProperties.DefaultTimeToLive);
containerProperties.DefaultTimeToLive = 120;//
containerResponse = await container.ReplaceContainerAsync(containerProperties);
containerProperties = containerResponse.Resource;
Console.WriteLine("Current TTL on the container is: " + containerProperties.DefaultTimeToLive);
Console.ReadKey();
Setting TTL is now supported through Microsoft.Azure.Cosmos.Table directly with version >= 1.0.8.
// Get the table reference for table operations
CloudTable table = <tableClient>.GetTableReference(<tableName>);
table.CreateIfNotExists(defaultTimeToLive: <ttlInSeconds>);

Azure Monitor in .NET Core

Using the preview package for Microsoft.Azure.Management.Monitor, I am trying to get metrics from Azure into a .NET Core application, but I am uncertain about what to input as "resourceUri".
var serviceCreds = await ApplicationTokenProvider.LoginSilentAsync(tenantId, clientId, secret);
var monitorClient = new MonitorManagementClient(serviceCreds);
monitorClient.SubscriptionId = subscriptionId;
var resourceUri = "";
var metrics = await monitorClient.Metrics.ListAsync(resourceUri: resourceUri, cancellationToken: CancellationToken.None);
What should I insert in the resourceUri variable, and where do I get this uri from in Azure? A lot of things are great about Azure, but not documentation 🤨
Good question.
The resourceUri is in this format(this example is for web app, and you should replace with your real subscriptionsId, resourceGroupsName etc.):
/subscriptions/4d7e91d4-e930-4bb5-a93d-163aa358e0dc/resourceGroups/Default-Web-westus/providers/microsoft.web/serverFarms/DefaultServerFarm
You can find this information in the source code, here.
And for different resources, the format has a little difference, I add another resourceUri for blob storage:
/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Storage/storageAccounts/xxx/blobServices/default/providers/Microsoft.Insights/metrics/ContainerCount
If you still have issues, please feel free to let me know.

Support for RetyPolicy in Azure Table Storage for WindowsAzure.Storage SDK version 7.0.0.0

I need to apply custom retry policy for all Table Operations. This is what I have been using:
_account = CloudStorageAccount.Parse(PhoenixConfiguration.AzureBlobStorageConnection);
var _tableClient = this._account.CreateCloudTableClient();
IRetryPolicy linearRetryPolicy = new LinearRetry(TimeSpan.FromSeconds(5), 10);
_tableClient.RetryPolicy = linearRetryPolicy;
I was using WindowsAzure.Storage SDK (version 6), after upgrading my project to use WindowsAzire.Storage SDK version 7, this code is breaking. What is the correct way to implement custom retry policy in the new SDK? Is there any documentation available that I can refer?
The reason your code is failing to compile is because RetryPolicy member on CloudTableClient was deprecated in version 6.0 and is now removed in 7.0 [What is surprising is that it is still there on CloudBlobClient, though it is deprecated].
In order to use Retry Polcies, you have to use TableRequestOptions and specify the retry policy there. For example, this is how you could use it when creating a table.
var storageAccount = new CloudStorageAccount(new StorageCredentials(accountName, accountKey), true);
IRetryPolicy linearRetry = new LinearRetry(TimeSpan.FromSeconds(5), 10);
var tableClient = storageAccount.CreateCloudTableClient();
var table = tableClient.GetTableReference("MyTable");
var tableRquestOptions = new TableRequestOptions()
{
RetryPolicy = linearRetry
};
table.CreateIfNotExists(tableRquestOptions);

Resources