Meta-Question:
We're pulling data from EventHub, running some logic, and saving it off to cosmos. Currently Cosmos inserts are our bottleneck. How do we maximize our throughput?
Details
We're trying to optimize our Cosmos throughput and there seems to be some contention in the SDK that makes parallel inserts only marginally faster than serial inserts.
We're logically doing:
for (int i = 0; i < insertCount; i++)
{
taskList.Add(InsertCosmos(sdkContainerClient));
}
var parallelTimes = await Task.WhenAll(taskList);
Here's the results comparing serial inserts, parallel inserts, and "faking" an insert (with Task.Delay):
Serial took: 461ms for 20
- Individual times 28,8,117,19,14,11,10,12,5,8,9,11,18,15,79,23,14,16,14,13
Cosmos Parallel
Parallel took: 231ms for 20
- Individual times 17,15,23,39,45,52,72,74,80,91,96,98,108,117,123,128,139,146,147,145
Just Parallel (no cosmos)
Parallel took: 27ms for 20
- Individual times 27,26,26,26,26,26,26,25,25,25,25,25,25,24,24,24,23,23,23,23
Serial is obvious (just add each value)
no cosmos (the last timing) is also obvious (just take the min time)
But parallel cosmos doesn't parallelize nearly as well, indicating there's some contention.
We're running this on a VM in Azure (same datacenter as Cosmos), have enough RUs so aren't getting 429s, and using Microsoft.Azure.Cosmos 3.2.0.
Full Code Sample
class Program
{
public static void Main(string[] args)
{
CosmosWriteTest().Wait();
}
public static async Task CosmosWriteTest()
{
var cosmosClient = new CosmosClient("todo", new CosmosClientOptions { ConnectionMode = ConnectionMode.Direct });
var database = cosmosClient.GetDatabase("<ourcontainer>");
var sdkContainerClient = database.GetContainer("<ourcontainer>");
int insertCount = 25;
//Warmup
await sdkContainerClient.CreateItemAsync(new TestObject());
//---Serially inserts into Cosmos---
List<long> serialTimes = new List<long>();
var serialTimer = Stopwatch.StartNew();
Console.WriteLine("Cosmos Serial");
for (int i = 0; i < insertCount; i++)
{
serialTimes.Add(await InsertCosmos(sdkContainerClient));
}
serialTimer.Stop();
Console.WriteLine($"Serial took: {serialTimer.ElapsedMilliseconds}ms for {insertCount}");
Console.WriteLine($" - Individual times {string.Join(",", serialTimes)}");
//---Parallel inserts into Cosmos---
Console.WriteLine(Environment.NewLine + "Cosmos Parallel");
var parallelTimer = Stopwatch.StartNew();
var taskList = new List<Task<long>>();
for (int i = 0; i < insertCount; i++)
{
taskList.Add(InsertCosmos(sdkContainerClient));
}
var parallelTimes = await Task.WhenAll(taskList);
parallelTimer.Stop();
Console.WriteLine($"Parallel took: {parallelTimer.ElapsedMilliseconds}ms for {insertCount}");
Console.WriteLine($" - Individual times {string.Join(",", parallelTimes)}");
//---Testing parallelism minus cosmos---
Console.WriteLine(Environment.NewLine + "Just Parallel (no cosmos)");
var justParallelTimer = Stopwatch.StartNew();
var noCosmosTaskList = new List<Task<long>>();
for (int i = 0; i < insertCount; i++)
{
noCosmosTaskList.Add(InsertCosmos(sdkContainerClient, true));
}
var justParallelTimes = await Task.WhenAll(noCosmosTaskList);
justParallelTimer.Stop();
Console.WriteLine($"Parallel took: {justParallelTimer.ElapsedMilliseconds}ms for {insertCount}");
Console.WriteLine($" - Individual times {string.Join(",", justParallelTimes)}");
}
//inserts
private static async Task<long> InsertCosmos(Container sdkContainerClient, bool justDelay = false)
{
var timer = Stopwatch.StartNew();
if (!justDelay)
await sdkContainerClient.CreateItemAsync(new TestObject());
else
await Task.Delay(20);
timer.Stop();
return timer.ElapsedMilliseconds;
}
//Test object to save to Cosmos
public class TestObject
{
public string id { get; set; } = Guid.NewGuid().ToString();
public string pKey { get; set; } = Guid.NewGuid().ToString();
public string Field1 { get; set; } = "Testing this field";
public double Number { get; set; } = 12345;
}
}
This is the scenario for which Bulk is being introduced. Bulk mode is in preview at this moment and available in the 3.2.0-preview2 package.
What you need to do to take advantage of Bulk is turn the AllowBulkExecution flag on:
new CosmosClient(endpoint, authKey, new CosmosClientOptions() { AllowBulkExecution = true } );
This mode was made to benefit this scenario you describe, a list of concurrent operations that need throughput.
We have a sample project here: https://github.com/Azure/azure-cosmos-dotnet-v3/tree/master/Microsoft.Azure.Cosmos.Samples/Usage/BulkSupport
And we are still working on the official documentation, but the idea is that when concurrent operations are issued, instead of executing them as individual requests like you are seeing right now, the SDK will group them based on partition affinity and execute them as grouped (batch) operations, reducing the backend service calls and potentially increasing throughput between 50%-100% depending on the volume of operations. This mode will consume more RU/s as it is pushing a higher volume of operations per second than issuing the operations individually (so if you hit 429s it means the bottleneck is now on the provisioned RU/s).
var cosmosClient = new CosmosClient("todo", new CosmosClientOptions { AllowBulkExecution = true });
var database = cosmosClient.GetDatabase("<ourcontainer>");
var sdkContainerClient = database.GetContainer("<ourcontainer>");
//The more operations the better, just 25 might not yield a great difference vs non bulk
int insertCount = 10000;
//Don't do any warmup
List<Task> operations = new List<Tasks>();
var timer = Stopwatch.StartNew();
for (int i = 0; i < insertCount; i++)
{
operations.Add(sdkContainerClient.CreateItemAsync(new TestObject()));
}
await Task.WhenAll(operations);
serialTimer.Stop();
Important: This is a feature that is still in preview. Since this is a mode optimized for throughput (not latency), any single individual operation you do, won't have a great operational latency.
If you want to optimize even further, and your data source lets you access Streams (avoid serialization), you can use the CreateItemStream SDK methods for even better throughput.
I am using ListBlobsSegmentedAsync in my C# code to list all the blobs.Is there a way i can separate the images and videos from response of ListBlobsSegmentedAsync ?
Here is an example from this link. You should be able to optimise the code to do a yield return which will return results iteratively and not leave your calling code waiting for all the results to be returned.
public static String WildCardToRegular(String value)
{
return "^" + Regex.Escape(value).Replace("\\*", ".*") + "$";
}
Then, using it with ListBlobsSegmentedAsync:
var blobList = await container.ListBlobsSegmentedAsync(blobFilePath, true, BlobListingDetails.None, 1000, token, null, null);
var items = blobList.Results.Select(x => x as CloudBlockBlob);
// Filter items by search pattern, if specify
if (!string.IsNullOrEmpty(searchPattern))
{
items = items.Select(i =>
{
var filename = Path.GetFileName(i.Name);
if (Regex.IsMatch(filename, WildCardToRegular(searchPattern), RegexOptions.IgnoreCase))
{
return i;
}
return null;
}).ToList();
}
I am developing a quartz.net job which runs every 1 hour. It executes the following method. I am calling a webapi inside a for loop. I want to make sure i return from the GetChangedScripts() method only after all thread is complete? How to do this or have i done it right?
Job
public void Execute(IJobExecutionContext context)
{
try
{
var scripts = _scriptService.GetScripts().GetAwaiter().GetResult();
}
catch (Exception ex)
{
_logProvider.Error("Error while executing Script Changed Notification job : " + ex);
}
}
Service method:
public async Task<IEnumerable<ChangedScriptsByChannel>> GetScripts()
{
var result = new List<ChangedScriptsByChannel>();
var currentTime = _systemClock.CurrentTime;
var channelsToProcess = _lastRunReader.GetChannelsToProcess().ToList();
if (!channelsToProcess.Any()) return result;
foreach (var channel in channelsToProcess)
{
var changedScripts = await _scriptRepository.GetChangedScriptAsync(queryString);
if (changedScriptsList.Any())
{
result.Add(new ChangedScriptsByChannel()
{
ChannelCode = channel.ChannelCode,
ChangedScripts = changedScriptsList
});
}
}
return result;
}
As of 8 days ago there was a formal announcement from the Quartz.NET team stating that the latest version, 3.0 Alpha 1 has full support for async and await. I would suggest upgrading to that if at all possible. This would help your approach in that you'd not have to do the .GetAwaiter().GetResult() -- which is typically a code smell.
How can I use await in a for loop?
Did you mean a foreach loop, if so you're already doing that. If not the change isn't anything earth-shattering.
for (int i = 0; i < channelsToProcess.Count; ++ i)
{
var changedScripts =
await _scriptRepository.GetChangedScriptAsync(queryString);
if (changedScriptsList.Any())
{
var channel = channelsToProcess[i];
result.Add(new ChangedScriptsByChannel()
{
ChannelCode = channel.ChannelCode,
ChangedScripts = changedScriptsList
});
}
}
Doing these in either a for or foreach loop though is doing so in a serialized fashion. Another approach would be to use Linq and .Select to map out the desired tasks -- and then utilize Task.WhenAll.
I am working on GA for a project. I am trying to solve Travelling Salesman Problem using GA. I used array[] to store data, I think Arrays are much faster than List. But for any reason it takes too much time. e.g. With MaxPopulation = 100000, StartPopulation=1000 the program lasts to complete about 1 min. I want to know if this is a problem. If it is, how can I fix this?
A code part from my implementation:
public void StartAsync()
{
Task.Run(() =>
{
CreatePopulation();
currentPopSize = startPopNumber;
while (currentPopSize < maxPopNumber)
{
Tour[] elits = ElitChromosoms();
for (int i = 0; i < maxCrossingOver; i++)
{
if (currentPopSize >= maxPopNumber)
break;
int x = rnd.Next(elits.Length - 1);
int y = rnd.Next(elits.Length - 1);
Tour parent1 = elits[x];
Tour parent2 = elits[y];
Tour child = CrossingOver(parent1, parent2);
int mut = rnd.Next(100);
if (mutPosibility >= mut)
{
child = Mutation(child);
}
population[currentPopSize] = child;
currentPopSize++;
}
progress = currentPopSize * 100 / population.Length;
this.Progress = progress;
GC.Collect();
}
if (GACompleted != null)
GACompleted(this, EventArgs.Empty);
});
}
In here "elits" are the chromosoms that have greater fit value than the average fit value of the population.
Scientific papers suggest smaller population. Maybe you should follow what is written by the other authors. Having big population does not give you any advantage.
TSP can be solved by GA, but maybe it is not the most efficient approach to attack this problem. Look at this visual representation of TSP-GA: http://www.obitko.com/tutorials/genetic-algorithms/tsp-example.php
Ok. I have just found a solution. Instead of using an array with size of maxPopulation, change new generations with the old and bad one who has bad fitness. Now, I am working with a less sized array, which has length of 10000. The length was 1,000.000 before and it was taking too much time. Now, in every iteration, select best 1000 chromosomes and create new chromosomes using these as parent and replace to old and bad ones. This works perfect.
Code sample:
public void StartAsync()
{
CreatePopulation(); //Creates chromosoms for starting
currentProducedPopSize = popNumber; //produced chromosom number, starts with the length of the starting population
while (currentProducedPopSize < maxPopNumber && !stopped)
{
Tour[] elits = ElitChromosoms();//Gets best 1000 chromosoms
Array.Reverse(population);//Orders by descending
this.Best = elits[0];
//Create new chromosom as many as the number of bad chromosoms
for (int i = 0; i < population.Length - elits.Length; i++)
{
if (currentProducedPopSize >= maxPopNumber || stopped)
break;
int x = rnd.Next(elits.Length - 1);
int y = rnd.Next(elits.Length - 1);
Tour parent1 = elits[x];
Tour parent2 = elits[y];
Tour child = CrossingOver(parent1, parent2);
int mut = rnd.Next(100);
if (mutPosibility <= mut)
{
child = Mutation(child);
}
population[i] = child;//Replace new chromosoms
currentProducedPopSize++;//Increase produced chromosom number
}
progress = currentProducedPopSize * 100 / maxPopNumber;
this.Progress = progress;
GC.Collect();
}
stopped = false;
this.Best = population[population.Length - 1];
if (GACompleted != null)
GACompleted(this, EventArgs.Empty);
}
Tour[] ElitChromosoms()
{
Array.Sort(population);
Tour[] elits = new Tour[popNumber / 10];
Array.Copy(population, elits, elits.Length);
return elits;
}
Here is what I am trying to achieve :
On the service bus I have a topic which contains 5005 messages.
I need to peek all the messages without completing them and add them to a list (List<BrokeredMessage>)
Here is what I am trying :
IEnumerable<BrokeredMessage> dlIE = null;
List<BrokeredMessage> bmList = new List<BrokeredMessage>();
long i = 0;
while (i < count) //count is the total messages in the subscription
{
dlIE = deadLetterClient.ReceiveBatch(100);
bmList.AddRange(dlIE);
i = i + dlIE.Count();
}
In the above code I can only fetch 100 messages at a time since there is a batch size limit to retrieving the messages.
I have also tried to do asynchronously but it always returns 0 messages in the list. This is the code for that:
static List<BrokeredMessage> messageList = new List<BrokeredMessage>();
long i = 0;
while (i < count)
{
var task = ReceiveMessagesBatchForSubscription(deadLetterClient);
i = i + 100;
}
Task.WaitAny();
public async static Task ReceiveMessagesBatchForSubscription(SubscriptionClient deadLetterClient)
{
while (true)
{
var receivedMessage = await deadLetterClient.ReceiveBatchAsync(100);
messageList.AddRange(receivedMessage);
}
}
Can anyone please suggest a better way to do this?