Controlling the number of threads when using c#5 async / await

Controlling the number of threads when using c#5 async / await - multithreading

I am looking into using the new async/await keyworks in c#5 and reading this article
I see the following example
async void ArchiveDocuments(List<Url> urls)
{
Task archive = null;
for(int i = 0; i < urls.Count; ++i)
{
var document = await FetchAsync(urls[i]);
if (archive != null)
await archive;
archive = ArchiveAsync(document);
}
}
Presumably if the Urls list is VERY long we might get into a situation where the thread cound gets out of control.
What I'd like to know is what the recommended way to control the number of threads used. Is there a way to specify a threadpool or a max number?
With the TPL you can use the options to control the max number of threads
ParallelOptions.MaxDegreeOfParallelism. Perhaps some way of combining await and Task might be possible.

The point of that example is to show a "pipeline" pattern where there is at most one Fetch and one Archive executing concurrently.
async does not mean concurrent, it means non-blocking.

The async and await keywords do not create any threads themselves. That's totally up to the method that handles the task itself.
In fact, operations like file-system I/O or network connections, usually don't need any threads to run asynchronously. The hardware does what the hardware does and when it's done the related event is fired (back in C#).
If you want to control the number of threads, you need to change the method that creates threads - and it has nothing to do with async and awaits.

Related

what would be the right way to go for my scenario, thread array, thread pool or tasks?

I am working on a small microfinance application that processes financial transactions, the frequency of these transaction are quite high, which is why I am planning to make it a multi-threaded application that can process multiple transactions in parallel.
I have already designed all the workers that are thread safe,
what I need help for is how to manage these threads. here are some of my options
1.make a specified number of thread pool threads at startup and keep them running like in a infinite loop where they could keep looking for new transactions and if any are found start processing
example code:
void Start_Job(){
for (int l_ThreadId = 0; l_ThreadId < PaymentNoOfWorkerThread; l_ThreadId++)
{
ThreadPool.QueueUserWorkItem(Execute, (object)l_TrackingId);
}
}
void Execute(object l_TrackingId)
{
while(true)
{
var new_txns = Get_New_Txns(); //get new txns if any returns a queue
while(new_txns.count > 0 ){
process_txn(new_txns.Dequeue())
}
Thread.Sleep(some_time);
}
}
2.look for new transactions and assign a thread pool thread for each transaction (my understanding that these threads would be reused after their execution is complete for new txns)
example code:
void Start_Job(){
while(true){
var new_txns = Get_New_Txns(); //get new txns if any returns a queue
for (int l_ThreadId = 0; l_ThreadId < new_txns.count; l_ThreadId++)
{
ThreadPool.QueueUserWorkItem(Execute, (object)new_txn.Dequeue());
}
}
Thread.Sleep(some_time);
}
void Execute(object Txn)
{
process_txn(txn);
}
3.do the above but with tasks.
which option would be most efficient and well suited for my application,
thanks in advance :)

ThreadPool.QueueUserWorkItem is an older API and you shouldn't be using it directly
anymore. Tasks is the way to go and Thread pool is managed automatically for you.
What may suite your application would depend on what happens in process_txn and is subjective, so this is very generic guideline:
If process_txn is a compute bound operation: for example it performs only CPU bound calculations, then you may look at the Task Parallel Library. It will help you use the CPU cores more efficiently.
If process_txn is less of CPU and more IO bound operations: meaning if it may read/write from files/database or connects to some other remote service, then what you should look at is asynchronous programming and make sure your IO operations are all asynchronous which means your threads are never blocked on IO. This will help your service to be more scalable. Also depending on what your queue is, see if you can await on the queue asynchronously, so that none of your application threads are blocked just waiting on the queue.

Azure Durable Functions fan-out scale limit?

I have a durable function orchestrator that fans-out into multiple activity functions to handle some workload. The following code is an example where Function_2 is the one that fans out to handle the workload:
public static async Task Run(DurableOrchestrationContext ctx)
{
// get a list of N work items to process in parallel
object[] workBatch = await ctx.CallActivityAsync<object[]>("Function_1");
var parallelTasks = new List<Task<int>>();
for (int i = 0; i < workBatch.Length; i++)
{
Task<int> task = ctx.CallActivityAsync<int>("Function_2", workBatch[i]);
parallelTasks.Add(task);
}
//How many instances of Function_2 will handle the workload?
await Task.WhenAll(parallelTasks);
// aggregate all N outputs and send result to Function_3
int sum = parallelTasks.Sum(t => t.Result);
await ctx.CallActivityAsync("Function_3", sum);
}
My question is how many instances of Function_2 will be spawned in order to handle the work. I know that it depends on the number of Tasks, so lets say I have 5000 tasks. I doubt it would spawn 5000 instances, but what is the upper limit and can I control it. I read through the documentation multiple times but was unable to find information on this subject. I know that by definition I should not care about that as it is handled for me, however my Tasks can overload a backing resource they all depend upon.

Behind the scenes, each CallActivity call becomes a message in a Storage Queue, so 5000 messages in your example. The messages will there be consumed by Function App.
It will run multiple invocations in parallel, but of course not all at the same time. You won't see the exact numbers anywhere in the docs, since they will be defined by internal scale controller logic. They will also depend on duration of each activity call, it's CPU usage etc.
The results may change over time too.
So, your milage may vary, and you should test your scenario.

Parallel Request at different paths in NodeJS: long running path 1 is blocking other paths

I am trying out simple NodeJS app so that I could to understand the async nature.
But my problem is as soon as I hit "/home" from browser it waits for response and simultaneously when "/" is hit, it waits for the "/home" 's response first and then responds to "/" request.
My concern is that if one of the request needs heavy processing, in parallel we can't request another one? Is this correct?
app.get("/", function(request, response) {
console.log("/ invoked");
response.writeHead(200, {'Content-Type' : 'text/plain'});
response.write('Logged in! Welcome!');
response.end();
});
app.get("/home", function(request, response) {
console.log("/home invoked");
var obj = {
"fname" : "Dead",
"lname" : "Pool"
}
for (var i = 0; i < 999999999; i++) {
for (var i = 0; i < 2; i++) {
// BS
};
};
response.writeHead(200, {'Content-Type' : 'application/json'});
response.write(JSON.stringify(obj));
response.end();
});

Good question,
Now, although Node.js has it's asynchronous nature, this piece of code:
for (var i = 0; i < 999999999; i++) {
for (var i = 0; i < 2; i++) {
// BS
};
};
Is not asynchronous actually blocking the node main thread. And therefore, all other requests has to wait until this big for loop will end.
In order to do some heavy calculations in parallel I recommend using setTimeout or setInterval to achieve your goal:
var i=0;
var interval = setInterval(function() {
if(i++>=999999999){
clearInterval(interval);
}
//do stuff here
},5);
For more information I recommend searching for "Node.js event loop"

As Stasel, stated, code running like will block the event loop. Basically whenever javascript is running on the server, nothing else is running. Asynchronous I/O events such as disk I/O might be processing in the background, but their handler/callback won't be call unless your synchronous code has finished running. Basically as soon as it's finished, node will check for pending events to be handled and call their handlers respectively.
You actually have couple of choices to fix this problem.
Break the work in pieces and let the pending events be executed in between. This is almost same as Stasel's recommendation, except 5ms between a single iteration is huge. For something like 999999999 items, that takes forever. Firstly I suggest batch process the loop for about sometime, then schedule next batch process with setimmediate. setimmediate basically will schedule it after the pending I/O events are handled, so if there is not new I/O event to be handled(like no new http requests) then it will executed immediately. It's fast enough. Now the question comes that how much processing should we do for each batch/iteration. I suggest first measure how much does it on average manually, and for schedule about 50ms of work. For example if you have realized 1000 items take 100ms. Then let it process 500 items, so it will be 50ms. You can break it down further, but the more broken down, the more time it takes in total. So be careful. Also since you are processing huge amount of items, try not to make too much garbage, so the garbage collector won't block it much. In this not-so-similar question, I've explained how to insert 10000 documents into MongoDB without blocking the event loop.
Use threads. There are actually a couple nice thread implementations that you won't shoot yourself in foot with them. This is really a good idea for this case, if you are looking for performance for huge processings, since it would be tricky as I said above to implement CPU bound task playing nice with other stuff happening in the same process, asynchronous events are perfect for data-bound task not CPU bound tasks. There's nodejs-threads-a-gogo module you can use. You can also use node-webworker-threads which is built on threads-a-gogo, but with webworker API. There's also nPool, which is a bit more nice looking but less popular. They all support thread pools and should be straight forward to implement a work queue.
Make several processes instead of threads. This might be slower than threads, but for huge stuff still way better than iterating in the main process. There's are different ways. Using processes will bring you a design that you can extend it to using multiple machines instead of just using multiple CPUs. You can either use a job-queue(basically pull the next from the queue whenever finished a task to process), a multi process map-reduce or AWS elastic map reduce, or using nodejs cluster module. Using cluster module you can listen to unix domain socket on each worker and for each job just make a request to that socket. Whenever the worker finished processing the job, it will just write back to that particular request. You can search about this stuff, there are many implementations and modules existing already. You can use 0MQ, rabbitMQ, node built-in ipc, unix domain sockets or a redis queue for multi process communications.

Trying to batch AddMessage to an Azure Queue

I've got about 50K messages I wish to add to an azure queue.
I'm not sure if the code I have is safe. It feels/smells bad.
Basically, give a collection of POCO's, serialize the POCO to some json, then add that json text to the queue.
public void AddMessage(T content)
{
content.ShouldNotBe(null);
var json = JsonConvert.SerializeObject(content);
var message = new CloudQueueMessage(json);
Queue.AddMessage(message);
}
public void AddMessages(ICollection<T> contents)
{
contents.ShouldNotBe(null);
Parallel.ForEach(contents, AddMessage);
}
Can someone tell me what I should be doing to fix this up -- and most importantly, why?
I feel that the Queue might not be thread safe, in this scenario.

A few things I have observed regarding Parallel.ForEach and dealing with Azure Storage (my experience has been with uploading blobs/blocks in parallel):
Azure storage operations are Network (IO) based operations and not processor intensive operations. If I am not mistaken, Parallel.ForEach is more suitable for processor intensive applications.
Another thing we noticed with uploading a large number of blobs (or blocks) using Parallel.ForEach is that we started to get a lot of Timeout exceptions and actually slowed down the entire operation. I believe the reason for this is when you iterate over a collection with large number of items using this approach, you're essentially handling the control to underlying framework which decides how to deal with that collection. In this case, a lot of Context Switching will take place which slows down the operation. Not sure how this would work in your scenario considering the payload is smaller.
My recommendation would be have the application control the number of parallel threads it can spawn. A good criteria would be the number of logical processor. Another good criteria would be the number of ports IE can open. So you would spawn that many number of parallel threads. Then you could either wait for all threads to finish to spawn next set of parallel threads or start a new thread as soon as one task finishes.
Pseudo Code:
ICollection<string> messageContents;
private void AddMessages()
{
int maxParallelThreads = Math.Min(Environment.ProcessorCount, messageContents.Count);
if (maxParallelThreads > 0)
{
var itemsToAdd = messageContents.Take(maxParallelThreads);
List<Task> tasks = new List<Task>();
for (var i = 0; i < maxParallelThreads; i++)
{
tasks.Add(Task.Factory.StartNew(() =>
{
AddMessage(itemsToAdd[i]);
RemoveItemFromCollection();
}));
}
Task.WaitAll(tasks.ToArray());
AddMessages();
}
}

Your code looks fine to me at a high level. Gaurav's additions make sense, so you have more controls over the parallel processing of your requests. Make sure you add some form of retry logic, and perhaps setting the DefaultConnectionLimit to something greater than its default value (which is 2). You may also consider adding multiple Azure Queues across multiple storage accounts if you hit a form of throttling, depending on the type of errors you are getting.

For anyone looking to add a large number of non-POCO/string messages in bulk/batch to a queue, an alternate/better solution will be to add the list of messages as a single message or blob, and then in a queue/blob trigger traverse & add each message to a [seperate] queue.

var maxDegreeOfParallelism = Math.Min(Environment.ProcessorCount,cloudQueueMessageCollection.Count());
var parallelOptions=new ParallelOptions { MaxDegreeOfParallelism = maxDegreeOfParallelism };
Parallel.ForEach(cloudQueueMessageCollection, parallelOptions,
async (m) => await AddMessageAsync(queue, connectionStringOrKey, m));

GPars report status on large number of async functions and wait for completion

I have a parser, and after gathering the data for a row, I want to fire an aync function and let it process the row, while the main thread continues on and gets the next row.
I've seen this post: How do I execute two tasks simultaneously and wait for the results in Groovy? but I'm not sure it is the best solution for my situation.
What I want to do is, after all the rows are read, wait for all the async functions to finish before I go on. One concern with using a collection of Promises is that the list could be large (100,000+).
Also, I want to report status as we go. And finally, I'm not sure I want to automatically wait for a timeout (like on a get()), because the file could be huge, however, I do want to allow the user to kill the process for various reasons.
So what I've done for now is record the number of rows parsed (as they occur via rowsRead), then use a callback from the Promise to record another row being finished processing, like this:
def promise = processRow(row)
promise.whenBound {
rowsProcessed.incrementAndGet()
}
Where rowsProcessed is an AtomicInteger.
Then in the code invoked at the end of the sheet, after all parsing is done and I'm waiting for the processing to finish, I'm doing this:
boolean test = true
while (test) {
Thread.sleep(1000) // No need to pound the CPU with this check
println "read: ${sheet.rowsRead}, processed: ${sheet.rowsProcessed.get()}"
if (sheet.rowsProcessed.get() == sheet.rowsRead) {
test = false
}
}
The nice thing is, I don't have an explosion of Promise objects here - just a simple count to check. But I'm not sure sleeping every so often is as efficient as checking the get() on each Promise() object.
So, my questions are:
If I used the collection of Promises instead, would a get() react and return if the thread executing the while loop above was interrupted with Thread.interrupt()?
Would using the collection of Promises and calling get() on each be more efficient than trying to sleep and check every so often?
Is there another, better approach that I haven't considered?
Thanks!

Call to allPromises*.get() will throw InterruptedException if the waiting (main) thread gets interrupted
Yes, the promises have been created anyway, so grouping them in a list should not impose additional memory requirements, in my opinion.
The suggested solutions with a CountDownLanch or a Phaser are IMO much more suitable than using busy waiting.

An alternative to an AtomicInteger is to use a CountDownLatch. It avoids both the sleep and the large collection of Promise objects. You could use it like this:
latch = new CountDownLatch(sheet.rowsRead)
...
def promise = processRow(row)
promise.whenBound {
latch.countDown()
}
...
while (!latch.await(1, TimeUnit.SECONDS)) {
println "read: ${sheet.rowsRead}, processed: ${sheet.rowsRead - latch.count}"
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string