Run multiple tasks at the same time using TAP - multithreading

My code is supposed to simultanously start sorting 3 different lists using different methods and return the first one to finish. However it always performs the first task on the list instead. How can I fix that?
Below is part of my code which seemed relevant to show.
static List<Task<List<int>>> listoftasks = new List<Task<List<int>>>() { QuickSortAsync(list1), BubbleSortAsync(list2), SelectionSortAsync(list3) };
public async static void caller()
{
List<int> result = await Task.WhenAny(listoftasks).Result;
foreach (var item in result)
Console.Write(item + ", ");
}
static Task<List<int>> QuickSortAsync(List<int> l)
{
return Task.Run<List<int>>(() =>
{
l.Sort();
return l;
});
}

Since your list of tasks is static, you're starting all three tasks very early. Then, when you call WhenAny, it's likely that they've already all completed.
I suggest you start the tasks when you call WhenAny:
public static async Task CallerAsync()
{
List<int> result = await await Task.WhenAny(QuickSortAsync(list1),
BubbleSortAsync(list2), SelectionSortAsync(list3));
foreach (var item in result)
Console.Write(item + ", ");
}

Related

For await x of y using an AsyncIterator causes memory leak

When using AsyncIterator i have a substential memory leak when used in for-x-of-y
I need this when scraping a HTML-Page which includes the information about the next HTML-Page to be scraped:
Scrape Data
Evaluate Data
Scrape Next Data
The async Part is needed since axios is used to obtain the HTML
Here is a repro, which allows to see the memory rising von ~4MB to ~25MB at the end of the script. The memory is not freed till the program terminates.
const scraper = async ():Promise<void> => {
let browser = new BrowserTest();
let parser = new ParserTest();
for await (const data of browser){
console.log(await parser.parse(data))
}
}
class BrowserTest {
private i: number = 0;
public async next(): Promise<IteratorResult<string>> {
this.i += 1;
return {
done: this.i > 1000,
value: 'peter '.repeat(this.i)
}
}
[Symbol.asyncIterator](): AsyncIterator<string> {
return this;
}
}
class ParserTest {
public async parse(data: string): Promise<string[]> {
return data.split(' ');
}
}
scraper()
It looks like that the data of the for-await-x-of-y is dangling in memory. The callstack gets huge aswell.
In the repro the Problem could still be handled. But for my actual code a whole HTML-Page stays in memory which is ~250kb each call.
In this screenshot you can see the heap memory on the first iteration compared to the heap memory after the last iteration
Cannot post inline Screenshots yet
The expected workflow would be the following:
Obtain Data
Process Data
Extract Info for the next "Obtain Data"
Free all Memory from the last "Obtain Data"
Use extracted information to restart the loop with new Data obtained.
I am unsure an AsyncIterator is the right choice here to archive what is needed.
Any help/hint would be appriciated!
In Short
When using an AsyncIterator the Memory is rising drastically. It drops once the Iteration is done.
The x in `for await (x of y) is not freed till the Iteration is done. Also every Promise awaited inside the for-loop is not freed.
I came to the conclusion that the Garbage Collector cannot catch the contents of Iteration, since the Promises generated by the AsyncIterator will only fully resolve once the Iteration is done.
I think this might be a Bug.
Workaround Repro
As workaround to free the contents of the Parser we encapsulate the Result in a lightweight Container. We then free the contents, so only the Container itself remains in Memory.
The data Object cannot be freed even if you use the same technic to encapsulate it - so it seems to be the case when debugging at least.
const scraper = async ():Promise<void> => {
let browser = new BrowserTest();
for await (const data of browser){
let parser = new ParserTest();
let result = await parser.parse(data);
console.log(result);
/**
* This avoids memory leaks, due to a garbage collector bug
* of async iterators in js
*/
result.free();
}
}
class BrowserTest {
private i: number = 0;
private value: string = "";
public async next(): Promise<IteratorResult<string>> {
this.i += 1;
this.value = 'peter '.repeat(this.i);
return {
done: this.i > 1000,
value: this.value
}
}
public [Symbol.asyncIterator](): AsyncIterator<string> {
return this;
}
}
/**
* Result class for wrapping the result of the parser.
*/
class Result {
private result: string[] = [];
constructor(result: string[]){
this.setResult(result);
}
public setResult(result: string[]) {
this.result = result;
}
public getResult(): string[] {
return this.result;
}
public free(): void {
delete this.result;
}
}
class ParserTest {
public async parse(data: string): Promise<Result>{
let result = data.split(' ');
return new Result(result);
}
}
scraper())
Workaround in actual context
What is not shown in the Repro-Solution is that we also try to free the Result of the Iteration itself. This seems not to have any effect tho(?).
public static async scrape<D,M>(scraper: IScraper<D,M>, callback: (data: DataPackage<Object,Object> | null) => Promise<void>) {
let browser = scraper.getBrowser();
let parser = scraper.getParser();
for await (const parserFragment of browser) {
const fragment = await parserFragment;
const json = await parser.parse(fragment);
await callback(json);
json.free();
fragment.free();
}
}
See: https://github.com/demokratie-live/scapacra/blob/master/src/Scraper.ts
To test with an actual Application: https://github.com/demokratie-live/scapacra-bt (yarn dev ConferenceWeekDetail)
References
Github NodeJs: https://github.com/nodejs/node/issues/30298
Github DEMOCRACY: https://github.com/demokratie-live/democracy-client/issues/926
Conclusion
We found a feasible Solution for us. Therefore i close this Issue. The followup is directed towards the Node.js Repo in order to fix this potential Bug
https://github.com/nodejs/node/issues/30298

How to concatWith using information from previous Observable for pagination

Let's say I have a blocking method with is called List<UUID> listOf(int page).
If I want to paginate something like this, one idea is to do something like this:
public Observable<UUID> allOf(int initialPage) {
return fromCallable( () -> listOf(initialPage))
.concatWith( fromCallable( () -> allOf(initialPage + 1)))
.flatMap(x -> from(x));
}
If my service doesn't use the page number but the last element of the list to find next elements, how can I achieve it with RxJava?
I would still like to obtain the effect of doing something like allOf(0).take(20) and obtain, with concatWith, the call to the second Observable when the first one has completed.
But how can I do it when I need information from the previous call?
You could use a subject to send back the next page number to the beginning of a sequence:
List<Integer> service(int index) {
System.out.println("Reading " + index);
List<Integer> list = new ArrayList<>();
for (int i = index; i < index + 20; i++) {
list.add(i);
}
return list;
}
Flowable<List<Integer>> getPage(int index) {
FlowableProcessor<Integer> pager = UnicastProcessor.<Integer>create()
.toSerialized();
pager.onNext(index);
return pager.observeOn(Schedulers.trampoline(), true, 1)
.map(v -> {
List<Integer> list = service(v);
pager.onNext(list.get(list.size() - 1) + 1);
return list;
})
;
}
#Test
public void testPager() {
getPage(0).take(20)
.subscribe(System.out::println, Throwable::printStackTrace);
}

C# how to call async await in a for loop

I am developing a quartz.net job which runs every 1 hour. It executes the following method. I am calling a webapi inside a for loop. I want to make sure i return from the GetChangedScripts() method only after all thread is complete? How to do this or have i done it right?
Job
public void Execute(IJobExecutionContext context)
{
try
{
var scripts = _scriptService.GetScripts().GetAwaiter().GetResult();
}
catch (Exception ex)
{
_logProvider.Error("Error while executing Script Changed Notification job : " + ex);
}
}
Service method:
public async Task<IEnumerable<ChangedScriptsByChannel>> GetScripts()
{
var result = new List<ChangedScriptsByChannel>();
var currentTime = _systemClock.CurrentTime;
var channelsToProcess = _lastRunReader.GetChannelsToProcess().ToList();
if (!channelsToProcess.Any()) return result;
foreach (var channel in channelsToProcess)
{
var changedScripts = await _scriptRepository.GetChangedScriptAsync(queryString);
if (changedScriptsList.Any())
{
result.Add(new ChangedScriptsByChannel()
{
ChannelCode = channel.ChannelCode,
ChangedScripts = changedScriptsList
});
}
}
return result;
}
As of 8 days ago there was a formal announcement from the Quartz.NET team stating that the latest version, 3.0 Alpha 1 has full support for async and await. I would suggest upgrading to that if at all possible. This would help your approach in that you'd not have to do the .GetAwaiter().GetResult() -- which is typically a code smell.
How can I use await in a for loop?
Did you mean a foreach loop, if so you're already doing that. If not the change isn't anything earth-shattering.
for (int i = 0; i < channelsToProcess.Count; ++ i)
{
var changedScripts =
await _scriptRepository.GetChangedScriptAsync(queryString);
if (changedScriptsList.Any())
{
var channel = channelsToProcess[i];
result.Add(new ChangedScriptsByChannel()
{
ChannelCode = channel.ChannelCode,
ChangedScripts = changedScriptsList
});
}
}
Doing these in either a for or foreach loop though is doing so in a serialized fashion. Another approach would be to use Linq and .Select to map out the desired tasks -- and then utilize Task.WhenAll.

Let the tasks race

Suppose I have a BlockingCollection OutputQueue, which has many items. Current my code is:
public void Consumer()
{
foreach (var workItem in OutputQueue.GetConsumingEnumerable())
{
PlayMessage(workItem);
Console.WriteLine("Works on {0}", workItem.TaskID);
OutLog.Write("Works on {0}", workItem.TaskID);
Thread.Sleep(500);
}
}
Now I want PlayMessage(workItem) running in the multiple tasks way because some workItem need more time, the others need less time. There are huge difference.
As for the method PlayMessage(workItem), it has a few service calls, play text to speech and some logging.
bool successRouting = serviceCollection.SvcCall_GetRoutingData(string[] params, out ex);
bool successDialingService = serviceCollection.SvcCall_GetDialingServiceData(string[] params, out excep);
PlayTTS(workItem.TaskType); // playing text to speech
So how to change my code?
What I thought was:
public async Task Consumer()
{
foreach (var workItem in OutputQueue.GetConsumingEnumerable())
{
await PlayMessage(workItem);
Console.WriteLine("Works on {0}", workItem.TaskID);
OutLog.Write("Works on {0}", workItem.TaskID);
Thread.Sleep(500);
}
}
Since you want parallelism with your PlayMessage, i would suggest looking into TPL Dataflow, as it combines both parallel work with async, so you could await your work properly.
TPL Dataflow is constructed of Blocks, and each block has its own characteristics.
Some popular ones are:
ActionBlock<TInput>
TransformBlock<T, TResult>
I would construct something like the following:
var workItemBlock = new ActionBlock<WorkItem>(
workItem =>
{
PlayMessage(workItem);
Console.WriteLine("Works on {0}", workItem.TaskID);
OutLog.Write("Works on {0}", workItem.TaskID);
}, new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = // Set max parallelism as you wish..
});
foreach (var workItem in OutputQueue.GetConsumingEnumerable())
{
workItemBlock.Post(workItem);
}
workItemBlock.Complete();
Here's another solution, not based on TPL Dataflow. It uses uses SemaphoreSlim to throttle the number of parallel playbacks (warning, untested):
public async Task Consumer()
{
var semaphore = new SemaphoreSlim(NUMBER_OF_PORTS);
var pendingTasks = new HashSet<Task>();
var syncLock = new Object();
Action<Task> queueTaskAsync = async(task) =>
{
// be careful with exceptions inside "async void" methods
// keep failed/cancelled tasks in the list
// they will be observed outside
lock (syncLock)
pendingTasks.Add(task);
await semaphore.WaitAsync().ConfigureAwait(false);
try
{
await task;
}
catch
{
if (!task.IsCancelled && !task.IsFaulted)
throw;
// the error will be observed later,
// keep the task in the list
return;
}
finally
{
semaphore.Release();
}
// remove successfully completed task from the list
lock (syncLock)
pendingTasks.Remove(task);
};
foreach (var workItem in OutputQueue.GetConsumingEnumerable())
{
var item = workItem;
Func<Task> workAsync = async () =>
{
await PlayMessage(item);
Console.WriteLine("Works on {0}", item.TaskID);
OutLog.Write("Works on {0}", item.TaskID);
Thread.Sleep(500);
});
var task = workAsync();
queueTaskAsync(task);
}
await Task.WhenAll(pendingTasks.ToArray());
}

Task.wait in task array

I need to execute the line completed after all the tasks completed.I thought Task.WaitAll(tasks) will take care but after executing callback method my completed line gets executed.Is there a way to block the main thread untill the Task aray completes it.
Taskpprcessor.Batchstart(definition)
public void BatchStart(List<TaskDefinition> definition)
{
int i = 0;
tasks = new Task[definition.Count];
definition.ForEach((a) =>
{
tasks[i] = Task<TaskResult>.Factory.StartNew(() => (TaskResult)a.MethodTocall.DynamicInvoke(a.ARguments));
tasks[i].ContinueWith(task => RunTaskRetObjResultIns((Task<TaskResult>)task, a.CompleteMethod));
i++;
});
Task.WaitAll(tasks);
Console.WriteLine("completed");
}
I would try this:
public void BatchStart(List<TaskDefinition> definition)
{
Task.WaitAll(
definition.Select
(a => Task<TaskResult>.Factory.StartNew(
() => (TaskResult)a.MethodTocall.DynamicInvoke(a.ARguments)).ContinueWith(task => RunTaskRetObjResultIns((Task<TaskResult>)task, a.CompleteMethod))
).ToArray()
);
Console.WriteLine("completed");
}
I think the problem is that ContinueWith returns a new Task, and that's the one you want to Wait for. You're waiting for the original tasks but not the continuation.
You could just use PLINQ, as in:
List<Object> items = new List<Object>();
items.AsParallel().ForAll(obj => {
// Write whatever the object is to the string, but do it parallelly
Console.WriteLine(obj.ToString());
});
Conosle.WriteLine("Done");
That will execute all of your tasks parallelly, and then return when complete.

Resources