Task Parallel Library - Know when all tasks are finished - c#-4.0

I use Task Parallel Library to start some tasks, like so:
public static void Main()
{
for (var i = 0; i < 10; i++)
{
var x = i;
Task.Factory.StartNew(() => new WorkerClass(x).Do());
}
// (*) Here I'd like to wait for all tasks to finish
Task.WaitAll();
Console.WriteLine("Ready.");
Console.ReadLine();
}
The problem is that some tasks can create new tasks themselves. This is how WorkerClass looks like:
public class WorkerClass
{
private static readonly NLog.Logger Log = NLog.LogManager.GetCurrentClassLogger();
private readonly int _i;
public WorkerClass(int i)
{
_i = i;
}
public void Do()
{
if (_i % 3 == 0)
Task.Factory.StartNew(() => new WorkerClass(_i + 101).Do());
Log.Info("Started {0}", _i);
Thread.Sleep(2000);
Log.Info("Done {0}", _i);
}
}
For every value of i that's a multiple of 3, a new Task is started.
I'd like to be able to wait until all tasks (including the ones created by other tasks) are finished.
Is there a clean/built-in way to do this (with or without TPL)?

Keep a reference to all top-level tasks and then just use WaitAll:
var tasks = new Task[10];
for (var i = 0; i < 10; i++)
{
var x = i;
tasks[i] = Task.Factory.StartNew(() => new WorkerClass(x).Do());
}
Task.WaitAll( tasks );
As for the child tasks, just make sure you attach them to the parent task. This means that the parent task will not go into a complete state until all child tasks are also finished.
Task.Factory.StartNew(() => { }, TaskCreationOptions.AttachedToParent);

Related

Local variable can be mutated from child threads in Scala

Today I was trying to learn about memory management in the JVM and I came across the following question: can I mutate a local variable from two threads spawned within the same function?
In Java, if you try something like this the code will not compile, yielding an error with message "local variables referenced from an inner class must be final or effectively final"
public class MyClass {
static void f() throws Exception {
int x = 0;
Thread t1 = new Thread(new Runnable() {
public void run() {
for(int i = 0; i < 1000; i++) {
x = x + 1;
}
}
});
Thread t2 = new Thread(new Runnable() {
public void run() {
for(int i = 0; i < 1000; i++) {
x = x - 1;
}
}
});
t1.start();
t2.start();
t1.join();
t2.join();
System.out.println(x);
}
public static void main(String args[]) throws Exception {
for(int i = 0; i < 20; i++) {
f();
}
}
}
However the equivalent code in Scala does compile and run without problem (despite maybe race conditions):
def f(): Unit = {
var x = 0
val t1 = new Thread(new Runnable {
override def run(): Unit =
(1 to 1000).foreach(_ => {x = x + 1})
})
t1.start()
val t2 = new Thread(new Runnable {
override def run(): Unit =
(1 to 1000).foreach(_ => {x = x - 1})
})
t2.start()
t1.join()
t2.join()
println(x)
}
(1 to 20).foreach(_ => f())
Why is the behavior different in each case?
In Scala lambdas, and by extension anonymous classes, can capture local variables. The scala.runtime package contains some extra classes for that purpose. They effectively lift the local variable into an instance variable of another class whose instances can be shared: https://github.com/scala/scala/blob/v2.13.3/src/library/scala/runtime/ObjectRef.java

Java thread execution time not consistent

im just experimenting with multithreading where im filling an array with random numbers and comparing how long it takes with two threads and one thread. thing is that the time for the first thread is much longer than the second.
code:
class createList extends Thread
{
int[] array = new int[25000000];
public void run() {
for (int i = 0; i < 25000000; i++)
{
array[i] = randomNumber();
}
}
public static int randomNumber()
{
Random random = new Random();
return random.nextInt(50);
}
}
public class Main {
public static void main(String[] args) {
createList listcreator1 = new createList();
createList listcreator2 = new createList();
listcreator1.start();
listcreator2.start();
Stopwatch sw = new Stopwatch();
listcreator1.run();
System.out.println(sw.elapsedTime());
Stopwatch sw3 = new Stopwatch();
listcreator2.run();
System.out.println(sw3.elapsedTime());
Stopwatch sw2 = new Stopwatch();
int[] array = new int[50000000];
for (int i = 0; i < 50000000; i++)
{
array[i] = randomNumber();
}
System.out.println(sw2.elapsedTime());
}
public static int randomNumber()
{
Random random = new Random();
return random.nextInt(50);
}
}
and the console output is
5.024,
0.945,
1.889
what is the reason for the large difference?
Actually on the first measurement you have calculation run 3 times. You have started 2 threads in the background (by using 'start' method) and 1 on current thread (by method 'run'). Run doesn't create new thread. Just execute operation on current on, it's simple method execution.
Other measurements execute without any jobs in the background so that's way they finish faster.

How to execute a single method and passing list of value concurrently

Suppose I have a List which have 2000 values, i want to divide the list of values and and passing to the method concurrently, SO I can increase my performance.
I applied bellow multi thread concept but its also taking 10 Minutes
Thread t = new Thread(new Runnable() {
public void run() {
for (int i = 1; i < visitList.size()/2; i = i + 2) {
auditedT1 += accuracyDao.SumOfChartsAudited(visitList.get(i));
}
}
});
Thread t1 = new Thread(new Runnable() {
public void run() {
for (int i = 0; i < visitList.size()/2; i = i + 2) {
auditedT2 += accuracyDao.SumOfChartsAudited(visitList.get(i));
}
}
});
Thread t3 = new Thread(new Runnable() {
public void run() {
for (int i = visitList.size()/2; i < visitList.size(); i = i + 2) {
auditedT3 += accuracyDao.SumOfChartsAudited(visitList.get(i));
}
}
});
Thread t4 = new Thread(new Runnable() {
public void run() {
for (int i = visitList.size()/2+1; i < visitList.size(); i = i + 2) {
auditedT4 += accuracyDao.SumOfChartsAudited(visitList.get(i));
}
}
});
t.start();
t1.start();
t3.start();
t4.start();
t.join();
t1.join();
t3.join();
t4.join();
To answer your main question, check out java parallel streams. It's only one way to do this but it's fairly straightforward. You can use a map operation to perform the database calls, then a collect to sum them all together.
https://docs.oracle.com/javase/tutorial/collections/streams/parallelism.html
There is no guarantee that this will improve performance though. You may have to profile and see where the performance issue really lies. It could be in your database, network, or somewhere else. (assuming you're using a database since there's DAO in your method names)

List getting threadsafe with removing items

I'm trying to remove items from a list until its empty with multithreading.
Code:
public void testUsers() {
final List<User> users = userDao.findAll();
final int availableProcessors = Runtime.getRuntime().availableProcessors() * multiplier;
final List<String> loggingList = Lists.newArrayList();
final List<Integer> sizeChecked = Lists.newArrayList();
int totalSizeChecked = 0;
int sizeList = users.size();
ExecutorService executorService = Executors.newFixedThreadPool(availableProcessors);
for (int i = 0; i < availableProcessors; i++) {
createThread(executorService, users, loggingList, sizeChecked);
}
executorService.shutdown();
try {
// wait for all threads to die
executorService.awaitTermination(1, TimeUnit.HOURS);
} catch (InterruptedException ex) {
}
for (Integer count : sizeChecked) {
totalSizeChecked += count;
}
Assert.assertTrue(totalSizeChecked==sizeList);
}
private void createThread(ExecutorService executorService, final List<User> users,
final Collection<String> loggingList, final List<Integer> sizeChecked) {
executorService.execute(new Runnable() {
#Override
public void run() {
int totalChecked = 0;
while (!users.isEmpty()) {
User user = null;
synchronized (users) {
if (!users.isEmpty()) {
user = users.remove(0);
}
}
totalChecked++;
if (user != null) {
String reason = checkUser(user);
if (reason != null) {
loggingList.add(reason);
}
} else {
LOGGER.info("user is null");
}
}
sizeChecked.add(totalChecked);
}
});
}
Now I was thinking this couldn't be so wrong cause I made the list synchronised for removing the first item.
I'm testing with a multiplier of 6.(on prod it will be lowered to 1-2)
I get this in the email :
The batch was not correctly executed.
Size of accounts that must be checked : 28499. Size of accounts that have been checked: 25869
What do I wrong to get it threadsafe?
List<Integer> sizeChecked is not thread safe. Therefore you cannot add elements in parallel in it.
Synchronize your add operation or use a thread-safe structure. If sizeChecked is just a counter, use an AtomicLong instead and make each thread increment it.

c# two object in class dont work thread

I have a construct like this:
private readonly List<Thread> thr = new List<Thread>();
In a class i have a method with one parameter that i want to call threaded.
public void testthr(object xxx)
{
......
}
on button click i start a thread
for (Int32 i = 0; i < textBox8.Lines.Length; i++)
{
var thr1 = new Thread(testthr);
thr1.Start(textBox8.Lines[i].Trim());
thr.Add(threadz);
}
How to make a thread with more than one parameter? Like:
public void testthr(object xxx, string yyy)
{
......
}
this class in thread start ?
If you want to pass multiple values to a thread proc, you need to create an object to contain them. There are several ways to do that. The easiest is probably to use a Tuple:
for (Int32 i = 0; i < textBox8.Lines.Length; i++)
{
var thr1 = new Thread(testthr);
var data = new Tuple<string, string>(textBox8.Lines[i].Trim(), "hello");
thr1.Start(data);
thr.Add(thr1);
}
public void testthr(object state)
{
var data = (Tuple<string,string>)state;
var item1 = data.Item1;
var item2 = data.Item2;
...
}

Resources