Call each task in a separate thread. ThreadPerTaskScheduler task scheduler - multithreading

I want to run each TPL task in a separate thread (the idea is having TPL advantages in the same time with working with a separate threads). It looks like this task scheduler is exactly what I'm looking for: ThreadPerTaskScheduler .
I made several local tests and I see that it works as I expected including the ability to call Task.WaitAll.
var task = Task.Factory.StartNew(() =>
{
Thread.Sleep(10000);
}, CancellationToken.None, TaskCreationOptions.None, new ThreadPerTaskScheduler());
Task.WaitAll(task);
But, I have a question about this line from the task scheduler implementation:
protected override void QueueTask(Task task)
{
new Thread(() => TryExecuteTask(task)) { IsBackground = true }.Start();
}
as I see, we just create a new thread without saving any reference on this thread anywhere. If so, how does Task.WaitAll work?

Related

Kotlin: Why isn't job.invokeOnCompletion() block running on main thread?

In my Android application I have code that should run periodically in its own coroutine and should be cancelable.
for this I have the following functions:
startJob(): Initializes the job, sets up invokeOnCompletion() and starts the work loop in the respective scope
private fun startJob() {
if (::myJob.isInitialized && myJob.isActive) {
return
}
myJob= Job()
myJob.invokeOnCompletion {
it?.message.let {
var msg = it
if (msg.isNullOrBlank()) {
msg = "Job stopped. Reason unknown"
}
myJobCompleted(msg)
}
}
CoroutineScope(Dispatchers.IO + myJob).launch {
workloop()
}
}
workloop(): The main work loop. Do some work in a loop with a set delay in each iteration:
private suspend fun workloop() {
while (true) {
// doing some stuff here
delay(setDelayInMilliseconds)
}
}
myJobCompleted: do some finalizing. For now simply log a message for testing.
private fun myJobCompleted(msg: String) {
try {
mainActivityReference.logToGUI(msg)
}
catch (e:Exception){
println("debug: " + e.message)
}
}
Running this and calling myJob.Cancel() will throw the following exception in myJobCompleted():
debug: Only the original thread that created a view hierarchy can touch its views.
I'm curious as to why this code isn't running on the main thread, since startJob() IS called from the main thread?
Furthermore: is there a option similar to using a CancellationTokenSource in c#, where the job is not immediately cancelled, but a cancellation request can be checked each iteration of the while loop?
Immediately breaking off the job, regardless of what it is doing (although it will pretty much always be waiting for the delay on cancellation) doesn't seem like a good idea to me.
It is not the contract of Job.invokeOnCompletion to run on the same thread where Job is created. Moreover, such a contract would be impossible to implement.
You can't expect an arbitrary piece of code to run on an arbitrary thread, just because there was some earlier method invocation on that thread. The ability of the Android main GUI thread to execute code submitted from the outside is special, and involves the existence a top-level event loop.
In the world of coroutines, what controls thread assignment is the coroutine context, while clearly you are outside of any context when creating the job. So the way to fix it is to explicitly launch(Dispatchers.Main) a coroutine from within invokeOnCompletion.
About you question on cancellation, you can use withContext(NonCancellable) to surround the part of code you want to protect from cancellation.

Interrupt parallel Stream execution

Consider this code :
Thread thread = new Thread(() -> tasks.parallelStream().forEach(Runnable::run));
tasks are a list of Runnables that should be executed in parallel.
When we start this thread, and it begins its execution, then depending on some calculations we need to interrupt (cancel) all those tasks.
Interrupting the Thread will only stop one of exections. How do we handle others? or maybe Streams should not be used that way? or you know a better solution?
You can use a ForkJoinPool to interrupt the threads:
#Test
public void testInterruptParallelStream() throws Exception {
final AtomicReference<InterruptedException> exc = new AtomicReference<>();
final ForkJoinPool forkJoinPool = new ForkJoinPool(4);
// use the pool with a parallel stream to execute some tasks
forkJoinPool.submit(() -> {
Stream.generate(Object::new).parallel().forEach(obj -> {
synchronized (obj) {
try {
// task that is blocking
obj.wait();
} catch (final InterruptedException e) {
exc.set(e);
}
}
});
});
// wait until the stream got started
Threads.sleep(500);
// now we want to interrupt the task execution
forkJoinPool.shutdownNow();
// wait for the interrupt to occur
Threads.sleep(500);
// check that we really got an interruption in the parallel stream threads
assertTrue(exc.get() instanceof InterruptedException);
}
The worker threads do really get interrupted, terminating a blocking operation. You can also call shutdown() within the Consumer.
Note that those sleeps might not be tweaked for a proper unit test, you might have better ideas to just wait as necessary. But it is enough to show that it is working.
You aren't actually running the Runnables on the Thread you are creating. You are running a thread which will submit to a pool, so:
Thread thread = new Thread(() -> tasks.parallelStream().forEach(Runnable::run));
In this example you are in lesser terms doing
List<Runnable> tasks = ...;
Thread thread = new Thread(new Runnable(){
public void run(){
for(Runnable r : tasks){
ForkJoinPool.commonPool().submit(r);
}
}
});
This is because you are using a parallelStream that delegates to a common pool when handling parallel executions.
As far as I know, you cannot get a handle of the Threads that are executing your tasks with a parallelStream so may be out of luck. You can always do tricky stuff to get the thread but probably isn't the best idea to do so.
Something like the following should work for you:
AtomicBoolean shouldCancel = new AtomicBoolean();
...
tasks.parallelStream().allMatch(task->{
task.run();
return !shouldCancel.get();
});
The documentation for the method allMatch specifically says that it "may not evaluate the predicate on all elements if not necessary for determining the result." So if the predicate doesn't match when you want to cancel, then it doesn't need to evaluate any more. Additionally, you can check the return result to see if the loop was cancelled or not.

Using worker threads to add new tasks to a taskPool in D

This a simplification and narrowing to another of my questions: Need help parallel traversing a dag in D
Say you've got some code that you want to parallelize. The problem is, some of the things you need to do have prerequisites. So you have to make sure that those prerequisites are done before you add the new task into the pool. The simple conceptual answer is to add new tasks as their prerequisites finish.
Here I have a little chunk of code that emulates that pattern. The problem is, it throws an exception because pool.finish() gets called before a new task is put on the queue by the worker thread. Is there a way to just wait 'till all threads are idle or something? Or is there another construct that would allow this pattern?
Please note: this is a simplified version of my code to illustrate the problem. I can't just use taskPool.parallel() in a foreach.
import std.stdio;
import std.parallelism;
void simpleWorker(uint depth, uint maxDepth, TaskPool pool){
writeln("Depth is: ",depth);
if (++depth < maxDepth){
pool.put( task!simpleWorker(depth,maxDepth,pool));
}
}
void main(){
auto pool = new TaskPool();
auto t = task!simpleWorker(0,5,pool);
pool.put(t);
pool.finish(true);
if (t.done()){ //rethrows the exception thrown by the thread.
writeln("Done");
}
}
I fixed it: http://dpaste.dzfl.pl/eb9e4cfc
I changed to for loop to:
void cleanNodeSimple(Node node, TaskPool pool){
node.doProcess();
foreach (cli; pool.parallel(node.clients,1)){ // using parallel to make it concurrent
if (cli.canProcess()) {
cleanNodeSimple(cli, pool);
// no explicit task creation (already handled by parallel)
}
}
}

Task is ignoring Thread.Sleep

trying to grasp the TPL.
Just for fun I tried to create some Tasks with a random sleep to see how it was processed. I was targeting a fire and forget pattern..
static void Main(string[] args)
{
Console.WriteLine("Demonstrating a successful transaction");
Random d = new Random();
for (int i = 0; i < 10; i++)
{
var sleep = d.Next(100, 2000);
Action<int> succes = (int x) =>
{
Thread.Sleep(x);
Console.WriteLine("sleep={2}, Task={0}, Thread={1}: Begin successful transaction",
Task.CurrentId, Thread.CurrentThread.ManagedThreadId, x);
};
Task t1 = Task.Factory.StartNew(() => succes(sleep));
}
Console.ReadLine();
}
But I don't understand why it outputs all lines to the Console ignoring the Sleep(random)
Can someone explain that to me?
Important:
The TPL default TaskScheduler does not guarantee Thread per Task - one thread can be used for processing several tasks.
Calling Thread.Sleep might impact other tasks performance.
You can construct your task with the TaskCreationOptions.LongRunning hint this way the TaskScheduler will assign a dedicated thread for the task and it will be safe to block on it.
Your code uses the value of i instead of the generated random number. It does not ignore the sleep but rather sleeps between 0 and 10ms each iteration.
Try:
Thread.Sleep(sleep);
The sentence
Task t1 = Task.Factory.StartNew(() => succes(sleep));
Will create the Task and automatically start it, then will iterate again inside the for, without waiting the task to end its process. So when the second task is created and executed, the first one may be finished. I mean you are not waiting for the tasks to end:
You should try
Task t1 = Task.Factory.StartNew(() => succes(sleep));
t1.Wait();

Parallel Task advice

I am trying to use the parallel task library to kick off a number of tasks like this:
var workTasks = _schedules.Where(x => x.Task.Enabled);
_tasks = new Task[workTasks.Count()];
_cancellationTokenSource = new CancellationTokenSource();
_cancellationTokenSource.Token.ThrowIfCancellationRequested();
int i = 0;
foreach (var schedule in _schedules.Where(x => x.Task.Enabled))
{
_log.InfoFormat("Reading task information for task {0}", schedule.Task.Name);
if(!schedule.Task.Enabled)
{
_log.InfoFormat("task {0} disabled.", schedule.Task.Name);
i++;
continue;
}
schedule.Task.ServiceStarted = true;
_tasks[i] = Task.Factory.StartNew(() =>
schedule.Task.Run()
, _cancellationTokenSource.Token);
i++;
_log.InfoFormat("task {0} has been added to the worker threads and has been started.", schedule.Task.Name);
}
I want these tasks to sleep and then wake up every 5 minutes and do their stuff, at the moment I am using Thread.Sleep in the Schedule object whose Run method is the Action that is passed into StartNew as an argument like this:
_tasks[i] = Task.Factory.StartNew(() =>
schedule.Task.Run()
, _cancellationTokenSource.Token);
I read somewhere that Thread.Sleep is a bad solution for this. Can anyone recommend a better approach?
By my understanding, Thread.Sleep is bad generally, because it force-shifts everything out of memory even when that's not necessary. It won't be a big deal in most cases, but it could be a performance issue.
I'm in the habit of using this snippet instead:
new System.Threading.EventWaitHandle(false, EventResetMode.ManualReset).WaitOne(1000);
Fits on one line, and isn't overly complicated -- it creates an event handle that will never be set, and then waits for the full timeout period before continuing.
Anyway, if you're just trying to have something repeat every 5 minutes, a better approach would probably be to use a Timer. You could even make a class to neatly wrap everything if your repeated work methods are already factored out:
using System.Threading;
using System.Threading.Tasks;
public class WorkRepeater
{
Timer m_Timer;
WorkRepeater(Action workToRepeat, TimeSpan interval)
{
m_Timer = new System.Timers.Timer((double)Interval.Milliseconds);
m_Timer.Elapsed +=
new System.Timers.ElapsedEventHandler((o, ea) => WorkToRepeat());
}
public void Start()
{
m_Timer.Start();
}
public void Stop()
{
m_Timer.Stop();
}
}
Bad solution are Tasks here. Task should be used for short living operations, like asynch IO. If you want to control life time of task you should use Thread and sleep as much as you like, because Thread is individual, but Tasks are rotated in thread pool which is shared.

Resources