I have a scenario where some functions need to complete as quickly as possible and be given computation resources at the expense of other tasks (i.e. they are high-priority). Specifically, graphics rendering, and any tasks that are spawned for rendering should run as quickly as possible but do not consume the full CPU capacity. Simultaneously, I want to fill empty cycles of the CPU with other work that is not as time-critical and make sure not to steal cycles from the rendering tasks.
The basic idea is fairly simple, but I cannot figure out how to do what I want through PPL. I have found how to set the default scheduler to different priorities, but I don't want to globally change the priority. Rather, I want to have two distinct scheduling policies that I can add tasks to at any time.
The ideal situation is if I could create two task_group instances with different priorities and add tasks to the relevant group as needed, but I don't see how to do that. I linked the most relevant documentation I found, which does what I want, but uses agents in a way that leaves me unsure how to do the simple action of just adding a task. I would also rather not add the complexity of agents and message passing if I can use the basic facilities in PPL.
https://msdn.microsoft.com/en-us/library/dd984038.aspx
It is also important that I can ensure that any sub-tasks spawned from a thread inherit the priority of the parent. Specifically, I call parallel_for from both high and low priority tasks and the parallel_for blocks should keep the same priority.
The task constructor (and create_task function) can take a task_options parameter with a custom scheduler.
https://msdn.microsoft.com/en-us/library/dn237306.aspx
Related
I have created a tbb::task_group and added multiple task to it. In the end I wait() on the tasks to complete. I was profiling the code and saw that the number of threads used by my application have increased (as visible in Window's Task Manager). However when the tbb::task_group object is destructed, the thread count does not decrease.
Additionally if I call the same code block again (without restarting the application), the number of threads sometimes increases and sometimes not.
Is this an expected behavior? If yes, how can I make sure the threads created previously are reused?
Yes, this is expected behavior. It is done specifically to reuse threads between parallel algorithms. You can verify it by marking threads with thread-local variables (TBB provides combinable class) or looking into callbacks of task_scheduler_observer.
TBB always but lazily create the number of threads specified at the initialization time - even if you run only single task. By default the number of TBB worker threads equals to the number of HW threads (cores*HT) minus one for the application thread.
BTW, I'd not recommend you using tbb::task which is for advanced cases, check out tbb::parallel_invoke or tbb::task_group first which are high-level interfaces to tasks. Or even better, look whether your algorithm can be expressed on even more higher level using things like parallel_for, parallel_reduce (possibly with custom Range), parallel_pipeline, flow::graph, etc.
I thought starting order meant the predetermined order of the threads ( at what moment thread X will run), but I started realizing it didn't make any sense, because native threads can't be predetermined.
Isn't the running order of the native threads determined by the operating system and therefore random? I don't understand why we're talking about starting order if everything is "random" or rather determined by the operating system's scheduling service.
When we do not care about the order of execution of certain blocks of statements in a computer program, that is the situation in which we can think about using threads. Code that uses threads, but expects them to execute in a particular order, is usually broken. If it ensures that threads execute in a certain order, then it's wasting the power of threads.
There are no absolutes; there are probably situations in some real-time programming where some select actions have to be done in order, and the most convenient way is to keep those actions in their associated threads (for reasons of context or whatever).
Another example is the use of priority. Priority is a tool that we use when we still don't care about specific orders of execution, but we want more important actions to complete ahead of less important actions, in cases where there is a scheduling conflict.
I've got a service that runs scans of various servers. The networks in question can be huge (hundreds of thousands of network nodes).
The current version of the software is using a queueing/threading architecture designed by us which works but isn't as efficient as it could be (not least of which because jobs can spawn children which isn't handled well)
V2 is coming up and I'm considering using the TPL. It seems like it should be ideally suited.
I've seen this question, the answer to which implies there's no limit to the tasks TPL can handle. In my simple tests (Spin up 100,000 tasks and give them to TPL), TPL barfed fairly early on with an Out-Of-Memory exception (fair enough - especially on my dev box).
The Scans take a variable length of time but 5 mins/task is a good average.
As you can imagine, scans for huge networks can take a considerable length of time, even on beefy servers.
I've already got a framework in place which allows the scan jobs (stored in a Db) to be split between multiple scan servers, but the question is how exactly I should pass work to the TPL on a specific server.
Can I monitor the size of TPL's queue and (say) top it up if it falls below a couple of hundred entries? Is there a downside to doing this?
I also need to handle the situation where a scan needs to be paused. This is seems easier to do by not giving the work to TPL than by cancelling/resetting tasks which may already be partially processed.
All of the initial tasks can be run in any order. Children must be run after the parent has started executing but since the parent spawns them, this shouldn't ever be a problem. Children can be run in any order. Because of this, I'm currently envisioning that child tasks be written back to the Db not spawned directly into TPL. This would allow other servers to "work steal" if required.
Has anyone had any experience with using the TPL in this way? Are there any considerations I need to be aware of?
TPL is about starting small units of work and running them in parallel. It is not about monitoring, pausing, or throttling this work.
You should see TPL as a low-level tool to start "work" and to synchronize threads.
Key point: TPL tasks != logical tasks. Logical tasks are in your case scan-tasks ("scan an ip-range from x to y"). Such a task should not correspond to a physical task "System.Threading.Task" because the two are different concepts.
You need to schedule, orchestrate, monitor and pause the logical tasks yourself because TPL does not understand them and cannot be made to.
Now the more practical concerns:
TPL can certainly start 100k tasks without OOM. The OOM happened because your tasks' code exhausted memory.
Scanning networks sounds like a great case for asynchronous code because while you are scanning you are likely to wait on results while having a great degree of parallelism. You probably don't want to have 500 threads in your process all waiting for a network packet to arrive. Asynchronous tasks fit well with the TPL because every task you run becomes purely CPU-bound and small. That is the sweet spot for TPL.
I want to use TPL in Worker process on Windows Azure. I'm looking to add an IJob the queue, this has a Run method, so the worker will consist of:
loop
get item off queue
Use TPL to call IJob.Run, this is an async call
But I'm a bit concerned about the maximum items I can add to TPL? I'm happy to build my own TPL Pool of some sort if required, just checking it capabilities.
Cheers,
Ash.
One of the main goals of the TPL is to remove the need to worry about this. By decomposing your work into Tasks instead of Threads, you're allowing the scheduler to handle the balancing of this more appropriately.
There is no fixed upper limit to the number of "tasks" you can schedule. They are (by default, with the default TaskScheduler) scheduled using the ThreadPool, which as of .NET 4, scales based on the work. I would strongly suggest not trying to build your own pool - it's highly unlikely that you'll do better than the default. That being said, if your tasks have a very non-standard behavior, you may want to consider writing a custom TaskScheduler.
Also - realize that you should, ideally, make your tasks as "large as possible". There is overhead associated with an individual task - having them be too small (in terms of work) will cause the overhead to have a larger impact on performance than if you have an appropriate number of larger "tasks".
My question might sound a bit naive but I'm pretty new with multi-threaded programming.
I'm writing an application which processes incoming external data. For each data that arrives a new task is created in the following way:
System.Threading.Tasks.Task.Factory.StartNew(() => methodToActivate(data));
The items of data arrive very fast (each second, half second, etc...), so many tasks are created. Handling each task might take around a minute. When testing it I saw that the number of threads is increasing all the time. How can I limit the number of tasks created, so the number of actual working threads is stable and efficient. My computer is only dual core.
Thanks!
One of your issues is that the default scheduler sees tasks that last for a minute and makes the assumption that they are blocked on another tasks that have yet to be executed. To try and unblock things it schedules more pending tasks, hence the thread growth. There are a couple of things you can do here:
Make your tasks shorter (probably not an option).
Write a scheduler that deals with this scenario and doesn't add more threads.
Use SetMaxThreads to prevent
unbounded thread pool growth.
See the section on Thread Injection here:
http://msdn.microsoft.com/en-us/library/ff963549.aspx
You should look into using the producer/consumer pattern with a BlockingCollection<T> around a ConcurrentQueue<T> where you set the BoundedCapacity to something that makes sense given the characteristics of your workload. You can make your BoundedCapacity configurable and then tweak as you run through some profiling sessions to find the sweet spot.
While it's true that the TPL will take care of queueing up the tasks you create, creating too many tasks does not come without penalties. Also, what's the point in producing more work than you can consume? You want to produce enough work that the consumers will never be starved, but you don't want to get to far ahead of yourself because that's just wasting resources and potentially stealing those very same resources from your consumers.
You can create a custom TaskScheduler for the Task Parallel library and then schedule tasks on that by passing an instance of it to the TaskFactory constructor.
Here's one example of how to do that: Task Scheduler with a maximum degree of parallelism.