GCD dispatch_sync priority over previously queued dispatch_async - multithreading

I have a class that wraps a data model and is accessed/modified by multiple threads. I need to make sure modification to data model is synchronized. I am using a dispatch_queue_create(..., DISPATCH_QUEUE_SERIAL). This is working really well for my needs.
Most of the methods on my class internally call "dispatch_async(queue, ^{...});". There are a few places where I need to return a snapshot result. This is a simplified example of how that looks:
- (NSArray*) getSomeData {
__block NSArray* result = nil;
dispatch_sync(queue, ^{
... Do Stuff ...
result = blah.blah;
}
return result;
}
Now, lets assume that 5 "async tasks" are queued and one is currently executing. Now a "sync" task is scheduled. When will the "sync task" execute?
What I would like to have happen is "sync task" is executed ahead of any pending "async tasks". Is this what happens by default? If not is there a way to priority queue the "sync task"?
BTW,
I know I can set an overall queue priority but that is not what this question is about. For me queue priority normal is just fine. I just want my synchronous tasks to happen before any pending asynchronous tasks.

There's not a generic setting for "perform sync tasks first" or for setting relative priority between enqueued blocks in a single queue. To recap what may be obvious, a serial queue is going to work like a queue: first in, first out. That said, it's pretty easy to conceive of how you might achieve this effect using multiple queues and targeting. For example:
realQueue = dispatch_queue_create(NULL, DISPATCH_QUEUE_SERIAL);
asyncOpsQueue = dispatch_queue_create(NULL, DISPATCH_QUEUE_SERIAL);
dispatch_set_target_queue(asyncOpsQueue, realQueue);
for (NSUInteger i = 0; i < 10; i++)
{
dispatch_async(asyncOpsQueue, ^{
NSLog(#"Doing async work block %#", #(i));
sleep(1);
});
}
// Then whenever you have high priority sync work to do, stop the async
// queue, do your work, and then restart it.
dispatch_suspend(asyncOpsQueue);
dispatch_sync(realQueue, ^{
NSLog(#"Doing sync work block");
});
dispatch_resume(asyncOpsQueue);
One thing to know is that an executing block effectively can't be canceled/suspended/terminated (from the outside) once it's begun. So any async enqueued block that's in flight has to run to completion before your sync block can start, but this arrangement of targeting allows you to pause the flow of async blocks and inject your sync block. Note that it also doesn't matter that you're doing a sync block. It could, just as easily, be an async block of high priority, but in that case you would probably want to move the dispatch_resume into the block itself.

Related

Linux kernel: how to wait in multiple wait queues?

I know how to wait in Linux kernel queues using wait_event and how to wake them up.
Now I need to figure out how to wait in multiple queues at once. I need to multiplex multiple event sources, basically in a way similar to poll or select, but since the sources of events don't have the form of a pollable file descriptor, I wasn't able to find inspiration in the implementation of these syscalls.
My initial idea was to take the code from the wait_event macro, use DEFINE_WAIT multiple times as well as prepare_to_wait.
However, given how prepare_to_wait is implemented, I'm afraid the internal linked list of the queue would become corrupted if the same "waiter" is added multiple times (which could maybe happen if one queue causes wakeup, but the wait condition isn't met and waiting is being restarted).
One of possible scenarios for wait in several waitqueues:
int ret = 0; // Result of waiting; in form 0/-err.
// Define wait objects, one object per waitqueue.
DEFINE_WAIT_FUNC(wait1, default_wake_function);
DEFINE_WAIT_FUNC(wait2, default_wake_function);
// Add ourselves to all waitqueues.
add_wait_queue(wq1, &wait1);
add_wait_queue(wq2, &wait2);
// Waiting cycle
while(1) {
// Change task state for waiting.
// NOTE: this should come **before** condition checking for avoid races.
set_current_state(TASK_INTERRUPTIBLE);
// Check condition(s) which we are waiting
if(cond) break;
// Need to wait
schedule();
// Check if waiting has been interrupted by signal
if (signal_pending(current)) {
ret = -ERESTARTSYS;
break;
}
}
// Remove ourselves from all waitqueues.
remove_wait_queue(wq1, &wait1);
remove_wait_queue(wq2, &wait2);
// Restore task state
__set_current_state(TASK_RUNNING);
// 'ret' contains result of waiting.
Note, that this scenario is slightly different from one of wait_event:
wait_event uses autoremove_wake_function for wait object (created with DEFINE_WAIT). This function, called from wake_up(), removes wait object from the queue. So it is needed to re-add wait object into the queue each iteration.
But in case of multiple waitqueues it is impossible to know, which waitqueue has fired. So following this strategy would require to re-add every wait object every iteration, which is inefficient.
Instead, our scenario uses default_wake_function for wait object, so the object is not removed from the waitqueue on wake_up() call, and it is sufficient to add wait object to the queue only once, before the loop.

Serial Dispatch Queue with Asynchronous Blocks

Is there ever any reason to add blocks to a serial dispatch queue asynchronously as opposed to synchronously?
As I understand it a serial dispatch queue only starts executing the next task in the queue once the preceding task has completed executing. If this is the case, I can't see what you would you gain by submitting some blocks asynchronously - the act of submission may not block the thread (since it returns straight-away), but the task won't be executed until the last task finishes, so it seems to me that you don't really gain anything.
This question has been prompted by the following code - taken from a book chapter on design patterns. To prevent the underlying data array from being modified simultaneously by two separate threads, all modification tasks are added to a serial dispatch queue. But note that returnToPool adds tasks to this queue asynchronously, whereas getFromPool adds its tasks synchronously.
class Pool<T> {
private var data = [T]();
// Create a serial dispath queue
private let arrayQ = dispatch_queue_create("arrayQ", DISPATCH_QUEUE_SERIAL);
private let semaphore:dispatch_semaphore_t;
init(items:[T]) {
data.reserveCapacity(data.count);
for item in items {
data.append(item);
}
semaphore = dispatch_semaphore_create(items.count);
}
func getFromPool() -> T? {
var result:T?;
if (dispatch_semaphore_wait(semaphore, DISPATCH_TIME_FOREVER) == 0) {
dispatch_sync(arrayQ, {() in
result = self.data.removeAtIndex(0);
})
}
return result;
}
func returnToPool(item:T) {
dispatch_async(arrayQ, {() in
self.data.append(item);
dispatch_semaphore_signal(self.semaphore);
});
}
}
Because there's no need to make the caller of returnToPool() block. It could perhaps continue on doing other useful work.
The thread which called returnToPool() is presumably not just working with this pool. It presumably has other stuff it could be doing. That stuff could be done simultaneously with the work in the asynchronously-submitted task.
Typical modern computers have multiple CPU cores, so a design like this improves the chances that CPU cores are utilized efficiently and useful work is completed sooner. The question isn't whether tasks submitted to the serial queue operate simultaneously — they can't because of the nature of serial queues — it's whether other work can be done simultaneously.
Yes, there are reasons why you'd add tasks to serial queue asynchronously. It's actually extremely common.
The most common example would be when you're doing something in the background and want to update the UI. You'll often dispatch that UI update asynchronously back to the main queue (which is a serial queue). That way the background thread doesn't have to wait for the main thread to perform its UI update, but rather it can carry on processing in the background.
Another common example is as you've demonstrated, when using a GCD queue to synchronize interaction with some object. If you're dealing with immutable objects, you can dispatch these updates asynchronously to this synchronization queue (i.e. why have the current thread wait, but rather instead let it carry on). You'll do reads synchronously (because you're obviously going to wait until you get the synchronized value back), but writes can be done asynchronously.
(You actually see this latter example frequently implemented with the "reader-writer" pattern and a custom concurrent queue, where reads are performed synchronously on concurrent queue with dispatch_sync, but writes are performed asynchronously with barrier with dispatch_barrier_async. But the idea is equally applicable to serial queues, too.)
The choice of synchronous v asynchronous dispatch has nothing to do with whether the destination queue is serial or concurrent. It's simply a question of whether you have to block the current queue until that other one finishes its task or not.
Regarding your code sample code, that is correct. The getFromPool should dispatch synchronously (because you have to wait for the synchronization queue to actually return the value), but returnToPool can safely dispatch asynchronously. Obviously, I'm wary of seeing code waiting for semaphores if that might be called from the main thread (so make sure you don't call getFromPool from the main thread!), but with that one caveat, this code should achieve the desired purpose, offering reasonably efficient synchronization of this pool object, but with a getFromPool that will block if the pool is empty until something is added to the pool.

Do I understand these concepts correctly?

In most of my interviews, I've been asked about web services and multithreading. I've done neither, so I decided to learn more about Web Services and Multithreading using Grand Central Dispatch.
For web services, the way that I understand it is that you need to fetch the data using a class such as NSURLConnection. basically setup a new NSURL, then a connection, then a request. You also need to make use of the API's methods such as didConnect, didReceiveData, and didFailLoadWithError. After you receive the data, which is generally in JSON or XML format and stored as an NSData object, you can store it and parse through it. There are multiple ways to parse through it, such as by using SBJSON or NSXMLParser. You can then do with it what you need.
For multithreading, Grand Central Dispatch is a c-style way of multithreading. Basically, you use it when you need to do heavy hauling away from the main thread to avoid the app freezing. You can dispatch synchronously or asynchronously. Asynchronously means that the method on the main thread will continue executing, synchronously means that it will not. You never need to use GCD alongside with NSURLConnection, because NSURLConnection already does its work in the background then calls upon delegates in the main thread. But, for saving and unzipping files, you should use GCD. When you call dispatch_async, you pass in a dispatch queue. You can use either a serial queue or a concurrent queue. A serial queue will execute tasks in the queue one at a time, in the order that they arrived. It is the default setting. With concurrently queues, tasks executed concurrently might be executed at the same time.
My first question is, do I have a proper understanding of these two concepts? I know that there is a lot to learn about GCD, but I just want to make sure that I have the basic ideas correct. Also, with GCD, why would someone ever want to dispatch synchronously, wouldn't that defeat the purpose of multithreading?
The only reason to dispatch synchronously is to prevent the current code from continuing until the critical section finishes.
For example, if you wanted to get some value from the shared resource and use it right away, you would need to dispatch synchronously. If the current code does not need to wait for the critical section to complete, or if it can simply submit additional follow-up tasks to the same serial queue, submitting asynchronously is generally preferred.
You can make synchronous request and dispatch it by using dispatch_async or dispatch_sync call. It will totally run in background.
-(void)requestSomething:(NSString *)url
{
NSString *queue_id = #"queue_identifier";
dispatch_queue_t queue = dispatch_queue_create([queue_id UTF8String], 0);
dispatch_queue_t main = dispatch_get_main_queue();
dispatch_async(queue, ^{
NSURLRequest *theRequest = [NSURLRequest requestWithURL:[NSURL URLWithString:url]];
NSError *serviceError = nil;
NSURLResponse *serviceResponse = nil;
NSData *dataResponse = [NSURLConnection sendSynchronousRequest:theRequest returningResponse:&serviceResponse error:&serviceError];
if(serviceError)
{
dispatch_sync(main, ^{
// Do UI work like removing indicator or show user an alert with description of error using serviceError object.
return;
});
}
else
{
// Use dataResponse object and parse it as this part of code will not executed on main thread.
dispatch_sync(main, ^{
// Do UI work like updating table-view or labels using parsed data or removing indicator
});
}
});
// If your project is not developed under ARC mechanism, add following line
dispatch_release(queue);
}

.NET - Multiple Timers instances mean Multiple Threads?

I already have a windows service running with a System.Timers.Timer that do a specific work. But, I want some works to run at the same time, but in different threads.
I've been told to create a different System.Timers.Timer instance. Is this correct? Is this way works running in parallel?
for instance:
System.Timers.Timer tmr1 = new System.Timers.Timer();
tmr1.Elapsed += new ElapsedEventHandler(DoWork1);
tmr1.Interval = 5000;
System.Timers.Timer tmr2 = new System.Timers.Timer();
tmr2.Elapsed += new ElapsedEventHandler(DoWork2);
tmr2.Interval = 5000;
Will tmr1 and tmr2 run on different threads so that DoWork1 and DoWork2 can run at the same time, i.e., concurrently?
Thanks!
It is not incorrect.
Be careful. System.Timers.Timer will start a new thread for every Elapsed event. You'll get in trouble when your Elapsed event handler takes too long. Your handler will be called again on another thread, even though the previous call wasn't completed yet. This tends to produce hard to diagnose bugs. Something you can avoid by setting the AutoReset property to false. Also be sure to use try/catch in your event handler, exceptions are swallowed without diagnostic.
Multiple timers might mean multiple threads. If two timer ticks occur at the same time (i.e. one is running and another fires), those two timer callbacks will execute on separate threads, neither of which will be the main thread.
It's important to note, though, that the timers themselves don't "run" on a thread at all. The only time a thread is involved is when the timer's tick or elapsed event fires.
On another note, I strongly discourage you from using System.Timers.Timer. The timer's elapsed event squashes exceptions, meaning that if an exception escapes your event handler, you'll never know it. It's a bug hider. You should use System.Threading.Timer instead. System.Timers.Timer is just a wrapper around System.Threading.Timer, so you get the same timer functionality without the bug hiding.
See Swallowing exceptions is hiding bugs for more info.
Will tmr1 and tmr2 run on different threads so that DoWork1 and DoWork2 can run at the same time, i.e., concurrently?
At the start, yes. However, what is the guarantee both DoWork1 and DoWork2 would finish within 5 seconds? Perhaps you know the code inside DoWorkX and assume that they will finish within 5 second interval, but it may happen that system is under load one of the items takes more than 5 seconds. This will break your assumption that both DoWorkX would start at the same time in the subsequent ticks. In that case even though your subsequent start times would be in sync, there is a danger of overlapping current work execution with work execution which is still running from the last tick.
If you disable/enable respective timers inside DoWorkX, however, your start times will go out of sync from each other - ultimately possible they could get scheduled over the same thread one after other. So, if you are OK with - subsequent start times may not be in sync - then my answer ends here.
If not, this is something you can attempt:
static void Main(string[] args)
{
var t = new System.Timers.Timer();
t.Interval = TimeSpan.FromSeconds(5).TotalMilliseconds;
t.Elapsed += (sender, evtArgs) =>
{
var timer = (System.Timers.Timer)sender;
timer.Enabled = false; //disable till work done
// attempt concurrent execution
Task work1 = Task.Factory.StartNew(() => DoWork1());
Task work2 = Task.Factory.StartNew(() => DoWork2());
Task.Factory.ContinueWhenAll(new[]{work1, work2},
_ => timer.Enabled = true); // re-enable the timer for next iteration
};
t.Enabled = true;
Console.ReadLine();
}
Kind of. First, check out the MSDN page for System.Timers.Timer: http://msdn.microsoft.com/en-us/library/system.timers.timer.aspx
The section you need to be concerned with is quoted below:
If the SynchronizingObject property is null, the Elapsed event is
raised on a ThreadPool thread. If processing of the Elapsed event
lasts longer than Interval, the event might be raised again on another
ThreadPool thread. In this situation, the event handler should be
reentrant.
Basically, this means that where the Timer's action gets run is not such that each Timer has its own thread, but rather that by default, it uses the system ThreadPool to run the actions.
If you want things to run at the same time (kick off all at the same time) but run concurrently, you can not just put multiple events on the elapsed event. For example, I tried this in VS2012:
static void testMethod(string[] args)
{
System.Timers.Timer mytimer = new System.Timers.Timer();
mytimer.AutoReset = false;
mytimer.Interval = 3000;
mytimer.Elapsed += (x, y) => {
Console.WriteLine("First lambda. Sleeping 3 seconds");
System.Threading.Thread.Sleep(3000);
Console.WriteLine("After sleep");
};
mytimer.Elapsed += (x, y) => { Console.WriteLine("second lambda"); };
mytimer.Start();
Console.WriteLine("Press any key to go to end of method");
Console.ReadKey();
}
The output was this:
Press any key to go to end of method
First lambda.
Sleeping 3 seconds
After sleep
second lambda
So it executes them consecutively not concurrently. So if you want "a bunch of things to happen" upon each timer execution, you have to launch a bunch of tasks (or queue up the ThreadPool with Actions) in your Elapsed handler. It may multi-thread them, or it may not, but in my simple example, it did not.
Try my code yourself, it's quite simple to illustrate what's happening.

java - avoid unnessary thread wake-ups

I have a set of 12 threads executing work (Runnable) in parallel. In essence, each thread does the following:
Runnable r;
while (true) {
synchronized (work) {
while (work.isEmpty()) {
work.wait();
}
r = work.removeFirst();
}
r.execute();
}
Work is added as following:
Runnable r = ...;
synchronized (work) {
work.add(r);
work.notify();
}
When new work is available, it is added to the list and the lock is notified. If there is a thread waiting, it is woken up, so it can execute this work.
Here lies the problem. When a thread is woken up, it is very likely that another thread will execute this work. This happens when the latter thread is done with its previous work and re-enters the while(true)-loop. The smaller/shorter the work actions, the more likely this will happen.
This means I am waking up a thread for nothing. As I need high throughput, I believe this behavior will lower the performance.
How would you solve this? In theory, I need a mechanism which allows me to cancel a pending thread wake-up notification. Of course, this is not possible in Java.
I thought about introducing a work list for each thread. Instead of pushing the work into one single list, the work is spread over the 12 work lists. But I believe this will introduce other problems. For example, one thread might have a lot of work pending, while another thread might have no work pending. In essence, I believe that a solution which assigns work to a particular thread in advance might become very complex and and is sub-optimal.
Thanks!
What you are doing is a thread pooling. Take a look at pre java-5 concurrency framework, PooledExecutor class there:
http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/intro.html
In addition to my previous answer - another solution. This question makes me curious.
Here, I added a check with volatile boolean.
It does not completely avoid the situation of uselessly wakening up a thread but helps to avoid it. Actually, I do not see how this could be completely avoided without additional restrictions like "we know that after 100ms a job will most likely be done".
volatile boolean free = false;
while (true) {
synchronized (work) {
free = false; // new rev.2
while (work.isEmpty()) {
work.wait();
}
r = work.removeFirst();
}
r.execute();
free = true; // new
}
--
synchronized (work) {
work.add(r);
if (!free) { // new
work.notify();
} // new
free = false; // new rev.2
}

Resources