Understanding Threads Swift - multithreading

I sort of understand threads, correct me if I'm wrong.
Is a single thread allocated to a piece of code until that code has completed?
Are the threads prioritised to whichever piece of code is run first?
What is the difference between main queue and thread?
My most important question:
Can threads run at the same time? If so how can I specify which parts of my code should run at a selected thread?

Let me start this way. Unless you are writing a special kind of application (and you will know if you are), forget about threads. Working with threads is complex and tricky. Use dispatch queues… it's simpler and easier.
Dispatch queues run tasks. Tasks are closures (blocks) or functions. When you need to run a task off the main dispatch queue, you call one of the dispatch_ functions, the primary one being dispatch_async(). When you call dispatch_async(), you need to specify which queue to run the task on. To get a queue, you call one of the dispatch_queue_create() or dispatch_get_, the primary one being dispatch_get_global_queue.
NOTE: Swift 3 changed this from a function model to an object model. The dispatch_ functions are instance methods of DispatchQueue. The dispatch_get_ functions are turned into class methods/properties of DispatchQueue
// Swift 3
DispatchQueue.global(qos: .background).async {
var calculation = arc4random()
}
// Swift 2
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND, 0)) {
var calculation = arc4random()
}
The trouble here is any and all tasks which update the UI must be run on the main thread. This is usually done by calling dispatch_async() on the main queue (dispatch_get_main_queue()).
// Swift 3
DispatchQueue.global(qos: .background).async {
var calculation = arc4random()
DispatchQueue.main.async {
print("\(calculation)")
}
}
// Swift 2
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND, 0)) {
var calculation = arc4random()
dispatch_async(dispatch_get_main_queue()) {
print("\(calculation)")
}
}
The gory details are messy. To keep it simple, dispatch queues manage thread pools. It is up to the dispatch queue to create, run, and eventually dispose of threads. The main queue is a special queue which has only 1 thread. The operating system is tasked with assigning threads to a processor and executing the task running on the thread.
With all that out of the way, now I will answer your questions.
Is a single thread allocated to a piece of code until that code has completed?
A task will run in a single thread.
Are the threads prioritised to whichever piece of code is run first?
Tasks are assigned to a thread. A task will not change which thread it runs on. If a task needs to run in another thread, then it creates a new task and assigns that new task to the other thread.
What is the difference between main queue and thread?
The main queue is a dispatch queue which has 1 thread. This single thread is also known as the main thread.
Can threads run at the same time?
Threads are assigned to execute on processors by the operating system. If your device has multiple processors (they all do now-a-days), then multiple threads are executing at the same time.
If so how can I specify which parts of my code should run at a selected thread?
Break you code into tasks. Dispatch the tasks on a dispatch queue.

Related

Kotlin coroutines multithread dispatcher and thread-safety for local variables

Let's consider this simple code with coroutines
import kotlinx.coroutines.*
import java.util.concurrent.Executors
fun main() {
runBlocking {
launch (Executors.newFixedThreadPool(10).asCoroutineDispatcher()) {
var x = 0
val threads = mutableSetOf<Thread>()
for (i in 0 until 100000) {
x++
threads.add(Thread.currentThread())
yield()
}
println("Result: $x")
println("Threads: $threads")
}
}
}
As far as I understand this is quite legit coroutines code and it actually produces expected results:
Result: 100000
Threads: [Thread[pool-1-thread-1,5,main], Thread[pool-1-thread-2,5,main], Thread[pool-1-thread-3,5,main], Thread[pool-1-thread-4,5,main], Thread[pool-1-thread-5,5,main], Thread[pool-1-thread-6,5,main], Thread[pool-1-thread-7,5,main], Thread[pool-1-thread-8,5,main], Thread[pool-1-thread-9,5,main], Thread[pool-1-thread-10,5,main]]
The question is what makes these modifications of local variables thread-safe (or is it thread-safe?). I understand that this loop is actually executed sequentially but it can change the running thread on every iteration. The changes done from thread in first iteration still should be visible to the thread that picked up this loop on second iteration. Which code does guarantee this visibility? I tried to decompile this code to Java and dig around coroutines implementation with debugger but did not find a clue.
Your question is completely analogous to the realization that the OS can suspend a thread at any point in its execution and reschedule it to another CPU core. That works not because the code in question is "multicore-safe", but because it is a guarantee of the environment that a single thread behaves according to its program-order semantics.
Kotlin's coroutine execution environment likewise guarantees the safety of your sequential code. You are supposed to program to this guarantee without any worry about how it is maintained.
If you want to descend into the details of "how" out of curiosity, the answer becomes "it depends". Every coroutine dispatcher can choose its own mechanism to achieve it.
As an instructive example, we can focus on the specific dispatcher you use in your posted code: JDK's fixedThreadPoolExecutor. You can submit arbitrary tasks to this executor, and it will execute each one of them on a single (arbitrary) thread, but many tasks submitted together will execute in parallel on different threads.
Furthermore, the executor service provides the guarantee that the code leading up to executor.execute(task) happens-before the code within the task, and the code within the task happens-before another thread's observing its completion (future.get(), future.isCompleted(), getting an event from the associated CompletionService).
Kotlin's coroutine dispatcher drives the coroutine through its lifecycle of suspension and resumption by relying on these primitives from the executor service, and thus you get the "sequential execution" guarantee for the entire coroutine. A single task submitted to the executor ends whenever the coroutine suspends, and the dispatcher submits a new task when the coroutine is ready to resume (when the user code calls continuation.resume(result)).

Serial Dispatch Queue with Asynchronous Blocks

Is there ever any reason to add blocks to a serial dispatch queue asynchronously as opposed to synchronously?
As I understand it a serial dispatch queue only starts executing the next task in the queue once the preceding task has completed executing. If this is the case, I can't see what you would you gain by submitting some blocks asynchronously - the act of submission may not block the thread (since it returns straight-away), but the task won't be executed until the last task finishes, so it seems to me that you don't really gain anything.
This question has been prompted by the following code - taken from a book chapter on design patterns. To prevent the underlying data array from being modified simultaneously by two separate threads, all modification tasks are added to a serial dispatch queue. But note that returnToPool adds tasks to this queue asynchronously, whereas getFromPool adds its tasks synchronously.
class Pool<T> {
private var data = [T]();
// Create a serial dispath queue
private let arrayQ = dispatch_queue_create("arrayQ", DISPATCH_QUEUE_SERIAL);
private let semaphore:dispatch_semaphore_t;
init(items:[T]) {
data.reserveCapacity(data.count);
for item in items {
data.append(item);
}
semaphore = dispatch_semaphore_create(items.count);
}
func getFromPool() -> T? {
var result:T?;
if (dispatch_semaphore_wait(semaphore, DISPATCH_TIME_FOREVER) == 0) {
dispatch_sync(arrayQ, {() in
result = self.data.removeAtIndex(0);
})
}
return result;
}
func returnToPool(item:T) {
dispatch_async(arrayQ, {() in
self.data.append(item);
dispatch_semaphore_signal(self.semaphore);
});
}
}
Because there's no need to make the caller of returnToPool() block. It could perhaps continue on doing other useful work.
The thread which called returnToPool() is presumably not just working with this pool. It presumably has other stuff it could be doing. That stuff could be done simultaneously with the work in the asynchronously-submitted task.
Typical modern computers have multiple CPU cores, so a design like this improves the chances that CPU cores are utilized efficiently and useful work is completed sooner. The question isn't whether tasks submitted to the serial queue operate simultaneously — they can't because of the nature of serial queues — it's whether other work can be done simultaneously.
Yes, there are reasons why you'd add tasks to serial queue asynchronously. It's actually extremely common.
The most common example would be when you're doing something in the background and want to update the UI. You'll often dispatch that UI update asynchronously back to the main queue (which is a serial queue). That way the background thread doesn't have to wait for the main thread to perform its UI update, but rather it can carry on processing in the background.
Another common example is as you've demonstrated, when using a GCD queue to synchronize interaction with some object. If you're dealing with immutable objects, you can dispatch these updates asynchronously to this synchronization queue (i.e. why have the current thread wait, but rather instead let it carry on). You'll do reads synchronously (because you're obviously going to wait until you get the synchronized value back), but writes can be done asynchronously.
(You actually see this latter example frequently implemented with the "reader-writer" pattern and a custom concurrent queue, where reads are performed synchronously on concurrent queue with dispatch_sync, but writes are performed asynchronously with barrier with dispatch_barrier_async. But the idea is equally applicable to serial queues, too.)
The choice of synchronous v asynchronous dispatch has nothing to do with whether the destination queue is serial or concurrent. It's simply a question of whether you have to block the current queue until that other one finishes its task or not.
Regarding your code sample code, that is correct. The getFromPool should dispatch synchronously (because you have to wait for the synchronization queue to actually return the value), but returnToPool can safely dispatch asynchronously. Obviously, I'm wary of seeing code waiting for semaphores if that might be called from the main thread (so make sure you don't call getFromPool from the main thread!), but with that one caveat, this code should achieve the desired purpose, offering reasonably efficient synchronization of this pool object, but with a getFromPool that will block if the pool is empty until something is added to the pool.

Play Framework: thread-pool-executor vs fork-join-executor

Let's say we have a an action below in our controller. At each request performLogin will be called by many users.
def performLogin( ) = {
Async {
// API call to the datasource1
val id = databaseService1.getIdForUser();
// API call to another data source different from above
// This process depends on id returned by the call above
val user = databaseService2.getUserGivenId(id);
// Very CPU intensive task
val token = performProcess(user)
// Very CPU intensive calculations
val hash = encrypt(user)
Future.successful(hash)
}
}
I kind of know what the fork-join-executor does. Basically from the main thread which receives a request, it spans multiple worker threads which in tern will divide the work into few chunks. Eventually main thread will join those result and return from the function.
On the other hand, if I were to choose the thread-pool-executor, my understanding is that a thread is chosen from the thread pool, this selected thread will do the work, then go back to the thread pool to listen to more work to do. So no sub dividing of the task happening here.
In above code parallelism by fork-join executor is not possible in my opinion. Each call to the different methods/functions requires something from the previous step. If I were to choose the fork-join executor for the threading how would that benefit me? How would above code execution differ among fork-join vs thread-pool executor.
Thanks
This isn't parallel code, everything inside of your Async call will run in one thread. In fact, Play! never spawns new threads in response to requests - it's event-based, there is an underlying thread pool that handles whatever work needs to be done.
The executor handles scheduling the work from Akka actors and from most Futures (not those created with Future.successful or Future.failed). In this case, each request will be a separate task that the executor has to schedule onto a thread.
The fork-join-executor replaced the thread-pool-executor because it allows work stealing, which improves efficiency. There is no difference in what can be parallelized with the two executors.

In Silverlight UI thread, when doing InvokeAsync what happens?

I am trying to solve SL performance issues.
Up until now I had WCF calls which were executed by InvokeAsync.
Now, I changed it to use the BackgroundWorker.
Performance is greatly improved.
what can cause this? what does InvokeAsync did exactly that affected the UI thread? is it opening another UI thread?
Thanks
It comes down to Synchronization contexts. A thread may be associated with SynchronizationContext such as the DispatcherSynchronizationContext (which is the context of the UI thread and only contains this one thread). WCF will complete an operation in the same Synchronization context that it began in, if there is no synchronization context associated with the thread it will use any thread in the thread pool.
Hence if you have several outstanding async operations all invoked from the UI Thread then all those operations will want to run their completion code in the UI Thread. If a number of them complete at the same time the completion code will have to queue up waiting to be dispatched into this single UI thread.
Whereas when you invoke async operations in a Background worker its running in a thread from the thread pool and does not have special synchronisation context. When those operations complete their completion code may run on any available thread in the pool (of which there are several). So near simultaneous completions can all run in parallel on different threads.
In WPF and Silverlight i recommend to use SynchronazationContext to save the main thread, all other thread will use this instance of SynchronazationContext to access the main thread (UI). You use it in this manner (Note: i generated a method that do this and all other methods will access this method to update the UI):
SynchronazationContext ctx = null;
void DoSomething()
{
ctx = SynchronazationContext.Current;
//Some algorithm here
this.UpdatePic("Success !");
}
void ThreadProc()
{
SendOrPostCallback callBack = new SendOrPostCallback(UpdatePic);
ctx.Post(callBack, String.Format("Put here the pic path");
}
void UpdatePic(string _text)
{
//This method run under the main method
}
In .NET 5.0 you can call this complicated functions by mark the method as async and write 'await' when you call the synchronous method - that make the synchronous method as asynchronous method and update the UI with the main thread.

What are working threads?

What are this working threads? How to implement them? And when to use them. I ask this because many people mention them but I dont find an the net some example of them. Or is just a saying for creating threads? Thanks.
Working threads isn't itself a meaningful term in the thread world.
I guess you mean to say," What are worker threads" ?
In that case, let me tell you that a worker thread is commonly used to handle background tasks that the user shouldn't have to wait for to continue using your application.
e.g Recalculation and background printing.
For implementing the worker thread, the controlling function should be defined which defines the thread. When this function is entered, the thread starts, and when it exits, the thread terminates. This function should have the following prototype : More Information
UINT MyControllingFunction( LPVOID pParam );
A short snippet to implement the controlling function of worker thread,
UINT MyThreadProc( LPVOID pParam )
{
CMyObject* pObject = (CMyObject*)pParam;
if (pObject == NULL ||
!pObject->IsKindOf(RUNTIME_CLASS(CMyObject)))
return 1; // if pObject is not valid
// do something with 'pObject'
return 0; // thread completed successfully
}
// inside a different function in the program
.
.
.
pNewObject = new CMyObject;
AfxBeginThread(MyThreadProc, pNewObject);
.
.
.
"Worker thread" is a generic term for a thread which performs some task independent of some primary thread. Depending on usage, it may simply mean any thread other than the primary UI thread, or it may mean a thread which performs a well-scoped task (i.e. a 'job' rather than a continuous operation which lasts the lifetime of the application).
For example, you might spawn a worker thread to retrieve a file from a remote computer over a network. It might send progress updates the application's main thread.
I use a worker, or background thread, any time that I want to perform a lengthy task without tying up my user interface. Threads often allow me to simplify my code by making a continuous series of statements, rather than a convoluted, non-blocking architecture.

Resources