Why is performSelector:onThread:withObject:waitUntilDone: not recommended for frequent inter-thread communication - multithreading

Apple's Threading Programming Guide states that:
Although good for occasional
communication between threads, you
should not use the
performSelector:onThread:withObject:waitUntilDone:
method for time critical or frequent
communication between threads.
Which begs the questions: Which is, then, the acceptable method for frequent inter-thread communication, and why is performSelector:onThread:withObject:waitUntilDone: specifically not recommended.
ps: Not waiting until done, naturally.

The reason they don’t recommend using that probably is because this has a lot of overhead. Also it works only with threads that have a NSRunloop running. It’s really good for updating the UI from a secondary thread though.
For more heavy duty lifting you should use shared memory (with locks or lockless algorithms) for inter-thread-communication. Or even better use something like NSOperationQueue or Grand Central Dispatch and don’t worry about doing the communication and synchronization yourself, if your problem permits that.

Related

How can threads synchronize their operation?

How can threads synchronize their operation? Since the threads within the same process share resources, how can the threads behave in such a way as not to interfere with other threads? Kindly explain in an easily understandable language. Thanks to all experts!
Imagine a narrow door where people can pass this door one at a time. Some times when there are too many people, they form a queue. So do correctly programmed threads - they obey to conventions like not to try to pass the door when it is occupied. Badly programmed threads, as well as bad-mannered people, can ignore the conventions and make a disorder.
The main abstraction in multithreading programming is a resource - an area of memory which can belong to at most one of a thread. Threads request for resources, wait for them, own them - and only can read or write to that memory while owing - and free them.
There are many synchronization primitives for dealing with resources, most important are semaphores, monitors, and blocking queues.
Programmers who want to design a multithreading program, should first make a plan what kind of resources will be used and how threads would exchange them. Then choose what standard synchronization facilities will be used, or design new facilities and program them. Specialized facilities usually are made with monitors.
Often teaching of multithreading programming starts with how to use monitors for thread interaction. This is wrong. First a student should master using of standard means - semaphores and blocking queues, which are sufficient for 95% of cases, and only then learn to design specialized facilities using monitors.

Why must/should UI frameworks be single threaded?

Closely related questions have been asked before:
Why are most UI frameworks single threaded?.
Should all event-driven frameworks be single-threaded?
But the answers to those questions still leave me unclear on some points.
The asker of the first question asked if multi-threading would help performance, and the answerers mostly said that it would not, because it is very unlikely that the GUI would be the bottleneck in a 2D application on modern hardware. But this seems to me a sneaky debating tactic. Sure, if you have carefully structured your application to do nothing other than UI calls on the UI thread you won't have a bottleneck. But that might take a lot of work and make your code more complicated, and if you had a faster core or could make UI calls from multiple threads, maybe it wouldn't be worth doing.
A commonly advocated architectural design is to have view components that don't have callbacks and don't need to lock anything except maybe their descendants. Under such an architecture, can't you let any thread invoke methods on view objects, using per-object locks, without fear of deadlock?
I am less confident about the situation with UI controls, but as long their callbacks are only invoked by the system, why should they cause any special deadlock issues? After all, if the callbacks need to do anything time consuming, they will delegate to another thread, and then we're right back in the multiple threads case.
How much of the benefit of a multi-threaded UI would you get if you could just block on the UI thread? Because various emerging abstractions over async in effect let you do that.
Almost all of the discussion I have seen assumes that concurrency will be dealt with using manual locking, but there is a broad consensus that manual locking is a bad way to manage concurrency in most contexts. How does the discussion change when we take into consideration the concurrency primitives that the experts are advising us to use more, such as software transactional memory, or eschewing shared memory in favor of message passing (possibly with synchronization, as in go)?
TL;DR
It is a simple way to force sequencing to occur in an activity that is going to ultimately be in sequence anyway (the screen draw X times per second, in order).
Discussion
Handling long-held resources which have a single identity within a system is typically done by representing them with a single thread, process, "object" or whatever else represents an atomic unit with regard to concurrency in a given language. Back in the non-emptive, negligent-kernel, non-timeshared, One True Thread days this was managed manually by polling/cycling or writing your own scheduling system. In such a system you still had a 1::1 mapping between function/object/thingy and singular resources (or you went mad before 8th grade).
This is the same approach used with handling network sockets, or any other long-lived resource. The GUI itself is but one of many such resources a typical program manages, and typically long-lived resources are places where the ordering of events matters.
For example, in a chat program you would usually not write a single thread. You would have a GUI thread, a network thread, and maybe some other thread that deals with logging resources or whatever. It is not uncommon for a typical system to be so fast that its easier to just put the logging and input into the same thread that makes GUI updates, but this is not always the case. In all cases, though, each category of resources is most easily reasoned about by granting them a single thread, and that means one thread for the network, one thread for the GUI, and however many other threads are necessary for long-lived operations or resources to be managed without blocking the others.
To make life easier its common to not share data directly among these threads as much as possible. Queues are much easier to reason about than resource locks and can guarantee sequencing. Most GUI libraries either queue events to be handled (so they can be evaluated in order) or commit data changes required by events immediately, but get a lock on the state of the GUI prior to each pass of the repaint loop. It doesn't matter what happened before, the only thing that matters when painting the screen is the state of the world right then. This is slightly different than the typical network case where all the data needs to be sent in order and forgetting about some of it is not an option.
So GUI frameworks are not multi-threaded, per se, it is the GUI loop that needs to be a single thread to sanely manage that single long-held resource. Programming examples, typically being trivial by nature, are often single-threaded with all the program logic running in the same process/thread as the GUI loop, but this is not typical in more complex programs.
To sum up
Because scheduling is hard, shared data management is even harder, and a single resource can only be accessed serially anyway, a single thread used to represent each long-held resource and each long-running procedure is a typical way to structure code. GUIs are only one resource among several that a typical program will manage. So "GUI programs" are by no means single-threaded, but GUI libraries typically are.
In trivial programs there is no realized penalty to putting other program logic in the GUI thread, but this approach falls apart when significant loads are experienced or resource management requires either a lot of blocking or polling, which is why you will often see event queue, signal-slot message abstractions or other approaches to multi-threading/processing mentioned in the dusty corners of nearly any GUI library (and here I'm including game libraries -- while game libs typically expect that you want to essentially build your own widgets around your own UI concept, the basic principles are very similar, just a bit lower-level).
[As an aside, I've been doing a lot of Qt/C++ and Wx/Erlang lately. The Qt docs do a good job of explaining approaches to multi-threading, the role of the GUI loop, and where Qt's signal/slot approach fits into the abstraction (so you don't have to think about concurrency/locking/sequencing/scheduling/etc very much). Erlang is inherently concurrent, but wx itself is typically started as a single OS process that manages a GUI update loop and Erlang posts update events to it as messages, and GUI events are sent to the Erlang side as messages -- thus permitting normal Erlang concurrent coding, but providing a single point of GUI event sequencing so that wx can do its GUI update looping thing.]
Because the GUI main thread code is old. Very old and therefore very much designed for low resource usage. If someone would write everything from scratch again (and even Android as the most recent GUI OS didn't) it would be working well and be better in multithreading.
For example the best two improvements that would help for MT are
Now we have MVVM (Model-View-ViewModel) pattern, this is an extra duplication of data. When the toolskits were developed even a single duplication in a MVC was highly debated. MVVM makes multithreading much easier. IMHO this was the main reason for Microsoft to invent it in the first place in .NET not the data binding.
The scene graph approach. Android, iOS, Windows UWP (based on CoreWindow not hWnd until Windows11 Project Reunion), Gtk4 is decoupling the GPU part from the model. Yes it is in fact a MVVMGM now (Model-View-ViewModel-GPUModel). So another memory intense layer. If you duplicate stuff you need less synchronisation. Combine on Android and SwiftUI on MacOS/iOS is using immutability of GUI widgets now to further improve this View->GPUModel.
Especially with the GPU Model/Scene Graph, the statement that GUIs are single threaded is not true anymore.
Two reasons, as far as I can tell:
It is much easier to reason about single-threaded code; thus the event loop model reduces the likelihood of bugs.
2D User interfaces are not CPU intensive. An old computer with a wimpy graphics card can smoothly render all the windows, frames, widgets, etc. you could possibly desire without skipping a beat.
Basically, if single-threaded code is easier and tends to have fewer bugs, favor that over multithreaded code unless you have a compelling need for parallelization or speed. Your typical GUI frameworks don't have this need.
Now, of course we've all experienced lagginess and freezes from GUI applications before. I'd argue that the vast majority of the time, this is the fault of the developer: putting long-running synchronous code for an event that should have been handled asynchronously (which is a mechanism all the major UI frameworks have).

Usefulness of Lockfree programming on Userlevel Threads

AFAIK, as opposed to time-slice based scheduler preemption, pure User-level threads(ULTs) have the property of yielding the processor to the other threads. However, from my surfing on internet I see that we have several preemptive User Thread mechanisms now.
Keeping this in mind, wanted to start a discussion on benefits of Lock-free programming on User level threads. My understanding is that irrespective of presence of preemptive scheduler performance of Lock free programming should surpass that of mutex/semaphore based programs.
However, I am still confused; since acquire operation on mutex also takes fast-path in the absence of contention, performance gain need not be attractive enough to migrate to Lock-free approach.
In case of semaphores, there is a invocation to system call leading to context switch and hence lock-free approaches can be seen as much better option.
Please suggest for both the situations - ULT equipped with preemptive mechanism and the one without it.
This is not an easy question to answer, as it is very general, it will boil down to what your requirements are.
I have recently been working with systems where the use of lock free structures was considered, but when we sat down and wrote out our requirements, we realized that they are in fact not what we want. Our system didn't really require them, and in fact locking helps us because we typically have a producer/consumer architecture where if there is nothing being produced (i.e. nothing being added to a queue) then the consumer should be idle (i.e. blocked).
I recently wrote about this in more detail:
http://blog.chrisd.info/a-simple-thread-safe-queue-for-use-in-multi-threaded-c-applications/

Instant Messaging Server Design

Let's suppose we have an instant messaging application, client-server based, not p2p. The actual protocol doesn't matter, what matters is the server architecture. The said server can be coded to operate in single-threaded, non-parallel mode using non-blocking sockets, which by definition allow us to perform operations like read-write effectively immediately (or instantly). This very feature of non-blocking sockets allows us to use some sort of select/poll function at the very core of the server and waste next to no time in the actual socket read/write operations, but rather to spend time processing all this information. Properly coded, this can be very fast, as far as I understand. But there is the second approach, and that is to multithread aggressively, creating a new thread (obviously using some sort of thread pool, because that very operation can be (very) slow on some platforms and under some circumstances), and letting those threads to work in parallel, while the main background thread handles accept() and stuff. I've seen this approach explained in various places over the Net, so it obviously does exist.
Now the question is, if we have non-blocking sockets, and immediate read/write operations, and a simple, easily coded design, why does the second variant even exist? What problems are we trying to overcome with the second design, i.e. threads? AFAIK those are usually used to work around some slow and possibly blocking operations, but no such operations seem to be present there!
I'm assuming you're not talking about having a thread per client as such a design is usually for completely diffreent reasons, but rather a pool of threads each handles several concurrent clients.
The reason for that arcitecture vs a single threaded server is simply to take advantage of multiple processors. You're doing more work than simply I/O. You have to parse the messages, do various work, maybe even run some more heavyweight crypto algorithms. All this takes CPU. If you want to scale, taking advantage of multiple processors will allow you to scale even more, and/or keep the latency even lower per client.
Some of the gain in such a design can be a bit offset by the fact you might need more locking in a multithreaded environment, but done right, and certainly depening on what you're doing, it can be a huge win - at the expense of more complexity.
Also, this might help overcome OS limitations . The I/O paths in the kernel might get more distributed among the procesors. Not all operating systems might fully be able to thread the IO from a single threaded applications. Back in the old days there were'nt all the great alternatives to the old *nix select(), which usually had a filedesciptor limit of 1024, and similar APIs severly started degrading once you told it to monitor too many socket. Spreading all those clients on multiple threads or processes helped overcome that limit.
As for a 1:1 mapping between threads, there's several reasons to implement that architecture:
Easier programming model, which might lead to less hard to find bugs, and faster to implement.
Support blocking APIs. These are all over the place. Having a thread handle many/all of the clients and then go on to do a blocking call to a database is going to stall everyone. Even reading files can block your application, and you usually can't monitor regular file handles/descriptors for IO events - or when you can, the programming model is often exceptionally complicated.
The drawback here is it won't scale, atleast not with the most widely used languages/framework. Having thousands of native threads will hurt performance. Though some languages provides a much more lightweight approach here, such as Erlang and to some extent Go.

When is multi-threading not a good idea? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I was recently working on an application that sent and received messages over Ethernet and Serial. I was then tasked to add the monitoring of DIO discretes. I throught,
"No reason to interrupt the main
thread which is involved in message
processing, I'll just create
another thread that monitors DIO."
This decision, however, proved to be poor. Sometimes the main thread would be interrupted between a Send and a Receive serial message. This interruption would disrupt the timing and alas, messages would be lost (forever).
I found another way to monitor the DIO without using another thread and Ethernet and Serial communication were restored to their correct functionality.
The whole fiasco, however, got me thinking. Are their any general guidelines about when not to use multiple-threads and/or does anyone have anymore examples of situations when using multiple-threads is not a good idea?
**EDIT:Based on your comments and after scowering the internet for information, I have composed a blog post entitled When is multi-threading not a good idea?
On a single processor machine and a desktop application, you use multi threads so you don't freeze the app but for nothing else really.
On a single processor server and a web based app, no need for multi threading because the web server handles most of it.
On a multi-processor machine and desktop app, you are suggested to use multi threads and parallel programming. Make as many threads as there are processors.
On a multi-processor server and a web based app, no need again for multi threads because the web server handles it.
In total, if you use multiple threads for other than un-freezing desktop apps and any other generic answer, you will make the app slower if you have a single core machine due to the threads interrupting each other.
Why? Because of the hardware switches. It takes time for the hardware to switch between threads in total. On a multi-core box, go ahead and use 1 thread for each core and you will greatly see a ramp up.
To paraphrase an old quote: A programmer had a problem. He thought, "I know, I'll use threads." Now the programmer has two problems. (Often attributed to JWZ, but it seems to predate his use of it talking about regexes.)
A good rule of thumb is "Don't use threads, unless there's a very compelling reason to use threads." Multiple threads are asking for trouble. Try to find a good way to solve the problem without using multiple threads, and only fall back to using threads if avoiding it is as much trouble as the extra effort to use threads. Also, consider switching to multiple threads if you're running on a multi-core/multi-CPU machine, and performance testing of the single threaded version shows that you need the performance of the extra cores.
Multi-threading is a bad idea if:
Several threads access and update the same resource (set a variable, write to a file), and you don't understand thread safety.
Several threads interact with each other and you don't understand mutexes and similar thread-management tools.
Your program uses static variables (threads typically share them by default).
You haven't debugged concurrency issues.
Actually, multi threading is not scalable and is hard to debug, so it should not be used in any case if you can avoid it. There is few cases where it is mandatory : when performance on a multi CPU matters, or when you deal whith a server that have a lot of clients taking a long time to answer.
In any other cases, you can use alternatives such as queue + cron jobs or else.
You might want to take a look at the Dan Kegel's "The C10K problem" web page about handling multiple data sources/sinks.
Basically it is best to use minimal threads, which in sockets can be done in most OS's w/ some event system (or asynchronously in Windows using IOCP).
When you run into the case where the OS and/or libraries do not offer a way to perform communication in a non-blocking manner, it is best to use a thread-pool to handle them while reporting back to the same event loop.
Example diagram of layout:
Per CPU [*] EVENTLOOP ------ Handles nonblocking I/O using OS/library utilities
| \___ Threadpool for various blocking events
Threadpool for handling the I/O messages that would take long
Multithreading is bad except in the single case where it is good. This case is
The work is CPU Bound, or parts of it is CPU Bound
The work is parallelisable.
If either or both of these conditions are missing, multithreading is not going to be a winning strategy.
If the work is not CPU bound, then you are waiting not on threads to finish work, but rather for some external event, such as network activity, for the process to complete its work. Using threads, there is the additional cost of context switches between threads, The cost of synchronization (mutexes, etc), and the irregularity of thread preemption. The alternative in most common use is asynchronous IO, in which a single thread listens to several io ports, and acts on whichever happens to be ready now, one at a time. If by some chance these slow channels all happen to become ready at the same time, It might seem like you will experience a slow-down, but in practice this is rarely true. The cost of handling each port individually is often comparable or better than the cost of synchronizing state on multiple threads as each channel is emptied.
Many tasks may be compute bound, but still not practical to use a multithreaded approach because the process must synchronise on the entire state. Such a program cannot benefit from multithreading because no work can be performed concurrently. Fortunately, most programs that require enormous amounts of CPU can be parallelized to some level.
Multi-threading is not a good idea if you need to guarantee precise physical timing (like in your example). Other cons include intensive data exchange between threads. I would say multi-threading is good for really parallel tasks if you don't care much about their relative speed/priority/timing.
A recent application I wrote that had to use multithreading (although not unbounded number of threads) was one where I had to communicate in several directions over two protocols, plus monitoring a third resource for changes. Both protocol libraries required a thread to run the respective event loop in, and when those were accounted for, it was easy to create a third loop for the resource monitoring. In addition to the event loop requirements, the messages going through the wires had strict timing requirements, and one loop couldn't be risked blocking the other, something that was further alleviated by using a multicore CPU (SPARC).
There were further discussions on whether each message processing should be considered a job that was given to a thread from a thread pool, but in the end that was an extension that wasn't worth the work.
All-in-all, threads should if possible only be considered when you can partition the work into well defined jobs (or series of jobs) such that the semantics are relatively easy to document and implement, and you can put an upper bound on the number of threads you use and that need to interact. Systems where this is best applied are almost message passing systems.
In priciple everytime there is no overhead for the caller to wait in a queue.
A couple more possible reasons to use threads:
Your platform lacks asynchronous I/O operations, e.g. Windows ME (No completion ports or overlapped I/O, a pain when porting XP applications that use them.) Java 1.3 and earlier.
A third-party library function that can hang, e.g. if a remote server is down, and the library provides no way to cancel the operation and you can't modify it.
Keeping a GUI responsive during intensive processing doesn't always require additional threads. A single callback function is usually sufficient.
If none of the above apply and I still want parallelism for some reason, I prefer to launch an independent process if possible.
I would say multi-threading is generally used to:
Allow data processing in the background while a GUI remains responsive
Split very big data analysis onto multiple processing units so that you can get your results quicker.
When you're receiving data from some hardware and need something to continuously add it to a buffer while some other element decides what to do with it (write to disk, display on a GUI etc.).
So if you're not solving one of those issues, it's unlikely that adding threads will make your life easier. In fact it'll almost certainly make it harder because as others have mentioned; debugging mutithreaded applications is considerably more work than a single threaded solution.
Security might be a reason to avoid using multiple threads (over multiple processes). See Google chrome for an example of multi-process safety features.
Multi-threading is scalable, and will allow your UI to maintain its responsivness while doing very complicated things in the background. I don't understand where other responses are acquiring their information on multi-threading.
When you shouldn't multi-thread is a mis-leading question to your problem. Your problem is this: Why did multi-threading my application cause serial / ethernet communications to fail?
The answer to that question will depend on the implementation, which should be discussed in another question. I know for a fact that you can have both ethernet and serial communications happening in a multi-threaded application at the same time as numerous other tasks without causing any data loss.
The one reason to not use multi-threading is:
There is one task, and no user interface with which the task will interfere.
The reasons to use mutli-threading are:
Provides superior responsiveness to the user
Performs multiple tasks at the same time to decrease overall execution time
Uses more of the current multi-core CPUs, and multi-multi-cores of the future.
There are three basic methods of multi-threaded programming that make thread safety implemented with ease - you only need to use one for success:
Thread Safe Data types passed between threads.
Thread Safe Methods in the threaded object to modify data passed between.
PostMessage capabilities to communicate between threads.
Are the processes parallel? Is performance a real concern? Are there multiple 'threads' of execution like on a web server? I don't think there is a finite answer.
A common source of threading issues is the usual approaches employed to synchronize data. Having threads share state and then implement locking at all the appropriate places is a major source of complexity for both design and debugging. Getting the locking right to balance stability, performance, and scalability is always a hard problem to solve. Even the most experienced experts get it wrong frequently. Alternative techniques to deal with threading can alleviate much of this complexity. The Clojure programming language implements several interesting techniques for dealing with concurrency.

Resources