How to take control of the primary OS thread in Haskell? - haskell

I am trying to embed a ruby1.9 interpreter in a program. I am currently using forkOS in my hruby package, but it seems this only works for ruby 1.8 and 2.x. It looks like 1.9 needs to execute in the primary thread. As a side node, there is no documentation one how to do such a thing, so the only pointer to my current problem is here.
Is there a way to take control of the primary thread to run all my FFI calls ?

Having done some testing and reading of documentation I have come to the following conclusions. The report says all of this is implementation defined so there is no standard way. The module Control.Concurrent states in its documentation that main is a bound thread however it doesn't require that it is the same as the primary OS thread.
Experimentally (at least on Linux 64-bit with GHC 7.8 and 7.10-rc3) the main thread is the OS thread. Given that the main thread is bound it seems there would be no reason for this to be different on other GHC platforms however I cannot test other platforms.
In terms of actually implementing this if you want to program as if ruby was in a different thread you can run most non-ruby stuff in a different thread and communicate with main thread (which talks to the ruby interpreter) via either MVars or TVars. See the comment by #chi for an example of how this is done in gtk.
In terms of a library interface you can have a initialisation function that takes a continuation. Your library hijacks the thread at initialisation and then calls the continuation on another thread. Of course you then need to document to users that it must be called in the main thread.

Related

Doing UI on a background thread

The SDL documentation for threading states:
NOTE: You should not expect to be able to create a window, render, or receive events on any thread other than the main one.
The glfw documentation for glfwCreateWindow states:
Thread safety: This function must only be called from the main thread.
I have read about issues regarding the glut library from people who have tried to run the windowing functions on a second thread.
I could go on with these examples, but I think you get the point I'm trying to make. A lot of cross-platform libraries don't allow you to create a window on a background thread.
Now, two of the libraries I mentioned are designed with OpenGL in mind, and I get that OpenGL is not designed for multithreading and you shouldn't do rendering on multiple threads. That's fine. The thing that I don't understand is why the rendering thread (the single thread that does all the rendering) has to be the main one of the application.
As far as I know, neither Windows nor Linux nor MacOS impose any restrictions on which threads can create windows. I do know that windows have affinity to the thread that creates them (only that thread can receive input for them, etc.); but still that thread does not need to be the main one.
So, I have three questions:
Why do these libraries impose such restrictions? Is it because there is some obscure operating system that mandates that all windows be created on the main thread, and so all operating systems have to pay the price? (Or did I get it wrong?)
Why do we have this imposition that you should not do UI on a background thread? What do threads have to do with windowing, anyways? Is it not a bad abstraction to tie your logic to a specific thread?
If this is what we have and can't get rid of it, how do I overcome this limitation? Do I make a ThreadManager class and yield the main thread to it so it can schedule what needs to be done in the main thread and what can be done in a background thread?
It would be amazing if someone could shed some light on this topic. All the advice I see thrown around is to just do input and UI both on the main thread. But that's just an arbitrary restriction if there isn't a technical reason why it isn't possible to do otherwise.
PS: Please note that I am looking for a cross platform solution. If it can't be found, I'll stick to doing UI on the main thread.
While I'm not quite up to date on the latest releases of MacOS/iOS, as of 2020 Apple UIKit and AppKit were not thread safe. Only one thread can safely change UI objects, and unless you go to a lot of trouble that's going to be the main thread. Even if you do go to all the trouble of closing the window manager connection etc etc you're still going to end up with one thread only doing UI. So the limitation still applies on at least one major system.
While it's possibly unsafe to directly modify the contents of a window from any other thread, you can do software rendering to an offscreen bitmap image from any thread you like, taking as long as you like. Then hand the finished image over to the main thread for rendering. (The possibly is why cross platform toolkits disallow/tell you not to. Sometimes it might work, but you can't say why, or even that it will keep working.)
With Vulkan and DirectX 12 (and I think but am not sure Metal) you can render from multiple threads. Woohoo! Of course now you have to figure out how to do all the coordination and locking and cross-synching without making the whole thing slower than single threaded, but at least you have the option to try.
Adding to the excellent answer by Matt, with Qt programs you can use invokeMethod and postEvent to have background threads update the UI safely.
It's highly unlikely that any of these frameworks actually care about which thread is the 'main thread', i.e., the one that called the entry point to your code. The real restriction is that you have to do all your UI work on the thread that initialized the framework, i.e., the one that called SDL_Init in your case. You will usually do this in your main thread. Why not?
Multithreaded code is difficult to write and difficult to understand, and in UI work, introducing multithreading makes it difficult to reason about when things happen. A UI is a very stateful thing, and when you're writing UI code, you usually need to have a very good idea about what has happened already and what will happen next -- those things are often undefined when multithreading is involved. Also, users are slow, so multithreading the UI is not really necessary for performance in normal cases. Because of all this, making a UI framework thread-safe isn't usually considered beneficial. (multithreading compute-intensive parts of your rendering pipeline is a different thing)
Single-threaded UI frameworks have a dispatcher of some sort that you can use to enqueue activities that should happen on the main thread when it next has time. In SDL, you use SDL_PushEvent for this. You can call that from any thread.

Do IO operations run in green threads?

Given the example from Control.Concurrent.Async:
do a1 <- async (getURL url1)
a2 <- async (getURL url2)
page1 <- wait a1
page2 <- wait a2
Do the two getURL calls run on different OS threads, or just different green threads?
In case my question doesn't make sense... say the program is running on one OS thread only, will these calls still be made at the same time? Do blocking IO operations block the whole OS thread and all the green threads on that OS thread, or just one green thread?
From the documentation of Control.Concurrent.Async
This module provides a set of operations for running IO operations asynchronously and waiting for their results. It is a thin layer over the basic concurrency operations provided by Control.Concurrent.
and Control.Concurrent
Scheduling of Haskell threads is done internally in the Haskell runtime system, and doesn't make use of any operating system-supplied thread packages.
This last may be a bit misleading if not interpreted carefully: although the scheduling of Haskell threads -- that is, the choice of which Haskell code to run next -- is done without using any OS facilities, GHC can and does use multiple OS threads to actually execute whatever code is chosen to be run, at least when using the threaded runtime system.
It should all be green threads.
If your program is compiled (or rather, linked) with the single-threaded RTS, then all green threads run in a single OS thread. If your program is compiled (linked) with the multi-threaded RTS, then some arbitrary number of green threads are scheduled across (by default) one OS thread per CPU core.
As far as I'm aware, in either case blocking I/O calls should only block one green thread. Other green threads should be completely unaffected.
This isn't as simple as the question seems to imply. Haskell is a more capable programming language than most you would have run into. In particular, IO operations that appear to block from an internal point of view may be implemented as the sequence "start non-blocking IO operation, suspend thread, wait for that IO operation to complete in an IO manager that covers multiple Haskell threads, queue thread for resumption once the IO device is ready."
See waitRead# and waitWrite# for the api that provides that functionality with the standard global IO manager.
Using green threads or not is mostly irrelevant with this pattern. IO operations can be written to use non-blocking IO behind the scenes, with proper multiplexing, while appearing to present a blocking interface to their users.
Unfortunately, it's not that simple either. The fact is that OS limitations get in the way. Until very recently (I think the 5.1 kernel was released yesterday, maybe?), Linux has provided no good interface for non-blocking disk operations. Sure there were things that looked like they should work, but in practice they weren't very good. So disk reads/writes are actual blocking operations in GHC. (Not just on linux, either. GHC doesn't have a lot of developers supporting it, so a lot of things are written with the same code that works on linux, even if there are other alternatives.)
But it's not even as simple as "network operations are hidden non-blocking, disk operations are blocking". At least maybe not. I don't actually know, because it's so hard to find documentation on the non-threaded runtime. I know the threaded runtime actually maintains a separate thread pool for performing FFI calls marked as "safe", which prevents them from blocking execution of green threads. I don't know if the same is true with the non-threaded runtime.
But for your example, I can say - assuming getURL uses the standard network library (it's a hypothetical function anyway), it'll be doing non-blocking IO with proper multiplexing behind the scenes. So those operations will be truly concurrent, even without the threaded runtime.

tcl interpreter parameters contaminated

I have a multithreaded C++ program in which the main thread creates two tcl interpreters, interp#1 and interp#2. During parallel running, the main thread and one slave thread each try to invoke different cmds through interp#1 and interp#2 seperately. At some point, memory error happens and program crashes.
The log file tells me that some value of kObjv[] for interp#1 is contaminated by that for interp#2.
I also run helgrind to check possible data races and it dumps plenty of data race risks on beneath tcl lib apis, like: Tcl_NewStringObj/TclFreeObj/ResetObjResult/TclNREvalObjv, etc.
It looks like the underlying memory is shared by interpreters from same thread. Is that true? My program links static tcl 8.6 lib, which was installed with thread enabled.
The Tcl library uses thread-bound memory pooling to (hugely!) reduce pressure on global locks, with the consequence that every Tcl interpreter object is also strongly bound to the thread that created it. (This is the Apartment Threading Model, if you're familiar with that.) You cannot safely use a Tcl interpreter from any other thread. If you want to have access to a Tcl interpreter in each thread, each thread should create its own interpreter and use that.
There are a few operations that allow safe inter-thread communication, specifically Tcl_ThreadQueueEvent() and Tcl_ThreadAlert(), which allow you to lodge a message for the other thread to handle when it is ready (every thread with a Tcl interpreter on it has an event queue associated with it inside the Tcl library; this is in the core of the Tcl event notifier engine).
You're recommended to use the Tcl thread package (which should be part of any good Tcl 8.6 installation and is available for older versions too) for inter-thread working in Tcl. Apart from the complexity of getting each side to know what the handle for the other thread is, it's really quite easy to use.

forkIO threads and OS threads

If I create a thread using forkIO I need to provide a function to run and get back an identifier (threadID). I then can communicate with this animal via e.g. the workloads, MVARs etc.. However, to my understanding the created thread is very limited and can only work in sort of a SIMD fashion where the function that was provided for thread creation is the instruction. I cannot change the function that I provided when the thread was initiated. I understand that these user threads are eventually by the OS mapped to OS threads.
I would like to know how the Haskell threads and the OS threads do interface. Why can Haskell threads that do completely different things be mapped to one and the same OS thread? Why was there no need to initiate the OS thread with a fixed instruction (as it is needed in forkIO)? How does the scheduler(?) recognize user threads in an application that could possibly be distributed? In other words, why are OS threads so flexible?
Last, is there any way to dump the heap of a selected thread from within the application?
First, let's address one quick misconception:
I understand that these user threads are eventually by the OS mapped to OS threads.
Actually, the Haskell runtime is in charge of choosing which Haskell thread a particular OS thread from its pool is executing.
Now the questions, one at a time.
Why can Haskell threads that do completely different things be mapped to one and the same OS thread?
Ignoring FFI for the moment, all OS threads are actually running the Haskell runtime, which keeps track of a list of ready Haskell threads. The runtime chooses a Haskell thread to execute, and jumps into the code, executing until the thread yields control back to the runtime. At that moment, the runtime has a chance to continue executing the same thread or pick a different one.
In short: many Haskell threads can be mapped to a single OS thread because in reality that OS thread is doing only one thing, namely, running the Haskell runtime.
Why was there no need to initiate the OS thread with a fixed instruction (as it is needed in forkIO)?
I don't understand this question (and I think it stems from a second misconception). You start OS threads with a fixed instruction in exactly the same sense that you start Haskell threads with a fixed instruction: for each thing, you just give a chunk of code to execute and that's what it does.
How does the scheduler(?) recognize user threads in an application that could possibly be distributed?
"Distributed" is a dangerous word: usually, it refers to spreading code across multiple machines (presumably not what you meant here). As for how the Haskell runtime can tell when there's multiple threads, well, that's easy: you tell it when you call forkIO.
In other words, why are OS threads so flexible?
It's not clear to me that OS threads are any more flexible than Haskell threads, so this question is a bit strange.
Last, is there any way to dump the heap of a selected thread from within the application?
I actually don't really know of any tools for dumping the Haskell heap at all, in multithreaded applications or otherwise. You can dump a representation of the part of the heap reachable from a particular object, if you like, using a package like vacuum. I've used vacuum-cairo to visualize these dumps with great success in the past.
For further information, you may enjoy the middle two sections, "Conventions" and "Foreign Imports", from my intro to multithreaded gtk2hs programming, and perhaps also bits of the section on "The Non-Threaded Runtime".
Instead of trying to directly answer your question, I will try to provide a conceptual model for how multi-threaded Haskell programs are implemented. I will ignore many details, and complexities.
Operating systems implement preemptive multithreading using hardware interrupts to allow multiple "threads" of computation to run logically on the same core at the same time.
The threads provided by operating systems tend to be heavy weight. They are well suited to certain types of "multi-threaded" applications, and, on systems like Linux, are fundamentally the same tool that allows multiple programs to run at the same time (a task they excel at).
But, these threads are bit heavy weight for many uses in high level languages such as Haskell. Essentially, the GHC runtime works as mini-OS, implementing its own "threads" on top of the OS threads, in the same way an OS implements threads on top of cores.
It is conceptually easy to imagine that a language like Haskell would be implemented in this way. Evaluating Haskell consists of "forcing thunks" where a thunk is a unit of computation that might 1. depend on another value (thunk) and/or 2. create new thunks.
Thus, one can imagine multiple threads each evaluating thunks at the same time. One would construct a queue of thunks to be evaluated. Each thread would pop the top of the queue, and evaluate that thunk until it was completed, then select a new thunk from the queue. The operation par and its ilk can "spark" new computation by adding a thunk to that queue.
Extending this model to IO actions is not particularly hard to imagine either. Instead of each simply forcing pure thunk, we imagine the unit of Haskell computation being somewhat more complicated. Psuedo Haskell for such a runtime:
type Spark = (ThreadId,Action)
data Action = Compute Thunk | Perform IOAction
note: this is for conceptual understanding only, don't think things are implemented this way
When we run a Spark, we look for exceptions "thrown" to that thread ID. Assuming we have none, execution consists of either forcing a thunk or performing an IO action.
Obviously, my explanation here has been very hand-wavy, and ignored some complexity. For more, the GHC team have written excellent articles such as "Runtime Support for Multicore Haskell" by Marlow et al. You might also want to look at text book on Operating Systems, as they often go in some depth on how to build a scheduler.

In Apple's Cocoa API, why is it important that NSApplicationMain be called from the main thread?

In the documentation for NSApplicationMain, it says:
Creates the application, loads the main nib file from the application’s main bundle, and runs the application. You must call this function from the main thread of your application [...].
The "main thread" obviously refers to the first thread of the program, where main(argc, argv) starts. A quick look through the NSThread documentation reveals + (BOOL)isMainThread, which can be used to determine whether the current thread is the "main" one or not. I ran some tests: this method works regardless of whether NSApplicationMain has been called yet.
My question has two (somewhat related) parts:
What is so special about the main thread for NSApplicationMain?
How does Cocoa identify the main thread in the first place?
Here is a good place to study NSApplicationMain by following a re-implementation of the function. NSApplicationMain must be called from the main thread primarily because
It handles the primary interface
UI elements (in several systems, not just OS X) need to all be called within the same thread to function correctly.
Graphical elements provided within the Cocoa framework assume they'll be running in the main thread.
So pretty much, since Cocoa calls things in the main thread, and the UI needs to all be run in the same thread, you need to work within main thread for anything touching UI, including NSApplicationMain.

Resources