How do you share programming language interpreter state between threads for parallel interpretation? - multithreading

How do you share programming language interpreter state between threads for parallel interpretation?
I wrote a simple parallel multithreaded imaginary assembly interpreter that can communicate variables and method calls between threads - it sends jump addresses - between threads using an underlying actor mailbox implementation. (The code is very straightforward can be found in my multiversion-concurrency-control repository)
When I implement objects and classes ala object orientated programming I shall have a new problem.
I want parallel execution of interpreted code in the interpreter loop in each thread.
Python has the same problem - you have an object hierarchy and the class objects are loaded into the interpreter's context in memory. The identity of these objects might be different between threads.
Each thread parses objects and classes and the memory identity of the objects is different.
One idea I had was to hash the code of an implementation and use that as the hash for equivalence, then we can send a object hash and do a zero copy transfer of thread ownership from one thread to another.
For a compiled language the solution would need to think of garbage collection or reference counting or other form of memory management solution such as RAII between threads.
For simplicity, my idea is that when an object is sent from one thread to another, the reference to the object is removed so the variable can no longer be used. This assumes a zero copy implementation.
Alternatively, we can separate book keeping per-thread and this means that each thread can maintain its own visibility of ownership. If all thread's reference counts go to 0 then the object is cleared.

Related

Can a singleton object have a (thread-safe) method executing in different threads simultaneously?

If so, how will different threads share the same instance or chunk of memory that represents the object? Will different threads somehow "copy" the single-instance object method code to be run on its own CPU resources?
EDIT - to clarify this question further:
I understand that different threads can be in the "process" of executing a singleton object's method at the same time, while they might not all be actively executing - they may be waiting to be scheduled for execution by the OS and in various states of execution within the method. This question is specifically for multiple active threads that are executing each on a different processor. Can multiple threads be actively executing the same exact code path (same region of memory) at the same time?
Can a singleton object have a (thread-safe) method executing in different threads simultaneously?
Yes, of course. Note that this is no different from multiple threads running a method on the same non-singleton object instance.
how will different threads share the same instance or chunk of memory that represents the object?
(Ignoring NUMA and processor caches), the object exists in only one place in memory (hence, it's a singleton) so the different threads are all reading from the same memory addresses.
Now, if the object is immutable (and doesn't have external side-effects, like IO) then that's okay: multiple threads reading from the same memory and never changing it doesn't introduce any problems.
By analogy, this is like having a single (single-sided) piece of paper on a desk and 10 people reading it simultaneously: no problem.
If the object is mutable or (does have external side-effects like IO) then that's a problem :) You need to synchronize each thread's changes otherwise they'll be overwriting each other - this is bad.
Note that in many languages and platforms, like C#/.NET and C++, documentation can often assert or claim that a method is "thread-safe" but this is not necessarily absolutely guaranteed by the runtime (excepting the case where a function is provably "const-correct" and has no IO, which is something C++ can do, but C# cannot currently do that).
Note that just because a single method is thread-safe, doesn't mean an entire logical operation (e.g. calling multiple object methods in sequence) is thread-safe; for example, consider an Dictionary<K,V>: many threads can simultaneously call TryGetValue without issues - but as soon as one thread calls .Add then that messes up all of the other threads calling TryGetValue (because Dictionary<K,V> does not guarantee that .Add is atomic and blocks TryGetValue, this is why we use ConcurrentDictionary which has a different API - or we wrap our logical business/domain operations in lock (Monitor) statements).
Will different threads somehow "copy" the single-instance object method code to be run on its own CPU resources?
No.
You can make that happen with [ThreadLocal] and [ThreadStatic] (and [AsyncLocal]) but proponents of functional programming techniques (like myself) will strongly recommend against this for many reasons. If you want a thread to "own" something then pass the object on the stack as a parameter and not inside hidden static state.
(Also note this does not apply to value-types (e.g. struct) which are always copied between uses, but I assume you're referring to class object instances, fwiw - as .NET discourages using mutable structs anyway).

multiple threads vs reference counting: does each thread count variables separately

I've been playing around with glib, which
utilizes reference counting to manage memory for its objects;
supports multiple threads.
What I can't understand is how they play together.
Namely:
In glib each thread doesn't seem to increase refcount of objects passed on its input, AFAIK (I'll call them thread-shared objects). Is it true? (or I've just failed to find the right piece of code?) Is it a common practice not to increase refcounts to thread-shared objects for each thread, that shares them, besides the main thread (responsible for refcounting them)?
Still, each thread increases reference counts for the objects, dynamically created by itself. Should the programmer bother not to give the same names of variables in each thread in order to prevent collision of names and memory leaks? (E.g. on my picture, thread2 shouldn't crate a heap variable called output_object or it will collide with thread1's heap variable of the same name)?
UPDATE: Answer to (question 2) is no, cause the visibility scope of
those variables doesn't intersect:
Is dynamically allocated memory (heap), local to a function or can all functions in a thread have access to it even without passing pointer as an argument.
An illustration to my questions:
I think that threads are irrelevant to understanding the use of reference counters. The point is rather ownership and lifetime, and a thread is just one thing that is affected by this. This is a bit difficult to explain, hopefully I'll make this clearer using examples.
Now, let's look at the given example where main() creates an object and starts two threads using that object. The question is, who owns the created object? The simple answer is that main() and both threads share this object, so this is shared ownership. In order to model this, you should increment the refcounter before each call to pthread_create(). If the call fails, you must decrement it again, otherwise it is the responsibility of the started thread to do that when it is done with the object. Then, when main() terminates, it should also release ownership, i.e. decrement the refcounter. The general rule is that when adding an owner, increment the refcounter. When an owner is done with the object, it decrements the refcounter and the last one destroys the object with that.
Now, why does the the code not do this? Firstly, you can get away with adding the first thread as owner and then passing main()'s ownership to the second thread. This will save one increment/decrement operation. This still isn't what's happening though. Instead, no reference counting is done at all, and the simple reason is that it isn't used. The point of refcounting is to coordinate the lifetime of a dynamically allocated object between different owners that are peers. Here though, the object is created and owned by main(), the two threads are not peers but rather slaves of main. Since main() is the master that controls start/stop of the threads, it doesn't have to coordinate the lifetime of the object with them.
Lastly, though that might be due to the example-ness of your code, I think that main simply leaks the reference, relying on the OS to clean up. While this isn't beautiful, it doesn't hurt. In general, you can allocate objects once and then use them forever without any refcounting in some cases. An example for this is the main window of an application, which you only need once and for the whole runtime. You shouldn't repeatedly allocate such objects though, because then you have a significant memory leak that will increase over time. Both cases will be caught by tools like valgrind though.
Concerning your second question, concerning the heap variable name clash you expect, it doesn't exist. Variable names that are function-local can not collide. This is not because they are used by different threads, but even if the same function is called twice by the same thread (think recursion!) the local variables in each call to the function are distinct. Also, variable names are for the human reader. The compiler completely eradicates these.
UPDATE:
As matthias says below, GObject is not thread-safe, only reference counting functions are.
Original content:
GObject is supposed to be thread safe, but I've never played with that myself…

guarding a critical section with a mutex

Let's say that I have a shared object that has a piece of code protected with a critical section and more than 1 thread are accessing the object for read/write. When a thread is inside the critical section the other threads are waiting. Once the thread gets out of the CS then the OS gives access to any of the waiting threads.
If I am confined to only one process, does the CS alone is a good protection for the shared object?
I ask because I have seen on the web that the right way to do it is to use a kernel object (ex: mutex, semaphone) to guard the CS. A thread wishing to use the shared resource needs to obtain the mutex/semaphore first with a WaitForSingleObject type of function. If a mutex is used then only one of then can access the resource. Once the mutex is obtained then the thread Enters the CS, does what is supposed to do, then Leave the CS and Releases the mutex. Then the OS allows any other waiting thread to obtain the mutex and so on and so forth.
But isn't is the same as using only the CS?
Also, using a mutex is supposed to be significantly slower than using a CS alone. The only problem I see of using only a CS is that if the thread crashes inside the CS then the other threads may never access the shared resource.
Is there any other reason why this approach is better?
thanks in advance
It sounds like you're discussing some Windows-specific terminology in a way that's getting it mixed up with some general computer science terminology.
In computer science the term "critical section" is used for areas of code that must run exclusively (usually due to data sharing). In Windows, there's a synchronization object called CRITICAL_SECTION that can be used to provide exclusive access to areas of execution. Another attribute of a CRITICAL_SECTION object on Windows is that it is limited to being used within a single process.
In computer science, the term 'mutex' is often used to describe an object that can be used to provide synchronization among parallel or cncurrent threads of execution. In Windows, there is also a mutex object which can be created with the CreateMutex() function (which returns a HANDLE representing the mutex). That object can be used to synchronize access among threads in the same or different processes, so it can be used similar to a CRITICAL_SECTION (but with different APIs) in many ways. If you want to synchronized threads of execution that are in different processes, a mutex object can be used, while a CRITICAL_SECTION object cannot.
So to answer your question (I think), if you are only concerned with protecting a critical section among threads that are part of the same process, a CRITICAL_SECTION object should be adequate. A mutex object can be used instead, but it may be somewhat less performant. There should be no need to use both types of objects

Thread Locking in Large Parralel Applications

I have a slightly more general question about parallelisation and threadlocking synchronization in large applications. I am working on an application with a large number of object types with a deep architecture that also utilises parallelisation of most key tasks. At present synchronisation is done with thread locking management inside each object of the system. The problem is that the locking scope is only as large as each object, whereas the object attibutes are being passed through many different objects where the attributes are losing synchronisation protection.
What is best-practice on thread management, 'synchronization contexts' &c. in large applications? It seems the only foolproof solution is to make data synchronization application wide such that data can be consumed safely by any object at any time, but this seems to violate object oriented coding concepts.
How is this problem best managed?
One approach is to make your objects read-only; a read-only object doesn't need any synchronization because there is no chance of any thread reading it while another thread writes to it (because no thread ever writes to it). Object lifetime issues can be handled using lock-free reference counting (using atomic-counters for thread safety).
Of course the down side is that if you actually want to change an object's state you can't; you have to create a new object that is a copy of the old object except for the changed part. Depending on what your application does, that overhead may or may not be acceptable.

Has anyone seen a programming language that handles threads like this?

Most of the multithreaded work I have done has been in C/C++, Python, or Delphi (Object Pascal). All on Windows. I'll use Delphi for my discussion here. Delphi has a nice class called TThread which abstracts the thread creation process. The class provides an Execute method which is the created thread's thread function. You override that method and typically create a loop within it that exits when the thread is terminated. You do the thread's work inside the loop.
One of the recurring tasks that crops up is keeping track (carefully) of which code gets executed in the thread's context and which code gets executed by external threads in an external context, with synchronization objects guarding data shared by the threads. All basic thread programming stuff. One of the recurring annoyances is creating functions to allow external threads to submit or retrieve data, and moving data from public thread-safe memory objects to those private to the thread and back the other way.
I was wondering if anyone has ever seen a programming language that makes this simpler? Here's what I would love as a programming thread idiom. Let's take a Delphi TThread sub-class created for this discussion. Suppose I could mark class methods with one of three keywords like Private, PublicExecuteInAnyContext, or PublicExecuteInPrivateContext. Here's how they would work.
Private: Private methods would only execute in the thread's context. The compiler would automatically add code that would raise an exception if a code path led to that method being executed in a context outside of the host thread. (E.g. - "Error, attempt to execute method private to thread $AEB from thread $EE0").
PublicExecuteInAnyContext: methods marked as such could be called by the thread that owns the method and any external thread. Any data objects referenced in these methods would automatically be guarded with synchronization objects, with the option to override the default choice and supply your own. (Mutex or semaphore instead of Critical Section, etc.)
PublicExecuteInPrivateContext: Methods marked with this keyword would execute in the thread's context, but are callable by any thread. This option would allow for two strategies for dealing with calls to such methods by external threads:
1) Mode 1 - Block calling thread: the calling thread would block until the method returned. In other words the compiler would automatically write code to make the calling thread would block. Any parameters passed by the calling thread would be copied into variables private to the host thread. The method would not execute until the host thread received control. When the host thread exited the method the calling thread would be released and would have any results returned by the method copied into it's own private variable space.
2) Mode 2 - Do not block the calling thread: this would allow for the additional argument of a callback function. Any parameters passed to the method by an external thread would be copied into the thread's private variable space. The compiler would again hold the execution of the method until the host thread received control, but would let the calling thread continue without blocking. When the host thread completed execution of the method, if a callback function was specified, it would call that function ** but in the context of the original thread that called the method, not in the host thread context **.
A language that would provide this kind of automatic thread handling would, at least to me, be a lot more fun to do multithreading with then the way I have to do it now. Has anybody seen a programming language, or a module/hack for one of the mainstream languages, that provides this kind of multi-threading model?
I don't think it matches your description exactly, but Erlang provides one of the easiest concurrency models I've seen. It uses a share-nothing approach, which may or may not sound appealing to you, but I found it really interesting. It's also really easy to create distributed systems with it, if you ever have the need. Check out "Getting Started with Erlang", it's a great tutorial that covers almost all parts of the language.
Wanted to mention. LabVIEW really makes starting with multi threading a breeze. I just found some article on the Internet which actually explains this in a better way:
If you have programmed in a
traditional, textual language before,
the data-flow paradigm of Labview can
be somewhat hard to embrace. The
data-flow paradigm stipulates that it
does not matter where on the 2D
surface of the block diagram a
particular component is placed in
relation to other components, but what
other components it is wired to. A
particular component-node does not
execute until all its inputs are
available; it executes, however, as
soon as all its inputs are available,
regardless of what else might be
executing at the same time, that is -
in parallel with potentially many
other things.
You need extensive experience with
multi-threading programming in a
traditional language, however, to
truly appreciate how easy it is to
achieve multi-threading with LabVIEW
and how naturally the parallelism
arises - whether intentionally, or
not. While LabVIEW does not magically
dissolve all the challenges related to
parallel processing, it certainly
makes it considerably simpler to get
started with it. In fact, in contrast
with the traditionally sequential
textual programming languages, its is
the sequential execution that comes at
a price in LabVIEW - parallelism is
almost free.
Taken from http://saberrobotics.org/?id=34

Resources