Preemptive Multithreading in Delphi - multithreading

I've read about Preemptive Multithreading here and here.
Is there a way to do this in Delphi and how does this compare (advantages and disadvantages) to other methods of threading in Delphi?

The "other methods" you're referring to all seem to be using the operating system's underlying threading capability -- which is preemptive. In other words, choose whichever you find most convenient, and it'll be preemptive.
Getting non-preemptive (aka cooperative) threading requires a bit of extra work, typically by converting threads to "fibers".

Modern versions of Windows are all preemptive multitasking operating systems. This means that threads and processes (where a process to exist requires at least one thread of execution) are all scheduled and preemptively run.
So "is there a way to do this in Delphi" has the following answers:
Your singlethreaded Delphi application is already preemptively scheduled with the other applications
If you write a multithreaded Delphi application, it also will be. You would have to go to considerable effort to write a non-preemptive model, such as a cooperative threading model in your application. One approach might be to use coroutines; here is an example using Delphi 7.
The best answer is use TThread or any native Windows thread or wrapper around them. You will have preemptive multithreading.
All the models in your link use normal Windows threads and I suspect your question means you're confused about different threading techniques, which are mostly techniques for communication or running tasks (jobs of work that are run on other threads.) If this is the case, you might want to either update your question or ask another looking for an explanation of these models.

Have you looked at User-Mode Scheduling which was introduced in Windows 7. Fibers basically don't really work. There's lots of information on this on the MSDN site and I seem to recall a few videos on Channel 9.

Related

Recursion calls on different cores?

So I recently was writing program to confirm that Dijkstra mutual exclusion algorithm is working. I decided to use Kotlin coz I didn't want to use C++ and manage memory myself. Today I saw that despite the fact my program is running sequential all cores are in 100% use, is it some JVM optimization? Or maybe Kotlin optimized my recursion calls? I need to mention that my recursion is not tail recursion. Do any of you know why this happens? I did not use threads or coroutines just to be clear.
Short answer: don't worry!
If you're not using coroutines or explicit threads, then your code should continue to execute in the same thread.
However, there's no guarantee that your thread will always be executed on the same core; the OS is free to schedule it on whichever core it thinks best at each moment.  (I don't know what criteria various OSs may use to make those decisions, but it's likely to take account of other threads and processes. And your thread can move between codes quickly enough to confuse any monitoring you may do.)
Also, even if you don't start any other threads, there are likely to be several background threads handling garbage collection, running finalisers, and other housekeeping.  If you use a GUI toolkit such as JavaFX or Swing, that will use many threads, as will a framework such as Spring.  (These will usually be marked as ‘dæmon’ theads, so they don't prevent the JVM from exiting.)
Finally, the JVM itself is likely to use system threads for background compilation of bytecode, monitoring, and so on.  (Different JVM implementations may do this differently, of course.)
(None of this is specific to Kotlin, by the way; it's the same for all Java apps.  But it'd almost certainly be different for Kotlin/JS and Kotlin/native.)
So no, activity on multiple cores does not imply that your code has been transformed or rewritten.  It just means that you're not running on the bare metal, and should trust the JVM to take care of things!
It was GC, here is Reddit post I created there.
https://www.reddit.com/r/Kotlin/comments/bp9y1t/recursion_calls_on_different_cores/

Is X threads implemented via X processes?

I know that there is multithreading vs multiprocessing approach.
But I was under the impression that threads are implemented as processes by the OS. So the threading model is just a programming construct on top of processes.
At least in Java (hence the tag although this question is language agnostic) I know that the threads are implemented by the linux as processes
Is it not the general case? Does it depend on the OS?
UPDATE for Java asked in comment by #SotiriosDelimanolis: One to one mapping of Java Thread to Linux thread (LWP)
Threads in modern versions of Java are "native" and are implemented, scheduled and handled by the OS the JVM is running on. So the answer depends on which OS you are using.
Distinguishing between Java threads and OS threads?
EDIT
In general, not just java, the rules for how threads are created is determined by the language, the OS and and language libraries that are used (or some combination of those).
But in general, on modern OSes, multiple threads often share a single process for performance reasons. Threads are sometimes called light weight processes.
This link has an overview of threads and C libraries for writing multithreaded apps for various OSes.

What are some of the core principles needed to master multi-threading using Delphi?

I am kind of new to programming in general (about 8 months with on and off in Delphi and a little Python here and there) and I am in the process of buying some books.
I am interested in learning about concurrent programming and building multi-threaded apps using Delphi. Whenever I do a search for "multithreading Delphi" or "Delphi multithreading tutorial" I seem to get conflicting results as some of the stuff is about using certain libraries (Omnithread library) and other stuff seems to be more geared towards programmers with more experience.
I have studied quite a few books on Delphi and for the most part they seem to kind of skim the surface and not really go into depth on the subject. I have a friend who is a programmer (he uses c++) who recommends I learn what is actually going on with the underlying system when using threads as opposed to jumping into how to actually implement them in my programs first.
On Amazon.com there are quite a few books on concurrent programming but none of them seem to be made with Delphi in mind.
Basically I need to know what are the main things I should be focused on learning before jumping into using threads, if I can/should attempt to learn them using books that are not specifically aimed at Delphi developers (don't want to confuse myself reading books with a bunch of code examples in other languages right now) and if there are any reliable resources/books on the subject that anyone here could recommend.
Short answer
Go to OmnyThreadLibrary install it and read everything on the site.
Longer answer
You asked for some info so here goes:
Here's some stuff to read:
http://delphi.about.com/od/kbthread/Threading_in_Delphi.htm
I personally like: Multithreading - The Delphi Way.
(It's old, but the basics still apply)
Basic principles:
Your basic VCL application is single threaded.
The VCL was not build with multi-threading in mind, rather thread-support is bolted on so that most VCL components are not thread-safe.
The way in which this is done is by making the CPU wait, so if you want a fast application be careful when and how to communicate with the VCL.
Communicating with the VCL
Your basic thread is a decendent of TThread with its own members.
These are per thread variables. As long as you use these you don't have any problems.
My favorite way of communicating with the main window is by using custom windows Messages and postmessage to communicate asynchronically.
If you want to communicate synchronically you will need to use a critical section or a synchonize method.
See this article for example: http://edn.embarcadero.com/article/22411
Communicating between threads
This is where things get tricky, because you can run into all sorts of hard to debug synchonization issues.
My advice: use OmnithreadLibrary, also see this question: Cross thread communication in Delphi
Some people will tell you that reading and writing integers is atomic on x86, but this is not 100% true, so don't use those in a naive way, because you'll most likely get subtle issues wrong and end up with hard to debug code.
Starting and stopping threads
In old Delphi versions Thread.suspend and Thread.resume were used, however these are no longer recommended and should be avoided (in the context of thread synchronization).
See this question: With what delphi Code should I replace my calls to deprecated TThread method Suspend?
Also have a look at this question although the answers are more vague: TThread.resume is deprecated in Delphi-2010 what should be used in place?
You can use suspend and resume to pause and restart threads, just don't use them for thread synchronization.
Performance issues
Putting wait_for... , synchonize etc code in your thread effectively stops your thread until the action it's waiting for has occured.
In my opinion this defeats a big purpose of threads: speed
So if you want to be fast you'll have to get creative.
A long time ago I wrote an application called Life32.
Its a display program for conways game of life. That can generate patterns very fast (millions of generations per second on small patterns).
It used a separate thread for calculation and a separate thread for display.
Displaying is a very slow operation that does not need to be done every generation.
The generation thread included display code that removes stuff from the display (when in view) and the display thread simply sets a boolean that tells the generation thread to also display the added stuff.
The generation code writes directly to the video memory using DirectX, no VCL or Windows calls required and no synchronization of any kind.
If you move the main window the application will keep on displaying on the old location until you pause the generation, thereby stopping the generation thread, at which point it's safe to update the thread variables.
If the threads are not 100% synchronized the display happens a generation too late, no big deal.
It also features a custom memory manager that avoids the thread-safe slowness that's in the standard memory manager.
By avoiding any and all forms of thread synchronization I was able to eliminate the overhead from 90%+ (on smallish patterns) to 0.
You really shouldn't get me started on this, but anyway, my suggestions:
Try hard to not use the following:
TThread.Synchronize
TThread.WaitFor
TThread.OnTerminate
TThread.Suspend
TThread.Resume, (except at the end of constructors in some Delphi versions)
TApplication.ProcessMessages
Use the PostMessage API to communicate to the main thread - post objects in lParam, say.
Use a producer-consumer queue to communicate to secondary threads, (not a Windows message queue - only one thread can wait on a WMQ, making thread pooling impossible).
Do not write directly from one thread to fields in another - use message-passing.
Try very hard indeed to create threads at application startup and to not explicitly terminate them at all.
Do use object pools instead of continually creating and freeing objects for inter-thread communication.
The result will be an app that performs well, does not leak, does not deadlock and shuts down immediately when you close the main form.
What Delphi should have had built-in:
TWinControl.PostObject(anObject:TObject) and TWinControl.OnObjectRx(anObject:TObject) - methods to post objects from a secondary thread and fire a main-thread event with them. A trivial PostMessage wrap to replace the poor performing, deadlock-generating, continually-rewritten TThread.Synchronize.
A simple, unbounded producer-consumer class that actually works for multiple producers/consumers. This is, like, 20 lines of TObjectQueue descendant but Borland/Embarcadero could not manage it. If you have object pools, there is no need for complex bounded queues.
A simple thread-safe, blocking, object pool class - again, really simple with Delphi since it has class variables and virtual constructors, eg. creating a lot of buffer objects:
myPool:=TobjectPool.create(1024,TmyBuffer);
I thought it might be useful to actually try to compile a list of things that one should know about multithreading.
Synchronization primitives: mutexes, semaphores, monitors
Delphi implementations of synchronization primitives: TCriticalSection, TMREWSync, TEvent
Atomic operations: some knowledge about what operations are atomic and what not (discussed in this question)
Windows API multithreading capabilities: InterlockedIncrement, InterlockedExchange, ...
OmniThreadLibrary
Of course this is far from complete. I made this community wiki so that everyone can edit.
Appending to all the other answers I strongly suggest reading a book like:
"Modern Operating Systems" or any other one going into multithreading details.
This seems to be an overkill but it would make you a better programmer and
you defenitely get a very good insight
into threading/processes in an abstract way - so you learn why and how to
use critical section or semaphores on examples (like the
dining philosophers problem or the sleeping barber problem)

Lua :: How to write simple program that will load multiple CPUs?

I haven't been able to write a program in Lua that will load more than one CPU. Since Lua supports the concept via coroutines, I believe it's achievable.
Reason for me failing can be one of:
It's not possible in Lua
I'm not able to write it ☺ (and I hope it's the case )
Can someone more experienced (I discovered Lua two weeks ago) point me in right direction?
The point is to write a number-crunching script that does hi-load on ALL cores...
For demonstrative purposes of power of Lua.
Thanks...
Lua coroutines are not the same thing as threads in the operating system sense.
OS threads are preemptive. That means that they will run at arbitrary times, stealing timeslices as dictated by the OS. They will run on different processors if they are available. And they can run at the same time where possible.
Lua coroutines do not do this. Coroutines may have the type "thread", but there can only ever be a single coroutine active at once. A coroutine will run until the coroutine itself decides to stop running by issuing a coroutine.yield command. And once it yields, it will not run again until another routine issues a coroutine.resume command to that particular coroutine.
Lua coroutines provide cooperative multithreading, which is why they are called coroutines. They cooperate with each other. Only one thing runs at a time, and you only switch tasks when the tasks explicitly say to do so.
You might think that you could just create OS threads, create some coroutines in Lua, and then just resume each one in a different OS thread. This would work so long as each OS thread was executing code in a different Lua instance. The Lua API is reentrant; you are allowed to call into it from different OS threads, but only if are calling from different Lua instances. If you try to multithread through the same Lua instance, Lua will likely do unpleasant things.
All of the Lua threading modules that exist create alternate Lua instances for each thread. Lua-lltreads just makes an entirely new Lua instance for each thread; there is no API for thread-to-thread communication outside of copying parameters passed to the new thread. LuaLanes does provide some cross-connecting code.
It is not possible with the core Lua libraries (if you don't count creating multiple processes and communicating via input/output), but I think there are Lua bindings for different threading libraries out there.
The answer from jpjacobs to one of the related questions links to LuaLanes, which seems to be a multi-threading library. (I have no experience, though.)
If you embed Lua in an application, you will usually want to have the multithreading somehow linked to your applications multithreading.
In addition to LuaLanes, take a look at llthreads
In addition to already suggested LuaLanes, llthreads and other stuff mentioned here, there is a simpler way.
If you're on POSIX system, try doing it in old-fashioned way with posix.fork() (from luaposix). You know, split the task to batches, fork the same number of processes as the number of cores, crunch the numbers, collate results.
Also, make sure that you're using LuaJIT 2 to get the max speed.
It's very easy just create multiple Lua interpreters and run lua programs inside all of them.
Lua multithreading is a shared nothing model. If you need to exchange data you must serialize the data into strings and pass them from one interpreter to the other with either a c extension or sockets or any kind of IPC.
Serializing data via IPC-like transport mechanisms is not the only way to share data across threads.
If you're programming in an object-oriented language like C++ then it's quite possible for multiple threads to access shared objects across threads via object pointers, it's just not safe to do so, unless you provide some kind of guarantee that no two threads will attempt to simultaneously read and write to the same data.
There are many options for how you might do that, lock-free and wait-free mechanisms are becoming increasingly popular.

Is there an advantage of the operating system understanding the characteristics of how a thread may be used?

Is there an advantage of the operating system understanding the characteristics of how a thread may be used? For example, what if there were a way in Java when creating a new thread to indicate that it would be used for intensive CPU calculations vs will block for I/O. Wouldn't thread scheduling improve if this were a capability?
I'm not sure what you're actually expecting the OS to do with the information that a thread is I/O or compute. The things which actually make the most difference to how threads get scheduled (ie thread priority and thread CPU affinity) are already exposed by APIs (and support for NUMA aspects are starting to appear in mainstream OS APIs too).
If by a "compute thread" you mean it's something doing background processing and less important than a GUI thread (from the point of view of maintaining app responsiveness) probably the most useful thing you can do is lower the priority of the compute threads a little.
That's what OS processes do. The OS has sophisticated scheduling for the processes. The OS tracks I/O use and CPU use and dynamically adjusts priorities so that CPU-intensive processing doesn't interfere with I/O.
If you want those features, use a proper OS process.
Is that even necessary? Threads blocking on I/O will cause CPU-intensive threads to run. The operating system decides how to schedule threads. AFAIK there's no way to give any hints with Java.
Yes, it is very important to understand them specially if you are one of those architects who like opening lot of threads, specially on windows.
Jeff Richter over at Wintellect has a library called PowerThreading. It is very useful if you are developing applications on .NET, but since you are talking about JAVA, it is still better to understand OS threads, kernel models and how the interrupts work.

Resources