How to disable parallel execution in Rust? - multithreading

I need to prevent some programs from running in parallel (last time I checked there was RUST_THREADS=1 option but can't find any docs for it) because they screw up the display.
I'm running a program from cargo, but I'd prefer to do it for executive form of program (i.e. instead of cargo run I could use ./myprogram).

By default Rust uses native threading, and you can't just disable it. It just doesn't make sense and will almost certainly break multithreaded program completely - it is because native thread scheduling is done by OS, so hypothetical "disabling" of native threading will leave all programs with their main thread only.
It is possible to disable green threading, but Rust does not use it by default for a long time already. This is possible because green threads rely on user-land scheduler, usually implemented in a language runtime, and it usually can be configured to use single OS thread. This is the default mode of operation of Go, for example. Rust switched to native threading as it is closer to underlying OS, more effective and much easier to implement and support, so RUST_THREADS does not work anymore (even if it ever worked).

Related

Recursion calls on different cores?

So I recently was writing program to confirm that Dijkstra mutual exclusion algorithm is working. I decided to use Kotlin coz I didn't want to use C++ and manage memory myself. Today I saw that despite the fact my program is running sequential all cores are in 100% use, is it some JVM optimization? Or maybe Kotlin optimized my recursion calls? I need to mention that my recursion is not tail recursion. Do any of you know why this happens? I did not use threads or coroutines just to be clear.
Short answer: don't worry!
If you're not using coroutines or explicit threads, then your code should continue to execute in the same thread.
However, there's no guarantee that your thread will always be executed on the same core; the OS is free to schedule it on whichever core it thinks best at each moment.  (I don't know what criteria various OSs may use to make those decisions, but it's likely to take account of other threads and processes. And your thread can move between codes quickly enough to confuse any monitoring you may do.)
Also, even if you don't start any other threads, there are likely to be several background threads handling garbage collection, running finalisers, and other housekeeping.  If you use a GUI toolkit such as JavaFX or Swing, that will use many threads, as will a framework such as Spring.  (These will usually be marked as ‘dæmon’ theads, so they don't prevent the JVM from exiting.)
Finally, the JVM itself is likely to use system threads for background compilation of bytecode, monitoring, and so on.  (Different JVM implementations may do this differently, of course.)
(None of this is specific to Kotlin, by the way; it's the same for all Java apps.  But it'd almost certainly be different for Kotlin/JS and Kotlin/native.)
So no, activity on multiple cores does not imply that your code has been transformed or rewritten.  It just means that you're not running on the bare metal, and should trust the JVM to take care of things!
It was GC, here is Reddit post I created there.
https://www.reddit.com/r/Kotlin/comments/bp9y1t/recursion_calls_on_different_cores/

What is the downside of XInitThreads()?

I know XInitThreads() will allow me to do calls to the X server from threads other than the main thread, and that concurrent thread support in Xlib is necessary if I want to make OpenGL calls from secondary threads using Qt. I have such a need for my application, but in a very rare context. Unfortunately, XInitThreads() needs to be called at the very beginning of my application's execution and will therefore affect it whether I need it or not for a particular run (I have no way of knowing before I run the app if I do need multithreaded OpenGL support or not).
I'm pretty sure the overall behavior the the application will remain unchanged if I uselessly call XInitThread(), but programming is all about tradeoffs, and I'm pretty sure there's a reason multithread support is not the default behavior for Xlib.
The man page says that it is recommended that single-threaded programs not call this function, but it doesn't say why. What's the tradeoff when calling XInitThreads()?
It creates a global lock and also one on each Display, so each call into Xlib will do a lock/unlock. If you instead do your own locking (keep your Xlib usage in a single thread at a time with your own locks) then you could in theory have less locking overhead by taking your lock, doing a lot of Xlib stuff at once, and then dropping your lock. Or in practice most toolkits use a model where they don't lock at all but just require apps to use X only in the main UI thread.
Since many Xlib calls are blocking (they wait for a reply from the server) lock contention could be an issue.
There are also some semantic pains with locking per-Xlib-call on Display; between each Xlib call on thread A, in theory any other Xlib call could have been made on thread B. So threads can stomp on each other in terms of confusing the display state, even though only one of them makes an Xlib request at a time. That is, XInitThreads() is preventing crashes/corruption due to concurrent Display access, but it isn't really taking care of any of the semantic considerations of having multiple threads sharing an X server connection.
I think the need to make your own semantic sense out of concurrent display access is one reason that people don't bother with XInitThreads per-Xlib-call locking. Since they end up with application-level or toolkit-level locks anyway.
One approach here is to open a separate Display connection in each thread, which could make sense depending on what you are doing.
Another approach is to use the newer xcb API rather than Xlib; xcb is designed from scratch to be multithreaded and nonblocking.
Historically there have also been some bugs in XInitThreads() in some OS/Xlib versions, though I don't remember any specifics, I do remember seeing those go by.

Replacing system calls (syscalls) in Linux 2.6+

I'm looking into writing a userland threading library, since there seems to be no active work in this area, and I believe the C++0x promises and futures may give this model some power. Unfortunately, in order to make this model work, it is essential to ensure a context switch on blocking calls. As such, I would like to intercept every syscall in order to replace it with an asynchronous version. There are some caveats:
I know there are asynchronous syscalls for just about every regular syscall, but for backwards compatibility reasons this is not a viable solution.
I know that in Linux 2.4 or earlier it was possible to directly change the sys_call_table, but this has vanished.
As I would like my library to be statically linked if desired, the LD_PRELOAD trick isn't viable.
Similarly, kernel modules are not an option because this is supposed to be a userland library.
Finally, ptrace() is also not an option for similar reasons. I can't have my library forking a new process just in order to be used.
Is this possible?
I'm looking into writing a userland threading library, since there seems to be no active work in this area
You might want to take a look at the thread libraries Marcel (and its publications) and MPC, which implement hybrid (kernel and user-level) threads, mainly in the purpose of High-Performance Computing, so they had to find some solution for this blocking system calls.
So as to avoid the blocking of kernel threads when the application
makes blocking system calls, Marcel uses Scheduler Activations when
they are available, or just intercepts such blocking calls at dynamic
symbols level.

Lua :: How to write simple program that will load multiple CPUs?

I haven't been able to write a program in Lua that will load more than one CPU. Since Lua supports the concept via coroutines, I believe it's achievable.
Reason for me failing can be one of:
It's not possible in Lua
I'm not able to write it ☺ (and I hope it's the case )
Can someone more experienced (I discovered Lua two weeks ago) point me in right direction?
The point is to write a number-crunching script that does hi-load on ALL cores...
For demonstrative purposes of power of Lua.
Thanks...
Lua coroutines are not the same thing as threads in the operating system sense.
OS threads are preemptive. That means that they will run at arbitrary times, stealing timeslices as dictated by the OS. They will run on different processors if they are available. And they can run at the same time where possible.
Lua coroutines do not do this. Coroutines may have the type "thread", but there can only ever be a single coroutine active at once. A coroutine will run until the coroutine itself decides to stop running by issuing a coroutine.yield command. And once it yields, it will not run again until another routine issues a coroutine.resume command to that particular coroutine.
Lua coroutines provide cooperative multithreading, which is why they are called coroutines. They cooperate with each other. Only one thing runs at a time, and you only switch tasks when the tasks explicitly say to do so.
You might think that you could just create OS threads, create some coroutines in Lua, and then just resume each one in a different OS thread. This would work so long as each OS thread was executing code in a different Lua instance. The Lua API is reentrant; you are allowed to call into it from different OS threads, but only if are calling from different Lua instances. If you try to multithread through the same Lua instance, Lua will likely do unpleasant things.
All of the Lua threading modules that exist create alternate Lua instances for each thread. Lua-lltreads just makes an entirely new Lua instance for each thread; there is no API for thread-to-thread communication outside of copying parameters passed to the new thread. LuaLanes does provide some cross-connecting code.
It is not possible with the core Lua libraries (if you don't count creating multiple processes and communicating via input/output), but I think there are Lua bindings for different threading libraries out there.
The answer from jpjacobs to one of the related questions links to LuaLanes, which seems to be a multi-threading library. (I have no experience, though.)
If you embed Lua in an application, you will usually want to have the multithreading somehow linked to your applications multithreading.
In addition to LuaLanes, take a look at llthreads
In addition to already suggested LuaLanes, llthreads and other stuff mentioned here, there is a simpler way.
If you're on POSIX system, try doing it in old-fashioned way with posix.fork() (from luaposix). You know, split the task to batches, fork the same number of processes as the number of cores, crunch the numbers, collate results.
Also, make sure that you're using LuaJIT 2 to get the max speed.
It's very easy just create multiple Lua interpreters and run lua programs inside all of them.
Lua multithreading is a shared nothing model. If you need to exchange data you must serialize the data into strings and pass them from one interpreter to the other with either a c extension or sockets or any kind of IPC.
Serializing data via IPC-like transport mechanisms is not the only way to share data across threads.
If you're programming in an object-oriented language like C++ then it's quite possible for multiple threads to access shared objects across threads via object pointers, it's just not safe to do so, unless you provide some kind of guarantee that no two threads will attempt to simultaneously read and write to the same data.
There are many options for how you might do that, lock-free and wait-free mechanisms are becoming increasingly popular.

Multithreading in Lua

I was having a discussion with my friend the other day. I was saying how that, in pure Lua, you couldn't build a preemptive multitasking system. He claims you can, because of the following reason:
Both C and Lua have no inbuilt threading libraries [OP's note: well, Lua technically does, but AFAIK it's not useful for our purposes]. Windows, which is written in mostly C(++) has pre-emptive multitasking, which they built from scratch. Therefore, you should be able to do the same in Lua.
The big problem i see with that is that the main way preemptive multitasking works (to my knowledge) is that it generates regular interrupts which the manager uses to get control and determine what code it should be working on next. I also don't think Lua has any facility that can do that.
My question is: is it possible to write a pure-Lua library that allows people to have pre-emptive multitasking?
I can't see how to do it, although without a formal semantics of Lua (like the semantics of yield for example), it's really hard to come up with an ironclad argument why it can't be done. (I've been wanting a formal semantics for ages, but evidently Roberto and lhf have better things to do.)
If I wanted pre-emptive multitasking for Lua, I wouldn't even try to do it in pure Lua. Instead I'd use an old trick I first saw 20 years ago in Standard ML of New Jersey:
Interrupt sets a flag in the lua_State saying "current coroutine has been preempted".
Alter the VM so that on every loop and every function call, it checks the flag and yields if necessary.
This patch would be easy to write and easy to maintain. It doesn't solve the problem of the long-running C function that can't be pre-empted, but if you have to solve that problem, you are wandering into much harder territory, and you may as well do all your threading at the C level, not the Lua level.
No. It's not possible to write a preemptive scheduler in pure Lua. At some point a preemptive scheduler needs some mechanism like an interrupt service routine to take control away from the current thread and give it to the scheduler which can then give it to another thread. Pure Lua doesn't have this mechanism.
You mention that Windows is written in mostly C/C++. The keyword is mostly. You can't write a preemptive scheduler in pure ANSI C/C++. Usually, part of the interrupt service routine is written in assembly language. Or, the C/C++ compiler implements a non-standard extension that allows interrupt service routines to be written in C/C++. Some compilers allow you to declare a functions with an __interrupt modifier that that causes the compiler to generate a prolong / epilog that allows the function to be used as an interrupt service routine.
Also, code that sets up the interrupt service routine fiddles with CPU registers with memory mapped IO, or a IO instructions. None of this code is portable ANSI C/C++. And, depends on the CPU architecture.
Not that I know of, no. It would almost be absurdly simple if you could yield from hooks set on coroutines with debug.sethook though, but it doesn't work. You can yield from C hooks set from C (lua_sethook), but I couldn't figure out exactly to do that, and it's not pure Lua anyways.
Even if it were possible, it wouldn't be true threading. Everything would still run within the same operating system thread, for example. Your hook would take a variety of factors into account (such as time, perhaps memory, etc.) and then determine whether to yield. The yielded-to coroutine then would decide which child coroutine to run next. You'd also need to decide on when the hook should be called. Most frequent would be on every Lua instruction, but that carries a performance penalty. And if the coroutine calls into a C function, Lua has no jurisdiction. If that C call takes a long time, there's nothing you can do about it.
Here's a related thread from the Lua-L mailing list which you might find interesting.

Resources