Is compiled Simulink-Simulation multithreaded?

Is compiled Simulink-Simulation multithreaded? - multithreading

Does the compiled (with simulink-coder-toolbox) simulink-model run with multiple threads or just with one thread/process? As far as I know, the simulation is a single process, if you do not have the parallel toolbox, but what about multithreading?
I am curious how simulink handles different stepsizes for simulation time in one model? For example, if there are 2 parallel paths in a model with different stepsizes (1 x complex work with 0.1s steptime and 100 x light work with 0.001s steptime), do these paths run one after the other or somehow in parallel fashion with threads to save execution time?

The Simulink Coder generates pretty plain vanilla C code, and by default compiles it as such. There's no inherent multithreading, or parallelism going on in the code itself.
Different sample rates in the model are given given task id's, and each step through the code will execute the code associated with the currently executing id. Tasks can also be split into different files allowing easier multitasking execution when deployed on an RTOS.
How the multiple tasks execute is largely dependent on the target OS and the compilation process. If you're compiling to a shared libary, or an exe, deployed on a non-real time OS (e.g. Windows) then you're not getting any multitasking. If you have an RTOS, have generated the code in an appropriate way, and compile appropriately then you will have multitasking.
There is a discussion of how this works in the doc: Model Single-Core, Multitasking Platform Execution
You have access to the code, and access to the build file (and can modify both should you wish.) The easiest way to see what is going on is to look at that code.

Related

How to disable multithreading in PyTorch?

I am trying to ensure that a PyTorch program build in c++ uses only a single thread. The program runs on CPU.
It has a fairly small model, and multi-threading doesn't help and actually causes problems because my program is multithreaded allready. I have called:
at::set_num_interop_threads(1);
at::set_num_threads(1);
torch::set_num_threads(1);
omp_set_num_threads(1);
omp_set_dynamic(0);
omp_set_nested(0);
In addition, I have set the environment variable
OPENBLAS_NUM_THREADS to 1.
Still, when I spawn in single thread, a total of 16 threads show up on htop, and 16 of the processors of the machine go to 100%.
Am I missing something? What?

From the PyTorch docs, one can do:
torch.set_num_threads(1)
To be on the safe side, do this before you instantiate any models etc (so immediately after the import). This worked for me.
More info: https://jdhao.github.io/2020/07/06/pytorch_set_num_threads/

Which is more efficient: curl_easy_perform() in a multi-threaded program or curl_multi_perform() in a single threaded program?

I am working on a program where I am required to download a large amount of JSON files from different URLs.
Currently, my program creates multiple threads, and in each thread, it calls the LibCurl easy_perform() function but I am running into issues where the program fails occasionally with an error of "double free". It seems to be some sort of Heisenbug but I have been able to catch it in GDB which confirms the error originates in LibCurl (backtraced).
While I would love suggestions on the issue I am having, my actual question is this: Would it be better if I were to change the structure of my code to use the LibCurl Multi Interface on one thread instead of calling the single interface across multiple threads? What are the trade offs of using one over the other?
Note: By "better", I mean is it faster and less taxing on my CPU? Is it more reliable as the multi interface was designed for this?
EDIT:
The three options I have as I understand it are these:
1) Reuse the same easy_handle in a single thread. The connections wont need to be reestablished making it faster.
2) Call curl_easy_perform() in each individual thread. They all run in parallel, again, making it faster.
3) Call curl_multi_perform() in a single thread. This is non-blocking so I imagine all of the files are downloaded in parallel making it faster?
Which of these options is the most time efficient?

curl_easy_perform is blocking operation. That means if you run in in one thread you have to download files sequentially. In multithreaded application you can run many operations in parallel - this usually means faster download time (if speed is not limited by network or destination server).
But there is non-blocking variant that may work better for you if you want to go single threaded way - curl_multi_perform
From curl man
You can do any amount of calls to curl_easy_perform while using the
same easy_handle. If you intend to transfer more than one file, you
are even encouraged to do so. libcurl will then attempt to re-use the
same connection for the following transfers, thus making the
operations faster, less CPU intense and using less network resources.
Just note that you will have to use curl_easy_setopt between the
invokes to set options for the following curl_easy_perform.
In short - it will give few benefits you want vs curl_easy_perform.

What's the point of multi-threading on a single core?

I've been playing with the Linux kernel recently and diving back into the days of OS courses from college.
Just like back then, I'm playing around with threads and the like. All this time I had been assuming that threads were automatically running concurrently on multiple cores but I've recently discovered that you actually have to explicitly code for handling multiple cores.
So what's the point of multi-threading on a single core? The only example I can think of is from college when writing a client/server program but that seems like a weak point.

All this time I had been assuming that threads were automatically
running concurrently on multiple cores but I've recently discovered
that you actually have to explicitly code for handling multiple cores.
The above is incorrect for any widely used, modern OS. All of Linux's schedulers, for example, will automatically schedule threads on different cores and even automatically move threads from one core to another when necessary to maximize core usage. There are some APIs that allow you to modify the schedulers' behavior, but these APIs are generally used to disable automatic thread-to-core scheduling, not to enable it.
So what's the point of multi-threading on a single core?
Imagine you have a GUI program whose purpose is to execute an expensive computation (for example, render a 3D image or a Mandelbrot set) and then display the result. Let's say this computation takes 30 seconds to complete on this particular CPU. If you implement that program the obvious way, and use only a single thread, then the user's GUI controls will be unresponsive for 30 seconds while the calculation is executing -- the user will be unable to do anything with your program, and possibly unable to do anything with his computer at all. Since users expect GUI controls to be responsive at all times, that would be a poor user experience.
If you implement that program with two threads (one GUI thread and one rendering thread), on the other hand, the user will be able to click buttons, resize the window, quit the program, choose menu items, etc, even while the computation is executing, because the OS is able to wake up the GUI thread and allow it to handle mouse/keyboard events when necessary.
Of course, it is possible to write this program with a single thread and keep its GUI responsive, by writing your single thread to do just a few milliseconds worth of computation, then check to see if there are GUI events available to process, handling them, then going back to do a bit more computation, etc. But if you code your app this way, you are essentially writing your own (very primitive) thread scheduler inside your app anyway, so why reinvent the wheel?
The first versions of MacOS were designed to run on a single core, but had no real concept of multithreading. This forced every application developer to correctly implement some manual thread management -- even if their app did not have any extended computations, they had to explicitly indicate when they were done using the CPU, e.g. by calling WaitNextEvent. This lack of multithreading made early (pre-MacOS-X) versions of MacOS famously unreliable at multitasking, since just one poorly written application could bring the whole computer to a grinding halt.

First, a program not only computes, but also waits for input/output and so can be considered as executing on an I/O processor. So even single-core machine is a multi-processor machine, and employing of multi-threading is justified.
Second, a task can be divided in several threads in the sake of modularity.

Multithreading is not only for taking advantage of multiple cores.
You need multiple processes for multitasking. For similar reason you are allowed to have multiple threads, which are lightweight compared with processes.
You probably don't want to spawn processes all the time for things like blocking I/O. That may be overkill.
And there is fiber, which is even more lightweight. So we have process, thread, and fiber for different levels of needs.

Well, when you say multithreading on a single core, there are things you need to consider. For example, the thread API that you are using - is it user level or kernel level. Most probably from you question I believe you are using user level threads.
Now, user level threads, depending upon the host OS or the API itself may map to single kernel thread or multiple. Many relations are possible like 1-1,many-1 or many-many.
Now, if there is a single core, your OS can still provide you several Kernel level threads which may behave as multiple processes to the CPU. In which case, OS will give you a time-slicing (and multi-programming) on the kernel threads leading to superfast context switch and via the user level API - you/your code will seem to have multithreaded features.
Also note that eventhough your processor is a single core, depending on the make, it can be hyperthreaded and have super deep pipelines allowing the concurrent running of Kernel threads with very low overhead.
For references: Check Intel/AMD architecture and how various OS provide Kernel threads.

MultiProgramming , multi-threading, and parallel processing?

I was wondering if there are any slight difference between the definitions of :
multiprogramming
Multithreading
Parallel processing
As I understand that we are using multithreading to achieve multiprogramming . Should the parallel processing the same as multiprogramming ,or it's related to hardware ?
Thanks

Multiprogramming describes that you are able to run multiple programms on a computer at the same time (compared to an old eg DOS system where only one program at a time could run) (also sometimes refered as mutlitasking) -> multiprogramming
Multithreading has to be seen differently on description: -> multithreading
Hardware Multithreading or Architecture: a Processor is able to run multiple Threads in parallel (for real, counterexample: Multiprogramming)
Software Mutlithreading: is when one Process consists of multiple threads those threads are not independent to each other, like processes, especially those threads can have race conditions while working on the same data (-> difference between thread & process )
Parallel processing desribes that there are some ( > 1) CPU's working togehter in any kind. This includes one PC with a multi-core, one server with multiple processors (eg on cards) or even a network of computers -> Parallel processing

The way I've usually seen your 2nd and 3rd terms used:
Parallel processing refers to two or more threads running at the same time, each working with their own data. That is, beyond starting and stopping, there are few, if anym synchronization problems. Multithreading refers to much the same thing, except that the threads share data and must be very careful about this. That is, synchronization is everything.
Proper parallel processing is not much harder than running a single thread. (Most platforms provide all kinds of support to help keep it simple.) Multithreading is a lot of very hard work.

Parallel software?

What is the meaning of "parallel software" and what are the differences between "parallel software" and "regular software"?
What are its advantages and disadvantages?
Does writing "parallel software" require a specific hardware or programming language ?

Are the "parallel software" requires a specified hardware or programing language ?
Yes and Yes.
The first one is trivially easy. Most modern CPU's (Say anything newer than m6800) have hardware features that make it possible to do more than one thing at a time, though not necessarily both at the same time. For instance, when a timer interrupt goes off, a CPU could save what it's doing, and then start doing something else. Those tasks run concurrently.
Even without that, you could just get two machines with some sort of connection to each other (like a simple serial connection via a Null modem adapter) and they can both work on the same task in parallel.
Most new (not just modern but recent) CPU's have parallel computing resources built in. These multi-core CPU's can actually be working on two or more tasks at the same time, one task per core, and have special features that make it a bit more efficient for those tasks to cooperate.
The second one, requiring special software tools such as a parallel enabled language, is in some ways the hardest part of parallel computing. If you're the only person in the kitchen, it's pretty easy to cook a meal, by following each recipe from start to finish, one after the next, until all dishes are cooked. If you want to speed that up by adding more cooks, you have to be a bit more careful to not step on each other's toes.
The simplest way this is handled is by using a threading library that offers some tools so that multiple tasks can arrange to not clobber each other. This is not as easy as flagging a program as parallel and the system takes care of the rest, rather, you have to write each task to communicate with every other task at every place where there is the possibility of them interfering.

http://en.wikipedia.org/wiki/Thread_(computer_science)
In computer science, a thread of
execution results from a fork of a
computer program into two or more
concurrently running tasks. The
implementation of threads and
processes differs from one operating
system to another, but in most cases,
a thread is contained inside a
process. Multiple threads can exist
within the same process and share
resources such as memory, while
different processes do not share these
resources.
Most modern programming languages support multithreading in one way or another (even Javascript in the newest versions). :-)
Advantages and Disadvantages can depend on the task. If you have a lot of processing that you need to do, then multithreading can help you break that up into smaller units of work that each CPU can work on independently at the same time. However, multithreaded code will generally be more complex to write and maintain than single threaded code.
You can still write/run multithreaded code on a machine that has only one processor. Although there will only be one processor to execute the tasks, the operating system will ensure that they happen simultaneously by rapidly switching context and executing a few instructions for each thread at a time.
Some specialized hardware you may be familiar with which does parallel tasks is the GPU which can be found on most new computers. In this video, the Mythbusters demonstrate the difference between drawing on a single-threaded CPU, and a multi-threaded GPU:
http://www.youtube.com/watch?v=XtGf0HaW7x4&feature=player_embedded

parallel software can natively take advantage on multiple cores/cpus on a computer or sometimes across multiple computers. Examples include graphics rendering software and circuit design software.
Not so sure about disadvantages other than multi-processor aware software tends to be a CPU hog.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string