Why Running multiple same commands take a very long time - linux

This isn't so much of a programming question, but more of a problem which I've encountered lately, which I'm trying to understand.
Example, running an ls command in linux take maybe ..... 1 sec.
But when I spawn off a few thousands of ls command simultaneously, I noticed that some of the process is not running, and kinda take a very long time to run.
Why is that so? And how can we work around that?
Thanks in advance.
UPDATE:
I did a ps, and saw that a couple of the ls commands were in the state of D<. I checked up a bit, and understand that it is an Uninterruptable Sleep. What is that? And when will that happen? How to avoid that?

The number of processes or threads that can be executing concurrently is limited by the number of cores in your machine.
If you spawn thousands of processes or threads simultaneously the kernel is only able to run 'n' (where n equals the number of available cores) at the same time, some of them will have to wait to be scheduled.
If you want to run more processes or threads truly concurrently then you need to increase the number of available cores in the system (ie. by adding CPUs, enabling hyperthreading if available).

Related

With an n core machine why does htop continously claim only 1 process running?

While reading about htop:
"In the top right corner, htop shows the total number of processes and
how many of them are running."
If I have an 8 core machine (seen above), and I'm currently running over 100 processes, why is htop always indicating 1 process running at a time?
Shouldn't I have the potential to run more?
I'd expect that value to be... 8.
I must be misunderstanding what that value means.
What does it mean for that value to always be 1?
Am I really not running anything in parallel?
why is htop always indicating 1 process running at a time?
Probably because on average there is only 1 process actually running at a time.
Shouldn't I have the potential to run more?
You do have the potential to run more!
I'd expect that value to be... 8. I must be misunderstanding what that value means.
The value is actually a reflection of the amount of work available for your system to do. If there is little work to do, most of the cores will be idle most of the time.
Technically, the load average is the average number of threads in the system's run list. This includes threads / processes that are running, and those that are waiting to run. Most of the time, a thread / process on a non-busy system will be in a "wait" state; i.e. "D" which means that it is waiting for a device or file system, or "S" which means that it is waiting for user or network I/O.
Am I really not running anything in parallel?
That is correct.
If you are expecting your system, or a specific application to be running in parallel, you should probably investigate ...

What will be going on inside my computer if I run a python and matlab program at once?

Suppose I have a multi-core laptop.
I write some code in python, and run it;
then while my python code is running, I open my matlab and run some other code.
What is going on underneath? Will this two process be processed in parallel using multi-core auomatically?
Or the computer waits for one to finish and then process the other?
Thank you !
P.S. The two programs I am referring to can be considered the simplest in nature, e.g. calculate 1+2+3.....+10000000
The answer is... it depends!
Your operating system is constantly switching which processes are running. There are tons of processes always running in the background - refreshing the screen, posting sound to the speakers, checking for updates, polling the mouse, etc. - and those processes can only actually execute if they get some amount of processor time. If you have many cores, the OS will use some sort of heuristics to figure out which processes should get some time on the cores. You have the illusion that everything is running at the same time because (1) in some sense, things are running at the same time because you have multiple cores, and (2) the switching happens so fast that you can't notice it happen.
The reason I'm bringing this up is that if you run both Python and MATLAB at the same time, while in principle they could easily run at the same time, it's not guaranteed that that happens because you may have a ton of other things going on as well. It might be that both Python and MATLAB run for a bit concurrently, then both temporarily get paused to allow some program that's playing music to load the next sound clip to be played into memory, then one pauses while the OS pages in some memory from disk and another takes over, etc.
Can you assume that they'll run in parallel? Sure! Most reasonable OSes will figure that out and do it correct. Can you assume that they exclusively are running in parallel and nothing else is? Not necessarily.

Difference between executing a thread and running a program multiple times

This may be a beginner's question. Is there a difference between executing multiple threads and running a program multiple times? By running a program multiple times, I mean literally starting up a terminal and running the program multiple times. I read that there is a limit of 1 thread per CPU, and I have a quad-core machine, so I guess that means I have 4 CPUs. Is there a limit of programs per CPU also?
Generally, if a program uses multiple threads, the threads will divide the work of the program between themselves. For example, one thread might work on half of a giant data set and another thread might take the other half, or multiple threads might talk to separate machines across a network. Running a program 2 times won't have that effect; you'll get two webservers or two games of Minecraft that have nothing to do with each other. It's possible for a program to communicate with other copies of itself, and some programs do that, but it's not the usual approach.
Multiple Threads means you can execute different instances of an action in same time.
If you running multiple programs it will execute one by one . Using thread we can increase the processing speed by parallel process

The result of Linux time command

I got a result from linux time command.
real 119m10.626s
user 133m0.952s
sys 20m32.155s
From the information I searched, it seems that user+sys should less than real, but it is not the case here.
Does somebody know why?
Multiple CPUs.
A multi-threaded application can run simultaneously on multiple CPU cores, (and thus accumulating CPU Time as multiples of real time)

the number of pthread_mutex in running system

I have a strange question. I have to calculate the number of
pthread_mutex in running system, for example, debian, ubuntu,system in
microcontroller and etc. I have to do it without LD_PRELOAD,
interrupting, overloading of functions and etc. I have to calculate it
in random time.
Do somebody have idea how I can do it? Can you see me way?
for the count the threads:
ps -eLf will give you a list of all the threads and processes currently running on the system.
However you ask for a list of all threads that HAVE executed on the system, presumably since some arbitrary point in the past - are you sure that is what you mean? You could run ps as a cron job and poll the system every X minutes, but you would miss threads that were born and died between jobs. You would also have a huge amount of data to deal with.
For count the mutex it's impossible

Resources