From what i learnt in class, two of the main reasons why we want to use thread are:
. Parallelization
. They can share the same global address space
I assume that the reason why we like it that they share data is because we dont have to clear the cache table every time we running different thread from the same process.
My question is, in order to have parallelization, threads have to be run in different CPU, then how do they share data when CPUs dont share cache table? Is my assumption wrong? or we can only have one of the property listed above at a time? thank you.
Related
I have a vector of entities. At update cycle I iterate through vector and update each entity: read it's position, calculate current speed, write updated position. Also, during updating process I can change some other objects in other part of program, but each that object related only to current entity and other entities will not touch that object.
So, I want to run this code in threads. I separate vector into few chunks and update each chunk in different threads. As I see, threads are fully independent. Each thread on each iteration works with independent memory regions and doesn't affect other threads work.
Do I need any locks here? I assume, that everything should work without any mutexes, etc. Am I right?
Short answer
No, you do not need any lock or synchronization mechanism as your problem appear to be a embarrassingly parallel task.
Longer answer
A race conditions that can only appear if two threads might access the same memory at the same time and at least one of the access is a write operation. If your program exposes this characteristic, then you need to make sure that threads access the memory in an ordered fashion. One way to do it is by using locks (it is not the only one though). Otherwise the result is UB.
It seems that you found a way to split the work among your threads s.t. each thread can work independently from the others. This is the best case scenario for concurrent programming as it does not require any synchronization. The complexity of the code is dramatically decreased and usually speedup will jump up.
Please note that as #acelent pointed out in the comment section, if you need changes made by one thread to be visible in another thread, then you might need some sort of synchronization due to the fact that depending on the memory model and on the HW changes made in one thread might not be immediately visible in the other.
This means that you might write from Thread 1 to a variable and after some time read the same memory from Thread 2 and still not being able to see the write made by Thread 1.
"I separate vector into few chunks and update each chunk in different threads" - in this case you do not need any lock or synchronization mechanism, however, the system performance might degrade considerably due to false sharing depending on how the chunks are allocated to threads. Note that the compiler may eliminate false sharing using thread-private temporal variables.
You can find plenty of information in books and wiki. Here is some info https://software.intel.com/en-us/articles/avoiding-and-identifying-false-sharing-among-threads
Also there is a stackoverflow post here does false sharing occur when data is read in openmp?
I read about threads in some operating system books , and i confuse about the following:
A. Some books when talk about:
many to one relation mean:many threads in user space map to one thread in kernel .
one to one relation mean:one thread in user space map to one thread in kernel
many to many relation mean: some threads in user space multiplex in lower or equal threads in kernel space .
B. On the other hand, some book talk about 4 relations between threads & processes
many to one ,mean:A process defines an address space and dynamic resource ownership. Multiple threads may be created and executed within that process.
one to one ,mean:Each thread of execution is a unique process with its
own address space and resources.
one to many ,mean: thread may migrate from one process environment
to another. This allows a thread to be easily moved among distinct systems.
many to many ,mean:Combines attributes of (many to one) and (one to many) cases.
The cases in A is clear but in B i didn't understand number 3 , would you please explain it ?
Thanks.
I am not sure which book you are reading, but it seems to be it was written long time ago and now doesn't have any practical usage. For instance, there is no system I know of which allows thread migration. I doubt there ever was one practically used.
As for user-spaced threads modern systems do not use them. All platforms I know of use threads which are managed by kernel (i.e. kernel threads). All threads within the same process have access to this process memory, but can't go outside of it.
A thread is piece of a process ;while a process is a program in execution mode which requires resources.
Is it simply because they only need a stack and storage for registers so they are cheap to create ?
Is the fact that threads can share common data, i.e they don't need to use interprocess communication a factor here ? Would this result in less need for protection ?
Or do threads take advantage of multiprocessors better than processes ?
Who says it is? On some operating systems there is little difference. Are you thinking of Windows where threads are much lighter weight than processes?
I suspect you would learn more by consulting this Stack Overflow question.
If we speak of heavy-weight threads (Windows Threads for example), a Process has Threads, and it has at least one thread (the main thread), so clearly it's heavier or at least not-lighter :-) (the sum is always >= the part)
There are many "tables" that a Process must have (the open file table, the table that show how the memory is mapped (LDT, Local Descriptor Table)...). If you create a process all these tables have to be initialized. If you create a thread they don't (because the thread uses the ones of its process). And then a new process has to load again all the DLL, check them for remapping...
From the windows perspective, a process can take longer to create if it is loading many DLLs and also moving them around in memory due to conflicts in the base address. Then see all of the other reasons listed in the link from David Heffernan's answer.
Process switch requires change in CS/DS registers. Change in the value of these registers require fetching a new descriptor from global descriptor table which is actually expensive process in terms of CPU time.
Not really programming related this question, but I still hope it fits somehow here :).
I wrote the following sentence in my work:
Mulitthreading refers to the ability of an OS to subdivide an application into
threads, where each of the them are capable to execute independently.
I was told, that this definition of thread is too narrow. I am not really sure why this is the case, could somebody be so kind to explain me what I missed?
Thank you
Usually, it is the application that decides when to create threads, not the OS. Also, you may want to mention that threads share address space, while each process has its own.
A thread fundamentally, is a saved execution context - a set of saved registers and a stack, that you can resume and continue execution of. This thread can be executed on a processor (these days, many machines of course can execute multiple threads at the same time).
The critical aspect of "multi-threading" is, that an operating system can emulate execution of many threads at the same time, by preempting (stopping) a thread once it has run for a certain amount of time (a "quantum"), then scheduling another thread to run, based on a certain algorithm that is OS-specific.
Do current architectures provide support for running the same single thread on multiple cores of a single system? What kind of issues would be involved in such a situation?
Not that I know of.
A thread can be stopped an started again on a different core but a thread in and by itself can not run parallel.
If you have code in a thread that could run parallel, you should split it up in two threads.
This would actually slow down the thread. Every time a thread switches cores, all the state of the previous core needs to be transfered. Ideally a thread would stay on one core.
What advantages are you thinking will come from running on multiple cores?
Nope, I don't think there exists such a thing..
It does not exist because the hardware doesn't permit but this is a bottleneck that could theoretically be eliminated with some ingenuity from Intel.
You're basically asking, are sequential instructions automatically parallelized? There are many ways to implement parallel execution, with different levels of efficiency based on the work load. This can also happen at different levels, at the µarch (related to your question) , the ISA, the OS, and etc...
If I can assume correctly, and attempt to answer what I believe your asking.
It is theoretically possible, but hasn't been implemented on commercially available commodity hardware. Hence the many higher level methods for parallelization.