Do threads have access to stack of other threads of same process? - multithreading

I am aware that the threads of a process share the address space (all the segments) except the stack. Each thread has it's own stack. I have also read that being in same address space, threads can have access to the memory location belonging to the stack of other thread (here and here).
If above statements are correct, I've this question: what happens when OS sees such address from a thread? What if threads write something in other's stack and corrupt them? why OS doesn't generate some sort of permission denied error?
If answer for this question depends on operating systems, please consider Linux.

As you state, threads share the same address space. Thus each thread has the same access to the address space as does any other thread.
OS sees such address from a thread?
I can't tell what you are asking here.
What if threads write something in other's stack and corrupt them?
This is entirely possible and the results are unpredictable. It just like a race condition.
why OS doesn't generate some sort of permission denied error?
Memory access is controlled by processor mode (e.g., user, kernel, and on some systems additional modes). Generally mode level protection is designed to prevent user access to the shared system range of the logical address space. Processes are protected from each other by having different user address space ranges.
Generally, you want threads to be able to read and write within the same address space. Otherwise, you might as well just use separate processes.
The stack is simply a block of read/write member. There is nothing sacred about a stack. Any block of memory can be a stack just be using it as a stack.

Related

Multi-thread context switching in register-less machine

As I know, multi-thread context switching is pre-emptive initiated by the OS, transparent from the perspective of thread. Generally, when context switching, OS saves all the register values and restores them later switching back to that thread. This includes stack-pointer of that thread.
But consider a hypothetical register-less machine. In this, I can use a fixed address in memory to store the stack-pointer. But here, when the context-switches, stack-pointer is not guaranteed to be preserved. The other thread stores its stack-pointer in the same address. Since it is a global variable, it will corrupt stack-pointer of all other process in the same thread. How to avoid this? How can I store stack-pointers without necessarily needing registers, but also keeping stack-pointer valid after a context-switch?
I'm asking this because, since every computer is equivalent to basic Turing machine, and basic Turing machine does not contain registers this should be somehow doable. I thought about it for some time now, and I was unable to come up with anything.
EDIT: As #JérômeRichard mentioned in comments, all modern processors have registers, and their ISAs are dependent on them. So even memory copying is not possible in them without registers. So here I am going to define a simple architecture for argument purposes.
It is a machine with 2^x addressable units, with each of them having a unique x-bit address. For simplicity, assume there is no concept of virtual memory, and entire address space is mapped directly to physical memory. So, there is no need for requesting memory, process can use its entire x-bit address space freely. Let OS be present outside this address-space and not interfere with user-process.
Also, there is only one user-process running on the machine. But it can multiple threads. All threads share same address space. But they all can independently execute different instructions and on different data. Again, for simplicity, switching of instruction pointer is managed by the OS.
This processor's ISA is capable of reading, writing, copying data on any part of memory given their address. It can also do any arithmetic, logic, bit-manip operations on the data. Let me leave out all other floating-point and vector instructions.
When we have a single thread running, we can fix some global address and store the current function's data like frame pointer and return address and use them. Now, the challenge is to store and restore thread-specific data like these across context switches. Assume this OS does pre-emptive context switches.

Is there a separate kernel level thread for handling system calls by user processes?

I understand that user level threads are implemented in user space and kernel level threads in kernel space. I have also read that user level threads are mapped onto kernel level threads to actually run the user level threads.
What exactly is meant by "implemented"? Does this mean the thread control blocks are defined in user and kernel space respectively?
What happens when a system call is made? Which kernel thread (or user thread IDK) does this system call run on? And does each kernel level stack have its own stack?
I have an understanding that threads are just parts of a process. When we deal with kernel threads, what is the corresponding process here? And what are the kernel processes and can you give examples?
I have referred to other answers as well, but haven't received satisfaction.
It depends on the implementation of the OS.
But usually, like in Linux, the system call is executed on the thread that called it. And each thread has a user stack and a kernel stack.
See How does a system call work and How is the system call in Linux implemented? for more details. And I hope this link can clear up your question about "kernel threads".

How an OS figures out an identity of a thread calling the GetCurrentThreadId()?

I'm trying to understand how an OS figures out what thread is a current one (for example, when the thread calls gettid() or GetCurrentThreadId()). Since a process address space is shared between all threads, keeping a thread id there is not an option. It must be something unique to each thread (i.e. stored in its context). If I was an OS developer, I would store it in some internal CPU register readable only in kernel mode. I googled a lot but haven't found any similar question (as if it was super obvious).
So how is it implemented in real operating systems like Linux or Windows?
You are looking for Thread Control Block(TCB).
It is a data structure that holds information about threads.
A light reading material can be found here about the topic:
https://www.cs.duke.edu/courses/fall09/cps110/slides/threads2.3.ppt
But I would recommend getting a copy of Modern Operating Systems by Andrew S. Tanenbaum if you are interested in OS.
Chapter 2 Section 2.2 Threads:
Implementing Threads in User Space - "When threads are managed in user space, each process needs its own private
thread table to keep track of the threads in that process."
Implementing threads in the Kernel - "The kernel has a thread table that keeps track
of all the threads in the system."
Just an edit you might also want to read "SCHEDULING". In a general manner you can say kernel decides which thread/process should be using the CPU.Thus kernel knows which thread/process made a system call. I am not going into detail because it depends on which OS we are talking about.
I believe this has already been very well explained in this question: how kernel distinguishes between thread and process
If you want to find out more, you can also google for the kernel task structure and see what info is stored about each type of processes running in the user space
The answer to your question is entirely system specific. However, most processors know nothing about threads. They only support processes. Threads are generally implemented by created separate processes that share the same address space.
When you do a system service call to get a thread ID it is going to be implemented in the same general fashion as system service to get the process id. Imagine how a get process ID function could work in a system that does not support threads. And to keep it simple, let's assume a single processor.
You are going to have some kind of data structure to represent the current process and the kernel is going to have some means of identifying the current process (e.g. a pointer in the kernel address space to that process). On some processors there is a current task register that points to a structure defined by the processor specification. An operating system can usually add its own data to the end of this structure.
So now I want to upgrade this operating system to support threads. To that I must have a data structure that describes the thread. In that structures I have a pointer to a structure that defines the process.
Then get thread ID works the same way get process ID worked before. But now Get Process ID has an additional step that I have to translate the thread to the process to get its id (which may even be included in the thread block).

where thread is implemented in memory?

We know that thread has its own stack it's implemented within the process. But my question is that when thread is implemented in his own stack that time it is the same stack which used by process or any other function?
One more doubt that thread share it's global variable,file descriptor, signal handler etc. But how it's share all these parameters within same address where all the threads executed?
Brief explanation will be appreciated.
when thread is implemented in his own stack that time it is the same stack which used by process or any other?
Can't quite parse this but I get the gist I think.
In most cases, under Linux in a multithreaded application, all of the threads share the same address space. Each thread if it is running on a separate processor may have local cached memory but the overall address space is shared by all threads. Even per-thread stack space is shared by all threads -- just that each thread gets a different contiguous memory area.
But how it's share all these parameters within same address?
This is also true of the global variables, file descriptors, etc.. They are all shared.
Most thread implementations running under Linux use the clone(2) syscall to create new thread processes. To quote from the clone man page:
clone() creates a new process, in a manner similar to fork(2). It is actually a library function layered on top of the underlying clone() system call, hereinafter referred to as sys_clone. A description of sys_clone is given toward the end of this page.
Unlike fork(2), these calls allow the child process to share parts of its execution context with the calling process, such as the memory space, the table of file descriptors, and the table of signal handlers.
You can see the cloned processes by using ps -eLf under Linux.

Process VS thread : can two processes share the same shared memory ? can two threads ?

After thinking about the the whole concept of shared memory , a question came up:
can two processes share the same shared memory segment ? can two threads share the same shared memory ?
After thinking about it a little more clearly , I'm almost positive that two processes can share the same shared memory segment , where the first is the father and the second is the son , that was created with a fork() , but what about two threads ?
Thanks
can two processes share the same shared memory segment?
Yes and no. Typically with modern operating systems, when another process is forked from the first, they share the same memory space with a copy-on-write set on all pages. Any updates made to any of the read-write memory pages causes a copy to be made for the page so there will be two copies and the memory page will no longer be shared between the parent and child process. This means that only read-only pages or pages that have not been written to will be shared.
If a process has not been forked from another then they typically do not share any memory. One exception is if you are running two instances of the same program then they may share code and maybe even static data segments but no other pages will be shared. Another is how some operating systems allow applications to share the code pages for dynamic libraries that are loaded by multiple applications.
There are also specific memory-map calls to share the same memory segment. The call designates whether the map is read-only or read-write. How to do this is very OS dependent.
can two threads share the same shared memory?
Certainly. Typically all of the memory inside of a multi-threaded process is "shared" by all of the threads except for some relatively small stack spaces which are per-thread. That is usually the definition of threads in that they all are running within the same memory space.
Threads also have the added complexity of having cached memory segments in high speed memory tied to the processor/core. This cached memory is not shared and updates to memory pages are flushed into central storage depending on synchronization operations.
In general, a major point of processes is to prevent memory being shared! Inter-process comms via a shared memory segment is certainly possible on the most common OS, but the mechanisms are not there by default. Failing to set up, and manage, the shared area correctly will likely result in a segFault/AV if you're lucky and UB if not.
Threads belonging to the same process, however, do not have such hardware memory-management protection can pretty much share whatever they like, the obvious downside being that they can corrupt pretty much whatever they like. I've never actually found this to be a huge problem, esp. with modern OO languages that tend to 'structure' pointers as object instances, (Java, C#, Delphi).
Yes, two processes can both attach to a shared memory segment. A shared memory segment wouldn't be much use if that were not true, as that is the basic idea behind a shared memory segment - that's why it's one of several forms of IPC (inter-Process communication).
Two threads in the same process could also both attach to a shared memory segment, but given that they already share the entire address space of the process they are part of, there likely isn't much point (although someone will probably see that as a challenge to come up with a more-or-less valid use case for doing so).
In general terms, each process occupies a memory space isolated from all others in order to avoid unwanted interactions (including those which would represent security issues). However, there is usually a means for processes to share portions of memory. Sometimes this is done to reduce RAM footprint ("installed files" in VAX/VMS is/was one such example). It can also be a very efficient way for co-operating processes to communicate. How that sharing is implemented/structured/managed (e.g. parent/child) depends on the features provided by the specific operating system and design choices implemented in the application code.
Within a process, each thread has access to exactly the same memory space as all other threads of the same process. The only thing a thread has unique to itself is "execution context", part of which is its stack (although nothing prevents one thread from accessing or manipulating the stack "belonging to" another thread of the same process).

Resources