A program is divided into 4 parts: Stack, data, code, heap.
I know what each of them are as data structures (like used in Java), but what is their difference (and definition) in operating systems?
A program is divided into 4 parts: Stack, data, code, heap.
That is not an accurate starting point.
A program is divided into program sections with various attributes.
Read Only/No execute (which you call data)
Read Only/Execute (which you call code)
Read/Write (which encompasses both heap and stack).
A stack is simply a block of memory that is allocated and freed using push and pop operations.The allocations and frees are usually implemented using a stack pointer register.
A heap is one or more blocks of memory that can be allocated and freed in any order and various sizes. The operating system has no knowledge at all of program heaps. The are managed by libraries linked to the code (although the operating system will have heaps of its own). The operating system simply sees these a blocks of memory.
Related
From Understanding the Linux Kernel:
Segmentation has been included in 80x86 microprocessors to encourage programmers to split their applications into logically related entities, such as subroutines or global and local data areas. However, Linux uses segmentation in a very limited way. In fact, segmentation and paging are somewhat redundant, because both can be used to separate the physical address spaces of processes: segmentation can assign a different linear address space to each process, while paging can map the same linear address space into different physical address spaces. Linux prefers paging to segmentation for the following reasons:
Memory management is simpler when all processes use the same segment register values—that is, when they share the same set of linear addresses.
One of the design objectives of Linux is portability to a wide range of architectures; RISC architectures, in particular, have limited support for segmentation.
The 2.6 version of Linux uses segmentation only when required by the 80x86 architecture.
The x86-64 architecture does not use segmentation in long mode (64-bit mode). As the x86 has segments, it is not possible to not use them. Four of the segment registers: CS, SS, DS, and ES are forced to 0, and the limit to 2^64. If so, two questions have been raised:
Stack data (stack segment) and heap data (data segment) are mixed together, then pop from the stack and increase the ESP register is not available.
How does the operating system know which type of data is (stack or heap) in a specific virtual memory address?
How do different programs share the kernel code by sharing memory?
Stack data (stack segment) and heap data (data segment) are mixed together, then pop from the stack and increase the ESP register is not available.
As Peter states in the above comment, even though CS, SS, ES and DS are all treated as having zero base, this does not change the behavior of PUSH/POP in any way. It is no different than any other segment descriptor usage really. You could get overlapping segments even in 32-bit multi-segment mode if you point multiple selectors to the same descriptor. The only thing that "changes" in 64-bit mode is that you have a base forced by the CPU, and RSP can be used to point anywhere in addressable memory. PUSH/POP operations will work as usual.
How does the operating system know which type of data is (stack or heap) in a specific virtual memory address?
User-space programs can (and will) move the stack and heap around as they please. The operating system doesn't really need to know where stack and heap are, but it can keep track of those to some extent, assuming the user-space application does everything according to convention, that is uses the stack allocated by the kernel at program startup and the program break as heap.
Using the stack allocated by the kernel at program startup, or a memory area obtained through mmap(2) with MAP_GROWSDOWN, the kernel tries to help by automatically growing the memory area when its size is exceeded (i.e. stack overflow), but this has its limits. Manual MAP_GROWSDOWN mappings are rarely used in practice (see 1, 2, 3, 4). POSIX threads and other more modern implementations use fixed-size mappings for threads.
"Heap" is a pretty abstract concept in modern user-space applications. Linux provides user-space applications with the basic ability to manipulate the program break through brk(2) and sbrk(2), but this is rarely in a 1-to-1 correspondence with what we got used to call "heap" nowadays. So in general the kernel does not know where the heap of an application resides.
How do different programs share the kernel code by sharing memory?
This is simply done through paging. You could say there is one hierarchy of page tables for the kernel and many others for user-space processes (one for each task). When switching to kernel-space (e.g. through a syscall) the kernel changes the value of the CR3 register to make it point to the kernel's page global directory. When switching back to user-space, CR3 is changed back to point to the current process' page global directory before giving control to user-space code.
I'm studying operating system theory, and I know that heap allocation involves a specific syscall and I know that compilers usually optimize for this requesting more than needed beforehand.
But I don't find information about stack allocation. What about it? It involves a specific syscall every time you read from it or write to it (for example when you call a function with some parameters)? Or there is some other mechanism that don't involve syscall perhaps?
Typically when the OS starts your program it examines the executable file's headers and arranges various areas for various things (an area for your executable's code, and area for your executable's data, etc). This includes setting up an initial stack (and a lot more - e.g. finding shared libraries and doing dynamic linking).
After the OS has done all this, your executable starts executing. At this point you already have memory for a stack and can just use it without any system calls.
Note 1: If you create threads, then there will probably be a system call involved to create the thread and that system call will probably allocate memory for the new thread's stack.
Note 2: Typically there's "virtual memory" (what your program sees) and "physical memory" (what the hardware sees); and in between typically the OS does lots of tricks to improve performance and avoid wasting physical memory, and to hide resource limits (so you don't have to worry so much about running out of physical memory). One of these tricks is to allocate virtual memory (e.g. for a large stack) without allocating any actual physical memory, and then allocate the physical memory if/when the virtual memory is first modified. Other tricks include various "swap space" schemes, and memory mapped files. These tricks rely on requests generated by the CPU on your program's behalf (e.g. page fault exceptions) which aren't system calls, but have similar ("ask kernel to do something") characteristics.
Note 3: All of the above depends on which OS. Different operating systems do things differently. I've chosen words carefully - e.g. "Typically" means that most modern operating systems work like I've described (but "typically" does not imply that all possible operating systems work like that; and some operating systems do not work like I've described).
No, stack is normal memory. For process point of view, there is no difference (and so the nasty security bug, where you return a pointer to a data in stack, but stack now is changed.
As Brendan wrote, OS will setup stack for the process at program loading. But if you access a non-allocated page of stack (e.g. if your stack if growing), kernel may allocate automatically for you a new stack page. (not much different as when you try to allocate new memory in heap, and there is no more memory available on program space: but in this case you explicitly do a syscall to tell kernel you want more heap memory).
You will notice that usually stack go in one direction and heap (allocated memory) in the other direction (usually toward each others). So if you program need more stack you have space, but if you program do not need much stack, you can use memory for e.g. huge array. Or the contrary: if you do a lot of recursion, you allocate much stack (but you probably need less heap memory).
Two additional consideration: CPU may have special stack instruction. But you can see them as syntactic sugar (you can simulate PUSH and POP with MOV. CALL and RET with JMP (and simulated PUSH and POP).
And kernel may use a special stack for his own purposes (especially important for interrupts).
I am puzzled as to how Linux is able to have so many segments and it can still have bounds checking. To my knowledge, modern CPUs have a couple of segment data registers (code, data, etc).
But Linux has multiple segments of its own: Stack, BSS, heap, code, globals, and many more (especially if the heap is large and composed of many segments). Not every CPU has enough registers to track all these segments.
If I am not mistaken, Linux stores each segment in a separate page, so how is it able to prevent one of these pages from reading or writing out of bounds?
My only possible explanations are that Linux:
performs some manual checking on every write
places all the pages close together in a way such that they can be tracked with a few registers
With the advent of 64-bit Intel, the concept of hardware segments has died the death that should have taken place in the 1970's.
But Linux has multiple segments of its own: Stack, BSS, heap, code, globals, and many more (especially if the heap is large and composed of many segments).
These are pedagogical concepts that have little relationship to reality outside the implementation of linkers--but which bad books on operating systems persist in using.
A stack is just memory. Heap is just memory. The operating system has no knowledge if memory is being used for a stack whether it is being used for a heap. The operating system simply allocates memory to a process with different attributes (e.g., read/write, read only, read/execute). What the process does with that memory is its own business.
I want to understand how OOP programming languages differs from procedural languages in terms of memory utilization. To be more specific, let's assume we are talking about Java and C as examples:
Is it true that objects are automatically stored in heap while in procedural languages you have to explicitly define the heap usage such as in C malloc?
If I write a program in C, OS will create a virtual page of this program including heap and stack spaces. If I don't use malloc in my code, this means my program does not utilize the heap allocated for it, is that correct?
Since Stack is used to store local variables and function call addresses, what if a program ran out of Stack space, does OS extend the paging size of this program or it just uses the heap as an extension?
Is it true that objects are automatically stored in heap while in procedural languages you have to explicitly define the heap usage such as in C malloc?
That depends upon the language. Some, such as Object Pascal, require all "objects" to be allocated on the heap. Others, such as C++, allow objects to exist in static, heap, or stack. There are advantages and disadvantages of both approaches.
If I write a program in C, OS will create a virtual page of this program including heap and stack spaces. If I don't use malloc in my code, this means my program does not utilize the heap allocated for it, is that correct?
Probably not. It is likely the run-time library will use the heap behind your back.
Since Stack is used to store local variables and function call addresses, what if a program ran out of Stack space, does OS extend the paging size of this program or it just uses the heap as an extension?
That depends upon the operating system. Generally, an OS will attempt to expand the stack if it can. t will not use the heap to extend the stack. Stacks are usually protected by an inaccessible page at either end (like the first page to catch null pointers). The likely outcome from running out of stack is some kind of access violation.
I was wondering that if the space required on heap is not large enough
such that there is no need for a brk/sbrk system all (to shift the break pointer (brk) of data segment), how does a library function (such as malloc) allocates space on heap.
I am not asking about the data-structures and algorithms for heap management. I am just asking how does malloc get the address of the first location of the heap if it doesn't invoke a system call. I am asking this because I have heard that it is not always necessary to invoke a system call (brk/sbrk) as these are only required to expand the space.Please correct me if I am wrong.
The basic idea is that when your program starts, the heap is very small, but not necessarily zero. If you only allocate (malloc) a small amount of memory, the library is able to handle it within the small amount of space it has when it is loaded. However, when malloc runs out of that space, it needs to make a system call to get more memory.
That system call is often sbrk(), which moves the top of the heap's memory region up by a certain amount. Usually, the malloc library routine increases the heap by larger than what is needed for the current allocation, with the hope that future allocations can be performed w/o making a system call.
Other implementations of malloc use mmap() instead -- this allows the program to create a sparse virtual memory mapping. However, mmap() based malloc implementations do the same thing as the sbrk()-based ones: each system call reserves more memory than what is necessarily needed for the current call.
One way to look at this is to trace a program that uses malloc: you'll see that for N calls to malloc, you will see M system calls (where M is much smaller than N).
The short answer is that it uses sbrk() to allocate a big hunk, which at that point belongs to your app process. It can then further parcel out subsections of that as individual malloc calls without needing to ask the system for anything, until it exhausts that space and needs to resort sbrk() again.
You said you didn't want the details on the data structures, but suffice it to say that the implementation of malloc (i.e. your own process, not the OS kernel) is keeping track of which space in the region it got from the system is spoken for and which is still available to dole out as individual mallocs. It's like buying a big tract of land, then subdividing it into lots for individual houses.
Use sbrk() or mmap() — http://linux.die.net/man/2/sbrk, http://linux.die.net/man/2/mmap