Disclaimer: I am not a very experienced guy, and many questions might seem stupid or badly phrased.
I have heard about stacks and heaps and read a bit about them, but still a few things I don't quite understand:
How does a program find empty memory to store new variables/objects in physical memory.
How does a program know where an object starts and where an object ends in memory. With number variables I can imagine there is a few extra information provided in memory that show the porgram how many bits the variable occupies, but correct me if I'm wrong.
This is similar to my first question, but: when a variable has a value representd only by zeros, how does the program not confuse that with free memory.
Does the object value null mean that the address of an object is a bunch of 0's or does the object point to litterally nothing? And if so, how is the "reference" stored to assign it an address later on?
How does a program find empty memory to store new variables/objects in physical memory.
Modern operating systems use logical address translation. A process sees a range of logical addresses—its address space. The system hardware breaks the address range into pages. The size of the page is system dependent and is often configurable. The operating system manages page tables that map logical pages to physical page frames of the same size.
The address space is divided into a range of pages that is the system space, shared by all processes, and a user space, that is generally unique to each process.
Within the user and system spaces, pages may be valid or invalid. An invalid page has not yet been mapped to the process address space. Most pages are likely to be invalid.
Memory is always allocated from the operating system image pages. The operating system will have system services that transform invalid pages into valid pages with mappings to physical memory. In order to map pages, the operating system needs to find (or the application needs to specify) a range of pages that are invalid and then has to allocate physical page frames to map to the those pages. Note that physical page frames do not have to be mapped contiguously to logical pages.
You mention stacks and heaps. Stacks and heap are just memory. The operating system cannot tell whether memory is a stack, heap or something else. User mode libraries for memory allocation (such as those that implement malloc/free) allocate memory in pages to create heaps. The only thing that makes this memory a heap is that there is a heap manager controlling it. The heap manager can then allocate smaller blocks of memory from the pages allocated to the heap.
A stack is simpler. It is just a contiguous range of pages. Typically an operating system service that creates a thread or process will allocate a range of pages for a stack and assign the hardware stack pointer register to the high end of the stack range.
How does a program know where an object starts and where an object ends in memory. With number variables I can imagine there is a few extra information provided in memory that show the porgram how many bits the variable occupies, but correct me if I'm wrong.
This depends upon how the program is created and how the object is created in memory. For typed languages, the linker binds variables to addresses. The linker also generates instruction for mapping those addresses to the address space. For stack/auto variables, the compiler generates offsets from a pointer to the stack. When a function/subroutine gets called, the compiler generates code to allocate the memory required by the procedure, which it does by simply subtracting from the stack pointer. The memory gets freed by simply adding that value back to the stack pointer.
In the case of typeless languages, such as assembly language or Bliss, the programmer has to keep track of the type for each location. When memory is dynamically, the programmer also has to keep track of the type. Most programming languages help this out by having pointers with types.
This is similar to my first question, but: when a variable has a value representd only by zeros, how does the program not confuse that with free memory.
Free memory is invalid. Accessing free memory causes a hardware exception.
Does the object value null mean that the address of an object is a bunch of 0's or does the object point to litterally nothing? And if so, how is the "reference" stored to assign it an address later on?
The linker defines the initial state of a program's user address space. Most linkers do not map the first page (or even more than one page). That page is then invalid. That means a null pointer, as you say, references absolutely nothing. If you try to dereference a null pointer you will usually get some kind of access violation exception
Most operating system will allow the user to map the first page. Some linkers will allow the user to override the default setting and map the first page. This is not commonly done as it makes detecting memory error difficult.
How does a program find empty memory to store new variables/objects in physical memory.
Physical memory is managed by the OS that knows which parts of the memory are used by processes and which parts are free. When it needs memory, a program asks the operating system to use parts of the memory. If this memory is for the heap, extra operations are needed. The operating systems delivers memory by fixed size blocks called pages. As a page is 4kbytes, if the user mallocs some bytes, there is a need, to optimize memory use, to know which parts of the page are used or available and to monitor page content after successive malloc and free. There are specific data structures to describe used space and algorithms to find space, whilst avoiding fragmentation.
How does a program know where an object starts and where an object ends in memory. With number variables I can imagine there is a few extra information provided in memory that show the porgram how many bits the variable occupies, but correct me if I'm wrong
The program knows the address (ie the start) of every variable. For global or static variables it is generated by the linker when it places vars in memory. For local variables, the processor has means to compute it given the stack position. For allocated variables, it is stored in another variable (a pointer) when memory is allocated. Concerning the end, it depends on the type of variables. For known types (like int) or composition of known types (like structs) it can be computed at compile time. In other situations, the program has no way to know the entity size. For instance a declaration like int * a may describe an array, but the program has no way to know the array size. The programmer must keep track of this information, for instance by writing the number of elements in the array in another variable.
This is similar to my first question, but: when a variable has a value representd only by zeros, how does the program not confuse that with free memory.
The program never looks at the memory to know if it is free or not. It managed by other means (see question 1).
Does the object value null mean that the address of an object is a bunch of 0's or does the object point to litterally nothing? And if so, how is the "reference" stored to assign it an address later on?
An address is never a bunch of zero, except for address '0' of memory. It is the content that is set to zero. Actually, it not possible to read or write address 0. It generates a "bus error" exception (and maybe you have already encountered it). Pointing to a zero address is exactly like "pointing to litterally nothing" and generate an error if encountered in a program. These variables hold addresses of other vars (pointer). So the address of the pointer is well defined. Was may not be defined is what it points to. It can be modified by assigning something to the pointer (for instance what malloc returned or the address of another var).
Related
I want to find all accesses to heap memory in an application. I need to store each allocation and, consequently, can not only check for addresses in the [heap] range (which also does not include heap memory areas allocated by mmap()). Therefore, I wrote a pintool and captured all calls to malloc(), calloc(), realloc() and free(). Because of optimizations such as tail-call elimiation Pin can not detect the last instruction of these calls. Therefore, I manually added callbacks after (precisely, I used IPOINT_TAKEN_BRANCH) the ret... instructions in each of the probable direct/indirect jump targets out of these functions (e.g., malloc(), indirectly, jumps to malloc_hook_ini(). So I added instrumentation code after all ret... instructions in malloc_hook_ini()). These targets, themselves, may have outgoing direct/indirect jumps and, again, I tried to capture them.
But, there are still some accesses in the [heap] range (and also in mmap() ranges) which do not pertain to any of the previously captured allocations. To clear up any doubts, I used Pwngdb to display all currently allocated heap chunks, right before the access point. The access address was clearly in an allocated heap chunk. Of course, knowing the allocation IP for these heap chunks will be a great help. But this is not supported in Pwngdb or any other similar tools.
In many cases analyzed by Pin, the access address does not belong to any address range allocated (even those removed in the meantime) during the whole program execution. How can I determine which allocation function was missed during Pin analysis?
It seems that there are two possible situations:
1) There exists some omitted function other than malloc(), calloc(), realloc() and free().
2) There are some missed return points for malloc(), calloc(), realloc() and free().
The second candidate is not possible. Because I put a counter before and after each of these allocation functions and at the end, they had equal values.
UPDATE:
Here is the backtrace for one such access point and also the value for the RSI register:
I had a debate with friend who said that in C, the memory is allocated in compile time.
I said it cannot be true since memory can be allocated only when the process loads to memory.
From this to that we started talking about stack allocation, after reading, I know its OS implementation,
But usually the stack start in default size, 1 MB (sequential) lets say, of reserved Virtual address, and when the function is called (no matter where we are on the stack of that thread),
A block of bytes in size X (all local variable for that function?) is commited (allocated) from the reserved address.
What I would like to understand, When we enter the function, How does the OS knows how big is the size to allocate (commit from reserved)?
Does it happen dynamically during the function execution ? or the compiler knows to calculate before each function what size this function needs ?
The compiler can and does (at compile time) count up the sizes of all the local variables that the function declares on the stack, and thereby it knows (again, at compile time) how much it will need to increase the stack-pointer when the function is entered (and decrease it when the function returns). This computed amount-to-increase-the-stack-pointer-by value will be written into the executable code directly at compile time, so that it doesn't need to be re-computed when the function is called at run time.
The exception to the above is when your C program is using C99s's variable-length-arrays (VLA) feature. As suggested by the name, variable-length-arrays are arrays whose size is not known until run-time, and as such the compiler will have to emit special code for any function that contains one or more VLAs, such that the amount by which to increase the stack-pointer is calculated at run-time.
Note that the physical act of mapping virtual stack addresses to physical RAM addresses (and making sure that the necessary RAM is allocated) is done at run time, and is handled by the operating system, not by the compiler. In particular, if a process tries to access a virtual address (on the stack or otherwise) that is not currently mapped to any physical address, a page fault will be generated by the MMU. The process's executation will be temporarily paused while a page-fault-handler routine executes. The page-fault-handler will evaluate the legality of the virtual address the process tried to access; if the virtual address was a legal one, the page-fault-handler will map it to an appropriate page of physical RAM, and then let the process continue executing. If the virtual address was not one the process is allowed to access (or if the attempt to procure a page of physical RAM failed, e.g. because the computer's memory is full), then the mapping will fail and the OS will halt/crash the process.
I am writing a simple VM and I have a question on implementing object and structure member access.
Since the begin address of a program is arbitrary on each run, and subsequently the address of each and every of its objects is arbitrary too.
Thus the only way I can think of to access an object or its member object is by accessing an offset from the "base" pointer, which means there is an arithmetic operation needed to access anything in a program structure.
My question is whether this is the way it is done in professional compilers, because obviously this approach adds some overhead to the runtime, and I myself can't think of any way to offload this process from the runtime because of the lack of guarantees for consistency of memory allocation and its address?
Most computers for many decades provide addressing modes that let you specify the address as a combination of a base and an offset, and the actual calculation is carried out in the hardware for no additional cost in CPU clock cycles.
More recent (past few decades) computers offer hardware for virtualizing memory layout, meaning that even through the physical address of an item is different on every run, its address in the virtual address space remains the same. Again, there is no additional cost for using the base address, because the calculations are performed implicitly and invisibly to the executing binary code of a program.
If we have a string "A" and a number 65, since they look identical in memory, how does the OS know which is the string and which is the number?
Another question - assume that a program allocates some memory (say, one byte). How does the OS remember where that memory has been allocated?
Neither of these details are handled by the operating system. They're handled by user programs.
For your first question, internally in memory there is absolutely no distinction between the character 'A' and the numeric value 65 (assuming, of course, that you're just looking at one byte of data). The difference arises when you see how those bits are interpreted by the program. For example, if the user program tries to print the string to the screen, it will probably make some system call to the OS to ask the OS to print the character. In that case, the code in the OS consists of a series of assembly instructions to replicate those bits somewhere in the display device. The display is then tasked with rendering a set of appropriate pixels to draw the character 'A.' In other words, at no point did the program ever "know" that the value was an 'A.' Instead, the hardware simply pushed around bits which controlled another piece of code that ultimately was tasked with turning those bits into pixels.
For your second question, that really depends on the memory manager. There are many ways for a program to allocate memory and know where it's stored. I'm not fully sure I understand what you're asking, but I believe that this answer should be sufficient:
At the OS level, the OS kernel doesn't even know that the byte was allocated. Instead, the OS just allocates giant blocks of memory for the user program to use as it runs. When the program terminates, all that memory is reclaimed.
At the program level, most programs contain a memory manager, a piece of code tasked with allocating and divvying up that large chunk of memory into smaller pieces that can then be used by the program. This usually keeps track of allocated memory as a list of "chunks," where each chunk of memory is treated as a doubly-linked list of elements. Each chunk is usually annotated with information indicating that it's in use and how large the chunk is, which allows the memory manager to reclaim the memory once it's freed.
At the user code level, when you ask for memory, you typically store it in a pointer to keep track of where the memory is. This is just a series of bytes in memory storing the address, which the OS and memory manager never look at unless instructed to.
Hope this helps!
No. 2 - the system keeps a record of all allocations (of a certain process) and can thus remove them e.g. when the process terminates. I propose you read a book on operating system priciples (e.g. Tanenbaum's "Modern Operating Systems").
The character 'A' and the integer number 65 are stored the same way (atleast on 32bit systems) in memory. The string "A" however is stored differently, and that can depend on the system or the programming language. Take for example C which will stored strings as essentially an array of the characters followed by the null character.
Operating Systems use memory managers to keep track of which process are using which parts of memory.
For a computer, a string is a number. A simpliest example would be an ASCII table where for every letter there is a number attached. So if you're familiar with C, you could write printf("%c", 0x65) and actually get a A instead of number. Hope that made sense.
OS don't remember the location of the memory the program has allocated. That's what pointers are for!
The 'OS' applies an algorithm, which will look something like: "if every character in the string is a number, then the string is a number", and gets more complicated for decimals, +/-, etc!
http://en.wikipedia.org/wiki/Dynamic_memory_allocation!
When executed, program will start running from virtual address 0x80482c0. This address doesn't point to our main() procedure, but to a procedure named _start which is created by the linker.
My Google research so far just led me to some (vague) historical speculations like this:
There is folklore that 0x08048000 once was STACK_TOP (that is, the stack grew downwards from near 0x08048000 towards 0) on a port of *NIX to i386 that was promulgated by a group from Santa Cruz, California. This was when 128MB of RAM was expensive, and 4GB of RAM was unthinkable.
Can anyone confirm/deny this?
As Mads pointed out, in order to catch most accesses through null pointers, Unix-like systems tend to make the page at address zero "unmapped". Thus, accesses immediately trigger a CPU exception, in other words a segfault. This is quite better than letting the application go rogue. The exception vector table, however, can be at any address, at least on x86 processors (there is a special register for that, loaded with the lidt opcode).
The starting point address is part of a set of conventions which describe how memory is laid out. The linker, when it produces an executable binary, must know these conventions, so they are not likely to change. Basically, for Linux, the memory layout conventions are inherited from the very first versions of Linux, in the early 90's. A process must have access to several areas:
The code must be in a range which includes the starting point.
There must be a stack.
There must be a heap, with a limit which is increased with the brk() and sbrk() system calls.
There must be some room for mmap() system calls, including shared library loading.
Nowadays, the heap, where malloc() goes, is backed by mmap() calls which obtain chunks of memory at whatever address the kernel sees fit. But in older times, Linux was like previous Unix-like systems, and its heap required a big area in one uninterrupted chunk, which could grow towards increasing addresses. So whatever was the convention, it had to stuff code and stack towards low addresses, and give every chunk of the address space after a given point to the heap.
But there is also the stack, which is usually quite small but could grow quite dramatically in some occasions. The stack grows down, and when the stack is full, we really want the process to predictably crash rather than overwriting some data. So there had to be a wide area for the stack, with, at the low end of that area, an unmapped page. And lo! There is an unmapped page at address zero, to catch null pointer dereferences. Hence it was defined that the stack would get the first 128 MB of address space, except for the first page. This means that the code had to go after those 128 MB, at an address similar to 0x080xxxxx.
As Michael points out, "losing" 128 MB of address space was no big deal because the address space was very large with regards to what could be actually used. At that time, the Linux kernel was limiting the address space for a single process to 1 GB, over a maximum of 4 GB allowed by the hardware, and that was not considered to be a big issue.
Why not start at address 0x0? There's at least two reasons for this:
Because address zero is famously known as a NULL pointer, and used by programming languages to sane check pointers. You can't use an address value for that, if you're going to execute code there.
The actual contents at address 0 is often (but not always) the exception vector table, and is hence not accessible in non-privileged modes. Consult the documentation of your specific architecture.
As for the entrypoint _start vs main:
If you link against the C runtime (the C standard libraries), the library wraps the function named main, so it can initialize the environment before main is called. On Linux, these are the argc and argv parameters to the application, the env variables, and probably some synchronization primitives and locks. It also makes sure that returning from main passes on the status code, and calls the _exit function, which terminates the process.