How memory addresses are assigned at compilation time - memory-address

I'm new in it fields, recently I played a bit with debugger and disassemblers.
Looking at the binary inside hopper tool I noticed that we have memory addressesaassigned to it statically. My question is: why if the operating systems manage the memory (that change the address of functions etc every time I run a binary) we have also a static memory addresses in a binary(?) pratical example:
Open the binary in hopper and show that a printf is at address 0x11ed, next run the program in gdb and obviously we have the printf at different address. Is the compiler that assign static address to the binary and why? Any recommended resources to know more?

There are two different things that seem like they are confusing you.
The difference between virtual addresses and real memory addresses. Processes are assigned their own address space starting at 0. This has no relationship to real memory address 0. The mapping is managed by the OS, generally with hardware assistance from a 'memory management unit'. This part of my answer addresses the part of the question about 'operating systems managing memory'.
Dynamically-loaded libraries. An address may be relative to such a library; the mapping from library-relative address to virtual address is assigned at the time the library is loaded into an address space. Language runtime libraries are often distributed as dynamically-loaded shared libraries. This likely accounts for the difference in address between running your program standalone versus under the debugger. I don't know what 'hopper' is so I can't say exactly what it's showing you.

Related

Reading a value in a physical address via the kernel

I'm working on an old linux operating system which has one kernel for all processes (it basically an exo-kernel type).
While implementing debugging features from user space, I would like to disassemble other's processes commands. Therefore, I have created a system-call which takes the virtual address at the target process and prints it's value in it (so I can disassemble the bytes).
My idea was to switch to the target's pgdir, call a pagewalk and then access the data in the physical address pointer. I get a kernel panic while trying to access the later.
If I'm switching to the target's process and then access the virtual address (without pagewalk), the bytes of the command are printed without any problem (with printf("%04x", *va) for example).
My question is - why does the virtual address contain the actual command but the physical address don't (and why does it panic?)
Thank you!
Note: This is an XY-answer ... I'm aware I'm not answering your question ('how to twiddle with hardware MMU setup to read ... memory somewhere') but I'm suggesting a solution to your stated problem (how to read from another process' address space).
Linux provides a facility to do what you ask for - read memory from another process' address space - via the use of ptrace(),
PTRACE_PEEKTEXT, PTRACE_PEEKDATA
Read a word at the address addr in the tracee's memory, returning
the word as the result of the ptrace() call. Linux does not have
separate text and data address spaces, so these two requests are
currently equivalent. (data is ignored; but see NOTES.)
https://stackoverflow.com/search?q=ptrace+PTRACE_PEEKDATA for some references ?

Linux management of memory as exposed to the user

I built a bank-affinity malloc implementation in linux. It translates virtual addresses to physical addresses using /proc/[pid]/pagemap. I know the bits that convey the bank number in the physical address, so I am able to give only pages that correspond to the bank I want to a process, mapping the pages into contiguous virtual address space with mremap.
The results are unexpected, however. Running multiple processes with my malloc, each with affinity for a different bank, gives no performance improvement over running with the system's stock malloc. There should be some improvement, in theory, due to absence of bank contention. A similar kernel-based bank-affinity malloc did give quantifiable performance improvement.
Is there something I'm unaware of? Some translation, buffer, or etc. that is keeping my user-level system from working, whereas the kernel-based system works?
Thanks

passing data from 32bit user program to 64bit linux

If a 32bit user program is running on 64bit linux kernel,
and wants to pass a pointer to data in userspace to kernel code. If the
same structure is defined both in user space and kernel space.
will kernel space code be able to decode the data correctly?
If yes how it is done?
Yes. The 32bit addresses that you use (or any addresses that you use, it's the same in 64bits) are virtual addresses. In other words, any kind of address that you use and pass to anyone (including the kernel) is a "fantasy" thing, it does not correspond to real addresses in any obvious way. You don't know anything but virtual addresses.
In order to make this work, the kernel (usually with help from the MMU) routinely translates virtual addresses to phyiscal addresses. For that, every process has a table with all pages that are valid for this process (managed by the kernel).
The kernel maps and remaps virtual addresses to existing or non-existing locations at pretty much every page fault (so basically, all the time).
The kernel can consequently of course do any translations that may be necessary for any pointer you pass it, whenever that is the case.

Is it possible to allocate a certain sector of RAM under Linux?

I have recently gotten a faulty RAM and despite already finding out this I would like to try a much easier concept - write a program that would allocate faulty regions of RAM and never release them. It might not work well if they get allocated before the program runs, but it'd be much easier to reboot on failure than to build a kernel with patches.
So the question is:
How to write a program that would allocate given sectors (or pages containing given sectors)
and (if possible) report if it was successful.
This will problematic. To understand why, you have to understand the relation between physical and virtual memory.
On any modern Operating System, programs will get a very large address space for themselves, with the remainder of the address space being used for the OS itself. Other programs are simply invisible: there's no address at which they're found. How is this possible? Simple: processes use virtual addresses. A virtual address does not correspond directly to physical RAM. Instead, there's an address translation table, managed by the OS. When your process runs, the table only contains mappings for RAM that's allocated to you.
Now, that implies that the OS decides what physical RAM is allocated to your program. It can (and will) change that at runtimke. For instance, swapping is implemented using the same mechanism. When swapping out, a page of RAM is written to disk, and its mapping deleted from the translation table. When you try to use the virtual address, the OS detects the missing mapping, restores the page from disk to RAM, and puts back a mapping. It's unlikely that you get back the same page of physical RAM, but the virtual address doesn't change during the whole swap-out/swap-in. So, even if you happened to allocate a page of bad memory, you couldn't keep it. Programs don't own RAM, they own a virtual address space.
Now, Linux does offer some specific kernel functions that allocate memory in a slightly different way, but it seems that you want to bypass the kernel entirely. You can find a much more detailed description in http://lwn.net/images/pdf/LDD3/ch08.pdf
Check out BadRAM: it seems to do exactly what you want.
Well, it's not an answer on how to write a program, but it fixes the issue whitout compiling a kernel:
Use memmap or mem parameters:
http://gquigs.blogspot.com/2009/01/bad-memory-howto.html
I will edit this answer when I get it running and give details.
The thing is write own kernel module, which can allocate physical address. And make it noswap with mlock(2).
I've never tried it. No warranty.

Dynamic memory managment under Linux

I know that under Windows, there are API functions like global_alloc() and such, which allocate memory, and return a handle, then this handle can be locked and a pointer returned, then unlocked again. When unlocked, the system can move this piece of memory around when it runs low on space, optimising memory usage.
My question is that is there something similar under Linux, and if not, how does Linux optimize its memory usage?
Those Windows functions come from a time when all programs were running in the same address space in real mode. Linux, and modern versions of Windows, run programs in separate address spaces, so they can move them about in RAM by remapping what physical address a particular virtual address resolves to in the page tables. No need to burden the programmer with such low level details.
Even on Windows, it's no longer necessary to use such functions except when interacting with a small number of old APIs. I believe Raymond Chen's blog and book have some discussions of the topic if you are interested in more detail. Eg here's part 4 of a series on the history of GlobalLock.
Not sure what Linux equivalent is but in ATT UNIX there are "scatter gather" memory management functions in the memory manager of the core OS. In a virtual memory operating environment there are no absolute addresses so applications don't have an equivalent function. The executable object loader (loads executable file into memory where it becomes a process) uses memory addressing from the memory manager that is all kept track of in virtual memory blocks maintained in its page table (which contains the physical memory addresses). Bottom line is your applications physical memory layout is likely in no way ever linear or accessible directly.

Resources