disassembly Linux kernel using objdump - linux

If I try to disassem Linux kernel, it takes quite long time due to the big size of Linux elf-binary.
Is there a way to only disassem a function or a symbol, for instance start_kernel function?
What I don't want is to use grep, since it anyway takes very long time.

Unless you have compiled kernel with debugging symbols included, there are no symbols for objdump to use. It is highly unlikely the kernel binary has debugging symbols included, unless you've specifically compiled it with such options.
In case your kernel binary does have debugging symbols, they can be found using nm -g and then further used with objdump -j <symbol>.

You can not disassemble particular function or symbol alone but instead of full kernel image, you can disassemble specific object file.
ex.
objdump -DS XYZ.o > XYZ.S

Related

How addr2line can locate the source file and the line of code?

addr2line translates addresses into file names and line numbers. I am still beginner in debugging, and have some questions about addr2line.
If am debugging a certain .so (binary) file, how the tool can locate
its source code file (from where can get it!), what if the source doesn't exist?
What is the relation between the address in a binary and the line
number in its source, so addr2line can do this kind of mapping?
In general, addr2line works best on ELF executables or shared libraries with debug information. That debug information is emitted by the compiler when you pass -g (or -g2, etc...) to GCC. It notably provides a mapping between source code location (name of source file, line number, column number) and functions, variable names, call stack frame organization, etc etc... The debug information is today in DWARF format (and is also processed by the gdb debugger, the libbacktrace library, etc etc...). Notice that the debug information contains source file paths (not the source file itself).
In practice, you can (and often should) pass the -g (or -g2) debugging option to GCC even with optimization flags like -O2. In that case, the debug information is slightly less precise but still practically usable. In some cases, stack frames may disappear (inlined function calls, tail call optimizations, ....).
You could use the strip(1) utility to remove debug information (and other symbol tables, etc) from some ELF executable.

How does the Linux kernel determine ld.so's load address?

I know that the dynamic linker uses mmap() to load libraries. I guess it is the kernel who loads both the executable and its .interpreter into the same address space, but how does it determine where? I noticed that ld.so's load address with ASLR disabled is 0x555555554000 (on x86_64) — where does this address come from? I tried following do_execve()'s code path, but it is too ramified for me not to be confused as hell.
Read more about ELF, in particular elf(5), and about the execve(2) syscall.
An ELF file may contain an interpreter. elf(5) mentions:
PT_INTERP The array element specifies the location and
size of a null-terminated pathname to invoke
as an interpreter. This segment type is
meaningful only for executable files (though
it may occur for shared objects). However it
may not occur more than once in a file. If
it is present, it must precede any loadable
segment entry.
That interpreter is practically almost always ld-linux(8) (e.g. with GNU glibc), more precisely (on my Debian/Sid) /lib64/ld-linux-x86-64.so.2. If you compile musl-libc then build some software with it you'll get a different interpreter, /lib/ld-musl-x86_64.so.1. That ELF interpreter is the dynamic linker.
The execve(2) syscall is using that interpreter:
If the executable is a dynamically linked ELF executable, the
interpreter named in the PT_INTERP segment is used to load the needed
shared libraries. This interpreter is typically /lib/ld-linux.so.2
for binaries linked with glibc.
See also Levine's book on Linkers and loaders, and Drepper's paper: How To Write Shared Libraries
Notice that execve is also handling the shebang (i.e. first line starting with #!); see the Interpreter scripts section of execve(2). BTW, for ELF binaries, execve is doing the equivalent of mmap(2) on some segments.
Read also about vdso(7), proc(5) & ASLR. Type cat /proc/self/maps in your shell.
(I guess, but I am not sure, that the 0x555555554000 address is in the ELF program header of your executable, or perhaps of ld-linux.so; it might also come from the kernel, since 0x55555555 seems to appear in the kernel source code)

Can nasm generate debug symbol to binary file?

I have a binary file made with nasm -f which I want to do some debugging, or close enough. So far I know, nasm doesn't generate proper symbols for debugging to a binary file, right? which approach could I use to e.g, see each value passed on register/memory a time? I have an "array" in a assembly program that I want to see each value of. Is there any tool would help to perform this task?
If you are on linux, you should use nasm -f elf -F dwarf to get debug information, and make sure you are not stripping them during linking.
Also, to see register or memory contents you don't need debug info.

Intel binary to ELF

Really quick question here. I'm working in Ubuntu, I have a simple "Hello World!" program in assembly which I have assembled into x86 assembly. Now I want to turn that machine code into an ELF executable which my computer can run. I am aware that I could just assemble directly to ELF, the purpose of my inquiry is to discover how to make ELF binaries out of assembled machine code.
Thanks!
Final ELF executable files are typically built out of other ELF files, reorganized by the linker. The easiest way, of course, would be to specify ELF as the output format of your assembler.
1) If you really want to do this, you could start with an "empty" ELF file (that you get from compiling or assembling nothing, etc.). Then you could use objcopy --add-section, which allows you to add an arbitrary file as a section in an existing ELF file.
This will create a minimal ELF file:
$ echo "" | gcc -c -o empty.out -xc -
2) Alternatively, you could include your raw binary into another assembly file using something like nasm's incbin, which would then need to be assembled as an ELF.
3) A third option (the best so far) would be to provide your raw binary to the linker, and use a custom linker script to tell it what section to put it in (determined from the input file name). The -b flag before an input file will tell ld what type of file it is. This should let you use your flat binary file.
One of the first obstacles you're going to face is getting the entry point to point to your code. Off the top of my head I'm not sure how to edit that.
There is a Python library, pyelftools that may help you in your quest.
If it's really assembled, then it's already in the ELF format (compilers targeting Linux generally store the object code in ELF object files as well).
However, if you want a fully-functioning executable, you have to feed the object file to a linker.

changing linux memory protection

Is there a way to check which memory protection machenizem is used by the OS?
I have a program that fails with segmentation fault, in one computer (ubuntu) but not in another (RH6).
One of the explanations was memory protection mechanizem used by the OS.
Is there a way I can find / change it?
Thanks,
You might want to learn more about virtual memory, system calls, the linux kernel, ASLR.
Then you could study the role and usage of mmap & munmap system calls (also mprotect). They are the syscalls used to retrieve memory (e.g. to implement malloc & free), sometimes with obsolete syscalls like sbrk (which is increasingly useless).
You should use the gdb debugger (its watch command may be handy), and the valgrind utility. strace could also be useful.
Look also inside the /proc pseudo file system. Try to understand what
cat /proc/self/maps
is telling you (about the process running that cat). Look also inside /proc/$(pidof your-program)/maps
consider also using the pmap utility.
If it is your own source code, always compile it with all warnings and debuggiing info, e.g. gcc -Wall -Wextra -g and improve it till the compiler don't give any warnings. Use a recent version of gcc (ie 4.7) and of gdb (i.e. 7.4).

Resources