Kernel symbol table mapped into virtual address space - why? - linux

What is /proc/ksyms and /proc/kallsyms, and why is it mapped into a processes address space? What purpose does it serve? Is it used in context switching of the kernel during a system call?

The Solaris manpage for ksyms(7d) explains this. The data is informative-only, the kernel exposes its currently-used symbol table to kernel debuggers and/or the kernel module loader this way, through /dev/ksyms.
Linux does the same through /proc/kallsyms; /proc/ksyms - if present - is a "traditional" file presenting a subset of the same data (i.e. it's deprecated).
The difference, as usual for Linux/Solaris, is that the Linux version presents text while the Solaris one is binary. You can run nm /dev/ksyms on the Solaris one to get the same type of output you get from cat /proc/kallsyms on Linux.

Related

Where can I find the Process Control Block (PCB) and GDTR/LDTR contents using GDB and QEMU?

I have a barebone linux kernel with buildroot setup for debugging using QEMU and GDB. I am using the x86_64 architecture.
I want to check how the memory protection works for each process. So basically, I need to find the base and limit values that govern the access to the physical memory.
If I understood correctly, the GDTR register in the x86 architecture "holds the base address (32 bits in protected mode; 64 bits in IA-32e mode) and the 16-bit table limit for the GDT." If not, please let me know where such information is held.
I tried using the i r in GDB but the output does not show the GDTR/LDTR contents. I read somewhere that we can use another method while inside the kernel in order to display the results of these registers.
I also need to check the PCB (Process Control Block) contents. I can't seem to find a way to do so. I read somewhere that if we do memory dump in the kernel, we can get the PCB contents, but I can't figure out how to do so.
So, how can I check the contents of PCB and GDTR/LDTR from gdb?
The setup is a simple qemu that launches the linux kernel with buildroot, connects gdb by using target remote :1234 and execute a simple C program that has fork and exec inside of it.

How to get distro name and version from linux kernel code?

I am looking if there is any API through which we can get OS distribution name and version from a Linux kernel module?
For example, I am using SLES 12 service pack 4. This information is present in /etc/os-release file. I want to know if there is any way to get this information from kernel code.
linux:/ # cat /etc/os-release
NAME="SLES"
VERSION="12-SP4"
VERSION_ID="12.4"
PRETTY_NAME="SUSE Linux Enterprise Server 12 SP4"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:12:sp4"
linux:/ #
There's no kernel API for detecting the current OS distribution, simply because it's not really needed. The Linux kernel itself is distribution-agnostic, and it couldn't care less which distribution is being run on top of it (having the kernel depend on what's being run on top of it wouldn't make much sense).
If you really want, you can open, read and parse the file yourself from kernel space. See more in this other post for an example, and in particular this answer for modern kernels. In any case, remember that filesystem interaction from kernel space is generally discouraged, and could easily lead to bugs and compromise the security of the kernel if done wrong, so be careful.
If you are developing a kernel module, I would suggest you to parse the /etc/os-release file from userspace when compiling/installing the module and use a set of #defines, or even module parameters. In any case, you should ask yourself why you need this information in kernel code in the first place, as you really shouldn't.

How can we tell an instruction is from application code or library code on Linux x86_64

I wanted to know whether an instruction is from the application itself or from the library code.
I observed some application code/data are located at about 0x000055xxxx while libraries and mmaped regions are by default located at 0x00007fcxxxx. Can I use for example, 0x00007f00...00 as a boundary to tell instruction is from the application itself or from the library?
How can I configure this boundary in Linux kernel?
Updated.
Can I prevent (or detect) a syscall instruction being issued from application code (only allow it to go through libc). Maybe we can do a binary scan, but due to the variable length of instruction, it's hard to prevent unintended syscall instruction.
Do it the other way. You need to learn a lot.
First, read a lot more about operating systems. So read the Operating Systems: Three Easy Pieces textbook.
Then, learn more about ASLR.
Read also Drepper's How to write shared libraries and Levine's Linkers and loaders book.
You want to use pmap(1) and proc(5).
You probably want to parse the /proc/self/maps pseudo-file from inside your program. Or use dladdr(3).
To get some insight, run cat /proc/$$/maps and cat /proc/self/maps in a Linux terminal
I wanted to know whether an instruction is from userspace or from library code.
You are confused: both library code and main executable code are userspace.
On Linux x86_64, you can distinguish kernel addresses from userpsace addresses, because the kernel addresses are in the FFFF8000'00000000 through FFFFFFFF'FFFFFFFF range on current (48-bit) implementations. See canonical form address description here.
I observed some application code/data are located at about 0x000055xxxx while libraries and mmaped regions are by default located at 0x00007fcxxxx. Can I use for example, 0x00007f00...00 as a boundary to tell instruction is from the application itself or from the library?
No, in general you can't. An application can be linked to load anywhere within canonical address space (though most applications aren't).
As Basile Starynkevitch already answered, you'll need to parse /proc/$pid/maps, or know what address the executable is linked to load at (for non-PIE binary).

Mac equivalent bless on Linux

Is there any equivalent mac "bless" command in Linux. I am specifically interested in the option "--bless-folder".
No. You'll need to configure things like grub to specify where it should find your kernel, and what arguments to pass to the kernel. Quite often the arguments to the kernel would include:
root=/dev/sda1 which means the filesystem can be found on a particular partition of a particular disk;
init=/sbin/init which is the very first file the kernel executes, which loads the whole of the rest of the OS.

Arguments to kernel

Is there anything that the kernel need to get from the boot loader.Usually the kernel is capable of bringing up a system from scratch, so why does it require anything from boot-loader?
I have seen boot messages from kernel like this.
"Fetching vars from bootloader... OK"
So what exactly are the variables being passed?
Also how are the variables being passed from the boot-loader? Is it through stack?
The kernel accept so called command-line options, that are text based. This is very useful, because you can do a lot of thing without having to recompile your kernel. As for the argument passing, it is architecture dependent. On ARM it is done through a pointer to a location in memory, or a fixed location in memory.
Here is how it is done on ARM.
Usually a kernel is not capable of booting the machine from scratch. May be from the bios, but then it is not from scratch. It needs some initialisation, this is the job of the bootloader.
There are some parametres that the Linux kernel accepts from the bootloader, of which what I can remember now is the vga parametre. For example:
kernel /vmlinuz-2.6.30 root=/dev/disk/by-uuid/3999cb7d-8e1e-4daf-9cce-3f49a02b00f2 ro vga=0x318
Have a look at 10 boot time parameters you should know about the Linux kernel which explains some of the common parametres.
For the Linux kernel, there are several things the bootloader has to tell the kernel. It includes things like the kernel command line (as several other people already mentioned), where in the memory the initrd has been loaded and its size, if an initrd is being used (the kernel cannot load it by itself; often when using an initrd, the modules needed to acess storage devices are within the initrd, and it can also have to do some quite complex setup before being able to access the storage), and several assorted odds and ends.
See Documentation/x86/boot.txt (link to 2.6.30's version) for more detail for the traditional x86 architecture (both 32-bit and 64-bit), including how these variables are passed to the kernel setup code.
The bootloader doesn't use a stack to pass arguments to the kernel. At least in the case of Linux, there is a rather complex memory structure that the bootloader fills in that the kernel knows how to parse. This is how the bootloader points the kernel to its command line. See Documentaion/x86/boot.txt for more info.
Linux accepts variables from the boot loader to allow certain options to be used. I know that one of the things you can do is make it so that you don't have to log-in (recovery mode) and there are several other options. It mainly just allows fixes to be done if there's an issue with something or for password changing. This is how the Ubuntu Live-CD boots Linux if you select to use another option.
Normally the parameters called command line parameters, which is passed to kernel module from boot loader. Bootloader use many of the BIOS interrupts to detect,
memory
HDD
Processor
Keyboard
Screen
Mouse
ETC...
and all harwares details are going to be detected at boot time, that is in real mode, then pass this parameters to Kernel.

Resources