Grub bootloader with shared library support

Grub bootloader with shared library support - shared-libraries

I'd like to load a shared library (closed-source binary user-space library) at boot stage with grub boot-loader. Are there any chances for this or I must write a custom-elf-loader (grub module) to do it?
29/08/2014: For more detail, this is a programming problem in which I
want to customize or add some new features to Grub boot-loader
project. Thank you for your all supporting!

So, you don't make it crystal clear what you are trying to do, but:
Loading a userspace (assuming Linux SysV ELF type) shared library straight into GRUB is not possible. GRUB modules are indeed in ELF format, but they contain additional headers. Among the information contained in that header is an explicit license statement - GRUB will refuse to load any modules that are not explicitly GPLv2+, GPLv3 or GPLv3+.
It should be possible to write an ELF loader, but an easier way might be to write a tool to convert a userspace library to a GRUB module. There would of course be several restrictions here:
You would need to ensure the userspace library performed no system calls - GRUB would have nothing in place to handle them.
You would need to abide by the licensing rules (so only above three licenses would be acceptable).
You would need to ensure these libraries were not dependent on a global offset table being set up by glibc for them.
If recompiling is an option, GRUB also provides a POSIX emulation layer - add CPPFLAGS_POSIX to your CPPFLAGS, and use core standard POSIX header files. Have a look at the gcrypt support for an example.

Related

How can we tell an instruction is from application code or library code on Linux x86_64

I wanted to know whether an instruction is from the application itself or from the library code.
I observed some application code/data are located at about 0x000055xxxx while libraries and mmaped regions are by default located at 0x00007fcxxxx. Can I use for example, 0x00007f00...00 as a boundary to tell instruction is from the application itself or from the library?
How can I configure this boundary in Linux kernel?
Updated.
Can I prevent (or detect) a syscall instruction being issued from application code (only allow it to go through libc). Maybe we can do a binary scan, but due to the variable length of instruction, it's hard to prevent unintended syscall instruction.

Do it the other way. You need to learn a lot.
First, read a lot more about operating systems. So read the Operating Systems: Three Easy Pieces textbook.
Then, learn more about ASLR.
Read also Drepper's How to write shared libraries and Levine's Linkers and loaders book.
You want to use pmap(1) and proc(5).
You probably want to parse the /proc/self/maps pseudo-file from inside your program. Or use dladdr(3).
To get some insight, run cat /proc/$$/maps and cat /proc/self/maps in a Linux terminal

I wanted to know whether an instruction is from userspace or from library code.
You are confused: both library code and main executable code are userspace.
On Linux x86_64, you can distinguish kernel addresses from userpsace addresses, because the kernel addresses are in the FFFF8000'00000000 through FFFFFFFF'FFFFFFFF range on current (48-bit) implementations. See canonical form address description here.
I observed some application code/data are located at about 0x000055xxxx while libraries and mmaped regions are by default located at 0x00007fcxxxx. Can I use for example, 0x00007f00...00 as a boundary to tell instruction is from the application itself or from the library?
No, in general you can't. An application can be linked to load anywhere within canonical address space (though most applications aren't).
As Basile Starynkevitch already answered, you'll need to parse /proc/$pid/maps, or know what address the executable is linked to load at (for non-PIE binary).

Can i use gcov for kernel modules without recompiling the whole kernel?

I have ubuntu OS and i installed gcov in it.
I am able to use gcov for my c-program which is in user space and i am getting the desired results.
When i want to use gcov for my .ko files(kernel space) i am getting an error.
I googled and from the below mentioned link i found that i will have to recompile my whole kernel by enabling CONFIG_DEBUG_FS, CONFIG_GCOV_KERNEL, CONFIG_GCOV_FORMAT_AUTODETECT and CONFIG_GCOV_PROFILE_ALL.
http://techvolve.blogspot.in/2014/03/how-to-gcovlcov-for-linux-kernel-modules.html
Do i have any other way to integrate gcov for my kernel loadable modules without recompiling kernel?
If more information is required from my end please let me know. I will update it.
Thank You

Without support from the Linux kernel core you cannot collect coverage from a kernel module. So, if you current kernel has no such support, you have to recompile it.
CONFIG_GCOV_PROFILE_ALL isn't needed for coverage from the kernel module, however other configuration options are needed:
CONFIG_GCOV_KERNEL - enables coverage counters in the kernel space,
CONFIG_DEBUG_FS - enables debugfs filesystem, the only way for extract those counters into the user space,
CONFIG_GCOV_FORMAT_AUTODETECT - describes the format of the collected coverage (you may chose configuration option which selects specific format instead of autodetecting).

How to build the elf interpreter (ld-linux.so.2/ld-2.17.so) as static library?

I apologize if my question is not precise because I don't have a lot
of Linux related experience. I'm currently building a Linux from
scratch (mostly following the guide at linuxfromscratch.org version
7.3). I ran into the following problem: when I build an executable it
gets a hardcoded path to something called ELF interpreter.
readelf -l program
shows something like
[Requesting program interpreter: /lib/ld-linux.so.2]
I traced this library ld-linux-so.2 to be part of glibc. I am not very
happy with this behaviour because it makes the binary very unportable
- if I change the location of /lib/ld-linux.so.2 the executable no
longer works and the only "fix" I found is to use the patchelf utility
from NixOS to change the hardcoded path to another hardcoded path. For
this reason I would like to link against a static version of the ld
library but such is not produced. And so this is my question, could
you please explain how could I build glibc so that it will produce a
static version of ld-linux.so.2 which I could later link to my
executables. I don't fully understand what this ld library does, but I
assume this is the part that loads other dynamic libraries (or at
least glibc.so). I would like to link my executables dynamically, but
I would like the dynamic linker itself to be statically built into
them, so they would not depend on hardcoded paths. Or alternatively I
would like to be able to set the path to the interpreter with
environment variable similar to LD_LIBRARY_PATH, maybe
LD_INTERPRETER_PATH. The goal is to be able to produce portable
binaries, that would run on any platform with the same ABI no matter
what the directory structure is.
Some background that may be relevant: I'm using Slackware 14 x86 to
build i686 compiler toolchain, so overall it is all x86 host and
target. I am using glibc 2.17 and gcc 4.7.x.

I would like to be able to set the path to the interpreter with environment variable similar to LD_LIBRARY_PATH, maybe LD_INTERPRETER_PATH.
This is simply not possible. Read carefully (and several times) the execve(2), elf(5) & ld.so(8) man pages and the Linux ABI & ELF specifications. And also the kernel code doing execve.
The ELF interpreter is responsible for dynamic linking. It has to be a file (technically a statically linked ELF shared library) at some fixed location in the file hierarchy (often /lib/ld.so.2 or /lib/ld-linux.so.2 or /lib64/ld-linux-x86-64.so.2)
The old a.out format from the 1990s had a builtin dynamic linker, partly implemented in old Linux 1.x kernel. It was much less flexible, and much less powerful.
The kernel enables, by such (in principle) arbitrary dynamic linker path, to have various dynamic linkers. But most systems have only one. This is a good way to parameterize the dynamic linker. If you want to try another one, install it in the file system and generate ELF executables mentioning that path.
With great pain and effort, you might make your own ld.so-like dynamic linker implementing your LD_INTERPRETER_PATH wish, but that linker still has to be an ELF shared library sitting at some fixed location in the file tree.
If you want a system not needing any files (at some predefined, and wired locations, like /lib/ld.so, /dev/null, /sbin/init ...), you'll need to build all its executable binaries statically. You may want (but current Linux distributions usually don't do that) to have a few statically linked executables (like /sbin/init, /bin/sash...) that will enable you to repair a system broken to the point of not having any dynamic linker.
BTW, the /sbin/init -or /bin/sh - path is wired inside the kernel itself. You may pass some argument to the kernel at boot load time -e.g. with GRUB- to overwrite the default. So even the kernel wants some files to be here!
As I commented, you might look into MUSL-Libc for an alternative Libc implementation (providing its own dynamic linker). Read also about VDSO and ASLR and initrd.
In practice, accept the fact that modern Linuxes and Unixes are expecting some non-empty file system ... Notice that dynamic linking and shared libraries are a huge progress (it was much more painful in the 1990s Linux kernels and distributions).
Alternatively, define your own binary format, then make a kernel module or a binfmt_misc entry to handle it.
BTW, most (or all) of Linux is free software, so you can improve it (but this will take months -or many years- of work to you). Please share your improvements by publishing them.
Read also Drepper's Hwo to Write Shared Libraries paper; and this question.

I ran into the same issue. In my case I want to bundle my application with a different GLIBC than comes system installed. Since ld-linux.so must match the GLIBC version I can't simply deploy my application with the according GLIBC. The problem is that I can't run my application on older installations that don't have the required GLIBC version.
The path to the loader interpreter can be modified with --dynamic-linker=/path/to/interp. However, this needs to be set at compile time and therefore would require my application to be installed in that location (or at least I would need to deploy the ld-linux.so that goes with my GLIBC in that location which goes against a simple xcopy deployment.
So what's needed is an $ORIGIN option equivalent to what the -rpath option can handle. That would allow for a fully dynamic deployment.
Given the lack of a dynamic interpreter path (at runtime) leaves two options:
a) Use patchelf to modify the path before the executable gets launched.
b) Invoke the ld-linux.so directly with the executable as an argument.
Both options are not as 'integrated' as a compiled $ORIGIN path in the executable itself.

Loading ELF shared library and custom binfmt executable into same Linux address space

I am working on a project to load and run a custom binary format executable (PE, in my case) on a Linux platform. I've done this pretty successfully so far by first loading the executable and then loading a small ELF shared library that calls the start address of the executable and then exits safely.
I would really like not doing the ELF loading myself for a few reasons, though. First, the shared library I use is written in assembly (I can't use anything else because I'm not linking to libc, etc.), which will be very platform-specific, and I'd like to move away from that and use C so I can compile for any platform. Also, it will be easier and safer to use Linux's native ELF loader instead of my own simplified version.
I'm wondering if there is a way to use my binfmt handler, an installed kernel module, to load my executable and then ask Linux to load my shared library (and its dependencies) into the same address space without overwriting my executable code. I first thought that the uselib syscall might be useful, but the description on the man page is unclear about whether or not this will serve my purposes:
From libc 4.4.4 on only the library "/lib/ld.so" is loaded, so that
this dynamic library can load the remaining libraries needed (again
using this call). This is also the state of affairs in libc5.
glibc2 does not use this call.
I've also never seen an example of its use, and I'm always wary of using syscalls that I don't understand.
Is there a good way to achieve what I've described? Can I use Linux's existing capabilities to load a shared library (written in C) into an address space already containing executable code, and, if so, how can I use that library without knowing where it has been loaded?

there is already a project like this called binfmt_pe (by me!) which is a kernel module and will have it's own linker (similar to /lib/ld). check it out here.
As for your question about making modules and the loader/linker, there are links below. I also included links with info about ELF and PE executables.
I hope this helps. :)
Useful information for making a Linux Kernel Module
The Linux Kernel Module Programming Guide
Writing Your Own Loadable Kernel Module
Linux Data Structures
The Linux kernel: The kernel source
Kernel Support for miscellaneous Binary Formats
binfmt_elf.c
binfmt_misc.c
Information About Dynamnic Loading/Linking
Understanding ld-linux.so.2
ld.so : Dynamic-Link Library support
How To Write Shared Libraries - PDF
ld-linux(8) - Linux man page
rpath
Listing Shared Library Dependencies
Information About ELF and PE Formats
OSRC: Executable File Formats
EXE Format
How Windows NT Recognizes MS-DOS - Based Applications
Common Object File Format (COFF)
Peering Inside the PE: A Tour of the Win32 Portable Executable File Format
Executable and Linkable Format
An In-Depth Look into the Win32 Portable Executable File Format
An In-Depth Look into the Win32 Portable Executable File Format, Part 2
Injective Code Inside an Import Table
The PE Format | Hackers Library
IMAGE_NT_HEADERS structure (Windows)
x86 Disassembly/Windows Executable Files
the Portable Executable Format on Windows

What's the difference between /usr/include/linux and the include folder in linux kernel source?

On a freshly installed Ubuntu , i found kernel headers in both /usr/include/linux , and /usr/src/kernel-version-headers/include/linux
Are they mutually the same ?

They are very different; the /usr/include/linux headers are the headers that were used when compiling the system's standard C library. They are owned by the C library packaging, and updated in lockstep with the standard C library. They exist to provide the userland interface to the kernel, as understood and "brokered"1 by the C library.
The /usr/src/linux-headers-$(uname -r)/include/linux headers are used via the /lib/modules/$(uname -r)/build symbolic links. They are owned by the kernel headers packages and updated in lockstep with the kernel. These are a subset of the kernel headers and enough of the Kbuild system required to build out-of-tree kernel modules. These files represent the kernel internals -- modules must build against these if they are to properly understand in-memory objects. See the kernel's Documentation/kbuild/modules.txt file for some details.
1: "Mediated" was my first word choice, but it implies some sort of access controls, which isn't the case. "Brokered" implies a third-party process, but that is also not the case. Consider: when a C program calls _exit(), it is actually calling the Standard C library's _exit() wrapper, which calls the exit(2) system call. The select(2) interface has an upper limit on the number of file descriptors that can be tracked, and that limit is compiled into the standard C library. Even if the kernel's interface were extended, the C library would also need to be recompiled.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string