Executing binaries without execve? - linux

I saw somewhere mentioned that one can "emulate" execve (primarily with open and mmap) in order to load some other binary (without actual "execve" syscall).
Are there any already implemented examples for it?
Can we load both static and dynamic binaries?
Can it be done portably?
Such feature may be useful for delegating work to arbitrary binaries ignoring filesystem bits or with seccomp policy installed not allowing actual execve.

Related

Is it possible to share memory using the SysV shmat() interface in one application and the Posix shm_open() interace in another?

Ignoring some details there are two low-level SHM APIs available for in Linux.
We have the older (e.g System V IPC vs POSIX IPC) SysV interface using:
ftok
shmctl
shmget
shmat
shmdt
and the newer Posix interface (though Posix seems to standardize the SysV one as well):
shm_open
shm_unlink
It is possible and safe to share memory such that one program uses shm_open() while the other uses shmget() ?
I think the answer is no, though someone wiser may know better.
shm_open(path,...) maps one file to a shared memory segment whereas ftok(path,id,...) maps a named placeholder file to one or more segments.
See this related question - Relationship between shared memory and files
So on the one hand you have a one to one mapping between filenames and segments and on the other a one to many - as in the linked question.
Also the path used by shmget() is just a placeholder. For shm_open() the map might be the actual file (though this is implementation defined).
I'm not sure there is anyway to make shm_open() and shmat() refer to the same memory location.
Even if you could mix them somehow it would probably be undefined behaviour.
If you look the glibc implementation of shm_open it is simply a wrapper to opening a file.
The implementation of shmget and shmat are internal system calls.
It may be that they share an implementation further down in the Linux kernal but this is not a detail that should be exposed or relied upon.

How can we tell an instruction is from application code or library code on Linux x86_64

I wanted to know whether an instruction is from the application itself or from the library code.
I observed some application code/data are located at about 0x000055xxxx while libraries and mmaped regions are by default located at 0x00007fcxxxx. Can I use for example, 0x00007f00...00 as a boundary to tell instruction is from the application itself or from the library?
How can I configure this boundary in Linux kernel?
Updated.
Can I prevent (or detect) a syscall instruction being issued from application code (only allow it to go through libc). Maybe we can do a binary scan, but due to the variable length of instruction, it's hard to prevent unintended syscall instruction.
Do it the other way. You need to learn a lot.
First, read a lot more about operating systems. So read the Operating Systems: Three Easy Pieces textbook.
Then, learn more about ASLR.
Read also Drepper's How to write shared libraries and Levine's Linkers and loaders book.
You want to use pmap(1) and proc(5).
You probably want to parse the /proc/self/maps pseudo-file from inside your program. Or use dladdr(3).
To get some insight, run cat /proc/$$/maps and cat /proc/self/maps in a Linux terminal
I wanted to know whether an instruction is from userspace or from library code.
You are confused: both library code and main executable code are userspace.
On Linux x86_64, you can distinguish kernel addresses from userpsace addresses, because the kernel addresses are in the FFFF8000'00000000 through FFFFFFFF'FFFFFFFF range on current (48-bit) implementations. See canonical form address description here.
I observed some application code/data are located at about 0x000055xxxx while libraries and mmaped regions are by default located at 0x00007fcxxxx. Can I use for example, 0x00007f00...00 as a boundary to tell instruction is from the application itself or from the library?
No, in general you can't. An application can be linked to load anywhere within canonical address space (though most applications aren't).
As Basile Starynkevitch already answered, you'll need to parse /proc/$pid/maps, or know what address the executable is linked to load at (for non-PIE binary).

Gulp/Node: error while loading shared libraries: cannot allocate memory in static TLS block

Trying to run gulp and getting this output
$ gulp
node: error while loading shared libraries: cannot allocate memory in static TLS block
From what I have found, this seems to relate to gcc or g++, not sure how it pertains to node or gulp. Either way I can't seem to run gulp anymore. Should also mention, this just popped up today. It was running fine yesterday.
EDIT: seems like it's for all node commands. Just tried running npm -v to get the version number and it has the same output. Same with node -v
Running CentOS 6.9
The GNU toolchain supports various kinds of TLS, and one of them (the initial-exec model) involves what is essentially a fixed offset from the thread control block. At program startup, the dynamic linker computes all the offsets and makes sure that all threads have sufficient space for all the required thread local variables.
However, with dlopen, this does not work in general because it is not possible to move the thread control block around to make room for more thread-local variables. The current glibc dynamic linker has a heuristic which reserves some space for future dlopen calls, but if you load a number of shared objects, each wither their own thread-local variables, this is not enough.
The usual workaround is to use the LD_DEBUG=files environment variable (or strace) to find relevant shared objects loaded with dlopen (unfortunately, the error message you quoted does not provide this information). After that, you can use the LD_PRELOAD environment variable to tell the dynamic linker to load them early. (It is sufficient to do this for the shared object which is dlopened, its dependencies are processed automatically.) This has the side effect that the computation at program startup takes into account their TLS needs, and when the dlopen call happens later at run time, no additional TLS variables have to be allocated. However, this approach does not work for all shared objects because it affects symbol lookup and the order in which ELF constructors run.
In the general case, it may be necessary to switch some shared objects to the global-dynamic TLS model (which requires recompiling them), or use a glibc build with an increased TLS reserve. Unfortunately, the reserve cannot currently be set at run time.

Is a core dump executable by itself?

The Wikipedia page on Core dump says
In Unix-like systems, core dumps generally use the standard executable
image-format:
a.out in older versions of Unix,
ELF in modern Linux, System V, Solaris, and BSD systems,
Mach-O in OS X, etc.
Does this mean a core dump is executable by itself? If not, why not?
Edit: Since #WumpusQ.Wumbley mentions a coredump_filter in a comment, perhaps the above question should be: can a core dump be produced such that it is executable by itself?
In older unix variants it was the default to include the text as well as data in the core dump but it was also given in the a.out format and not ELF. Today's default behavior (in Linux for sure, not 100% sure about BSD variants, Solaris etc.) is to have the core dump in ELF format without the text sections but that behavior can be changed.
However, a core dump cannot be executed directly in any case without some help. The reason for that is that there are two things missing from a simple core file. One is the entry point, the other is code to restore the CPU state to the state at or just before the dump occurred (by default also the text sections are missing).
In AIX there used to be a utility called undump but I have no idea what happened to it. It doesn't exist in any standard Linux distribution I know of. As mentioned above (#WumpusQ) there's also an attempt at a similar project for Linux mentioned in above comments, however this project is not complete and doesn't restore the CPU state to the original state. It is, however, still good enough in some specific debugging cases.
It is also worth mentioning that there exist other ELF formatted files that cannot be executes as well which are not core files. Such as object files (compiler output) and .so (shared object) files. Those require a linking stage before being run to resolve external addresses.
I emailed this question the creator of the undump utility for his expertise, and got the following reply:
As mentioned in some of the answers there, it is possible to include
the code sections by setting the coredump_filter, but it's not the
default for Linux (and I'm not entirely sure about BSD variants and
Solaris). If the various code sections are saved in the original
core-dump, there is really nothing missing in order to create the new
executable. It does, however, require some changes in the original
core file (such as including an entry point and pointing that entry
point to code that will restore CPU registers). If the core file is
modified in this way it will become an executable and you'll be able
to run it. Unfortunately, though, some of the states are not going to
be saved so the new executable will not be able to run directly. Open
files, sockets, pips, etc are not going to be open and may even point
to other FDs (which could cause all sorts of weird things). However,
it will most probably be enough for most debugging tasks such running
small functions from gdb (so that you don't get a "not running an
executable" stuff).
As other guys said, I don't think you can execute a core dump file without the original binary.
In case you're interested to debug the binary (and it has debugging symbols included, in other words it is not stripped) then you can run gdb binary core.
Inside gdb you can use bt command (backtrace) to get the stack trace when the application crashed.

How does AppArmor do "Environment Scrubbing"?

The AppArmor documentation mentions giving applications the ability to execute other programs with or without enviroment scrubbing. Apparently a scrubbed environment is more secure, but the documentation doesn't seem to specify exactly how environment scrubbing happens.
What is environment scrubbing and what does AppArmor do to scrub the environment?
"Environment scrubbing" is the removal of various "dangerous" environment variables which may be used to affect the behaviour of a binary - for example, LD_PRELOAD can be used to make the dynamic linker pull in code which can make essentially arbitrary changes to the running of a program; some variables can be set to cause trace output to files with well-known names; etc.
This scrubbing is normally performed for setuid/setgid binaries as a security measure, but the kernel provides a hook to allow security modules to enable it for arbitrary other binaries as well.
The kernel's ELF loader code uses this hook to set the AT_SECURE entry in the "auxiliary vector" of information which is passed to the binary. (See here and here for the implementation of this hook in the AppArmor code.)
As execution starts in userspace, the dynamic linker picks up this value and uses it to set the __libc_enable_secure flag; you'll see that the same routine also contains the code which sets this flag for setuid/setgid binaries. (There is equivalent code elsewhere for binaries which are statically linked.)
__libc_enable_secure affects a number of places in the main body of the dynamic linker code, and causes a list of specific environment variables to be removed.

Resources