How to determine the effective user id of a process in Rust? - linux

On Linux and other POSIX systems, a program can be executed under the identity of another user (i.e. euid). Normally, you'd call geteuid and friends to reliably determine the current identities of the process. However, I couldn't figure out a reliable way to determine these identities using only rust's standard library.
The only thing I found that was close is std::os::unix::MetadataExt.
Is it currently possible to determine the euid (and other ids) of process using the rust's standard library? Is there a function or trait I'm missing?

This is going to be on an OS-specific dependency as the concept does not exist (or do what you think it will!) for most of the targets you can build rust code for. In particular, you will find this in the libc crate, which is, as the name suggests, a very small wrapper over libc.
The std::os namespace is typically limited for the bare minimum to get process and FS functionality going for the std::process, std::thread and std::fs modules. As such, it would not have been in there. MetadataExt is, for a similar reason, aimed and targeted at filesystem usage.
As you could have expected, the call itself is, unimaginatively, geteuid.
It is an unsafe extern import, so you'll have to wrap it in an unsafe block.

It appears that Rust 1.46.0 doesn't expose this functionality in the standard library. If you're using a POSIX system and don't want to rely on an extra dependency, you have four options:
You can use libc directly:
#[link(name = "c")]
extern "C" {
fn geteuid() -> u32;
fn getegid() -> u32;
}
If you're using GNU/Linux in particular, you won't need to link to libc at all since the system call symbols are automatically made available to your program via the VDSO. In other words, you can use a plain extern block without the link attribute.
Read /proc/self/status (potentially Linux only?). This file contains a line that starts with Uid:. This line lists the real user id, effective user id, and other information that you may also find relevant. Refer to man proc for more information.
If you're using a normal GNU/Linux system, you can access the metadata of the /proc/self directory itself. As pointed out in this question, the owner of this directory should match the effective user id of the process. You can get the euid as follows:
use std::os::unix::fs::MetadataExt;
println!("metadata for {:?}", std::fs::metadata("/proc/self").map(|m| m.uid()));
A benefit this approach provides is that it is relatively cheap compared to option #2 since it's only a single stat syscall (as opposed to opening a file and reading/parsing its contents).
If you're not using a normal GNU/Linux system, you might find success in creating a new dummy file and obtaining the owner id normally via Metadata.

Related

Is it possible to share memory using the SysV shmat() interface in one application and the Posix shm_open() interace in another?

Ignoring some details there are two low-level SHM APIs available for in Linux.
We have the older (e.g System V IPC vs POSIX IPC) SysV interface using:
ftok
shmctl
shmget
shmat
shmdt
and the newer Posix interface (though Posix seems to standardize the SysV one as well):
shm_open
shm_unlink
It is possible and safe to share memory such that one program uses shm_open() while the other uses shmget() ?
I think the answer is no, though someone wiser may know better.
shm_open(path,...) maps one file to a shared memory segment whereas ftok(path,id,...) maps a named placeholder file to one or more segments.
See this related question - Relationship between shared memory and files
So on the one hand you have a one to one mapping between filenames and segments and on the other a one to many - as in the linked question.
Also the path used by shmget() is just a placeholder. For shm_open() the map might be the actual file (though this is implementation defined).
I'm not sure there is anyway to make shm_open() and shmat() refer to the same memory location.
Even if you could mix them somehow it would probably be undefined behaviour.
If you look the glibc implementation of shm_open it is simply a wrapper to opening a file.
The implementation of shmget and shmat are internal system calls.
It may be that they share an implementation further down in the Linux kernal but this is not a detail that should be exposed or relied upon.

How can a shared library know where it resides?

I'm developing a shared library for linux machines, which is dynamically loaded relative to the main executable with rpath.
Now, the library itself tries to load other libraries dynamically relative to its location but without rpath (I use scandir to search for shared libraries in a certain folder - I don't know their names yet).
This works only, if the working directory is set to the shared libraries location, as otherwise I look in a different directory as intended.
Is there any practical, reliable way for the shared library to determine where it resides?
I know, I could use /proc/self/maps or something like that, to get the loaded files, but this works only, as long the library knows its own name.
Another idea is so use dlinfo(), but to use it, the shared library need to know its own handle.
Is there any practical, reliable way for the shared library
to determine where it resides?
I'd use dlinfo and /proc/self/maps (proc may not always be mounted, especially in containers).
I know, I could use /proc/self/maps or something like that,
to get the loaded files, but this works only, as long the library
knows its own name.
Not really, you can take a pointer to some code inside library (preferably to some internal label, to avoid messing with PLT/GOT) and compare result against memory range obtained from /proc/self/maps (or dlinfo).
No need to muck around with the page mappings or dlinfo. You can use dladdr on any symbol defined in the shared library:
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdlib.h>
static char *lib_path(void) {
Dl_info info;
if (! dladdr((void *) lib_path, & info)) return NULL;
return realpath(info.dli_fname, NULL);
}
Technically this isn’t portable; in practice, it works even on on-Linux systems such as macOS. You may want to manually allocate the storage for realpath to avoid non-standard behaviour there (on Linux and macOS, realpath itself mallocs the storage, and needs to be freed by the caller).
This returns the path to the shared library itself. If you want to obtain the directory, you could use something like dirname (careful: reentrant) or modify the string yourself.

How to prohibit system calls, GNU/Linux

I'm currently working on the back-end of ACM-like public programming contest system. In such system, any user can submit a code source, which will be compiled and run automatically (which means, no human-eye pre-moderation is performed) in attempt to solve some computational problem.
Back-end is a GNU/Linux dedicated machine, where a user will be created for each contestant, all such users being part of users group. Sources sent by any particular user will be stored at the user's home directory, then compiled and executed to be verified against various test cases.
What I want is to prohibit usage of Linux system calls for the sources. That's because problems require platform-independent solutions, while enabling system calls for insecure source is a potential security breach. Such sources may be successfully placed in the FS, even compiled, but never run. I also want to be notified whenever source containing system calls was sent.
By now, I see the following places where such checker may be placed:
Front-end/pre-compilation analysis - source already checked in the system, but not yet compiled. Simple text checker against system calls names. Platform-dependent, compiler-independent, language-dependent solution.
Compiler patch - crash GCC (or any other compiler included in the tool-chain) whenever system call is encountered. Platform-dependent, compiler-dependent, language-independent solution (if we place checker "far enough"). Compatibility may also be lost. In fact, I dislike this alternative most.
Run-time checker - whenever system call is invoked from the process, terminate this process and report. This solution is compiler and language independent, but depends on the platform - I'm OK with that, since I will deploy the back-end on similar platforms in short- and mid-terms.
So the question is: does GNU/Linux provide an opportunity for administrator to prohibit system calls usage for a usergroup, user or particular process? It may be a security policy or a lightweight GNU utility.
I tried to Google, but Google disliked me today.
mode 1 seccomp allows a process to limit itself to exactly four syscalls: read, write, sigreturn, and _exit. This can be used to severely sandbox code, as seccomp-nurse does.
mode 2 seccomp (at the time of writing, found in Ubuntu 12.04 or patch your own kernel) provides more flexibility in filtering syscalls. You can, for example, first set up filters, then exec the program under test. Appropriate use of chroot or unshare can be used to prevent it from re-execing anything else "interesting".
I think you need to define system call better. I mean,
cat <<EOF > hello.c
#include <stdio.h>
int main(int argc,char** argv) {
fprintf(stdout,"Hello world!\n");
return 0;
}
EOF
gcc hello.c
strace -q ./a.out
demonstrates that even an apparently trivial program makes ~27 system calls.
You (I assume) want to allow calls to the "standard C library", but those in turn will be implemented in terms of system calls. I guess what I'm trying to say is that run-time checking is less feasible than you might think (using strace or similar anyway).

How/where is the working directory of a program stored?

When a program accesses files, uses system(), etc., how and where is the current working directory of that program physically known/stored? Since logically the working directory of a program is similar to a global variable, it should ideally be thread-local, especially in languages like D where "global" variables are thread-local by default. Would it be possible to make the current working directory of a program thread-local?
Note: If you are not familiar with D specifically, even a language-agnostic answer would be useful.
On Linux, each process is represented by a process descriptor - a task_struct. This structure is defined in include/linux/sched.h in the kernel source.
One of the fields of task_struct is a pointer to an fs_struct, which stores filesystem-related information. fs_struct is defined in include/linux/fs_struct.h.
fs_struct has a field called pwd, which stores information about the current working directory (the filesystem it is on, and the details of the directory itself).
Current directory is maintained by the OS, not by language or framework. See description of GetCurrentDirectory WinAPI function for details.
From description:
Multithreaded applications and shared
library code should not use the
GetCurrentDirectory function and
should avoid using relative path
names. The current directory state
written by the SetCurrentDirectory
function is stored as a global
variable in each process, therefore
multithreaded applications cannot
reliably use this value without
possible data corruption from other
threads that may also be reading or
setting this value.

Linking symbols to fixed addresses on Linux

How would one go about linking (some) symbols to specific fixed addresses using GNU ld so that the binary could still be executed as normal in Linux (x86)? There will not be any accesses to those symbols, but their addresses are important.
For example, I'd have the following structure:
struct FooBar {
Register32 field_1;
Register32 field_2;
//...
};
struct FooBar foobar;
I'd like to link foobar to address 0x76543210, but link the standard libraries and the rest of the application normally. The application will then make use of the address of foobar, but will not reference the (possibly non-existent) memory behind it.
The rationale for this request is that this same source can be used on two platforms: On the native platform, Register32 can simply be a volatile uint32_t, but on Linux Register32 is a C++ object with the same size as a uint32_t that defines e.g. operator=, which will then use the address of the object and sends a request to a communication framework with that address (and the data) to perform the actual access on remote hardware. The linker would thus ensure the Register32 fields of the struct refer to the correct "addresses".
The suggestion by litb to use --defsym symbol=address does work, but is a bit cumbersome when you have a few dozen such instances to map. However, --just-symbols=symbolfile does just the trick. It took me a while to find out the syntax of the symbolfile, which is
symbolname1 = address;
symbolname2 = address;
...
The spaces seem to be required, as otherwise ld reports file format not recognized; treating as linker script.
Try it with
--defsym symbol=expression
As with this:
gcc -Wl,--defsym,foobar=0x76543210 file.c
And make foobar in your code an extern declaration:
extern struct FooBar foobar;
This looks promising. However, it's a bad idea to do such a thing (unless you really know what you do). Why do you need it?
I'll give you the hot tip... GNU LD can do this (assuming the system libs don't need the address you want). You just need to build your own linker script instead of using the compiler's autogenerated one. Read the man page for ld. Also, building a linker script for a complex piece of software is no easy task when you involve the GLIBC too.

Resources