fork and execve to inherit unprivileged parent process' capabilities - linux

In Linux system, an unprivileged user launches a program. The process created has the capabilities CAP_NET_RAW,CAP_NET_ADMIN with mode as effective,permitted,inheritable.
This process then creates a child process by calling fork and execv to invoke another program udhcpc, but the child process does not inherit the capabilities CAP_NET_RAW,CAP_NET_ADMIN as expected. Even though before setting the capabilities I have called prctl(PR_SET_KEEPCAPS, 1).
Any suggestion on what to do to inherit unprivileged parent process' capabilities upon fork followed by execve?

On execve(), the file capability sets of the file being executed (in this case, udhcpc) are inspected and combined with the thread's capability sets. In particular, the file's Inheritable set is AND-ed with the thread's Inheritable set to determine the new Permitted set, and the file's Effective bit must be set in order for the new Effective set to be copied from the Permitted set.
This implies that in your case you must use setcap cap_net_raw,cap_net_admin=ei /path/to/udhcpc to obtain the effect you want (in addition to setting the capabilities in the parent process - the prctl() is not necessary).

According to "The Linux Programming Interface" by Michael Kerrisk (No Starch Press, 2010):
Since kernel 2.6.24, it is possible to attach capabilities to a file.
Various other features were added in kernels 2.6.25 and 2.6.26 in
order to complete the capabilities implementation.
The tools sucap and execcap are what you should look up. However they are, if I recall limited to restricting, not granting capabilities. Look at :
http://www.linuxjournal.com/article/5737
and
http://lkml.indiana.edu/hypermail/linux/kernel/0503.1/2540.html

Extracted from the manual, There have been some changes. According to it fork does not change capabilities. And now there is an ambient set, it seems that this is for what you are trying to do.
Ambient (since Linux 4.3):
This is a set of capabilities that are preserved across an execve(2) of a program that is not privileged. The ambient capability set obeys the invariant that no capability can ever
be ambient if it is not both permitted and inheritable.
The ambient capability set can be directly modified using
prctl(2). Ambient capabilities are automatically lowered if
either of the corresponding permitted or inheritable
capabilities is lowered.
Executing a program that changes UID or GID due to the set-
user-ID or set-group-ID bits or executing a program that has
any file capabilities set will clear the ambient set. Ambient
capabilities are added to the permitted set and assigned to
the effective set when execve(2) is called.
A child created via fork(2) inherits copies of its parent's
capability sets. See below for a discussion of the treatment of
capabilities during execve(2).
…
P'(ambient) = (file is privileged) ? 0 : P(ambient)
P'(permitted) = (P(inheritable) & F(inheritable)) |
(F(permitted) & cap_bset) | P'(ambient)
P'(effective) = F(effective) ? P'(permitted) : P'(ambient)
P'(inheritable) = P(inheritable) [i.e., unchanged]
where:
P denotes the value of a thread capability set before the
execve(2)
P' denotes the value of a thread capability set after the
execve(2)
F denotes a file capability set
cap_bset is the value of the capability bounding set (described
below).

It is useful to have a wrapper program that can execute any program with specific capabilities, without having to set capabilities on target programs. Such a wrapper is particularly useful to run software from a build directory (where setcap would be cumbersome) or to run interpreters like Python (where it would be inappropriate).
As explained in other answers, ambient capabilities solve this, but they are only available since kernel 4.3. It is possible to work around this problem by having the wrapper load the target program directly instead of using exec. By that, I mean open the executable, map relevant sections, set up the stack, etc., and jump to its code. This is a pretty complicated task, but luckily the wine-preloader program from the Wine project does exactly that (and some other things that are irrelevant for this purpose).
Run something like this as root to set up the wrapper:
cp /usr/bin/wine-preloader /path/to/wrapper
setcap cap_net_raw+ep /path/to/wrapper # set whatever capabilities you need
Now we have a copy of wine-preloader that is able to run any program with those capabilities:
/path/to/wrapper /path/to/executable arguments...
This works but there are some pitfalls:
The target program must be a path to an executable, it cannot find programs in PATH.
It does not work if the target program is a script with an interpreter (#!).
The wine-preloader prints a message about not being able to find something (but it still runs the program fine).

Related

Change or hide process name in htop

It seems that htop shows all running processes to every user, and process names in htop contain all the file names that I include in the command line. Since I usually use very long file names that actually contains a lot of detailed information about my project, I do not want such information to be visible to every one (but I am OK that other users see what software that I am running).
How can I hide the details in the process name?
How can I hide the details in the process name?
Since kernel 3.3, you can mount procfs with the hidepid option set to 1 or 2.
The kernel documentation file proc.txt describe this option:
The following mount options are supported:
hidepid= Set proc access mode.
hidepid=0 means classic mode - everybody may access all /proc directories
(default).
hidepid=1 means users may not access any /proc directories but their own. Sensitive files like cmdline, sched*, status are now protected against other users. This makes it impossible to learn whether any user runs specific program (given the program doesn't reveal itself by its behaviour). As an additional bonus, as /proc//cmdline is unaccessible for other users, poorly written programs passing sensitive information via program arguments are now protected against local eavesdroppers.
hidepid=2 means hidepid=1 plus all /proc will be fully invisible to other users. It doesn't mean that it hides a fact whether a process with a specific pid value exists (it can be learned by other means, e.g. by "kill -0 $PID"), but it hides process' uid and gid, which may be learned by stat()'ing /proc// otherwise. It greatly complicates an intruder's task of gathering information about running processes, whether some daemon runs with elevated privileges, whether other user runs some sensitive program, whether other users run any program at all, etc.

Purpose of issetugid?

According to the man pages for issetugid, the call is supposed to either (1) alert to uid/gid changes; or (2) alert to a possible tainted environment. The function name suggests a third purpose.
First question: what is it purpose?
When I look at the implementations available (for example, on Linux system as a library since Linux kernel does not provide the API), I find the following:
if (getuid() != geteuid()) return 1;
if (getgid() != getegid()) return 1;
return 0;
On Solaris, it looks as follows:
return ((curproc->p_flag & SUGID) != 0);
I'm a bit suspicious, but that's partially because its difficult understand what functions like geteuid and getegid return across all platforms - for example, BSD, Linux, Unix and Solaris.
Second question: is the Linux code semantically equivalent to Solaris code?
Third question: are geteuid and getegid implemented the same across platforms? How about for systems that have I three id's play - real, effective, and saved?
Fourth question: is the effective id the only id's that matter here?
If a process starts as UID = 0 and temporarily drops privileges, then the saved id's come into play. A process that temporarily drops root does not need to exec and should not be tainted.
Fifth question: is a process that temporarily drops root tainted?
Sixth question: should a process whose effective id is the saved id be considered tainted?
Six questions is a bit much to answer in a system designed for one question to answer, especially if no one person knows the answers to all six, but I'll try...
1) The purpose of issetugid() is to let libraries know if they're being used in a program that was run with raised privileges so they can avoid risky behavior such as trusting LD_LIBRARY_PATH, NLSPATH, etc. environment variables that would let the caller load modules that can abuse the raised privileges. You can see some historical discussions on it like this ncurses 4.1 security bug thread.
2) That code appears to be less secure than the BSD & Solaris versions, since it doesn't take into account the saved setid bits.
3) They probably have different implementations on different kernels - look at the platform source code to find out.
4, 5 & 6) No, yes, yes - a process that can change its euid or egid back to higher levels should still not trust environment variables that cause it to load user-provided code to exploit them.
I don't know issetugid(), but I can learn by reading BSD or Solaris manual pages. The function comes from OpenBSD.
1) OpenBSD's manual for issetugid(2) says, "The issetugid() function returns 1 if the process was made setuid or setgid as the result of the last or other previous execve() system calls. Otherwise it returns 0." It then suggests using issetugid() to check whether files named in environment variables are safe to open.
2) No, your Linux and Solaris code are not equivalent. A process running setuid might set its real uid to its effective uid without cleaning its environment variables. For example, uid_t uid = geteuid(); setresuid(uid, uid, uid); would set both real uid and saved uid to effective uid. Then your Linux issetugid() would return 0, but Solaris issetugid() would return 1.
Solaris checks the SUGID process flag at exec time. Illumos, the free fork of Solaris, sets SUGID in src/uts/common/os/exec.c when executing a file. OpenBSD has similar logic. OpenBSD's manual says,
If a child process executes a new executable file, a new issetugid status will be determined. This status is based on the existing process's uid, euid, gid, and egid permissions and on the modes of the executable file. If the new executable file modes are setuid or setgid, or if the existing process is executing the new image with uid != euid or gid != egid, the new process will be considered issetugid.
Solaris and OpenBSD compare the ids at exec time. Your Linux code delays the comparison until the call to issetugid(), so it is not equivalent.
3) The geteuid() and getegid() functions seem to do the same thing everywhere; they simply return the effective user id and the effective group id.
4) The saved ids don't matter. The process might have changed those ids without cleaning its environment variables. None of the real, effective, or saved ids tell us who set the environment variables for the current process.
5) At least on OpenBSD and Solaris, a process that temporarily drops root does not become tainted. OpenBSD's manual page says,
The issetugid() system call's result is unaffected by calls to setuid(), setgid(), or other such calls. In case of a fork(), the child process inherits the same status.
The status of issetugid() is only affected by execve().
When a process temporarily drops root with setuid() or seteuid(), it does not execute a file, so its issetugid() value does not change.
But FreeBSD, DragonFly BSD, and NetBSD define issetugid() more strictly. FreeBSD's manual for issetugid(2) says,
A process is tainted if it was created as a result of an execve(2) system call which had either of the setuid or setgid bits set (and extra privileges were given as a result) or if it has changed any of its real, effective or saved user or group ID's since it began execution.
With these systems, a process dropping root does force its issetugid() value to 1.
6) No, an effective id equal to a saved id does not taint a process. If it did, then every process would be tainted, because every process has its saved id set to its effective id at exec time.

How to execve a process, retaining capabilities in spite of missing filesystem-based capabilities?

I want to make system usable without setuid, file "+p" capabilities, and in general without things which are disabled when I set PR_SET_NO_NEW_PRIVS.
With this approach (init sets PR_SET_NO_NEW_PRIVS and filesystem-based capability elevation no longer possible) you cannot "refill" your capabilities and only need to be careful not to "splatter" them.
How to execve some other process without "splattering" any granted capabilities (such as if the new program's file is setcap =ei)? Just "I trust this new process as I trust myself". For example, a capability is given to a user (and the user wants to exercise it in any program he starts)...
Can I make the entire filesystem permanently =ei? I want to keep the filesystem just not interfering with the scheme, not capable of granting or revoking capabilities; controlling everything through parent->child things.
I am not saying that I recommend this for what you are doing, but here it is.
Extracted from the manual, There have been some changes. According to it: fork does not change capabilities. And now there is an ambient set added in Linux kernel 4.3, it seems that this is for what you are trying to do.
Ambient (since Linux 4.3):
This is a set of capabilities that are preserved across an execve(2) of a program that is not privileged. The ambient capability set obeys the invariant that no capability can ever
be ambient if it is not both permitted and inheritable.
The ambient capability set can be directly modified using
prctl(2). Ambient capabilities are automatically lowered if
either of the corresponding permitted or inheritable
capabilities is lowered.
Executing a program that changes UID or GID due to the set-
user-ID or set-group-ID bits or executing a program that has
any file capabilities set will clear the ambient set. Ambient
capabilities are added to the permitted set and assigned to
the effective set when execve(2) is called.
A child created via fork(2) inherits copies of its parent's
capability sets. See below for a discussion of the treatment of
capabilities during execve(2).
Transformation of capabilities during execve()
During an execve(2), the kernel calculates the new capabilities of
the process using the following algorithm:
P'(ambient) = (file is privileged) ? 0 : P(ambient)
P'(permitted) = (P(inheritable) & F(inheritable)) |
(F(permitted) & cap_bset) | P'(ambient)
P'(effective) = F(effective) ? P'(permitted) : P'(ambient)
P'(inheritable) = P(inheritable) [i.e., unchanged]
where:
P denotes the value of a thread capability set before the
execve(2)
P' denotes the value of a thread capability set after the
execve(2)
F denotes a file capability set
cap_bset is the value of the capability bounding set (described
below).
A privileged file is one that has capabilities or has the set-user-ID
or set-group-ID bit set.
There is currently no simple way to do that, if you refer to the capabilities' man page:
During an execve(2), the kernel calculates the new capabilities of the process
using the following algorithm:
P'(permitted) = (P(inheritable) & F(inheritable)) | (F(permitted) & cap_bset)
P'(effective) = F(effective) ? P'(permitted) : 0
P'(inheritable) = P(inheritable) [i.e., unchanged]
where:
P denotes the value of a thread capability set before the execve(2)
P' denotes the value of a capability set after the execve(2)
F denotes a file capability set
cap_bset is the value of the capability bounding set
If the file you want to execute doesn't have its fP bit set, or if its fI bits aren't set, your process will have no permitted and therefore no effective capabilities.
Setting the whole file system permitted and inheritance bits would be technically possible but that would not make much sense since it would strongly reduce the security on the system, (edit: and as you mentioned that won't work for new executables).
You can indeed give some capabilities to a user with pam_cap, but you can't let them execute any file they just compiled using that. Capabilities are by design made to give power to programs and not to users, you can read in Hallyn's paper:
A key insight is the observation that programs, not people, exercise
privilege. That is, everything done in a computer is via
agents—programs—and only if these programs know what to do with
privilege can they be trusted to wield it.
See also the POSIX draft 1003.1e, which defines POSIX capabilities, page 310:
It is also not appropriate to establish for a process chain (a
sequence of programs within a single process) a set of capabilities
that remains fixed and active throughout the life of that chain. [...]
This is an application of the principle of least privilege, and it
applies equally to users and to processes.
Someone asked to introduce what you want to do as a feature in this Linux kernel mailing list recently (dec. 2012), and there are some very interesting answers given. Some people argue that dropping file capabilities in inheritance rules across exec would introduce some security problems and that capabilities are not designed for such a feature, even though no explanation is given wrt which security issue it would introduce:/
The only way to do that currently is to modify the way capabilities are inherited in the Linux kernel (2 files to modify, I tested it successfully on a 3.7 kernel), but it's not clear whether that is secured or not as mentioned above.
On old kernels (before 2.6.33) there was an option to compile without file's capabilities (CONFIG_SECURITY_FILE_CAPABILITIES), but I doubt working with such an old kernel is an option for you.
I think (my understanding), that the best way to use capabilities is:
For programs that need capabilities and are trusted including trusted not to leak capabilities: e.g. the packet sniffing part of wire-shark, a web server that needs to listen on port 80.
new programs, capabilities aware: set permitted.
legacy programs, not capabilities aware: set permitted and effective
For programs that will leak capabilities, and have code that could (sometimes) use a capability: set inherited
e.g. for chmod set inherit CAP_FOWNER, if user needs super powers (those normally held by root), then they need to use setpriv (or equivalent, this could be rolled into sudo), else it works in unprivileged mode.
When a process needs to fork and share some capabilities, then and only then use ambient. Probably same executable; if it was a different one, then this new one would have permitted or inherited set on the file. [Edit: I have just realised that you do not need ambient if you do not exec. If I think of a use-case for ambient, in a well set up system, then I will add it here. Ambient can be used as a transitional mechanism, when inherited is not set on files that could use it.]
Uses of ambient:
On a system where files do not have the correct capabilities. ( a transitional technique).
For shell scripts, that can not have capabilities (as they can not have setuid), except on systems that have fixed and then allow setuid on scripts.
Add more here.

How do I ensure my Linux program doesn't produce core dumps?

I've got a program that keeps security-sensitive information (such as private keys) in memory, since it uses them over the lifetime of the program. Production versions of this program set RLIMIT_CORE to 0 to try to ensure that a core dump that might contain this sensitive information is never produced.
However, while this isn't mentioned in the core(8) manpage, the apport documentation on the Ubuntu wiki claims,
Note that even if ulimit is set to disabled core files (by specyfing a
core file size of zero using ulimit -c 0), apport will still capture
the crash.
Is there a way within my process (i.e., without relying on configuration of the system external to it) that I can ensure that a core dump of my process is never generated?
Note: I'm aware that there are plenty of methods (such as those mentioned in the comments below) where a user with root or process owner privileges could still access the sensitive data. What I'm aiming at here is preventing unintentional exposure of the sensitive data through it being saved to disk, being sent to the Ubuntu bug tracking system, or things like that. (Thanks to Basile Starynkevitch for making this explicit.)
According to the POSIX spec, core dumps only happen in response to signals whose action is the default action and whose default action is to "terminate the process abnormally with additional actions".
So, if you scroll down to the list in the description of signal.h, everything with an "A" in the "Default Action" column is a signal you need to worry about. Use sigaction to catch all of them and just call exit (or _exit) in the signal handler.
I believe these are the only ways POSIX lets you generate a core dump. Conceivably, Linux might have other "back doors" for this purpose; unfortunately, I am not enough of a kernel expert to be sure...

Is it possible to configure Linux capabilities per user? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
Improve this question
There appears to be support for fine-grained capabilities in Linux kernel, which allows granting privileges to a process to do things like, for example, opening raw sockets or raising thread priority without granting the process root privileges.
However what I'd like to know if there is a way to grant per-user capabilities. That is, allow non-root and non-suid processes to acquire those capabilities.
It can sort of be done with libcap - it provides a PAM module pam_cap.so.
However it's not quite that simple :)
Each process has three capability sets:
Effective (the caps that this process actually has)
Permitted (the caps that this process can possibly have - a superset of Effective)
Inheritable (the caps that this process can pass to a child process)
Each file has the same capability sets. When a new binary is exec()'d, the capabilities of the process change according to the following rules, where:
pI/pP are the process's initial Inheritable/Permitted capabilities
pI'/pP'/pE' are the process's new Inheritable/Permitted/Effective capabilities
fI/fP/fE are the file's Inheritable/Permitted/Effective capabilities
& represents intersection
| represents union
pI' = pI
pP' = fP | (pI & fI)
pE' = fE & pP'
(simplified from http://www.friedhoff.org/posixfilecaps.html)
In most scenarios, pE' is the only result we care about. Programs that are linked against libcap can call setcap() to change their Effective caps (as long as the caps they try to request are in the Permitted set), but the vast majority of programs don't explicitly touch their caps so we have to arrange for the cap to be effective post-exec().
Having a concrete example will help understanding here... I got fed up with having to 'su' to run openvpn, so I wanted to grant myself the CAP_NET_ADMIN capability to allow the setting of routes and such.
Looking at the last rule (pE' = fE & pP') it's clear that to have CAP_NET_ADMIN in the process's Effective set, CAP_NET_ADMIN must be in the file's Effective set. So, the capabilities system doesn't allow us to simply say "grant CAP_NET_ADMIN to user sqweek" - the program's capabilities are always important.
Being in the file's Effective set isn't enough though, the cap also needs to be in the process's new Permitted set. Lets look at that rule: pP' = fP | (pI & fI). So there's two ways we can get the cap in pP', either we add CAP_NET_ADMIN to the file's Permitted set, or we add it to the file's Inheritable set and make sure it is in the process's Inheritable set.
If we add it to the file's Permitted set, then the process's initial capabilities become irrelevant - openvpn will get CAP_NET_ADMIN every time it runs, regardless of who runs it. This is similar to setuid, but provides a more fine-grained approach. Still, it is not a per-user granularity, so lets look at the other option.
Note the first rule, pI' = pI. The process's Inheritable capabilities are unaffected by exec(). What this means is, all we need is a single libcap aware program to set CAP_NET_ADMIN as an Inheritable cap, and every process spawned from there will also have CAP_NET_ADMIN Inheritable. This is the role the pam module plays - it modifies the Inheritable set during login, which is then inherited for all of that user's processes.
To summarise:
Install libcap
Configure the pam_cap module (add the line cap_net_admin sqweek to /etc/security/capability.conf. If the file did not previously exist, add another line none * for a sensible default.
Enable the PAM module during login (add auth required pam_cap.so to /etc/pam.d/login). Make sure to test your login in a separate terminal BEFORE logging out when making PAM changes so you don't lock yourself out!
Add CAP_NET_ADMIN to the Effective and Inheritable sets for openvpn (setcap cap_net_admin+ie /usr/sbin/openvpn)
openvpn calls ip to change the routing table and such, so that needs the same treatment (setcap cap_net_admin+ie /sbin/ip)
Note that /etc/pam.d/login only governs local logins - you might want to give eg. /etc/pam.d/sshd similar treatment. Also, any capabilities you add via setcap will be blown away when your package manager installs a new version of the target binary so you'll have to re-add them.
Yes, you can use setcap to specify a capability set for an executable, which can grant specific capabilities when that executable is run.
From the capabilities(7) man page:
File Capabilities
Since kernel 2.6.24,
the kernel supports associating
capability sets with an executable
file using setcap(8). The file
capability sets are stored in an
extended attribute (see setxattr(2))
named security.capability. Writing to
this extended attribute requires the
CAP_SETFCAP capability. The file
capability sets, in conjunction with
the capability sets of the thread,
determine the capabilities of a thread
after an execve(2).
The way to grant capabilities per-user (or even per-group) would be with a PAM module. sqweek's answer shows how to do this using pam_cap.
I've not confirmed, but I think that this aspect of SELinux may be your answer:
http://www.lurking-grue.org/writingselinuxpolicyHOWTO.html#userpol5.1
Have a look at CapOver - it should do what you want.
Note: I haven't used this as it's not (yet?) been ported to the 2.6.30ish kernel API.

Resources