How to compute the minimal capabilities' set for a process? - linux

What's the best way to compute a minimal set of Linux capabilities for any process?
Suppose you're hardening an operating system and some of you tools may require CAP_NET_ADMIN and related network privileges while other tools may require CAP_SYS_NICE. There should be a way to tell for each executable which capabilities are really required.

Two possible approaches to determine required capabilities at runtime:
Subsequently run your program under strace without root privileges. Determine which system calls failed with EPERM and add corresponding capabilities to your program. Repeat this until all capabilities are gathered.
Use SystemTap, DTrace or Kprobes to log or
intercept capability checks in kernel made for your program. (e.g. use capable from BCC tools suite as described here)
Unit tests with good coverage will help a lot, I guess. Also note that capabilities(7) manual page lists system calls that may require each capability (although it is not a complete list).
Update:
The article referenced by #RodrigoBelem mentions capable_probe module, which is based on KProbes.
Original article with this module was "POSIX file capabilities: Parceling the power of root" and it's not availble now (it was hosted here). But you can find the source code and some docs in the Internet.

Related

Configure Chef to only report issues

I'm trying to configure a automation testing system on Linux that reports any inconsistency such as incorrect file permissions, failed services, etc on a custom Linux OS. I can write my own script to do that, but I need a general solution that supports a wide variety of situations and systems.
So, I was wondering if I can configure Chef to only report problems and inconsistencies on Linux, but not fix them?
Kind of. We have a system called "why run mode" which tries to do a dry-run check of what Chef would probably do if executed. Unfortunately because Chef code is, at heart, arbitrary executable Ruby code we can't be 100% certain. That said, I would try it out and see if the output is enough for your use.

Developmental testing of programs using Linux's POSIX capabilities

I'm developing a project where the executables use Linux's POSIX capabilities rather than being setuid root. So far I've had to keep one root shell open so that each time I recompile I can redo the setcap command to give the needed capability to the executable file so that I can test the results. That's getting tedious, plus if I ever hope that anyone else would want to contribute to the project's development I'll have to come up with a better way of doing it.
So far I've come up with two ways of dealing with this:
1) Have a single make target to be run as root to create a special setuid program which will be used to by the makefiles to give the capability to the executables. The program will be compiled from a template modified via sed so that it will only run if used by the non-root user the developer is working as, and will only modify files owned by the developer (and which are sitting in directories owned by the developer which aren't world writeable).
The problem with this is that I'm using GNU autotools to generate my make files, and I can't figure out how to get the makefiles to run a program on a linked executable after it's been linked. I could create a setcap-all target which has all the executables as its dependencies, with a rule that runs the setuid program on them, but then you can't simply do make executable-1 if that's all you want to build.
2) Have a single make target to be run as root to create a setuid daemon which will use inotify to monitor the src directory and grant the capability to any new executables (and which has security consideration similar to the setuid program from #1).
My problem with this is that I can't figure out how to get the build system to automatically and transparently start up the daemon, plus my intuition that This Is Not The Way Things Are Done in a proper build system.
Are there any better ways of doing this?
Maybe I'm a bit confused about the question, but it seems you're trying to use the build-system to solve an installation problem.
Whether you're packaging your project using dpkg, rpm or anything else, there should be a rule to enforce usage of setcap, which will set the capabilities of the installed binary using the Filesystem Extended Attributes (xattrs).
# Post-install rule example
setcap cap_net_raw=+pe /usr/bin/installed-binary
However, of you're installing a system daemon, you may count on the init-script to already have all the capabilities, so it's a matter of letting your process to drop unneeded capabilities.

Can I load a library or process with limited permissions?

This is an imaginary example of what I like to do. Don't take it too literally.
Let say my process is being ran as www-data and I have a lua script called thedevil.lua. It will try to delete, corrupt and cause as much problems as possible. I'd like to fire up a process (or load a shared object) that has a lua interpreter and it will try to ruin all my websites as the user is www-data.
Is there a way I can say lets create this process (or load a library) with LIMITED permissions. Say the script is in /var/www/devilscript/thedevil.lua. I'd like to give it permissions for /tmp/www/devilscript and /var/www/devilscript/. Is that possible? I don't want to create a new user called devilscript and give it limited permissions than run the process as that user. I just want to say I am www-data but I only want to give this process/lib a subset of what I can do.
-edit- Could you give me the name of the functions to execute the said so or binary with lower permissions?
-edit2- Can windows do something like I asked?
Yes, depending on the operating system you are running on, there are various sorts of sandboxing methods available in modern Unix systems. It depends a bit on which one you are running. Under Linux there are almost too many -- SELinux, Apparmor, Tomoyo, and others. FreeBSD has a Mandatory Access Control System as well as the Capsicum capabilities system. Mac OS X has a sandboxing system as well.
Most such systems allow you to reduce the privilege that a particular process gets in a fairly granular manner. In general, capability systems are easier to work with than Mandatory Access Control (MAC) systems, but they are less frequently available.
A primitive way of doing this sort of privilege restriction in older Unix systems was "chrooting" a process, that is, running it in a restricted part of the file hierarchy using the chroot system call. Unfortunately, that remains the only truly portable form of privilege reduction available in Unix systems -- you thus encounter it in the configuration systems of many system daemons.
SELinux will allow you to create a domain that has restricted access to various file contexts and resources, regardless of the user the process is running as (even root).

How can I sandbox filesystem activity, particularly writes?

Gentoo has a feature in portage, that prevents and logs writes outside of the build and packaging directories.
Checkinstall is able to monitor writes, and package up all the generated files after completion.
Autotools have the DESTDIR macro that enables you to usually direct most of the filesystem activity to an alternate location.
How can I do this myself with the
safety of the Gentoo sandboxing
method?
Can I use SELinux, rlimit, or
some other resource limiting API?
What APIs are available do this from
C, Python?
Update0
The mechanism used will not require root privileges or any involved/persistent system modification. This rules out creating users and using chroot().
Please link to the documentation for APIs that you mention, for some reason they're exceptionally difficult to find.
Update1
This is to prevent accidents. I'm not worried about malicious code, only the poorly written variety.
The way Debian handles this sort of problem is to not run the installation code as root in the first place. Package build scripts are run as a normal user, and install scripts are run using fakeroot - this LD_PRELOAD library redirects permission-checking calls to make it look like the installer is actually running as root, so the resulting file ownership and permissions are right (ie, if you run /usr/bin/install from within the fakeroot environment, further stats from within the environment show proper root ownership), but in fact the installer is run as an ordinary user.
Builds are also, in some cases (primarily for development), done in chroots using eg pbuilder - this is likely easier on a binary distribution however, as each build using pbuilder reinstalls all dependencies beyond the base system, acting as a test that all necessary dependencies are specified (this is the primary reason for using a chroot; not for protection against accidental installs)
One approach is to virtualize a process, similar to how wine does it, and reinterpret file paths. That's rather heavy duty to implement though.
A more elegant approach is to use the chroot() system call which sets a subtree of the filesystem as a process's root directory. Create a virtual subtree, including /bin, /tmp, /usr, /etc as you want the process to see them, call chroot with the virtual tree, then exec the target executable. I can't recall if it is possible to have symbolic links within the tree reference files outside, but I don't think so. But certainly everything needed could be copied into the sandbox, and then when it is done, check for changes against the originals.
Maybe get the sandbox safety with regular user permissions? So the process running the show has specific access to specific directories.
chroot would be an option but I can't figure out how to track these tries to write outside the root.
Another idea would be along the lines of intercepting system calls. I don't know much about this but strace is a start, try running a program through it and check if you see something you like.
edit:
is using kernel modules an option? because you could replace the write system call with your own so you could prevent whatever you needed and also log it.
It sounds a bit like what you are describing is containers. Once you've got the container infrastructure set up, it's pretty cheap to create containers, and they're quite secure.
There are two methods to do this. One is to use LD_PRELOAD to hook library calls that result in syscalls, such as those in libc, and call dlsym/dlopen. This will not allow you to directly hook syscalls.
The second method, which allows hooking syscalls, is to run your executable under ptrace, which provides options to stop and examine syscalls when they occur. This can be set up programmatically to sandbox calls to restricted areas of the filesystem, among other things.
LD_PRELOAD can not intercept syscalls, but only libcalls?
Dynamic linker tricks: Using LD_PRELOAD to cheat, inject features and investigate programs
Write Yourself an Strace in 70 Lines of Code

Running external code in a restricted environment (linux)

For reasons beyond the scope of this post, I want to run external (user submitted) code similar to the computer language benchmark game. Obviously this needs to be done in a restricted environment. Here are my restriction requirements:
Can only read/write to current working directory (will be large tempdir)
No external access (internet, etc)
Anything else I probably don't care about (e.g., processor/memory usage, etc).
I myself have several restrictions. A solution which uses standard *nix functionality (specifically RHEL 5.x) would be preferred, as then I could use our cluster for the backend. It is also difficult to get software installed there, so something in the base distribution would be optimal.
Now, the questions:
Can this even be done with externally compiled binaries? It seems like it could be possible, but also like it could just be hopeless.
What about if we force the code itself to be submitted, and compile it ourselves. Does that make the problem easier or harder?
Should I just give up on home directory protection, and use a VM/rollback? What about blocking external communication (isn't the VM usually talked to over a bridged LAN connection?)
Something I missed?
Possibly useful ideas:
rssh. Doesn't help with compiled code though
Using a VM with rollback after code finishes (can network be configured so there is a local bridge but no WAN bridge?). Doesn't work on cluster.
I would examine and evaluate both a VM and a special SELinux context.
I don't think you'll be able to do what you need with simple file system protection because you won't be able to prevent access to syscalls which will allow access to the network etc. You can probably use AppArmor to do what you need though. That uses the kernel and virtualizes the foreign binary.

Resources