process-local override of name resolution? - linux

I have test code that I want to have a couple of hostnames resolve to the loopback while testing. When deployed, this code will use the normal system name resolution as appropriate. Test and deployment host are recent linux distros (SLES11SP1, e.g.).
I'd like to override hostname resolution for a single process, without being superuser. Is there a way to manipulate the nsswitch/hostsbehavior in such a narrow fashion?
Yes, of course I could override the hostnames themselves, but I prefer not to (unless this feature really isn't available).
EDIT:
glibc's HOSTALIASES feature sounds like exactly what I want -- but its availability/effectiveness seems inconsistent among the hosts I surveyed. At some point, it was added to be among a list of insecure environment variables. But does that mean it's ignored globally or only in suid binaries? Will it still work for programs which do getnameinfo()?
More edit:
IMO, HOSTALIAS wins hands down. Disabling nscd is a workaround for platforms which don't respect it -- like mine (SuSE). And maybe they will release a fix.

LD_LIBRARY_PATH for the win!
http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html
Also:
What is the LD_PRELOAD trick?
Also:
http://www.linuxjournal.com/article/7795

Assuming you want to intercept e.g. gethostbyname(), and have it return 127.0.0.1 for certain hostnames ...
If your code is C++, the simplest answer might be to use gMock.
If you can't, you may want to interpose gethostbyname. A sample interposer is documented here.

Brian, another option would be to use chroot. You could create a directory with a bunch of mount --rbind for each of the directories usr, lib, home, etc. - enough to simulate a working root directory. Then use mount -t aufs to "layer" mount the existing etc together with a writable empty layer. In essence, after all that, whatever you change in etc ends up changing only inside that chroot environment. You could override and simulate all kinds of environments that way.
If this is of any interest and need me to elaborate further, let me know.

Related

Are there any security issues in using environment variables like this?

I'm writing a collection of utilities in bash that has a common library. Each script I write has to have a block of code that determines the path of the library relative to the executable. Not the actual code, but an example:
#!/bin/bash
DIR="$( cd -P "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
. $DIR/../lib/utilities/functions
Instead of a preamble that tries to search for the library, I had the bright idea to use an environment variable to indicate the libraries location.
#!/bin/bash
. $TOOLS_LIBRARY_PATH
I could use a wrapper program to set that environment variable, or I could set it in my own path. There may be better ways to organize a bash toolset, but question is:
Can I trust my environment variables?
This is one of those, I've-never-really-thought-about-it kind of questions. When programming in other languages, paths are used to find libraries, (e.g. LD_LIBRARY_PATH, PYTHONPATH, PERLLIB, RUBYLIB, CLASSPATH, NODE_PATH), but I've never had to stop and think about how that could be insecure.
In fact, LD_LIBRARY_PATH has Why LD_LIBRARY_PATH is bad to discourage its use. The Ruby and Perl library path environment variables are ignored if their security mechanisms are invoked, $SAFE and -T (taint mode) respectively.
My thoughts so far...
The user could set TOOLS_PATH_LIBRARY to a library of their choosing, but the utility will run under their uid. They could simply run their malicious library directly with bash.
My tools sudo some things. Someone could set their TOOLS_PATH_LIBRARY to something that takes advantage of this. But, the tools are not run via sudo, they only invoke sudo here and there. The user would have to be a sudoer in any case, they could just call sudo directly.
If I can't trust TOOLS_PATH_LIBRARY, then I can't trust PATH. All program invocations must use absolute paths.
I've seen shell programs use aliases for programs that are absolute, so that instead of calling ls, use a variable, like LS=/bin/ls. From what I've read, this is to protect against users redefining program defaults as aliases. See: PATH, functions and security. Bash scripting best practices.
.
Perl's taint mode treats all environment variables as "tainted", which foreboding, which is why I'm trying to reason about the risks of environment.
It is not possible for one users to change another's environment, unless that user is root. Thus, I'm only concerned about a user changing their own environment to escalate privileges. See: Is there a way to change another process's environment variables?
I've rubber ducked this into an answer of sorts, but I'm still going to post it, since it isn't a pat answer.
Update: What are the security issues surrounding the use of environment variables to specify paths to libraries and executables?
While mechanisms exist in various programs to prevent the modification of environment variables, the bottom line is, no you can't trust environment variables. The security concern is very basic:
Anytime a user can change what is expected to be executed, the possibility for a vulnerability to be exploited arises.
Case-in-point, have a look at CVE-2010-3847. With that, an underprivileged attacker with write access to a file system containing a setuid or setgid binaries could use this flaw to escalate their privileges. It involves an environment variable being modified by the user.
CVE-2011-1095 is another example, and doesn't involve SUID binaries. Do a google search for "glibc cve environment" to see the kinds of stuff that people were able to do with environment variable modifications. Pretty crafty.
Your concerns really boil down to your statement of:
The user could set TOOLS_PATH_LIBRARY to a library of their choosing, but the utility will run under their uid. They could simply run their malicious library directly with bash.
Key phrase here - run their malicious library. This assumes their library is owned by their UID as well.
This is where a security framework will do you lots of good. One that I have written that focuses exclusively on this problem you can find here:
https://github.com/cormander/tpe-lkm
The module stops execve/mmap/mprotect calls on files that are writable, or not owned by a trusted user (root). As long as they're not able to put malicious code into a file/directory owned by the trusted user, they can't exploit the system in this way.
If you're using SUID binaries or sudo that include those variables, you might want to consider enabling the "paranoid" and "strict" options to stop even root from trusting non-root owned binaries.
I should mention that this Trusted Path Execution method protects direct execution with binaries and shared libraries. It does little (if anything) against interpreted languages, as they parse bytecode and not execute it directly. So you still need some degree of care with PYTHONPATH, PERLLIB, CLASSPATH, etc and use the language's safety mechanisms that you mentioned.
Short answer:
Assuming the user is able to run programs and code of her own choice anyway, you do not have to trust anything they feed you, including the environment. If the account is limited in some ways (no shell access, no write access to file systems that allow execution), that may change the picture, but as long as your scripts are only doing things the user could do herself, why protect against malicious interference?
Longer answer:
There are at least two separate issues to consider in terms of security problems:
What and whom do we need to guard against?
(closely related question: What can our program actually do and what could it potentially break?)
How do we do that?
If and as long as your program is running under the user ID of the user starting the program and providing all the input (i.e., the only one who could mount any attack), there are only rare situations where it makes sense at all to harden the program against attacks. Anti-copy protection and sharing high-scores comes to mind, but that group of things is not just concerned with inputs, but probably even more so with stopping the user from reading code and memory. Hiding the code of a shell script without some kind of suid/sgid trickery is nothing I'd know how to do; the best I could think of would be obscuring the code.
In this situation, anything the user could trick the program into doing, they could do without the help of the tool, too, so trying to “protect” against attacks by this user is moot. Your description does not read as if you'd actually need any protection against attacks.
Assuming you do need protection, you simply cannot rely on environment variables – and if you fail to reset things like LD_LIBRARY_PATH and LD_PRELOAD, even calling tools with absolute paths like /bin/id or /bin/ls won't give you a reliable answer, unless that tool happens to be statically compiled. This is why sudo has env_reset enabled by default and why running suid programs has to ignore certain environment variables. Note that this means that your point that TOOLS_PATH_LIBRARY and PATH are equally trustworthy may be true in your situation, but is not necessarily true in other situations' border cases: a sysadmin may reset PATH for sudo usage, but let non-standard environment variables pass through.
As pointed out above, argv[0] (or its bash equivalent ${BASH_SOURCE[0]}) is no more reliable than environment variables. Not only can the user simply make a copy or symlink of your original file; execve or bash's exec -a foo bar allows putting anything into argv[0].
You can never trust someone else's environment.
Why not simply create a new user that contains all of this important code. Then, you can either get the information directly from /etc/passwd or use the ~foo syntax to find the user's home directory.
# One way to get home directory of util_user
DIR=$(/usr/bin/awk -F: '$1 == "util_user" {print $6}' /etc/passwd)
# Another way which works in BASH and Kornshell
[ -d ~util_dir ] && DIR=~util_dir
# Make sure DIR is set!
if [ -z "$DIR" ]
then
echo "Something's wrong!"
exit
fi

Suppressing system calls when using gcc/g++

I have a portal in my university LAN where people can upload code to programming puzzles in C/C++. I would like to make the portal secure so that people cannot make system calls via their submitted code. There might be several workarounds but I'd like to know if I could do it simply by setting some clever gcc flags. libc by default seems to include <unistd.h>, which appears to be the basic file where system calls are declared. Is there a way I could tell gcc/g++ to 'ignore' this file at compile time so that none of the functions declared in unistd.h can be accessed?
Some particular reason why chroot("/var/jail/empty"); setuid(65534); isn't good enough (assuming 65534 has sensible limits)?
Restricting access to the header file won't prevent you from accessing libc functions: they're still available if you link against libc - you just won't have the prototypes (and macros) to hand; but you can replicate them yourself.
And not linking against libc won't help either: system calls could be made directly via inline assembler (or even tricks involving jumping into data).
I don't think this is a good approach in general. Running the uploaded code in a completely self-contained virtual sandbox (via QEMU or something like that, perhaps) would probably be a better way to go.
-D can overwrite individual function names. For example:
gcc file.c -Dchown -Dchdir
Or you can set the include guard yourself:
gcc file.c -D_UNISTD_H
However their effects can be easily reverted with #undefs by intelligent submitters :)

Recommended FHS compliant application test/install workflow under Linux?

I'm in the process of switching to Linux for development, and I'm puzzled about how to maintain a good FHS compliancy in my programs.
For example, under Windows, I know that all the resources (Bitmaps, audio data, etc.) that my program will need can be found with relative paths from the executable, so its the same if I'm running the program from my development directory, or from an installation (Under "Program Files" for example), the program will be able to locate all its files.
Now, under Linux, I see that usually the executable goes under /usr/local/bin and its resources on /usr/local/share. (And the truth is that I'm not even sure of this)
For convenience reasons (such as version control) I'd like to have all the files pertaining to the project under a same path, say, for example, project/src for the source and project/data for resource files.
Is there any standard or recommended way to let me just rebuild the binary for testing and use the files on the project/data directory, while also being able to locate the files when they are under /usr/local/share?
I thought for example of setting a symlink under /usr/local/share pointing to my resources dir, and then just hardcode that path inside my program, but I feel its quite hackish and not very portable.
Also, I thought of running an install script that copies all the resources to /usr/local/share everytime I change, or add resources, but I also feel its not a good way to do it.
Could anyone tell me or point me to where it tells how this issue is usually resolved?
Thanks!
For convenience reasons (such as version control) I'd like to have all the files pertaining to the project under a same path, say, for example, project/src for the source and project/data for resource files.
You can organize your source tree as you wish — it need not bear any resemblance to the FHS layout desired of installed software.
I see that usually the executable goes under /usr/local/bin and its resources on /usr/local/share. (And the truth is that I'm not even sure of this)
The standard prefix is /usr. /usr/local is for, well, "local installations" as the FHS spec reiterates.
Is there any standard or recommended way to let me just rebuild the binary for testing and use the files on the project/data directory
Definitely. Run ./configure --datadir=$PWD/share for example is the way to point your build to the data files form the source tree (substitute by proper path) and use something like -DDATADIR="'${datadir}'" in AM_CFLAGS to make the value known to the (presumably C) code. (All of that, provided you are using autoconf/automake. Similar options may be available in other build systems.)
This sort of hardcoding is what is used in practice, and it suffices. For a development build within your own working copy, having a hardcoded path should not be a problem, and final builds (those done by a packager) will simply use the standard FHS paths.
You could just test a few locations. For example, first check if you have a data directory within the directory you're currently running the program from. If so, just go ahead and use it. If not, try /usr/local/share/yourproject/data, and so on.
For developing/testing, you can use the data directory within your project folder, and for deploying, use the stuff in /usr/local/share/. Of course, you can test for even more locations (e.g. /usr/share).
Basically the requirement for this method is that you have a function that builds the correct paths for all filesystem accesses. Instead of fopen("data/blabla.conf", "w") use something like fopen(path("blabla.conf"), "w"). path() will construct the correct path from the path determined using the directory tests when the program started. E.g. if the path was /usr/local/share/yourproject/data/, the string returned by path("blabla.conf") would be "/usr/local/share/yourproject/data/blabla.conf" - and there is your nice absolute path.
That's how I'd do it. HTH.
My preferred solution in cases like this is to use a configuration file, along with a command-line option that overrides its location.
For example, a configuration file for a fully deployed application named myapp could reside in /etc/myapp/settings.conf and a part of it could look like this:
...
confdir=/etc/myapp/
bindir=/usr/bin/
datadir=/usr/share/myapp/
docdir=/usr/share/doc/myapp/
...
Your application (or a launcher script) can parse this file to determine where to find the rest of the needed files.
I believe that you can reasonably assume in your code that the location of the configuration file is fixed under /etc/myapp - or any other location specified at compile time. Then you provide a command line option to allow that location to be overridden:
myapp --configfile=/opt/myapp/etc/settings.conf ...
It might also make sense to have options for some of the directory paths as well, so that the user can easily override any of the configuration file settings. This approach has a couple of advantages:
Your users can relocate the application very easily - just by moving the files, modifying the paths in the configuration file and then using e.g. a wrapper script to call the main application with the proper --configfile option.
You can easily support FHS, as well as any other scheme you need to.
While developing, you can have your testsuite use a specially crafted configuration file with the paths being wherever you need them to be.
Some people advocate probing the system at runtime to resolve issues like this. I usually suggest avoiding such solutions for at least the following reasons:
It makes your program non-deterministic. You can never tell at a first glance which configuration file it picks up - especially if you have multiple versions of the application on your system.
At any installation mix-up, the application will remain fat and happy - and so will the user. In my opinion, the application should look at one specific and well-documented location and abort with an informative message if it cannot find what it is looking for.
It's highly unlikely that you will always get everything right. There will always be unexpected rare environments or corner cases that the application will not handle.
Such behaviour is against the Unix philosophy. Even comamnd shells probe multiple locations because all locations can hold a file that should be parsed.
EDIT:
This method is not mandated by any formal standard that I know of, but it is the prevalent solution in the Unix world. Most major daemons (e.g. BIND, sendmail, postfix, INN, Apache) will look for a configuration file at a certain location, but will allow you to override that location and - through the file - any other path.
This is mostly to allow the system administrator to implement whetever scheme they want or to setup multiple concurrent installations, but it does help during testing as well. This flexibility is what makes it a Best Practice if not a proper standard.

Pseudo filesystems on *nix

I need some opinions pointers on creating pseudo-filesystems for linux/*nix systems.
Firstly when I say pseudo-filesystem I mean something like /proc where the structure within does not represent actual files on disks or such but the state of the kernel. I would like to try something similar as an interface to an application.
As an example you could say, mount a ftp url to your filesystem and your browser app could then allow you to interact with the remote system doing ls et al on it and translating the standard filesystem requests into ftp ones.
So the first question is: how does one go about doing that? I have read a bit about it and it looks like you need to implement a new kernel module. If possible I would like to avoid that - my thinking being that someone may have already provided a tool for doing this sort of thing and provided the module to assist already.
My second question is: does anyone have a good list of examples of applications/services/whatever using this sort of technique to provide a filesystem based interface.
Lastly if anyone has any opinions on why this might be a good/bad idea to do such a thing on a generic level I would like to hear it.
A userspace filesystem via fuse would probably be your best way to go.
Regarding the next part of your question (which applications use this method), there is the window manager wmii, it uses the 9p filesystem via v9fs, which is a port of 9p to Linux. There are many examples on plan9, most notably acme. I suggested fuse because it seems more actively developed and mainstream in the Linux world, but plan9 is pretty much the reference for this approach as far as I know.

Are there good reasons not to exploit '#!/bin/make -f' at the top of a makefile to give an executable makefile?

Mostly for my amusement, I created a makefile in my $HOME/bin directory called rebuild.mk, and made it executable, and the first lines of the file read:
#!/bin/make -f
#
# Comments on what the makefile is for
...
all: ${SCRIPTS} ${LINKS} ...
...
I can now type:
rebuild.mk
and this causes make to execute.
What are the reasons for not exploiting this on a permanent basis, other than this:
The makefile is tied to a single directory, so it really isn't appropriate in my main bin directory.
Has anyone ever seen the trick exploited before?
Collecting some comments, and providing a bit more background information.
Norman Ramsey reports that this technique is used in Debian; that is interesting to know. Thank you.
I agree that typing 'make' is more idiomatic.
However, the scenario (previously unstated) is that my $HOME/bin directory already has a cross-platform main makefile in it that is the primary maintenance tool for the 500+ commands in the directory.
However, on one particular machine (only), I wanted to add a makefile for building a special set of tools. So, those tools get a special makefile, which I called rebuild.mk for this question (it has another name on my machine).
I do get to save typing 'make -f rebuild.mk' by using 'rebuild.mk' instead.
Fixing the position of the make utility is problematic across platforms.
The #!/usr/bin/env make -f technique is likely to work, though I believe the official rules of engagement are that the line must be less than 32 characters and may only have one argument to the command.
#dF comments that the technique might prevent you passing arguments to make. That is not a problem on my Solaris machine, at any rate. The three different versions of 'make' I tested (Sun, GNU, mine) all got the extra command line arguments that I type, including options ('-u' on my home-brew version) and targets 'someprogram' and macros CC='cc' WFLAGS=-v (to use a different compiler and cancel the GCC warning flags which the Sun compiler does not understand).
I would not advocate this as a general technique.
As stated, it was mostly for my amusement. I may keep it for this particular job; it is most unlikely that I'd use it in distributed work. And if I did, I'd supply and apply a 'fixin' script to fix the pathname of the interpreter; indeed, I did that already on my machine. That script is a relic from the first edition of the Camel book ('Programming Perl' by Larry Wall).
One problem with this for generally distributable Makefiles is that the location of make is not always consistent across platforms. Also, some systems might require an alternate name like gmake.
Of course one can always run the appropriate command manually, but this sort of defeats the whole purpose of making the Makefile executable.
I've seen this trick used before in the debian/rules file that is part of every Debian package.
To address the problem of make not always being in the same place (on my system for example it's in /usr/bin), you could use
#!/usr/bin/env make -f
if you're on a UNIX-like system.
Another problem is that by using the Makefile this way you cannot override variables, by doing, for example make CFLAGS=....
"make" is shorter than "./Makefile", so I don't think you're buying anything.
The reason I would not do this is that typing "make" is more idiomatic to building Makefile based projects. Imagine if every project you built you had to search for the differently named makefile someone created instead of just typing "make && make install".
You could use a shell alias for this too.
We can look at this another way: is it a good idea to design a language whose interpreter looks for a fixed filename if you don't give it one? What if python looked for Pythonfile in the absence of a script name? ;)
You don't need such a mechanism in order to have a convention based around a known name. Example: Autoconf's ./configure script.

Resources