Are there any security issues in using environment variables like this?

Are there any security issues in using environment variables like this? - security

I'm writing a collection of utilities in bash that has a common library. Each script I write has to have a block of code that determines the path of the library relative to the executable. Not the actual code, but an example:
#!/bin/bash
DIR="$( cd -P "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
. $DIR/../lib/utilities/functions
Instead of a preamble that tries to search for the library, I had the bright idea to use an environment variable to indicate the libraries location.
#!/bin/bash
. $TOOLS_LIBRARY_PATH
I could use a wrapper program to set that environment variable, or I could set it in my own path. There may be better ways to organize a bash toolset, but question is:
Can I trust my environment variables?
This is one of those, I've-never-really-thought-about-it kind of questions. When programming in other languages, paths are used to find libraries, (e.g. LD_LIBRARY_PATH, PYTHONPATH, PERLLIB, RUBYLIB, CLASSPATH, NODE_PATH), but I've never had to stop and think about how that could be insecure.
In fact, LD_LIBRARY_PATH has Why LD_LIBRARY_PATH is bad to discourage its use. The Ruby and Perl library path environment variables are ignored if their security mechanisms are invoked, $SAFE and -T (taint mode) respectively.
My thoughts so far...
The user could set TOOLS_PATH_LIBRARY to a library of their choosing, but the utility will run under their uid. They could simply run their malicious library directly with bash.
My tools sudo some things. Someone could set their TOOLS_PATH_LIBRARY to something that takes advantage of this. But, the tools are not run via sudo, they only invoke sudo here and there. The user would have to be a sudoer in any case, they could just call sudo directly.
If I can't trust TOOLS_PATH_LIBRARY, then I can't trust PATH. All program invocations must use absolute paths.
I've seen shell programs use aliases for programs that are absolute, so that instead of calling ls, use a variable, like LS=/bin/ls. From what I've read, this is to protect against users redefining program defaults as aliases. See: PATH, functions and security. Bash scripting best practices.
.
Perl's taint mode treats all environment variables as "tainted", which foreboding, which is why I'm trying to reason about the risks of environment.
It is not possible for one users to change another's environment, unless that user is root. Thus, I'm only concerned about a user changing their own environment to escalate privileges. See: Is there a way to change another process's environment variables?
I've rubber ducked this into an answer of sorts, but I'm still going to post it, since it isn't a pat answer.
Update: What are the security issues surrounding the use of environment variables to specify paths to libraries and executables?

While mechanisms exist in various programs to prevent the modification of environment variables, the bottom line is, no you can't trust environment variables. The security concern is very basic:
Anytime a user can change what is expected to be executed, the possibility for a vulnerability to be exploited arises.
Case-in-point, have a look at CVE-2010-3847. With that, an underprivileged attacker with write access to a file system containing a setuid or setgid binaries could use this flaw to escalate their privileges. It involves an environment variable being modified by the user.
CVE-2011-1095 is another example, and doesn't involve SUID binaries. Do a google search for "glibc cve environment" to see the kinds of stuff that people were able to do with environment variable modifications. Pretty crafty.
Your concerns really boil down to your statement of:
The user could set TOOLS_PATH_LIBRARY to a library of their choosing, but the utility will run under their uid. They could simply run their malicious library directly with bash.
Key phrase here - run their malicious library. This assumes their library is owned by their UID as well.
This is where a security framework will do you lots of good. One that I have written that focuses exclusively on this problem you can find here:
https://github.com/cormander/tpe-lkm
The module stops execve/mmap/mprotect calls on files that are writable, or not owned by a trusted user (root). As long as they're not able to put malicious code into a file/directory owned by the trusted user, they can't exploit the system in this way.
If you're using SUID binaries or sudo that include those variables, you might want to consider enabling the "paranoid" and "strict" options to stop even root from trusting non-root owned binaries.
I should mention that this Trusted Path Execution method protects direct execution with binaries and shared libraries. It does little (if anything) against interpreted languages, as they parse bytecode and not execute it directly. So you still need some degree of care with PYTHONPATH, PERLLIB, CLASSPATH, etc and use the language's safety mechanisms that you mentioned.

Short answer:
Assuming the user is able to run programs and code of her own choice anyway, you do not have to trust anything they feed you, including the environment. If the account is limited in some ways (no shell access, no write access to file systems that allow execution), that may change the picture, but as long as your scripts are only doing things the user could do herself, why protect against malicious interference?
Longer answer:
There are at least two separate issues to consider in terms of security problems:
What and whom do we need to guard against?
(closely related question: What can our program actually do and what could it potentially break?)
How do we do that?
If and as long as your program is running under the user ID of the user starting the program and providing all the input (i.e., the only one who could mount any attack), there are only rare situations where it makes sense at all to harden the program against attacks. Anti-copy protection and sharing high-scores comes to mind, but that group of things is not just concerned with inputs, but probably even more so with stopping the user from reading code and memory. Hiding the code of a shell script without some kind of suid/sgid trickery is nothing I'd know how to do; the best I could think of would be obscuring the code.
In this situation, anything the user could trick the program into doing, they could do without the help of the tool, too, so trying to “protect” against attacks by this user is moot. Your description does not read as if you'd actually need any protection against attacks.
Assuming you do need protection, you simply cannot rely on environment variables – and if you fail to reset things like LD_LIBRARY_PATH and LD_PRELOAD, even calling tools with absolute paths like /bin/id or /bin/ls won't give you a reliable answer, unless that tool happens to be statically compiled. This is why sudo has env_reset enabled by default and why running suid programs has to ignore certain environment variables. Note that this means that your point that TOOLS_PATH_LIBRARY and PATH are equally trustworthy may be true in your situation, but is not necessarily true in other situations' border cases: a sysadmin may reset PATH for sudo usage, but let non-standard environment variables pass through.
As pointed out above, argv[0] (or its bash equivalent ${BASH_SOURCE[0]}) is no more reliable than environment variables. Not only can the user simply make a copy or symlink of your original file; execve or bash's exec -a foo bar allows putting anything into argv[0].

You can never trust someone else's environment.
Why not simply create a new user that contains all of this important code. Then, you can either get the information directly from /etc/passwd or use the ~foo syntax to find the user's home directory.
# One way to get home directory of util_user
DIR=$(/usr/bin/awk -F: '$1 == "util_user" {print $6}' /etc/passwd)
# Another way which works in BASH and Kornshell
[ -d ~util_dir ] && DIR=~util_dir
# Make sure DIR is set!
if [ -z "$DIR" ]
then
echo "Something's wrong!"
exit
fi

Related

How to modify standard linux commands?

I am looking for a way to edit the source code of common Linux commands (passwd, cd, rm, cat)
Ex. Every time the 'cat' command is called (by any user), it performs its regular function, but also prints "done" to stdout after.

If you're only looking to "augment" the commands as in your example, you can create e.g. /opt/bin/cat.sh:
/bin/cat && echo "done"
and then either:
change the default PATH (in /etc/bash.bashrc in Ubuntu) as follows:
PATH=/opt/bin:$PATH
or rename cat to e.g. cat.orig and move cat.sh to /bin/cat.
(If you do the latter, your script needs to call cat.orig not cat).
If you want to actually change the behavior, then you need to look at the sources here:
https://ftp.gnu.org/gnu/coreutils/
and then build them and replaces them.
All this assumes, of course, that you have root permissions, seeing how you want to change that behavior for any user.

The answer to how to modify the source is not to, unless you have a REALLY good reason to. There’s a plethora of reasons why you shouldn’t, but a big one is that you should try to avoid modifying the source of anything that could receive an update. The update breaks, if not erases, your code and you’re left with a lot of work.
Alternatively, you can use things like Alias for quick customizations and write scripts that call and rely upon the command being available, instead of worrying about its implementation. I’ve over explained, but that’s because I’m coming to you as someone with only a little experience with Linux but much more in Development, and what I’ve said extends beyond an Operating Systems CLI capabilities and lands further into general concepts of development.

process-local override of name resolution?

I have test code that I want to have a couple of hostnames resolve to the loopback while testing. When deployed, this code will use the normal system name resolution as appropriate. Test and deployment host are recent linux distros (SLES11SP1, e.g.).
I'd like to override hostname resolution for a single process, without being superuser. Is there a way to manipulate the nsswitch/hostsbehavior in such a narrow fashion?
Yes, of course I could override the hostnames themselves, but I prefer not to (unless this feature really isn't available).
EDIT:
glibc's HOSTALIASES feature sounds like exactly what I want -- but its availability/effectiveness seems inconsistent among the hosts I surveyed. At some point, it was added to be among a list of insecure environment variables. But does that mean it's ignored globally or only in suid binaries? Will it still work for programs which do getnameinfo()?
More edit:
IMO, HOSTALIAS wins hands down. Disabling nscd is a workaround for platforms which don't respect it -- like mine (SuSE). And maybe they will release a fix.

LD_LIBRARY_PATH for the win!
http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html
Also:
What is the LD_PRELOAD trick?
Also:
http://www.linuxjournal.com/article/7795

Assuming you want to intercept e.g. gethostbyname(), and have it return 127.0.0.1 for certain hostnames ...
If your code is C++, the simplest answer might be to use gMock.
If you can't, you may want to interpose gethostbyname. A sample interposer is documented here.

Brian, another option would be to use chroot. You could create a directory with a bunch of mount --rbind for each of the directories usr, lib, home, etc. - enough to simulate a working root directory. Then use mount -t aufs to "layer" mount the existing etc together with a writable empty layer. In essence, after all that, whatever you change in etc ends up changing only inside that chroot environment. You could override and simulate all kinds of environments that way.
If this is of any interest and need me to elaborate further, let me know.

Recommended FHS compliant application test/install workflow under Linux?

I'm in the process of switching to Linux for development, and I'm puzzled about how to maintain a good FHS compliancy in my programs.
For example, under Windows, I know that all the resources (Bitmaps, audio data, etc.) that my program will need can be found with relative paths from the executable, so its the same if I'm running the program from my development directory, or from an installation (Under "Program Files" for example), the program will be able to locate all its files.
Now, under Linux, I see that usually the executable goes under /usr/local/bin and its resources on /usr/local/share. (And the truth is that I'm not even sure of this)
For convenience reasons (such as version control) I'd like to have all the files pertaining to the project under a same path, say, for example, project/src for the source and project/data for resource files.
Is there any standard or recommended way to let me just rebuild the binary for testing and use the files on the project/data directory, while also being able to locate the files when they are under /usr/local/share?
I thought for example of setting a symlink under /usr/local/share pointing to my resources dir, and then just hardcode that path inside my program, but I feel its quite hackish and not very portable.
Also, I thought of running an install script that copies all the resources to /usr/local/share everytime I change, or add resources, but I also feel its not a good way to do it.
Could anyone tell me or point me to where it tells how this issue is usually resolved?
Thanks!

For convenience reasons (such as version control) I'd like to have all the files pertaining to the project under a same path, say, for example, project/src for the source and project/data for resource files.
You can organize your source tree as you wish — it need not bear any resemblance to the FHS layout desired of installed software.
I see that usually the executable goes under /usr/local/bin and its resources on /usr/local/share. (And the truth is that I'm not even sure of this)
The standard prefix is /usr. /usr/local is for, well, "local installations" as the FHS spec reiterates.
Is there any standard or recommended way to let me just rebuild the binary for testing and use the files on the project/data directory
Definitely. Run ./configure --datadir=$PWD/share for example is the way to point your build to the data files form the source tree (substitute by proper path) and use something like -DDATADIR="'${datadir}'" in AM_CFLAGS to make the value known to the (presumably C) code. (All of that, provided you are using autoconf/automake. Similar options may be available in other build systems.)
This sort of hardcoding is what is used in practice, and it suffices. For a development build within your own working copy, having a hardcoded path should not be a problem, and final builds (those done by a packager) will simply use the standard FHS paths.

You could just test a few locations. For example, first check if you have a data directory within the directory you're currently running the program from. If so, just go ahead and use it. If not, try /usr/local/share/yourproject/data, and so on.
For developing/testing, you can use the data directory within your project folder, and for deploying, use the stuff in /usr/local/share/. Of course, you can test for even more locations (e.g. /usr/share).
Basically the requirement for this method is that you have a function that builds the correct paths for all filesystem accesses. Instead of fopen("data/blabla.conf", "w") use something like fopen(path("blabla.conf"), "w"). path() will construct the correct path from the path determined using the directory tests when the program started. E.g. if the path was /usr/local/share/yourproject/data/, the string returned by path("blabla.conf") would be "/usr/local/share/yourproject/data/blabla.conf" - and there is your nice absolute path.
That's how I'd do it. HTH.

My preferred solution in cases like this is to use a configuration file, along with a command-line option that overrides its location.
For example, a configuration file for a fully deployed application named myapp could reside in /etc/myapp/settings.conf and a part of it could look like this:
...
confdir=/etc/myapp/
bindir=/usr/bin/
datadir=/usr/share/myapp/
docdir=/usr/share/doc/myapp/
...
Your application (or a launcher script) can parse this file to determine where to find the rest of the needed files.
I believe that you can reasonably assume in your code that the location of the configuration file is fixed under /etc/myapp - or any other location specified at compile time. Then you provide a command line option to allow that location to be overridden:
myapp --configfile=/opt/myapp/etc/settings.conf ...
It might also make sense to have options for some of the directory paths as well, so that the user can easily override any of the configuration file settings. This approach has a couple of advantages:
Your users can relocate the application very easily - just by moving the files, modifying the paths in the configuration file and then using e.g. a wrapper script to call the main application with the proper --configfile option.
You can easily support FHS, as well as any other scheme you need to.
While developing, you can have your testsuite use a specially crafted configuration file with the paths being wherever you need them to be.
Some people advocate probing the system at runtime to resolve issues like this. I usually suggest avoiding such solutions for at least the following reasons:
It makes your program non-deterministic. You can never tell at a first glance which configuration file it picks up - especially if you have multiple versions of the application on your system.
At any installation mix-up, the application will remain fat and happy - and so will the user. In my opinion, the application should look at one specific and well-documented location and abort with an informative message if it cannot find what it is looking for.
It's highly unlikely that you will always get everything right. There will always be unexpected rare environments or corner cases that the application will not handle.
Such behaviour is against the Unix philosophy. Even comamnd shells probe multiple locations because all locations can hold a file that should be parsed.
EDIT:
This method is not mandated by any formal standard that I know of, but it is the prevalent solution in the Unix world. Most major daemons (e.g. BIND, sendmail, postfix, INN, Apache) will look for a configuration file at a certain location, but will allow you to override that location and - through the file - any other path.
This is mostly to allow the system administrator to implement whetever scheme they want or to setup multiple concurrent installations, but it does help during testing as well. This flexibility is what makes it a Best Practice if not a proper standard.

Gurus say that LD_LIBRARY_PATH is bad - what's the alternative?

I read some articles about problems in using the LD_LIBRARY_PATH, even as a part of a wrapper script:
http://linuxmafia.com/faq/Admin/ld-lib-path.html
http://blogs.oracle.com/ali/entry/avoiding_ld_library_path_the
In this case - what are the recommended alternatives?
Thanks.

You can try adding:
-Wl,-rpath,path/to/lib
to the linker options. This will save you the need to worry about the LD_LIBRARY_PATH environment variable, and you can decide at compile time to point to a specific library.
For a path relative to the binary, you can use $ORIGIN, eg
-Wl,-rpath,'$ORIGIN/../lib'
($ORIGIN may not work when statically linking to shared libraries with ld, use -Wl,--allow-shlib-undefined to fix this)

I've always set LD_LIBRARY_PATH, and I've never had a problem.
To quote you first link:
When should I set LD_LIBRARY_PATH? The short answer is never. Why? Some users seem to set this environment variable because of bad advice from other users or badly linked code that they do not know how to fix.
That is NOT what I call a definitive problem statement. In fact it brings to mind I don't like it. [YouTube, but SFW].
That second blog entry (http://blogs.oracle.com/ali/entry/avoiding_ld_library_path_the) is much more forthcoming on the nature of the problem... which appears to be, in a nutshell, library version clashes ThisProgram requires Foo1.2, but ThatProgram requires Foo1.3, hence you can't run both programs (easily). Note that most of these problems are negated by a simple wrapper script which sets the LD_LIBRARY_PATH for just the executing shell, which is (almost always) a separate child process of interactive shell.
Note also that the alternatives are pretty well explained in the post.
I'm just confused as to why you would post a question containing links to articles which apparently answer your question... Do you have a specific question which wasn't covered (clearly enough) in either of those articles?

the answer is in the first article you quoted.
In UNIX the location of a library can be specified with the -L dir option to the compiler.
....
As an alternative to using the -L and -R options, you can set the environment variable LD_RUN_PATH before compiling the code.

I find that the existing answers to not actually answer the question in a straightforward way:
LD_RUN_PATH is used by the linker (see ld) at the time you link your software. It is used only if you have no -rpath ... on the command line (-Wl,rpath ... on the gcc command line). The path(s) defined in that variable are added to the RPATH entry in your ELF binary file. (You can see that RPATH using objdump -x binary-filename—in most cases it is not there though! It appears in my development binaries, but once the final version gets installed RPATH gets removed.)
LD_LIBRARY_PATH is used at runtime, when you want to specify a directory that the dynamic linker (see ldd) needs to search for libraries. Specifying the wrong path could lead to loading the wrong libraries. This is used in addition to the RPATH value defined in your binary (as in 1.)
LD_RUN_PATH really causes no security threat unless you are a programmer and don't know how to use it. As I am using CMake to build my software, the -rpath is used all the time. That way I do not have to install everything to run my software. ldd can find all the .so files automatically. (the automake environment was supposed to do that too, but it was not very good at it, in comparison.)
LD_LIBRARY_PATH is a runtime variable and thus you have to be careful with it. That being said, many shared object would be really difficult to deal with if we did not have that special feature. Whether it is a security threat, probably not. If a hacker takes a hold of your computer, LD_LIBRARY_PATH is accessible to that hacker anyway. What could happen is that you use the wrong path(s) in that variable, your binary may not load, but if it loads you may end up with a crashing binary or at least a binary that does not work quite right. One concern is that over time you get new versions of the library and you are likely to forget to remove the LD_LIBRARY_PATH which means you may be using an unsecure version of the library.
The one other possibility for security is if the hacker installs a fake library of the same name as what the binary is searching, library that includes all the same functions, but that has some of those functions replaced with sneaky code. He can get that library loaded by changing the LD_LIBRARY_PATH variable. Then it will eventually get executed by the hacker. Again, if the hacker can add such a library to your system, he's already in and probably does not need to do anything like that in the first place (since he's in he has full control of your system anyway.) Because in reality, if the hacker can only place the library in his account he won't do anything much (unless your Unix box is not safe overall...) If the hacker can replace one of your /usr/lib/... libraries, he already has full access to your system. So LD_LIBRARY_PATH is not needed.

Are there good reasons not to exploit '#!/bin/make -f' at the top of a makefile to give an executable makefile?

Mostly for my amusement, I created a makefile in my $HOME/bin directory called rebuild.mk, and made it executable, and the first lines of the file read:
#!/bin/make -f
#
# Comments on what the makefile is for
...
all: ${SCRIPTS} ${LINKS} ...
...
I can now type:
rebuild.mk
and this causes make to execute.
What are the reasons for not exploiting this on a permanent basis, other than this:
The makefile is tied to a single directory, so it really isn't appropriate in my main bin directory.
Has anyone ever seen the trick exploited before?
Collecting some comments, and providing a bit more background information.
Norman Ramsey reports that this technique is used in Debian; that is interesting to know. Thank you.
I agree that typing 'make' is more idiomatic.
However, the scenario (previously unstated) is that my $HOME/bin directory already has a cross-platform main makefile in it that is the primary maintenance tool for the 500+ commands in the directory.
However, on one particular machine (only), I wanted to add a makefile for building a special set of tools. So, those tools get a special makefile, which I called rebuild.mk for this question (it has another name on my machine).
I do get to save typing 'make -f rebuild.mk' by using 'rebuild.mk' instead.
Fixing the position of the make utility is problematic across platforms.
The #!/usr/bin/env make -f technique is likely to work, though I believe the official rules of engagement are that the line must be less than 32 characters and may only have one argument to the command.
#dF comments that the technique might prevent you passing arguments to make. That is not a problem on my Solaris machine, at any rate. The three different versions of 'make' I tested (Sun, GNU, mine) all got the extra command line arguments that I type, including options ('-u' on my home-brew version) and targets 'someprogram' and macros CC='cc' WFLAGS=-v (to use a different compiler and cancel the GCC warning flags which the Sun compiler does not understand).
I would not advocate this as a general technique.
As stated, it was mostly for my amusement. I may keep it for this particular job; it is most unlikely that I'd use it in distributed work. And if I did, I'd supply and apply a 'fixin' script to fix the pathname of the interpreter; indeed, I did that already on my machine. That script is a relic from the first edition of the Camel book ('Programming Perl' by Larry Wall).

One problem with this for generally distributable Makefiles is that the location of make is not always consistent across platforms. Also, some systems might require an alternate name like gmake.
Of course one can always run the appropriate command manually, but this sort of defeats the whole purpose of making the Makefile executable.

I've seen this trick used before in the debian/rules file that is part of every Debian package.

To address the problem of make not always being in the same place (on my system for example it's in /usr/bin), you could use
#!/usr/bin/env make -f
if you're on a UNIX-like system.
Another problem is that by using the Makefile this way you cannot override variables, by doing, for example make CFLAGS=....

"make" is shorter than "./Makefile", so I don't think you're buying anything.

The reason I would not do this is that typing "make" is more idiomatic to building Makefile based projects. Imagine if every project you built you had to search for the differently named makefile someone created instead of just typing "make && make install".

You could use a shell alias for this too.

We can look at this another way: is it a good idea to design a language whose interpreter looks for a fixed filename if you don't give it one? What if python looked for Pythonfile in the absence of a script name? ;)
You don't need such a mechanism in order to have a convention based around a known name. Example: Autoconf's ./configure script.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string