Python mechanism to load site-packages relative to program/script location? - python-3.x

For a set of programs written in most languages (C for instance) a script can normally run those programs without any sort of interference between dynamic link libraries and with no special hand holding so long as they are all found on PATH. That is, the following will work:
#!/bin/bash
prog1
prog2
prog3
However, if these three programs are written in Python and they import conflicting package versions then to run each one successfully it must either be installed into a virtualenv or each must have a separate site-packages directory which is referenced by PYTHONPATH. Either way they need a set up and possibly a tear down before running. That is, for virtualenv:
#!/bin/bash
source $PROG1_ROOT/bin/activate
prog1
deactivate
source $PROG2_ROOT/bin/activate
prog2
deactivate
source $PROG3_ROOT/bin/activate
prog3
deactivate
and for separate site-packages:
#!/bin/bash
export PYTHONPATH=$PROG1_ROOT/lib/python3.6/site-packages
prog1
export PYTHONPATH=$PROG2_ROOT/lib/python3.6/site-packages
prog2
export PYTHONPATH=$PROG3_ROOT/lib/python3.6/site-packages
prog3
This problem results because
import pkg_resources
(at least through Python3.6) cannot reliably import the proper versions when multiple versions of a package share the same site-package directory, even if __requires__ precedes it listing all the version restrictions.
It occurs to me that if PYTHONPATH, or some equivalent, could be specified relative to the program instead of the $PWD, and some consistency in directory layout was observed, then it would only have to be set once. That is, if prog1 is in $PROG1_ROOT/bin and its libraries are in $PROG1_ROOT/lib/python3.6/site-packages, then setting PYTHONPATH to "../lib/python3.6/site-packages" would work not only for prog1, but also for prog2, prog3, and for as many more as are needed through progN.
However, PYTHONPATH is normally provided as an absolute path, and relative paths are I believe with respect to $PWD, not to the python program (prog1). Is there some other Python path variable which has the desired property? Failing that, is there some type of file which could be dropped into $PROG1_ROOT/bin which would be normally picked up by a python program when it starts and which could direct it to use $PROG1_ROOT/lib/python3.6/site-packages? It would be OK to have either the relative or absolute path in that file, although the former would still be preferred because then one could move the entire PROG1_ROOT directory tree to another location in the file system without having to rewrite this special file. I really want to avoid solutions which would require modifying prog1 etc. themselves (ie, prog1 in the example).
Thanks.
EDITED:
I wrote this:
https://sourceforge.net/projects/python-devirtualizer/
to implement some of these ideas. At this point it is Linux (or at least POSIX) specific. It slightly modifies python scripts in a package's "bin" directory by changing the first line, and it "wraps" everything in that directory with a replacement native binary which injects a custom PYTHONPATH into the true target's environment. That binary looks up its location using a function from libSDL2 and then specifies the PYTHONPATH relative to that. So far it has worked pretty well, and the "programs" in installed python packages (the "bin" directory's contents) are run based on PATH just like any other program, no futzing about with PYTHONPATH in the shell.

Making search paths relative to the executable is a Very Bad Idea (TM). Move the executable or libraries around, all hell breaks loose. Some enterprising miscreant might notice the path settings and place a script just right to get their own doctored libraries (or just flawed old versions) to be used. And so on.
Clean up the misbehaving scripts. Chances are that by using old versions they are vulnerable to by now fixed security boo-boos, or other misbehaviours. Or find a way to load the stuff in the script itself.

Related

Add software bin or just add soft link for executable file in bin when install software on linux?

I’m not root for the linux server,
so I choose to install softwares in my $HOME/local/bin, I already added the $HOME/local/bin directory to the PATH environment variable, wrote in my .bashrc.
Some softwares install this way like:
tar xvzf ncurses-5.9.tar.gz
cd ncurses-5.9
./configure --prefix=$HOME/local
make
make install
cd ..
So it will directly install in my $HOME/local/bin.
But for some softwares, after download like sbt-1.2.1.zip (based on java), and decompression, shows just a file fold sbt, it contains three foldsbin conf lib, and in its bin, contains one executable file named sbt and java9-rt-export.jar sbt-launch-lib.bash sbt-launch.jar sbt.bat.
Here I wonder:
I should just soft link this executable sbt file path under my $HOME/local/bin, then source my .bashrc?
Or, after decompression, add one line in my .bashrc export PATH="downloadpath/sbt/bin:$PATH"?
Since just one executable downloadpath/sbt/bin, so I'm not sure it is right to add whole bin fold path, if software's bin fold contains executable files (one or many), I think this situation is more convenient for just add it's bin in .bashrc, but even so, I'm not sure its right?
I'm not familiar with installation software, now I usually know way
but not why. Here I shows two ways (more ways not be showed here) to
install, executable file always be written in bin or src? But some
softwares no bin just src but no executable files in it...
Slurm also can use modules to install software, conda also other way, but I want to
confirm these traditional ways I mentioned (that two) still can be
used on slurm or conda?
However, any suggestion even one aspect's reminding will be grateful!
For precompiled software, or, in general, software that does not offer configure scripts or (C)make files, it is ofter better to leave them in their target directory and adapt the *PATH (PATH to binaries, but also LD_LIBRARY_PATH, LIBRARY_PATH to libraries and CPATH to include files and MANPATH to the man page) environment variables.
The reason is that the software might be configured to read files with hardcoded paths, relative to the position of the executable, such as libraries, etc.
In your case, you might also need to setup the CLASSPATH env variable to the directory with the jar files.
To ease software installation, you can use tools such as easybuild that can help, and even create user modules just like the system module installed by the system administrators.
There is something wrong in my opinion with your setup. If you don`t have root account on your server, is not better to test what you have to test, in a more safe environment - for example a vm/container on your developement machine ?
However, in your situation maybe it can be better to start sbt by using a separate bash script than modifying your .bashrc

Building install packages with shared objects and symbolic links

I am working on a project and we are starting release binaries. We are using CMake to generate build files, and CPack to create binaries. Our binaries work, but we run into problem with shared objects. Essentially, many of the issues, arise from symbolic linking on the system, especially with shared objects with multiple links. So, let say the RPATH results from ldd (or otool) for some executable include libmpich.so.10 and I've linked /usr/lib/x86_64-linux-gnu/libmpich.so from cmake and these files are related like this:
/usr/lib/x86_64-linux-gnu/libmpich.so -> libmpich.so.10
/usr/lib/x86_64-linux-gnu/libmpich.so.10 -> libmpich.so.10.0.4
/usr/lib/x86_64-linux-gnu/libmpich.so.10.0.4
Now, for some reason the RPATH uses the intermediate link (so.10) but readlink on libmpich.so (or get_filename_component(... REALPATH)) returns libmpich.so.10.0.4. So if I install libmpich.so.10.0.4 under the name libmpich.so OR libmpich.so.10.0.4 (or create the symlink from one to the other), I've still missed the library asked for in the RPATH.
I've been playing whack-a-mole on when dealing with these and/or using a file glob to try to grab the intermediate link but I would like to do something more robust. Anyone use a good design pattern for this?
I have been looking into using functions like GET_PREREQUISITES, but those require the object to be built so I would need to add them into the install scripts somehow.... and it feels like their should be a better way.
-Jameson
P.S. I've also bee looking for a best practices guide for building binaries, either with cmake or in general. We are producing binaries on windows, linux and mac. If you know of some good links please post them as well.
I just recently dealt with this issue myself. The cmake command get_filename_component(... REALPATH) resovles ALL levels of a symlink in one call.
To resolve just a single level of a symlink you can call 'readlink' directly from cmake, since it's available on every symlink-enabled platform that you're likely to build on (Linux, Mac OS X, and *BSD).
So, if you want to reproduce the complete chain of symbolic links,
you'd code up something like this in your cmake script:
#If given the following library path:
set(lib "/usr/lib/x86_64-linux-gnu/libmpich.so")
#Make sure the initial path is absolute.
get_filename_component(lib "${lib}" ABSOLUTE)
#Store initial path as first element in list.
set(symlist "${lib}")
while(UNIX AND IS_SYMLINK "${lib}")
#Grab path to directory containing the current symlink.
get_filename_component(sym_path "${lib}" DIRECTORY)
#Resolve one level of symlink, store resolved path back in lib.
execute_process(COMMAND readlink "${lib}"
RESULT_VARIABLE errMsg
OUTPUT_VARIABLE lib
OUTPUT_STRIP_TRAILING_WHITESPACE)
#Check to make sure readlink executed correctly.
if(errMsg AND (NOT "${errMsg}" EQUAL "0"))
message(FATAL_ERROR "Error calling readlink on library.")
endif()
#Convert resolved path to an absolute path, if it isn't one already.
if(NOT IS_ABSOLUTE "${lib}")
set(lib "${sym_path}/${lib}")
endif()
#Append resolved path to symlink resolution list.
list(APPEND symlist "${lib}")
endwhile()
#Now symlist will contain the following:
# [...]/libmpich.so;[...]/libmpich.so.10;[...]/libmpich.so.10.0.4

uic can't find shared library

I am trying to make a Qt5 part of my source tree, so I haven't installed it on my machine, just copied it from source control. I am having a problem when I try to run uic.exe:
stiopa#stiopa-VirtualBox:~/ct/LinuxLibs/Qt/bin > ./uic
./uic: error while loading shared libraries: libQt5Core.so.5: cannot open shared object file: No such file or directory
I am still getting the same error even when I copy the libQt5Core library to bin directory. How is uic looking for shared libraries? Is there any environment variable I need to set to fix it?
This is yet another case of not putting the dependent shared libraries in a defined location that is supported by the program.
If you're planning on doing the 'copy the files to the same directory as the executable', the fast solution is to reference the directory in the library load path; e.g. if the binary is in $HOME/foo, you do:
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}${LD_LIBRARY_PATH:+:}$HOME/foo
This adds or makes $HOME/foo the run-time-linker's load path. As a result, any programs you run will look in this directory for libraries, as well as the default set for the OS (defined by the ld.so configuration), as well as the paths that are defined within the application itself (the rpath).
If you're going to follow this route, what you can do is to move the binary to target.bin, create a target bash script, which invokes the bin file automatically; e.g.
#!/bin/bash -p
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}${LD_LIBRARY_PATH:+:}$(dirname $0)
exec $0.bin "$*"
A secondary mechanism which will permit you to change the search location for a binary; without requiring an environment variable insert is to modify the binary so that it searches in different locations than it usually does; this takes advantage of some features in the run-time linker (which looks for libraries).
There is a program called chrpath, which can be added by various package managers, which allows you to edit the rpath directly. In this case; you can change the additional search path of the binary using:
chrpath -r '$ORIGIN' foo
This means that the program will look in the same directory as the binary for .so files, thus allowing it to run.

multiple binaries with same name in ubuntu/linux

I have recently installed a webframework play (http://www.playframework.com/) and want to have the play executable in the system path ie $PATH. But ubuntu already defines a command called play. How do I overwrite the system defined command with my framework binary path so that command play on commandline calls my framework rather than the old application.
Installation: I downloaded zipped file of the framework and upzipped in one of my personal folder which contains the docs and the executable.
Never alter the contents of installed packages. Such changes can provoke hard to find problems in the system and anyway, they will most likely be overwritten again in subsequent updates. There are other alternatives:
obviously you can chose another name for your executable
place the executable in another part of your $PATH if its a "personal installation", typically ~/bin is used for such approach. Remember that the order of entries in the $PATH variable is important, first come first serve.
use the traditional /usr/local/bin location for locally added "wild" installations, this way there is some form of clean separation between clean packages and wild installed files inside the system
store your software in some other location and prepend that to your personal or system wide $PATH variable
store your executable under another name and create an alias (see man alias for an explanation) for it which allows to call it by some name that "hides" the original command this way. For this the executable can be addressed with an absolute path, so it dies not have to be found inside the $PATH variable.
In my personal opinion options 2. and 5. and the best if it comes to "personal installations".
If you are sure you'll never use the original play command, you could just remove the binary. But in general, this isn't a good idea, since some system component you don't think of might need it, and the next update will probably restore it.
The best thing to do is to prepend the directory of your play command to the PATH, for example, using PATH=/opt/framework/bin:$PATH in your .profile (assuming your play command installs to /opt/framework/bin/play), or the script that starts your web server, or wherever you need your play command.
Remember that does not make your play command global. A common mistake is to add the path in their .profile file, then call the program from crontab - crontab scripts will not execute .profile or .bashrc.

Running a program from the source tree

Should it generally be possible to run a program from the source directory (src) after having invoked ./configure and make (but not make install)? I'm trying to fix a bug in an application and it seems unnecessary to run make install after each code change. Unfortunately I can't run the application in the source directory since it tries to access files in the lib installation directory (which do not exist before make install). Is the application wrongly configured or do I have to reinstall it after each change to the source code?
It all depends on the application and what components or files it expects to be visible and where. But assuming no required configuration or dependencies, then yes, you can run the program in-place.
To add a directory to your lib search path, add to the environment variable LD_LIBRARY_PATH. Like so:
LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/user/myproject/lib" ./someprogram
Note that specifiying a variable assignment on the command line in front of the program you run sets that variable for that run only. (Note, no semicolon -- this is a single command.) If you want to set the variable for the entire session, use
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/user/myproject/lib"
I'd recommend against this, though. It can lead to problems and confusion.

Resources