How can I audit and minimize requirements.txt? - python-3.x

I looking for a way to audit and minimize requirements.txt. I've taken over a project that has grown bloated over several iterations, and I'm trying to make it more maintainable. The current virtual environment I'm in was created from the previous requirements.txt; that file has packages that are no longer imported in any script.
In the past, I've done this manual process:
Search through project directory to find all python files in all subfolders
Search through each python file found to find all import x and from x import y statements, add those packages to a list
pip show each of packages on the list, adding any dependencies to the end of the list
Once list is exhausted, sort and compare to requirements.txt
Remove requirements that aren't on the list.
Assuming that my code performs no relative imports, is there a way to automate this process? I can't imagine I'd be the first person looking for such a tool (or gist, or script). I couldn't find any. I use windows, but I'm happy to linux commands on windows subsystem for linux.

I think a tool that would help with would be pip_missing_reqs.

Related

How do I run a .py script?

I just started learning Python last week to automate some stuff I do (thanks to automatetheboringstuff.com). Assume I know nothing about programming. The only thing I know is HTML and CSS.
I created a simple automation workflow already and I want to improve not the code (maybe in the future because it's not yet finished) but how I can maintain my setup/program on two laptops -- Both Mac OS running on High Sierra.
I have a .py file that contains my automated workflow. I don't know where to place it. It currently resides in my Dropbox so i can use it on laptop1 and laptop2.
I also created a virtualenv for each machine and did the requirements.txt thing as well (just to prep for the future). The directory is on both username/python/project_name.
I read in some posts that these files and other resources can exist anywhere whether inside each virtualenv or not. And that it's just a preference. I also read that the virtualenv itself isn't recommended to be placed inside apps like Dropbox (that's why i separated it on each laptop).
I switch between both laptops frequently. The environment which contains the packages doesn't really concern me that much when switching. It's the other files that is bothering me. For example, there's an image I need, this has to be available on both laptops so my solution to this is to have a Resources folder inside Dropbox as well. It currently looks like this:
Dropbox
Projects
Project 1 files (images, etc.)
Project 2 files (images, etc.)
Workflows (this would contain my completed .py files)
I read some stuff about the virtualenvwrapper, but haven't looked at it yet. Maybe in the future when i do have more projects to manage. Because right now, it's just this one.
Lastly, I noticed that every time i open up Terminal and activate my virtualenv, the file directory is in Users/username
How can i set it to default to Dropbox/Projects/project_name? I always have to set it using the chdir(). That way, when i do have multiple projects (and virtualenv) i don't have to worry about where the files load/ save.
Finally, how do I run the .py script? If i open the IDLE, open the .py file there, and use f5, it runs properly. But as far as I know, that doesn't look into the virtualenv i setup. Is that correct?
I tried right-clicking, then Open With > Python Launcher the .py file. and i'm getting an error saying there are no modules found. It seems it's not loading the right virtualenv. So there must be something wrong with the file i made.
Then I read about the #! you place at the beginning of the .py files but i don't understand it. Can someone explain that further? Is that why my file isn't loading properly?
Thanks for helping out!
You can run .py scripts from the command line using:
python test.py
That tells terminal to run test.py in the python interpreter and send the output to your terminal, just like when you run it in the IDLE. If your .py script is not in your current directory and you don't want to change directories, you can access it using it's absolute path:
python /Users/username/Dropbox/Workflows/test.py
As long as you have already activated your virtualenv, it should run your script using only the libraries you have added to your virtualenv. Also, once your virtualenv is activated, you can move around directories using "cd" and it will bring your virtualenv with you.

How to copy an executable with all needed libraries?

I have two fairly identical (Linux-) systems but one with just a minimum set of packages installed. On one system I have a running (binary/ELF) executable which I want to copy over to the other system (with the minimum setup).
Now I need a way to copy all needed shared libraries as well. Currently I start the application on the source system and then go through the output of
lsof | grep <PID>
or
ldd <FILE>
to get a list of all libraries currently loaded by the application and copy them over manually.
Now my question is: before I start to automate this approach and run into lots of little problems and end up with yet another reinvented wheel - Is there a tool which already automates this for me? The tool I'm dreaming of right now would work like this:
$ pack-bin-for-copy <MY_EXE>
which creates a .tgz with all needed shared libraries needed to run this executable.
or
$ cp-bin <MY_EXE> user#target:/target/path/
which would just copy the binary once..
Note: I do NOT need a way to professionally deploy an application (via RPM/apt/etc.). I'm looking for a 'just for now' solution.
One tool that does something similar to what you suggest is linuxdeploy. While the tool is intended to ease the creation of an AppImage (see here for more information), it will pack your executable with any dependencies into a directory. Then you can just create a 'tgz' file of that directory instead of an AppImage.
ldd usuage is correct if you also enable -Wl,--no-dynamic-lookup at link time.

how to create a debian package which updates only required files while updating the package

After few weeks of struggle i am able to create a medium native package debian package which works well in installation and removing of the package.
As http://www.quietsche-entchen.de/cgi-bin/wiki.cgi/-wiki/CreatingDebianPackages
Debian wiki
http://wiki.debian.org/HowToPackageForDebian http://www.debian.org/doc/manuals/maint-guide/ these are the quite good material for beginners,
I have basic problem, in updating the package all the files data.tar.gz are updated by default.
I want only few files to get updated in the data.tar.gz based on a key variable stored in all the files.
After the unpacking that is executing preinst script, all the files in data.tar.gz are already updated..
my idea was to take back up of the files intially before upgrading the package, and check key variable in files.. if the key variable is greater than the current variable replace it..
which means i am writing a simple backup script.. and executing in the postinst file..
i donot think this is good idea.. and more over limitations in dash script make it a very tough job..
What are you trying to accomplish here? During the reinstallation (or upgrading) of a Debian package, replacement of all of the non-conffiles with the latest version is exactly what's supposed to happen. If the file hasn't changed since the last installed version of the package then there's no harm in updating it anyway, and if is has changed, it's supposed to be updated.
If you have specific files which might be modified by the user and should be preserved across upgrades, make then conf files. The package system will prompt the user and ask them if they want to keep the package maintainer's version or the locally modified version.
(But if you're going to make every file a conf file, then you're probably doing something wrong.)
To make a file a conffile, list it in debian/conffiles. But if the file is going to be installed under /etc then you don't need to do this because dh_installdeb will do it for you.
EDIT following additional information in comment:
Suppose you have files test1.sh and test2.sh (among others) in your package. In the Debian world, they are either conffiles are intended to be modified by the end user, or they're not.
conffiles should be relatively few in number and as short as possible, to minimize the burden of having to reconcile changes made by the package maintainer with conflicting changes made by the end user.
If there are things mixed into the code that the end user is likely to want to tune, try to factor them out into a configuration file. If you put that file in /etc, you don't even have to manually designate it as a conffile.
If the end user needs to make a change to a non-conffile, they should use the dpkg-divert protocol to (1) move the original file aside, and (2) edit a copy. Diverted files are respected by package upgrades. The end user who uses dpkg-divert should be aware that things might break after upgrades as a result, because the package maintainer hasn't foreseen that these files would be modified by end users and the locally modified version might be incompatible with a newly upgraded version of a different file. dpkg-divert should be used carefully and sparingly.

let ./configure find library files in specific directory

I'm currently installing R software on a shared space across several servers. After installation I found that when I login on different servers, R is not guaranteed to run due to the missing of some library files on different machines.
Here is what I'm trying to do: since the installation of R is machine-dependent, I'd like to put all missing library files like libtermcap.so.2, libg2c.so.1, etc, to a single directory on the shared space, so that when I run ./configure, it will also search for this directory. Since this directory is shared, the installation could become machine-independent, so I won't need to add missing files on each server.
Is there an option to achieve this when I run ./configure? Thanks.
Assuming you have copied the library files to /shared/lib/ and the header files to /shared/include/, you can run
./configure LDFLAGS=-L/shared/lib CPPFLAGS=-I/shared/include ...other options...
Note, however, that you are bound to run into trouble at run time, when you have to convince your installation to use the shared libraries from the right directory, especially in case someone decides to upgrade the default version on the respective host. That whole business is platform and installation dependent. I think if your hosts are not at least mostly identical, you ought to install your software (R) locally in a way suitable to the respective system.
Peter's answer is correct (+1), and please take special note of his suggestion to install locally. Using the local package management system and auto updating on each box is (in the long run) a much easier solution than trying to get compatible binaries/libraries on a shared drive. To simplify using Peter's solution, note that you can place the appropriate arguments in /shared/share/config.site. For example:
$ cat > /shared/share/config.site << EOF
: ${LDFLAGS=-L/shared/lib}
: ${CPPFLAGS=-I/share/include}
EOF
Whenever you run configure with --prefix=/shared, the config.site file will be read and defaults will be set.

Capturing all the data that has changed during a Linux install

I am trying to figure out which files were changed when I run an app install via make install. I can look at the script, but that calls other scripts and may or may not touch other files, etc. How can I do this programmatically?
Implementation: http://asic-linux.com.mx/~izto/checkinstall/
Several ways come to mind. First, use some sort of LD_PRELOAD to track all files opened. Second approach, compare filesystem before and after.
If your kernel supports it, you can use inotify (a handy interface is inotify tools) and watch your home directory, if the package was configured with --prefix=/home/myusername
I've noticed that checkinstall (using installwatch via LD_PRELOAD) does not always catch everything, the last time I used it it did not catch empty directories that were created for spooling, which caused the subsequent generated .deb's to break.
Note, don't use inotify if you are installing to /, in that case you have to use installwatch or just read all of the makefiles / install scripts closely.

Resources