Less tedious way to generate poetry dependencies? - python-3.x

Every time I start a new poetry project, I have to go through a tedious process of listing dependencies. These include:
poetry add every dependency one by one even though it's already listed in my import block
Source diving to figure out the actual minimum version of the package given the minimal functionality I use
Going down the rabbit hole of CPython code to figure out the minimum version of Python
I don't really like the Poetry approach of just requiring whatever version I have installed. As a developer, I tend to install bleeding edge version of packages and Python, which many of my users don't have. I then get annoying bug reports that come down to "the python version is wrong" but the user is often very confused by the error messages. The process of finding minimum dependency versions is typically not very complicated, it's just tedious and not scalable.
Surely there is a tool out there that can do some static analysis and get me started with a basic dependency list? I understand that a perfect solution would likely be a lot of work, but a partial solution would be good enough for me. So long as it takes care of the bulk of the tedious work, I don't mind dealing with the handful of remaining corner cases by hand.
PyCharm seems able to at least compare the package names in requirements.txt to my imports. Unfortunately this doesn't work for the poetry dependencies, not even with the Poetry Pycharm Plugin installed.

Related

Support several versions of a package if version change breaks code

I have a Python3 project which uses Biopython package. One of its modules got removed in the latest version so I have to change a small piece of code to support this change. On the other hand this change would break my code for all "old" version of Biopython (which are heavily used on productive systems).
My questions:
What is the proper way to deal with this?
If this makes sense: How do I support old and new package versions at the same time? Do I perform a run time check to see which version I have an then run different code? Or is this a bad idea? If you think this is the way to go: Is there a standard way to do this?
The simplest way to ensure a specific version is present is to pin that version in your requirements.txt file (or other dependency specifications). There are plenty of systems which rely on legacy versions of packages, and especially for a package without any security implications this is totally reasonable.
If supporting multiple versions is your goal, you could perform some basic checks during your package import process, in an __init__.py file or elsewhere. This pattern is somewhat common, especially useful for version compatibility between Python 2 & 3:
def foo_function():
return
try:
import biopython.foo as foo
except (ImportError, AttributeError):
foo = foo_function
foo()
I have seen this countless times in the wild on GitHub--of course now that I try to find an example I cannot--but I will update this answer with an example when I do.
EDIT: If it's good enough for Numpy, it's probably good enough for the rest of us. numpy_base.pyi L7-13

What does cabal mean when it says "The following packages are likely to be broken by the reinstalls"

I've seen this message pop up a couple times when running cabal v1-install with a suggestion to use --force-reinstalls to install anyway. As I don't know that much about cabal, I'm not sure why a package would break due to a reinstall. Could someone please fill me in on the backstory behind this message?
Note for future readers: this discussion is about historical matters. For practical purposes, you can safely ignore all of that if you are using Cabal 3.
The problem had to do with transitive dependencies. For instance, suppose we had the following three packages installed at specific versions:
A-1.0;
B-1.0, which depends on A; and
C-1.0, which depends on B, but not explicitly on A.
Then, we would install A-1.1, which seemingly would work fine:
A-1.1 would be installed, but the older A-1.0 version would be kept around, solely for the sake of other packages built using it;
B-1.0 would keep using A-1.0; and
C-1.0 would keep using B-1.0.
However, there would be trouble if we, for whatever reason, attempted to reinstall B-1.0 (as opposed to, say, update to B-1.1):
A-1.1 and A-1.0 would still be available for other packages needing them;
B-1.0, however, would be rebuilt against A-1.1, there being no way of keeping around a second installation of the same version of B; and
C-1.0, which was built against the replaced B-1.0 (which depended on A-1.0), would now be broken.
v1-install provided a safeguard against this kind of dangerous reinstall. Using --force-reinstalls would disable that safeguard.
For a detailed explanation of the surrounding issues, see Albert Y. C. Lai's Storage and Identification of Cabalized Packages (in particular, the example I used here is essentially a summary of its Corollary: The Pigeon Drop Con section).
While Cabal 1, in its later versions, was able to, in the scenario above, detect that the reinstall changed B even though the version number remained the same (which is what made the safeguard possible), it couldn't keep around the two variants of B-1.0 simultaneously. Cabal 3, on the other hand, is able to do that, which eliminates the problem.

How does one find and understand excess data dependencies in a Haskell program

How does one find and understand excess data dependencies in a Haskell program so that one is able to eliminate them?
I once used ghc-vis to investigate data dependencies in a Haskell program but since Stack has moved on such that ghc-vis no longer installs in unison with most current development it's no longer an option and I wonder what do people use these days instead.
Try to fix ghc-vis (or actually, its dependencies).
From the logs you reported on the ghc-vis issue tracker https://github.com/def-/ghc-vis/issues/24, the errors all belong to these two categories, neither of which requires expertise specific to the broken packages, so you should be able to fix them yourself, that's the beauty of open source:
Failed to load interface... There are files missing: this might be related to your Haskell distribution. How did you install Haskell? For example Haskell packages on Arch are dynamically linked: https://wiki.archlinux.org/index.php/Haskell
Ambiguous occurence: at least one package you depend on exports a name which clashes with the actually intended name. Look at the broken package and fix its version bounds or fix its imports.
At this point, the problems you are encountering have little to do with ghc-vis, but with wl-pprint-text, polyparse, and cairo.

Should I use Anaconda on Ubuntu WSL for other reasons than datascience?

I've been away from Python for a while (just a normal guy trying to learn it) and wanted to start learning again. I came across Anaconda and I am trying to figure out whether to use it or stick to pip. There are several questions for this, but they are all related to people using Python for Datascience, which I use R for.
So, is there any reason for me to use Anaconda if I am not planning on working with Datascience?
Sorry if this question seems easy to answer by searching, but I cannot find any information about this that isn't related to Datascience.
Off-hand, I'd say no. Anaconda simply clumps many (not all) of the standard data science-y packages with Python. If you go base Python, you get a much smaller install. Now one thing that is useful: conda. I find it's pretty good at managing package versions, which can be pretty problematic otherwise. But you can probably get by without it.

Is there a generic way to consume my dependency's grunt build process?

Let's say I have a project where I want to use Lo-Dash and jQuery, but I don't need all of the features.
Sure, both these projects have build tools so I can compile exactly the versions I need to save valuable bandwidth and parsing time, but I think it's quite uncomfortable and ugly to install both of them locally, generate my versions and then check them it into my repository.
Much rather I'd like to integrate their grunt process into my own and create custom builds on the go, which would be much more maintainable.
The Lo-Dash team offers this functionality with a dedicated cli and even wraps it with a grunt task. That's very nice indeed, but I want a generic solution for this problem, as it shouldn't be necessary to have every package author replicate this.
I tried to achieve this somehow with grunt-shell hackery, but as far as I know it's not possible to devDependencies more than one level deep, which makes it impossible even more ugly to execute the required grunt tasks.
So what's your take on this, or should I just move this over to the 0.5.0 discussion of grunt?
What you ask assumes that the package has:
A dependency on Grunt to build a distribution; most popular libraries have this, but some of the less common ones may still use shell scripts or the npm run command for general minification/compression.
Some way of generating a custom build in the first place with a dedicated tool like Modernizr or Lo-Dash has.
You could perhaps substitute number 2 with a generic one that parses both your source code and the library code and uses code coverage to eliminate unnecessary functions from the library. This is already being developed (see goldmine), however I can't make any claims about how good that is because I haven't used it.
Also, I'm not sure how that would work in a AMD context where there are a lot of interconnected dependencies; ideally you'd be able to run the r.js optimiser and get an almond build for production, and then filter that for unnecessary functions (most likely Istanbul, would then have to make sure that the filtered script passed all your unit/integration tests). Not sure how that would end up looking but it'd be pretty cool if that could happen. :-)
However, there is a task especially for running Grunt tasks from 'sub-gruntfiles' that you might like to have a look at: grunt-subgrunt.

Resources