FYI, I'm quite new to Python and it's packaging and dependency tools seem confusing.
I am going to be writing series of Python packages that support Dags running in Apache Airflow. As these packages share some common functionality I want to extract the commonalities out into separate supporting modules. In turn, these supporting modules will rely on at least two other supporting modules. All of the modules/packages in question will be published as source distributions on an internal repository.
Is there a way for me to install the main packages such that all of the direct and indirect dependencies are installed from the private repo?
I have made use of install_requires in setup.py to install modules available via Pypi and it seems like I could do something like this to achieve my goals, however this seems like it could get messy when I need to update say the version of the indirect dependencies. Is there a better way that I can handle this? Would adding the dependencies to requirements.txt with an --extra-index argument be a valid approach?
The hierarchy of dependencies can be represented loosely as:
MainPackage
-> SupportingPackage
-> CommonUtilites
It is possible to use a git repository as a Python package source.
Just add git+{REPO_LINK}#{TAG_OR_SHA1} in requirements.txt and just pip install -r requirements.txt.
See How to add git source in requirements.txt.
Related
I currently have a dockerfile which installs python libraries in the container which i eventually use to execute code. Now for every release , I need to add or update the dependencies , which results in rebuilding the image.
The issue is while rebuilding many internal transitive libraries create version issues which effects my functionality , for eg some library can bring in a new numpy library version which can cause issues in the code.
How should I handle this problem ? Should I create a new base image for every release and update it in dockerfile ?
Edit : Caching does not help me , because the moment my requirement.txt file change , rebuild will happen.
Also , I cant specify versions for all libraries. Transitive libraries are a challenge here.
This is not related to Docker. You can either pin the package version in requirement.txt or use Poetry to manage the dependencies. Poetry uses a lock file which makes sure the proper version is installed for all the dependencies.
I am trying to create a Python Wheel for Great_Expectations. The .whl provided by Great_Expectations exists here https://pypi.org/project/great-expectations/#files - great-expectations 0.13.25. Unfortunately, it appears that this .whl doesn't contain all the libraries I need to in order to work with Great_Expectations in Azure Synapse Apache Spark Pool.
Therefore, it looks like I will either have to create my own Great_Expectations package a python project with all of its dependencies for offline install.whl or at the very least try and establish what libraries are contained within the existing package great-expectations 0.13.25
Therefore, can someone let me know how to create a Python Wheel(ie. Python Package, with all of its dependencies for Great_Expectations). Alternatively, can someone let me know how to determine what module/dependencies are contained with a package?
Thanks
To add new dependencies, update requirements.txt (You actually need to update install_requires in setup.py, but in this project they are reading the requirements file to fetch the requirements)
You will need to clone the git repo so as to update that list.
Then to create a new wheel out of that source, just run:
python setup.py bdist_wheel
(You may need to run pip install wheel if wheel doesn't exist)
Docs: wheel
To the second question: What modules / libraries are within a Python Wheel?
Just the package, the dependencies are installed from sources when you install the package.
Consider to use Conda pack. It was explicitly created for such a use case of making Python / Conda packages easily portable
I am interested in understanding how to install node-sqlite3, but provide by own precompiled package of sqlite3: I just want to install the Node client and skip the build phase entirely during install.
How can I do this?
(Reasoning: I am going to test the module in multiple environments and have already read countless posts of people having build issues in various environments, so I'd rather manually compile myself.)
It turns out that I was looking for a package like this one:
dblite on npm, GitHub
I am currently building some docker images.
I found that the Linux distribution I was using was hard to adapt to Docker multi-stage builds until I found Nix.
With Nix, I can copy files among images (COPY --from=source/image /nix/store /nix/store) without worrying about conflicts and breaking things.
But I found that it installed too many things after running nix-env -i curl command.
warning: there are multiple derivations named 'curl-7.60.0'; using the first one
installing 'curl-7.60.0'
these paths will be fetched (49.44 MiB download, 203.64 MiB unpacked):
/nix/store/0yaiablzxhd8ki5qan156ydz78grlav7-nghttp2-1.32.0-bin
/nix/store/0zvcf4dnlcd4bk84qmxcxm1pbc534chv-openssl-1.0.2o-bin
/nix/store/3xvnr0y2mx7g8b796rb9p77bjfbaw03h-linux-headers-4.15
/nix/store/4bikvz91b83sycavf35lmby65m6zxgch-libssh2-1.8.0-dev
/nix/store/504vcw350rp1yh31razv0mq2vsgp0izh-libkrb5-1.15.2-dev
/nix/store/5gzy6cacylfb0lha2yd0i0as0k1d0d5v-libev-4.24
/nix/store/5xnniwzazzlg6qinhrwammxxwsq5c1di-nghttp2-1.32.0-dev
/nix/store/7l1smzwil1kxyyfayzl6lg1hw9m4iwmw-nghttp2-1.32.0
/nix/store/8zkg9ac4s4alzyf4a8kfrig1j73z66dw-bash-4.4-p23
/nix/store/93ljbaqhsipwamcn1acrv94jm6rjpcnd-acl-2.2.52
/nix/store/dgp8mnf40pmwh8ghpcfda1vcwcy34w6z-curl-7.60.0-devdoc
/nix/store/gbddfvxzjjqpgkr17whn8ynh9z8afz8l-curl-7.60.0-debug
/nix/store/imfm3gk3qchmyv7684pjpm8irvkdrrkk-gcc-7.3.0
/nix/store/jg9yh6cm4iwcpl4l18g7mr9y7sdwav5q-curl-7.60.0-dev
/nix/store/jsmnk16iwb9xrm3c6jv2fyxkh7xr7q3j-curl-7.60.0-man
/nix/store/lyd89mv72m8a0aw1a4idfimyi0rb2b13-glibc-2.27-dev
/nix/store/n7qp8pffvcb5ff52l2nrc3g2wvxfrk75-coreutils-8.29
/nix/store/pa4q0szxz23bd6srry91gmw08fmwgfw2-libkrb5-1.15.2
/nix/store/q239yikz665n4a5rff7rg2vc7jpay6xb-openssl-1.0.2o-dev
/nix/store/rmq6gnybmxxzpssj3s63sfjivlq4inrm-attr-2.4.47
/nix/store/szdi35clpzj13c8dhfzh55fj6hk0z8j6-glibc-2.27-bin
/nix/store/v5xh3glylamhfg586hcykn6hlk4n41dh-nghttp2-1.32.0-lib
/nix/store/vawc9a89l53mf05yq0k1910q7dakd99w-perl-5.24.3
/nix/store/vl5k9m1pjkd6cm9125afic1kj06y4i6b-curl-7.60.0-bin
/nix/store/y8cfvcvya61l260jil989lcmkia5b5gh-zlib-1.2.11-dev
/nix/store/z4k2pbdd8pz9mjc0p5394j0zp435fcc5-curl-7.60.0
It is important to keep docker images slim and I do not think curl need dependencies like gcc or linux-headers at runtime.
Is there a way for Nix to exclude the dependencies of these source or dev libraries?
Build dependencies become runtime dependencies whenever a path name to the build dependency is included in the package. This is necessary because there is no general way to tell whether such a reference is actually used by a program.
The best way to avoid having build dependencies in your closures is by not referencing them in the first place. The next best thing is to understand why the reference is there and, if safe, modify the package build script to remove the reference.
In order to figure out where these references come from, you can make use of the Nix 2.0 nix why-depends command. It will tell you the shortest path, or all paths that lead from the first argument package to the second argument package. You can also use store paths instead of the attribute paths in the examples of nix why-depends --help.
The method for removing the dependency depends on the referencing package, so there's no general formula for that. General hacks to remove the reference in unsafe ways exist, but they are probably not worth the risk.
I am coming over to Go from Node.js and I am used to adding all my modules and then people just have to go npm install when cloning my package.
What is the equivalent of this with Go? I have a few imports and don't want people to have to manually install it if they use my package.
I also not sure if I create a simple Go application with just a package main if that allows people to just go get. I have really picked up on the Go way of repo sharing like Node.js
What is the equivalent of this with Go? I have a few imports and don't want people to have to manually install it if they use my package.
You don't have to do anything. People will not have to manually install the packages you import. When someone does
go get github.com/FrickeFresh/awesome
all of the dependencies you import in your awesome package will be downloaded automatically as needed.
Go get skips testing files by default, but a user can download those too by including -t:
go get -t github.com/FrickeFresh/awesome
But that's not something you need to worry about.
If you want to delve into vendoring specific versions of dependencies, there are a number of articles/tools available. The official tool is dep:
https://github.com/golang/dep
Basically you should take a look at vendoring. There exist tools that help you with versioning. Personally, I use vendetta which is just a little tool that "go gets" the referenced packages as git submodules into the vendor folder. So if anyone checks out my repo they simply do git submodule update --init --recursive. The package version can be specified as a git commit id in the respective submodule.
There also exist tools where you maintain the deps in a file, check out here.