data_files differences between pip and setuptools - python-3.x

I have a Python application that comes with a setup.py script and can be installed via Pip or setuptools. However, I'm finding some annoying differences between the two methods and I want to know the correct way of distributing data-files.
import glob
import setuptools
long_description = ''
setuptools.setup(
name='creator-build',
version='0.0.3-dev',
description='Meta Build System for Ninja',
long_description=long_description,
author='Niklas Rosenstein',
author_email='rosensteinniklas#gmail.com',
url='https://github.com/creator-build/creator',
py_modules=['creator'],
packages=setuptools.find_packages('.'),
package_dir={'': '.'},
data_files=[
('creator', glob.glob('creator/builtins/*.crunit')),
],
scripts=['scripts/creator'],
classifiers=[
"Development Status :: 5 - Production/Stable",
"Programming Language :: Python",
"Intended Audience :: Developers",
"Topic :: Utilities",
"Topic :: Software Development :: Libraries",
"Topic :: Software Development :: Libraries :: Python Modules",
],
license="MIT",
)
Using Pip, the files specified in data_files end up in sys.prefix + '/creator'.
Using setuptools (that is, running setup.py directly), the files end up in lib/python3.4/site-packages/creator_build-0.0.3.dev0-py3.4.egg/creator.
Ideally, I would like the files to always end up in the same location, independent from the installation method. I would also prefer the files to be put into the module directory (the way setuptools does it), but that could lead to problems if the package is installed as a zipped Python Egg.
How can I make sure the data_files end up in the same location with both installation methods? Also, how would I know if my module was installed as a zipped Python Egg and how can I load the data files then?

I've been asking around and the general consensus including the official docs is that:
Warning data_files is deprecated. It does not work with wheels, so it should be avoided.
Instead, everyone appears to be pointing towards include_package_data instead.
There's a drawback here in that it doesn't allow for including things outside of your src root. Which means, if creator is outside creator-build, it won't include it. Even package_data will have this limitation.
The only workaround, if your data files are outside of your source files (for instance, I'm trying to include examples/*.py for a lot of reasons we don't need to discuss), you can hot-swap them in, do the setup and then remove them.
import setuptools, glob, shutil
with open("README.md", "r") as fh:
long_description = fh.read()
shutil.copytree('examples', 'archinstall/examples')
setuptools.setup(
name="archinstall",
version="2.0.3rc4",
author="Anton Hvornum",
author_email="anton#hvornum.se",
description="Arch Linux installer - guided, templates etc.",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/Torxed/archinstall",
packages=setuptools.find_packages(),
classifiers=[
"Programming Language :: Python :: 3.8",
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
"Operating System :: POSIX :: Linux",
],
python_requires='>=3.8',
package_data={'archinstall': glob.glob('examples/*.py')},
)
shutil.rmtree('archinstall/examples')
This is at best ugly, but works.
My folder structure for reference is (in the git repo):
.
├── archinstall
│   ├── __init__.py
│   ├── lib
│   │   ├── disk.py
│   │   └── exceptions.py
│   └── __main__.py
├── docs
│   ├── logo.png
├── examples
│   ├── guided.py
│   └── minimal.py
├── LICENSE
├── profiles
│   ├── applications
│   │   ├── awesome.json
│   │   ├── gnome.json
│   │   ├── kde.json
│   │   └── postgresql.json
│   ├── desktop.py
│   ├── router.json
│   ├── webserver.json
│   └── workstation.json
├── README.md
└── setup.py
And this is the only way I can see how to include for instance my profiles as well as examples without moving them outside of the root of the repository (which I'd prefer not to do, as I want users to easily find them when navigating to the repo on github).
And one final note. If you don't mind polluting the src directory, in my case that's just archinstall. You could symlink in whatever you need to include instead of copying it.
cd archinstall
ln -s ../examples ./examples
ln -s ../profiles ./profiles
That way, when setup.py or pip installs it, they'll end up in the <package dir> as it's root.

Related

Where to place "common command line" tools in a standard library Rust package structure?

I'm developing a library. For internal development experiments I need a bunch of binary command line entry points that I can execute. I'm using a standard project layout like described in this answer under "Flexible".
What I'd like to achieve: Certain functionality in my command line tools is similar, and I'd like to move that out into its own module. Ideally I'd like to introduce a cli_common.rs module containing some helper functions here in the package structure:
.
├── Cargo.toml
└── src
├── bin
│   ├── cli_common.rs
│   ├── run_foo.rs (uses cli_common)
│   └── run_bar.rs (uses cli_common)
├── lib.rs
└── <various-lib-related-sub-modules>
This doesn't seem to be possible, because the compiler expects every module under bin to have a main function.
This suggest that I have to move the functionality of cli_common.rs into the library itself. This doesn't feel great, because it mixes "peripheral" logic with "core" logic, and I'd like to avoid having this functionality as part of the public interface of the library.
Is there a trick to have such a cli_commons.rs without having to move it into the library?
I found a solution based on this answer.
For simple cases, an extra crate can be avoided by moving the cli_common.rs somewhere, where the compiler doesn't expect a main. In my example I modified the structure for instance to:
.
├── Cargo.toml
└── src
├── bin
│   ├── cli_common.rs
│   │   └── cli_common.rs
│   ├── run_foo.rs
│   └── run_bar.rs
├── lib.rs
└── <various-lib-related-sub-modules>
Within run_foo.rs and run_bar.rs, I can now use mod combined with a #[path] attribute:
#[path="cli_common/cli_common.rs"]
mod cli_common;
For more complex scenarios, going for a separate crate may be preferred.

geopandas read shp file with fiona.errors.DriverError when wrapping up as a package

I have a package structure like this:
├── LICENSE
├── README.md
├── main
│   ├── __init__.py
│   ├── application.py
│   ├── core_function
│   │   ├── __init__.py
│   │   └── maps
│   │   ├── Taiwan
│   │   ├── Taiwan_detailed
│   │   └── taiwan.txt
└── setup.py
I try to wrap this package by python setup.py develop.
When it runs Taiwan = gpd.read_file(pkg_resources.resource_stream(__name__, 'maps/Taiwan/COUNTY_MOI_1090820.shp')) in application.py,
fiona.errors.DriverError: '/vsimem/9b633f8a8a3f457eadf710539afd2a22' not recognized as a supported file format. or
fiona._err.CPLE_OpenFailedError: '/vsimem/9b633f8a8a3f457eadf710539afd2a22' not recognized as a supported file format. would occur.
It reads perfectly when I run it as a script on my machine, but it fails as a package.
Knowing that the shp file should be read along with all the files in that folder, in my setup.py I also include them
packages= setuptools.find_packages(),
package_data={'maps': ['main/core_function/maps/*','core_function/maps/Taiwan/*']},
I was thinking the problem is about the path, but taiwan.txt can be read.
Any suggestion is appreciated. Thanks in advance.
So far I have not found the reason.
But I instead used the to_file method and work with only one file.
.to_file("package.gpkg", driver="GPKG")
This works in my package. The problem can be due to reading multiple files.

Can I swap out the library used by a binary with a wrapper when building a third-party crate?

Let's say there is a vendored third-party cargo project consisting of a library plem and a binary plem_main that I want to extend with some functionality of my own. Crucially, the functionality needs to go in the library plem, not the binary plem_main (which can stay the same). I could write a wrapper my_plem around the library that offers the same interface to the binary, but with the extra functionality included. The project would be set up like this:
.
├── Cargo.toml
├── my_plem
│   ├── Cargo.toml
│   └── src
│   └── lib.rs
└── third-party
├── plem
│   ├── Cargo.toml
│   └── src
│   └── lib.rs
└── plem_main
├── Cargo.toml
└── src
└── main.rs
my_plem/src/lib.rs would depend on things in third-party/plem/src/lib.rs and reexport or overwrite the functions exported by the latter. Is there a good way to get cargo to build the binary plem_main on top of my_plem instead of plem?
"Best" here means that the solution has no or minimal merge conflicts when updating plem in my project and doesn't duplicate the code of plem_main. Ideally it does not touch third-party at all.

Updating the script present in Poky Source code

meta/recipes-core/initrdscripts/files/init-install-efi.sh is used for formatting and creating partitions.
I have modified this file to create one more partition for software update.
Can I copy the newly updated script file in my own custom layer recipes-core/initrdscripts/files/init-install-efi.sh.
Will it update the init-install-efi.sh. If not how to achieve this, I don't want to touch the poky source code, as that is fetched using repo utility
$ tree meta-ncr/
meta-ncr/
├── conf
│   ├── bblayers.conf
│   ├── layer.conf
│   └── machine
│   └── panther2.conf
├── recipes-core
│   └── initrdscripts
│   ├── files
│   │   └── init-install-efi.sh
│   └── initramfs-live-install-efi_1.0.bbappend
└── scripts
└── setup-environment
$ cat meta-ncr/recipes-core/initrdscripts/initramfs-live-install-efi_1.0.bbappend
FILESEXTRAPATHS_prepend := "${THISDIR}/files:"
SRC_URI = "file://init-install-efi.sh"
After debugging, I found that it is copying the script present in the meta-intel layer and not of my layer.
This is from the output of bitbake-layers show-appends
initramfs-live-install-efi_1.0.bb:
/home/jamal/repo_test/sources/meta-intel/recipes-core/initrdscripts/initramfs-live-install-efi_%.bbappend
/home/jamal/repo_test/sources/meta-ncr/recipes-core/initrdscripts/initramfs-live-install-efi_1.0.bbappend
Can you please tell me what changes are required for my bbappend to work instead of meta-intel
Yocto provides bbappend mechanism to archive Your case without touching metadata from poky, please follow these few steps to archive this:
create new layer or use Your existing one,
in this layer create bbappend file for initramfs-module-install-efi_1.0.bb or initramfs-live-install-efi_1.0.bb (I found that this recipes are based on this script), with content:
$ cat meta-test/recipes-core/initrdscripts/initramfs-live-install-efi_1.0.bbappend
FILESEXTRAPATHS_prepend := "${THISDIR}/files:"
SRC_URI = "file://init-install-efi.sh"
move modified script file under files directory, Your meta layer structure should look like:
$ tree meta-test/
meta-test/
├── conf
│   └── layer.conf
├── COPYING.MIT
├── README
└── recipes-core
└── initrdscripts
├── files
│   └── init-install-efi.sh
└── initramfs-live-install-efi_1.0.bbappend
4 directories, 5 files
Then finally after running do_unpack task on initramfs-live-install-efi recipe in working directory You will find Your modified file in recipe workspace,
$ bitbake -c unpack initramfs-live-install-efi
Test:
$ cat tmp/work/i586-poky-linux/initramfs-live-install-efi/1.0-r1/init-install-efi.sh
#!/bin/bash
echo "hello"
FILESEXTRAPATHS - is used to extend search path for do_fetch and do_patch tasks.

any specific example of making a dictionary using logios

I followed this tutorial trying to build a dictionary, but I think the logios package is generally used for gram and classes file given as input. If I understood it correctly, the dictionary should be built by feeding either a sample text file or a vocabulary file, there shouldn't be any gram file involved. So my questions are
A) Is my understanding about the input correct?
B) Based on above assumption, I guess the correct way of using the package is not building them all, but rather making use of a component inside Tools directory named MakeDict. But I fell short of any solid example of how to use this tool. So I tried by making modification to test_MakeDict.sh file, but it failed with no further information on which component fails. So what shall I do next? Can anyone give any solid command line examples on using this tool, and shall I build the whole package first before I can use any individual component?
The directory layout in MakeDict is as
.
├── AUTHORS
├── bin
│   ├── x86-linux
│   │   └── pronounce
│   └── x86-nt
│   ├── libgnurx-0.dll
│   └── pronounce.exe
├── dict
│   ├── command.dic
│   ├── command.txt
│   ├── command.vocab
│   ├── hand.dic
│   └── pronunciation.log
├── lextool.pl
├── lib
│   ├── lexdata
│   │   ├── lexicon.data
│   │   ├── lexicon.key
│   │   ├── ltosru.bin
│   │   ├── nslex.900
│   │   └── nslex.901
│   └── Pronounce.pm
├── logios.log
├── make_pronunciation.pl
├── README
├── test
│   ├── example
│   ├── example.dic.test
│   ├── hand.dict
│   └── pronunciation.log.test
└── test_MakeDict.sh
You have to use the pronounce executable. Depending on your OS you will have to use the linux or nt version.
You can use it as follows:
(from the MakeDict root directory)
./bin/x86-linux/pronounce -d [name of the dictionary you want to use from the dict folder] -i [words file] -o [destination dictionary]
The words file must be a file contanining the words you want to include in the dictionary, one per line.
The dictionary I used was cmudict_SPHINX_40. I don't know which one you should use.

Resources