What is the proper way to organize a PIP package? - python-3.x

I have 4 files: main.py, helper.py, clf.pkl, and tests.py.
Main.py has core classes. It needs to import helper.py for some methods and clf.pkl for data.
What is the minimal structure I can have for a Python library with 4 files?

I would use a package to hold your files, along with a pyproject.toml to describe your project, like this:
.
├── pyproject.toml
├── MANIFEST.in
├── your_package_name
│   ├── __main__.py
│   ├── helper.py
│   └── __init__.py
└── tests
└── tests.py
In your __init__.py file write at least:
"""A short description of your project"""
__version__ = "0.0.1"
(Change description and version accordingly).
To create your pyproject.toml you can use flit init:
pip install flit
flit init
Name your entry point __main__.py in the package so you can run it using:
python -m your_package_name
(Yes it's still a good idea to use an if __name__ == "__main__": in it, so you can import your main from your tests if needed).
You can import helper.py from __main__.py using:
from your_package_name import helper
or:
from . import helper
(I prefer the first one but I don't know if there a concensus.)
For your clf.pkl to be included in your package you'll need to create a MANIFEST.in with:
include your_package_name/clf.pkl
Your pkl will be available at:
os.path.join(os.path.dirname(os.path.abspath(__file__)), "clf.pkl")
To test it use flit install -s and to publish it on PyPI flit publish.

Related

Module not recognising root directory for Python imports

I have a Python project that uses the MicroKernel pattern where I want each of the modules to be completely independent. I import each of the modules into the kernel and that works fine. However, when I am in a module I want the root of the module to be the module dir. This is the part that is not working.
Project structure;
.
├── requirements.txt
├── ...
├── kernel
│   ├── config.py
│   ├── main.py
│   ├── src
│   │   ├── __init__.py
│   │   ├── ...
│   └── test
│   ├── __init__.py
│   ├── ...
├── modules
│   └── img_select
│   ├── __init__.py
│   ├── config.py
│   ├── main.py
│   └── test
│   ├── __init__.py
│   └── test_main.py
If I import from main import somefunction in modules/img_select/test/test_main.py I get the following error:
ImportError: cannot import name 'somefunction' from 'main' (./kernel/main.py)
So it clearly does not see the modules/img_select as the root of the module, which leads to the following question:
How can I set the root for imports in a module?
Some additional info, I did add the paths with sys.path in the config files;
kernel/config.py;
import os
import sys
ROOT_DIR = os.path.dirname(os.path.abspath(__file__))
MODULES_DIR = os.path.join(ROOT_DIR, '../modules')
sys.path.insert(0, os.path.abspath(MODULES_DIR))
modules/img_select/config.py;
import os
import sys
ROOT_DIR = os.path.dirname(os.path.abspath(__file__))
sys.path.insert(0, os.path.abspath(ROOT_DIR))
And my python version is 3.7.3
I do realise that there are a lot of excellent resources out there, but I have tried most approaches and can't seem to get it to work.
I'm not sure what main you are trying to import from. I think python is confused from the pathing as well. How does test_main.py choose which main to run? Typically when you have a package (directory with __init__.py) you import from the package and not individual modules.
# test_main.py
# If img_select is in the path and has __init__.py
from img_select.main import somefunction
If img_select does not have __init__.py and you have img_select in the path then you can import from main.
# test_main.py
# If img_select is in the path without __init__.py
from main import somefunction
In your case I do not know how you are trying to indicate which main.py to import from. How are you importing and calling the proper config.py?
You might be able to get away with changing the current directory with os.chdir. I think your main problem is that img_select is a package with __init__.py. Python doesn't like to use from main import ... when main is in a package. Python is expecting from img_select.main import ....
Working Directory
If you are in the directory modules/img_select/test/ and call python test_main.py then this directory is known as your working directory. Your working directory is wherever you call python. If you are in the top level directory (where requirements.txt lives) and call python modules/img_select/test/test_main.py then the top level directory is the working directory. Python uses this working directory as path.
If kernel has an __init__.py then python will find kernel from the top level directory. If kernel is not a package then you need add the kernel directory to the path in order for python to see kernel/main.py. One way is to modify sys.path or PYTHONPATH like you suggested. However, if your working directory is modules/img_select/test/ then you have to go up several directories to find the correct path.
# test_main.py
import sys
TEST_DIR = os.path.dirname(__file__) # modules/img_select/test/
IMG_DIR = os.path.dirname(TEST_DIR)
MOD_DIR = os.path.dirname(IMG_DIR)
KERNEL_DIR = os.path.join(os.path.dirname(MOD_DIR), 'kernel')
sys.path.append(KERNEL_DIR)
from main import somefunction
If your top level directory (where requirements.txt lives) is your working directory then you still need to add kernel to the path.
# modules/img_select/test/test_main.py
import sys
sys.path.append('kernel')
As you can see this can change depending on your working directory, and you would have to modify every running file manually. You can get around this with abspath like you are doing. However, every file needs the path modified. I do not recommend manually changing the path.
Libraries
Python pathing can be a pain. I suggest making a library.
You just make a setup.py file to install the kernel or other packages as a library. The setup.py file should be at the same level as requirements.txt
# setup.py
"""
setup.py - Setup file to distribute the library
See Also:
* https://github.com/pypa/sampleproject
* https://packaging.python.org/en/latest/distributing.html
* https://pythonhosted.org/an_example_pypi_project/setuptools.html
"""
from setuptools import setup, Extension, find_packages
setup(name='kernel',
version='0.0.1',
# Specify packages (directories with __init__.py) to install.
# You could use find_packages(exclude=['modules']) as well
packages=['kernel'], # kernel needs to have __init__.py
include_package_data=True,
)
The kernel directory needs an __init__.py. Install the library as editable if you are still working on it. Call pip install -e . in the top level directory that has the setup.py file.
After you install the library python will have copied or linked the kernel directory into its site-packages path. Now your test_main.py file just needs to import kernel correctly
# test_main.py
from kernel.main import somefunction
somefunction()
Customizing init.py
Since kernel now has an __init__.py you can control the functions available from importing kernel
# __init__.py
# The "." indicates a relative import
from .main import somefunction
from .config import ...
try:
from .src.mymodule import myfunc
except (ImportError, Exception):
def myfunc(*args, **kwargs):
raise EnvironmentError('Function not available. Missing dependency "X".')
After changing the __init__.py you can import from kernel instead of kernel.main
# test_main.py
from kernel import somefunction
somefunction()
If you delete the NumPy (any library) from the site manager and save that folder in another location then use:
import sys
sys.path.append("/home/shubhangi/numpy") # path of numpy dir (which is removed from site manager and paste into another directory)
from numpy import __init__ as np
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)
print(type(arr))

problems importing sub packages in python: how should I write the __init__.py files

I am new to building packages so bear with me. I am having a problem importing the subpackages of my latest python project.
My directory structure is the following:
├── package
│   ├── __init__.py
│   ├── subpackage_a
│   │   ├── __init__.py
│   │   └── functions_a.py
│   └── subpackage_b
│   ├── __init__.py
│   └── functions_b.py
└── setup.py
The files look as follows
setup.py
:
from setuptools import setup
setup(name='test_package',
version='0.3',
description='',
author='me',
packages=['package']
)
package/__init__.py: empty.
subpackage_a/__init__.py: from .functions_a import *
subpackage_b/__init__.py: from .functions_b import *
functions_a.py
contains
def hello_world_a():
print('hello its a')
and functions_b.py contains
def hello_world_b():
print('hello its b')
Now I open a virtualenv go to the setup.py's directory and I pip install .. I was expecting to access the functions contained in the subpackages a and b. But when I try to import the functions I get a module not found error.
from package.subpackage_a import hello_world_a
ModuleNotFoundError: No module named 'package.subpackage_a'
and the same thing holds for subpackage_b. But if I import package this is recognised. I have a feeling that this approach used to work, as I have some old packages written this way which don't work any longer.
Perhaps I have to change my init.py files ? What am I doing wrong ?
setuptools.setup doesn't know that subpackage_a and subpackage_b exist. You only specified the top-level package. So it won't include these subpackages in the installation. Instead you should also specify them:
setup(
...,
packages=['package', 'subpackage_a', 'subpackage_b']
)
This process can be automatized via find_packages():
from setuptools import find_packages
setup(
...,
packages=find_packages()
)

Python: Generate function stubs from C module

I made a Python module in C/C++ with Python C API. I use setuptools.Extension in my setup.py.
It creates one .py file which loads a python module from some compiled .pyd file:
def __bootstrap__():
global __bootstrap__, __loader__, __file__
import sys, pkg_resources, imp
__file__ = pkg_resources.resource_filename(__name__, 'zroya.cp36-win32.pyd')
__loader__ = None; del __bootstrap__, __loader__
imp.load_dynamic(__name__,__file__)
__bootstrap__()
But it does not generate python stubs for IDE autocomplete feature. I would like all exported functions and classes to be visible from .py file:
def myfunction_stub(*args, **kwargs):
"""
... function docstring
"""
pass
Is it possible? Or do I have to create some python "preprocessor" which loads data from .pyd file and generate stubs with docstrings?
Source code is available on github.
This question is old but since it didn't contain an answer as I was looking into this issue I thought I'd provide what worked in my case.
I had a python module developed using the c-api with the following structure:
my_package/
├── docs
│   └── source
├── package_libs
│   ├── linux
│   └── win
│   ├── amd64
│   └── i386
├── package_src
│   ├── include
│   └── source
└── tests
the typegen command of the mypy package can generate stubs for packages pretty well.
The steps used were to first compile the package as you normally would with your existing setup.py for example.
Then generated the stubs for the generated .pyd or .so.
In my case the easiest was to install the whole package using pip for example and then calling stubgen on the whole module e.g:
pip install my_package
pip install mypy
stubgen my_package
This generates a my_package.pyi file which can then be included in the package data of your setup.py file as follows:
.
.
.
setup(
.
.
.
package=["my_package"],
package_data={"my_package": ["py.typed", "my_package.pyi", "__init__.pyi"]},
.
.
.
)
.
.
.
In there I include an empty py.typed file to let utilities know that the package contains type stubs, the generated my_package.pyi file and an __init__.pyi file containing only the import of the stubs to make them available at the top level of my package as they are in module.
from my_package import *
This works for me and is reproducible even in CI environments where we generate the stubs before publishing the package so that they don't need to be manually updated or checked for discrepancy.
The final source repository looks like this with the added files :
my_package/
├── docs
│   └── source
├── my_package
│   ├── __init__.pyi
│   ├── my_package.pyi # generated by stubgen upon successful CI build in my case
│   └── py.typed
├── package_libs
│   ├── linux
│   └── win
│   ├── amd64
│   └── i386
├── package_src
│   ├── include
│   └── source
└── tests
Unfortunately, mypy's stubgen does not (yet) include docstrings and signatures. However, it is relatively easy to automatically generate your own stub's using the Python native inspect package. For example, I use something along the lines of:
import my_package
import inspect
with open('my_package.pyi', 'w') as f:
for name, obj in inspect.getmembers(nf):
if inspect.isclass(obj):
f.write('\n')
f.write(f'class {name}:\n')
for func_name, func in inspect.getmembers(obj):
if not func_name.startswith('__'):
try:
f.write(f' def {func_name} {inspect.signature(func)}:\n')
except:
f.write(f' def {func_name} (self, *args, **kwargs):\n')
f.write(f" '''{func.__doc__}'''")
f.write('\n ...\n')
So essentially, first install your package and afterwards you can run this script to create a .pyi. You can change it easily to your liking! Note that you must correctly define docstrings in your C/C++ code: https://stackoverflow.com/a/41245451/4576519

Why isn't this .pyx file being recognized as a module?

I'm having trouble with relative imports, but I can't seem to figure out what's wrong in this case. It seems like a straightforward relative import from another module in the same package, so I'm at a loss for how to debug this.
My project is set up like so:
.
├── ckmeans
│   ├── __init__.py
│   ├── _ckmeans.pxd
│   ├── _ckmeans_wrapper.pyx
│   ├── _ckmeans.py
│   ├── _evaluation.py
│   └── _utils.py
└── setup.py
At the top of __init__.py:
from ._ckmeans import ckmeans # _ckmeans.py
And at the top of _ckmeans.py:
from . import _ckmeans_wrapper # _ckmeans_wrapper.pyx
And at the top of _ckmeans_wrapper.pyx:
cimport _ckmeans # _ckmeans.pxd
I run pip install --ignore-installed --upgrade -e ., and everything seems to go smoothly. Then when I try to run my test suite, or import ckmeans in the interpreter, I get the error:
ImportError: cannot import name '_ckmeans_wrapper'
When I comment out the import statement from __init__.py and then import ckmeans in the interpreter, it does indeed seem to be missing the _ckmeans_wrapper module. I suspect that something is failing silently in the Cython build, but I don't have any idea how to debug.
Here's the setup.py:
import numpy as np
from Cython.Build import cythonize
from setuptools import setup, Extension
extension = Extension(
name='_ckmeans_wrapper',
sources=['ckmeans/_ckmeans_wrapper.pyx'],
language="c++",
include_dirs=[np.get_include()]
)
setup(
name='ckmeans',
version='1.0.0',
packages=['ckmeans'],
ext_modules = cythonize(extension),
install_requires=['numpy', 'Cython']
)
The name argument to Extension was incorrect. It should be name='ckmeans._ckmeans_wrapper'.

How to import the own model into myproject/alembic/env.py?

I want to use alembic revision --autogenerate with my own model classes. Because of that I need to import them in myproject/alembic/env.py as described in the docs. But this doesn't work even if I tried a lot of variations.
I am not sure in which context (don't know if this is the correct word) does alembic run the env.py. Maybe that causes some errors.
This is the directory and file structure I use.
myproject/
common/
__init__.py
model.py
alembic/
env.py
The error is kind of that
from .common import model
SystemError: Parent module '' not loaded, cannot perform relative import
myproject itself is just a repository/working directory. It is not installed into the system (with pip3, apt-get, easyinstall or anything else).
You can set the PYTHONPATH environment variable to control what python sees as the top level folder, eg. if you are in the root folder of your project:
PYTHONPATH=. alembic revision -m "..."
Then you can use a "normal" import in your alembic env.py, relative to your root folder, in your example:
from src.models.base import Base
Fiddling around few hours with this same issue, I found out a solution. First, this is my structure right now:
. ← That's the root directory of my project
├── alembic.ini
├── dev-requirements.txt
├── requirements.txt
├── runtime.txt
├── setup.cfg
├── src
│   └── models
│   ├── base.py
│   ...
│   └── migrations
│   ├── env.py
│   ├── README
│      ├── script.py.mako
│      └── versions
│     
└── tests
in env.py I simply did this:
import sys
from os.path import abspath, dirname
sys.path.insert(0, dirname(dirname(dirname(abspath(__file__))))) # Insert <.>/src
import models # now it can be imported
target_metadata = models.base.Base.metadata
Hope you find this useful! :)
EDIT: I then did my first revision with the database empty (with no tables yet), alembic filled everything automatically for upgrade() and downgrade(). I did that in this way because not all my tables were automagically detected by alembic.
Put this in your env.py to put the working directory onto the Python path:
import sys
import os
sys.path.insert(0, os.getcwd())
For alembic 1.5.5 and above, add the following to your alembic.ini:
prepend_sys_path = .
From alembic documentation: this will be prepended to sys.path if present, defaults to the current working directory.

Resources