Problem with this "minimalistic" python packaging that has an import in source code - python-3.x

I'm not a programmer, and my audience/users are not programmers either. So I'm trying to have the most minimalistic setup for my python package. I liked this structure below, which is endorsed in this video:
python-mypackage/
└── src/
└── mypackage.py
The file mypackage.py:
import numpy as np
class myclass():
def __init__(self, var_a, var_b):
self.var_a = var_a
self.var_b = var_b
def mult(self):
return np.matmul(self.var_a, self.var_b)
When I build this with python setup.py bdist_wheel, with the setup.py:
from setuptools import setup
setup(
name='mypackage',
version='0.0.1',
description='Test package',
py_modules=["mypackage"],
package_dir={'': 'src'},
)
and install it with pip with pip install -e ., I get a NameError: name 'np' is not defined if I run
from mypackage import myclass
test = myclass([1,2], [3,4])
#Returns, NameError: name 'np' is not defined
But for a reason I don't understand, if I have mypackage.py:
import numpy as np
def mult(var_a, var_b):
return np.matmul(var_a, var_b)
And I build it and pip install it, the code below works:
from mypackage import mult
mult([1,2], [3,4])
#Returns 11
Is this behavior correct? Or do I have something funky in my installation? If it's normal, how do I correct the import failure of numpy in the class case (I know that if I do from mypackage import * will work, but I rather do the from mypackage import myclass)?
I'm sure there's maybe best programming practices with __init__.py and __main__.py files, but I rather stay away from them since I think they make the folder organization a little hard to follow for fellow non-programmers. But if there's no way around it without a __init__.py and/or __main__.py files, that would be good to know too.

I'm under the impression that this was an installation error of some sort. When I did a new environment and reinstalled everything, I was able to call myclass without error using from mypackage import myclass

Related

Problem importing python file from folder above

I know theres heaps of questions and answers for this, I tried multitude stackoverflow links but none of these seem to help.
My project structure is:
volume_price_analysis/
README.md
TODO.md
build/
docs/
requirements.txt
setup.py
vpa/
__init__.py
database_worker.py
utils.py
test/
__init__.py
test_utils.py
input/
input_file.txt
I want to load utils.py inside test_utils.py
my test_utils.py is:
import unittest
import logging
import os
from .vpa import utils
class TestUtils(unittest.TestCase):
def test_read_file(self):
input_dir = os.path.join(os.path.join(os.getcwd()+"/test/input"))
file_name = "input_file.txt"
with open(os.path.join(input_dir+"/"+file_name)) as f:
file_contents = f.read()
f.close()
self.assertEqual(file_contents, "Hello World!\n")
if __name__ == '__main__':
unittest.main()
I want to run (say inside test folder):
python3 -m test_utils.py
I can not do that, I get a bunch of errors regarding imports of utils (tried many iterations of . , no ., from this import that etc.. etc..
Why is this so bloody complicated?
I am using Python 3.7 if that helps.
As per this answer, you can do it using importlib,
in spec = importlib.util.spec_from_file_location("module.name", "/path/to/file.py") ,instead of path/to/file, you can use ../utils.py. Also, since you are already importing a package named utils (from importlib), you should call one of them by other name, ie. dont keep module.name as utils or import importlib.utils as something else.
I figured it out, turns out python prefers you to run your code from top level folder, in my case volume_price_analysis folder, all I had to do was make a shell script that calls
python3 -m unittest vpa.test.test_utils
And inside test_utils I can import whatever I want as long as I remember that I am executing the code from main folder so loading utils.py would be
from vpa import utils inside test_utils

Importing module named same as package

I have ran into this so many times. Always struggle and forget. This time I ax. This is python3.
repo/
setup.py
abyss/
__init__.py
abyss.py
some.py
# abyss.py
from abyss import some
print(some.x)
# some.py
x = 2
when i run ./abyss/abyss.py I get
ImportError: cannot import name 'some' from 'abyss'
some.py is at the same level that abyss.py, so just using import some works.

Add numpy.get_include() argument to setuptools without preinstalled numpy

I am currently developing a python package that uses cython and numpy and I want the package to be installable using the pip install command from a clean python installation. All dependencies should be installed automatically. I am using setuptools with the following setup.py:
import setuptools
my_c_lib_ext = setuptools.Extension(
name="my_c_lib",
sources=["my_c_lib/some_file.pyx"]
)
setuptools.setup(
name="my_lib",
version="0.0.1",
author="Me",
author_email="me#myself.com",
description="Some python library",
packages=["my_lib"],
ext_modules=[my_c_lib_ext],
setup_requires=["cython >= 0.29"],
install_requires=["numpy >= 1.15"],
classifiers=[
"Programming Language :: Python :: 3",
"Operating System :: OS Independent"
]
)
This has worked great so far. The pip install command downloads cython for the build and is able to build my package and install it together with numpy.
Now I want to improve the performance of my cython code, which leads to some changes in my setup.py. I need to add include_dirs=[numpy.get_include()] to either the call of setuptools.Extension(...) or setuptools.setup(...) which means that I also need to import numpy. (See http://docs.cython.org/en/latest/src/tutorial/numpy.html and Make distutils look for numpy header files in the correct place for rationals.)
This is bad. Now the user cannot call pip install from a clean environment, because import numpy will fail. The user needs to pip install numpy before installing my library. Even if I move "numpy >= 1.15" from install_requires to setup_requires the installation fails, because the import numpy is evaluated earlier.
Is there a way to evaluate the include_dirs at a later point of the installation, for example, after the dependencies from setup_requires or install_requires have been resolved? I really like to have all dependencies resolved automatically and I dont want the user to type multiple pip install commands.
The following snippet works, but it is not officially supported because it uses an undocumented (and private) method:
class NumpyExtension(setuptools.Extension):
# setuptools calls this function after installing dependencies
def _convert_pyx_sources_to_lang(self):
import numpy
self.include_dirs.append(numpy.get_include())
super()._convert_pyx_sources_to_lang()
my_c_lib_ext = NumpyExtension(
name="my_c_lib",
sources=["my_c_lib/some_file.pyx"]
)
The article How to Bootstrap numpy installation in setup.py proposes using a cmdclass with custom build_ext class. Unfortunately, this breaks the build of the cython extension because cython also customizes build_ext.
First question, when is numpy needed? It is needed during the setup (i.e. when build_ext-funcionality is called) and in the installation, when the module is used. That means numpy should be in setup_requires and in install_requires.
There are following alternatives to solve the issue for the setup:
using PEP 517/518 (which is more straight forward IMO)
using setup_requires-argument of setup and postponing import of numpy until setup's requirements are satisfied (which is not the case at the start of setup.py's execution)
PEP 517/518-solution:
Put next to setup.py a pyproject.toml-file , with the following content:
[build-system]
requires = ["setuptools", "wheel", "Cython>=0.29", "numpy >= 1.15"]
which defines packages needed for building, and then install using pip install . in the folder with setup.py. A disadvantage of this method is that python setup.py install no longer works, as it is pip that reads pyproject.toml. However, I would use this approach whenever possible.
Postponing import
This approach is more complicated and somewhat hacky, but works also without pip.
First, let's take a look at unsuccessful tries so far:
pybind11-trick
#chrisb's "pybind11"-trick, which can be found here: With help of an indirection, one delays the call to import numpy until numpy is present during the setup-phase, i.e.:
class get_numpy_include(object):
def __str__(self):
import numpy
return numpy.get_include()
...
my_c_lib_ext = setuptools.Extension(
...
include_dirs=[get_numpy_include()]
)
Clever! The problem: it doesn't work with the Cython-compiler: somewhere down the line, Cython passes the get_numpy_include-object to os.path.join(...,...) which checks whether the argument is really a string, which it obviously isn't.
This could be fixed by inheriting from str, but the above shows the dangers of the approach in the long run - it doesn't use the designed mechanics, is brittle and may easily fail in the future.
the classical build_ext-solution
Which looks as following:
...
from setuptools.command.build_ext import build_ext as _build_ext
class build_ext(_build_ext):
def finalize_options(self):
_build_ext.finalize_options(self)
# Prevent numpy from thinking it is still in its setup process:
__builtins__.__NUMPY_SETUP__ = False
import numpy
self.include_dirs.append(numpy.get_include())
setupttools.setup(
...
cmdclass={'build_ext':build_ext},
...
)
Yet also this solution doesn't work with cython-extensions, because pyx-files don't get recognized.
The real question is, how did pyx-files get recognized in the first place? The answer is this part of setuptools.command.build_ext:
...
try:
# Attempt to use Cython for building extensions, if available
from Cython.Distutils.build_ext import build_ext as _build_ext
# Additionally, assert that the compiler module will load
# also. Ref #1229.
__import__('Cython.Compiler.Main')
except ImportError:
_build_ext = _du_build_ext
...
That means setuptools tries to use the Cython's build_ext if possible, and because the import of the module is delayed until build_ext is called, it founds Cython present.
The situation is different when setuptools.command.build_ext is imported at the beginning of the setup.py - the Cython isn't yet present and a fall back without cython-functionality is used.
mixing up pybind11-trick and classical solution
So let's add an indirection, so we don't have to import setuptools.command.build_ext directly at the beginning of setup.py:
....
# factory function
def my_build_ext(pars):
# import delayed:
from setuptools.command.build_ext import build_ext as _build_ext#
# include_dirs adjusted:
class build_ext(_build_ext):
def finalize_options(self):
_build_ext.finalize_options(self)
# Prevent numpy from thinking it is still in its setup process:
__builtins__.__NUMPY_SETUP__ = False
import numpy
self.include_dirs.append(numpy.get_include())
#object returned:
return build_ext(pars)
...
setuptools.setup(
...
cmdclass={'build_ext' : my_build_ext},
...
)
One (hacky) suggestion would be using the fact that extension.include_dirs is first requested in build_ext, which is called after the setup dependencies are downloaded.
class MyExt(setuptools.Extension):
def __init__(self, *args, **kwargs):
self.__include_dirs = []
super().__init__(*args, **kwargs)
#property
def include_dirs(self):
import numpy
return self.__include_dirs + [numpy.get_include()]
#include_dirs.setter
def include_dirs(self, dirs):
self.__include_dirs = dirs
my_c_lib_ext = MyExt(
name="my_c_lib",
sources=["my_c_lib/some_file.pyx"]
)
setup(
...,
setup_requires=['cython', 'numpy'],
)
Update
Another (less, but I guess still pretty hacky) solution would be overriding build instead of build_ext, since we know that build_ext is a subcommand of build and will always be invoked by build on installation. This way, we don't have to touch build_ext and leave it to Cython. This will also work when invoking build_ext directly (e.g., via python setup.py build_ext to rebuild the extensions inplace while developing) because build_ext ensures all options of build are initialized, and by coincidence, Command.set_undefined_options first ensures the command has finalized (I know, distutils is a mess).
Of course, now we're misusing build - it runs code that belongs to build_ext finalization. However, I'd still probably go with this solution rather than with the first one, ensuring the relevant piece of code is properly documented.
import setuptools
from distutils.command.build import build as build_orig
class build(build_orig):
def finalize_options(self):
super().finalize_options()
# I stole this line from ead's answer:
__builtins__.__NUMPY_SETUP__ = False
import numpy
# or just modify my_c_lib_ext directly here, ext_modules should contain a reference anyway
extension = next(m for m in self.distribution.ext_modules if m == my_c_lib_ext)
extension.include_dirs.append(numpy.get_include())
my_c_lib_ext = setuptools.Extension(
name="my_c_lib",
sources=["my_c_lib/some_file.pyx"]
)
setuptools.setup(
...,
ext_modules=[my_c_lib_ext],
cmdclass={'build': build},
...
)
I found a very easy solution in this post:
Or you can stick to https://github.com/pypa/pip/issues/5761. Here you install cython and numpy using setuptools.dist before actual setup:
from setuptools import dist
dist.Distribution().fetch_build_eggs(['Cython>=0.15.1', 'numpy>=1.10'])
Works well for me!

Importing submodules

I am new to python and i m having a really bad time to overcome a problem with the importing system.
Lets say i have the file system presented below:
/src
/src/main.py
/src/submodules/
/src/submodules/submodule.py
/src/submodules/subsubmodules
/src/submodules/subsubmodules/subsubmodule.py
All the folders (src, submodules, subsubmodules) have and empty __init__.py file.
In submodule.py i have:
from subsubmodules import subsubmodule
In main.py i have:
from submodules import submodule
When i run submodule.py python accepts the import. But when i run main.py python raises error for the import of subsubmodule.py because /src/submodules/subsubmodules/ folder is not in the path.
Only solution is to change the import of submodule.py to
from submodules.subsubmodules import subsubmodule
This seems to me as an awful solution because after that i cannot run submodule.py and i m sure that something else is the key to that.
An other solution is to add the following code to the __init__.py file:
import os
import sys
import inspect
cmd_subfolder = os.path.split(inspect.getfile(inspect.currentframe()))[0]
if cmd_subfolder not in sys.path:
sys.path.insert(0, cmd_subfolder)
Is there any way to do this using just the importing system of python and not other methods that do it manually using, for example sys.path or other modules like os, inspect etc..?
How can i import modules without caring about the modules they import?
You can run subsubmodule.py as
python3 -m submodule.subsubmodules.subsubmodule
If you want a shorter way to invoke it, you're free to add a shell or Python script for that on the top level of your package.
This is how imports work in Python 3; there are reasons for that.
You can avoid this issue by using sys.path in your program.
sys.path.insert(0, './lib')
import subsubmodule
For this code, you can put all your imports to a lib folder.
You can read the official documentation on Python packages where this is explained in depth.

Python 3.5 - Smart module imports in the file tree

I was wondering if it was possible for modules in a project to be smart about their imports...
Say I have the following data structure :
/parent-directory
/package
__init__.py
main.py
/modules
__init__.py
one.py
/submodules-one
__init__.py
oneone.py
onetwo.py
two.py
Files higher in the hierarchy are supposed to import Classes from those lower in the hierarchy.
For instance main.py has
import modules.one
import modules.two
Now I'd like to be able to directly run not only main.py, but also one.py (and all the others)
Except that it doesn't work like I hoped :
If I run from main.py, I need to have in one.py
import modules.submodules-one.oneone
import modules.submodules-one.onetwo
But if I run from one.py, I'll get an error, and I need to have instead
import submodules-one.oneone
import submodules-one.onetwo
I've found a hacky way to get around it using in one.py :
if __name__ == '__main__':
import submodules-one.oneone
import submodules-one.onetwo
else:
import modules.submodules-one.oneone
import modules.submodules-one.onetwo
But isn't there a better solution?
P.S.: I also have an additional complication, I'm using pint,
which to work properly only needs to have a single instance of the unit registry, so I have in the top ____init____.py :
from pint import UnitRegistry
ur = UnitRegistry()
And obviously
from .. import ur
will fail if running from one of the files of the subfolders.
Thank you in advance for your answer.

Resources