Add numpy.get_include() argument to setuptools without preinstalled numpy - python-3.x

I am currently developing a python package that uses cython and numpy and I want the package to be installable using the pip install command from a clean python installation. All dependencies should be installed automatically. I am using setuptools with the following setup.py:
import setuptools
my_c_lib_ext = setuptools.Extension(
name="my_c_lib",
sources=["my_c_lib/some_file.pyx"]
)
setuptools.setup(
name="my_lib",
version="0.0.1",
author="Me",
author_email="me#myself.com",
description="Some python library",
packages=["my_lib"],
ext_modules=[my_c_lib_ext],
setup_requires=["cython >= 0.29"],
install_requires=["numpy >= 1.15"],
classifiers=[
"Programming Language :: Python :: 3",
"Operating System :: OS Independent"
]
)
This has worked great so far. The pip install command downloads cython for the build and is able to build my package and install it together with numpy.
Now I want to improve the performance of my cython code, which leads to some changes in my setup.py. I need to add include_dirs=[numpy.get_include()] to either the call of setuptools.Extension(...) or setuptools.setup(...) which means that I also need to import numpy. (See http://docs.cython.org/en/latest/src/tutorial/numpy.html and Make distutils look for numpy header files in the correct place for rationals.)
This is bad. Now the user cannot call pip install from a clean environment, because import numpy will fail. The user needs to pip install numpy before installing my library. Even if I move "numpy >= 1.15" from install_requires to setup_requires the installation fails, because the import numpy is evaluated earlier.
Is there a way to evaluate the include_dirs at a later point of the installation, for example, after the dependencies from setup_requires or install_requires have been resolved? I really like to have all dependencies resolved automatically and I dont want the user to type multiple pip install commands.
The following snippet works, but it is not officially supported because it uses an undocumented (and private) method:
class NumpyExtension(setuptools.Extension):
# setuptools calls this function after installing dependencies
def _convert_pyx_sources_to_lang(self):
import numpy
self.include_dirs.append(numpy.get_include())
super()._convert_pyx_sources_to_lang()
my_c_lib_ext = NumpyExtension(
name="my_c_lib",
sources=["my_c_lib/some_file.pyx"]
)
The article How to Bootstrap numpy installation in setup.py proposes using a cmdclass with custom build_ext class. Unfortunately, this breaks the build of the cython extension because cython also customizes build_ext.

First question, when is numpy needed? It is needed during the setup (i.e. when build_ext-funcionality is called) and in the installation, when the module is used. That means numpy should be in setup_requires and in install_requires.
There are following alternatives to solve the issue for the setup:
using PEP 517/518 (which is more straight forward IMO)
using setup_requires-argument of setup and postponing import of numpy until setup's requirements are satisfied (which is not the case at the start of setup.py's execution)
PEP 517/518-solution:
Put next to setup.py a pyproject.toml-file , with the following content:
[build-system]
requires = ["setuptools", "wheel", "Cython>=0.29", "numpy >= 1.15"]
which defines packages needed for building, and then install using pip install . in the folder with setup.py. A disadvantage of this method is that python setup.py install no longer works, as it is pip that reads pyproject.toml. However, I would use this approach whenever possible.
Postponing import
This approach is more complicated and somewhat hacky, but works also without pip.
First, let's take a look at unsuccessful tries so far:
pybind11-trick
#chrisb's "pybind11"-trick, which can be found here: With help of an indirection, one delays the call to import numpy until numpy is present during the setup-phase, i.e.:
class get_numpy_include(object):
def __str__(self):
import numpy
return numpy.get_include()
...
my_c_lib_ext = setuptools.Extension(
...
include_dirs=[get_numpy_include()]
)
Clever! The problem: it doesn't work with the Cython-compiler: somewhere down the line, Cython passes the get_numpy_include-object to os.path.join(...,...) which checks whether the argument is really a string, which it obviously isn't.
This could be fixed by inheriting from str, but the above shows the dangers of the approach in the long run - it doesn't use the designed mechanics, is brittle and may easily fail in the future.
the classical build_ext-solution
Which looks as following:
...
from setuptools.command.build_ext import build_ext as _build_ext
class build_ext(_build_ext):
def finalize_options(self):
_build_ext.finalize_options(self)
# Prevent numpy from thinking it is still in its setup process:
__builtins__.__NUMPY_SETUP__ = False
import numpy
self.include_dirs.append(numpy.get_include())
setupttools.setup(
...
cmdclass={'build_ext':build_ext},
...
)
Yet also this solution doesn't work with cython-extensions, because pyx-files don't get recognized.
The real question is, how did pyx-files get recognized in the first place? The answer is this part of setuptools.command.build_ext:
...
try:
# Attempt to use Cython for building extensions, if available
from Cython.Distutils.build_ext import build_ext as _build_ext
# Additionally, assert that the compiler module will load
# also. Ref #1229.
__import__('Cython.Compiler.Main')
except ImportError:
_build_ext = _du_build_ext
...
That means setuptools tries to use the Cython's build_ext if possible, and because the import of the module is delayed until build_ext is called, it founds Cython present.
The situation is different when setuptools.command.build_ext is imported at the beginning of the setup.py - the Cython isn't yet present and a fall back without cython-functionality is used.
mixing up pybind11-trick and classical solution
So let's add an indirection, so we don't have to import setuptools.command.build_ext directly at the beginning of setup.py:
....
# factory function
def my_build_ext(pars):
# import delayed:
from setuptools.command.build_ext import build_ext as _build_ext#
# include_dirs adjusted:
class build_ext(_build_ext):
def finalize_options(self):
_build_ext.finalize_options(self)
# Prevent numpy from thinking it is still in its setup process:
__builtins__.__NUMPY_SETUP__ = False
import numpy
self.include_dirs.append(numpy.get_include())
#object returned:
return build_ext(pars)
...
setuptools.setup(
...
cmdclass={'build_ext' : my_build_ext},
...
)

One (hacky) suggestion would be using the fact that extension.include_dirs is first requested in build_ext, which is called after the setup dependencies are downloaded.
class MyExt(setuptools.Extension):
def __init__(self, *args, **kwargs):
self.__include_dirs = []
super().__init__(*args, **kwargs)
#property
def include_dirs(self):
import numpy
return self.__include_dirs + [numpy.get_include()]
#include_dirs.setter
def include_dirs(self, dirs):
self.__include_dirs = dirs
my_c_lib_ext = MyExt(
name="my_c_lib",
sources=["my_c_lib/some_file.pyx"]
)
setup(
...,
setup_requires=['cython', 'numpy'],
)
Update
Another (less, but I guess still pretty hacky) solution would be overriding build instead of build_ext, since we know that build_ext is a subcommand of build and will always be invoked by build on installation. This way, we don't have to touch build_ext and leave it to Cython. This will also work when invoking build_ext directly (e.g., via python setup.py build_ext to rebuild the extensions inplace while developing) because build_ext ensures all options of build are initialized, and by coincidence, Command.set_undefined_options first ensures the command has finalized (I know, distutils is a mess).
Of course, now we're misusing build - it runs code that belongs to build_ext finalization. However, I'd still probably go with this solution rather than with the first one, ensuring the relevant piece of code is properly documented.
import setuptools
from distutils.command.build import build as build_orig
class build(build_orig):
def finalize_options(self):
super().finalize_options()
# I stole this line from ead's answer:
__builtins__.__NUMPY_SETUP__ = False
import numpy
# or just modify my_c_lib_ext directly here, ext_modules should contain a reference anyway
extension = next(m for m in self.distribution.ext_modules if m == my_c_lib_ext)
extension.include_dirs.append(numpy.get_include())
my_c_lib_ext = setuptools.Extension(
name="my_c_lib",
sources=["my_c_lib/some_file.pyx"]
)
setuptools.setup(
...,
ext_modules=[my_c_lib_ext],
cmdclass={'build': build},
...
)

I found a very easy solution in this post:
Or you can stick to https://github.com/pypa/pip/issues/5761. Here you install cython and numpy using setuptools.dist before actual setup:
from setuptools import dist
dist.Distribution().fetch_build_eggs(['Cython>=0.15.1', 'numpy>=1.10'])
Works well for me!

Related

Problem with this "minimalistic" python packaging that has an import in source code

I'm not a programmer, and my audience/users are not programmers either. So I'm trying to have the most minimalistic setup for my python package. I liked this structure below, which is endorsed in this video:
python-mypackage/
└── src/
└── mypackage.py
The file mypackage.py:
import numpy as np
class myclass():
def __init__(self, var_a, var_b):
self.var_a = var_a
self.var_b = var_b
def mult(self):
return np.matmul(self.var_a, self.var_b)
When I build this with python setup.py bdist_wheel, with the setup.py:
from setuptools import setup
setup(
name='mypackage',
version='0.0.1',
description='Test package',
py_modules=["mypackage"],
package_dir={'': 'src'},
)
and install it with pip with pip install -e ., I get a NameError: name 'np' is not defined if I run
from mypackage import myclass
test = myclass([1,2], [3,4])
#Returns, NameError: name 'np' is not defined
But for a reason I don't understand, if I have mypackage.py:
import numpy as np
def mult(var_a, var_b):
return np.matmul(var_a, var_b)
And I build it and pip install it, the code below works:
from mypackage import mult
mult([1,2], [3,4])
#Returns 11
Is this behavior correct? Or do I have something funky in my installation? If it's normal, how do I correct the import failure of numpy in the class case (I know that if I do from mypackage import * will work, but I rather do the from mypackage import myclass)?
I'm sure there's maybe best programming practices with __init__.py and __main__.py files, but I rather stay away from them since I think they make the folder organization a little hard to follow for fellow non-programmers. But if there's no way around it without a __init__.py and/or __main__.py files, that would be good to know too.
I'm under the impression that this was an installation error of some sort. When I did a new environment and reinstalled everything, I was able to call myclass without error using from mypackage import myclass

Making .exe from the Python Script that uses GIS libraries such as geopandas, folium

It's a very straightforward and broad question I know but I have very little time so I have to ask. I created an interface to do some GIS calculations and for that I used below libraries in backend.
import osmnx as ox, networkx as nx, geopandas as gpd, pandas as pd
from shapely.geometry import LineString, Point
from fiona.crs import from_epsg
import branca.colormap as cm
import folium
from folium.plugins import MarkerCluster
import pysal as ps
and these for frontend
import tkinter as tk
from tkinter import ttk
from tkinter.filedialog import askopenfilename, asksaveasfilename,
askdirectory
import backend as bk
I'm trying to make it an executable program and I've tried PyInstaller but it did not work because of the dependencies. Is there any way to do it with PyInstaller? or any other libraries? Or what should I do?
p.s : I'm using python 3.6
2nd EDIT:
I tried cx_freeze and created a setup.py and build it. After that, when I double click on the program It simply does nothing. No error messages, anything. My code is in below:
import cx_Freeze
import sys
import os
PYTHON_INSTALL_DIR = os.path.dirname(sys.executable)
os.environ['TCL_LIBRARY'] = os.path.join(PYTHON_INSTALL_DIR, 'tcl', 'tcl8.6')
os.environ['TK_LIBRARY'] = os.path.join(PYTHON_INSTALL_DIR, 'tcl', 'tk8.6')
include_files = [(os.path.join(PYTHON_INSTALL_DIR, 'DLLs', 'tk86t.dll'), os.path.join('lib', 'tk86t.dll')),
(os.path.join(PYTHON_INSTALL_DIR, 'DLLs', 'tcl86t.dll'), os.path.join('lib', 'tcl86t.dll'))]
packages = ["pandas", "numpy", "tkinter", "matplotlib", "osmnx", "networkx",
"geopandas", "shapely", "fiona", "branca", "folium",
"pysal"]
base = None
if sys.platform == "win32":
base = "Win32GUI"
executables = [cx_Freeze.Executable("frontend.py", base=base, icon="transport.ico")]
cx_Freeze.setup(
name = "Network_Analyst",
options = {"build_exe": {"packages":packages,
"include_files":include_files}},
version = "0.01",
description = "Network analyst",
executables = executables
)
My program consists of two scripts which are frontend and backend. I'm importing backend on the frontend section, should I add it somewhere in the setup code? And one more thing, I'm working on an environment to do these processes, Is this has an effect on building a setup?
I'm giving a sample from my code to make your understanding better:
In frontend part I'm calling backend as
import backend as bk
and in the script:
class Centrality(tk.Frame):
def degree_cent(self):
print("Calculating Degree Centrality")
G = self.findG()
try:
bk.degree_cent(G, self.t3.get("1.0",'end-1c'), self.t2.get("1.0",'end-1c'))
except:
bk.degree_cent(G, self.t3.get("1.0",'end-1c'))
In backend I don't use OOP, I just write the functions such as:
import osmnx as ox, networkx as nx, geopandas as gpd, pandas as pd
def degree_cent(G, outpath, *args):
G_proj = ox.project_graph(G)
nodes, edges = ox.graph_to_gdfs(G_proj)
nodes["x"] = nodes["x"].astype(float)
degree_centrality = nx.degree_centrality(G_proj)
degree = gpd.GeoDataFrame(pd.Series(degree_centrality), columns=["degree"])
Executable program still doesn't respond when I'm clicking on it. No respond at all. No any windows event (I've checked it from Windows Event Viewer).
As far as another library is concerned: you can use cx_Freeze to make an executable out of your Python program.
You can install cx_Freeze by issuing the command
python -m pip install cx_Freeze --upgrade
in a terminal or command prompt. You'll find links to the cx_Freeze documentation and source code on the cx_Freeze entry page.
To create an executable, you need to create a setup script setup.py for your application an then issue the command
python setup.py build
You can find a working example using tkinterin this question
tkinter program compiles with cx_Freeze but program will not launch
and its accepted answer. It also contains useful links.
In order to use pandas in your main script, you'll need to modify the setup.py script of the example linked above by adding
packages = ['numpy']
and replacing the options argument in the setup call by
options={'build_exe': {'include_files': include_files, 'packages': packages}}
You also might need further tweaking for the other modules you are using (geopandas, folium, ...). If it does not work with the example described above, please edit your question and add the setup.py script you are using and the error message reported to get further help.
EDIT:
For cx_Freeze version 5.1.1, the TCL/TK DLLs need to be included in a lib subdirectory of the build directory. You can do that by passing a tuple (source, destination) to the corresponding entry of the include_files list option:
include_files = [(os.path.join(PYTHON_INSTALL_DIR, 'DLLs', 'tk86t.dll'), os.path.join('lib', 'tk86t.dll')),
(os.path.join(PYTHON_INSTALL_DIR, 'DLLs', 'tcl86t.dll'), os.path.join('lib', 'tcl86t.dll'))]
As far as the backend is concerned, if you use import backend in frontend.py, it should be no problem, cx_Freeze should freeze it correctly.

How to use Cython with pytest?

The goal is to use the pytest unit test framework for a Python3 project that uses Cython. This is not a plug-and-play thing, because pytest by default is not able to import the Cython modules.
One unsuccessful solution would be to use the pytest-cython plugin, but it simply does not work for me:
> py.test --doctest-cython
usage: py.test [options] [file_or_dir] [file_or_dir] [...]
py.test: error: unrecognized arguments: --doctest-cython
inifile: None
rootdir: /censored/path/to/my/project/dir
To verify that I have the package installed:
> pip freeze | grep pytest-cython
pytest-cython==0.1.0
UPDATE:
I'm using PyCharm and it seems that it is not using my pip-installed packages but rather uses a custom(?) pycharm repository for packages used by my project. Once I added pytest-cython to that repository, the command runs but strange enough it doesn't recognize the Cython module anyway, although the package/add-on is specifically designed for that purpose:
> pytest --doctest-cython
Traceback:
tests/test_prism.py:2: in <module>
from cpc_naive.prism import readSequence, processInput
cpc_naive/prism.py:5: in <module>
from calculateScore import calculateScore, filterSortAlphas,
calculateAlphaMatrix_c#, incrementOverlapRanges # cython code
E ImportError: No module named 'calculateScore'
Another unsuccessful solution I got here is to use pytest-runner, but this yields:
> python3 setup.py pytest
usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
or: setup.py --help [cmd1 cmd2 ...]
or: setup.py --help-commands
or: setup.py cmd --help
error: invalid command 'pytest'
UPDATE:
I first had forgotten to add setup_requires=['pytest-runner', ...] and tests_require=['pytest', ...] to the setup script. Once i did that, I got another error:
> python3 setup.py pytest
Traceback (most recent call last):
File "setup.py", line 42, in <module>
tests_require=['pytest']
(...)
AttributeError: type object 'test' has no attribute 'install_dists'
UPDATE 2 (setup.py):
from distutils.core import setup
from distutils.extension import Extension
from setuptools import find_packages
from Cython.Build import cythonize
import numpy
try: # try to build the .c file
from Cython.Distutils import build_ext
except ImportError: # if the end-user doesn't have Cython that's OK; you should have shipped the .c files anyway.
use_cython = False
else:
use_cython = True
cmdclass = {}
ext_modules = []
if use_cython:
ext_modules += [
Extension("cpc_naive.calculateScore", ["cpc_naive/calculateScore.pyx"],
extra_compile_args=['-g'], # -g for debugging
define_macros=[('CYTHON_TRACE', '1')]),
]
cmdclass.update({'build_ext': build_ext})
else:
ext_modules += [
Extension("cpc_naive.calculateScore", ["cpc_naive/calculateScore.c"],
define_macros=[('CYTHON_TRACE', '1')]), # compiled C files are stored in /home/pdiracdelta/.pyxbld/
]
setup(
name='cpc_naive',
author=censored,
author_email=censored,
license=censored,
packages=find_packages(),
cmdclass=cmdclass,
ext_modules=ext_modules,
install_requires=['Cython', 'numpy'],
include_dirs=[numpy.get_include()],
setup_requires=['pytest-runner'],
tests_require=['pytest']
)
UPDATE 3 (partial fix):
As suggested by #hoefling I downgraded pytest-runner to a version <4 (in fact 3.0.1) and this resolves the error in update 1, but now I get the same Exception as with the pytest-cython solution:
E ImportError: No module named 'calculateScore'
It just doesn't seem to recognize the module. Perhaps this is due to some absolute/relative import mojo I don't understand.
How can I use pytest with Cython? How can I discover why these methods aren't working and then fix it?
FINAL UPDATE:
After taking both the original problem and the question Updates into consideration (thanks #hoefling for solving these issues!), this question is now reduced to the question of:
why can pytest no import the Cython module calculateScore, even though running the code just with python (no pytest) works just fine?
As #hoefling suggested, one should use pytest-runner version <0.4 to avoid the
AttributeError: type object 'test' has no attribute 'install_dists'
To then answer the actual and final question (in addition to partial, off-topic, user-specific fixes added to the question post itself) of why pytest cannot import the Cython module calculateScore, even though running the code just with python (no pytest) works just fine:
that remaining issue is solved here.

Nosetests gives ImportError when __init__.py is included (using cython)

I just came across a very strange error while using nose and cython inside my virtualenv with python3. For some reason nosetests started giving me an ImportError even though python -m unittest basic_test.py was working. I made a new directory to reproduce the error to make sure there wasn't something weird in that directory.
Here are the three files: fileA.pyx, setup.py, and basic_test.py
file1.pyx
class FileA:
def __init__(self):
self.temp = {}
setup.py:
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
ext_modules = [
Extension('fileA', ['fileA.pyx'],)
]
setup(
name='test',
ext_modules=ext_modules,
cmdclass={'build_ext': build_ext},
)
basic_test.py:
def test():
from fileA import FileA
FileA()
assert True
Fresh directory. I run python setup.py build_ext --inplace. It compiles. I run nosetests the single test passes.
Then I do touch __init__.py and then run nosetests again and it fails with this error:
ImportError: No module named 'fileA'
Is this a bug or do I not understand how init affects imports?
Update:
I found this post about import traps and read something that might explain how adding init breaks it. I still don't get exactly how it fits in though.
This is an all new trap added in Python 3.3 as a consequence of fixing
the previous trap: if a subdirectory encountered on sys.path as part
of a package import contains an init.py file, then the Python
interpreter will create a single directory package containing only
modules from that directory, rather than finding all appropriately
named subdirectories as described in the previous section.

ImportError when using cx_Freeze with scipy

I'm trying to use cx_Freeze to generate a .app from a python project. Generally I have it working, but some of my modules which depend on scipy have an import error when executed:
No module named '_csr'
under the build folder I see a file:
scipy.sparse.sparsetools._csr.so
and watching the output of the build command seems to suggest that it's copying csr:
$ python3 setup.py bdist_mac | grep csr
m scipy.sparse.csr /usr/local/lib/python3.3/site-packages/scipy/sparse/csr.py
m scipy.sparse.sparsetools._csr /usr/local/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so
m scipy.sparse.sparsetools.csr /usr/local/lib/python3.3/site-packages/scipy/sparse/sparsetools/csr.py
? _csr imported from scipy.sparse.sparsetools.csr
? os.path imported from NIF_WRF.util.StopPow, distutils.file_util, matplotlib.backends.backend_tkagg, matplotlib.cbook, numpy.core.memmap, numpy.distutils.command.scons, os, pkg_resources, pkgutil, scipy.lib.blas.scons_support, scipy.lib.blas.setup, scipy.lib.lapack.scons_support, scipy.linalg.setup, scipy.sparse.csgraph.setup, scipy.sparse.linalg.dsolve.setup, scipy.sparse.linalg.eigen.arpack.setup, scipy.sparse.linalg.isolve.setup, scipy.sparse.sparsetools.bsr, scipy.sparse.sparsetools.coo, scipy.sparse.sparsetools.csc, scipy.sparse.sparsetools.csgraph, scipy.sparse.sparsetools.csr, scipy.sparse.sparsetools.dia, scipy.special.setup, shutil, sysconfig
? scipy.lib.six.moves imported from scipy.integrate.quadrature, scipy.interpolate.interpolate, scipy.interpolate.polyint, scipy.linalg.special_matrices, scipy.misc.common, scipy.optimize.anneal, scipy.optimize.linesearch, scipy.optimize.nonlin, scipy.sparse.base, scipy.sparse.compressed, scipy.sparse.coo, scipy.sparse.csc, scipy.sparse.csr, scipy.sparse.dok, scipy.sparse.lil, scipy.sparse.linalg.eigen.lobpcg.lobpcg, scipy.sparse.linalg.isolve.lgmres, scipy.spatial.distance, scipy.special.basic, scipy.stats.stats
copying /usr/local/lib/python3.3/site-packages/scipy/sparse/sparsetools/_csr.so -> build/exe.macosx-10.8-x86_64-3.3/scipy.sparse.sparsetools._csr.so
The problem seems to be related to this other question but that user seemed to solve it by building again, which hasn't helped here. Any ideas?
UPDATE
I mucked around in the .app package contents and found that renaming scipy.sparse.sparsetools._csr.so to _csr.so solves that error (though generates another similar one for another scipy component). It seems like the cx_Freeze script is not properly naming scipy inputs.
Also, here are the versions I'm using:
cx_Freeze: 4.3.2
scipy: 0.13.0
python: 3.3.2

Resources