Python: Generate function stubs from C module - python-3.x

I made a Python module in C/C++ with Python C API. I use setuptools.Extension in my setup.py.
It creates one .py file which loads a python module from some compiled .pyd file:
def __bootstrap__():
global __bootstrap__, __loader__, __file__
import sys, pkg_resources, imp
__file__ = pkg_resources.resource_filename(__name__, 'zroya.cp36-win32.pyd')
__loader__ = None; del __bootstrap__, __loader__
imp.load_dynamic(__name__,__file__)
__bootstrap__()
But it does not generate python stubs for IDE autocomplete feature. I would like all exported functions and classes to be visible from .py file:
def myfunction_stub(*args, **kwargs):
"""
... function docstring
"""
pass
Is it possible? Or do I have to create some python "preprocessor" which loads data from .pyd file and generate stubs with docstrings?
Source code is available on github.

This question is old but since it didn't contain an answer as I was looking into this issue I thought I'd provide what worked in my case.
I had a python module developed using the c-api with the following structure:
my_package/
├── docs
│   └── source
├── package_libs
│   ├── linux
│   └── win
│   ├── amd64
│   └── i386
├── package_src
│   ├── include
│   └── source
└── tests
the typegen command of the mypy package can generate stubs for packages pretty well.
The steps used were to first compile the package as you normally would with your existing setup.py for example.
Then generated the stubs for the generated .pyd or .so.
In my case the easiest was to install the whole package using pip for example and then calling stubgen on the whole module e.g:
pip install my_package
pip install mypy
stubgen my_package
This generates a my_package.pyi file which can then be included in the package data of your setup.py file as follows:
.
.
.
setup(
.
.
.
package=["my_package"],
package_data={"my_package": ["py.typed", "my_package.pyi", "__init__.pyi"]},
.
.
.
)
.
.
.
In there I include an empty py.typed file to let utilities know that the package contains type stubs, the generated my_package.pyi file and an __init__.pyi file containing only the import of the stubs to make them available at the top level of my package as they are in module.
from my_package import *
This works for me and is reproducible even in CI environments where we generate the stubs before publishing the package so that they don't need to be manually updated or checked for discrepancy.
The final source repository looks like this with the added files :
my_package/
├── docs
│   └── source
├── my_package
│   ├── __init__.pyi
│   ├── my_package.pyi # generated by stubgen upon successful CI build in my case
│   └── py.typed
├── package_libs
│   ├── linux
│   └── win
│   ├── amd64
│   └── i386
├── package_src
│   ├── include
│   └── source
└── tests

Unfortunately, mypy's stubgen does not (yet) include docstrings and signatures. However, it is relatively easy to automatically generate your own stub's using the Python native inspect package. For example, I use something along the lines of:
import my_package
import inspect
with open('my_package.pyi', 'w') as f:
for name, obj in inspect.getmembers(nf):
if inspect.isclass(obj):
f.write('\n')
f.write(f'class {name}:\n')
for func_name, func in inspect.getmembers(obj):
if not func_name.startswith('__'):
try:
f.write(f' def {func_name} {inspect.signature(func)}:\n')
except:
f.write(f' def {func_name} (self, *args, **kwargs):\n')
f.write(f" '''{func.__doc__}'''")
f.write('\n ...\n')
So essentially, first install your package and afterwards you can run this script to create a .pyi. You can change it easily to your liking! Note that you must correctly define docstrings in your C/C++ code: https://stackoverflow.com/a/41245451/4576519

Related

How to solve python moduleNotFoundError

I'm having trouble understanding the module layout in python. Here is my directory / file structure
Project2.1/
├── project2
│   ├── data_mining
│   │   ├── process.py
│   │   └── __init__.py
│   └── __init__.py
└── tests
   ├── data
   │   └── data.csv
   ├── data_mining
   │   ├── __init__.py
   │   └── test_process.py
   └── __init__.py
File test_process.py, contains the following import
from project2.data_mining.process import ClassP
Run file tests/data_mining/test_proecss.py using the following command from directory Project2.1
$ cd Project2.1
$ python3 tests/data_mining/test_process.py
Generates the error
File "tests/data_mining/test_process.py", line 7, in <module>
from project2.data_mining.process import ClassP
ModuleNotFoundError: No module named 'project2'
ClassP is a class inside project2/data_mining/process.py
Since you are in the data_mining directory of tests folder , only those files inside the data_mining folder are accessible directly, and you can't type the path of module with from, you need to add the path of the data_mining folder of project2 , so get the exact path of data_mining (of project2 )
and
import sys
sys.path.append(exact path)
from process import ClassP
this will append the path of that folder and make all files inside it accessible to the import system
also remeber we don't use .py or any extension while importing
it is just like importing any other module from random import randint for instance
:D

problems importing sub packages in python: how should I write the __init__.py files

I am new to building packages so bear with me. I am having a problem importing the subpackages of my latest python project.
My directory structure is the following:
├── package
│   ├── __init__.py
│   ├── subpackage_a
│   │   ├── __init__.py
│   │   └── functions_a.py
│   └── subpackage_b
│   ├── __init__.py
│   └── functions_b.py
└── setup.py
The files look as follows
setup.py
:
from setuptools import setup
setup(name='test_package',
version='0.3',
description='',
author='me',
packages=['package']
)
package/__init__.py: empty.
subpackage_a/__init__.py: from .functions_a import *
subpackage_b/__init__.py: from .functions_b import *
functions_a.py
contains
def hello_world_a():
print('hello its a')
and functions_b.py contains
def hello_world_b():
print('hello its b')
Now I open a virtualenv go to the setup.py's directory and I pip install .. I was expecting to access the functions contained in the subpackages a and b. But when I try to import the functions I get a module not found error.
from package.subpackage_a import hello_world_a
ModuleNotFoundError: No module named 'package.subpackage_a'
and the same thing holds for subpackage_b. But if I import package this is recognised. I have a feeling that this approach used to work, as I have some old packages written this way which don't work any longer.
Perhaps I have to change my init.py files ? What am I doing wrong ?
setuptools.setup doesn't know that subpackage_a and subpackage_b exist. You only specified the top-level package. So it won't include these subpackages in the installation. Instead you should also specify them:
setup(
...,
packages=['package', 'subpackage_a', 'subpackage_b']
)
This process can be automatized via find_packages():
from setuptools import find_packages
setup(
...,
packages=find_packages()
)

What is the proper way to organize a PIP package?

I have 4 files: main.py, helper.py, clf.pkl, and tests.py.
Main.py has core classes. It needs to import helper.py for some methods and clf.pkl for data.
What is the minimal structure I can have for a Python library with 4 files?
I would use a package to hold your files, along with a pyproject.toml to describe your project, like this:
.
├── pyproject.toml
├── MANIFEST.in
├── your_package_name
│   ├── __main__.py
│   ├── helper.py
│   └── __init__.py
└── tests
└── tests.py
In your __init__.py file write at least:
"""A short description of your project"""
__version__ = "0.0.1"
(Change description and version accordingly).
To create your pyproject.toml you can use flit init:
pip install flit
flit init
Name your entry point __main__.py in the package so you can run it using:
python -m your_package_name
(Yes it's still a good idea to use an if __name__ == "__main__": in it, so you can import your main from your tests if needed).
You can import helper.py from __main__.py using:
from your_package_name import helper
or:
from . import helper
(I prefer the first one but I don't know if there a concensus.)
For your clf.pkl to be included in your package you'll need to create a MANIFEST.in with:
include your_package_name/clf.pkl
Your pkl will be available at:
os.path.join(os.path.dirname(os.path.abspath(__file__)), "clf.pkl")
To test it use flit install -s and to publish it on PyPI flit publish.

python git-submodule importing from other git-submodule

Using Python 3.6
I did create multiple modules (like DBmanager or jsonParser etc which I use across multiple different python projects)
For simplicity: I have created a module, lets call it 'gitmodule03'.
Internally it is supposed to be using yet another module from github 'gitmodule01' for parsing data. I have added 'gitmodule01' to 'gitmodule03' by
'git submodule add http://git/gitmodule01'
Separatenly, I am developing my 'MainPackage' which will use directly 'gitmodule03' and 'gitmodule01' (among others). I've added them all to my main Program with
'git submodule add http://git/gitmodule01'
'git submodule add http://git/gitmodule02'
'git submodule add http://git/gitmodule03'
and my package looks like this:
.
└── MainPackage
├── modules
│   ├── __init__.py
│   ├── gitmodule01
│   │   ├── __init__.py
│   │   └── mymodule01.py
│   ├── gitmodule02
│   │   ├── __init__.py
│   │   └── mymodule02.py
│   ├── gitmodule03
│   │   ├── __init__.py
│   │   ├── mymodule03.py
│   │   └── gitmodule01
│   │   └──
│   └── mymodule04.py
└── myMainProgram.py
At this moment 'gitmodule03' is NOT importing 'gitmodule01' internally. I was hoping that importing it in main myMainProgram.py would propagate across submodules (which is not the case)
If my myMainProgram.py imports them all:
from modules.gitmodule01.mymodule01 import my01class
from modules.gitmodule02.mymodule02 import my02class
from modules.gitmodule03.mymodule03 import my03class
my03class() # will work
my02class() # is internally using 'my03class()' and will error out:
NameError: name 'my03class' is not defined
How can I design those so they can work independently as well as within bigger package, in clean, pythonic way ?
I would like to have those modules idependent so they won't have to use any hard coded sys.path() methods
Edit Test Cases:
1.
myMainProgram.py
sys.path.insert(0, "modules/gitmodule03/gitmodule01/")
from mymodule01 import my01class
from modules.gitmodule03.mymodule03 import my03class
my01class() #works
my03class() # NameError: name 'my01class' is not defined
2.
myMainProgram.py
from modules.gitmodule03.gitmodule01.mymodule01 import my01class
from modules.gitmodule03.mymodule03 import my03class
my01class() #works
my03class() # NameError: name 'my01class' is not defined
3.
mymodule03.py
from gitmodule01.mymodule01 import my01class
my01class() #works
myMainProgram.py
from modules.gitmodule01.mymodule01 import my01class
from modules.gitmodule03.mymodule03 import my03class
my03class() # ModuleNotFoundError: No module named 'gitmodule01'
4.
mymodule03.py
from .gitmodule01.mymodule01 import my01class
my01class() # ModuleNotFoundError: No module named '__main__.gitmodule01'; '__main__' is not a package
myMainProgram.py
from modules.gitmodule03.mymodule03 import my03class
my03class() # works
With Test Case #4 It looks like i could make myMainProgram.py work but i would have to break module on its own.
So far I could not find better option to have both working myMainProgram.py and mymodule03.py on its own.
At the moment I am checking name variable to see whenever module is working on its own or whenever it is run from other package:
mymodule03.py
if __name__ == '__main__':
from gitmodule01.mymodule01 import my01class
my01class() # works
else:
from .gitmodule01.mymodule01 import my01class
myMainProgram.py
from modules.gitmodule03.mymodule03 import my03class
my03class() # works

How to import the own model into myproject/alembic/env.py?

I want to use alembic revision --autogenerate with my own model classes. Because of that I need to import them in myproject/alembic/env.py as described in the docs. But this doesn't work even if I tried a lot of variations.
I am not sure in which context (don't know if this is the correct word) does alembic run the env.py. Maybe that causes some errors.
This is the directory and file structure I use.
myproject/
common/
__init__.py
model.py
alembic/
env.py
The error is kind of that
from .common import model
SystemError: Parent module '' not loaded, cannot perform relative import
myproject itself is just a repository/working directory. It is not installed into the system (with pip3, apt-get, easyinstall or anything else).
You can set the PYTHONPATH environment variable to control what python sees as the top level folder, eg. if you are in the root folder of your project:
PYTHONPATH=. alembic revision -m "..."
Then you can use a "normal" import in your alembic env.py, relative to your root folder, in your example:
from src.models.base import Base
Fiddling around few hours with this same issue, I found out a solution. First, this is my structure right now:
. ← That's the root directory of my project
├── alembic.ini
├── dev-requirements.txt
├── requirements.txt
├── runtime.txt
├── setup.cfg
├── src
│   └── models
│   ├── base.py
│   ...
│   └── migrations
│   ├── env.py
│   ├── README
│      ├── script.py.mako
│      └── versions
│     
└── tests
in env.py I simply did this:
import sys
from os.path import abspath, dirname
sys.path.insert(0, dirname(dirname(dirname(abspath(__file__))))) # Insert <.>/src
import models # now it can be imported
target_metadata = models.base.Base.metadata
Hope you find this useful! :)
EDIT: I then did my first revision with the database empty (with no tables yet), alembic filled everything automatically for upgrade() and downgrade(). I did that in this way because not all my tables were automagically detected by alembic.
Put this in your env.py to put the working directory onto the Python path:
import sys
import os
sys.path.insert(0, os.getcwd())
For alembic 1.5.5 and above, add the following to your alembic.ini:
prepend_sys_path = .
From alembic documentation: this will be prepended to sys.path if present, defaults to the current working directory.

Resources