How can I make a big Python package modular?

How can I make a big Python package modular? - python-3.x

In my company, we have a Python project that contains a hierarchy of lots of packages and modules shared by our different applications. But what seemed a good idea to mutualise code has become something horribly difficult to use and maintain.
Depending on the end-project, we use a single module from this library, or a single package, or many. And some modules/packages are independent, but some others depend on other packages and modules from the same library. And of course those modules depend on third-party packages.
I would like to make it as modular as possible, i.e. I would like to deal with the following cases:
use the whole library
use a single package from that library (whether it is a top level package or not)
use a single module from the library
use multiple packages/modules from the library (possibly interdependent)
Moreover, a strong constraint I have is not to break existing code so that I can make the project transformation without breaking all the projects of my coworkers...
Here is an example file tree that represents the situation:
library
├── a
│   └── i
│   ├── alpha.py # each module may depend on any other package / module
│   └── beta.py
├── b
│   ├── delta.py
│   ├── gamma.py
│   └── j
│   └── epsilon.py
├── c
│   ├── mu.py
│   └── nu.py
├── requirements.txt
└── setup.py
The best solution I found is to add a setup.py and a requirements.txt in every folder of the tree. But this has serious limitations:
I cannot use a single module (I have to use a package).
When I use a package, I have to change its import statements. For example if, before any change, I use from library.a.i import alpha, I would like not to modify it afterwards.
Moreover, I am quite sure I am forgetting some of the constraints I have...
So is what I am trying to achieve feasible, or is it utopian?

What you can do is the following:
You need to have PYTHONPATH pointing on library or append it to sys.path
eg sys.path.insert(0, 'path_to_libray')
If you create __init__.py at each level of your folders you will be able to pick whatever level/module of interest eg:
in folder b :
from .delta import *
from .gamma import *
from b.j import *
in folder j:
from .epsilon import *
You can now do in any python script:
from b import * : will import all of b contained modules
from b.j import *: will import only epsilon stuff

Related

how to import sibling files from a script

I am aware there is plenty of similar questions out there. But trust me I have tried a lot of things but I don't know why they are not working for me. So I just want a step-by-step answer to approach this problem
I just created a representation of my files structure to reduce complexity and easy to duplicate for testing
imp
├── __init__.py
├── app1
│ ├── __init__.py
│ └── func1.py
└── app2
└── func2.py
inside func1.py
def hello():
print("Hello there!")
and inside func2.py
from app1 import func1
func1.hello()
Now I mainly want to run func2.py from the root directory i.e, imp/. And if possible I want to make it executable from its own folder i.e, imp/app2/
what I tried
added init.py
tried relative import
tried appending and inserting path
sys.path.append('..') in func2.py
sys.path.insert(0, '..') in func2.py
none of them worked, I don't know what I am missing.
PS: Btw I am using python 3. Also, I prefer a solution without editing path (as it is not considered to good practise)

How does the extensions.py work within python package

I am reading other people's code and found extensions.py in their package.
I can see the modules imported in the extensions.py are imported in init.py as well.
I could not find how the extensions.py works with init.py and in what situation you need to use the extensions.py.
Could anyone give me some explaination or provide some link that explain it?
In init.py
from flask_app.extensions import cors, guard
In extension.py
from flask_praetorian import Praetorian
cors = CORS()
guard = Praetorian()

According to Python’s package tutorial this is the minimal structure:
packaging_tutorial/
├── LICENSE
├── pyproject.toml
├── README.md
├── setup.cfg
├── src/
│   └── example_package/
│   ├── __init__.py
│   └── example.py
└── tests/
Seems as its just a backup for installing requirements, or for more readability. Maybe provide where you found extensions.py, and I can take a deeper look.
You could also dig deeper into docs and see exactly what flask-praetorian does.

I think It's just a backup for installing things like requirements.

Python package in private repo with CLI utility - how to deploy, use, and structure

I've established a private github organization for common python repos - each repo is basically a unique python 3rd party package (like numpy for example) but that are homegrown. These are to be used across different projects.
At the moment, the repos are just source packages, not compiled with wheels or sdist for releases - so each has a setup.py, and directory structure for the modules/business logic of the library. Basically the repos look somewhat like this: https://packaging.python.org/tutorials/packaging-projects/
At the moment, I don't want to address compiling releases or a private PyPI server. What I need help/guidance on is what if its not just a library, but also has a CLI tool (that uses the library).
I expect the user to one of several things: clone it, set PYTHONPATH/PATH accordingly, and use it, or package and pip install it. but should the CLI tool be included inside that repo, or outside? how does one call it (ie. python -m ).
Whats strange to me is that packages seem more geared towards true libraries and not libraries+tools. Any help in my thought process on this and how to invoke?

Thanks to #phd for helping me walk the dog.
For my package project, I define a setup.py (surrogate makefile in python's parlance) which defines this entry point:
setuptools.setup(
name="pkg_name", # Replace with your package name
version="0.0.1", # see pep 440
...
scripts=['bin/simple_cli.py'], # callable script to register (updates PATH)
...
)
Now in the package project itself, basic structure is as follows and I'll highlight the bin/ directory:
$ tree -L 3
.
├── bin
│   └── simple_cli.py
├── contributing.md
├── LICENSE
├── makefile
├── pkg_name
│   ├── example_module.py
│   ├── __init__.py
├── README.md
├── requirements.txt
├── setup.py
└── tests
├── test_main.py
Once this is built (sdist, wheels, etc), we can use pip install . I test this in a virtual environment and here is where simple_cli.py exists:
The comments above have some references, but end-state is the file is installed in the venv bin/ directory (which is available on PATH with an activated venv).

How to start pyscaffold(python) project?

How to start the pyscaffold project?
I use this command to create the project putup sampleProject
but i don't know how to start this project?

You don't start a pyscaffold project per say -- Its goal is simply to create the files and folder that you will commonly need for your project. See my structure below from "putup MyTestProject". Look at all the nice stuff already created that you now don't have to do by hand.
To get started, you need to start adding packages/code to "..src/mytestproject" and run that code like you normally would.
Might I recommend for you the use of a good IDEA, such as pycharm? I think you will find it makes starting your journey much easier.
A second recommendation -- if you are just getting started, you might skip pyscaffold for now. While a great tool, it might add confusion that you don't need right now.
MyTestProject/
├── AUTHORS.rst
├── CHANGELOG.rst
├── docs
│ ├── authors.rst
│ ├── changelog.rst
│ ├── conf.py
│ ├── index.rst
│ ├── license.rst
│ ├── Makefile
│ └── _static
├── LICENSE.txt
├── README.rst
├── requirements.txt
├── setup.cfg
├── setup.py
├── src
│ └── mytestproject
│ ├── __init__.py
│ └── skeleton.py
└── tests
├── conftest.py
└── test_skeleton.py
[Edit]
With respect to why "python skeleton.py" gives an output, the library is simply providing an example to show the user where to start adding code, and how the code relates to the tests (test_skeleton.py). The intent is that skeleton.py will be erased and replaced with your code structure. This may be some python.py files or packages and sub packages with python.py files. Read it this way; "Your Code goes here ... and here is an arbitrary example to get you started."
But you have to ask yourself what you are trying to accomplish? If you are just creating a few scripts for yourself -- for nobody else in the world to see, do you need the additional stuff (docs, setup, licensing, etc?) If the answer is no - don't use pyscaffold, just create your scripts in a venv and be on your way. This scaffolding is meant to give you most of what you need to create a full, github worthy, project to potentially share with the world. Based on what I gather your python experience to be, I don't think you want to use pyscaffold.
But specific to your question. Were I starting with pyscaffold, I would erase skeleton.py, replace it with "mytester.py", use the begins library to parse my incoming command arguments, then write individual methods to respond to my command line calls.

How to debug a Python package in PyCharm

Setup
I have the following tree structure in my project:
Cineaste/
├── cineaste/
│   ├── __init__.py
│   ├── metadata_errors.py
│   ├── metadata.py
│   └── tests/
│   └── __init__.py
├── docs/
├── LICENSE
├── README.md
└── setup.py
metadata.py imports metadata_errors.py with the expression:
from .metadata_errors.py import *
Thus setting a relative path to the module in the same directory (notice the dot prefix).
I can run metadata.py in the PyCharm 2016 editor just fine with the following configuration:
Problem
However, with this configuration I cannot debug metadata.py. PyCharm returns the following error message (partial stack trace):
from .metadata_errors import *
SystemError: Parent module '' not loaded, cannot perform relative import
PyCharm debugger is being called like so:
/home/myself/.pyenv/versions/cineaste/bin/python /home/myself/bin/pycharm-2016.1.3/helpers/pydev/pydevd.py --multiproc --module --qt-support --client 127.0.0.1 --port 52790 --file cineaste.metadata
Question
How can I setup this project so that PyCharm is able to run and debug a file that makes relative imports?

Today (PyCharm 2018.3) it is really easy but not obvious.
You can choose target to run: script name or module name by pressing the label "Script Path" in edit configuration window:

One of possible solutions could be to run your module through intermediate script which you'll run in debug mode.
E.g. test_runner.py:
import runpy
runpy.run_module('cineaste.metadata')

You might also try removing the last node (/cineaste) from the Working Directory. This configuration works (run and debug) for me (in Pycharm: 2017.2.2)

I would suggest not using * since that can cause many problems in the future, two classes or methods being named the same etc.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string