Using argparse scripts along with larger project - python-3.x

My objective is to use argparse command-line script along with a larger project. Especially I want to use some models from my project inside argparse_command.py.
Let's say the project structure is as follows:
myproject/
├── app
│   ├── db.py
│   └── __init__.py
│── management
│ └── argaparse_command.py
└── main.py
And the body of argparse_command.py:
import argparse
from app.db import engine
print('Print this when run from command-line.')
When I run the code from the command-line I get an error:
Traceback (most recent call last):
File "argaparse_command.py", line 2, in <module>
from app.db import engine
ModuleNotFoundError: No module named 'app'
And the problem is to import some stuff from within my project.
I have two questions:
Is it possible at all, and if so, how to do it?
What is the difference between running main.py with some IDE (pycharm) and argaparse_command.py with command-line (why it behaves differnt).

To be able to import app, you'd either have to install app as a python package (using a setup.py to specify how to install it for example), or be in the myproject working directory so your python interpreter has direct access to app in its path. So to answer question 1, yes it's possible.
The difference between running main.py with an IDE vs running on the command line is going to be the working directory the script is run in, which python executable is used and which virtual environment (if any) is used, and what flags are passed. You can check your IDE settings on how it invokes a script, and basically replicate that on the command line if you want.

Related

Python execution order in executable directory

I am trying to create an executable directory. It appears that the code in the init.py in one of my subpackages is executing before the main.py file in the root directory. Why is that?
Since you didn't describe any particular package structure in your question, all me to conjure up one for the sake of example. Let's say your package structure looks like the following:
package/
├── __init__.py
├── __main__.py
└── subpackage
├── __init__.py
└── submodule.py
and that package/__main__.py contains
print("before import in", __name__)
import package.subpackage.submodule
print("after import in", __name__)
while the files package/__init__.py, package/subpackage/__init__.py, and package/subpackage/submodule.py all contain
print(__name__)
(Note that __name__ is just a fancy global variable that holds the name of the current module).
If we try to run our package using the command
$ python3 -m package
we get the following output
package
before import in __main__
package.subpackage
package.subpackage.submodule
after import in __main__
This tells us that that package's top level __init__ module was the first to be loaded by the interpreter, followed by __main__. In the process of running __main__, we encounter an import statement, which causes the interpreter to briefly halt execution to load the desired module. When loading a module, Python check whether each intermediate package has already been loaded. Any packages that haven't been loaded yet will be loaded first, so importing package.subpackage.submodule results in package/subpackage/__init__.py being run, followed by package/subpackage/submodule.py. Only once all this is completed does control return back to __main__.
In your package, the __init__.py of your subpackage is not executing before __main__.py per se. Rather, you main module is (presumably) importing a module from the subpackage, which results in the subpackage's __init__ module being loaded, as demonstrated above.

Local directory shadowing 3rd party package

I'm trying to figure out if this is an error in my design, or an error in the redis-py library. Essentially, my understanding of namespace in Python is that packages should be designed such that all components are under the package namespace namespace. Meaning, if I have a queue in packageA and a queue in packageB, there should be no collision since they are namespaced (packageA.queue and packageB.queue). However, I'm running into an error in a package I am building.
This is the directory structure for the package I am building:
○ → tree
.
├── __init__.py
├── net
│ ├── __init__.py
│ ├── rconn.py
└── test.py
The __init__.py files are all empty. Here's the code of my test.py file:
○ → cat test.py
from net import rconn
and here's the code from my net/rconn.py file:
○ → cat net/rconn.py
import redis
Running test.py, everything works, no errors. However, if I add a queue directory in here and create an empty init.py within, here's the new tree:
○ → tree
.
├── __init__.py
├── net
│ ├── __init__.py
│ ├── rconn.py
├── queue
│ ├── __init__.py
└── test.py
Running test.py results in the following error:
Traceback (most recent call last):
File "test.py", line 1, in <module>
from net.rconn import ass
File "/Users/yosoyunmaricon/python_test/net/rconn.py", line 1, in <module>
import redis
File "/usr/local/lib/python3.7/site-packages/redis/__init__.py", line 1, in <module>
from redis.client import Redis, StrictRedis
File "/usr/local/lib/python3.7/site-packages/redis/client.py", line 10, in <module>
from redis._compat import (b, basestring, bytes, imap, iteritems, iterkeys,
File "/usr/local/lib/python3.7/site-packages/redis/_compat.py", line 139, in <module>
from queue import Queue
ImportError: cannot import name 'Queue' from 'queue' (/Users/yosoyunmaricon/python_test/queue/__init__.py)
So, I get what's happening. The Redis code says from queue import Queue, and when I create an empty queue directory (i.e., no Queue), it breaks the package. My question is this: Is that good design? Should the Redis package be more explicit and say something along the lines of from redis.queue import Queue, or is this simply an error in my own design?
It's not the Redis package that should adjust here, because it cannot know or cannot handle the different ways users could integrate the Redis package into their own applications, like how you have a similarly named queue package. Furthermore, there is no redis.queue because that queue, is not part of redis, but the built-in Python queue package. You can go to /usr/local/lib/python3.7/site-packages/redis/_compat.py and print out the queue.__file__, which would give you the path to Python's queue. It expects importing the built-in queue package.
Unfortunately for you, when Python builds the module search paths for resolving imports, it builds it in the following order:
The directory containing the input script (or the current directory when no file is specified).
PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
The installation-dependent default.
...which puts your own queue at the start of the list and that's what gets imported. So, yes, getting an ImportError because you shadowed the built-in queue is more of an error in your own design.
You could probably do some tricks here with sys.path or PYTHONPATH, but why bother when you can just rename your queue to something else. Or, what I usually do is to group my own packages into a parent folder, named after the acronym for the project ("abcdlibs") or some app identier or something like "mylibs":
.
├── __init__.py
├── mylibs
│ └── queue
│ ├── __init__.py
├── mynet
│ ├── __init__.py
│ └── rconn.py
└── test.py
That way, you could make it clear that mylibs.queue is different from queue.

How to import from a sibling directory in python3?

I have the following file structure:
bot
├── LICENSE.md
├── README.md
├── bot.py # <-- file that is executed from command line
├── plugins
│   ├── __init__.py
│   ├── debug.py
│   └── parsemessages.py
├── helpers
│   ├── __init__.py
│   ├── parse.py
│   └── greetings.py
└── commands
   ├── __init__.py
   └── search.py
bot.py, when executed from the command line, will load in everything in the plugins directory.
I want plugins/parsemessages.py to import parse from the helpers directory, so I do that:
# parsemessages.py
from ..helpers import parse
parse.execute("string to be parsed")
I run python3 bot.py from the command line.
I get the following error:
File "/home/bot/plugins/parsemessages.py", line 2, in <module>
from ..helpers import parse
ValueError: attempted relative import beyond top-level package
So I change two dots to one:
# parsemessages.py
from .helpers import parse
parse.execute("string to be parsed")
...but I get another error:
File "/home/bot/plugins/parsemessages.py", line 2, in <module>
from .helpers import parse
ImportError: No module named 'plugins.helpers'
How can I get this import to work?
It's worth noting that I'm not attempting to make a package here, this is just a normal script. That being said, I'm not willing to mess around with sys.path - I want this to be clean to use.
Additionally, I want parse to be imported as parse - so for the example above, I should be typing parse.execute() and not execute().
I found this post and this post, but they start with a file that's quite deep in the file structure (mine is right at the top). I also found this post, but it seems to be talking about a package rather than just a regular .py.
What's the solution here?
You could remove the dots, and it should work:
# parsemessages.py
from helpers import parse
parse.execute("string to be parsed")
That's probably your best solution if you really don't want to make it a package. You could also nest the entire project one directory deeper, and call it like python3 foo/bot.py.
Explanation:
When you're not working with an actual installed package and just importing stuff relative to your current working directory, everything in that directory is considered a top-level package. In your case, bot, plugins, helpers, and commands are all top-level packages/modules. Your current working directory itself is not a package.
So when you do ...
from ..helpers import parse
... helpers is considered a top-level package, because it's in your current working directory, and you're trying to import from one level higher than that (from your current working directory itself, which is not a package).
When you do ...
from .helpers import parse
... you're importing relative to plugins. So .helpers resolves to plugins.helpers.
When you do ...
from helpers import parse
... it finds helpers as a top-level package because it's in your current working directory.
If you want to execute your code from the root, my best answer to this is adding to the Path your root folder with os.getcwd().
Be sure your sibling folder has a init.py file.
import os
os.sys.path.insert(0, os.getcwd())
from sibling import module

why does import module in same directory gets error

directory structure
test/
__init__.py
line.py
test.py
test.py
from . import line
output:
Traceback (most recent call last):
File "test.py", line 1, in <module>
from . import line
ImportError: cannot import name 'line'
I know I can just import line. but it may import standard library in python3.
why did this error happen? does python3 support this syntax?
ps: I'm not in interactive console and test directory has already been a package
I'm not sure if you can import module with the same name as library utilities directly. However, you could consider putting your modules inside a package, so your directory structure would look like:
.
├── main.py
├── package
│   ├── __init__.py
│   ├── line.py
│   ├── test.py
Where __init__.py might have some setup commands (you can omit this file if you have none) and in main.py you can do the following:
from package import line
...
Within the directory package if you'd like to say, import line.py in test.py you can use the syntax:
from . import line
Note relative imports (using the from . notation) will only work inside the script and not in the interactive console. Running the test.py ( if it's using relative imports) directly will also not work, although importing it from main.py will work.

Import py file in another directory in Jupyter notebook

My question is related to this. I am using Python 3.6 in Jupyter Notebook. My project directory is /user/project. In this directory I'm building a number of models and each has its own folder. However, there is a common functions.py file with functions that I want to use across all models. So I want to keep the functions.py file in /user/project but be able to call it from an .ipynb file in /user/project/model1, /user/project/model2, etc... How can I do this?
There is no simple way to import python files in another directory.
This is unrelated to the jupyter notebook
Here are 3 solutions to your problem
You can add the directory containing the file you want to import to your path and then import the file like this:
import sys
sys.path.insert(0, '/path/to/application/app/folder')
import file
You can create a local module by having an empty __init__.py file in the folder you want to import. There are some weird rules regarding the folder hierarchy that you have to take into consideration.
You can create a module for the file you wish to import and install it globally.
Assuming you have a folder name Jupyter and you wish to import modules (employee) from another folder named nn_webserver.
visualizing it:
do this:
import sys
import os
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
sys.path.append(module_path+"\\nn_webserver")
from employee import motivation_to_work
see additional information here from #metakermit
I've been thinking about this problem because I don't like the sys.path.append() answers. A solution I propose uses the built-in Jupyter magic command to change the current working directory. Assuming you have this file structure:
project
├── model1
| └── notebook1.ipynb
├── model2
| └── notebook2.ipynb
└── functions.py
Whether you wanted to import functions from notebook1.ipynb or notebook2.ipynb, you could simply add a cell with the following line before the cell that has your package imports:
%cd ..
This changes the current working directory to the parent directory of the notebook, which then adds the path of the functions module to the default locations that Python will check for packages. To import functions:
import functions
This would work similarly if you had multiple modules in the same package directory that you wanted to import:
project
├── model1
| └── notebook1.ipynb
├── model2
| └── notebook2.ipynb
└── package
├── functions1.py
└── functions2.py
You can import both modules functions1 and functions2 from package like this:
from package import functions1, functions2
EDIT: As pointed out below, the local imports will no longer work if the cell containing the magic command is run more than once (the current working directory will be changed to the directory above at each rerun of the command). To prevent this from happening, the %cd .. command should be in its own cell (not in the same cell as the imports) at the top of the notebook and before the imports so it won't be run multiple times. Restarting the kernel and running all cells will reset the current working directory however will still return the desired imports/results.
I've solved this problem in the past by creating a symbolic link in the directory where the Jupyter notebook is located to the library it wants to load, so that python behaves as if the module is in the correct path. So for the example above, you would run the following command once per directory inside a Jupyter cell:
!ln -s /user/project/functions.py functions.py
and then you could import with
import functions
Note: I've only tried this on Linux and Mac Os, so I can't vouch for Windows.
I would suggest to install functions.py as a package in your virtual environment. There are some benefits of this:
You can access functions.py file from any iPython notebook located in any place, but at the given environment (kernel).
Once you changed any function in functions.py file, you don't need to reload your iPython notebook again and again. It will automatically reload every change.
This is the way how it can be done:
Create setup.py file (https://docs.python.org/2/distutils/setupscript.html) in your project folder
Activate your virtual environment, go to your project location, and use this command pip install -e .
Then, in your iPython notebook:
%load_ext autoreload
%autoreload 1
%aimport yourproject.functions
from functions import *
That's it!
In addition to the answer from adhg, I recommend using Pathlib, for compatibility between Linux/Windows/WSL paths formats:
Assuming the following folder structure:
.
├── work
| ├── notebook.ipynb
| └── my_python_file.py
├── py
| ├──modules
| | ├──__init__.py # empty
| | └──preparations.py
| ├──__init__.py # empty
| └── tools.py
├──.git
└── README.md
To load tools.py or preparations.py in my_python_file.py (or in notebook notebook.ipynb):
import sys
from pathlib import Path
# in jupyter (lab / notebook), based on notebook path
module_path = str(Path.cwd().parents[0] / "py")
# in standard python
module_path = str(Path.cwd(__file__).parents[0] / "py")
if module_path not in sys.path:
sys.path.append(module_path)
from modules import preparations
import tools
...
Found myself in the same exact situation as the OP, going to create several notebooks hence the wish to organise them in different subfolders
Tried this that seems to do what I need and seems cleaner to me
import os
os.chdir(os.path.dirname(os.path.dirname(os.getcwd())))
My function being two levels above so nested two os.path.dirname (with different folder structure could be only one or more)
Just implemented it and working fine, and btw I'm using JupyterLab started... two levels above where the function resides

Resources