Multi Level import in Python not working with Airflow - python-3.x

My file structure look like this
Module
|
|--Common
| |
| utils.py
| credentials.ini
|-- Folder2
|--Folder3
|--Folder4
|
Folder5
|
comp.py
I need to import utils.py functions in the comp.py file, but the problem is that utils itself needs the credentials.ini file for it to work.
I solved the problem in utils.py by giving it a absolute path like this path=join(dirname(os.path.realpath(__file__)), 'credentials.ini')
and in comp.py file I added this path to env using
import sys
sys.path.append("../../")
While this worked when I ran comp.py but I need to schedule it on airflow for it to run. Whenever airflow schedules comp.py to run it can't find the utils.py (Airflow and the module package are in different paths). Any idea how I can resolve it? I don't want to manually add the utils.py path to the env.
P.S The whole directly is initialized as a package. I have added __init__.py to the main module folder as well as all the subdirectories in it.
EDIT: Fixed Formatting

Airflow loads DAGs in a sandboxed environment and it does not handle all the various ways importing works when you run Python file as script. This is due to security and the way how different components of the distributed system work.
See https://airflow.apache.org/docs/apache-airflow/stable/modules_management.html but more detailed information especially the "development" versoin of the documentation that will be released in 2.2 (especially the "best practices"):
http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com/docs/apache-airflow/latest/modules_management.html#best-practices-for-module-loading
There are some best practices to follow:
Place all your python files in one of the modules that are already on pythonpath
Always use absolute imports, do not use "relative" references
Don't rely on your current working directory setting (likely this is what your problem was really - your current working directory was different than you expected).
In your case what will likely work is:
write a method in your 'utils.py" - for example "get_credentials_folder()".
in this method use __file__ to derive the path of the "utils.py" and find the absolute path of the expected folder containing it (use pardir and abspath)
add that absolute path you get to the sys.path

Related

Is there any way of changing the relative path of a mjs file (to make it see a different node_modules folder)?

When I'm executing pure node.js scripts I can change the path where it searches for node_modules as the following:
export NODE_PATH="/home/user/node_modules"
node "/home/user/different/path/file.js"
This way, I can make scripts located inside /home/user/different/path/ see the node_modules folder located in /home/user/ when they are executed.
So far everything is fine, the problem starts with .mjs files. If I try running:
export NODE_PATH="/home/user/node_modules"
node "/home/user/different/path/file.mjs"
I'll receive the error Error [ERR_MODULE_NOT_FOUND] for the modules that I use in my code. The workaround that I know for that is creating a symbolic link inside my script's folder. Something like:
ln -s "/home/user/node_modules" "/home/user/different/path/node_modules"
After doing that, if I run node "/home/user/different/path/file.mjs" it'll work as expected and I'll be able to use libraries installed on /home/user/node_modules with the import statement in my script. However, I'd like to find a solution that doesn't require me to create a symbolic link of the node_modules folder. Is there any alternative solution when I'm working with .mjs files that allows me to change its relative path?

Faust doesn't like relative path

I'm trying to clean up my code and have moved the models.py file to the top level, as other modules other than the faust ones will use this now.
The folder structure is below (albeit cut down for simplicity)
App
|
├── models
| ├── models.py
|
├── kafka
| ├── agent_a.py
|
├── servers
| ├── fastapi_server.py
Both the fastapi_server.py and the agent_a.py need access to models.py. If I run the server from the App directory it works ok. But when I try to run the following to start the faust agent also from the App directory it returns a No module named 'kafka.agent_a' error:
C:\path\to\App> faust -A kafka.agent_a:app worker
What is strange is that when I run the same command from a completely different directory that just has the faust/kafka stuff in there it works. What could possibly be happening for it to report the error?
But also note that when I run the server using:
C:\path\to\App> uvicorn servers.fastapi_server:app
it doesn't complain about the module at all. And if I try to run the faust application using:
C:\path\to\App> python kafka\agent_a.py worker
It then complains about the models not being a module. So I'm just completely confused about why one python script runs ok and the other doesn't... but it does run normally in a different directory
I've always found imports in python ridiculous to get my head around, but this one is significantly more stupefying.
Renaming the directory as suggested above did kind of solve the issue, but we still do have a relative imports issue that is a python problem, not a Faust one. But the import issue is completely resolvable with the use of the terminal commands, so we'll just stick with that.

How can I set up my imports in order to run my python application without installing it, and still be able to run tox using poetry?

I have a python 3.6 code-base which needs to be installed in the environment's site-packages directory in order to be called and used. After moving it to docker, I decided that I should set up a shared volume between the docker container and the host machine in order to avoid copying and installing the code on the container and having to rebuild every time I made a change to the code and wanted to run it. In order to achieve this, I had to change a lot of the import statements from relative to absolute. Here is the structure of the application:
-root
-src
-app
-test
In order to run the application from the root directory without installing it, I had to change a lot of the import statements from
from app import something
to:
import src.app.something
The problem is that I use poetry to build the app on an azure build agent, and tox to run the tests. The relevant part of my pyproject.toml file looks like this:
[tool.poetry]
name = "app"
version = "0.1.0"
packages = [{include = 'app', from='src'}]
The relevant part of my tox.ini file looks like this:
[tox]
envlist = py36, bandit, black, flake8, safety
isolated_build = True
[testenv:py36]
deps =
pytest
pytest-cov
pytest-env
pytest-mock
fakeredis
commands =
pytest {posargs} -m "not external_service_required" --junitxml=junit_coverage.xml --cov report=html --cov-report=xml:coverage.xml
I'm not an expert in tox or poetry, but from what I could tell, the problem was that the src directory wasn't being included in the build artifact, only the inner app directory was, so I added a parent directory and changed the directory structure to this:
-root
-app
-src
-app
-test
And then changed the poetry configuration to the following in order to include the src directory
[tool.poetry]
name = "app"
version = "0.1.0"
packages = [{include = 'src', from='app'}]
Now when I change the imports in the tests from this:
from app import something
to this:
from app.src.app import something
The import is recognized in Pycharm, but when I try to run tox -r, the I get the following error:
E ModuleNotFoundError: No module named 'app'
I don't understand how tox installs the application, and what kind of package structure I need to specify in order to be able to call the code both from the code-base directory and from site packages. I looked at some example projects, and noticed that they don't use the isolated_build flag, but rather the skip_dist flag, but somehow they also install the application in site packages before running their tests.
Any help would be much appreciated.
Specs:
poetry version: 1.1.6
python version:3.6.9
tox version:3.7
environment: azure windows build agent
You have to change the imports back to from app import something, the src part is, with respect to the code as a deliverable, completely transient. Same goes for adding in another app directory, your initial project structure was fine.
You were right about going from relative imports to absolute ones though, so all that is necessary thereafter is telling your python runtime within the container that root/src should be part of the PYTHONPATH:
export PYTHONPATH="{PYTHONPATH}:/path/to/app/src"
Alternatively, you can also update the path within your python code right before importing your package:
import sys
sys.path.append("/path/to/root/src")
import app # can be found now
Just to state the obvious, meddling with the interpreter in this way is a bit hacky, but as far as I'm aware it should work without any issues.

PhpStorm doesn't recognize package.json name of local directories

I'm using ReactNative and I have package.json in my local directories so I can have easier imports.
Example:
I have src/components folder and I want to import all components as :
import Button from 'components/Button;
and not use relative path as
import Button from '../../../components/Button;
I created package.json file in my components folder with name 'components' and now I can access Button component as needed.
But, there is problem with PhpStorm. PhpStorm doesn't recognize this as valid path. Is there any workaround for this?
This React native hack for specifying absolute paths (not officially documented anywhere, as far as I can tell) had never been supported. If you miss this feature, please follow WEB-23221 for updates. You can try creating a dummy webpack config like it's suggested in https://youtrack.jetbrains.com/issue/WEB-23221#focus=streamItem-27-2719626.0-0 and specifying a path to it in Settings | Languages & Frameworks | JavaScript | Webpack as a workaround.
Another workaround (if you aren't renaming paths, just making it shorter) is marking a parent folder of components directory as Resource root (note: not the subdirectory itself, but its parent dir!)

How can I include the parent folder structure on a library distribution in Python 3.6 using setuptools?

I am using setuptools to distribute a Python library. I have the following directory structure:
/src
/production
setup.py
/prod-library
/package1
/package2
The folder structure has to stay like this because there will be multiple libraries living under src in the future and need to have their own setup.py files. So the traditional answer of having 1 parent folder and moving out setup.py to the root folder will not work in this case.
I am using the following in the setup.py of the library to export the library (which is working)
package_dir={'': '.'},
packages=find_packages()
Inside the project tar.gz it looks like this:
/prod-library
/package1
/package2
But inside the prod-library package Python files, imports referencing other modules need to be structured as follows:
import src.production.prod-library.package1
import src.production.prod-library.package2
The problem:
After importing one of those libraries to a different project, errors are raised as follows:
ModuleNotFoundError: No module named 'src.production'
Since the build only drops in the /prod-library package, the project importing the code fails due to the missing folder structure (src/production) since the built distribution only has /prod-library.
What I need to do is include the src/production folder in the distribution build so the resulting tar.gz file looks like this:
/src
/production
/prod-library
/package1
/package2
I am not sure how I can get those in the build structure since they are above the setup.py location. How can that be accomplished?
If it can’t, then I am open to suggestions about fixing the imports if that can be a solution.
I found the solution to the problem. It has to do with how the package_dir was configured:
package_dir={'': '.'}
Although the above package_dir built the files and included all subfolders as expected, the egg-info file's SOURCES.txt was incorrect and showing as follows:
./prod-library/__init__.py
./prod-library/package1/__init__.py
etc...
When the package was imported into another API, the imports could not be found when attempting import prod-libary.package1.file.py
After changing the package_dir as follows, I was able to use the library normally:
package_dir={'.': ''}
The above effectively removed the ./ prefix in the SOURCES.txt file which was breaking the imports. Now the egg-info's SOURCES.txt looks correct:
prod-library/__init__.py
prod-library/package1/__init__.py
etc...

Resources