PyTest and PyLint cannot import source code - python-3.x

I have a structure like below:
src/py_meta/
handlers/
internals/
__init__.py
main.py
tests/
__init__.py
test_main.py
pyproject.toml
test_main.py
"""Test the main function
"""
import requests
from src.py_meta import main as py_meta
from src.py_meta.internals.metadata import MetaData
image = requests.get(
"https://file-examples.com/storage/fea8fc38fd63bc5c39cf20b/2017/10/file_example_JPG_500kB.jpg",
timeout=6,
)
with open("demo_file.jpg", "wb") as f:
f.write(image.content)
def test_read():
"""Tests for a MetaData object returned."""
assert isinstance(py_meta.read("demo_file.jpg"), MetaData)
I then get a ModuleNotFound error on the import statements
from src.py_meta import main as py_meta
from src.py_meta.internals.metadata import MetaData
# I have tried without the src.
And relevent contents of pyproject.toml
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
------------------------------------
[tool.pytest.ini_options]
pythonpath = "src"
addopts = ["--import-mode=importlib"]
[tool.pylint.ini_options]
pythonpath = "src"
Full code is on github: https://github.com/pyscripter99/python-metadata
I have tried pytest documentation
also followed the python guide. I expect the imports to work without any issues from pylint or pytest, along with future usage as a python package

Related

Trying extract a geography coordinates from .pdf file with python3

I am trying to extract a geographic coordinates in UTM format from a .pdf file with python3 in Ubuntu operative system, with the follow code:
from pathlib import Path
import textract
import numpy as np
import re
import os
import pdfminer
def main(_file):
try:
text = textract.process(_file, method="pdfminer")
except textract.exceptions.ShellError as ex:
print(ex)
return
with open("%s.csv" % Path(_file).name[: -len(Path(_file).suffix)],
"w+") as _file:
# find orders and DNIs
coords = re.compile(r"\d?\.?\d+\.+\d+\,\d{2}")
results = re.findall(coords, text.decode())
if results:
_file.write("|".join(results))
if __name__ == "__main__":
_file = "/home/cristian33/python_proj/folder1/buscarco.pdf"
main(_file)
when I run it give me the follow error:
The command pdf2txt.py /home/cristian33/python_proj/folder1/buscarco.pdf failed because the executable
pdf2txt.py is not installed on your system. Please make
sure the appropriate dependencies are installed before using
textract:
http://textract.readthedocs.org/en/latest/installation.html
somebody knows why is that error?
thanks

Where to do package imports when importing multiple python scripts?

This might have been answered before, but I could not find anything that addresses my issue.
So, I have 2 files.
|
|-- test.py
|-- test1.py
test1.py is as below
def fnc():
return np.ndarray([1,2,3,4])
I'm trying to call test1 from test and calling the function like
from test1 import *
x = fnc()
Now naturally I'm getting NameError: name 'np' is not defined.
I tried to write the import both in test and test1 as
import numpy as np
But still, I'm getting the error. This might be silly, but what exactly I'm missing?
Any help is appreciated. Thanks in advance.
Each Python module has it's own namespace, so if some functions in test1.py depends on numpy, you have to import numpy in test1.py:
# test1.py
import numpy as np
def fnc():
return np.ndarray([1,2,3,4])
If test.py doesn't directly use numpy, you don't have to import it again, ie:
# test.py
# NB: do NOT use 'from xxx import *' in production code, be explicit
# about what you import
from test1 import fnc
if __name__ == "__main__":
result = fnc()
print(result)
Now if test.py also wants to use numpy, it has to import it too - as I say, each module has it's own namespace:
# test.py
# NB: do NOT use 'from xxx import *' in production code, be explicit
# about what you import
import numpy as np
from test1 import fnc
def other():
return np.ndarray([3, 44, 5])
if __name__ == "__main__":
result1 = fnc()
print(result1)
result2 = other()
print(result2)
Note that if you were testing your code in a python shell, just modifying the source and re-importing it in the python shell will not work (modules are only loaded once per process, subsequent imports fetch the already loaded module from the sys.modules cache), so you have to exit the shell and open a new one.
mostly you need to have __init__.py in the directort where you have these files
just try creating init.py file like below in the directory where you .py files are present and see if it helps.
touch __init__.py

Importing a module does not run the code inside it when running a few unittest files using pytest

I have a few files with tests and an imported file:
# test1.py
print("running test1.py")
import unittest
import imported
class MyTest1(unittest.TestCase):
def test_method1(self):
assert "love" == "".join(["lo", "ve"])
def test_fail1(self):
assert True, "I failed!"
and
# test2.py
print("running test2.py")
import unittest
import imported
class MyTest2(unittest.TestCase):
def test_ok2(self):
self.assertEqual(True, 2==2, "I will not appear!")
def test_fail2(self):
self.assertFalse(False, "Doh!")
and
# imported.py
import sys
# Here I am accessing the module that has imported this file
importing = sys.modules[sys._getframe(6).f_globals['__name__']]
print(importing.__name__ + " imported this file")
When I run pytest -s test1.py test2.py, I get this result:
=============================== test session starts ================================
platform darwin -- Python 3.6.0, pytest-3.7.1, py-1.5.4, pluggy-0.7.1
rootdir: /Users/ibodi/MyDocs/progs/Python/unittest_test, inifile:
collecting 0 items running test1.py
test1 imported this file
collecting 2 items running test2.py
collected 4 items
test1.py ..
test2.py ..
============================= 4 passed in 0.08 seconds =============================
importing variable in imported.py corresponds to the module that has imported the imported.py module.
And the problem is that the line import imported in test2.py doesn't cause test2 imported this file to appear in the logs. Why is that?

Can't pickle/dill SwigPyObject when serializing dict imoprted by importlib

I try to serialize (dill) a list containing dill-able objects which is nested inside a dict. The dict itself is imported into my main script using importlib. Calling dill.dump() raises a TypeError: can't pickle SwigPyObject objects. Here is some code with which I managed to reproduce the error for more insight.
some_config.py located under config/some_config.py:
from tensorflow.keras.optimizers import SGD
from app.feature_building import Feature
config = {
"optimizer": SGD(lr=0.001),
"features": [
Feature('method', lambda v: v + 1)
],
}
Here is the code which imports the config and tries to dill config["features"]:
import dill
import importlib.util
from config.some_config import config
spec = importlib.util.spec_from_file_location(undillable.config,"config/some_config.py")
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
undillable_config = module.config
# Works prefectly fine
with open("dillable_config.pkl", "wb") as f:
dill.dump(config["features"], f)
# Raises TypeError: can't pickle SwigPyObject objects
with open("undillable_config.pkl", "wb") as f:
dill.dump(undillable_config["features"], f)
Now the part that made me wonder: When importing the config-dict with importlib it raises the error and after some debugging I found that not only config["features"] but also config["optimizer"] will be dilled. However, using normal import seems to work and it only tries to dill config["features"]
So my question is why does dill try to serialize the whole dict if it is imported by importlib instead of only the feature-list and how may this error be fixed?
After reading the answer to this question I managed to get it working by avoiding importlib and instead import the config using __import__.
filename = "config/some_config.py"
dir_name = os.path.dirname(filename)
if dir_name not in sys.path:
sys.path.append(dir_name)
file = os.path.splitext(os.path.basename(filename))[0]
config_module = __import__(file)
# Works prefectly fine now
with open("dillable_config.pkl", "wb") as f:
dill.dump(config_module.config["features"], f)

how to loop through all subfolders images then display them

I have a folder named:
'LIDC-IDRI'
inside this folder I have some other folders named:
'LIDC-IDRI-0001','LIDC-IDRI-0002','LIDC-IDRI-0003', ...
each of these subfolders contains a number of images.
What I want to do is to iterate through all images inside all subfolders and display them using 'imshow' function, can anyone help me do that?
Any help would be appreciated.
#honar.cs, based on your problem statement, I have tried to solve your problem.
Here I want to display all the png and jpg images present inside LIDC-IDRI-0001, LIDC-IDRI-0002, LIDC-IDRI-0003, LIDC-IDRI-0004 directories.
File structure »
H:\RISHIKESHAGRAWANI\PROJECTS\SOF\DISPLAYIMAGES
└───LIDC-IDRI
│ show_images.md
│ show_images.py
│ show_images_temp.py
│
├───LIDC-IDRI-0001
│ download.jpg
│ Hacker.jpg
│
├───LIDC-IDRI-0002
│ images.jpg
│
├───LIDC-IDRI-0003
│ internet.jpg
│ Internet.png
│
└───LIDC-IDRI-0004
RishikeshAgrawani-Hygull-Python.jpg
wallpaper-strange-funny-weird-crazy-absurd-awesome-592.jpg
waterfalls.jpg
Requirements »
numpy - pip install numpy
matplotlib - pip install matplotlib
Pillow - pip install Pillow
» Python code (Python 3.6)
show_images.py
import os
import json
import glob
import numpy as np
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
image_formats = ["png", "jpg"]; # Let suppose we want to display png & jpg images (specify more if you want)
def show_images(image_file_name):
print("Displaying ", image_file_name)
img=mpimg.imread(image_file_name)
imgplot = plt.imshow(img)
plt.show()
def get_image_paths(current_dir):
files = os.listdir(current_dir);
paths = []; # To store relative paths of all png and jpg images
for file in files:
file = file.strip()
if os.path.isdir(file) and 'LIDC-IDRI-' in file:
for image_format in image_formats:
image_paths = glob.glob(os.path.join(".", file, "*." + image_format))
if image_paths:
paths.extend(image_paths);
return paths
if __name__ == "__main__":
image_paths = get_image_paths(".");
print(json.dumps(image_paths, indent=4))
# Display all images inside image_paths
for image_path in image_paths:
show_images(image_path);
print('\n')
How to run?
Open terminal and navigate inside LIDC-IDRI directory using cd command and run the below command.
python show_images.py
Output on console »
Images will be opened one by one (once you close 1st image, 2nd image will be displayed and so on).
[
".\\LIDC-IDRI-0001\\download.jpg",
".\\LIDC-IDRI-0001\\Hacker.jpg",
".\\LIDC-IDRI-0002\\images.jpg",
".\\LIDC-IDRI-0003\\Internet.png",
".\\LIDC-IDRI-0003\\internet.jpg",
".\\LIDC-IDRI-0004\\RishikeshAgrawani-Hygull-Python.jpg",
".\\LIDC-IDRI-0004\\wallpaper-strange-funny-weird-crazy-absurd-awesome-592.jpg",
".\\LIDC-IDRI-0004\\waterfalls.jpg"
]

Resources