How to implement a Meta Path Importer on Python 3 - python-3.x

I'm struggling to refactor some working import-hook-functionality that served us very well on Python 2 the last years... And honestly I wonder if something is broken in Python 3? But I'm unable to see any reports of that around so confidence in doing something wrong myself is still stronger! Ok. Code:
Here is a cooked down version for Python 3 with PathFinder from importlib.machinery:
import sys
from importlib.machinery import PathFinder
class MyImporter(PathFinder):
def __init__(self, name):
self.name = name
def find_spec(self, fullname, path=None, target=None):
print('MyImporter %s find_spec fullname: %s' % (self.name, fullname))
return super(MyImporter, self).find_spec(fullname, path, target)
sys.meta_path.insert(0, MyImporter('BEFORE'))
sys.meta_path.append(MyImporter('AFTER'))
print('sys.meta_path:', sys.meta_path)
# import an example module
import json
print(json)
So you see: I insert an instance of the class right in front and one at the end of sys.meta_path. Turns out ONLY the first one triggers! I never see any calls to the last one. That was different in Python 2!
Looking at the implementation in six I thought, well THEY need to know how to do this properly! ... 🤨 I don't see this working either! When I try to step in there or just put some prints... Nada!
After all:IF I actually put my Importer first in the sys.meta_path list, trigger on certain import and patch my module (which all works fine) It still gets overridden by the other importers in the list!
* How can I prevent that?
* Do I need to do that? It seems dirty!

I have been heavily studying the meta_path in Python3.8
The entire import mechanism has been moved from C to Python and manifests itself as sys.meta_path which contains 3 importers. The Python import machinery is cleverly stupid. i.e. uncomplex.
While the source code of the entire python import is to be found in importlib/
meta_path[1] pulls the importer from frozen something: bytecode=?
underscore import is still the central hook called when you "import mymod"
--import--() first checks if the module has already been imported in which case it retrieves it from sys.modules
if that doesn't work it calls find_spec() on each "spec finder" in meta_path.
If the "spec finder" is successful it return a "spec" needed by the next stage
If none of them find it, import fails
sys.meta_path is an array of "spec finders"
0: is the builtin spec finder: (sys, _sre)
1: is the frozen import lib: It imports the importer (importlib)
2: is the path finder and it finds both library modules: (os, re, inspect)
and your application modules based on sys.path
So regarding the question above, it shouldn't be happening. If your spec finder is first in the meta_path and it returns a valid spec then the module is found, and remaining entries in sys.meta_path won't even be asked.

Related

import module with name same as built-in module in python 3

I meet a similar problem which can be simplified as following:
For example I have a file structure as following:
----folder
---- main.py
---- math.py
I define a function in math.py and I want to import this math.py in main.py .
The codes in math.py is following
# math.py
def f(x) :
return x**3
If I write codes in main.py as following
# main.py
import math
def main() :
print(math.f(3))
if __name__ == "__main__":
main()
then it returns AttributeError: module 'math' has no attribute 'f'
If I write codes in main.py as following
# main.py
from . import math
def main() :
print(math.f(3))
if __name__ == "__main__":
main()
Then it returns ImportError: cannot import name 'math' from '__main__' (main.py)
My question:
If I only want to import the module math.py in path folder which has the same name as build-in module, what should I do?
If in the main.py I want to use both math.f(x) defined in my math.py and built-in math.acos(x), what should I do?
PS: I meet similar problem since I have a long codes written by someone ten years ago. At that time there is no built-in module with such name (In fact it's not math module. I just simplify the problem by above question). And the functions of this module have been used at many places. Therefore it's almost impossible to change module's name since if so I need to carefully change all sites module.function().
It's pretty bad practice to name your modules after built-in modules. I'd recommend naming your math.py something else.
That being said, you could import it using the path with imp:
import imp
math = imp.load_source('math', './math.py')
Dove into a bit of a rabbit hole but here we go. As a disclaimer, like Jack said naming modules after builtins is very bad practice, and this can more easily be accomplished using imp as he suggested.
The reason you're having problems come from the interaction of a few things. When you type
import math
What your python does is look at sys.path. It will check in all the locations in sys.path for a module named math, and import the first one it finds. In your case, it finds your local math module first. After the import is completed, it adds it to sys.modules, but we'll get back to that.
Since you want to use both, first you can import your local math as you have. I would suggest importing it as a different name to keep it separate.
from . import math as local_math
After that, we need to mess with our sys.path to find the builtin math
sys.path = sys.path[::-1]
This reverses the order of sys.path, meaning it will look in your local directory first instead of last.
Now you might think this was enough, but if you try to import math here, python will see it in sys.modules and not bother trying to import again, so first
del sys.modules['math']
Then we can import the default math module
import math
and finally clean up our sys.path
sys.path = sys.path[::-1]
now we have access to everything we need
>>>math.cos(10)
-0.8390715290764524
>>>local_math.f(10)
1000
According to official docs: https://docs.python.org/3/library/imp.html
imp will be removed in version 3.12
Deprecated since version 3.4, will be removed in version 3.12: The imp module is deprecated in favor of importlib.
From imp approach (Deprecated):
import imp
m = imp.load_source('math', './math.py')
m.foo()
To importlib approach with the minimum code 'impact' would be:
def load_module(name, path):
from importlib import util as importlib_util
module_spec = importlib_util.spec_from_file_location(name, path)
module = importlib_util.module_from_spec(module_spec)
module_spec.loader.exec_module(module)
return module
m = load_module('math', './math.py')
m.foo()

using python3 libs in ros kinetic

I am working with project, connected with ROS Kinetic. I wrote my nodes, using Python3 and tried to transfer some information with ros-service from 1 node to another. This information represents a huge object that couldn't be easilly formatted to normal ROS types, so I used pickle.dumps(object, 0).decode() and send it like a string.In the server side I couldn't use pickle and met an exception about: No module named search.
The code of the server side:
#!/usr/bin/env python3
from visualization.srv import *
import rospy
import pickle
megafoo = []
def handle_nodes(req):
global megafoo
print(type(req.nodes))
megafoo.extend(pickle.loads(req.nodes.encode()))
print(len(megafoo))
a=1
print("A request type: {0}".format(type(req)))
return ListNodesResponse(a)
def nodes_creater_server():
rospy.init_node('nodes_server')
s = rospy.Service('draw_some_nodes', ListNodes, handle_nodes)
print('ready to draw nodes')
rospy.spin()
if __name__ == "__main__":
nodes_creater_server()
I tried to make this without calling pickle and problem was resolved so I think that I can't call pickle from server somehow
Quite late to the party, but "No module named search" sounds like your pickle string contains a custom class "search" that isn't available at the receiving side. Would this be correct?
To verify you can try to transfer some simple native Python value like (123, "abc") using pickle in ROS (which should work).
Another option is to write two separate programs, one that writes the pickle to disk, and another one that reads the pickle from that file. Then you can experiment with loading, and eg decide which modules you need to import.

python 3 tkinter Pycharm - error on messagebox

Im writing a GUI and I say:
from tkinter import *
Further in the program theres a function wich is:
def nameFunc():
messagebox.showinfo(........)
The problem is that by running the code in the latest Pycharm, it tells me that messagebox is not defined even if I already imported everything from tkinter, it only works if I explicitly say:
from tkinter import messagebox
This only occurs when I run the code on Pycharm, in the standard python IDLE its fine.
Why?
PyCharm is behaving exactly as it should, if you take a look at the documentation on packages:
what happens when the user writes from sound.effects import *? Ideally, one would hope that this somehow goes out to the filesystem, finds which submodules are present in the package, and imports them all. This could take a long time and importing sub-modules might have unwanted side-effects that should only happen when the sub-module is explicitly imported.
The only solution is for the package author to provide an explicit index of the package. The import statement uses the following convention: if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered.
tkinter does not define a __all__ to automatically import submodules and you should be glad it doesn't import them all automatically:
import tkinter.__main__
print("this will only print after you close the test window")
the program only continues to run after a window pops up with the current tcl/Tk version and some other content is closed, to import submodules of the package you must explicitly import them with:
from tkinter import messagebox
however as I describe in my other answer here, because of how IDLE is built it has already loaded some of the submodules when your code is being executed in the idle Shell.

restart python (or reload modules) in py.test tests

I have a (python3) package that has completely different behaviour depending on how it's init()ed (perhaps not the best design, but rewriting is not an option). The module can only be init()ed once, a second time gives an error. I want to test this package (both behaviours) using py.test.
Note: the nature of the package makes the two behaviours mutually exclusive, there is no possible reason to ever want both in a singular program.
I have serveral test_xxx.py modules in my test directory. Each module will init the package in the way in needs (using fixtures). Since py.test starts the python interpreter once, running all test-modules in one py.test run fails.
Monkey-patching the package to allow a second init() is not something I want to do, since there is internal caching etc that might result in unexplained behaviour.
Is it possible to tell py.test to run each test module in a separate python process (thereby not being influenced by inits in another test-module)
Is there a way to reliably reload a package (including all sub-dependencies, etc)?
Is there another solution (I'm thinking of importing and then unimporting the package in a fixture, but this seems excessive)?
To reload a module, try using the reload() from library importlib
Example:
from importlib import reload
import some_lib
#do something
reload(some_lib)
Also, launching each test in a new process is viable, but multiprocessed code is kind of painful to debug.
Example
import some_test
from multiprocessing import Manager, Process
#create new return value holder, in this case a list
manager = Manager()
return_value = manager.list()
#create new process
process = Process(target=some_test.some_function, args=(arg, return_value))
#execute process
process.start()
#finish and return process
process.join()
#you can now use your return value as if it were a normal list,
#as long as it was assigned in your subprocess
Delete all your module imports and also your tests import that also import your modules:
import sys
for key in list(sys.modules.keys()):
if key.startswith("your_package_name") or key.startswith("test"):
del sys.modules[key]
you can use this as a fixture by configuring on your conftest.py file a fixture using the #pytest.fixture decorator.
Once I had similar problem, quite bad design though..
#pytest.fixture()
def module_type1():
mod = importlib.import_module('example')
mod._init(10)
yield mod
del sys.modules['example']
#pytest.fixture()
def module_type2():
mod = importlib.import_module('example')
mod._init(20)
yield mod
del sys.modules['example']
def test1(module_type1)
pass
def test2(module_type2)
pass
The example/init.py had something like this
def _init(val):
if 'sample' in globals():
logger.info(f'example already imported, val{sample}' )
else:
globals()['sample'] = val
logger.info(f'importing example with val : {val}')
output:
importing example with val : 10
importing example with val : 20
No clue as to how complex your package is, but if its just global variables, then this probably helps.
I have the same problem, and found three solutions:
reload(some_lib)
patch SUT, as the imported method is a key and value in SUT, you can patch the
SUT. Example, if you use f2 of m2 in m1, you can patch m1.f2 instead of m2.f2
import module, and use module.function.

Importing one module from different other modules only executes it once. Why?

I am confused about some behavior of Python. I always thought importing a module basically meant executing it. (Like they say here: Does python execute imports on importation) So I created three simple scripts to test something:
main.py
import config
print(config.a)
config.a += 1
print(config.a)
import test
print(config.a)
config.py
def get_a():
print("get_a is called")
return 1
a = get_a()
test.py
import config
print(config.a)
config.a += 1
The output when running main.py is:
get_a is called
1
2
2
3
Now I am confused because I expected get_a() to be called twice, once from main.py and once from test.py. Can someone please explain why it is not? What if I really wanted to import config a second time, like it was in the beginning with a=1?
(Fortunately, for my project this behavior is exactly what I wanted, because get_a() corresponds to a function, which reads lots of data from a database and of course I only want to read it once, but it should be accessible from multiple modules.)
Because the config module is already loaded so there's no need to 'run' it anymore, just return the loaded instance.
Some standard library modules make use of this, from example random. It creates an object of class Random on first import and reuses it when it gets imported again. A comment on the module reads:
# Create one instance, seeded from current time, and export its methods
# as module-level functions. The functions share state across all uses
#(both in the user's code and in the Python libraries), but that's fine
# for most programs and is easier for the casual user than making them
# instantiate their own Random() instance.

Resources