I have a template engine named Contemplate which has implementations for php, node/js and python.
All work fine except lately the python implementation gives me some issues. Specificaly the problem appears when first parsing a template and creating the template python code which is then dynamically imported as a module. When template is already created everything works fine but when template needs to be parsed and saved to disk and THEN imported it raises an error eg
ModuleNotFoundError: No module named 'blah blah'
(note this error appears to be random, it is not always sure that it will be raised, many times it works even if template is created just before importing, other times it fails and then if ran again with template already created it succeeds)
Is there any way I can bypass this issue, maybe add a delay between saving a parsed template and then importing as module or somethig else?
The code to import the module (the parsed template which is now a python class) is below:
def import_tpl( filename, classname, cacheDir, doReload=False ):
# http://www.php2python.com/wiki/function.import_tpl/
# http://docs.python.org/dev/3.0/whatsnew/3.0.html
# http://stackoverflow.com/questions/4821104/python-dynamic-instantiation-from-string-name-of-a-class-in-dynamically-imported
#_locals_ = {'Contemplate': Contemplate}
#_globals_ = {'Contemplate': Contemplate}
#if 'execfile' in globals():
# # Python 2.x
# execfile(filename, _globals_, _locals_)
# return _locals_[classname]
#else:
# # Python 3.x
# exec(read_file(filename), _globals_, _locals_)
# return _locals_[classname]
# http://docs.python.org/2/library/imp.html
# http://docs.python.org/2/library/functions.html#__import__
# http://docs.python.org/3/library/functions.html#__import__
# http://stackoverflow.com/questions/301134/dynamic-module-import-in-python
# http://stackoverflow.com/questions/11108628/python-dynamic-from-import
# also: http://code.activestate.com/recipes/473888-lazy-module-imports/
# using import instead of execfile, usually takes advantage of Python cached compiled code
global _G
getTplClass = None
# add the dynamic import path to sys
basename = os.path.basename(filename)
directory = os.path.dirname(filename)
os.sys.path.append(cacheDir)
os.sys.path.append(directory)
currentcwd = os.getcwd()
os.chdir(directory) # change working directory so we know import will work
if os.path.exists(filename):
modname = basename[:-3] # remove .py extension
mod = __import__(modname)
if doReload: reload(mod) # Might be out of date
# a trick in-order to pass the Contemplate super-class in a cross-module way
getTplClass = getattr( mod, '__getTplClass__' )
# restore current dir
os.chdir(currentcwd)
# remove the dynamic import path from sys
del os.sys.path[-1]
del os.sys.path[-1]
# return the tplClass if found
if getTplClass: return getTplClass(Contemplate)
return None
Note the engine creates a __init__.py file in cacheDir if it is not there already.
If needed I can change the import_tpl function to sth else I dont mind.
Python tested is python 3.6 on windows but I dont think this is a platform-specific issue.
To test the issue you can download the github repository (linked above) and run the /tests/test.py test after clearing all cached templates from /tests/_tplcache/ folder
UPDATE:
I am thinking of adding a while loop with some counter in import_tpl that catches the error raised if any and retries a specified amount of times until it succeeds to import the module. But I am also wondering if this is a good solution or there is something else I am missing here..
UPDATE (20/02/2019):
Added a loop to retry a specified amount of times plus a small delay of 1 sec if initially failed to import template module (see online repository code), but still it raises same error sometimes when templates are first created before being imported. Any solutions?
Right, if you use a "while" loop with to handle exceptions would be one way.
while True:
try:
#The module importing
break
except ModuleNotFoundError:
print("NOPE! Module not found")
If it works for some other, an not other "module" files, the likely suspect is the template files the template files themselves.
Related
Is there anything in the Python API that lets you alter the artifact subdirectories? For example, I have a .json file stored here:
s3://mlflow/3/1353808bf7324824b7343658882b1e45/artifacts/feature_importance_split.json
MlFlow creates a 3/ key in s3. Is there a way to change to modify this key to something else (a date or the name of the experiment)?
As I commented above, yes, mlflow.create_experiment() does allow you set the artifact location using the artifact_location parameter.
However, sort of related, the problem with setting the artifact_location using the create_experiment() function is that once you create a experiment, MLflow will throw an error if you run the create_experiment() function again.
I didn't see this in the docs but it's confirmed that if an experiment already exists in the backend-store, MlFlow will not allow you to run the same create_experiment() function again. And as of this post, MLfLow does not have check_if_exists flag or a create_experiments_if_not_exists() function.
To make things more frustrating, you cannot set the artifcact_location in the set_experiment() function either.
So here is a pretty easy work around, it also avoids the "ERROR mlflow.utils.rest_utils..." stdout logging as well.
:
import os
from random import random, randint
from mlflow import mlflow,log_metric, log_param, log_artifacts
from mlflow.exceptions import MlflowException
try:
experiment = mlflow.get_experiment_by_name('oof')
experiment_id = experiment.experiment_id
except AttributeError:
experiment_id = mlflow.create_experiment('oof', artifact_location='s3://mlflow-minio/sample/')
with mlflow.start_run(experiment_id=experiment_id) as run:
mlflow.set_tracking_uri('http://localhost:5000')
print("Running mlflow_tracking.py")
log_param("param1", randint(0, 100))
log_metric("foo", random())
log_metric("foo", random() + 1)
log_metric("foo", random() + 2)
if not os.path.exists("outputs"):
os.makedirs("outputs")
with open("outputs/test.txt", "w") as f:
f.write("hello world!")
log_artifacts("outputs")
If it is the user's first time creating the experiment, the code will run into an AttributeError since experiment_id does not exist and the except code block gets executed creating the experiment.
If it is the second, third, etc the code is run, it will only execute the code under the try statement since the experiment now exists. Mlflow will now create a 'sample' key in your s3 bucket. Not fully tested but it works for me at least.
Well, I've got the need of automate a process in my job(actually I'm an intern), and I just wondered if I could use Python for such process. I'm still processing my ideas of how to do those stuffs, and now I'm currently trying to understand how to download a file from a web URL using python3. I've found a guide on another website, but there's no active help there. I was told to use the module requests to download the actual file, and the module re to get the real file name.
The code was working fine, but then I tried to add some features like GUI, and it just stopped working. I took off the GUI code, and it didn't work again. Now I have no idea of what to do to get the code working, pls someone helo me, thanks :)
code:
import os
import re
# i have no idea of how this function works, but it gets the real file name
def getFilename(cd):
if not cd:
print("check 1")
return None
fname = re.findall('filename=(.+)', cd)
if len(fname) == 0:
print("check 2")
return None
return fname[0]
def download(url):
# get request
response = requests.get(url)
# get the real file name, cut off the quota and take the second element of the list(actual file name)
filename = getFilename(response.headers.get('content-disposition'))
print(filename)
# open in binary mode and write to file
#open(filename, "wb").write(response.content)
download("https://pixabay.com/get/57e9d14b4957a414f6da8c7dda353678153fd9e75b50704b_1280.png?attachment=")
os.system("pause")```
I'm struggling to refactor some working import-hook-functionality that served us very well on Python 2 the last years... And honestly I wonder if something is broken in Python 3? But I'm unable to see any reports of that around so confidence in doing something wrong myself is still stronger! Ok. Code:
Here is a cooked down version for Python 3 with PathFinder from importlib.machinery:
import sys
from importlib.machinery import PathFinder
class MyImporter(PathFinder):
def __init__(self, name):
self.name = name
def find_spec(self, fullname, path=None, target=None):
print('MyImporter %s find_spec fullname: %s' % (self.name, fullname))
return super(MyImporter, self).find_spec(fullname, path, target)
sys.meta_path.insert(0, MyImporter('BEFORE'))
sys.meta_path.append(MyImporter('AFTER'))
print('sys.meta_path:', sys.meta_path)
# import an example module
import json
print(json)
So you see: I insert an instance of the class right in front and one at the end of sys.meta_path. Turns out ONLY the first one triggers! I never see any calls to the last one. That was different in Python 2!
Looking at the implementation in six I thought, well THEY need to know how to do this properly! ... 🤨 I don't see this working either! When I try to step in there or just put some prints... Nada!
After all:IF I actually put my Importer first in the sys.meta_path list, trigger on certain import and patch my module (which all works fine) It still gets overridden by the other importers in the list!
* How can I prevent that?
* Do I need to do that? It seems dirty!
I have been heavily studying the meta_path in Python3.8
The entire import mechanism has been moved from C to Python and manifests itself as sys.meta_path which contains 3 importers. The Python import machinery is cleverly stupid. i.e. uncomplex.
While the source code of the entire python import is to be found in importlib/
meta_path[1] pulls the importer from frozen something: bytecode=?
underscore import is still the central hook called when you "import mymod"
--import--() first checks if the module has already been imported in which case it retrieves it from sys.modules
if that doesn't work it calls find_spec() on each "spec finder" in meta_path.
If the "spec finder" is successful it return a "spec" needed by the next stage
If none of them find it, import fails
sys.meta_path is an array of "spec finders"
0: is the builtin spec finder: (sys, _sre)
1: is the frozen import lib: It imports the importer (importlib)
2: is the path finder and it finds both library modules: (os, re, inspect)
and your application modules based on sys.path
So regarding the question above, it shouldn't be happening. If your spec finder is first in the meta_path and it returns a valid spec then the module is found, and remaining entries in sys.meta_path won't even be asked.
I have a (python3) package that has completely different behaviour depending on how it's init()ed (perhaps not the best design, but rewriting is not an option). The module can only be init()ed once, a second time gives an error. I want to test this package (both behaviours) using py.test.
Note: the nature of the package makes the two behaviours mutually exclusive, there is no possible reason to ever want both in a singular program.
I have serveral test_xxx.py modules in my test directory. Each module will init the package in the way in needs (using fixtures). Since py.test starts the python interpreter once, running all test-modules in one py.test run fails.
Monkey-patching the package to allow a second init() is not something I want to do, since there is internal caching etc that might result in unexplained behaviour.
Is it possible to tell py.test to run each test module in a separate python process (thereby not being influenced by inits in another test-module)
Is there a way to reliably reload a package (including all sub-dependencies, etc)?
Is there another solution (I'm thinking of importing and then unimporting the package in a fixture, but this seems excessive)?
To reload a module, try using the reload() from library importlib
Example:
from importlib import reload
import some_lib
#do something
reload(some_lib)
Also, launching each test in a new process is viable, but multiprocessed code is kind of painful to debug.
Example
import some_test
from multiprocessing import Manager, Process
#create new return value holder, in this case a list
manager = Manager()
return_value = manager.list()
#create new process
process = Process(target=some_test.some_function, args=(arg, return_value))
#execute process
process.start()
#finish and return process
process.join()
#you can now use your return value as if it were a normal list,
#as long as it was assigned in your subprocess
Delete all your module imports and also your tests import that also import your modules:
import sys
for key in list(sys.modules.keys()):
if key.startswith("your_package_name") or key.startswith("test"):
del sys.modules[key]
you can use this as a fixture by configuring on your conftest.py file a fixture using the #pytest.fixture decorator.
Once I had similar problem, quite bad design though..
#pytest.fixture()
def module_type1():
mod = importlib.import_module('example')
mod._init(10)
yield mod
del sys.modules['example']
#pytest.fixture()
def module_type2():
mod = importlib.import_module('example')
mod._init(20)
yield mod
del sys.modules['example']
def test1(module_type1)
pass
def test2(module_type2)
pass
The example/init.py had something like this
def _init(val):
if 'sample' in globals():
logger.info(f'example already imported, val{sample}' )
else:
globals()['sample'] = val
logger.info(f'importing example with val : {val}')
output:
importing example with val : 10
importing example with val : 20
No clue as to how complex your package is, but if its just global variables, then this probably helps.
I have the same problem, and found three solutions:
reload(some_lib)
patch SUT, as the imported method is a key and value in SUT, you can patch the
SUT. Example, if you use f2 of m2 in m1, you can patch m1.f2 instead of m2.f2
import module, and use module.function.
I am confused about some behavior of Python. I always thought importing a module basically meant executing it. (Like they say here: Does python execute imports on importation) So I created three simple scripts to test something:
main.py
import config
print(config.a)
config.a += 1
print(config.a)
import test
print(config.a)
config.py
def get_a():
print("get_a is called")
return 1
a = get_a()
test.py
import config
print(config.a)
config.a += 1
The output when running main.py is:
get_a is called
1
2
2
3
Now I am confused because I expected get_a() to be called twice, once from main.py and once from test.py. Can someone please explain why it is not? What if I really wanted to import config a second time, like it was in the beginning with a=1?
(Fortunately, for my project this behavior is exactly what I wanted, because get_a() corresponds to a function, which reads lots of data from a database and of course I only want to read it once, but it should be accessible from multiple modules.)
Because the config module is already loaded so there's no need to 'run' it anymore, just return the loaded instance.
Some standard library modules make use of this, from example random. It creates an object of class Random on first import and reuses it when it gets imported again. A comment on the module reads:
# Create one instance, seeded from current time, and export its methods
# as module-level functions. The functions share state across all uses
#(both in the user's code and in the Python libraries), but that's fine
# for most programs and is easier for the casual user than making them
# instantiate their own Random() instance.