Instancing classes from dynamically loaded modules

Instancing classes from dynamically loaded modules - python-3.5

What is the best way to load classes from python source code files on disk, dynamically? (Meaning I cannot just import className)
So far, I was able to load the source code into memory and make a python module object out of it using the following:
spec = importlib.util.spec_from_file_location(modname, sourcepath)
module = importlib.util.module_from_spec(spec)
So far, the code works. What I did next is use the inspect.getmembers() function to extract only classes from the loaded module, but, and this is the issue, the function returns an empty list.
classes = inspect.getmembers(module, inspect.isclass)
This code actually worked in Python 2, where the only difference was that I loaded the source file into a module form using the imp library and the imp.load_source(modname, modpath)) function.
What I need this code is for the inspect.getmembers call to return a list of classes of the specific module, or, to implement this dynamic class lookup in another way.
The reason I'm doing this is that I'm writing a small program that loads plugins, in the form of python classes, that it then interfaces with using predefined function calls.

After a lot of messing around, I found that the solution was pretty simple.
I only had to change the way the module gets loaded into a spec like so:
loader = importlib.machinery.SourceFileLoader(filename, dirpath+"/"+filename)
spec = importlib.util.spec_from_loader(loader.name, loader)
module = importlib.util.module_from_spec(spec)
loader.exec_module(module)
This approached lead to the inspect.getmembers to return valid classes from the module.

Related

Why is doctest skipping tests on imported methods?

I have a python module some_module with an __init__.py file that imports methods, like this:
from .some_python_file import some_method
some_method has a docstring that includes doctests:
def some_method():
"""
>>> assert False
"""
But when I run doctest on the module, it passes even if the tests should fail.
import some_module
# How do I get this to consistently fail, regardless of whether
# `some_module.some_method` was declared inline or imported?
assert doctest.testmod(some_module).failed == 0
If I instead define some_method within the __init__.py file, the doctest correctly fails.
Why are these two situations behaving differently? The method is present and has the same __doc__ attribute in both cases.
How do I get doctest to run the tests defined in the dostrings of methods that were imported into the module?

In Python a module is define by a single file and in your case some_python_file is a module while __init__ is another one. Doctest has a check to run tests only for examples reachable from the module which can be found here.
The best way to see this behaviour in practice is to use PDB and pdb.set_trace() right before you call doctest.testmod(some_module) and step inside to follow the logic.
LE:
Doctest ignores imported methods per this comment. If you want to be able to run your test you should probably define a main function in your module and run the test test with python some_module.py. You can follow this example.
To achieve your expected behaviour you need to manually create a __test__ dict in your init file:
from .some_python_file import some_method
__test__ = {"some_method": some_method}
See this link as well.

Objects imported into the module are not searched.
See docs on which docstrings are examined.
You can can inject the imported function into the module's __test__ attribute to have imported objects tested:
__test__ = {'some_method': some_method}

I stumbled upon this question because I was hacking a __doc__ attribute of an imported object.
from foo import bar
bar.__doc__ = """
>>> assert True
"""
and I was also wondering why the doctest of bar did not get executed by the doctest runner.
The previously given answer to add a `__test__` mapping solved it for good.
```python
__test__ = dict(bar=bar.__doc__)
I think the explanation for this behaviour is the following. If you are using a library, lets say NumPy, you do not want all of their doctests to be collected and run in your own code.
Simply, because it would be redundant.
You should trust the developers of the library to (continuously) test their code, so you do not have to do it.
If you have tests defined in your own code, you should have a test collector (e.g. pytest) descend into all files of your project structure and run these.
You would end up testing all doctests in used libraries, which takes a lot of time. So the decision to ignore imported doctests is very sane.

Python Modules Replacing Themselves During Load

I've come across some code recently that uses a trick that makes me rather nervous. The system I'm looking at has a Python extension Moo.so file stored outside the path and the developer wants to import it with just import Moo. For various reasons neither the file location nor sys.path can be changed, and the extension must be loaded with ExtensionFileLoader anyway.
So what has been done is to have a Moo.py in the path that loads the extension module and then replaces itself in sys.modules with the extension module, along the following lines:
' Moo.py '
from importlib.machinery import ExtensionFileLoader
loader = ExtensionFileLoader('AnotherNameForMoo', '/path/to/Moo.so')
module = loader.load_module()
sys.modules['Moo'] = module
Now this does actually work. (I have some tests of it in rather gory detail in this repo if you want to have a look.) It appears that, at least in CPython 3.4 through 3.7, import Moo does not bind to Moo the module that it loaded and put into sys.modules['Moo'], but instead binds the current value of sys.modules['Moo'] after the module's top-level script returns, regardless of whether or not that's what it originally put in there.
I can't find anything in any Python documentation that indicates that this is required behaviour rather than just an accident of implementation.
How safe is this? What are other ways that one might try to achieve a similar "bootstrap" effect?

Groovy - extensions structure

I'd like to extend String's asType method to handle LocalDateTime. I know how to override this method, however I've no idea where should I put it in project structure to work globally - for all strings in my project. Is it enough to put such extension wherever in the classpath? I know that there's a special convention for extensions (META-INF/services), how does it work for method overriding?

All documentation regarding this topic can be found here. And here exactly the relevant part can be found.
Module extension and module descriptor
For Groovy to be able to load your extension methods, you must declare
your extension helper classes. You must create a file named
org.codehaus.groovy.runtime.ExtensionModule into the META-INF/services
directory:
org.codehaus.groovy.runtime.ExtensionModule moduleName=Test module for
specifications moduleVersion=1.0-test
extensionClasses=support.MaxRetriesExtension
staticExtensionClasses=support.StaticStringExtension The module
descriptor requires 4 keys:
moduleName : the name of your module
moduleVersion: the version of your module. Note that version number is
only used to check that you don’t load the same module in two
different versions.
extensionClasses: the list of extension helper classes for instance
methods. You can provide several classes, given that they are comma
separated.
staticExtensionClasses: the list of extension helper classes for
static methods. You can provide several classes, given that they are
comma separated.
Note that it is not required for a module to define both static
helpers and instance helpers, and that you may add several classes to
a single module. You can also extend different classes in a single
module without problem. It is even possible to use different classes
in a single extension class, but it is recommended to group extension
methods into classes by feature set.
Module extension and classpath
It’s worth noting that you can’t use an extension which is compiled at
the same time as code using it. That means that to use an extension,
it has to be available on classpath, as compiled classes, before the
code using it gets compiled. Usually, this means that you can’t have
the test classes in the same source unit as the extension class
itself. Since in general, test sources are separated from normal
sources and executed in another step of the build, this is not an
issue.

RequireJS - When specify module id in define()

In RequireJS documents (http://requirejs.org/docs/api.html#modulename), I couldn't understand this sentence.
You can explicitly name modules yourself, but it makes the modules less portable
My question is
Why explicitly naming module makes less portable?
When explicitly naming module needed?

Why explicitly naming module makes less portable?
If you do not give the module a name explicitly, RequireJS is free to name it whichever way it wants, which gives you more freedom regarding the name you can use to refer to the module. Let's say you have a module the file bar.js. You could give RequireJS this path:
paths: {
"foo": "bar"
}
And you could load the module under the name "foo". If you had given a name to the module in the define call, then you'd be forced to use that name. An excellent example of this problem is with jQuery. It so happens that the jQuery developers have decided (for no good reason I can discern) to hardcode the module name "jquery" in the code of jQuery. Once in a while someone comes on SO complaining that their code won't work and their paths has this:
paths: {
jQuery: "path/to/jquery"
}
This does not work because of the hardcoded name. The paths configuration has to use the name "jquery", all lower case. (A map configuration can be used to map "jquery" to "jQuery".)
When explicitly naming module needed?
It is needed when there is no other way to name the module. A good example is r.js when it concatenates multiple modules together into one file. If the modules were not named during concatenation, there would be no way to refer to them. So r.js adds explicit names to all the modules it concatenates (unless you tell it not to do it or unless the module is already named).
Sometimes I use explicit naming for what I call "glue" or "utility" modules. For instance, suppose that jQuery is already loaded through a script element before RequireJS but I also want my RequireJS modules to be able to require the module jquery to access jQuery rather than rely on the global $. If I ever want to run my code in a context where there is no global jQuery to get, then I don't have to modify it for this situation. I might have a main file like this:
define('jquery', function () {
return $;
});
require.config({ ... });
The jquery module is there only to satisfy modules that need jQuery. There's nothing gained by putting it into a separate file, and to be referred to properly, it has to be named explicitly.

Here's why named modules are less portable, from Sitepen's "AMD, The Definite Source":
AMD is also “anonymous”, meaning that the module does not have to hard-code any references to its own path, the module name relies solely on its file name and directory path, greatly easing any refactoring efforts.
http://www.sitepen.com/blog/2012/06/25/amd-the-definitive-source/
And from Addy Osmani's "Writing modular javascript":
When working with anonymous modules, the idea of a module's identity is DRY, making it trivial to avoid duplication of filenames and code. Because the code is more portable, it can be easily moved to other locations (or around the file-system) without needing to alter the code itself or change its ID. The module_id is equivalent to folder paths in simple packages and when not used in packages. Developers can also run the same code on multiple environments just by using an AMD optimizer that works with a CommonJS environment such as r.js.
http://addyosmani.com/writing-modular-js/
Why one would need a explicitly named module, again from Addy Osmani's "Writing modular javascript":
The module_id is an optional argument which is typically only required when non-AMD concatenation tools are being used (there may be some other edge cases where it's useful too).

Is it possible to hide specific functions from appearing in the documentation using haddock?

I use haddock and don't want all of my exported functions to be displayed in the documentation. Is it possible to hide specific functions?
I found the prune attribute at http://www.haskell.org/haddock/doc/html/module-attributes.html, but this is not what I want since some functions which shall be exported don't have a documentation annotation.

Assuming your current module is Foo.Bar, one solution would be to break it up into Foo.Bar and Foo.Bar.Internal. You could move all the definitions related to the function you don't want to export--perhaps even all the definitions--into Foo.Bar.Internal. Then, in Foo.Bar, you would re-export only the definitions you want the world to see.
This approach has a couple of advantages. It lets you export everything you need, while still giving the user a clear sign that certain things shouldn't be used. It also lets you document your special functions inside the Internal module, which is going to be useful (if only for your future self :P).
You could simply not export Foo.Bar.Internal in your .cabal file, hiding it from the world. However, this is not necessarily the best approach; look at the answers to How, why and when to use the ".Internal" modules pattern?, in particular luqui's.

Another possibility is to make a module Foo.Bar.Hidden exporting all of the stuff you want to hide, and then re-export the whole Foo.Bar.Hidden module from Foo.Bar:
module Foo.Bar (
blah1,
blah2,
blah3,
module Foo.Bar.Hidden
)
where
import Foo.Bar.Hidden
blah1 = ...
blah2 = ...
blah3 = ...
This way, the hidden stuff will be exported from Foo.Bar, but not included in the documentation. The documentation will only include one relatively unobtrusive link to the Foo.Bar.Hidden module as a whole.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string