How to assemble a tool that parses reStructuredText with additional directives

How to assemble a tool that parses reStructuredText with additional directives - python-3.x

Let's say I want to add a singe new directive to the reStructuredText standard directives.
There is a how-to for creating reStructuredText directives which outlines how each directive is essentially derived from the Directive class or its subclasses. It also suggests that the interested reader should check how the standard directives are implemented and use that as a starting point.
So far, so good, but the tutorial actually stops at that point. How can that new directive be incorporated into a new tool; how am I supposed to actually use it?
The "Hacker's Guide", which outlines how the front-end tools are assembled out of a Reader, a Parser, a Transformer and a Writer, seems incomplete and contains absolutely no code.
So, how am I supposed to assemble a new tool out of the standard reader + my extended Parser which recognizes the new directive + possibly a new Writer which appropriately handles the new directive in the output it produces?
I would very much appreciate a nudge in the right direction.

The docutils.core.Publisher class is "A facade encapsulating the high-level logic of a Docutils system." According to its documentation:
Calling the publish_* convenience functions (or instantiating a
Publisher object) with component names will result in default
behavior. For custom behavior (setting component options), create
custom component objects first, and pass them to
publish_*/Publisher.
So, in order to build a tool which, for example, parses an .rst file and outputs html5 to the standard output, using standard components, one can write:
import sys
import docutils.core
import docutils.parsers.rst
import docutils.writers.html5_polyglot
public = docutils.core.publish_file(
source=open("filename.rst", 'r'),
parser=docutils.parsers.rst.Parser(),
writer=docutils.writers.html5_polyglot.Writer())
Replace the standard components for the parser or the writer with your own, extended, classes and you 're done.

Related

How to generate API documentation from docstrings, for functional code

All I want is to generate API docs from function docstrings in my source code, presumably through sphinx's autodoc extension, to comprise my lean API documentation. My code follows the functional programming paradigm, not OOP, as demonstrated below.
I'd probably, as a second step, add one or more documentation pages for the project, hosting things like introductory comments, code examples (leveraging doctest I guess) and of course linking to the API documentation itself.
What might be a straightforward flow to accomplish documentation from docstrings here? Sphinx is a great popular tool, yet I find its getting started pages a bit dense.
What I've tried, from within my source directory:
$ mkdir documentation
$ sphinx-apidoc -f --ext-autodoc -o documentation .
No error messages, yet this doesn't find (or handle) the docstrings in my source files; it just creates an rst file per source, with contents like follows:
tokenizer module
================
.. automodule:: tokenizer
:members:
:undoc-members:
:show-inheritance:
Basically, my source files look like follows, without much module ceremony or object oriented contents in them (I like functional programming, even though it's python this time around). I've truncated the sample source file below of course, it contains more functions not shown below.
tokenizer.py
from hltk.util import clean, safe_get, safe_same_char
"""
Basic tokenization for text
not supported:
+ forms of pseuod elipsis (...)
support for the above should be added only as part of an automata rewrite
"""
always_swallow_separators = u" \t\n\v\f\r\u200e"
always_separators = ",!?()[]{}:;"
def is_one_of(char, chars):
'''
Returns whether the input `char` is any of the characters of the string `chars`
'''
return chars.count(char)
Or would you recommend a different tool and flow for this use case?
Many thanks!

If you find Sphinx too cumbersome and particular to use for simple projects, try pdoc:
$ pdoc --html tokenizer.py

Creating libraries that can be imported and used in Groovy

Currently, I am working on a project to transpile from my company's in house scripting language, which is Object Orientated and takes quite a few features from other languages, into Groovy, which has many similar features.
To keep code as close to original as possible, I am trying to leave certain function names and parameters the same. To cater for this, I would like to write a set of libraries that can be imported.
For example, say I have an inbuilt method in the original scripting language,
I would like to be able to write the definition for this method in a groovy file, that can then be imported when needed, and the method may be called.
Tools.groovy
// filename: Tools.groovy
public String foo(String bar) {
return bar;
}
and in another file
Main.groovy
// filename: Main.groovy
import Tools;
String bat = foo("bar")
I know you can can compile class files into jars and put them into the class path, but a lot of the methods I will need to implement will either require meta programming or won't be associated with an object.
Sorry if it's either a bad question or not clear enough. I'm not sure whether its even possible.
Cheers

I believe you should be able to create libraries and reuse them when needed.
All you need to do is create class and add the static methods if you do not have to create instances, non static methods otherwise. Then it looks like you already aware how to proceed later.
For instance, you can create utilities classes for String, List, etc based on your description.
By the way, even if you do not create libraries, it is even possible to write one lines in groovy achieve what you may needed most of the cases.

How to do custom python imports?

Is there a way to have custom behaviour for import statements in Python? How? E.g.:
import "https://github.com/kennethreitz/requests"
import requests#2.11.1
import requests#7322a09379565bbeba9bb40000b41eab8856352e
Alternatively, in case this isn't possible... Can this be achieved in a standard way with function calls? How? E.g.:
import gitloader
repo = gitloader.repo("https://github.com/kennethreitz/requests")
requests = repo.from_commit("7322a09379565bbeba9bb40000b41eab8856352e")
There are two reasons for why I would like to do this. The first reason is convenience (Golang style imports). The second reason is that I need cryptographic verification of plugin modules for a project. I'm using Python 3.x

What you're essentially asking is customization of the import steps. This was made possible with PEP 302 which brought about certain hooks for for customization.
That PEP is currently not the source from which you should learn how importing works; rather, look at the documentation for the import statement for reference. There it states:
Python includes a number of default finders and importers. The first one knows how to locate built-in modules, and the second knows how to locate frozen modules. A third default finder searches an import path for modules. The import path is a list of locations that may name file system paths or zip files. It can also be extended to search for any locatable resource, such as those identified by URLs.
The import machinery is extensible, so new finders can be added to extend the range and scope of module searching.
In short, you'll need to define the appropriate finder (to find the modules you're looking for) and an appropriate loader for loading these.

Groovy - extensions structure

I'd like to extend String's asType method to handle LocalDateTime. I know how to override this method, however I've no idea where should I put it in project structure to work globally - for all strings in my project. Is it enough to put such extension wherever in the classpath? I know that there's a special convention for extensions (META-INF/services), how does it work for method overriding?

All documentation regarding this topic can be found here. And here exactly the relevant part can be found.
Module extension and module descriptor
For Groovy to be able to load your extension methods, you must declare
your extension helper classes. You must create a file named
org.codehaus.groovy.runtime.ExtensionModule into the META-INF/services
directory:
org.codehaus.groovy.runtime.ExtensionModule moduleName=Test module for
specifications moduleVersion=1.0-test
extensionClasses=support.MaxRetriesExtension
staticExtensionClasses=support.StaticStringExtension The module
descriptor requires 4 keys:
moduleName : the name of your module
moduleVersion: the version of your module. Note that version number is
only used to check that you don’t load the same module in two
different versions.
extensionClasses: the list of extension helper classes for instance
methods. You can provide several classes, given that they are comma
separated.
staticExtensionClasses: the list of extension helper classes for
static methods. You can provide several classes, given that they are
comma separated.
Note that it is not required for a module to define both static
helpers and instance helpers, and that you may add several classes to
a single module. You can also extend different classes in a single
module without problem. It is even possible to use different classes
in a single extension class, but it is recommended to group extension
methods into classes by feature set.
Module extension and classpath
It’s worth noting that you can’t use an extension which is compiled at
the same time as code using it. That means that to use an extension,
it has to be available on classpath, as compiled classes, before the
code using it gets compiled. Usually, this means that you can’t have
the test classes in the same source unit as the extension class
itself. Since in general, test sources are separated from normal
sources and executed in another step of the build, this is not an
issue.

How to prevent dead-code removal of utility libraries in Haxe?

I've been tasked with creating conformance tests of user input, the task if fairly tricky and we need very high levels of reliability. The server runs on PHP, the client runs on JS, and I thought Haxe might reduce duplicative work.
However, I'm having trouble with deadcode removal. Since I am just creating helper functions (utilObject.isMeaningOfLife(42)) I don't have a main program that calls each one. I tried adding #:keep: to a utility class, but it was cut out anyway.
I tried to specify that utility class through the -main switch, but I had to add a dummy main() method and this doesn't scale beyond that single class.

You can force the inclusion of all the files defined in a given package and its sub packages to be included in the build using a compiler argument.
haxe --macro include('my.package') ..etc
This is a shortcut to the macro.Compiler.include function.
As you can see the signature of this function allows you to do it recursive and also exclude packages.
static include (pack:String, rec:Bool = true, ?ignore:Array<String>, ?classPaths:Array<String>):Void
I think you don't have to use #:keep in that case for each library class.
I'm not sure if this is what you are looking for, I hope it helps.
Otherwise this could be helpful checks:
Is it bad that the code is cut away if you don't use it?
It could also be the case some code is inlined in the final output?
Compile your code using the compiler flag -dce std as mentioned in comments.
If you use the static analyzer, don't use it.
Add #:keep and reference the class+function somewhere.
Otherwise provide minimal setup if you can reproduce.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string