There are quite alot of posts on removing namespaces in Python, but nearly all use the lxml package. Which seems very nice, but I've had trouble implementing in Windows.
What I want to achieve with my tags is similar to to:
removing namespace aliases from xml
but that answer is oriented toward json and doesn't seem to be python-based.
Similarly, I'm unclear on how to implement this:
https://stackoverflow.com/a/61786754/9249533
This older post is somewhat helpful:
https://stackoverflow.com/a/18160058/9249533
But I'm wondering as of July 2021 what the options are?
FWIW, my data look like:
My aim is to just access the data. I do not care to move it back to Excel.
presently, if I run:
from xml.etree import ElementTree as ET
tree = ET.parse(in_path + 'myfile.xml')
root = tree.getroot()
for child in root.iter():
print(child.tag)
returns this for the 'Data' tag:
{urn:schemas-microsoft-com:office:spreadsheet}Data
I'd like it to just be:
Data
The prefix/namespace is hampering my very modest xml interpretation skills. Any guidance for doing this with Packages that are more readily Windows compatible (or better still with conda installs) much appreciated.
Related
I am very new to python and started using pycharm ide for development.
How to display all the functions / API available in a class/file in python.
e.g. like when we code in scala/java in ide wee get all the functions/ methods available for that class/object by just typing object_name.(dot) and it shows all the API.
not able to get that for python in pycharm.
for example, I wanted to see all the functions available in the python List. I searched dir(module-name) but that is not what I need while developing faster.
thx in advance.
dir(module-name) and dir(class-name) is exactly how you do this. Yes, it doesn't return things in a userfriendly manner and you may need to dig down further for nested modules, but the only better option would be to learn from the package/api docs.
Here is some python code that illustrates the problem:
from pyscipopt import Model
master = Model("master LP")
relax = master.relax()
This generates the error:
builtins.AttributeError: 'pyscipopt.scip.Model' object has no attribute 'relax'
These python statements are taken from SCIP documentation --- section Column generation method for the cutting stock problem.
Note, I am using Python 3.6.5 and pyscipopt 3.0.2
The relax method does not exist in PySCIPOpt.
What you link to is a book that, I think, was originally written for using Gurobi's python interface. The authors started translating it to use it with SCIP but this is not ready yet. Actually, you can even see it in the Todo at the beginning of the page you link to.
In any case, this is not the documentation of SCIP
Check the pyscipopt github page and if you have further problems/questions, please open an issue there
All I want is to generate API docs from function docstrings in my source code, presumably through sphinx's autodoc extension, to comprise my lean API documentation. My code follows the functional programming paradigm, not OOP, as demonstrated below.
I'd probably, as a second step, add one or more documentation pages for the project, hosting things like introductory comments, code examples (leveraging doctest I guess) and of course linking to the API documentation itself.
What might be a straightforward flow to accomplish documentation from docstrings here? Sphinx is a great popular tool, yet I find its getting started pages a bit dense.
What I've tried, from within my source directory:
$ mkdir documentation
$ sphinx-apidoc -f --ext-autodoc -o documentation .
No error messages, yet this doesn't find (or handle) the docstrings in my source files; it just creates an rst file per source, with contents like follows:
tokenizer module
================
.. automodule:: tokenizer
:members:
:undoc-members:
:show-inheritance:
Basically, my source files look like follows, without much module ceremony or object oriented contents in them (I like functional programming, even though it's python this time around). I've truncated the sample source file below of course, it contains more functions not shown below.
tokenizer.py
from hltk.util import clean, safe_get, safe_same_char
"""
Basic tokenization for text
not supported:
+ forms of pseuod elipsis (...)
support for the above should be added only as part of an automata rewrite
"""
always_swallow_separators = u" \t\n\v\f\r\u200e"
always_separators = ",!?()[]{}:;"
def is_one_of(char, chars):
'''
Returns whether the input `char` is any of the characters of the string `chars`
'''
return chars.count(char)
Or would you recommend a different tool and flow for this use case?
Many thanks!
If you find Sphinx too cumbersome and particular to use for simple projects, try pdoc:
$ pdoc --html tokenizer.py
Is there a way to have custom behaviour for import statements in Python? How? E.g.:
import "https://github.com/kennethreitz/requests"
import requests#2.11.1
import requests#7322a09379565bbeba9bb40000b41eab8856352e
Alternatively, in case this isn't possible... Can this be achieved in a standard way with function calls? How? E.g.:
import gitloader
repo = gitloader.repo("https://github.com/kennethreitz/requests")
requests = repo.from_commit("7322a09379565bbeba9bb40000b41eab8856352e")
There are two reasons for why I would like to do this. The first reason is convenience (Golang style imports). The second reason is that I need cryptographic verification of plugin modules for a project. I'm using Python 3.x
What you're essentially asking is customization of the import steps. This was made possible with PEP 302 which brought about certain hooks for for customization.
That PEP is currently not the source from which you should learn how importing works; rather, look at the documentation for the import statement for reference. There it states:
Python includes a number of default finders and importers. The first one knows how to locate built-in modules, and the second knows how to locate frozen modules. A third default finder searches an import path for modules. The import path is a list of locations that may name file system paths or zip files. It can also be extended to search for any locatable resource, such as those identified by URLs.
The import machinery is extensible, so new finders can be added to extend the range and scope of module searching.
In short, you'll need to define the appropriate finder (to find the modules you're looking for) and an appropriate loader for loading these.
In an IPython notebook, one would expect the following code to cause Raphael.js to load successfully into the global namespace.
from IPython.display import Javascript
raphael_url = "https://cdnjs.cloudflare.com/ajax/libs/raphael/2.1.0/raphael-min.js"
Javascript('alert(Raphael);', lib=[raphael_url])
However, it does not work in recent versions of IPython which use require.js. Turns out, Raphael.js, which IPython loads using jQuery.getScript(), recognizes the presence of require.js and as such does not insert itself into the global namespace. In fact, if one first runs javascript code removing the window.define object, Raphael no longer realizes require.js is present, and it inserts itself into the global namespace as I would like. In other words, the code above works after running the following:
Javascript('window.define = undefined;')
Thus, the only way I am able to get Raphael to load within a recent version of IPython notebook is to delete (or set aside) window.define.
Having identified the problem, I am not familiar enough with require.js to know what piece of software is acting against protocol. Is Raphael using a poor way of testing for the existence of require.js? Should IPython be using require.js directly instead of jQuery.getScript() when it loads user-provided javascript libraries? Or is there a way I as the user ought to be embracing require.js, which will give me the Raphael object without needing any special hacks? (If the answer to the last question is yes, is there a way I can also support older versions of IPython notebook, which do not use require.js?)
The first part of my answer won't please you, but the loading and requirement of javascript library in the IPython-notebook-webapp has not yet been solved, so for now I would suggest not to build to much on the assumption you can load library like that, and rely more on custom.js for now.
That being said, if raphael is not in global namespace require is smart enough to cache it, and give you reference to it. Then in the callback you can just assign to a global :
require(['raphael'], function( raph ){
window.raphael = raph;
})
Or something like that should do the trick.