yaml octal integer generate errors - python-3.x

In my YAML file I have the below entry:
- type: dir
name: .ssh
chmod: 0o700
According to the YAML 1.2 specification section 3.2.1.3 the 0o700 is the way to specify octals (there is also an example in section 2.4)
However when I process the loaded file and do:
import os
import yaml
filename = "in.yml"
with open(filename) as fp:
for e in yaml.load(open(filename)):
if e['type'] == 'dir':
os.mkdir(e['name'], e['chmod'])
I get TypeError: an integer is required. What is going wrong here?
I am using Python 3.5

What's wrong is that you assume that your YAML library supports the latest version 1.2. That YAML version is from 2009, but you are using PyYaml and that still only supports 1.1. From the non-activity the last few years it seems to be a dead project, so don't expect this to be solved any time soon.
You can add
from yaml.resolver import Resolver
Resolver.add_implicit_resolver(
'tag:yaml.org,2002:int',
re.compile(r'''^(?:[-+]?0b[0-1_]+
|[-+]?0o?[0-7_]+
|[-+]?0[0-7_]+
|[-+]?(?:0|[1-9][0-9_]*)
|[-+]?0x[0-9a-fA-F_]+
|[-+]?[1-9][0-9_]*(?::[0-5]?[0-9])+)$''', re.X),
list('-+0123456789'))
in your program to add recognition of 0o123 kinda octals (it also still recognizes the 1.1 octals).
Please note that the above only works for Python 3, as PyYaml has different code for Python 2.
You should also consider using pathlib.Path types and their .mkdir() instead of os.mkdir()

Install ruamel.yaml ( pip install ruamel.yaml ). It defaults to loading 1.2 as documented here:
unless the YAML document is loaded with an explicit version==1.1 or the document starts with:
% YAML 1.1
, ruamel.yaml will load the document as version 1.2.
and
YAML 1.2 no longer accepts strings that start with a 0 and solely consist of number characters as octal, you need to specify such strings with 0o[0-7]+ (zero + lower-case o for octal + one or more octal characters).

Related

ValueError: Comments are not supported by the python backend

The ijson module has a documented option allow_comments=True, but when I include it,
an error message is produced:
ValueError: Comments are not supported by the python backend
Below is a transcript using the file test.py:
import ijson
for o in ijson.items(open(0), 'item'):
print(o)
Please note that I have no problem with a similar documented option, multiple_values=True.
Transcript
$ python3 --version
Python 3.10.9
$ python3 test.py <<< [1,2]
1
2
# Now change the call to: ijson.items(open(0), 'item', allow_comments=True)
$ python3 test.py <<< [1,2]
Traceback (most recent call last):
File "/Users/user/test.py", line 5, in <module>
for o in ijson.items(open(0), 'item', allow_comments=True):
File "/usr/local/lib/python3.10/site-packages/ijson/utils.py", line 51, in coros2gen
f = chain(events, *coro_pipeline)
File "/usr/local/lib/python3.10/site-packages/ijson/utils.py", line 29, in chain
f = coro_func(f, *coro_args, **coro_kwargs)
File "/usr/local/lib/python3.10/site-packages/ijson/backends/python.py", line 284, in basic_parse_basecoro
raise ValueError("Comments are not supported by the python backend")
ValueError: Comments are not supported by the python backend
$
Take a look at the Backends section of the documentation, which says:
Ijson provides several implementations of the actual parsing in the form of backends located in ijson/backends:
yajl2_c: a C extension using YAJL 2.x. This is the fastest, but might require a compiler and the YAJL development files to be present when installing this package. Binary wheel distributions exist for major platforms/architectures to spare users from having to compile the package.
yajl2_cffi: wrapper around YAJL 2.x using CFFI.
yajl2: wrapper around YAJL 2.x using ctypes, for when you can’t use CFFI for some reason.
yajl: deprecated YAJL 1.x + ctypes wrapper, for even older systems.
python: pure Python parser, good to use with PyPy
And later on in the FAQ it says:
Q: Are there any differences between the backends?
...
The python backend doesn’t support allow_comments=True It also internally works with str objects, not bytes, but this is an internal detail that users shouldn’t need to worry about, and might change in the future.
If you want support for allow_comments=True, you need to be using one of the yajl based backends. According to the docs:
Importing the top level library as import ijson uses the first available backend in the same order of the list above, and its name is recorded under ijson.backend. If the IJSON_BACKEND environment variable is set its value takes precedence and is used to select the default backend.
You'll need the necessary libraries, etc, installed on your system in order for this to work.

What are Python3 libraries which replace "from scikits.audiolab import Format, Sndfile"

Hope you'll are doing good. I am new to python. I am trying to use audio.scikits library in python3 verion. I have a working code version in 2.7(with audio.scikits) . While I am running with python3 version I am getting the Import Error: No Module Named 'Version' error. I get to know that python3 is not anymore supporting audio.scikits(If I am not wrong). Can anyone suggest me replacing library for audio.scikits where I can use all the functionalities like audio.scikits do OR any other solution which might helps me. Thanks in advance.
2.7 Version Code :
from scikits.audiolab import Format, Sndfile
from scipy.signal import firwin, lfilter
array = np.array(all)
fmt = Format('flac', 'pcm16')
nchannels = 1
cd, FileNameTmp = mkstemp('TmpSpeechFile.wav')
# making the file .flac
afile = Sndfile(FileNameTmp, 'w', fmt, nchannels, RawRate)
#writing in the file
afile.write_frames(array)
SendSpeech(FileNameTmp)
To check entire code please visit :Google Asterisk Reference Code(modifying based on this code)
I want to modify this code with python3 supported libraries. Here I am doing this for Asterisk-Microsoft-Speech To Text SDK.
Firstly the link code you paste is Asterisk-Google-Speech-Recognition, it's not the Microsoft-Speech-To-Text, if you want get a sample about Microsoft-Speech-To-Text you could refer to the official doc:Recognize speech from an audio file.
And about your problem you said, yes it's not completely compatible, in the github issue there is a solution for it, you could refer to this comment.

webp2y XML helper sanitize line breaks under python3

In my web2py app I’m processing a list of items, where the user can click on a link for each item to select this. An item has an UUID, a title and a description. For a better orientation the item description is also displayed as link title. To prevent injections by and to escape tags in the description I’m using the XML sanitizer as follows:
A(this_item.title, \
callback = URL('item', 'select', \
vars=dict(uuid=this_item.uuid), user_signature=True), \
_title=XML(str_replace(this_item.description, {'\r\n':'
', '<':'<', '>':'>'}), sanitize=True))
Using Python 2 everything was fine. Since I have switched to Python 3 I have the following problem. When the description contains line breaks the sanitizer is not working anymore. For example the following string produces by my str_replace routine is fine to be sanitized by the XML helper under Python 2 but not under Python 3:
Header

Line1
Line2
Line3
Sanitizing line breaks escaped by 
 is the problem with Python 3 (but not with Python 2). Everything else is no problem for the XML helper to sanitize (e.g. less than or greater than, I need these, since if there is no description it is generated as <no description>).
How can be line breaks sanitized by the XML helper running web2py under Python3?
Thanks for any support!
Best regards
Clemens
This is down to a change in python's HTMLParser class between 3.4 and 3.5, where convert_charrefs started defaulting to True:
Python 3.4 DeprecationWarning convert_charrefs
I think the following fix in the your web2py yatl source should correct it:
https://github.com/web2py/yatl/compare/master...timnyborg:patch-1

Loading .npz with Python 3.5 always crashes

In this simple tutorial written in Python 2.7, they have a line loading the numpy array.
train_data = np.load(open('../musicnet.npz','rb'))
Then, they get the data by calling different keys
X,Y = train_data['2494']
Everything works well in python 2.7
Data type of train_data is numpy.lib.npyio.NpzFile
My problem
However, whenever I try to do the same in Python 3.5, most of the lines work fine, except when it comes to the line of X,Y = train_data['2494'], it just freezes there forever. I would like to use Python 3.5 because my other projects are written in python 3.5.
How to rewrite this line so that it runs with Python 3.5?
Error Message
I finally managed to get the error message in terminal
It freezes there because there's tons of output right after the error message, my jupyter notebook just cannot handle that much information.
Solution
Change the encoding to 'bytes'
train_data = np.load('../musicnet.npz', encoding='bytes')
Then everything works fine.
You first said things crashed, now you say it freezes when trying to access a specific array. numpy has the same syntax in 3.5 compared to 2.7. You shouldn't have to rewrite anything.
np.load does have a couple of parameters that deal with differences between Py2 and Py3. But I'm not sure these are an issue for you.
fix_imports : bool, optional
Only useful when loading Python 2 generated pickled files on Python 3,
which includes npy/npz files containing object arrays. If `fix_imports`
is True, pickle will try to map the old Python 2 names to the new names
used in Python 3.
encoding : str, optional
What encoding to use when reading Python 2 strings. Only useful when
loading Python 2 generated pickled files in Python 3, which includes
npy/npz files containing object arrays. Values other than 'latin1',
'ASCII', and 'bytes' are not allowed, as they can corrupt numerical
data. Default: 'ASCII'
Try
print(list(train_data.keys()))
This should show the array names that were saved to the zip archive. Do they match the names in the Py2 load? Do they include the '2494' name?
A couple of things are unusual about:
X,Y = train_data['2494']
Naming an array in the zip archive by a string number, and unpacking the load into two variables.
Do you know anything about how this was savez? What was saved?
Another question - are you loading this file from the same machine that Py2 worked on? Or has the file been transferred from another machine, and possibly corrupted?
As those parameters indicate, there are differences in the pickle code between Py2 and Py3. If the original save included object dtype arrays, or non-array objects, then they will be pickled and there might be incompatibilities in the pickle versions.
Try this,
with np.load('../musicnet.npz') as train_data:
X,Y = train_data['2494']
There are 2 ways out in my point of view:
re-edit your code
train_data = np.load(open('../musicnet.npz','rb'))
to
train_data = np.load(open('../musicnet.npz','r'))
Because the mode of r/rb in python2.7 / 3.5 is a difference in your situation.
Using the default debugger to pointing the significant error. (Usually, work on my experience)

Encoding issue with python3 and click package

When the lib click detects that the runtime is python3 but the encoding is ASCII then it ends the python program abruptly:
RuntimeError: Click will abort further execution because Python 3 was configured to use ASCII as encoding for the environment. Either switch to Python 2 or consult http://click.pocoo.org/python3/ for mitigation steps.
I found the cause of this issue in my case, when I connect to my Linux host from my Mac, the Terminal.app set the SSH session locale to my Mac locale (es_ES.UTF-8) However my Linux host hasn't installed such locale (only en_US.utf-8).
I applied an initial workaround to fix it (but It had many issues, see accepted answer):
import locale, codecs
# locale.getpreferredencoding() == 'ANSI_X3.4-1968'
if codecs.lookup(locale.getpreferredencoding()).name == 'ascii':
os.environ['LANG'] = 'en_US.utf-8'
EDIT: For a better patch see my accepted answer.
All my linux hosts have installed 'en_US.utf-8' as locale (Fedora uses it as default).
My question is: Is there a better (more robust) way to choose/force the locale in a python3 script ? For instance, setting one of the available locales in the system.
Maybe there is a different approach to fix this issue but I didn't find it.
If you have python version >= 3.7, then you should not need to do anything. If you have python 3.6 see the original solution.
EDIT 2017-12-08
I've seen that there is a PEP 538 for py3.7, that will change the entire behavior of python3 encoding management during startup, I think that the new approach will fix the original problem: https://www.python.org/dev/peps/pep-0538/
IMHO the changes targeted to python 3.7 for encoding issues, should have been planed years ago, but better late than never, I guess.
EDIT 2015-09-01
There is an opened issue (enhancement), http://bugs.python.org/issue15216, that will allow to change the encoding in a created (not-used) stream easily (sys.std*). But is targeted to python 3.7 So, we'll have to wait for a while.
Original solution that targets python version 3.6
NOTE: this solution should not be needed for anyone running python version >= 3.7 see PEP 538
Well, my initial workaround had many flaws, I got to pass the click library check about the encoding, but the encoding itself was not fixed, so I get exceptions when the input parameters or output had non-ascii characters.
I had to implement a more complex method, with 3 steps: set locale, correct encoding in std in/out and re-encode the command line parameters, besides I've added a "friendly" exit if the first try to set the locale doesn't work as expected:
def prevent_ascii_env():
"""
To avoid issues reading unicode chars from stdin or writing to stdout, we need to ensure that the
python3 runtime is correctly configured, if not, we try to force to utf-8,
but It isn't possible then we exit with a more friendly message that the original one.
"""
import locale, codecs, os, sys
# locale.getpreferredencoding() == 'ANSI_X3.4-1968'
if codecs.lookup(locale.getpreferredencoding()).name == 'ascii':
os.environ['LANG'] = 'en_US.utf-8'
if codecs.lookup(locale.getpreferredencoding()).name == 'ascii':
print("The current locale is not correctly configured in your system")
print("Please set the LANG env variable to the proper value before to call this script")
sys.exit(-1)
#Once we have the proper locale.getpreferredencoding() We can change current stdin/out streams
_, encoding = locale.getdefaultlocale()
import io
sys.stderr = io.TextIOWrapper(sys.stderr.detach(), encoding=encoding, errors="replace", line_buffering=True)
sys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding=encoding, errors="replace", line_buffering=True)
sys.stdin = io.TextIOWrapper(sys.stdin.detach(), encoding=encoding, errors="replace", line_buffering=True)
# And finally we need to re-encode the input parameters
for i, p in enumerate(sys.argv):
sys.argv[i] = os.fsencode(p).decode()
This patch solves almost all issues, however it has a caveat, the method shutils.get_terminal_size() raises a ValueError because the sys.__stdout__ has been detached, click lib uses that method to print the help, to fix it I had to apply a monkey-patch on click lib
def wrapper_get_terminal_size():
"""
Replace the original function termui.get_terminal_size (click lib) by a new one
that uses a fallback if ValueError exception has been raised
"""
from click import termui, formatting
old_get_term_size = termui.get_terminal_size
def _wrapped_get_terminal_size():
try:
return old_get_term_size()
except ValueError:
import os
sz = os.get_terminal_size()
return sz.columns, sz.lines
termui.get_terminal_size = _wrapped_get_terminal_size
formatting.get_terminal_size = _wrapped_get_terminal_size
With this changes all my scripts work fine now when the environment has a wrong locale configured but the system supports en_US.utf-8 (It's the Fedora default locale).
If you find any issue on this approach or have a better solution, please add a new answer.
It's an aged thread, however this answer might help other in the future or myself. If it's *nux
env | grep LC_ALL
if it's set, do the follows. That's all of it.
unset LC_ALL
If you are running python 3.6 then you will still get this error. Here is a simple solution that the authors of click recommend:
#!/bin/bash
# before your python code executes set two environment variables
export LANG=en_US.utf8
export LC_ALL=en_US.utf8
NOTE: replace the values with whatever your locale is configured to
NOTE: this solution is even given in the PEP 538 document seen here.
I haven't found this simple method (re-exec script with proper environment before doing anything) so I'll add it for future travellers using old Python version for some reason. Add it bellow imports to be that first :
if os.environ["LC_ALL"] != "C.UTF-8" or os.environ["LANG"] != "C.UTF-8":
os.execve(sys.executable,
[os.path.realpath(__file__)] + sys.argv,
{"LC_ALL": "C.UTF-8", "LANG": "C.UTF-8"})

Resources