Python 3.5 not handling unicode input from CLI argument - python-3.x

I have a simple script that I'm attempting to use automate some of the japanese translation I do for my job.
import requests
import sys
import json
base_url = 'https://www.googleapis.com/language/translate/v2?key=CANT_SHARE_THAT&source=ja&target=en&q='
print(sys.argv[1])
base_url += sys.argv[1]
request = requests.get( base_url )
if request.status_code != 200:
print("Error on request")
print( json.loads(request.text)['data']['translations'][0]['translatedText'])
When the first argument is a string like 初期設定クリア this script will explode at line
print(sys.argv[1])
With the message:
line 5, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in
position 0-6: character maps to <undefined>
So the bug can be reduced too
import sys
print(sys.argv[1])
Which seems like an encoding problem. I'm using Python 3.5.1, and the terminal is MINGW64 under Windows7 x64.
When I write the same program in Rust1.8 (and the executable is ran under same conditions, i.e.: MINGW64 under Windows7 x64)
use std::env;
fn main() {
let args: Vec<String> = env::args().skip(1).collect();
print!("First arg: {}", &args[0] );
}
It produces the proper output:
$ rustc unicode_example.rs
$ ./unicode_example.exe 初期設定クリア
First arg: 初期設定クリア
So I'm trying to understand what is happening here. MINGW64 claims to have proper UTF-8 support, which it appears too. Does Python3.5.1 not have full UTF-8 support? I was under the assumption the move to Python3.X was because of Unicode support.

changing
print(sys.argv[1])
to
print(sys.argv[1].encode("utf-8"))
Will cause python to dump a string of bytes
$ python google_translate.py 初期設定クリア
b'\xe5\x88\x9d\xe6\x9c\x9f\xe8\xa8\xad\xe5\xae\x9a\xe3\x82\xaf\xe3\x83
\xaa\xe3\x82\xa2'
Nonetheless it works. So the bug, if this is a bug... Is happening when python is decoding the internal string to print into the terminal, not when the argument is being encoded INTO a python string.
Also simply removing the print statement fixes the bug as well.

Related

How can I use python dictionary?

I already made json secret file like this.
json_data = {
'creon' :
{'token' : ["abcd"]}}
So I want to use exactly like this.
token = app_info['creon']['token']
print(token)
> "abcd"
But, result is like this.
print(token)
> abcd
How can I use the way I wanted?
Last tried result)
import os
import json
app_info = os.getenv('App_info')
with open(app_info, 'r') as f:
app_info = json.load(f)
token = '"'+app_info['creon']['token']+'"'
print(token)
TypeError: expected str, bytes or os.PathLike object, not NoneType
So I see couple of problems. First of all when you are doing it so, you are not getting a string values, but unicode {u'creon': {u'token': [u'abcd']}}, which can't work in you current situation. Now then you need to convert it to string when you get it like so app_info['creon']['token'][0].encode('utf-8').decode('utf-8') and then you can print it properly.
I modified the code to look like this:
import os
import json
app_info = os.getenv('App_info')
with open(app_info, 'r') as f:
app_info = json.load(f)
t = app_info['creon']['token'][0].encode('utf-8').decode('utf-8')
token = f'"{t}"'
print(token)
The second problem TypeError: expected str, bytes or os.PathLike object, not NoneType I think you get it because you haven't set environment variable to the path of your json data. I did it like so in my terminal export app_info=example.json and it worked properly when I executed the command python3 example.py with above python code in the same terminal session with exported environment variable.
If your question is how to print out the quotation marks along with the value, you can do:
print('"' + token + '"')

syntax error in AWS Greengrass V2 hello_world.py

I am experimenting with AWS IoT greengrass V2. I am just following the manual that has the following python code:
import sys
import datetime
message = f"Hello, {sys.argv[1]}! Current time: {str(datetime.datetime.now())}."
# Print the message to stdout.
print(message)
# Append the message to the log file.
with open('/tmp/Greengrass_HelloWorld.log', 'a') as f:
print(message, file=f)
According to my logging there is a syntax error in line 4. Replacing line 4 with the following works fine:
message = "Hello"
Does anyone see what is wrong with this line:
message = f"Hello, {sys.argv[1]}! Current time: {str(datetime.datetime.now())}."
Thanks.
I'm one of the documentation writers for AWS IoT Greengrass.
Formatted strings literals (f"some content") are a feature of Python 3.6+, and this syntax results in a syntax error in earlier versions. The getting started tutorial requirements incorrectly list Python 3.5 as a requirement, but Python 3.5 doesn't support formatted string literals. We'll update this requirement to say 3.6 or update the script to remove the formatted string literal.
To resolve this issue, you can upgrade to Python 3.6+ or modify the script to remove the formatted string literal. Thank you for finding this issue!
For the record: I altered the code, avoiding the f"stringliteral" :
import sys
import datetime
constrmessage ="Hello, ",str(sys.argv[1])," "+str(datetime.datetime.now())
#change from tuple to string
message = ''.join(constrmessage)
#print message to screen
print(message)
original_stdout = sys.stdout
# Append the message to the log file.
with open('/tmp/Greengrass_HelloWorld.log', 'a') as f:
sys.stdout = f
print(message)
sys.stdout = original_stdout

Get versions of a .txt list of packages using pip.main

I'm trying to iterate over a list of packages in a text file to get the version of each package. The list in the pkgs.txt file appears like this:
{
"package1",
"package2",
...
}
Here is my most recent code:
with open("pkgs.txt", "r") as pkgs:
for line in pkgs:
version = subprocess.check_call([sys.executable, '-m', 'pip', 'search', line])
with open("versions.txt", "w+") as versions:
for ver in version:
version.write(ver)
The error I'm getting is: CalledProcessError: Command '['/opt/conda/bin/python', '-m', 'pip, 'search', '{\n']' returned non-zero exit status 23.
Could the issue be that I first need to remove the quotation marks and commas before I can loop through this list?
(There are too many aspects in this question to cover it all at once. I will try to cover the initial points here, the rest should be asked in a new question.)
First, pip doesn't have an API. The preferred way of using pip from your own program is to call it directly as a subprocess.
The input format is nothing recognizable (not CSV, JSON, maybe YAML could cover it). But coincidentally the input format is somewhat readable as a literal Python list of strings, so one could try parsing the content with ast.literal_eval.
pkgs.txt:
{
"package1",
"package2",
}
main.py
#!/usr/bin/env python3
import ast
import pathlib
import subprocess
import sys
def main():
file_name = 'pkgs.txt'
file_path = pathlib.Path(file_name)
file_content = file_path.read_text()
pkgs = ast.literal_eval(file_content)
print("pkgs", pkgs)
for pkg in pkgs:
print("pkg", pkg)
command = [
sys.executable,
'-m',
'pip',
'search',
pkg,
]
print("command", command)
command_output = subprocess.check_output(command)
print("command_output")
print('"""')
print(command_output.decode())
print('"""')
if __name__ == '__main__':
main()
Extracting the correct information out of command_result could be treated in a follow up question. (I suspect calling pip search is not the best choice here, it might be worth going with one of PyPI's APIs directly instead.)
command_output
"""
package2 (0.0.0) -
101703383-python-package2 (0.0.3) - This is my second package in python.
Vigenere-cipher-package2 (0.5) - An example of Vigenere cipher
WuFeiLiuGuang-first-package2 (2.0.0) - test pkg ignore
"""
References:
https://docs.python.org/3/library/ast.html#ast.literal_eval
https://stackoverflow.com/a/1894296/11138259
https://docs.python.org/3/library/pathlib.html#pathlib.Path.read_text
https://pip.pypa.io/en/stable/user_guide/#using-pip-from-your-program

python3 pySerial TypeError: unicode strings are not supported, please encode to bytes:

In Python 3 I imported the pySerial library so I could communicate with my Arduino Uno by serial commands.
It worked very well in Python 2.7 but in Python 3 I keep running into a error it says this
TypeError: unicode strings are not supported, please encode to bytes: 'allon'
In Python 2.7 the only thing I did differently is use raw_input but I don't know what is happening in Python 3. Here is my code
import serial, time
import tkinter
import os
def serialcmdw():
os.system('clear')
serialcmd = input("serial command: ")
ser.write (serialcmd)
serialcmdw()
ser = serial.Serial()
os.system('clear')
ser.port = "/dev/cu.usbmodem4321"
ser.baudrate = 9600
ser.open()
time.sleep(1)
serialcmdw()
Encode your data which you are writing to serial,in your case "serialcmd" to bytes.try the following :
ser.write(serialcmd.encode())
i found same you problem for learn "Arduino Python Serial" You can do another way this:
ser.write(str.encode('allon'))
If we have the string itself and not in a variable, we can do like this:
ser.write(b'\x0101')
This will convert the string to bytes type

Python 3 writing to a pipe

I'm trying to write some code to put data into a pipe, and I'd like the solution to be python 2.6+ and 3.x compatible.
Example:
from __future__ import print_function
import subprocess
import sys
if(sys.version_info > (3,0)):
print ("using python3")
def raw_input(*prmpt):
"""in python3, input behaves like raw_input in python2"""
return input(*prmpt)
class pipe(object):
def __init__(self,openstr):
self.gnuProcess=subprocess.Popen(openstr.split(),
stdin=subprocess.PIPE)
def putInPipe(self,mystr):
print(mystr, file=self.gnuProcess.stdin)
if(__name__=="__main__"):
print("This simple program just echoes what you say (control-d to exit)")
p=pipe("cat -")
while(True):
try:
inpt=raw_input()
except EOFError:
break
print('putting in pipe:%s'%inpt)
p.putInPipe(inpt)
The above code works on python 2.6 but fails in python 3.2 (Note that the above code was mostly generated with 2to3 -- I just messed with it a little to make it python 2.6 compatible.)
Traceback (most recent call last):
File "test.py", line 30, in <module>
p.putInPipe(inpt)
File "test.py", line 18, in putInPipe
print(mystr, file=self.gnuProcess.stdin)
TypeError: 'str' does not support the buffer interface
I've tried the bytes function (e.g. print(bytes(mystr,'ascii')) suggested here,
TypeError: 'str' does not support the buffer interface
But that doesn't seem to work.
Any suggestions?
The print function converts its arguments to a string representation, and outputs this string representation to the given file. The string representation always is of type str for both, Python 2.x and Python 3.x. In Python 3.x, a pipe only accepts bytes or buffer objects, so this won't work. (Even if you pass a bytes object to print, it will be converted to a str.)
A solution is to use the write() method instead (and flushing after writing):
self.gnuProcess.stdin.write(bytes(mystr + "\n", "ascii"))
self.gnuProcess.stdin.flush()
but python2 will complain about
bytes("something", "ascii")
if you use the mutable bytearray it will work in both python2 and python3 unaltered
self.gnuProcess.stdin.write(bytearray(mystr + "\n", "ascii"))
self.gnuProcess.stdin.flush()

Resources