python console does not display ü - python-3.x

dear fellow developers,
I have tried to teach python console to display ü, but it insists with displaying ü instead. I have tried it with Python 3.5 and Python 3.6. The result is the same. If I run a .py file containing line print("ü") with F5 command, it displays
ü
instead of ü. If I type in the console
print ("ü")
it displays
ü
I know it has been discussed many times, but most of the methods I have come across during the last 5 hours have not helped me or I have not applied them properly. The problem exists also with other non ascii characters. I appreciate your help!

Try adding the following line on the top on your source file : # coding: utf-8
Also, check if your file encoding is correct (Always choose UTF-8).

Related

Display Spanish characters with accents properly

The file I am working on has everything written in Spanish. The format is .sav. What I wanna do is to open it on JMP, export to Excel with .csv format. I am using Mac with running OS Sierra.
Here is the problem. I surely opened the file with utf-8 on JPM but there are some corrupted characters. So I changed the default language of Mac to Spanish and it did not work. I also exported the file from JMP to texteditor with the corruption remaining, duplicate the file with utf-8, and import on Excel. It did not work as well. Changing to utf-16 was one of my attempts and did not work at all. I used Numbers instead of Excel, but this also failed.
What else I can do to display the characters properly?
FYI, the file is taken from http://evaluacion.oportunidades.gob.mx:8010/EVALUACION/en/eval_cuant/p_bases_cuanti.php
Any suggestion is much appreciated!
Thank you in advance!!

√ not recognized in Terminal

For a class of mine I have to make a very basic calculator. I want to write the code in such a way that the user can just enter what they want to do (ex. √64) press return and get the answer. I wrote this:
if '√' in operation:
squareRoot = operation.replace('√','')
squareRootFinal = math.sqrt(int(squareRoot))
When I run this in IDLE, it works like a charm. However when I run this in Terminal I get the following error:
SyntaxError: Non-ASCII character '\xe2' in file x.py on line 50, but no encoding declared;
any suggestions?
Just declare the encoding. Python is begin a bit cautious here and not guessing the encoding of your text file. IDLE is a text editor and so has already guessed the encoding and stored things internally as unicode, which it can pass directly to the Python interpreter.
Put
#!/usr/bin/env python3
# -*- coding: UTF-8 -*-
at the top of the file. (It's pretty unlikely nowadays that your encoding is not UTF-8.)
For reference, in past times files had a variety of different possible encodings. That means that the same text could be stored in different ways in binary, when written to disk. Almost all encodings have the same interpretation of bytes 0 to 127—the ASCII subset. But if any other bytes occur in the file, their meaning is potentially ambiguous.
However, in recent years, UTF-8 has become by far the most common encoding, so it's almost always a safe guess.

how to detect/ remove 'UTF-8' code in python (window 10 console )

saw some answers on this but not really useful
I do some coding with python using blender 2.6
and on occasion I copy text from internet page or from Libre office text document
so there might be some control characters UTF-8 copied to the text editor for python in blender
problem is I cannot detect where UTF-8 characters are located in blender text editor or even using outside text editor like notepad 2 or notepad ++
so my question is there a simple way to detect and remove these UNTF-8 characters ?
I mean on window 10 in blender using some python commands or using external text editor!
I need something quick or a simple trick here if possible!
sorry it is "utf-8"
in blender at least the error is given as "utf-8"
and very annoying
so when I run a script in bl with some unknown characters it gives an error
the problem is how can I find where it is and remove it !
The error in blender is given but does not always shows where it is located in the python script!
even if i had to use a text editor that can locate and then help to remove these characters somehow if possible at all

How to convert Linux Python 3.4 code with national characters into executable code in windows

My national language is Polish.
I've got program in Python 3.4 which I wrote on linux. This program mostly work on text, Polish text. So of course, variable names don't have any special characters, but sometimes I put into them some strings with Polish characters, user will input from keyboard some strings with Polish characters and My program read from files, where I got strings with Polish characters.
Everything work well on Linux. I didn't think about encoding, it just worked. But now i want to make it work on Windows. Can you help me understand, what I should actually do to make this transform?
Or maybe some workaround - I just need to have Windows executable file. Perfect way for this, would be "Pyinstaller", but it work only for python 2.7, not 3.4. That's why I want to make it working on Windows, and in VirtualBox with py2exe compile into executable form. But maybe somone know way for this in Linux, it without this encoding problems, it would be great.
If not, I back to my question. I tried to convert my python scripts in gedit into ISO or CP1250 or 1252, I wrote in the file headline what coding I'm using, it actually worked a little, now my windows error pint me into my files with text form which I read some data, so I converted them too... But it didn't work.
So I decided, that it's no more time for blind trials, and I need to ask for help, I need to understand what encoding is used on windows, which on linux, what is the best way to convert one into another, and how make program read characters in right way.
The best way would be - I guess - not changing anything in encoding, but just make windows python understand what encoding I'm using. Is that possible?
Complete answer for my question would be great, but anything what will point me in right direction will also help me a lot.
OK. I'm not sure, if I understand your answer in comments, but tried sending text for myself via mail, coping it in virtualbox into notepad and save as utf_8. Still get this message:
C:\Users\python\Documents>py pytania.py
Traceback (most recent call last):
File "pytania.py", line 864, in <module>
start_probny()
File "pytania.py", line 850, in start_probny
utworzenie_danych()
File "pytania.py", line 740, in utworzenie_danych
utworzenie_pytania_piwo('a')
File "pytania.py", line 367, in utworzenie_pytania_piwo
for line in f: # Czytam po jednej linii
File "C:\Python34\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1134: cha
racter maps to <undefined>
As mentioned by Zero Piraeus in a comment: The default source encoding for Python 3.x is UTF-8, regardless of what platform it's running on...
If you have problems, that probably because your source code has incorrect encoding. You should stick to UTF-8 only (even though PEP 0263 -- Defining Python Source Code Encodings allows changing it).
The error message you provided is clear:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1134
Python is currently expecting UTF8 (because "UnicodeDecodeError"!), but it encounters an illegal char (0x9d isn't a valid char is UTF8). To diagnose the problem, use iconv(1) on a Linux machine, to detect errors buy doing a dummy conversion:
iconv -f utf8 -t iso8859-2 -o /dev/null < test.py
You can try to reproduce the problem by creating a very simple python file, typically : print "test €uro encoding"

wxformbuilder and unicode labels

Is there a way to get Unicode characters into label code generated by wxFormBuilder?
For example, to get an Angstrom character the generated string should read u"\u212b".
I tried entering \u212b in the label property field but the resulting string reads u"u212b". So I tried escaping the backslash as \\u212b but that gave me u"\\u212b".
I'm using wxFormBuilder v3.5 - beta. Generating Python code, although the C++ code shows the same behaviour.
By default, wxFormBuilder includes this command (# -- coding: utf-8 --
) on the first line at least for the python code generated.
So I went into MS word and inserted the Angstrom character Å, I then copied it into wxFormBuilder (Version 3.5 - RC1) statictext control and it worked on running the code.
Try my approach above instead of typing "u212b". Or type directly in your code like so: u"Hello... Å"

Resources