emacs shell interprets ipython characters wrongly - linux

I'm using gnome-terminal, emacs -nw, eshell inside emacs, and ipython.
For some reason the emacs shell is interpreting characters wrongly.
Here's what I see (plese note the last 3 lines):
$ ipython
Python 3.5.2 (default, Jun 28 2016, 08:46:01)
Type "copyright", "credits" or "license" for more information.
IPython 5.0.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
^[[?12l^[[?25hprint("hi")
^[[J^[[?7h^[[?12l^[[?25h^[[?2004lhi
^[[?12l^[[?25h
I believe this must be some encoding problem, but I'm not sure how to diagnose and fix it.
Here's my env output if it helps:
$ env
XDG_VTNR=2
XDG_SESSION_ID=c3
TERM=xterm-256color
SHELL=/bin/bash
XDG_MENU_PREFIX=gnome-
VTE_VERSION=4402
GJS_DEBUG_OUTPUT=stderr
WINDOWID=29360134
GJS_DEBUG_TOPICS=JS ERROR;JS LOG
USER=adrin
SSH_AUTH_SOCK=/run/user/1000/keyring/ssh
SESSION_MANAGER=local/mydarlingarch:#/tmp/.ICE-unix/498,unix/mydarlingarch:/tmp/.ICE-unix/498
USERNAME=adrin
MOZ_PLUGIN_PATH=/usr/lib/mozilla/plugins
PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl
MAIL=/var/spool/mail/adrin
DESKTOP_SESSION=gnome
QT_QPA_PLATFORMTHEME=qgnomeplatform
XDG_SESSION_TYPE=x11
PWD=/home/adrin
LANG=en_US.UTF-8
GDM_LANG=en_US.UTF-8
GDMSESSION=gnome
XDG_SEAT=seat0
HOME=/home/adrin
SHLVL=1
GNOME_DESKTOP_SESSION_ID=this-is-deprecated
XDG_SESSION_DESKTOP=gnome
LOGNAME=adrin
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
WINDOWPATH=2
XDG_RUNTIME_DIR=/run/user/1000
DISPLAY=:0
XDG_CURRENT_DESKTOP=GNOME
COLORTERM=truecolor
XAUTHORITY=/run/user/1000/gdm/Xauthority
_=/usr/bin/env

Thanks to #brian-malehorn, the problem was indeed the control characters sent by ipython.
This could be checked by trying to echo a colored text using echo -e '\033[0;31mhello\033[1;0m', which in my case printed a colored text. If the problem was the colored text, it could be fixed by:
ipython --colors=NoColor
My problem however, was not the above, therefore it must have been control characters sent by ipython to the shell. This can be disabled using:
ipython --simple-prompt

Related

How to specify python version in a conda virtual env

I am running a working project in my new position.
I believe virtual environment was created in it as I see:
$head bm3.py
#!/usr/bin/env /opt/bm3_venv/bin/python3
bm3_venv is the name of env created with requirements.txt (using virtualenv?)
$ ls -la /usr/bin/env
-rwxr-xr-x. 1 root root 28992 Jun 30 2016 /usr/bin/env
The bm3.py is presumably using python3 for the entire project, not only from the above first line in bm3.py but also seen in some other python scripts using print('asdf') which is a python3 grammar.
However I do see in the project there are python2 grammar print 'asdf', i.e.
/data/cloudera/parcels/CDH-5.12.0-1.cdh5.12.0.p0.29/bin/../lib/impala-shell/impala_shell.py is used when executing bm3.py and impala-shell.py is written in python2 grammar.
That means, in the current working project, when running bm3.py, it is using python3 but in the same running python2 is also somehow used.
How could this happen?
BTW, where can I download the original copy of impala-shell.py for the parcel of CDH-5.12.0-1.cdh5.12.0.p0.29?
Thank you very much.
UPDATE:
In the existing environment the first line of bm3.py is:
/usr/bin/env /opt/al2_venv/bin/python3
This specifies using python3 in this bm3.py
In the impala-shell.py used in the existing environment the first line is:
/usr/bin/env /usr/bin/env python
This specifies using python2 in this impala-shell.py
Now, the question becomes how does /usr/bin/env work here?
If I ran it in the existing environment, I get a list of variables settings like below:
> XDG_SESSION_ID=224064 SELINUX_ROLE_REQUESTED= TERM=xterm
> SHELL=/bin/bash HISTSIZE=1000 SSH_CLIENT=192.168.103.81 50182 22
> PATH=/usr/lib64/qt-3.3/bin:/home/xxxx/perl5/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/tableau/tabcmd/bin:/home/rxie/.local/bin:/home/rxie/bin
> PWD=/home/xxxx JAVA_HOME=/usr/java/latest LANG=en_US.UTF-8
> KDEDIRS=/usr SELINUX_LEVEL_REQUESTED= HISTCONTROL=ignoredups
> KRB5CCNAME=FILE:/tmp/krb5cc_1377008653_sw88z6 SHLVL=1 HOME=/home/xxxx
> PERL_LOCAL_LIB_ROOT=:/home/xxxx/perl5 LOGNAME=xxxx
> QTLIB=/usr/lib64/qt-3.3/lib SSH_CONNECTION=192.168.103.81 50182
> 192.168.101.231 22 LESSOPEN=||/usr/bin/lesspipe.sh %s XDG_RUNTIME_DIR=/run/user/1377008653
> QT_PLUGIN_PATH=/usr/lib64/kde4/plugins:/usr/lib/kde4/plugins
> PERL_MM_OPT=INSTALL_BASE=/home/rxie/perl5
> _=/usr/bin/env
What is this env for and how do I use it? Thanks.
I think you are running python 2 which you can verify using python -V in the Bash, now how could it be using python3 print() is by using from __future__ import print_function in the first line of your code, which from python 2.6+ ports/makes avialble the print function of python3 to python2.
I think I have the answer now:
I believe this is by design that python allows any python script (despite its python grammar) can specify the interpreter's version in the script's first line starts with #! like
#!/usr/bin/env /opt/bm3_venv/bin/python3 in bm3.py, meaning the entire script is written in python 3; at the meantime, when impala-shell.py is used during the job running, the first line in impala-shell.py specifies the python interpreter - which is python 2.6.6 - comes with the built-in python in Cloudera's CDH.

What's the default encoding in bash standard input? [duplicate]

I am using Gina Trapiani's excellent todo.sh to organize my todo-list.
However being a dane, it would be nice if the script accepted special danish characters like ø and æ.
I am an absolute UNIX-n00b, so it would be a great help if anybody could tell me how to fix this! :)
Slowly, the Unix world is moving from ASCII and other regional encodings to UTF-8. You need to be running a UTF terminal, such as a modern xterm or putty.
In your ~/.bash_profile set you language to be one of the UTF-8 variants.
export LANG=C.UTF-8
or
export LANG=en_AU.UTF-8
etc..
You should then be able to write UTF-8 characters in the terminal, and include them in bash scripts.
#!/bin/bash
echo "UTF-8 is græat ☺"
See also: https://serverfault.com/questions/11015/utf-8-and-shell-scripts
What does this command show?
locale
It should show something like this for you:
LC_CTYPE="da_DK.UTF-8"
LC_NUMERIC="da_DK.UTF-8"
LC_TIME="da_DK.UTF-8"
LC_COLLATE="da_DK.UTF-8"
LC_MONETARY="da_DK.UTF-8"
LC_MESSAGES="da_DK.UTF-8"
LC_PAPER="da_DK.UTF-8"
LC_NAME="da_DK.UTF-8"
LC_ADDRESS="da_DK.UTF-8"
LC_TELEPHONE="da_DK.UTF-8"
LC_MEASUREMENT="da_DK.UTF-8"
LC_IDENTIFICATION="da_DK.UTF-8"
LC_ALL=
If not, you might try doing this before you run your script:
LANG=da_DK.UTF-8
You don't say what happens when you run the script and it encounters these characters. Are they in the todo file? Are they entered at a prompt? Is there an error message? Is something output in place of the expected output?
Try this and see what you get:
read -p "Enter some characters" string
echo "$string"

Using sys.argv in Python 3 with Python interpreter

I’m trying to figure out how to use sys.argv in Python 3.6, but can’t figure out how to make it work using the Python interpreter (I’m not even 100% sure I’m actually using the interpreter, a bit confused around the terminology with interpreter, shell, terminal etc.)
Question 1: to access the Python interpreter, can I simply type $ python into the Terminal (I’m on a Mac)? If not, how do I access it?
Seems that when I go to find what I believe to be the interpreter in my files (I’ve downloaded Python via Anaconda), I find a program called “pythonw”, and starting this launches the Terminal, with what looks to be the Python interpreter already running. Is this the interpreter? The code chunk below is what is printed in a Terminal window when I run the "pythonw" program:
Last login: Tue Aug 7 18:26:37 on ttys001
Users-MacBook-Air:~ Username$ /anaconda3/bin/pythonw ; exit;
Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:14:23)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
Question 2: Let’s assume that I have the Python interpreter running for the sake of argument. Assume also that I have the following script/module saved as test.py.
import sys
print('Number of arguments:', len(sys.argv), 'arguments.')
print('Argument List:', str(sys.argv))
If I simply import this module in the command line of the interpreter, I get the printout:
Number of arguments: 1 arguments.
Argument List: ['']
But how do I actually supply the module with arguments in the command line?
I've been looking around on internet, all of them showing this way of doing it, but it does not work.
Question 3: can the sys.argv only be used when arguments are written in the command line of the interpreter, or is there a way to supply a module with arguments in Spyder for instance?
Thank you for taking the time to read through it all, would make me so happy if I could get an answer to this! Been struggling for days now without being able to grasp it.
The Python interpreter is just a piece of code which translates and runs Python code. You can interact with it in different ways. The most straightforward is probably to put some Python code in a file, and pass that as the first argument to python:
bash$ cat <<\: >./myscript.py
from sys import argv
print(len(argv), argv[1:])
:
bash$ # in real life you would use an editor instead to create this file
bash$ python ./myscript.py one two three
4 ['one', 'two', 'three']
If you don't want to put the script in a file, perhaps because you just need to check something quickly, you can also pass Python a -c command-line option where the option argument is a string containing your Python code, and any non-option arguments are exposed to that code in sys.argv as before:
bash$ python -c 'from sys import argv; print(len(argv), argv[1:])' more like this
4 ['more', 'like', 'this']
(Single quotes probably make the most sense with Bash. Some other shells use other conventions to wrap a piece of longer text as a single string; in particular, Windows works differently.)
In both of these cases, the Python interpreter was started with a program to execute; it interpreted and executed that Python program, and then it quit. If you want to talk to Python more directly in an interactive Read-Eval-Print-Loop (which is commonly abbreviated REPL) that's what happens when you type just python:
bash$ python
Python 3.5.1 (default, Dec 26 2015, 18:08:53)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 1+2
3
>>>
As you can see, anything you type at the >>> prompt gets read, evaluated, and printed, and Python loops back to the >>> to show that it's ready to do it again. (If you type something incomplete, the prompt changes to .... It will sometimes take a bit of puzzling to figure out what's missing - it could be indentation or a closing parenthesis to go with an opening parenthesis you typed on a previous line, for example.)
There is nothing per se which prevents you from assigning a value to sys.argv yourself:
>>> import sys
>>> sys.argv = ['ick', 'poo', 'ew']
At this point, you can import the script file you created above, and it will display the arguments after the first;
>>> import myscript
3, ['poo', 'ew']
You'll notice that the code ignores the first element of sys.argv which usually contains the name of the script itself (or -c if you used python -c '...').
... but the common way to talk to a module you import is to find its main function and call it with explicit parameters. So if you have a script otherscript.py and inspect its contents, it probably contains something like the following somewhere near the end:
def main():
import sys
return internal_something(*sys.argv[1:])
if __name__ == '__main__':
main()
and so you would probably instead simply
>>> import otherscript
>>> otherscript.internal_something('ick', 'poo')
Your very first script doesn't need to have this structure, but it's a common enough arrangement that you should get used to seeing it; and in fact, one of the reasons we do this is so that we can import code without having it start running immediately. The if __name__ == '__main__' condition specifically evaluates to False when you import the file which contains this code, so you can control how it behaves under import vs when you python myscript.py directly.
Returning to your question, let's still examine how to do this from a typical IDE.
An IDE usually shields you from these things, and simply allows you to edit a file and show what happens when the IDE runs the Python interpreter on the code in the file. Of course, behind the scenes, the IDE does something quite similar to python filename.py when you hit the Execute button (or however it presents this; a function key or menu item perhaps).
A way to simulate what we did above is to edit two files in the IDE. Given myscript.py from above, the second file could be called something like iderun.py and contain the same code we submitted to the REPL above.
import sys
sys.argv = ['easter egg!', 'ick', 'poo', 'ew']
import myscript

Console2 not using PS1 colours

I am using Cygwin in Console2 with the following PS1
export PS1='\[\e]2;\w\a\e[1;32m\e[40m\n\w\n\d - \# > \[\e[0;00m\]'
The prompt has the correct text content, but all the colors are ignored.
~/wd
Tue Mar 18 - 01:14 PM >
Screenshot showing Console2:
When I use mintty, the colours are perfect.
TERM is set the same in both Console2 and mintty:
Tue Mar 18 - 06:29 PM > env | grep TERM
TERM=cygwin
TERMCAP=SC|screen|VT 100/ANSI X3.64 virtual terminal:\
You have not show your screenshots. So I'm not sure what do you meaning.
But I believe it is cygwin feature (bug). It thinks that ANSI is not available in Windows terminal (that is true for Console2, but of course not if you are using ANSICON or ConEmu). That means that cygwin process all ANSI sequences internally (it does not send them to the terminal). So, if any problems happens, that all is cygwin implementation problems.

How to get terminal's Character Encoding

Now I change my gnome-terminal's character encoding to "GBK" (default it is UTF-8), but how can I get the value(character encoding) in my Linux?
The terminal uses environment variables to determine which character set to use, therefore you can determine it by looking at those variables:
echo $LC_CTYPE
or
echo $LANG
locale command with no arguments will print the values of all of the relevant environment variables except for LANGUAGE.
For current encoding:
locale charmap
For available locales:
locale -a
For available encodings:
locale -m
Check encoding and language:
$ echo $LC_CTYPE
ISO-8859-1
$ echo $LANG
pt_BR
Get all languages:
$ locale -a
Change to pt_PT.utf8:
$ export LC_ALL=pt_PT.utf8
$ export LANG="$LC_ALL"
If you have Python:
python -c "import sys; print(sys.stdout.encoding)"
To my knowledge, no.
Circumstantial indications from $LC_CTYPE, locale and such might seem alluring, but these are completely separated from the encoding the terminal application (actually an emulator) happens to be using when displaying characters on the screen.
They only way to detect encoding for sure is to output something only present in the encoding, e.g. ä, take a screenshot, analyze that image and check if the output character is correct.
So no, it's not possible, sadly.
To see the current locale information use locale command. Below is an example on RHEL 7.8
[usr#host ~]$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
Examination of https://invisible-island.net/xterm/ctlseqs/ctlseqs.html, the xterm control character documentation, shows that it follows the ISO 2022 standard for character set switching. In particular ESC % G selects UTF-8.
So to force the terminal to use UTF-8, this command would need to be sent. I find no way of querying which character set is currently in use, but there are ways of discovering if the terminal supports national replacement character sets.
However, from charsets(7), it doesn't look like GBK (or GB2312) is an encoding supported by ISO 2022 and xterm doesn't support it natively. So your best bet might be to use iconv to convert to UTF-8.
Further reading shows that a (significant) subset of GBK is EUC, which is a ISO2022 code, so ISO2022 capable terminals may be able to display GBK natively after all, but I can't find any mention of activating this programmatically, so the terminal's user interface would be the only recourse.

Resources