Accessing filenames with accents / Unicode in the terminal

Accessing filenames with accents / Unicode in the terminal - linux

I've got a number of files like:
Camera.txt
Cámera.txt
C我mera.txt
Given that I don't know how to type accented or Chinese characters, how do I access them in the terminal?
Normally, I'd use tab completion when using the terminal in Linux.
nano Cam<TAB>
Will auto complete the filename (if it exists):
nano Camera.txt
But there seems no way to do that if I'm unable to type the non-ASCII character.
Output of locale is :
LANG=en_GB.UTF-8
LANGUAGE=en_GB:en
LC_CTYPE="en_GB.UTF-8"
LC_NUMERIC=en_GB.UTF-8
LC_TIME=en_GB.UTF-8
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY=en_GB.UTF-8
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER=en_GB.UTF-8
LC_NAME=en_GB.UTF-8
LC_ADDRESS=en_GB.UTF-8
LC_TELEPHONE=en_GB.UTF-8
LC_MEASUREMENT=en_GB.UTF-8
LC_IDENTIFICATION=en_GB.UTF-8
LC_ALL=

If your locale is set correctly (which effectively means that your locale is an UTF-8 one), there are some ways to access your files, more or less boring:
if you are using bash or other shell that uses GNU readline, follow this advice to enable Tab completion iteration so that by pressing Tab several times you will be able to select any file matching the given prefix.
use shell wildcards * and ?. If the Cámara.txt is not a misprint (note the second a, not e), it can be selected as C?mara.txt, while the other two don't match this template.
finally, in most terminals, mouse selection, copy and paste are your friends.
That's all for now, give us more details (at least, the output of locale and which shell you are using) to have better advice.

Related

How to set TERM environment variable for linux shell

I've got very odd problem when I set export TERM=xterm-256color in ~/.bash_profile. When I try to run nano or emacs I get the following errors.
nano:
.rror opening terminal: xterm-256color
emacs:
is not defined.type xterm-256color
If that is not the actual type of terminal you have,
use the Bourne shell command `TERM=... export TERM' (C-shell:
`setenv TERM ...') to specify the correct type. It may be necessary
to do `unset TERMINFO' (C-shell: `unsetenv TERMINFO') as well.
If I manually enter the following into the shell it works
export TERM=xterm-256color
I'm stumped.

Looks like you have DOS line feeds in your .bash_profile. Don't edit files on Windows, and/or use a proper tool to copy them to your Linux system.
Better yet, get rid of Windows.
In more detail, you probably can't see it, but the erroneous line actually reads
export TERM=xterm-256color^M
where ^M is a literal DOS carriage return.
Like #EtanReisner mentions in a comment, you should not be hard-coding this value in your login files, anyway. Linux tries very hard to set it to a sane value depending on things like which terminal you are actually using and how you are connected. At most, you might want to override a particular value which the login process often chooses but which is not to your liking. Let's say you want to change to xterm-256color iff the value is xterm:
case $TERM in xterm) TERM=xterm-256color;; esac
This is not a programming question and yet an extremely common question on StackOverflow. Please google before asking.

How do I make commands like cat and less keep tab characters?

Is there a way to make cat, less etc. print tab characters instead of tabs being converted to spaces? I am annoyed by this when I copy code from the terminal to an editor.

I am seeing two problems here.
First, destination editor can covert TAB to number of spaces. Some
editor has default feature to convert TAB to number of spaces. If
you disable this feature TAB character you copied from terminal will
be copied as TAB(instead of space) to an editor.
Windows Notepad++ has similar feature
. If you are using vim, this page will be helpful for
vim tab and space conversion
Another, source file in your case terminal may be representing tab
as spaces, please check that first. You can use cat -t filename to
see if you have any TAB in source file or not. That command will
display TAB character as ^I.

It seems this is not possible with less (see answer to the same question on unix.stackexchange).
As a workaround, it works with cat or, for some minimal paging capabilities, with the more command.

Dynamic abbrev expand for the shell

Is there a function on one of the linux shells like the emacs dabbrev-expand?

First to give a definition:
M-xdescribe-functionEnterdabbrev-expandEnter
...
Expands to the most recent, preceding word for which this is a prefix.
Given that bash seems to be most heavily influenced by Emacs, looking there first reveals a few possibilities:
man bash(1), readline section
dynamic-complete-history (M-TAB)
Attempt completion on the text before point, comparing the text
against lines from the history list for possible completion matches.
dabbrev-expand
Attempt menu completion on the text before point, comparing the text
against lines from the history list for possible completion matches.
By default (or my system at least), M-/ is already bound to complete-filename:
$ bind -l | grep /
"\e/": complete-filename
You could re-bind it by putting
"\e/": dabbrev-expand
in your ~/.inputrc or /etc/inputrc.
Note that it only seems to complete the first word (the command), and only from history, not from the current command line as far as I can tell.
In zsh, I can't see anything in the man page that does this, but it should be possible to make it happen by figuring out the appropriate compctl command (Google mirror).

Using vim+LaTeX with Scandinavian characters

I want to create a lab write-up with LaTeX in Ubuntu, however my text includes Scandinavian characters and at present I have to type them in using /"a and "/o etc. Is it possible to get the latex-compiler to read these special characters when they are typed in as is? Additionally, I would like vim to "read" Finnish: Now when I open a .tex-document containing Scandinavian characters, they are not displayed at all in vim. How can I correct this?

For latex, use the inputenc option:
\usepackage[utf8]{inputenc}
Instead of utf8, you may use whatever else fits you, like latin1, as well.
Now the trick is to make your terminal run the same character encoding. It seems that it runs a character/input encoding that doesn't fit your input right now.
For this, refer to the "Locale" settings of your distribution. You can always check the locale settings in the terminal by issueing locale. These days, UTF8 locales are preferred as they work with every character imaginable. If your terminal's environment is set up correctly, vim should happily work with all your special characters without mourning.

To find out in which encoding Vim thinks the document is, try:
:set enc
To set the encoding to UTF-8, try:
:set enc=utf8

I can't help with vim, but for LaTeX I recommend you check out XeTeX, which is an extension of TeX that is designed to support Unicode input. XeTeX is now part of Texlive, so if you have TeX installed chances are you already have it.

I use the UCS unicode support: http://iamleeg.blogspot.com/2007/10/nice-looking-latex-unicode.html

Why do my keystrokes turn into crazy characters after I dump a bunch of binary data into my terminal?

If I do something like:
$ cat /bin/ls
into my terminal, I understand why I see a bunch of binary data, representing the ls executable. But afterwards, when I get my prompt back, my own keystrokes look crazy. I type "a" and I get a weird diagonal line. I type "b" and I get a degree symbol.
Why does this happen?

Because somewhere in your binary data were some control sequences that your terminal interpreted as requests to, for example, change the character set used to draw. You can restore everything to normal like so:
reset

Just do a copy-paste:
echo -e '\017'
to your bash and characters will return to normal. If you don't run bash, try the following keystrokes:
<Ctrl-V><Ctrl-O><Enter>
and hopefully your terminal's status will return to normal when it complains that it can't find either a <Ctrl-V><Ctrl-O> or a <Ctrl-O> command to run.
<Ctrl-N>, or character 14 —when sent to your terminal— orders to switch to a special graphics mode, where letters and numbers are replaced with symbols. <Ctrl-O>, or character 15, restores things back to normal.

The terminal will try to interpret the binary data thrown at it as control codes, and garble itself up in the process, so you need to sanitize your tty.
Run:
stty sane
And things should be back to normal. Even if the command looks garbled as you type it, the actual characters are being stored correctly, and when you press return the command will be invoked.
You can find more information about the stty command here.

You're getting some control characters piped into the shell that are telling the shell to alter its behavior and print things differently.

VT100 is pretty much the standard command set used for terminal windows, but there are a lot of extensions. Some control character set used, keyboard mapping, etc.
When you send a lot of binary characters to such a terminal, a lot of settings change. Some terminals have options to 'clear' the settings back to default, but in general they simply weren't made for binary data.
VT100 and its successors are what allow Linux to print in color text (such as colored ls listings) in a simple terminal program.
-Adam

If you really must dump binary data to your terminal, you'd have much better luck if you pipe it to a pager like less, which will display it in a slightly more readable format. (You may also be interested in strings and od, both can be useful if you're fiddling around with binary files.)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Accessing filenames with accents / Unicode in the terminal - linux

Related

How to set TERM environment variable for linux shell

How do I make commands like cat and less keep tab characters?

Dynamic abbrev expand for the shell

Using vim+LaTeX with Scandinavian characters

Why do my keystrokes turn into crazy characters after I dump a bunch of binary data into my terminal?

Categories

Resources