Parsing linux color control sequences - linux

I'm trying to render the output of a linux shell command in HTML. For example, systemctl status mysql looks like this in my terminal:
As I understand from Floz'z Misc I was expecting that the underlying character stream would contain control codes. But looking at it in say hexyl (systemctl status mysql | hexyl) I can't see any codes:
Looking near the bottom on lines 080 and 090 where the text "Active: failed" is displayed, I was hoping to find some control sequences to change the color to red. While not necessarily ascii, I used some ascii tables to help me:
looking at the second lot of 8 characters on line 090 where the letters ive: fa are displayed, I find:
69 = i
76 = v
65 = e
3a = :
20 = space
66 = f
61 = a
69 = i
There are no bytes for control sequences.
I wondered if hexyl is choosing not to display them so I wrote a Java program which outputs the raw bytes after executing the process as a bash script and the results are the same - no control sequences.
The Java is roughly:
p = Runtime.getRuntime().exec(new String[]{"/bin/sh", "-c", "systemctl status mysql"}); // runs in the shell
p.waitFor();
byte[] bytes = p.getInputStream().readAllBytes();
for(byte b : bytes) {
System.out.println(b + "\t" + ((char)b));
}
That outputs:
...
32
32
32
32
32
65 A
99 c
116 t
105 i
118 v
101 e
58 :
32
102 f
97 a
105 i
108 l
101 e
100 d
...
So the question is: How does bash know that it has to display the word "failed" red?

systemctl detects that the output is not a terminal, and it removes colors codes from the output.
Related: Detect if stdin is a terminal or pipe? , https://unix.stackexchange.com/questions/249723/how-to-trick-a-command-into-thinking-its-output-is-going-to-a-terminal , https://superuser.com/questions/1042175/how-do-i-get-systemctl-to-print-in-color-when-being-interacted-with-from-a-non-t
Tools sometimes (sometimes not) come with options to enable color codes always, like ls --color=always, grep --color=always on in case of systemd with SYSTEMD_COLORS environment variable.
What tool can I use to see them?
You can use hexyl to see them.
how does bash know that it has to mark the word "failed" red?
Bash is the shell, it is completely unrelated.
Your terminal, the graphical window that you are viewing the output with, knows to mark it red because of ANSI escape sequences in the output. There is no interaction with Bash.
$ SYSTEMD_COLORS=1 systemctl status dbus.service | grep runn | hexdump -C
00000000 20 20 20 20 20 41 63 74 69 76 65 3a 20 1b 5b 30 | Active: .[0|
00000010 3b 31 3b 33 32 6d 61 63 74 69 76 65 20 28 72 75 |;1;32mactive (ru|
00000020 6e 6e 69 6e 67 29 1b 5b 30 6d 20 73 69 6e 63 65 |nning).[0m since|
00000030 20 53 61 74 20 32 30 32 32 2d 30 31 2d 30 38 20 | Sat 2022-01-08 |
00000040 31 39 3a 35 37 3a 32 35 20 43 45 54 3b 20 35 20 |19:57:25 CET; 5 |
00000050 64 61 79 73 20 61 67 6f 0a |days ago.|
00000059

Related

Unable to append group to user gropus with usermod

I created group named support to manage access to some scripts that should be run as sudo. So I created group and verify, if group exists in /etc/group:
# groupadd support
# cat /etc/group | grep support
support:x:1002:
Then, I want to add group to user1:
# usermod –a –G support user1; echo $?
Usage: usermod [options] LOGIN
...
<usermod help page>
...
2
The command returned code 2 and no error message occurs. I thought, the problem could be with group support so I tried add user1 to group sudo (just for testing) and the problem persist. Do I do something wrong or am I missing something? Can't figure out where could be the problem. Thank you
OS: Kubuntu 20.04 LTS (5.4.0-58-generic)
BASH: GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu)
Your '-' character is incorrect. Maybe you've copy and paste it from somewhere or you're using a non-standard keyboard.
Take a look at your command hex-dump:
echo 'usermod –a –G support user1' | hd
00000000 75 73 65 72 6d 6f 64 20 e2 80 93 61 20 e2 80 93 |usermod ...a ...|
00000010 47 20 73 75 70 70 6f 72 74 20 75 73 65 72 31 0a |G support user1.|
00000020
But the correct one is:
echo 'usermod -a -G support user1' | hd
00000000 75 73 65 72 6d 6f 64 20 2d 61 20 2d 47 20 73 75 |usermod -a -G su|
00000010 70 70 6f 72 74 20 75 73 65 72 31 0a |pport user1.|
0000001c
Notice the - character in the second hex and compare it with your

Groovy gives error expecting EOF, found '?' # line 9, column 25

I'm using following code to generate random number in Groovy. I can run it in e.g. Groovy Web Console (https://groovyconsole.appspot.com/) and it works, however it fails when I try to run it in Mule. Here is the code I use:
log.info ">>run"
Random random = new Random()
def ranInt = random.nextInt()
def ran = Math.abs(​ranInt)​%20​0;
log.info ">>sleep counter:"+flowVars.counter+" ran: "+ran
sleep(ran)
And here is an exception that gets thrown:
Caused by:
org.codehaus.groovy.control.MultipleCompilationErrorsException:
startup failed: Script26.groovy: 9: expecting EOF, found '?' # line 9,
column 25. def ran = Math.abs(?400)?%20?0;
^
1 error
You have some extra unicode characters in line 4. If you convert it to hex you will get:
64 65 66 20 72 61 6e 20 3d 20 4d 61 74 68 2e 61 62 73 28 e2 80 8b 72 61 6e 49 6e 74 29 e2 80 8b 25 32 30 e2 80 8b 30 3b
Now if you convert this hex back to ascii, you will get:
def ran = Math.abs(​ranInt)​%20​0;
There is a character ​ added after first (, after ) and after first 0. If you remove it, your code will compile correctly.
Here is the hex of curated line:
64 65 66 20 72 61 6e 20 3d 20 4d 61 74 68 2e 61 62 73 28 72 61 6e 49 6e 74 29 25 32 30 30 3b
And the line itself:
def ran = Math.abs(ranInt)%200;

Unable to replace Unicode characters with sed or vim

I have a file with what I believe to be a unicode type and would like to remove them with sed or some other unix utility. I have tried few options and for some reason unable to remove those characters. Test cases shown with single line (head -n1)
Attempt 1:
> head -n1 file1.txt | hexdump -C # Hexdump line 1
output:
00000000 47 72 6f 75 70 c2 a0 20 20 20 53 69 67 6e 61 6c |Group.. Signal|
00000010 c2 a0 6e 61 6d 65 c2 a0 20 20 20 20 20 20 20 20 |..name.. |
00000020 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | |
00000030 55 6e 69 74 c2 a0 20 74 79 70 65 c2 a0 44 65 73 |Unit.. type..Des|
00000040 63 72 69 70 74 69 6f 6e c2 a0 0d 0a |cription....|
0000004c
Now replace "c2 a0" above
> head -n1 file1.txt | sed 's/\xc2\xa0//g' | hexdump -C
or
> head -n1 file1.txt | sed 's/\x{c2a0}//g | hexdump -C
00000000 47 72 6f 75 70 c2 a0 20 20 20 53 69 67 6e 61 6c |Group.. Signal|
00000010 c2 a0 6e 61 6d 65 c2 a0 20 20 20 20 20 20 20 20 |..name.. |
00000020 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | |
00000030 55 6e 69 74 c2 a0 20 74 79 70 65 c2 a0 44 65 73 |Unit.. type..Des|
00000040 63 72 69 70 74 69 6f 6e c2 a0 0d 0a |cription....|
No replacements happend
Attempt 2: Using vim
vim file1.txt
:set nobomb
:set fileencoding=utf-8
:wq
Used sed again and no replacements happened. How do I replace or remove those characters (hex "c2a0")?
I finally ended up using Perl which successfully removed the unicode chars.
> perl -v
This is perl 5, version 18, subversion 2 (v5.18.2) built for darwin-thread-multi-2level
> perl -pi -e 's/\x{c2}\x{a0}//g' file1.txt
> head -n1 file1.txt | hexdump -C
00000000 47 72 6f 75 70 20 20 20 53 69 67 6e 61 6c 6e 61 |Group Signalna|
00000010 6d 65 20 20 20 20 20 20 20 20 20 20 20 20 20 20 |me |
00000020 20 20 20 20 20 20 20 20 20 20 55 6e 69 74 20 74 | Unit t|
00000030 79 70 65 44 65 73 63 72 69 70 74 69 6f 6e 0d 0a |ypeDescription..|
00000040

Compare/Diff two files with different line terminators

I have two text files I wish to make sure are the same, the problem is that file1 (SELECT_20150210.txt) is generated on a windows platform, and file2 (sel.txt) is generated on a mac, so the two files have different line terminating characters even though they look the same:
The first line:
Eriks-MacBook-Air:hftdump erik$ head -n 1 sel.txt
SystemState 0x04 25 03:03:48.800 O
Eriks-MacBook-Air:hftdump erik$ head -n 1 SELECT_20150210.txt
SystemState 0x04 25 03:03:48.800 O
cmp says they are different:
Eriks-MacBook-Air:hftdump erik$ cmp sel.txt SELECT_20150210.txt
sel.txt SELECT_20150210.txt differ: char 35, line 1
But it's only the terminating characters that differ:
Eriks-MacBook-Air:hftdump erik$ head -n 1 SELECT_20150210.txt | hexdump -C
00000000 53 79 73 74 65 6d 53 74 61 74 65 09 30 78 30 34 |SystemState.0x04|
00000010 09 32 35 09 30 33 3a 30 33 3a 34 38 2e 38 30 30 |.25.03:03:48.800|
00000020 09 4f 0d 0a |.O..|
00000024
Eriks-MacBook-Air:hftdump erik$ head -n 1 sel.txt | hexdump -C
00000000 53 79 73 74 65 6d 53 74 61 74 65 09 30 78 30 34 |SystemState.0x04|
00000010 09 32 35 09 30 33 3a 30 33 3a 34 38 2e 38 30 30 |.25.03:03:48.800|
00000020 09 4f 0a |.O.|
00000023
So is there a way to cmp or diff these two file and telling cmp to ignore the different line terminating character? Thank you
ASSUMPTION: you don't want to alter the line-endings of the original files
To avoid creating temporary files, you could use process substitution:
diff my_unix_file <(dos2unix < my_dos_file)
diff my_unix_file <(sed 's/\r//' my_dos_file)
diff my_unix_file <(tr -d '\r' < my_dos_file)
UPDATE (Comments converted into answer): Some improvements done thanks to anishsane
On OSX you can use this diff:
diff osx-file.txt <(tr -d '\r' < win-file.txt)
tr -d '\r' < win-file.txt will strip r from win-file.txt.

Linux Bash shell: Leaving some ANSI codes (mostly color) and not interpreting some others from string in a function call

This is an example of my case function:
function SendToScreen(){
echo -e "$*"
}
So I call it by:
SendToScreen "Hello"
And, if I want to add color codes:
VioletForeGroundColor="\033[38;5;99m"
NormalColor="\033[0m"
SendToScreen "Hello"$VioletForeGroundColor" violet "$NormalColor" word."
That gives me a correct:
But the problem comes if I want to send some DOS-type path (including \ slash):
VioletForeGroundColor="\033[38;5;99m"
NormalColor="\033[0m"
MyDOSPath="d:\vivisector"
SendToScreen "Hello"$VioletForeGroundColor" violet "$NormalColor" word. The path is $MyDOSPath"
Because \v is some sort of ANSI code, so this time I obtain:
I need my function to output color text (bold, cursive, underline... etc), so I must use echo -e.
How could I solve the problem with such nagging control codes colliding characters like this \v (I suppose there will be another ones)?
I would like to repair the isssue by modifying the function, but I am not sure this is the proper method.
Thanks.
EDIT-1: We will choose \033 also known as \e as the only ANSI code that needs to remain.
New answer:
function SendToScreen() {
echo -e $(echo "${*//\\/\\\\}" | sed 's/\\\\033\[/\\033\[/g');
}
This one escapes everything, then un-escapes anything that looks like a color sequence (\033[). The possibility of sending filenames as color sequences is greatly reduced. You can reduce it even further by white-listing only those color sequences that you want to allow, and changing the sed command to a sequence of sed commands that un-escapes those exact sequences.
Old answer:
Let's say you want to escape \v and \n, you can do this:
function SendToScreen(){
a="${*//\\v/\\\\v}"
a="${a//\\n/\\\\n}"
echo -e "$a"
}
You can extend this with whatever other escapes you don't want to process.
The echo -e simply interprets sequences starting with backslash, so you simply need to ensure that the $MyDOSPath argument has all backslashes doubled up. That could be:
SendToScreen "Hello ${VioletForeGroundColor}violet${NormalColor} word." \
"The path is ${MyDOSPath//\\/\\\\}"
which uses a 'substitute' parameter expansion. The // means 'change every backslash to double backslash'.
As discussed in various comments, maybe the design of SendToScreen is sub-optimal. One possible alternative design uses:
SendToScreen [-e "string-to-expand"][-p "plain-string"] [-- "plain strings"]
Arguments that need to be expanded are, and those that should not be expanded are not. By default, they're not. So, example usage:
$ VioletForeGroundColor="\033[38;5;99m"
$ NormalColor="\033[0m"
$ MyDOSPath="C:\new\table\value\alert\form\033.txt"
$ echo "$MyDOSPath"
C:\new\table\value\alert\form\033.txt
$ bash SendToScreen.sh -e "${VioletForeGroundColor}violet${NormalColor}" -e "The path is ${MyDOSPath//\\/\\\\}" -p "Or $MyDOSPath" "Plain $MyDOSPath"
violet The path is C:\new\table\value\alert\form\033.txt Or C:\new\table\value\alert\form\033.txt Plain C:\new\table\value\alert\form\033.txt
$ bash SendToScreen.sh -e "${VioletForeGroundColor}violet${NormalColor}" -e "The path is ${MyDOSPath//\\/\\\\}" -p "Or $MyDOSPath" -e "Oops! $MyDOSPath" "Plain $MyDOSPath"
violet The path is C:\new\table\value\alert\form\033.txt Or C:\new\table\value\alert\form\033.txt Oops! C: ew able
aluelert
orm.txt Plain C:\new\table\value\alert\form\033.txt
$
A hex dump of the last lot of output was:
0x0000: 1B 5B 33 38 3B 35 3B 39 39 6D 76 69 6F 6C 65 74 .[38;5;99mviolet
0x0010: 1B 5B 30 6D 20 54 68 65 20 70 61 74 68 20 69 73 .[0m The path is
0x0020: 20 43 3A 5C 6E 65 77 5C 74 61 62 6C 65 5C 76 61 C:\new\table\va
0x0030: 6C 75 65 5C 61 6C 65 72 74 5C 66 6F 72 6D 5C 30 lue\alert\form\0
0x0040: 33 33 2E 74 78 74 20 4F 72 20 43 3A 5C 6E 65 77 33.txt Or C:\new
0x0050: 5C 74 61 62 6C 65 5C 76 61 6C 75 65 5C 61 6C 65 \table\value\ale
0x0060: 72 74 5C 66 6F 72 6D 5C 30 33 33 2E 74 78 74 20 rt\form\033.txt
0x0070: 4F 6F 70 73 21 20 43 3A 20 65 77 20 61 62 6C 65 Oops! C: ew able
0x0080: 0B 61 6C 75 65 07 6C 65 72 74 0C 6F 72 6D 1B 2E .alue.lert.orm..
0x0090: 74 78 74 20 50 6C 61 69 6E 20 43 3A 5C 6E 65 77 txt Plain C:\new
0x00A0: 5C 74 61 62 6C 65 5C 76 61 6C 75 65 5C 61 6C 65 \table\value\ale
0x00B0: 72 74 5C 66 6F 72 6D 5C 30 33 33 2E 74 78 74 0A rt\form\033.txt.
0x00C0:
You'll have to take my word for it that violet appeared in violet.
Clearly, the user (caller) of SendToScreen has to know which arguments should be expanded and which should not. However, it makes it very explicit.
Here's the code I used as a script. Repackaging as a function is left as an exercise for the reader. Extending it to add -c colour (or maybe -f foreground and -b background) is an exercise for the reader.
#!/bin/bash
output=()
while getopts "p:e:" opt
do
case "$opt" in
(e) output+=( $(echo -e "$OPTARG") );;
(p) output+=( "$OPTARG" );;
esac
done
shift $(($OPTIND - 1))
echo "${output[#]}" "$#"
Have fun!

Resources