Decoding ANSI escape sequences in Dart - linux

I'm writing some code for Flutter Desktop targeting linux_x64.
I'm extracting some logs from some applications, these logs presents a syntax like this:
Inspecting log file using less logfile
ESC(BESC[mauthentication-msESC(BESC[m
Inspecting log file using less -r logfile I can see colored text into my terminal.
Inspecting log file using cat logfile I can see colored text into my terminal.
Inspecting log file using cat -vte logfile I get this:
^[(B^[[mauthentication-ms^[(B^[[m$
In Flutter using this code
Future<String> readAsString = file.readAsString();
readAsString.then((String value) => _log = utf8.decode(value.runes.toList()));
I get this output in a SelectableText widget
(B[mauthentication-ms(B[m
I'm really confused about this behaviour so if someone has experience on this suggestions are welcome!
There are 2 options:
Cleaning all the logs, visualizing normal text
Trying to decode the text just as less -r does, visualizing colored text into Flutter application.
EDIT:
I solved importing tint plugin: tint: ^2.0.0
and changing the Dart code (using the strip() method from tint plugin) as follows:
Future<String> readAsString = file.readAsString();
readAsString.then((String value) => _log = value.strip());

Those funny characters are called escape sequences, and programs use them to print colours and italics and all of that.
Terminals are designed to decode these escape sequences, but regular programs don't know what to do with them. less and cat are printing exactly what is in the file, it's the terminal you run them in that decodes them.
You'll have to make your program go through and remove all of the escape sequences with a piece of code like this:
m = "h\x1b[34mello\x1b(A.\x1b[H" # Text full of random escape sequences
c = 0 # A count variable
p = True # Are we not in an escape sequence?
o = "" # The output variable
for l in m:
if l == "\x1b":
p = False
elif p:
o += l
elif l in "QWERTYUIOPASDFGHJKLZXCVBNMqwertyuiopasdfghjklzxcvbnm": # Most (maybe all) escape sequences end in letters.
p = True
c += 1 # Move on to the next letter in the input string
print(o) # Text without escape sequences

Related

Is there a function from terminal that removes repetition and concatenates the output on the same line?

With this input
x 1
x 2
x 3
y 1
y 2
y 3
I'd like to have this output
x 1;2;3
y 1;2;3
Thank you in advance,
Simone
If by terminal you mean something natively built in you might not be in much luck, however you could run a python file from the terminal which could do want you want and more. If having a standalone file isn't possible then you can always run python in REPL mode for purely terminal usage.
If you have python installed all you would need to do to access REPL would be "py" and you could manually setup a processor. If you can use a file then something like this below should be able to take any input text and output the formatted text to the terminal.
file = open("data.txt","r")
lines = file.readlines()
same_starts = {}
#parse each line in the file and get the starting and trailing data for sorting
for line in lines:
#remove trailing/leading whitesapce and newlines
line_norm = line.strip()#.replace('\n','')
#splits data by the first space in the line
#formatting errors make the line get skipped
try:
data_split = line_norm.split(' ')
start = data_split[0]
end = data_split[1]
except:
continue
#check if dictionary same_starts already has this start
if same_starts.get(start):
same_starts[start].append(end)
else:
#add new list with first element being this ending
same_starts[start] = [end]
#print(same_starts)
#format the final data into the needed output
final_output = ""
for key in same_starts:
text = key + ' '
for element in same_starts[key]:
text += element + ";"
final_output += text + '\n'
print(final_output)
NOTE: final_output is the text in the final formatting
assuming you have python installed then this file would only need to be run with the current directory being the folder where it is stored along with a text file called "data.txt" in the same folder which contains the starting values you want processed. Then you would do "py FILE_NAME.ex" ensuring you replace FILE_NAME.ex with the exact same name as the python file, extension included.

How can I print "\n" using exec()?

ab = open("bonj.txt","w")
exec(f'''print("Hi I'm Mark\n", file=ab)
print("\tToday I'm tired", file=ab)
''')
ab.close()
I would absolutely need to use exec() to print some informations on a txt doc. The problem is that when I use exec(), I lost the possibility of put newlines or tabs on my text, and I dont understand why, could you help me ?
This is the error message that I receive : "SyntaxError: EOL while scanning string literal"
You just need to escape \n and \t properly
ab = open("bonj.txt","w")
exec(f'''print("Hi I'm Mark\\n", file=ab)
print("\\tToday I'm tired", file=ab)
''')
ab.close()
You need to prevent python from interpreting the \n early.
This can be done by specifying the string as a raw string, using the r prefix:
ab = open("bonj.txt","w")
exec(rf'''print("Hi I'm Mark\n", file=ab)
print("\tToday I'm tired", file=ab)
''')
ab.close()
Anyway, using exec is odd there, you would rather try to see if you can write your code as something like:
lines = ["Hi I'm Mark\n", "\tToday I'm tired"]
with open("bonj.txt", "w") as f:
f.write("\n".join(lines))
Note that you need to use "\n".join to obtain the same result as with print because print adds a newline by default (see its end="\n" argument).
Also, when handling files, using the context manager syntax (with open ...) is good practice.

Python: Write to file diacritical marks as escape character sequence

I read text line from input file and after cut i have strings:
-pokaż wszystko-
–ყველას გამოჩენა–
and I must write to other file somethink like this:
-poka\017C wszystko-
\2013\10E7\10D5\10D4\10DA\10D0\10E1 \10D2\10D0\10DB\10DD\10E9\10D4\10DC\10D0\2013
My python script start that:
file_input = open('input.txt', 'r', encoding='utf-8')
file_output = open('output.txt', 'w', encoding='utf-8')
Unfortunately, writing to a file is not what it expects.
I got tip why I have to change it, but cant figure out conversion:
Diacritic marks saved in UTF-8 ("-pokaż wszystko-"), it works correctly only if NLS_LANG = AMERICAN_AMERICA.AL32UTF8
If the output file has diacritics saved in escaping form ("-poka\017C wszystko-"), the script works correctly for any NLS_LANG settings
Python 3.6 solution...format characters outside the ASCII range:
#coding:utf8
s = ['-pokaż wszystko-','–ყველას გამოჩენა–']
def convert(s):
return ''.join(x if ord(x) < 128 else f'\\{ord(x):04X}' for x in s)
for t in s:
print(convert(t))
Output:
-poka\017C wszystko-
\2013\10E7\10D5\10D4\10DA\10D0\10E1 \10D2\10D0\10DB\10DD\10E9\10D4\10DC\10D0\2013
Note: I don't know if or how you want to handle Unicode characters outside the basic multilingual plane (BMP, > U+FFFF), but this code probably won't handle them. Need more information about your escape sequence requirements.

Detect arrow keys being pressed on console in Python

I am trying to write a console program in Python 3 that provides some sort of shell for the user, just like the Python 3 shell in a console. I was able to achieve this relatively quickly by using the input()method. However, it would be nice if, in that shell, one could use the arrow keys to cycle through the most recently typed commands, just like you can in other shells. The input() method does not provide this feature, and I did not find any other simple tools to do this, except for the curses module, which needs to take over the whole screen to work. One of my approaches was to read the typed text from stdin byte by byte and then check it against the codes for the special characters I'm looking for. This works pretty well, but it would run into problems when the user (for some reason) types a weird unicode character that contains the code for a key like the arrow key somewhere in the middle. While this is still an acceptable solution for me, I feel like this is a problem which ought to have been solved (better) before, given how often it has got to occur.
In Python 3, sys.stdin.read returns unicode characters as a single character. Escape sequences for arrow keys are delivered as multiple ASCII characters. Here is an example program , using tty and termios, which parses inputs accordingly.
import sys,tty,termios
# Commands and escape codes
END_OF_TEXT = chr(3) # CTRL+C (prints nothing)
END_OF_FILE = chr(4) # CTRL+D (prints nothing)
CANCEL = chr(24) # CTRL+X
ESCAPE = chr(27) # Escape
CONTROL = ESCAPE +'['
# Escape sequences for terminal keyboard navigation
ARROW_UP = CONTROL+'A'
ARROW_DOWN = CONTROL+'B'
ARROW_RIGHT = CONTROL+'C'
ARROW_LEFT = CONTROL+'D'
KEY_END = CONTROL+'F'
KEY_HOME = CONTROL+'H'
PAGE_UP = CONTROL+'5~'
PAGE_DOWN = CONTROL+'6~'
# Escape sequences to match
commands = {
ARROW_UP :'up arrow',
ARROW_DOWN :'down arrow',
ARROW_RIGHT:'right arrow',
ARROW_LEFT :'left arrow',
KEY_END :'end',
KEY_HOME :'home',
PAGE_UP :'page up',
PAGE_DOWN :'page down',
}
# Blocking read of one input character, detecting appropriate interrupts
def getch():
k = sys.stdin.read(1)[0]
if k in {END_OF_TEXT, END_OF_FILE, CANCEL}: raise KeyboardInterrupt
print('raw input 0x%X'%ord(k),end='\r\n')
return k
# Println for raw terminal mode
def println(*args):
print(*args,end='\r\n',flush=True)
# Preserve current terminal settings (we will restore these before exiting)
fd = sys.stdin.fileno()
old_settings = termios.tcgetattr(fd)
try:
# Enter raw mode (key events sent directly as characters)
tty.setraw(sys.stdin.fileno())
# Loop, waiting for keyboard input
while 1:
# Parse known command escape sequences
read = getch()
while any(k.startswith(read) for k in commands.keys()):
if read in commands:
println('detected command (%s)'%commands[read])
read = ''
break
read += getch()
# Interpret all other inputs as text input
for c in read:
println('detected character 0x%X %c'%(ord(c),c))
# Always clean up
finally:
termios.tcsetattr(fd, termios.TCSADRAIN, old_settings)
println('')
sys.exit(0)

Python, if substring in string don't contain any [a-zA-Z0-9_] at the begining of the substring

I have a string called programs_cache that contains multiple program names and descriptions:
abcde - A Better CD Encoder
abcm2ps - Translates ABC music description files to PostScript (or SVG)
abcmidi - converter from ABC to MIDI format and back
abcmidi-yaps - yet another ABC to PostScript converter
cd-discid - CDDB DiscID utility
cl-launch - uniform frontend to running Common Lisp code from the shell
cppcheck - tool for static C/C++ code analysis
grabc - simple program to determine the color string in hex by clicking on a pixel
gregorio - command-line tool to typeset Gregorian chant
And i want to have an IF statement that returns True when searching for a program_name that it is in the programs_cache string, but it shouldn't return True if the search didn't provided the full name.
For example: a search of abc should return False but a search of grabc should return True.
I was trying this:
if program_name+" " in programs_cache and not re.search([w]+program_name+" ",programs_cache):
But I'm getting the error NameError: global name 'w' is not defined
The idea of using the W was in order to match ANY single character before the program_name.
As described in the basic patters:
w matches a "word" character: a letter or digit or underbar [a-zA-Z0-9_].
It only matches a single character not a whole word.
I know that I'm using wrong the function re.search() and the basic pattern but i haven't figured out how to properly use it in this case.
# do this once during program start
program_names = set(line.partition(' - ')[0] for line in programs_cache)
# do this for each lookup
if program_name in program_names:
print "got it"
else:
print "don't got it"

Resources