How to escape special characters (e.g., string.whitespace) in log messages? - python-3.x

I'm trying to log strings with special characters in them. I know python has a logging module but for simplicity I've defined the following log function which takes a file handle and the message to be logged:
def log(logfp, msg):
logfp.write(f'{msg}\n')
fp = open('logfile.txt', 'w')
log(fp, 'Hello World!')
log(fp, 'World:\nHello, Bob!')
fp.close()
logfile.txt
Hello World!
World:
Hello, Bob!
What I would like is:
Hello World!
World:\nHello, Bob!
So that each line of the logfile corresponds exactly to a single call to log().
I tried using string.replace(r'\', r'\\') but that did not work:
def log(logfp, msg):
msg = msg.replace(r'\\', r'\\\\')
logfp.write(f'{msg}\n')
I tried Cid's suggestion which worked for \n but not other whitespace chars:
import os
import string
def log(fp, msg):
msg = msg.replace("\n", "\\n")
fp.write(f'{msg}\n')
# Replaces \t \n \r \x0b \x0c with a backslash counterpart (not including space chars)
def log2(fp, msg):
replacement = {ch:f'\\{ch}' for ch in string.whitespace[1:]}
for ch in string.whitespace[1:]:
msg = msg.replace(ch, replacement[ch])
fp.write(f'{msg}\n')
os.chdir(r'C:\Users\mtran\Desktop') # Change working directory
logfp = open('logfile.txt', 'w')
log(logfp, 'Hello World!')
log(logfp, 'World:\nHello, Bob!')
log2(logfp, 'World:\t\n\r\x0b\x0cHello, Everyone!')
logfp.close()
logfile.txt
Hello World!
World:\nHello, Bob!
World:\ \
\
\Hello, Everyone!

in the string "\n", you can't directly replace \ by \\ because \n is one character. You see it composed with \ and n but it's interpreted as one character.
You'd rather replace \n by \\n :
def log(msg):
msg = msg.replace("\n", "\\n")
print(msg)
log('Hello World!')
log('World:\nHello, Bob!')
This outputs
Hello World!
World:\nHello, Bob!

Related

python3 pexpect spawn object line reference gets skipped

Python version: 3.8.0
pexpect Version: 4.8.0
def runCmd( self, cmd, thisTimeout = None ):
output = ""
if not thisTimeout:
thisTimeout = self.conn.timeout
try:
print("debug: %s" % cmd)
self.conn.sendline(cmd)
print( "before: %s " % self.conn.before.decode() )
index = self.conn.expect( self.expList, timeout = thisTimeout )
output += self.conn.before.decode()
print( "after: %s " % self.conn.after.decode() )
print( "before after: %s" % self.conn.before.decode() )
except Exception as e:
#expect exception thrown
print( "Error running command %s" % cmd )
print( e )
output = "Error: %s" % str(self.conn)
print("yo man %s" % self.conn.before.decode() )
output = output.replace(cmd, "").strip()
print("this has to print %s " % output)
return output
This function executes the cmd through the pexpect interface and returns the output.
Version of Python/pxpect that worked:
Python version: 3.6.9
pexpect version: 4.2.1
After an update of the python script to run on Python 3.8.0/pexpect 4.8.0, the first command sent to pexpect sometimes returns empty output. The reason is when the variable self.conn.before.decode() gets referenced, the python code does not get executed or ineffective.
An example output from described situation:
debug: cat /etc/hostname
before:
after: ubuntu#ip-172-31-1-219:~$
this has to print
An expected behavior:
debug: cat /etc/hostname
after: ubuntu#ip-172-31-1-219:~$
before after: cat /etc/hostname
ip-172-31-1-219
yo man cat /etc/hostname
ip-172-31-1-219
this has to print ip-172-31-1-219
But this time, the line before: gets skipped.
What is going on here?!
Downgrade is not possible as async(pexpect(<=4.2.1) used async as function/variable signature) becomes a keyword.
Update:
The lines are getting executed but it's printing out after I print it as byte string.
before after: b' \r\x1b[K\x1b]0;ubuntu#ip-172-31-1-219: ~\x07'
Where the correct one is printing out
before after: b' cat /etc/hostname\r\nip-172-31-1-219
\r\n\x1b]0;ubuntu#ip-172-31-1-219: ~\x07'
The reason the before and before after lines get skipped is that they contain the carriage return character \r and the escape sequence \x1b[K.
The carriage return is used to move the cursor to the start of the line. If there are characters after it in the string to be written, they get printed from the position of the cursor onward replacing existing printed characters.
The ANSI Control sequence \x1b[K erases the line from the position of the cursor to the end of the line. This clears the already printed strings in your particular case.

Regex multiline - Python3 - Match everything inside curly braces [duplicate]

This question already has an answer here:
Regular expression works on regex101.com, but not on prod
(1 answer)
Closed 2 years ago.
I tried this code but it doesn't work; I cannot catch anything. I need to get a multiline match and have been working 3 days on it now. Thanks for your help!!
My regex:
print(re.findall(r'^ltm\s+pool\s+/Common/[0-9-A-Z_.-]+\s+\{([\s\S]*?)^\}',file.read(), re.MULTILINE))
print(re.findall(r'^ltm\s+pool\s+/Common/[0-9-A-Z_.-]+\s+\{(.*?)^\}',file.read(), re.DOTALL))
My code:
#!/usr/bin/env python3
import re, os, sys
### We create a new file
f = open("bigip.txt", "w")
### Default stdout value copied to a variable
orig_stdout = sys.stdout
### Stdout transfered to a file in write mode
sys.stdout = open("bigip.txt", "w")
file = open("bigiptemp", "r")
#for line in file:
#if re.findall(r'^ltm\spool\s\/Common\/([A-Z-a-z]+)', line):
#print(line)
print(re.findall(r'^ltm\s+pool\s+/Common/[0-9-A-Z_.-]+\s+\{([\s\S]*?)^\}',file.read(), re.MULTILINE))
### Default stdout reset
sys.stdout = orig_stdout
The file below is an extract:
ltm pool /Common/GEOG.GD {
members {
/Common/
address
}
/Common/
address
}
monitor
}
ltm pool /Common/HAP_NAODE_DEV {
members {
/Common
address
}
/Common
address
}
}
monitor
}
The expected behavior is the following but I cannot share the content of bigiptemp ... But my previous answer was tagged as duplicate ... Regular expression works on regex101.com, but not on prod
Expected result
try this
#!/usr/bin/env python3
import re, os, sys
### We create a new file
f = open("bigip.txt", "w")
### Default stdout value copied to a variable
orig_stdout = sys.stdout
### Stdout transfered to a file in write mode
sys.stdout = open("bigip.txt", "w")
file = open("bigiptemp", "r")
#for line in file:
#if re.findall(r'^ltm\spool\s\/Common\/([A-Z-a-z]+)', line):
#print(line)
print(*re.findall(r'^ltm\s+pool\s+/Common/[0-9-A-Z_.-]+\s+\{([\s\S]*?)^\}',file.read(), re.MULTILINE))
### Default stdout reset
sys.stdout = orig_stdout
it seems work and match excepted results
My solution is this (three days to approximately understand how regex works):
regex = r'(^ltm\s+pool\s+/Common/[0-9-A-Z_.-]+\s+\{[\s\S]*?^\}\n?)'
print(*(re.findall(regex, file.read(), re.MULTILINE)))
ltm pool /Common/GEOG.GD {
members {
/Common/
address
}
/Common/
address
}
monitor
}
ltm pool /Common/HAP_NAODE_DEV {
members {
/Common
address
}
/Common
address
}
}
monitor
}
Never use like me 'for line in file' if you want to catch several lines in a grasp (one-shot) ... logic!!!
A very good regex-topic book for the insomniacs ... lol:
=> https://www.princeton.edu/~mlovett/reference/Regular-Expressions.pdf
Thanks to you all!!
Thanks to #Wiktor Stribiżew
#gueug

How to run a process that have an argument containing new-lines?

I have a command that have the structure :
xrdcp "root://server/file?authz=ENVELOPE&Param1=Val1" local_file_path
The problem is that ENVELOPE in text that should be unquoted in command line
and it contains a lot of new-lines
I cannot use repr as it will replace new-line with \n
Moreover subprocess seems to automatically use repr on the items from the list arguments
In bash this command is usually run with
xrdcp "root://server/file?authz=$(<ENVELOPE)&Param1=Val1" local_file
So, is there a way to run a command while keeping the new lines in the arguments?
Thank you!
Later Edit:
my actual code is :
envelope = server['envelope']
complete_url = "\"" + server['url'] + "?" + "authz=" + "{}".format(server['envelope']) + xrdcp_args + "\""
xrd_copy_list = []
xrd_copy_list.extend(xrdcp_cmd_list)
xrd_copy_list.append(complete_url)
xrd_copy_list.append(dst_final_path_str)
xrd_job = subprocess.Popen(xrd_copy_list, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = xrd_job.communicate()
print(stdout)
print(stderr)

Format data in a textfile

I have a text file containing data in this format :
[-0.00287209 -0.00815337 -0.00322895 -0.00015178]
[-0.0038058 -0.01238539 -0.00082072 0.00040815]
[-0.00922925 -0.00394288 0.00325778 0.00083047]
[-0.01221899 0.01573175 0.00569081 0.00079524]
[0.02409868 0.02623219 0.00364268 0.00026268]
[ 0.04754814 0.00664801 -0.00204411 -0.00044964]
[-0.02286798 -0.02860896 -0.00671971 -0.00086068]
[-0.079635 -0.03532551 -0.00594647 -0.00067338]
[ 1.13691452e-03 4.88425646e-04 -3.44116748e-05 -1.08364051e-05]
I want to format (removing the brackets, and strip the spaces between the numbers) so it will look like this :
-0.00287209,-0.00815337,-0.00322895,-0.00015178
-0.0038058,-0.01238539,-0.00082072,0.00040815
-0.00922925,-0.00394288,0.00325778,0.00083047
-0.01221899,0.01573175,0.00569081,0.00079524
0.02409868,0.02623219,0.00364268,0.00026268
0.04754814,0.00664801,-0.00204411,-0.00044964
-0.02286798,-0.02860896,-0.00671971,-0.00086068
-0.079635,-0.03532551,-0.00594647,-0.00067338
1.13691452e-03,4.88425646e-04,-3.44116748e-05,-1.08364051e-05
Something basic like this works:
import csv
# assuming the input is in input.txt
with open("input.txt") as input_file:
lines = input_file.readlines() # read in the entire file
fixed_lines = []
for line in lines: # for each line
line = line.strip() # remove the newline at the end
line = line.lstrip("[") # remove brackets from the left
line = line.rstrip("]") # remove brackets from the right
fixed_lines.append(line.strip().split()) # make sure there are no left over spaces and split by whitespace
# write out using the csv module
with open("output.txt", 'w') as f:
csv_writer = csv.writer(f)
csv_writer.writerows(fixed_lines)
Output:
-0.00287209,-0.00815337,-0.00322895,-0.00015178
-0.0038058,-0.01238539,-0.00082072,0.00040815
-0.00922925,-0.00394288,0.00325778,0.00083047
-0.01221899,0.01573175,0.00569081,0.00079524
0.02409868,0.02623219,0.00364268,0.00026268
0.04754814,0.00664801,-0.00204411,-0.00044964
-0.02286798,-0.02860896,-0.00671971,-0.00086068
-0.079635,-0.03532551,-0.00594647,-0.00067338
1.13691452e-03,4.88425646e-04,-3.44116748e-05,-1.08364051e-05
You could do it with a regexp like this
import re
s = """[-0.00287209 -0.00815337 -0.00322895 -0.00015178]
[-0.0038058 -0.01238539 -0.00082072 0.00040815]
[-0.00922925 -0.00394288 0.00325778 0.00083047]
[-0.01221899 0.01573175 0.00569081 0.00079524]
[0.02409868 0.02623219 0.00364268 0.00026268]
[ 0.04754814 0.00664801 -0.00204411 -0.00044964]
[-0.02286798 -0.02860896 -0.00671971 -0.00086068]
[-0.079635 -0.03532551 -0.00594647 -0.00067338]
[ 1.13691452e-03 4.88425646e-04 -3.44116748e-05 -1.08364051e-05]
"""
fouine = re.compile('^\[\s*(-?\d\.?\d+(?:e-\d+)?) \s*(-?\d\.?\d+(?:e-\d+)?) \s*(-?\d\.?\d+(?:e-\d+)?) \s*(-?\d\.?\d+(?:e-\d+)?)]$', re.M)
print re.sub(fouine, r'\1,\2,\3,\4', s)
Another way it to split your content by line and by "column"
import re
s = """[-0.00287209 -0.00815337 -0.00322895 -0.00015178]
[-0.0038058 -0.01238539 -0.00082072 0.00040815]
[-0.00922925 -0.00394288 0.00325778 0.00083047]
[-0.01221899 0.01573175 0.00569081 0.00079524 ]
[0.02409868 0.02623219 0.00364268 0.00026268]
[ 0.04754814 0.00664801 -0.00204411 -0.00044964]
[-0.02286798 -0.02860896 -0.00671971 -0.00086068]
[-0.079635 -0.03532551 -0.00594647 -0.00067338]
[ 1.13691452e-03 4.88425646e-04 -3.44116748e-05 -1.08364051e-05]
"""
# remove the brackets
def remove_brackets(l): return l.strip('[]')
# split the columns and join with a comma
def put_commas(l): return ','.join(re.split(r'\s+', l))
raw_lines = s.splitlines()
clean_lines = map(remove_brackets, raw_lines)
clean_lines = map(put_commas, clean_lines)
print '\n'.join(clean_lines)

split() on one character OR another

Python 3.6.0
I have a program that parses output from Cisco switches and routers.
I get to a point in the program where I am returning output from the 'sh ip int brief'
command.
I place it in a list so I can split on the '>' character and extract the hostname.
It works perfectly. Pertinent code snippet:
ssh_channel.send("show ip int brief | exc down" + "\n")
# ssh_channel.send("show ip int brief" + "\n")
time.sleep(0.6)
outp = ssh_channel.recv(5000)
mystring = outp.decode("utf-8")
ipbrieflist = mystring.splitlines()
hostnamelist = ipbrieflist[1].split('>')
hostname = hostnamelist[0]
If the router is in 'enable' mode the command prompt has a '#' character after the hostname.
If I change my program to split on the '#' character:
hostnamelist = ipbrieflist[1].split('#')
it still works perfectly.
I need for the program to handle if the output has the '>' character OR the '#' character in 'ipbrieflist'.
I have found several valid references for how to handle this. Ex:
import re
text = 'The quick brown\nfox jumps*over the lazy dog.'
print(re.split('; |, |\*|\n',text))
The above code works perfectly.
However, when I modify my code as follows:
hostnamelist = ipbrieflist[1].split('> |#')
It does not work. By 'does not work' I mean it does not split on either character. No splitting at all.
The following debug is from PyCharm:
ipbrieflist = mystring.splitlines() ipbrieflist={list}: ['terminal length 0', 'rtr-1841>show ip int brief | exc down', 'Interface'] IP-Address OK? Method Status Protocol', 'FastEthernet0/1 192.168.1.204 YES NVRAM up up ', 'Loopback0 172.17.0.1 YES NVRAM up up ', '', 'rtr-1841>']
hostnamelist = ipbrieflist[1].split('> |#') hostnamelist={list}: ['rtr-1841>show ip int brief | exc down']
hostname = {str}'rtr-1841>show ip int brief | exc down'
As you can see the hostname variable still contains the 'show ip int brief | exc down' appended to it.
I get the same exact behavior if the hostname is followed by the '#' character.
What am I doing wrong?
Thanks.
Instead of this:
ipbrieflist[1].split('> |#')
You want this:
re.split('>|#', ipbrieflist[1])

Resources