Python3.6.6 argparse for negative string values with list of arguments - python-3.x

I'm having issues with negative strings being fed into argparse and resulting in errors. I am wondering if anyone has figured out a way around this. Unfortunately, I need the strings to be prepended by a negative sign in some cases, so I cannot fix this by removing the negation part.
I've taken a look at some other stackoverflow pages, such as How to parse positional arguments with leading minus sign (negative numbers) using argparse but still have no solution for this. I'm sure that someone must have a solution for this!
Here is what I have tried and am seeing:
PyDev console: starting.
Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:07:29)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
import argparse
argparser = argparse.ArgumentParser()
# arguments:
argparser.add_argument("--configfile", "--config", type=str, default=None,
help="A config file to parse (see src/configs/sample_config.ini for more details).")
argparser.add_argument("--starttime", "--st", type=str, default="-1h#m",
help="Starting time for the dump; default: one hour ago rounded to the last minute. \n"
"Supported structure: -[0-9]+[h,m,s]+(#[h,m,s])?; for example: -1h#m, -10h, ... OR now")
argparser.add_argument("--endtime", "--et", type=str, default="#m",
help="Ending time for the dump; default: current time rounded to the last minute. \n"
"Supported structure: -[0-9]+[h,m,s]+(#[h,m,s])?; for example: -1h#m, -10h, ... OR now")
argparser.add_argument("--indexes", "--i", type=str, nargs="+", default="-config-",
help="Provide one or more indexes (comma-separated with quotes around the list; "
"ex: \"index1, index two, index3\") to search on. Default is \"-config-\" "
"which means that the indeces will be gathered from file names; see the sample "
"config for details.")
argparser.add_argument("--format", "--f", type=str, default="csv",
help="Write data out to CSV file.")
import shlex
args = argparser.parse_args(shlex.split("--configfile /somepath/sample_config_test04.ini --endtime now --indexes \"index one\" index2 index3 --format csv --starttime -5h#m"))
usage: pydevconsole.py [-h] [--configfile CONFIGFILE] [--starttime STARTTIME]
[--endtime ENDTIME] [--indexes INDEXES [INDEXES ...]]
[--format FORMAT]
pydevconsole.py: error: argument --starttime/--st: expected one argument
Process finished with exit code 2
As you can see the above fails, but the following works fine, so I am sure that it's an issue with the "-":
PyDev console: starting.
Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:07:29)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
import argparse
argparser = argparse.ArgumentParser()
# arguments:
argparser.add_argument("--configfile", "--config", type=str, default=None,
help="A config file to parse (see src/configs/sample_config.ini for more details).")
argparser.add_argument("--starttime", "--st", type=str, default="-1h#m",
help="Starting time for the dump; default: one hour ago rounded to the last minute. \n"
"Supported structure: -[0-9]+[h,m,s]+(#[h,m,s])?; for example: -1h#m, -10h, ... OR now")
argparser.add_argument("--endtime", "--et", type=str, default="#m",
help="Ending time for the dump; default: current time rounded to the last minute. \n"
"Supported structure: -[0-9]+[h,m,s]+(#[h,m,s])?; for example: -1h#m, -10h, ... OR now")
argparser.add_argument("--indexes", "--i", type=str, nargs="+", default="-config-",
help="Provide one or more indexes (comma-separated with quotes around the list; "
"ex: \"index1, index two, index3\") to search on. Default is \"-config-\" "
"which means that the indeces will be gathered from file names; see the sample "
"config for details.")
argparser.add_argument("--format", "--f", type=str, default="csv",
help="Write data out to CSV file.")
import shlex
args = argparser.parse_args(shlex.split("--configfile /somepath/sample_config_test04.ini --endtime now --indexes \"index one\" index2 index3 --format csv --starttime 5h#m"))
print(args)
Namespace(configfile='/somepath/sample_config_test04.ini', endtime='now', format='csv', indexes=['index one', 'index2', 'index3'], starttime='5h#m')
You may be wondering why I'm calling the code like this; that's because I need to do some unittest-ing of the argparse calls that I am running, so I need to be able to call it from both the commandline as well as from my unittesting code.
If I call the same code from commandline without any quotes or \ in front of the -5h#m, that seems to work fine, but only for commandline (gets converted to \-5h#m). I have tried --starttime \"-5h#m\" and --starttime \-5h#m, -5h#m, '\-5h#m', etc. but nothing seems to be accepted and correctly parsed by argparse other than cmdl input.
The error is typically:
test.py: error: argument --starttime/--st: expected one argument
Any help would be greatly appreciated!
Updates: changing the input to be alike to -configfile=/somepath/sample_config_test04.ini -endtime=now -indexes="index one, index2, index3" -format=csv -starttime=-5h#m seems to work from command line.
NOTE: I'd like to keep this answer mainly because the other suggested answer has a very weirdly phrased title that I would not be have been able to find by doing a google search for what I needed. I did update the question to also reflect that I also had listed items as part of it.

Related

SED style Multi address in Python?

I have an app that parses multiple Cisco show tech files. These files contain the output of multiple router commands in a structured way, let me show you an snippet of a show tech output:
`show clock`
20:20:50.771 UTC Wed Sep 07 2022
Time source is NTP
`show callhome`
callhome disabled
Callhome Information:
<SNIPET>
`show module`
Mod Ports Module-Type Model Status
--- ----- ------------------------------------- --------------------- ---------
1 52 16x10G + 32x10/25G + 4x100G Module N9K-X96136YC-R ok
2 52 16x10G + 32x10/25G + 4x100G Module N9K-X96136YC-R ok
3 52 16x10G + 32x10/25G + 4x100G Module N9K-X96136YC-R ok
4 52 16x10G + 32x10/25G + 4x100G Module N9K-X96136YC-R ok
21 0 Fabric Module N9K-C9504-FM-R ok
22 0 Fabric Module N9K-C9504-FM-R ok
23 0 Fabric Module N9K-C9504-FM-R ok
<SNIPET>
My app currently uses both SED and Python scripts to parse these files. I use SED to parse the show tech file looking for a specific command output, once I find it, I stop SED. This way I don't need to read all the file (these can get to be very big files). This is a snipet of my SED script:
sed -E -n '/`show running-config`|`show running`|`show running config`/{
p
:loop
n
p
/`show/q
b loop
}' $1/$file
As you can see I am using a multi address range in SED. My question specifically is, how can I achieve something similar in python? I have tried multiple combinations of flags: DOTALL and MULTILINE but I can't get the result I'm expecting, for example, I can get a match for the command I'm looking for, but python regex wont stop until the end of the file after the first match.
I am looking for something like this
sed -n '/`show clock`/,/`show/p'
I would like the regex match to stop parsing the file and print the results, immediately after seeing `show again , hope that makes sense and thank you all for reading me and for your help
You can use nested loops.
import re
def process_file(filename):
with open(filename) as f:
for line in f:
if re.search(r'`show running-config`|`show running`|`show running config`', line):
print(line)
for line1 in f:
print(line1)
if re.search(r'`show', line1):
return
The inner for loop will start from the next line after the one processed by the outer loop.
You can also do it with a single loop using a flag variable.
import re
def process_file(filename):
in_show = False
with open(filename) as f:
for line in f:
if re.search(r'`show running-config`|`show running`|`show running config`', line):
in_show = True
if in_show
print(line)
if re.search(r'`show', line1):
return

SyntaxError when using print(""" with a lst of numbers to populate a file with GROMACS patched with PLUMED

I am using GROMACS with PLUMED to run MD simulations. In setting up my plumed file for collecting the S2/SH CV from Omar(https://www.plumed.org/doc-v2.8/user-doc/html/_s2_c_m.html) I am having difficulties with the line:
File "makingplumed.py", line 25
""" % (x,i)file=f)
^
SyntaxError: invalid syntax
Here is the code I am trying to run:
# here we create the PLUMED input file with python
with open("plumed.dat","w") as f:
# print initial stuff
#K# from __future__ import print_function
# Define Atoms which are Oxygen hydrogen bond acceptors
ATOMS=[21,35,45,62,76,97,109,133,152,174,188,202,213,227,239,253,269,280,292,311,323,339,353,377,401,416,426,447,466,477,488,503,518,538,560,575,597,617,624,641,655,677,692,702,722,743,765,784,798,820,844,866,883,897,919,939,961,978,988,1004,1021,1040]
#Define heavy atoms for S2CM CV (protein and backbone and not hydrogen)
heavy_atoms_nh: GROUP ATOMS=1,5,7,10,12,16,20,21,22,23,26,29,32,34,35,36,38,40,44,45,46,48,50,53,54,55,57,59,61,62,63,64,67,70,73,75,76,79,81,84,85,87,89,90,92,94,96,97,98,100,102,105,106,107,108,109,110,112,114,117,120,123,124,125,126,129,132,133,134,136,138,141,143,147,151,152,153,155,157,160,163,166,169,173,174,175,177,179,181,185,187,188,189,191,193,195,199,201,202,203,205,207,210,212,213,214,216,217,218,220,224,226,227,228,230,232,235,236,237,238,239,240,241,244,247,250,252,253,254,256,258,260,264,268,269,270,272,274,277,279,280,281,283,285,288,291,392,293,295,297,299,303,306,310,311,312,314,316,319,320,321,322,323,326,328,330,334,338,339,340,342,344,346,350,352,353,354,356,358,361,364,367,369,370,373,376,377,378,380,382,385,388,391,393,397,400,401,402,404,406,409,412,415,416,417,419,421,425,426,427,429,431,434,345,437,439,442,444,446,447,448,450,452,455,457,461,465,466,467,469,471,474,476,477,478,480,482,485,487,489,491,493,496,499,500,501,502,503,504,506,508,511,514,515,516,517,518,519,521,523,526,527,529,531,533,535,537,538,539,541,543,546,549,552,555,559,560,561,563,265,268,571,572,573,574,575,576,578,580,583,586,589,592,596,597,598,600,602,605,606,608,610,612,614,616,617,618,620,623,624,625,627,629,632,635,636,640,641,642,644,646,648,652,654,655,656,658,660,663,666,669,672,676,677,678,680,682,685,688,691,692,693,695,697,701,702,703,705,707,710,711,713,715,717,719,721,722,723,725,727,730,731,733,735,738,740,742,743,744,746,748,751,754,757,760,764,765,766,768,770,773,775,779,783,784,785,786,789,792,795,797,798,799,801,803,806,809,812,815,819,820,821,823,825,828,829,831,833,834,836,838,840,842,843,844,845,847,849,852,855,858,861,865,866,867,869,871,874,877,878,879,882,883,884,886,888,891,892,893,896,867,898,900,902,905,908,911,914,918,919,920,922,924,927,928,930,932,934,936,938,939,940,942,944,947,950,953,956,960,961,962,964,966,969,975,973,977,978,979,981,983,987,988,989,991,993,995,999,1003,1004,1005,1007,1009,1012,1015,1016,1017,1020,1021,1022,1024,1026,1029,1031,1035,1039,1040,1041,1043,1045,1048,1049,1051,1053,1055,1057,1059,1060,1061
for x in range(len(ATOMS)):
for i in range(1, 60):
print("""
S2CM ...
NH_ATOMS=x,x+2
HEAVY_ATOMS=heavy_atoms_nh
LABEL=S2nh-%d
R_EFF=0.10
PREFACTOR_A=0.80
EXPONENT_B=1.0
OFFSET_C=0.10
N_I=1
NOPBC
... S2CM
""" % (x,i)file=f)
I am just learning python and Linux this summer as I am getting involved with computational biochemistry research, so if there is a simple fix I am very sorry for the waste of time, and I appreciate any and all time and attention to this matter.
python3 --version
Python 3.6.13
Thank You,
David Cummins
Masters Student at Western Washington University

Change locale for Google Colab

I want to change the local setting (to change the date format) in GoogleCollab
The following works for me in JupyterNotebook but not in GoogleColab:
locale.setlocale(locale.LC_TIME, 'de_DE.UTF-8')
It always returns the error: unsupported locale setting
I have already looked at many other solutions and tried everything.
One solution to change only the time zone I have seen is this one:
'!rm /etc/localtime
!ln -s /usr/share/zoneinfo/Asia/Bangkok /etc/localtime
!date
I figured this one out after a long time:
In Colab, you will have to install the desired locales. You do this with:
!sudo dpkg-reconfigure locales
This will prompt for a numeric input, e.g. 268 and 269 for Hungarian.
So you enter 268 269.
It will also prompt for the default locale, after installation. Here you will need to select your desired custom locale. This time, it is a numeric selection out of 3-5 options, depending, on how many have you selected at the previous step. In my case, I have selected 3, and the default locale became hu_HU.
You need to restart the Colab runtime: Ctrl + M then .
You need to activate the locale:
import locale
locale.setlocale(locale.LC_ALL, 'hu_HU') <- make sure you do it for the LC_ALL context.
The custom locale is now ready to use with pandas:
pd.to_datetime('2021-01-01').day_name() returns Friday, but
pd.to_datetime('2021-01-01').day_name('hu_HU') returns Péntek
I wasn't successful using German locale on Google Colab, but desired formatting could be obtained as a combination of overriding locale for decimal separator and date formatting.
German formatting rules can be found here.
For custom string formatting nice cheatsheet is here.
from datetime import datetime, timedelta
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import numpy as np
import locale
german_format_str_full = '%Y-%m-%d, %H.%M Uhr'
german_format_str_date = '%Y-%m-%d'
# genereting plot data, xs are dates with not obvious step
xs = np.arange(datetime(year=2021, month=11, day=28, hour=23, minute=59, second=59),
datetime(year=2021, month=12, day=6, hour=23, minute=59, second=59),
timedelta(hours=5,minutes=47,seconds=27))
ys = np.sin(np.arange(0,len(xs),1)) # whatever
# use overwritten locale for comma as decimal point -- German formatting
plt.rcParams['axes.formatter.use_locale'] = True
locale._override_localeconv["decimal_point"]= ','
# plot
fig, ax = plt.subplots(figsize=(9,4))
ax.plot(xs,ys, 'o-')
# set formatting string using mdates from matplotlib
ax.xaxis.set_major_formatter(mdates.DateFormatter(german_format_str_date))
# rotate formatted ticks or use autoformat 'fig.autofmt_xdate()'
plt.xticks(rotation=70)
plt.title('Google Colab plot with German locale style')
plt.show()
It gives me this plot:
If you need to check how formatting settings look like on your machine you can use locale.nl_langinfo(locale.D_T_FMT). For example:
import locale
from datetime import datetime
now = datetime.now()
# find local date time formatting on Google Colab
local_format_str = locale.nl_langinfo(locale.D_T_FMT)
print('local_format_str on Google Colab: ', local_format_str)
print('now in Google Colab default format:', now.strftime(local_format_str))
german_format_str_full = '%Y-%m-%d, %H.%M Uhr'
german_format_str_date = '%Y-%m-%d'
print('now in German format, full:',now.strftime(german_format_str_full))
print('now in German format, only date:',now.strftime(german_format_str_date))
ridiculous_format = '%Y->%m-->%d'
print('now ridiculous_format:',now.strftime(ridiculous_format))
Based on this answer I was able to load german locales. However it needs to be done in two steps: Installing new, german locale. Restarting kernel and loading german locale.
In short:
import os
# Install de_DE
!/usr/share/locales/install-language-pack de_DE
!dpkg-reconfigure locales
# Restart Python process to pick up the new locales
os.kill(os.getpid(), 9)
More detailed version:
It turned out that the list of available locales is pretty short which can be checked like this:
import locale
from datetime import datetime
now = datetime.now()
# find local date time formatting on Google Colab
local_format_str = locale.nl_langinfo(locale.D_T_FMT)
print('local_format_str on Google Colab: ', local_format_str)
print('now in Google Colab default format:', now.strftime(local_format_str))
print('Loading avaliable locales via real names...')
for real_name in set(locale.locale_alias.values()):
try:
locale.setlocale(locale.LC_ALL, real_name)
print('success: real_name = ', real_name)
except:
pass
print('Loading avaliable locales via aliases...')
for alias , real_name in locale.locale_alias.items():
try:
locale.setlocale(locale.LC_ALL, alias)
print('success: alias = ' , alias, ' , real_name = ', real_name)
except:
pass
With output:
local_format_str on Google Colab: %a %b %e %H:%M:%S %Y
now in Google Colab default format: Wed Dec 1 12:10:52 2021
Loading avaliable locales via real names...
success: real_name = en_US.UTF-8
success: real_name = C
Loading avaliable locales via aliases...
As we can see there is no german locale, so it needs to be installed with code:
import os
# Install de_DE
!/usr/share/locales/install-language-pack de_DE
!dpkg-reconfigure locales
# Restart Python process to pick up the new locales
os.kill(os.getpid(), 9)
giving an output:
Generating locales (this might take a while)...
de_DE.ISO-8859-1... done
Generation complete.
dpkg-trigger: error: must be called from a maintainer script (or with a --by-package option)
Type dpkg-trigger --help for help about this utility.
Generating locales (this might take a while)...
de_DE.ISO-8859-1... done
en_US.UTF-8... done
Generation complete.
Then we load german locale locale.setlocale(locale.LC_ALL, 'german') and the same code as at the beginning (remember about importing again packages) gives us:
Loading avaliable locales via real names...
success: real_name = C
success: real_name = en_US.UTF-8
success: real_name = de_DE.ISO8859-1
Loading avaliable locales via aliases...
success: alias = deutsch , real_name = de_DE.ISO8859-1
success: alias = german , real_name = de_DE.ISO8859-1
and the default formatting is more German:
local_format_str on Google Colab: %a %d %b %Y %T %Z
now in Google Colab default format: Mi 01 Dez 2021 12:12:03

How to write Chinese characters to file based on unicode code point in Python3

I am trying to write Chinese characters to a CSV file based on their Unicode code points found in a text file in unicode.org/Public/zipped/13.0.0/Unihan.zip. For instance, one example character is U+9109.
In the example below I can get the correct output by hard coding the value (line 8), but keep getting it wrong with every permutation I've tried at generating the bytes from the code point (lines 14-16).
I'm running this in Python 3.8.3 on a Debian-based Linux distro.
Minimal working (broken) example:
1 #!/usr/bin/env python3
2
3 def main():
4
5 output = open("test.csv", "wb")
6
7 # Hardcoded values work just fine
8 output.write('\u9109'.encode("utf-8"))
9
10 # Comma separation
11 output.write(','.encode("utf-8"))
12
13 # Problem is here
14 codepoint = '9109'
15 u_str = '\\' + 'u' + codepoint
16 output.write(u_str.encode("utf-8"))
17
18 # End with newline
19 output.write('\n'.encode("utf-8"))
20
21 output.close()
22
23 if __name__ == "__main__":
24 main()
Executing and viewing results:
example $
example $./test.py
example $
example $cat test.csv
鄉,\u9109
example $
The expected output would look like this (Chinese character occurring on both sides of the comma):
example $
example $./test.py
example $cat test.csv
鄉,鄉
example $
chr is used to convert integers to code points in Python 3. Your code could use:
output.write(chr(0x9109).encode("utf-8"))
But if you specify the encoding in the open instead of using binary mode you don't have to manually encode everything. print to a file handles newlines for you as well.
with open("test.txt",'w',encoding='utf-8') as output:
for i in range(0x4e00,0x4e10):
print(f'U+{i:04X} {chr(i)}',file=output)
Output:
U+4E00 一
U+4E01 丁
U+4E02 丂
U+4E03 七
U+4E04 丄
U+4E05 丅
U+4E06 丆
U+4E07 万
U+4E08 丈
U+4E09 三
U+4E0A 上
U+4E0B 下
U+4E0C 丌
U+4E0D 不
U+4E0E 与
U+4E0F 丏

How do I change the line endings used by PExpect output

The returned output from pexpect.run() includes \r\n at the end of every line. Printing to the terminal using print(returnVal.decode()) correctly prints one line for each line returned. When I examine the output I see that the byte string contains \r\n. When I log that to a file I get double returns to the log file. I'm on a Mac using Python 3.7. Is there a way to set the preferred new line when writing the output? I am using pythons logging class and using the info() method to write the string. Output looks like this:
total 80
-rw-r--r-- 1 xxxx admin 1048 Nov 12 00:41 Constants.py
-rw-r--r-- 1 xxxx admin 5830 Nov 12 13:33 file1.py
-rw-r--r-- 1 xxxx admin 2255 Nov 12 00:51 file2.py
When it should look like:
total 80
-rw-r--r-- 1 xxxx admin 1048 Nov 12 00:41 Constants.py
-rw-r--r-- 1 xxxx admin 5830 Nov 12 13:33 file1.py
-rw-r--r-- 1 xxxx admin 2255 Nov 12 00:51 file2.py
Here is a simplified version of my original Logger class:
class Logger():
def __init__( self, path ):
msgFormat = '%(asctime)s.%(msecs)d\t%(message)s'
dateFormat = '%m/%d/%Y %H:%M:%S'
logging.basicConfig( format=msgFormat, datefmt=dateFormat, filename=path, level=logging.INFO )
def Log ( self, theStr ):
logging.info( str( theStr ))
The string being returned from Pexpect looks something like:
Line1\r\nLine2
Depending on how you log the output, it's advisable to format the newlines before sending to logger. However, if you must override the logging module's newline parameter for FileHandler, and as an experiment, you can do so by monkey patching its _open method as the functionality isn't available by default.
I used source code for Python version 3.8 to get _open function's definition.
import logging
def custom_open(self):
"""
Monkey patched _open function of class logging.FileHandler (Python 3.8)
"""
return open(self.baseFilename, self.mode, encoding=self.encoding, newline='')
logging.FileHandler._open = custom_open
if __name__ == "__main__":
pexpect_return = "Output\nTest"
my_log = logging.getLogger("test_logger")
my_log.setLevel(logging.INFO)
my_log.addHandler(logging.FileHandler("test.log"))
my_log.info(pexpect_return)
How it works
Python's logging module has a class FileHandler, which uses a method _open to create a file handler object to write and append to log files on disk. Its default implementation as of version 3.8 does not have the newline parameter so it uses default newlines.
Monkey patching is when you replace or update a method/function in one of your imported classes, as the program is running. This line logging.FileHandler._open = custom_open tells python to replace the _open method of the FileHandler class, with my custom_open method. Then later when I use my_log.addHandler(logging.FileHandler("test.log")), the new custom_open method is used to open the file with newline paramater.
You can further confirm that the new method is used to open the file by adding a suffix to the file name like this:
return open(self.baseFilename+"__Monkey_Patched", self.mode, encoding=self.encoding, newline='')
If you will now run that demo code, the filename will be "test.log__Monkey_Patched".
This code, however, will not replace any newline characters which you pass to the logger as part of the string to log. You need to process that beforehand.

Resources