debugging a python script taking input from sys.stdin in pycharm - python-3.x

I want to debug a small python script that takes input from stdin and sends it to stdout. Used like this:
filter.py < in.txt > out.txt
There does not seem to be a way to configure Pycharm debugging to pipe input from my test data file.
This question has been asked before, and the answer has been, basically "you can't--rewrite the script to read from a file."
I modified the code to take a file, more or less doubling the code size, with this:
import argparse
if __name__ == '__main__':
cmd_parser = argparse.ArgumentParser()
cmd_parser.add_argument('path', nargs='?', default='/dev/stdin')
args = cmd_parser.parse_args()
with open(in_path) as f:
filter(f)
where filter() now takes a file object open for write as a parameter. This permits backward compatibility so it can be used as above, while I am also able to invoke it under the debugger with input from a file.
I consider this an ugly solution. Is there a cleaner alternative? Perhaps something that leaves the ugliness in a separate file?

If you want something simpler, you can forgo argparse entirely and just use the sys.argv list to get the first argument.
import sys
if len(sys.argv) > 1:
filename = sys.argv[1]
else:
filename = sys.stdin
with open(filename) as f:
filter(f)

Related

Split stdout between the terminal and variable

Consider the following function
import time
def foo():
for i in range(5):
print(f"{i}. Hello world!")
time.sleep(1)
I would like to save all these print calls in a variable without preventing them from reaching the terminal in real time. Essentially, print would output to stdout and a variable.
I have tried:
from contextlib import redirect_stdout
import io
stdout = io.StringIO()
with redirect_stdout(stdout):
foo()
stdout_content = stdout.getvalue()
print(stdout_content)
However, this blocks printing to the terminal until foo returns.
I would like foo to keep printing to the terminal in real time while an object stores the calls.
How can this be achieved?
One approach is to provide your own file like object to redirect_stdout. Your object class will implement the write by writing both to a file and the original sys.stdout.
You can read about sys.stdout here.
You can see the various file like classes here for solid examples.

Automate python argparse script with pre-generated inputs

I have a program which takes folder paths and other inputs through the command line with argparse. I want this script to run automatically on a server, but I also want to keep its argparse functionality in case I want to run the script manually. Is there a way to have the script use pre-generated inputs from a file but also retain its flag based input system with argparse? Here is my current implementation:
parser = argparse.ArgumentParser(description='runs batch workflow on root directory')
parser.add_argument("--root", type=str, default='./', help="the path to the root directory
to process")
parser.add_argument("--data", type=str, default='MS', help="The type of data to calculate ")
args = parser.parse_args()
root_dir = args.root
option = args.data
I'm pretty new to this stuff, and reading the argparse documentation and This stack overflow question is not really what I want, if possible I would like to keep the root and data flags, and not just replace them with an input file or stdin.
If using argparse, the default keyword argument is a good, standard way to approach the problem; embed the default behavior of the program in the script source, not an external configuration file. However, if you have multiple configuration files that you want to deploy differently, the approach you mentioned (pre-generated from an input) is desirable.
argparse to dictionary
The argparse namespace can be converted to a dictionary. This is convenient as we can make a function that accepts a dictionary, or keyword arguments, and have it process the program with a convenient function signature. Also, file parsers can just as easily load dictionaries and interact with the same function. The python json module is used as an example. Of course, others can be used.
Example Python
def main(arg1=None, arg2=None, arg3=None):
print(f"{arg1}, {arg2}, {arg3}")
if __name__ == "__main__":
import sys
import json
import argparse
# script called with nothing -- load default
if len(sys.argv) == 1:
with open("default.json", "r") as dfp:
conf = json.load(dfp)
main(**conf)
else: # parse arguments
parser = argparse.ArgumentParser()
parser.add_argument('-a1', dest='arg1', metavar='arg1', type=str)
parser.add_argument('-a2', dest='arg2', metavar='arg2', type=str)
parser.add_argument('-a3', dest='arg3', metavar='arg3', type=str)
args = parser.parse_args()
conf = vars(args)
main(**conf)
default.json
{
"arg1" : "str1",
"arg2" : "str2",
"arg3" : "str3"
}
Using Fire
The python Fire module can be used more conveniently as well. It has multiple modes that the file can be interacted with minimal effort. The github repo is available here.

Printing from other thread when waiting for input()

I am trying to write a shell that needs to run socket connections on a seperate thread. On my testings, when print() is used while cmd.Cmd.cmdloop() waiting for input, the print is displaying wrong.
from core.shell import Shell
import time
import threading
def test(shell):
time.sleep(2)
shell.write('Doing test')
if __name__ == '__main__':
shell = Shell(None, None)
testThrd = threading.Thread(target=test, args=(shell,))
testThrd.start()
shell.cmdloop()
When the above command runs, here is what happens:
python test.py
Welcome to Test shell. Type help or ? to list commands.
>>asd
*** Unknown syntax: asd
>>[17:59:25] Doing test
As you can see, printing from another threads add output after prompt >> not in a new line. How can I do it so that it appears in a new line and prompt appears?
What you can do, is redirect stdout from your core.shell.Shell to a file like object such as StringIO. You would also redirect the output from your thread into a different file like object.
Now, you can have some third thread read both of these objects and print them out in whatever fashion you want.
You said core.shell.Shell inherits from cmd.Cmd, which allows redirection as a parameter to the constructor:
import io
import time
import threading
from core.shell import Shell
def test(output_obj):
time.sleep(2)
print('Doing test', file=output_obj)
cmd_output = io.StringIO()
thr_output = io.StringIO()
shell = Shell(stdout=cmd_output)
testThrd = threading.Thread(target=test, args=(thr_output,))
testThrd.start()
# in some other process/thread
cmd_line = cmd_output.readline()
thr_line = thr_output.readline()
That's quite difficult. Both your threads are sharing the same stdout. So the output from each of those threads are concurrently sent to your stdout buffer where they are printed in some arbitrary order.
What you need to do is coordinate the output from both threads, and that's a tough nut to crack. Even bash doesn't do that!
That said, maybe you can try using a lock to make sure your threads access stdout in a controlled manner. Check out: http://effbot.org/zone/thread-synchronization.htm

Output data from subprocess command line by line

I am trying to read a large data file (= millions of rows, in a very specific format) using a pre-built (in C) routine. I want to then yeild the results of this, line by line, via a generator function.
I can read the file OK, but where as just running:
<command> <filename>
directly in linux will print the results line by line as it finds them, I've had no luck trying to replicate this within my generator function. It seems to output the entire lot as a single string that I need to split on newline, and of course then everything needs reading before I can yield line 1.
This code will read the file, no problem:
import subprocess
import config
file_cmd = '<command> <filename>'
for rec in (subprocess.check_output([file_cmd], shell=True).decode(config.ENCODING).split('\n')):
yield rec
(ENCODING is set in config.py to iso-8859-1 - it's a Swedish site)
The code I have works, in that it gives me the data, but in doing so, it tries to hold the whole lot in memory. I have larger files than this to process which are likely to blow the available memory, so this isn't an option.
I've played around with bufsize on Popen, but not had any success (and also, I can't decode or split after the Popen, though I guess the fact I need to split right now is actually my problem!).
I think I have this working now, so will answer my own question in the event somebody else is looking for this later ...
proc = subprocess.Popen(shlex.split(file_cmd), stdout=subprocess.PIPE)
while True:
output = proc.stdout.readline()
if output == b'' and proc.poll() is not None:
break
if output:
yield output.decode(config.ENCODING).strip()

Pass a Python input parameter to a Batch File

Is it at all possible to build a Python GUI (lets say using Tkinter) and then pass the users input from the Python GUI into a windows batch file.
My objective is to make batch files have a nice front end using Python.
Simple example:
In the Python code the user will be asked for a date
date = inputInt("Please enter Date yyyymmdd")
Now I need to put this date value into a windows batchfile.
When running the the Python program you should use a pipe, to redirect it's stdout to stdin of the batch file. In the batch file you can just wait on the stdin until something is outputed by the Python program. Take a look here to see how to read an input stream in batch. It would look something like this:
python myprogram.py | batch_file.bat
I used the following code to send the data to a text file
import sys
startdate = input("Please Enter StartDate YYYYMMDD ")
orig_stdout = sys.stdout
f = open('startdate.txt', 'w')
sys.stdout = f
print(startdate)
sys.stdout = orig_stdout
f.close()
I then used the following in my batch file to read the text file contents
#echo off
set /p startdate=<startdate.txt
echo %startdate%

Resources