using argparse to run a python3 script

using argparse to run a python3 script - python-3.x

I made a function in python3 which takes the path to some txt files and returns a list containing the name of all the txt files.
here is the function:
import os
def files(path):
folder = os.fsencode(path)
filenames = []
for file in os.listdir(folder):
filename = os.fsdecode(file)
if filename.endswith( ('.txt') ):
filenames.append(filename)
filenames.sort()
return filenames
to run this function I can do the following which works perfectly:
if __name__ == "__main__":
path = '/home/final_test'
file_list = files(path)
print(file_list)
but the problem is from this part. I am trying to make a script to run it in command-line using argparse. to do so I added the following code at the end of script but it does not return anything. do you know how to fix it?
def main():
ap = argparse.ArgumentParser(description="")
ap.add_argument('-P', '--path', required=True)
ap.add_argument('-o', '--outlist', required=True)
args = ap.parse_args()
file_list = files(path)
return file_list
if __name__ == "__main__":
from signal import signal, SIGPIPE, SIG_DFL
signal(SIGPIPE, SIG_DFL)
try:
main()
except IOError as e:
if e.errno != 32:
raise
except KeyboardInterrupt as e:
pass

Your main() function returns None (due to empty return statement). Assuming your files function works OK and file_list get some value, after that line it is no longer used. Probably you want to print within main function file_list or return it to be accessible outside main function (assign it to other variable).
That said I don't see why you need signal lines.

You should print your result(?)
result = main()
print(result)
or
print(main())

Try this:
def main():
ap = argparse.ArgumentParser(description="")
ap.add_argument('-P', '--path', required=True)
#ap.add_argument('-o', '--outlist', required=True)
args = ap.parse_args()
print(args)
file_list = files(args.path)
return file_list
if __name__ == "__main__":
filelist = main()
print(filelist)
I corrected the if __name indenting; I print the return value from main; I set path correctly. And for debugging purposes Iprint(args)`.

Related

Why did migrating from Python2.7 to 3.7 break my logging with subprocess, Pool and stdout?

I inherited some code written in python 2.7 that I updated to 3.7 However, a piece of the logging broke in the process. The code uses a wrapper script (A) to invoke the main script (B) with subprocess.Popen(). Script B uses multiprocess.Pool(). Activity in the Pool() used to write to the logfile. After switching to Python3, it does not. No portion of this code was modified in the switch to 3.7. I tried adding sys.stdout = open(logfile, 'w') to script A per this thread with no luck either.
Script A:
def runTool(path, args):
env = pathToPythonExe
logfile = getLogFile() # other function to create text file
proc = subprocess.Popen(scriptB, env=env, shell=False, stdout=logfile, stderr=logfile)
out = proc.communicate(None)
if out is not None and len(out) > 0:
print(out)
Script B:
def processMyData(arg1, arg2, data_dictionary):
index = str(data_dictionary[1][2])
print('Processing item: ' + index) # No longer logged at python3
doWork
def mainFunction():
print('This message gets logged successfully')
pool = multiprocessing.Pool(processes=4, maxtasksperchild=1)
pool.map(partial(processMyData, arg1, arg2), data_dictionary.items(), chunksize=1) #Prints in here do not get logged anymore
pool.close()
pool.join()

I found an incredibly inelegant solution by modifying script B.
import glob
fileList = glob.glob('C:\Temp\*.log')
newest_file = max(fileList, key=os.path.getctime)
sys.stdout = open(newest_file, 'a+')
...
def processMyData(arg1, arg2, data_dictionary):
index = str(data_dictionary[1][2])
print('Processing item: ' + index)
sys.stdout.flush() # Added line
doWork

False value with loop

I'm creating a tool to execute the command host for multiples IP ranges.
These IP ranges are located in a file in the same folder as my script.
I can read my file with the IP, I can create my threads, but the code execute i not like in a loop (as a variable):
Host 10.123.204.{i} not found: 3(NXDOMAIN)
Here is my code:
import threading
from argparse import ArgumentParser
from itertools import product
import subprocess
def check_host(host: str):
subprocess.run(["host", host])
#status = 'up' if return_code == 0 else 'down'
#print(f'{host} : is {status}')
def start_threads(addr_range):
for addr in addr_range:
t = threading.Thread(target=check_host, args=(addr,),
name=f'Thread:{addr}')
t.start()
yield t
def ping_network_range(net_class: str):
myFile=open('../findRoute/ip.txt', 'r')
net_class = net_class.upper()
for line in myFile:
if net_class == 'A':
newLine=line+''
newLine=newLine[:-1]
threads = list(start_threads(f''+newLine+'.{i}' for i in range(256)))#here is the error
elif net_class == 'B':#TBD
threads = list(start_threads(f'127.0.{i}.{j}'
for i, j in product(range(256), range(256))))
else:
raise ValueError(f'Wrong network class name {net_class}')
for t in threads:
t.join()
if __name__ == "__main__":
parser = ArgumentParser(description='Host network addresses by network class')
parser.add_argument('-c', '--nclass', choices=('A', 'B'),
required=True, help='Choose class A or B')
args = parser.parse_args()
ping_network_range(args.nclass)

You aren't properly making use of f-strings. You have the first empty string as an f-string, then don't make the string that you want to format an f-string.
Instead of:
f''+newLine+'.{i}'
You meant:
f'{newLine}.{i}'
Note how it uses one string (not multiple concatenated), and everything that should be substituted is in {}s.

How can I redirect hardcoded calls to open to custom files?

I've written some python code that needs to read a config file at /etc/myapp/config.conf . I want to write a unit test for what happens if that file isn't there, or contains bad values, the usual stuff. Lets say it looks like this...
""" myapp.py
"""
def readconf()
""" Returns string of values read from file
"""
s = ''
with open('/etc/myapp/config.conf', 'r') as f:
s = f.read()
return s
And then I have other code that parses s for its values.
Can I, through some magic Python functionality, make any calls that readconf makes to open redirect to custom locations that I set as part of my test environment?

Example would be:
main.py
def _open_file(path):
with open(path, 'r') as f:
return f.read()
def foo():
return _open_file("/sys/conf")
test.py
from unittest.mock import patch
from main import foo
def test_when_file_not_found():
with patch('main._open_file') as mopen_file:
# Setup mock to raise the error u want
mopen_file.side_effect = FileNotFoundError()
# Run actual function
result = foo()
# Assert if result is expected
assert result == "Sorry, missing file"

Instead of hard-coding the config file, you can externalize it or parameterize it. There are 2 ways to do it:
Environment variables: Use a $CONFIG environment variable that contains the location of the config file. You can run the test with an environment variable that can be set using os.environ['CONFIG'].
CLI params: Initialize the module with commandline params. For tests, you can set sys.argv and let the config property be set by that.

In order to mock just calls to open in your function, while not replacing the call with a helper function, as in Nf4r's answer, you can use a custom patch context manager:
from contextlib import contextmanager
from types import CodeType
#contextmanager
def patch_call(func, call, replacement):
fn_code = func.__code__
try:
func.__code__ = CodeType(
fn_code.co_argcount,
fn_code.co_kwonlyargcount,
fn_code.co_nlocals,
fn_code.co_stacksize,
fn_code.co_flags,
fn_code.co_code,
fn_code.co_consts,
tuple(
replacement if call == name else name
for name in fn_code.co_names
),
fn_code.co_varnames,
fn_code.co_filename,
fn_code.co_name,
fn_code.co_firstlineno,
fn_code.co_lnotab,
fn_code.co_freevars,
fn_code.co_cellvars,
)
yield
finally:
func.__code__ = fn_code
Now you can patch your function:
def patched_open(*args):
raise FileNotFoundError
with patch_call(readconf, "open", "patched_open"):
...

You can use mock to patch a module's instance of the 'open' built-in to redirect to a custom function.
""" myapp.py
"""
def readconf():
s = ''
with open('./config.conf', 'r') as f:
s = f.read()
return s
""" test_myapp.py
"""
import unittest
from unittest import mock
import myapp
def my_open(path, mode):
return open('asdf', mode)
class TestSystem(unittest.TestCase):
#mock.patch('myapp.open', my_open)
def test_config_not_found(self):
try:
result = myapp.readconf()
assert(False)
except FileNotFoundError as e:
assert(True)
if __name__ == '__main__':
unittest.main()
You could also do it with a lambda like this, if you wanted to avoid declaring another function.
#mock.patch('myapp.open', lambda path, mode: open('asdf', mode))
def test_config_not_found(self):
...

How to run a function in 'background'

I'm parsing the last line of a continuously updating log file. If it matches, I want to return the match to a list and start another function using that data. I need to keep watching for new entries and parse them even while the new function continues.
I've been working this from a few different angles for about a week with varying success. I tried threading, but ran into issues getting the return value, I tried using a global var but couldn't get it working. I'm now trying asyncio, but having even more issues getting that to work.
def tail():
global match_list
f.seek(0, os.SEEK_END)
while True:
line = f.readline()
if not line:
time.sleep(0.1)
continue
yield line
def thread():
while True:
tail()
def somefun(list):
global match_list
#do things here
pass
def main():
match_list = []
f = open(r'file.txt')
thread=threading.Thread(target=thread, args=(f,))
thread.start()
while True:
if len(match_list) >= 1:
somefun(match_list)
if __name__ == '__main__':
main()
Wrote the above from memory..
I want tail() to return the line to a list that somefun() can use.
I'm having issues getting it to work, I will use threading or asyncio.. anything to get it running at this point.

In asyncio you might use two coroutines, one that reads from file, and the other that processes the file. Since they communicate using queue, they don't need the global variable. For example:
import os, asyncio
async def tail(f, queue):
f.seek(0, os.SEEK_END)
while True:
line = f.readline()
if not line:
await asyncio.sleep(0.1)
continue
await queue.put(line)
async def consume(queue):
lines = []
while True:
next_line = await queue.get()
lines.append(next_line)
# it is not clear if you want somefun to receive the next
# line or *all* lines, but it's easy to do either
somefun(next_line)
def somefun(line):
# do something with line
print(f'line: {line!r}')
async def main():
queue = asyncio.Queue()
with open('file.txt') as f:
await asyncio.gather(tail(f, queue), consume(queue))
if __name__ == '__main__':
asyncio.run(main())
# or, on Python older than 3.7:
#asyncio.get_event_loop().run_until_complete(main())
The beauty of an asyncio-based solution is that you can easily start an arbitrary number of such coroutines in parallel (e.g. you could start gather(main1(), main2()) in an outer coroutine, and run that), and have them all share the same thread.

with a few small fixes you almost run this :) (comments inside)
match_list # should be at the module scope
def tail():
# f = open(...) ???
f.seek(0, os.SEEK_END)
while True:
line = f.readline()
if not line:
time.sleep(0.1)
continue
yield line
def thread():
for line in tail():
match_list.append(line) # append line
print("thread DONE!")
def somefun(list):
#do things here
while match_list:
line = match_list.pop(0)
print(line)
def main():
match_list = []
f = open(r'file.txt')
thread=threading.Thread(target=thread, args=(f,))
thread.start()
while True:
if match_list:
somefun(match_list)
time.sleep(0.1) # <-- don't burn the CPU :)

Python watchdog module duplicate events (edit: was not an watchdog issue)

I am creating a python script that will identify changes to a log file and print some data from the new logs.
I use watchdog to create an event handler and everything seems to work fine except from that, I get duplicate events every time I modify the file. I checked creation and delete, they both work as expected and trigger one time.
I have read the similar question which explains having a created and a modified event when I save a file but this is not my case. I just get two modification events.
Here is my code:
import os, sys, time
import subprocess
import threading
import win32print
from tkinter import filedialog
from tkinter import *
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
class Handler(FileSystemEventHandler):
# docstring for FileSystemEventHandler
def __init__(self, observer, filename, dirname):
# super(Handler, FileSystemEventHandler).__init__(self,)
self.observer = observer
self.filename = filename
self.dirname = dirname
print("Handler filename = " , self.filename)
print("Handler dirname = " , self.dirname)
def on_modified(self, event):
if self.filename == event.src_path:
print("The file was modified")
print (event.src_path)
# go get the last line and print the data
# try:
# hJob = win32print.StartDocPrinter (hPrinter, 1, ("test of raw data", None, "RAW"))
# try:
# win32print.StartPagePrinter (hPrinter)
# win32print.WritePrinter (hPrinter, raw_data)
# win32print.EndPagePrinter (hPrinter)
# finally:
# win32print.EndDocPrinter (hPrinter)
# finally:
# win32print.ClosePrinter (hPrinter)
def on_created(self, event):
print("A file was created (", event.src_path, ")")
def on_deleted(self, event):
print("A file was deleted (", event.src_path, ")")
if __name__ == "__main__":
Flags=2
Name=None
Level=1
printers = win32print.EnumPrinters(Flags, Name, Level)
print("\nChoose a printer to use:")
i=1
for p in printers:
print(i,')' , p[2])
i = i+1
if sys.version_info >= (3,):
raw_data = bytes ("This is a test", "utf-8")
else:
raw_data = "This is a test"
printer = int(input())
printer_name = printers[printer-1][2] #win32print.GetDefaultPrinter ()
print("You chose ", printer_name, "\nI will now print from the specified file with this printer")
hPrinter = win32print.OpenPrinter (printer_name)
# root = Tk()
# root.filename = filedialog.askopenfilename(initialdir = "/Desktop",title = "Select file",filetypes = (("log files","*.log"),("all files","*.*")))
file_path = "some_file_path" # root.filename
file_directory = os.path.dirname(file_path)
# print (file_path)
print (file_directory)
observer = Observer()
event_handler = Handler(observer, file_path, file_directory)
observer.schedule(event_handler, path=file_directory, recursive=False)
observer.start()
observer.join()
any ideas would be appreciated
EDIT:
After some debugging I found out that Windows10 is changing the file modification time twice every time I save it.
The proof of concept code is this:
prev_modification_time = os.path.getmtime(file_path)
while True:
current_mod_time = os.path.getmtime(file_path)
if prev_modification_time != current_mod_time :
print ("the file was modified, last modification time is: ", current_mod_time)
prev_modification_time = current_mod_time
pass
Final edit:
After testing my code on linux (Debian Stretch to be exact) it worked like a charm. So this combined with the previous edit probably shows that watchdog works fine and it is windows10 that has some issue. Should I post it on a different question or here?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

using argparse to run a python3 script - python-3.x

You should print your result(?) result = main() print(result) or print(main())

Related

Why did migrating from Python2.7 to 3.7 break my logging with subprocess, Pool and stdout?

False value with loop

How can I redirect hardcoded calls to open to custom files?

How to run a function in 'background'

Python watchdog module duplicate events (edit: was not an watchdog issue)

Categories

Resources