(Please feel free to edit the title to make it better to understand.)
I want to call (on bash) a Python script in this two ways without any error.
./arg.py
./arg.py TEST
It means that the parameter (here with the value TEST) should be optional.
With argparse I only know a way to create optional paramters when they have a switch (like --name).
Is there a way to fix that?
#!/usr/bin/env python3
import sys
import argparse
parser = argparse.ArgumentParser(description=__file__)
# must have
#parser.add_argument('name', metavar='NAME', type=str)
# optional BUT with a switch I don't want
#parser.add_argument('--name', metavar='NAME', type=str)
# store all arguments in objects/variables of the local namespace
locals().update(vars(parser.parse_args()))
print(name)
sys.exit()
I think all you need is nargs='?'.
parser = argparse.ArgumentParser(description=__file__)
parser.add_argument('name', nargs='?', default='mydefault')
args = parser.parse_args()
I'd expect args to be either:
namespace(name='mydefault')
namespace(name='TEST')
Related
My case is a little bit specific. I'm trying to run a Python program using Python for testing purposes. The case is as follows:
# file1.py
print("Hello world")
# file1.test.py
import io
import sys
import os
import unittest
EXPECTED_OUTPUT = "Hello world"
class TestHello(unittest.TestCase):
def test_hello(self):
sio = io.StringIO()
sys.stdout = sio
os.system("python3 path/to/file1.py")
sys.stdout = sys.__stdout__
print("captured value:", sio.getvalue())
self.assertEqual(sio.getvalue(), EXPECTED_STDOUT)
if __name__ == "__main__":
unittest.main()
But nothing ends up in the sio variable. This way and similar ways are introduced online but they don't seem to work for me. My Python version is 3.8.10 but it doesn't really matter if this works better in some other version, I can switch to that.
Note: I know that if I was using an importable object this might be easier, but right now I need to know how to catch the output of another file.
Thanks!
stdout redirection does not work like this - this will change the stdout variable inside your Python process. But by using os.system, you are running another process, that will re-use the same terminal pseudo-files your parent process is using.
If you want to log a subprocess, the way to do it is to use the subprocess modules calls, which allow you to redirect the subprocess output. https://docs.python.org/3/library/subprocess.html
Also, the subprocess won't be able to use a StringIO object from the parent process (it is not an O.S. level object, just an in-process Python object with a write method). The docs above include instructions about using the special object subprocess.PIPE which allows for in-memory communication, or, you can just pass an ordinary filesystem file, which you can read afterwards.
I'm bit confused about how the global variables work. I have a large project, with around 50 files, and I need to define global variables for all those files.
What I did was define them in my projects main.py file, as following:
# ../myproject/main.py
# Define global myList
global myList
myList = []
# Imports
import subfile
# Do something
subfile.stuff()
print(myList[0])
I'm trying to use myList in subfile.py, as following
# ../myproject/subfile.py
# Save "hey" into myList
def stuff():
globals()["myList"].append("hey")
An other way I tried, but didn't work either
# ../myproject/main.py
# Import globfile
import globfile
# Save myList into globfile
globfile.myList = []
# Import subfile
import subfile
# Do something
subfile.stuff()
print(globfile.myList[0])
And inside subfile.py I had this:
# ../myproject/subfile.py
# Import globfile
import globfile
# Save "hey" into myList
def stuff():
globfile.myList.append("hey")
But again, it didn't work. How should I implement this? I understand that it cannot work like that, when the two files don't really know each other (well subfile doesn't know main), but I can't think of how to do it, without using io writing or pickle, which I don't want to do.
The problem is you defined myList from main.py, but subfile.py needs to use it. Here is a clean way to solve this problem: move all globals to a file, I call this file settings.py. This file is responsible for defining globals and initializing them:
# settings.py
def init():
global myList
myList = []
Next, your subfile can import globals:
# subfile.py
import settings
def stuff():
settings.myList.append('hey')
Note that subfile does not call init()— that task belongs to main.py:
# main.py
import settings
import subfile
settings.init() # Call only once
subfile.stuff() # Do stuff with global var
print settings.myList[0] # Check the result
This way, you achieve your objective while avoid initializing global variables more than once.
See Python's document on sharing global variables across modules:
The canonical way to share information across modules within a single program is to create a special module (often called config or cfg).
config.py:
x = 0 # Default value of the 'x' configuration setting
Import the config module in all modules of your application; the module then becomes available as a global name.
main.py:
import config
print (config.x)
In general, don’t use from modulename import *. Doing so clutters the importer’s namespace, and makes it much harder for linters to detect undefined names.
You can think of Python global variables as "module" variables - and as such they are much more useful than the traditional "global variables" from C.
A global variable is actually defined in a module's __dict__ and can be accessed from outside that module as a module attribute.
So, in your example:
# ../myproject/main.py
# Define global myList
# global myList - there is no "global" declaration at module level. Just inside
# function and methods
myList = []
# Imports
import subfile
# Do something
subfile.stuff()
print(myList[0])
And:
# ../myproject/subfile.py
# Save "hey" into myList
def stuff():
# You have to make the module main available for the
# code here.
# Placing the import inside the function body will
# usually avoid import cycles -
# unless you happen to call this function from
# either main or subfile's body (i.e. not from inside a function or method)
import main
main.mylist.append("hey")
Using from your_file import * should fix your problems. It defines everything so that it is globally available (with the exception of local variables in the imports of course).
for example:
##test.py:
from pytest import *
print hello_world
and:
##pytest.py
hello_world="hello world!"
Hai Vu answer works great, just one comment:
In case you are using the global in other module and you want to set the global dynamically, pay attention to import the other modules after you set the global variables, for example:
# settings.py
def init(arg):
global myList
myList = []
mylist.append(arg)
# subfile.py
import settings
def print():
settings.myList[0]
# main.py
import settings
settings.init("1st") # global init before used in other imported modules
# Or else they will be undefined
import subfile
subfile.print() # global usage
Your 2nd attempt will work perfectly, and is actually a really good way to handle variable names that you want to have available globally. But you have a name error in the last line. Here is how it should be:
# ../myproject/main.py
# Import globfile
import globfile
# Save myList into globfile
globfile.myList = []
# Import subfile
import subfile
# Do something
subfile.stuff()
print(globfile.myList[0])
See the last line? myList is an attr of globfile, not subfile. This will work as you want.
Mike
I just came across this post and thought of posting my solution, just in case of anyone being in the same situation as me, where there are quite some files in the developed program, and you don't have the time to think through the whole import sequence of your modules (if you didn't think of that properly right from the start, such as I did).
In such cases, in the script where you initiate your global(s), simply code a class which says like:
class My_Globals:
def __init__(self):
self.global1 = "initial_value_1"
self.global2 = "initial_value_2"
...
and then use, instead of the line in the script where you initiated your globals, instead of
global1 = "initial_value_1"
use
globals = My_Globals()
I was then able to retrieve / change the values of any of these globals via
globals.desired_global
in any script, and these changes were automatically also applied to all the other scripts using them. All worked now, by using the exact same import statements which previously failed, due to the problems mentioned in this post / discussion here. I simply thought of global object's properties being changing dynamically without the need of considering / changing any import logic, in comparison to simple importing of global variables, and that definitely was the quickest and easiest (for later access) approach to solve this kind of problem for me.
Based on above answers and links within I created a new module called global_variables.py:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# ==============================================================================
#
# global_variables.py - Global variables shared by all modules.
#
# ==============================================================================
USER = None # User ID, Name, GUID varies by platform
def init():
""" This should only be called once by the main module
Child modules will inherit values. For example if they contain
import global_variables as g
Later on they can reference 'g.USER' to get the user ID.
"""
global USER
import getpass
USER = getpass.getuser()
# End of global_variables.py
Then in my main module I use this:
import global_variables as g
g.init()
In another child imported module I can use:
import global_variables as g
# hundreds of lines later....
print(g.USER)
I've only spent a few minutes testing in two different python multiple-module programs but so far it's working perfectly.
Namespace nightmares arise when you do from config import mySharedThing. That can't be stressed enough.
It's OK to use from in other places.
You can even have a config module that's totally empty.
# my_config.py
pass
# my_other_module.py
import my_config
def doSomething():
print(my_config.mySharedThing.message)
# main.py
from dataclasses import dataclass
from my_other_module import doSomething
import my_config
#dataclass
class Thing:
message: str
my_config.mySharedThing = Thing('Hey everybody!')
doSomething()
result:
$ python3 main.py
Hey everybody!
But using objects you pulled in with from will take you down a path of frustration.
# my_other_module.py
from my_config import mySharedThing
def doSomething():
print(mySharedThing.message)
result:
$ python3 main.py
ImportError: cannot import name 'mySharedThing' from 'my_config' (my_config.py)
And maybe you'll try to fix it like this:
# my_config.py
mySharedThing = None
result:
$ python3 main.py
AttributeError: 'NoneType' object has no attribute 'message'
And then maybe you'll find this page and try to solve it by adding an init() method.
But the whole problem is the from.
I have a program which takes folder paths and other inputs through the command line with argparse. I want this script to run automatically on a server, but I also want to keep its argparse functionality in case I want to run the script manually. Is there a way to have the script use pre-generated inputs from a file but also retain its flag based input system with argparse? Here is my current implementation:
parser = argparse.ArgumentParser(description='runs batch workflow on root directory')
parser.add_argument("--root", type=str, default='./', help="the path to the root directory
to process")
parser.add_argument("--data", type=str, default='MS', help="The type of data to calculate ")
args = parser.parse_args()
root_dir = args.root
option = args.data
I'm pretty new to this stuff, and reading the argparse documentation and This stack overflow question is not really what I want, if possible I would like to keep the root and data flags, and not just replace them with an input file or stdin.
If using argparse, the default keyword argument is a good, standard way to approach the problem; embed the default behavior of the program in the script source, not an external configuration file. However, if you have multiple configuration files that you want to deploy differently, the approach you mentioned (pre-generated from an input) is desirable.
argparse to dictionary
The argparse namespace can be converted to a dictionary. This is convenient as we can make a function that accepts a dictionary, or keyword arguments, and have it process the program with a convenient function signature. Also, file parsers can just as easily load dictionaries and interact with the same function. The python json module is used as an example. Of course, others can be used.
Example Python
def main(arg1=None, arg2=None, arg3=None):
print(f"{arg1}, {arg2}, {arg3}")
if __name__ == "__main__":
import sys
import json
import argparse
# script called with nothing -- load default
if len(sys.argv) == 1:
with open("default.json", "r") as dfp:
conf = json.load(dfp)
main(**conf)
else: # parse arguments
parser = argparse.ArgumentParser()
parser.add_argument('-a1', dest='arg1', metavar='arg1', type=str)
parser.add_argument('-a2', dest='arg2', metavar='arg2', type=str)
parser.add_argument('-a3', dest='arg3', metavar='arg3', type=str)
args = parser.parse_args()
conf = vars(args)
main(**conf)
default.json
{
"arg1" : "str1",
"arg2" : "str2",
"arg3" : "str3"
}
Using Fire
The python Fire module can be used more conveniently as well. It has multiple modes that the file can be interacted with minimal effort. The github repo is available here.
I am writing a python program that I want to have a command line interface that behaves in a particular way
The command line interface should accept the following invocations:
my_prog test.svg foo
my_prog --font=Sans test.svg foo
(it will generate an svg with the word foo written in the specified or default font)
Now I want to be able to also have this command accept the following invocation...
my_prog --list-fonts
which will list all of the valid options to --font as determined by the fonts available on the system.
I am using argparse, and I have something like this:
parser = argparse.ArgumentParser()
parser.add_argument('output_file')
parser.add_argument('text')
parser.add_argument('--font', help='list options with --list-fonts')
parser.add_argument('--list-fonts', action='store_true')
args = parser.parse_args()
however this does not make the --list-fonts option behave as I would like as the two positional arguments are still required.
I have also tried using subparsers, but these still need a workaround to prevent the other options being required every time.
How do I get the desired behaviour with argparse.
argparse allows you to define arbitrary actions to take when encountering an argument, based on the action keyword argument to add_argument (see the docs)
You can define an action to list your fonts and then abort argument parsing, which will avoid checking for the other required arguments.
this could look like this:
class ListFonts(argparse.Action):
def __call__(self, parser, namespace, values, option_string):
print("list of fonts here")
parser.exit() # exits the program with no more arg parsing and checking
Then you can add it to your argument like so:
parser.add_argument('--list-fonts', nargs=0, action=ListFonts)
Note nargs=0 has been added so that this argument doesn't require a value (the code in the question achieved this with action='store_true')
This solution has a side-effect of enabling the invocations like the following to also list the fonts and exits without running the main program:
my_prog --font Sans test.svg text --list-fonts
This is likely not a problem as it's not a typical use case, especially if the help text explains this behaviour.
If defining a new class for each such option feels too heavyweight, or perhaps you have more than one option that has this behaviour, then you could consider having a function that implements the desired action for each argument and then have a kind of factory function that returns a class that wraps the function. A complete example of this is shown below.
def list_fonts():
print("list of fonts here")
def override(func):
""" returns an argparse action that stops parsing and calls a function
whenever a particular argument is encountered. The program is then exited """
class OverrideAction(argparse.Action):
def __call__(self, parser, namespace, values, option_string):
func()
parser.exit()
return OverrideAction
parser = argparse.ArgumentParser()
parser.add_argument('output_file')
parser.add_argument('text')
parser.add_argument('--font', help='list options with --list-fonts')
parser.add_argument('--list-fonts', nargs=0, action=override(list_fonts),
help='list the font options then stop, don\'t generate output')
args = parser.parse_args()
I want to debug a small python script that takes input from stdin and sends it to stdout. Used like this:
filter.py < in.txt > out.txt
There does not seem to be a way to configure Pycharm debugging to pipe input from my test data file.
This question has been asked before, and the answer has been, basically "you can't--rewrite the script to read from a file."
I modified the code to take a file, more or less doubling the code size, with this:
import argparse
if __name__ == '__main__':
cmd_parser = argparse.ArgumentParser()
cmd_parser.add_argument('path', nargs='?', default='/dev/stdin')
args = cmd_parser.parse_args()
with open(in_path) as f:
filter(f)
where filter() now takes a file object open for write as a parameter. This permits backward compatibility so it can be used as above, while I am also able to invoke it under the debugger with input from a file.
I consider this an ugly solution. Is there a cleaner alternative? Perhaps something that leaves the ugliness in a separate file?
If you want something simpler, you can forgo argparse entirely and just use the sys.argv list to get the first argument.
import sys
if len(sys.argv) > 1:
filename = sys.argv[1]
else:
filename = sys.stdin
with open(filename) as f:
filter(f)