The following parser should let me do some sub commands:
% my_script acmd a_val
Is processed sort of like this in my_script.py (using the list instead of an actual command line.)
import argparse
parser = argparse.ArgumentParser(description='example')
subparsers = parser.add_subparsers()
acmd_parser = subparsers.add_parser('acmd')
acmd_parser.add_argument('a_arg')
bcmd_parser = subparsers.add_parser('bcmd')
bcmd_parser.add_argument('b_arg')
args = parser.parse_args(['acmd','a_val'])
print(args)
The result is this:
Namespace(a_arg='a_val')
How do I tell whether I ran acmd or bcmd? Do I just have to figure it out from the arguments?
Provide a dest parameter to the add_subparsers command, as documented in
https://docs.python.org/3/library/argparse.html#sub-commands
>>> parser = argparse.ArgumentParser()
>>> subparsers = parser.add_subparsers(dest='subparser_name')
>>> subparser1 = subparsers.add_parser('1')
>>> subparser1.add_argument('-x')
>>> subparser2 = subparsers.add_parser('2')
>>> subparser2.add_argument('y')
>>> parser.parse_args(['2', 'frobble'])
Namespace(subparser_name='2', y='frobble')
That also documents the use of set_defaults.
Related
To understand the values of each variable, I improved a script for replacement from Udacity class. I convert the codes in a function into regular codes. However, my codes do not work while the codes in the function do. I appreciate it if anyone can explain it. Please pay more attention to function "tokenize".
Below codes are from Udacity class (CopyRight belongs to Udacity).
# download necessary NLTK data
import nltk
nltk.download(['punkt', 'wordnet'])
# import statements
import re
import pandas as pd
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer
url_regex = 'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_#.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'
def load_data():
df = pd.read_csv('corporate_messaging.csv', encoding='latin-1')
df = df[(df["category:confidence"] == 1) & (df['category'] != 'Exclude')]
X = df.text.values
y = df.category.values
return X, y
def tokenize(text):
detected_urls = re.findall(url_regex, text) # here, "detected_urls" is a list for sure
for url in detected_urls:
text = text.replace(url, "urlplaceholder") # I do not understand why it can work while does not work in my code if I do not convert it to string
tokens = word_tokenize(text)
lemmatizer = WordNetLemmatizer()
clean_tokens = []
for tok in tokens:
clean_tok = lemmatizer.lemmatize(tok).lower().strip()
clean_tokens.append(clean_tok)
return clean_tokens
X, y = load_data()
for message in X[:5]:
tokens = tokenize(message)
print(message)
print(tokens, '\n')
Below is its output:
I want to understand the variables' values in function "tokenize()". Following are my codes.
X, y = load_data()
detected_urls = []
for message in X[:5]:
detected_url = re.findall(url_regex, message)
detected_urls.append(detected_url)
print("detected_urs: ",detected_urls) #output a list without problems
# replace each url in text string with placeholder
i = 0
for url in detected_urls:
text = X[i].strip()
i += 1
print("LN1.url= ",url,"\ttext= ",text,"\n type(text)=",type(text))
url = str(url).strip() #if I do not convert it to string, it is a list. It does not work in text.replace() below, but works in above function.
if url in text:
print("yes")
else:
print("no") #always show no
text = text.replace(url, "urlplaceholder")
print("\nLN2.url=",url,"\ttext= ",text,"\n type(text)=",type(text),"\n===============\n\n")
The output is shown below.
The outputs for "LN1" and "LN2" are same. The "if" condition always output "no". I do not understand why it happens.
Any further help and advice would be highly appreciated.
I have a data augmentation script that has a class with a bunch of optional methods that are triggered by argparse arguments. I am curious how I can structure my code to process the argparse commands based on the order they are passed in from the terminal.
Goal: If I were to pass arguments as: python maths.py --add --multiply I would want it to add 10 first then multiply by 5 second.
If I were to pass arguments as: python maths.py --multiply --add I would want it to multiply 5 first then add 10.
For example:
class Maths:
def __init__(self):
self.counter = 0
def addition(self, num):
self.counter += num
return self
def multiply(self, num):
self.counter *= num
return self
def main():
parser = argparse.ArgumentParser()
parser.add_argument('--add', required = False, action = 'store_true')
parser.add_argument('--multiply', required = False, action = 'store_true')
args = parser.parse_args()
maths = Maths()
maths.addition(10)
maths.multiply(5)
print(maths.counter)
if __name__ == "__main__":
main()
How can I accomplish ordering based on the order of how the arguments are passed in? Thank you!
This parser provides two ways of inputing lists of strings:
In [10]: parser = argparse.ArgumentParser()
...: parser.add_argument('--cmds', nargs='*', choices=['add','mult'])
...: parser.add_argument('--add', dest='actions', action='append_const', const='add')
...: parser.add_argument('--multiply', dest='actions', action = 'append_const', const='mult')
...: parser.print_help()
...:
...:
usage: ipython3 [-h] [--cmds [{add,mult} [{add,mult} ...]]] [--add]
[--multiply]
optional arguments:
-h, --help show this help message and exit
--cmds [{add,mult} [{add,mult} ...]]
--add
--multiply
As values of a '--cmds' argument:
In [11]: parser.parse_args('--cmds mult add'.split())
Out[11]: Namespace(actions=None, cmds=['mult', 'add'])
As separate flagged arguments:
In [12]: parser.parse_args('--mult --add'.split())
Out[12]: Namespace(actions=['mult', 'add'], cmds=None)
In both cases I create a list of strings. In the second the const values could be functions or methods.
const=maths.addition
I am calling a GET API to retrieve some data. For get call I need to covert my keyword as
keyword = "mahinder singh dhoni"
into
caption%3Amahinder%2Ccaption%3Asingh%2Ccaption%3Adhoni
I am new to python and dont know the pythonic way. I am doing like this
caption_heading = "caption%3A"
caption_tail = "%2Ccaption%3A"
keyword = "mahinder singh dhoni"
x = keyword.split(" ")
new_caption_keyword = []
new_caption_keyword.append(caption_heading)
for data in x:
new_caption_keyword.append(data)
new_caption_keyword.append(caption_tail)
search_query = ''.join(new_caption_keyword)
search_query = search_query[:-13]
print("new transformed keyword", search_query)
Is there a better way to do this.I means this is kind of hard coding.
Thanks
Best to turn our original string into a list:
>>> keyword = "mahinder singh dhoni"
>>> keyword.split()
['mahinder', 'singh', 'dhoni']
Then your actual string looks like caption:...,caption:...,caption:..., that can be done with a join and a format:
>>> # if you're < python3.6, use 'caption:{}'.format(part)`
>>> [f'caption:{part}' for part in keyword.split()]
['caption:mahinder', 'caption:singh', 'caption:dhoni']
>>> ','.join([f'caption:{part}' for part in keyword.split()])
'caption:mahinder,caption:singh,caption:dhoni'
And finally you'll urlencode using urllib.parse:
>>> import urllib.parse
>>> urllib.parse.quote(','.join([f'caption:{part}' for part in keyword.split()]))
'caption%3Amahinder%2Ccaption%3Asingh%2Ccaption%3Adhoni'
so try this way,
instead of split you can replace " " empty space with "%2Ccaption%3A" and start your string with "caption%3A"
for 2.x:
>>> from urllib import quote
>>> keyword = "mahinder singh dhoni"
>>> quote(','.join(['caption:%s'%i for i in keyword.split()]))
for 3.x:
>>> from urllib.parse import quote
>>> keyword = "mahinder singh dhoni"
>>> quote(','.join(['caption:%s'%i for i in keyword.split()]))
I am looking to assert a sequence of calls, without caring for what arguments are given. Is there any way to accomplish the following?
self.mocker = Mock()
self.mocker.increment = Mock()
self.mocker.decrement = Mock()
self.mocker.increment(2)
self.mocker.decrement(4)
expected_calls = [call.increment(ANY_ARGS), call.decrement(ANY_ARGS)]
self.mocker.assert_has_calls(expected_calls, any_order=False)
You'd want to look at the mock_calls list and extract the names for each call recorded. You can then assert that the right method names are called, in order:
self.assertEqual([c[0] for c in self.mocker.mock_calls], ['increment', 'decrement'])
Quick demo:
>>> from unittest import mock
>>> mocker = mock.Mock()
>>> mocker.increment(2)
<Mock name='mock.increment()' id='4546399144'>
>>> mocker.decrement(4)
<Mock name='mock.decrement()' id='4546398752'>
>>> mocker.mock_calls
[call.increment(2), call.decrement(4)]
>>> mocker.mock_calls[0][0]
'increment'
>>> [c[0] for c in mocker.mock_calls]
['increment', 'decrement']
I am filtering my file list using this line:
MyList = filter(lambda x: x.endswith(('.doc','.txt','.dat')), os.listdir(path))
The line above will only filter lowercase extension files. Therefore, is there an elegant way to make it filter also the uppercase extension files?
You just need to add a .lower() to your lambda function
MyList = filter(lambda x: x.lower().endswith(('.doc','.txt','.dat')), os.listdir(path))
I'd prefer to use os.path.splitext with a list comprehension
from os.path import splitext
my_list = [x for x in os.listdir(path) if splitext(x)[1].lower() in {'.doc', '.txt', '.dat'}]
Still a bit much for a single line, so perhaps
from os.path import splitext
def valid_extension(x, valid={'.doc', '.txt', '.dat'}):
return splitext(x)[1].lower() in valid
my_list = [x for x in os.listdir(path) if valid_extension(x)]
import os
import re
pat = re.compile(r'[.](doc|txt|dat)$', re.IGNORECASE)
filenames = [filename for filename in os.listdir(path)
if re.search(pat, filename)]
print(filenames)