Builder with optional source - scons

I want to make a Builder with one or more optional sources.
I tried this:
env.Append(BUILDERS = {'my_builder': Builder(action = Action(do_something))})
def do_something(target, source, env):
if source[1]:
do_optional_stuff(source[1])
do_other_stuff(target, source[0])
...
env.my_builder(target.txt, [source1, None]) # Fails
env.my_builder(target.txt, [source2, source3]) # Okay
The trouble is, I get 'NoneType' object has no attribute 'get_ninfo' when I pass in None, because scons is expecting Node arguments, and None isn't acceptable.
Is there anything I can do?
Edit:
As noted in the answer below, it's possible to solve this for the simple case of one optional argument, by varying the length of source list. This doesn't work for making arbitrary arguments optional, so I'd still be interested in a way to do that.

Instead of adding a bogus element, check the length of the source list (or better yet, iterate over the list starting after the first element):
def do_something(target, source, env):
if len(source) > 1:
do_optional_stuff(source[1])
# or:
# for opt_src in source[1:]:
# do_optional_stuff(opt_src)
do_main_stuff(target, source[0])
env.Append(BUILDERS = {'my_builder': Builder(action = Action(do_something))})
env.my_builder('target-2.txt', ['foo.txt'])
env.my_builder('target-1.txt', ['foo.txt', 'bar.txt'])
One issue with this approach is that you need to ensure that your sources are listed in the right order. Depending on the details of what you're doing, you might be able to filter the source list by matching file names or extensions. After all, this is Python code, you have the full power of the language at your disposal.

Related

Ask a question about a strange Python3 list index error

I am new to python3 and I want to get all the document suffix use:
dir_files = set(map(lambda f: f.split(sep='.')[1], os.listdir()))
but come with an error:
IndexError: list index out of range
However if I change [1] to [0] I can get all the filenames
correctly.
That's why? PLS help me.
If you want to get suffix of all documents, then you should go with some other approach. With this approach your program will fail for filenames like these:
my_doc - In this case, the file doesn't have any suffix. So, the split method will result will generate this list - ['my_doc']. Since this is a single element list, you're bound to get an IndexError.
my.doc.txt - Since more than 1 '.'s are present in this file name, the split method will generate this list - ['my', 'doc', 'txt']. Here, your code will give you doc as the file suffix even though real suffix is txt.
One may list more problems, because os.listdir() lists out directories and hidden files as well, but I won't talk about it since I don't know all about your task.
This is one possible solution that will work in most cases (not all cases):
dir_files = set(map(lambda f: os.path.splitext(f)[1][1:].lower(), os.listdir())) - {''}

Flask-Mongoengine custom order-by comparator

I am using Flask-Mongoengine as an ORM for my flask application. I define a Document to store unique versions in a string:
MyDocument(db.Document):
version = db.StringField(primary_key=True, required=True)
The value stored in the StringField is in the following format:
a.b.c.d
I run into issues when it comes to using order_by for my queries:
MyDocument.objects.order_by('version')
More precisely, if the first part of the version (a in the format example above) contains multiple digits, it doesn't sort properly (e.g: 15.20.142 < 1.7.9). While I could obviously do something like:
tmp = Document.objects.all()
result = some_merge_sort(tmp)
However, that requires writing, testing and maintaining a sorting algorithm. This then brings me to my next solution, which is to overwrite the comparator used in order_by. The issue is, I'm not sure how to do this properly. Do I have to define a custom attribute type, or is there some hook I'm supposed to override?
In any case, I thought I would as the good folks on stackoverflow before I go into re-writing the wheel.
After doing some digging into the Mongoengine source code, I found out that it uses the PyMongo cursor sort which itself wraps the queries made to MongoDB. In other words, my Document definition is too high in the chain to even affect the order_by result.
As such, the solution I went with was to write a classmethod. This simply takes in a mongoengine.queryset.QuerySet instance and applies a sort on the version attribute.
from mongoengine.queryset import QuerySet
from packaging import version
class MyDocument(db.Document):
version = db.StringField(primary_key=True, required=True)
#classmethod
def order_by_version(_, queryset, reverse=False):
"""
Wrapper function to order a given queryset of this classes
version attribute.
:params:
- `QuerySet` :queryset: - The queryset to order the elements.
- `bool` :reverse: - If the results should be reversed.
:raises:
- `TypeError` if the instance of the given queryset is not a `QuerySet`.
:returns:
- `List` containing the elements sorted by the version attribute.
"""
# Instance check the queryset.
if not isinstance(queryset, QuerySet):
raise TypeError("order_by_version requires a QuerySet instance!")
# Sort by the version attribute.
return sorted(queryset, key=lambda obj: version.parse(obj.version), reverse=reverse)
Then to utilize this method, I can simply do as follows:
MyDocument.order_by_version(MyDocument.objects)
As for my reason for taking the QuerySet instance rather than doing the query from within the classmethod, it was simply to allow the opportunity to call other Mongoengine methods prior to doing the ordering. Because I used sorted, it will maintain any existing order, so I can still use the built-in order_by for other attributes.

How to initilise a list that contains custom functions without python running those functions during initialisation?

Short version:
How do you store functions in a list and only have them be executed when they are called using their index position in the list?
Long Version:
So I am writing a program that rolls a user-chosen number of six-sided dice, stores the results in a list and then organizes the results/ data in a dictionary.
After the data is gathered the program gives the user options from 0-2 to choose from and asks the user to type a number corresponding to the option they want.
After this input by the user, a variable, lets say TT, is assigned to it. I want the program to use TT to identify which function to run that is contained within a list called "Executable_options" by using TT as the index posistion of this function within the list.
The problem I am having is that I have to have the list that contains the functions on a line after the functions have been defined and when I initialize the list it goes through and executes all functions within it in order when I don't want it to. I just want them to be in the list for calling at a later date.
I tried to initialise the list without any functions in and then append the functions individually, but every time a function is appened to the list it is also executed.
def results():
def Rolling_thunder():
def roll_again():
The functions contains things, but is unnecessary to show for the question at hand
Executable_options = []
Executable_options.append(results())
Executable_options.append(Rolling_thunder())
Executable_options.append(roll_again)
options = len(Executable_options)
I am relatively new to Python so I am still getting my head around it. I have tried searching for the answer to this on existing posts, but couldn't find anything so I assume I am just using the wrong key words in my search.
Thank you very much for taking the time to read this and for the answers provided.
Edit: Code now works
The () on the end of the function name calls it - i.e. results() is the call to the results method.
Simply append to the list without the call - i.e:
Executable_options.append(results)
You can then call it by doing e.g.:
Executable_options[0]()
as per your given data the code will look like this:
def results():
def Rolling_thunder():
def roll_again():
Executable_options = []
Executable_options.append(results)
Executable_options.append(Rolling_thunder)
Executable_options.append(roll_again)
for i in range(0,len(Executable_options)):
Executable_options[i]()
this will work for you.

Is there a pandas filter that allows any value? [duplicate]

I have discovered the pandas DataFrame.query method and it almost does exactly what I needed it to (and implemented my own parser for, since I hadn't realized it existed but really I should be using the standard method).
I would like my users to be able to specify the query in a configuration file. The syntax seems intuitive enough that I can expect my non-programmer (but engineer) users to figure it out.
There's just one thing missing: a way to select everything in the dataframe. Sometimes what my users want to use is every row, so they would put 'All' or something into that configuration option. In fact, that will be the default option.
I tried df.query('True') but that raised a KeyError. I tried df.query('1') but that returned the row with index 1. The empty string raised a ValueError.
The only things I can think of are 1) put an if clause every time I need to do this type of query (probably 3 or 4 times in the code) or 2) subclass DataFrame and either reimplement query, or add a query_with_all method:
import pandas as pd
class MyDataFrame(pd.DataFrame):
def query_with_all(self, query_string):
if query_string.lower() == 'all':
return self
else:
return self.query(query_string)
And then use my own class every time instead of the pandas one. Is this the only way to do this?
Keep things simple, and use a function:
def query_with_all(data_frame, query_string):
if query_string == "all":
return data_frame
return data_frame.query(query_string)
Whenever you need to use this type of query, just call the function with the data frame and the query string. There's no need to use any extra if statements or subclass pd.Dataframe.
If you're restricted to using df.query, you can use a global variable
ALL = slice(None)
df.query('#ALL', engine='python')
If you're not allowed to use global variables, and if your DataFrame isn't MultiIndexed, you can use
df.query('tuple()')
All of these will property handle NaN values.
df.query('ilevel_0 in ilevel_0') will always return the full dataframe, also when the index contains NaN values or even when the dataframe is completely empty.
In you particular case you could then define a global variable all_true = 'ilevel_0 in ilevel_0' (as suggested in the comments by Zero) so that your engineers could use the name of the global variable in their config file instead.
This statement is just a dirty way to properly query True like you already tried. ilevel_0 is a more formal way of making sure you are referring the index. See the docs here for more details on using in and ilevel_0: https://pandas.pydata.org/pandas-docs/stable/indexing.html#the-query-method

python argparse add_argument_group required

In this question
argparse: require either of two arguments
I find a reference to the solution I want, but it isn't right.
I need at least 1 of 2 options to be present, option1, option2 or both...
The add_argument_group function doesn't have a required argument.
The add_mutually_exclusive function has it, but it forces me to choose between the 2 options, which is not what I want.
rds,
argument_group just controls the help display. It does not affect the parsing or check for errors. mutually_exclusive_group affects usage display and tests for occurrence, but as you note, its logic is not what you want.
There is a Python bug issue requesting some form of nested 'inclusive' group. But a general form that allows nesting and all versions of and/or/xor logic is not a trivial addition.
I think your simplest solution is to test the args after parsing. If there is a wrong mix of defaults, then raise an error.
Assuming the default for both arguments is None:
if args.option1 is None and args.option2 is None:
parser.error('at least one of option1 and option2 is required')
What would be meaningful usage line? required mutually exclusive' uses(opt1 | opt2).(opt1 & opt2)might indicate that both are required. Your case is anon-exclusive or`
usage: PROG [-h] (--opt1 OPT1 ? --opt2 OPT2)

Resources