How to modify the signature of a function dynamically - python-3.x

I am writing a framework in Python. When a user declares a function, they do:
def foo(row, fetch=stuff, query=otherStuff)
def bar(row, query=stuff)
def bar2(row)
When the backend sees query= value, it executes the function with the query argument depending on value. This way the function has access to the result of something done by the backend in its scope.
Currently I build my arguments each time by checking whether query, fetch and the other items are None, and launching it with a set of args that exactly matches what the user asked for. Otherwise I got the "got an unexpected keyword argument" error. This is the code in the backend:
#fetch and query is something computed by the backend
if fetch= None and query==None:
userfunction(row)
elif fetch==None:
userunction (row, query=query)
elif query == None:
userfunction (row, fetch=fetch)
else:
userfunction (row,fetch=fetch,query=query)
This is not good; for each additional "service" the backend offers, I need to write all the combinations with the previous ones.
Instead of that I would like to primarily take the function and manually add a named parameter, before executing it, removing all the unnecessary code that does these checks. Then the user would just use the stuff it really wanted.
I don't want the user to have to modify the function by adding stuff it doesn't want (nor do I want them to specify a kwarg every time).
So I would like an example of this if this is doable, a function addNamedVar(name, function) that adds the variable name to the function function.
I want to do that that way because the users functions are called a lot of times, meaning that it would trigger me to, for example, create a dict of the named var of the function (with inspect) and then using **dict. I would really like to just modify the function once to avoid any kind of overhead.

This is indeed doable in AST and that's what I am gonna do because this solution will suit better for my use case . However you could do what I asked more simply by having a function cloning approach like the code snippet I show. Note that this code return the same functions with different defaults values. You can use this code as example to do whatever you want.
This works for python3
def copyTransform(f, name, **args):
signature=inspect.signature(f)
params= list(signature.parameters)
numberOfParam= len(params)
numberOfDefault= len(f.__defaults__)
listTuple= list(f.__defaults__)
for key,val in args.items():
toChangeIndex = params.index(key, numberOfDefault)
if toChangeIndex:
listTuple[toChangeIndex- numberOfDefault]=val
newTuple= tuple(listTuple)
oldCode=f.__code__
newCode= types.CodeType(
oldCode.co_argcount, # integer
oldCode.co_kwonlyargcount, # integer
oldCode.co_nlocals, # integer
oldCode.co_stacksize, # integer
oldCode.co_flags, # integer
oldCode.co_code, # bytes
oldCode.co_consts, # tuple
oldCode.co_names, # tuple
oldCode.co_varnames, # tuple
oldCode.co_filename, # string
name, # string
oldCode.co_firstlineno, # integer
oldCode.co_lnotab, # bytes
oldCode.co_freevars, # tuple
oldCode.co_cellvars # tuple
)
newFunction=types.FunctionType(newCode, f.__globals__, name, newTuple, f.__closure__)
newFunction.__qualname__=name #also needed for serialization
You need to do that weird stuff with the names if you want to Pickle your clone function.

Related

.get_dummies() works alone but doesnt save within function

I have a dataset and I want to make a function that does the .get_dummies() so I can use it in a pipeline for specific columns.
When I run dataset = pd.get_dummies(dataset, columns=['Embarked','Sex'], drop_first=True)
alone it works, as in, when I run df.head() I can still see the dummified columns but when I have a function like this,
def dummies(df):
df = pd.get_dummies(df, columns=['Embarked','Sex'], drop_first=True)
return df
Once I run dummies(dataset) it shows me the dummified columsn in that same cell but when I try to dataset.head() it isn't dummified anymore.
What am I doing wrong?
thanks.
You should assign the result of the function to df, call the function like:
dataset=dummies(dataset)
function inside them have their own independent namespace for variable defined there either in the signature or inside
for example
a = 0
def fun(a):
a=23
return a
fun(a)
print("a is",a) #a is 0
here you might think that a will have the value 23 at the end, but that is not the case because the a inside of fun is not the same a outside, when you call fun(a) what happens is that you pass into the function a reference to the real object that is somewhere in memory so the a inside will have the same reference and thus the same value.
With a=23 you're changing what this a points to, which in this example is 23.
And with fun(a) the function itself return a value, but without this being saved somewhere that result get lost.
To update the variable outside you need to reassigned to the result of the function
a = 0
def fun(a):
a=23
return a
a = fun(a)
print("a is",a) #a is 23
which in your case it would be dataset=dummies(dataset)
If you want that your function make changes in-place to the object it receive, you can't use =, you need to use something that the object itself provide to allow modifications in place, for example
this would not work
a = []
def fun2(a):
a=[23]
return a
fun2(a)
print("a is",a) #a is []
but this would
a = []
def fun2(a):
a.append(23)
return a
fun2(a)
print("a is",a) #a is [23]
because we are using a in-place modification method that the object provided, in this example that would be the append method form list
But such modification in place can result in unforeseen result, specially if the object being modify is shared between processes, so I rather recomend the previous approach

Python get #property.setter decorated method in a class

In Python there is no switch/case. It is suggested to use dictionaries: What is the Python equivalent for a case/switch statement?
in Python it is good practise to use #property to implement getter/setter: What's the pythonic way to use getters and setters?
So, if I want to build a class with a list of properties to switch so I can get or update values, I can use something like:
class Obj():
"""property demo"""
#property
def uno(self):
return self._uno
#uno.setter
def uno(self, val):
self._uno = val*10
#property
def options(self):
return dict(vars(self))
But calling
o=Obj()
o.uno=10 # o.uno is now 100
o.options
I obtain {'_uno': 100} and not {'uno': 100}.
Am I missing something?
vars is really a tool for introspection, and gives you the local variables of the current space, or in a given object - it is not a good way to get attributes and variables ready for final consumption.
So, your options code must be a bit more sophisticated - one way to go
is to search the class for any properties, and then using getattr to get
the values of those properties, but using the getter code, and
introspect the instance variables, to get any methods attributed directly,
but discard the ones starting with _:
#property
def options(self):
results = {}
# search in all class attributes for properties, including superclasses:
for name in dir(self.__class__):
# obtain the object taht is associated with this name in the class
attr = getattr(self.__class__, name)
if isinstance(attr, property):
# ^ if you want to also retrieve other "property like"
# attributes, it is better to check if it as the `__get__` method and is not callable:
# "if hasattr(attr, '__get__') and not callable(attr):"
# retrieves the attribute - ensuring the getter code is run:
value = getattr(self, name)
results[name] = value
# check for the attributes assigned directly to the instance:
for name, value in self.__dict__.items():
# ^ here, vars(self) could have been used instead of self.__dict__
if not name.startswith("_"):
results[name] = value
return results
about switch..case
On a side note to your question, regarding the "switch...case" construction: please disregard all content you read saying "in Python one should use dictionaries instead of switch/case". This is incorrect.
The correct construct to replace "switch...case" in Python is the "if..elif..else". You can have all the expressiveness one does have with a C-like "switch" with a plain "if-else" tree in Python, and actually, go much beyond that, as the testing expression in if...elif can be arbitrary, and not just a matching value.
option = get_some_user_option()
if option == "A":
...
elif option == "B":
...
elif option in ("C", "D", "E"):
# common code for C, D, E
...
if option == "E":
# specialized code for "E",
else:
# option does not exist.
...
While it is possible to use a dictionary as a call table, and having functions to perform actions in the dictionary values, this construct is obviously not a "drop in" replacement for a plain switch case - starting from the point that the "case" functions can't be written inline in the dictionary, unless they can be written as a lambda function, and mainly
the point that they won't have direct access to the variables on the function calling them.

Write a recursive function to list all paths of parts.txt

Write a function list_files_recursive that returns a list of the paths of all the parts.txt files without using the os module's walk generator. Instead, the function should use recursion. The input will be a directory name.
Here is the code I have so far and I think it's basically right, but what's happening is that the output is not one whole list?
def list_files_recursive(top_dir):
rec_list_files = []
list_dir = os.listdir(top_dir)
for item in list_dir:
item_path = os.path.join(top_dir, item)
if os.path.isdir(item_path):
list_files_recursive(item_path)
else:
if os.path.basename(item_path) == 'parts.txt':
rec_list_files.append(os.path.join(item_path))
print(rec_list_files)
return rec_list_files
This is part of the output I'm getting (from the print statement):
['CarItems/Honda/Accord/1996/parts.txt']
[]
['CarItems/Honda/Odyssey/2000/parts.txt']
['CarItems/Honda/Odyssey/2002/parts.txt']
[]
So the problem is that it's not one list and that there's empty lists in there. I don't quite know why this isn't not working and have tried everything to work through it. Any help is much appreciated on this!
This is very close, but the issue is that list_files_recursive's child calls don't pass results back to the parent. One way to do this is to concatenate all of the lists together from each child call, or to pass a reference to a single list all the way through the call chain.
Note that in rec_list_files.append(os.path.join(item_path)), there's no point in os.path.join with only a single parameter. print(rec_list_files) should be omitted as a side effect that makes the output confusing to interpret--only print in the caller. Additionally,
else:
if ... :
can be more clearly written here as elif: since they're logically equivalent. It's always a good idea to reduce nesting of conditionals whenever possible.
Here's the approach that works by extending the parent list:
import os
def list_files_recursive(top_dir):
files = []
for item in os.listdir(top_dir):
item_path = os.path.join(top_dir, item)
if os.path.isdir(item_path):
files.extend(list_files_recursive(item_path))
# ^^^^^^ add child results to parent
elif os.path.basename(item_path) == "parts.txt":
files.append(item_path)
return files
if __name__ == "__main__":
print(list_files_recursive("foo"))
Or by passing a result list through the call tree:
import os
def list_files_recursive(top_dir, files=[]):
for item in os.listdir(top_dir):
item_path = os.path.join(top_dir, item)
if os.path.isdir(item_path):
list_files_recursive(item_path, files)
# ^^^^^ pass our result list recursively
elif os.path.basename(item_path) == "parts.txt":
files.append(item_path)
return files
if __name__ == "__main__":
print(list_files_recursive("foo"))
A major problem with these functions are that they only work for finding files named precisely parts.txt since that string literal was hard coded. That makes it pretty much useless for anything but the immediate purpose. We should add a parameter for allowing the caller to specify the target file they want to search for, making the function general-purpose.
Another problem is that the function doesn't do what its name claims: list_files_recursive should really be called find_file_recursive, or, due to the hardcoded string, find_parts_txt_recursive.
Beyond that, the function is a strong candidate for turning into a generator function, which is a common Python idiom for traversal, particularly for situations where the subdirectories may contain huge amounts of data that would be expensive to keep in memory all at once. Generators also allow the flexibility of using the function to cancel the search after the first match, further enhancing its (re)usability.
The yield keyword also makes the function code itself very clean--we can avoid the problem of keeping a result data structure entirely and just fire off result items on demand.
Here's how I'd write it:
import os
def find_file_recursive(top_dir, target):
for item in os.listdir(top_dir):
item_path = os.path.join(top_dir, item)
if os.path.isdir(item_path):
yield from find_file_recursive(item_path, target)
elif os.path.basename(item_path) == target:
yield item_path
if __name__ == "__main__":
print(list(find_file_recursive("foo", "parts.txt")))

Why can't I access a variable that's being returned from a function?

I am new to Python and am at a lost as to what I'm doing wrong. I am trying to use the fqdn variable that is being returned to the caller which is main() but I'm getting NameError: name 'fqdn' is not defined
I'm betting this is some type of global variable statement issue or something like that, but I've been researching this and can't figure it out.
If a function from a module returns a value, and the caller is main(), shouldn't main() be able to use that returned value???
Here's the layout:
asset.py
def import_asset_list():
# Open the file that contains FQDNs
openfile = open(r"FQDN-test.txt")
if openfile.mode == 'r':
# Remove CR from end of each item
fqdn = openfile.read().splitlines()
# Add https to the beginning of every item in list
fqdn = ["https://" + item for item in fqdn]
openfile.close()
return fqdn
tscan.py
def main():
import asset
asset.import_asset_list()
# Iterate through list
for i in fqdn:
if SCHEDULED_SCAN == 1:
create_scheduled_scan(fqdn)
launch_scan(sid)
check_status_scan(uuid)
else:
create_scan(fqdn)
launch_scan(sid)
check_status_scan(uuid)
Short Explanation
Yes, main() should be able to use the returned value, but it's only the value that is returned, not the variable name. You have to define a variable of your own name, to receive the value, and use that instead.
Long Explanation
The name of a variable inside any function is simply a "label" valid only within the scope of this function. A function is an abstraction which means "Give me some input(s), and I will give you some output(s)". Within the function, you need to reference the inputs somehow and, potentially, assign some additional variables to perform whatever it is you would like to. These variable names have no meaning whatsoever outside the function, other than to, at most, convey some information as to the intended use of the function.
When a function returns a value, it does not return the "name" of the variable. Only the value (or the reference in memory) of the variable. You can define your own variable at the point where you call the function, give it your own name and assign to it the returned result of the function, so you simply have to write:
def main():
import asset
my_asset_list = asset.import_asset_list()
# Iterate through list
for i in my_asset_list:
if SCHEDULED_SCAN == 1:
create_scheduled_scan(my_asset_list)
launch_scan(sid)
check_status_scan(uuid)
else:
create_scan(my_asset_list)
launch_scan(sid)
check_status_scan(uuid)
I don't know where the uuid and the sid variables are defined.
To make sure you have understood this properly, remember:
You can have multiple functions in the same file, and use identically-named variables within all those functions, this will be no problem because a variable (with its name) only exists within each specific function scope.
Variable names do not "cross" the boundaries of the scope, only variable values/references and to do this, a special construct is used, i.e. the return [something] statement.

How to pass arguments from one function to other functions?

I have created three functions. The first function is used in the other two functions but I am passing it a hardcoded filepath. I want to be able to pass this in as a parameter, but I seem to be getting an issue.
Essentially, given a file_path, my function will get the first item in the list and then the second item.
So far my code is as follows :
def sort_files(file_path):
"""Sort files in ascending order"""
files = os.listdir(file_path)
return sorted(files, reverse=True)
def current_day():
"""Get the current day file"""
return sort_files(file_path)[0]
def previous_day():
"""Get the previous day file"""
return sort_files(file_path)[1]
If you want a function to accept an argument, you need to define it as doing so by specifying the parameter name it will be known as in the function (as you did with sort_files).
How are you executing the call to the current_day and previous_day. You should make them as function that can take a parameter.
Also please post the code that you are using to execute the whole setup.

Resources