multiple assignment for functions with *args - python-3.x

I am trying to write a function that will be used on multiple dictionaries of dataframes. My hope is to perform multiple assignments and do it all in one line. for example:
x, y, z = function(x, y, z)
However, with the function, I can't return multiple values for the multiple assignments. This is what I currently have
def split_pre(*args):
for arg in args:
newdict = {}
for key, sheet in arg.items():
if isinstance(sheet, str):
continue
else:
newdict[key] = sheet[sheet.Year < 2000]
return newdict
My thinking is that for each arg it would return the dictionary I created, but I get:
ValueError: too many values to unpack (expected 2)
The inputs to this function would be a dictionary made up of dataframes, e.g.,
x = {1:df, 2:df, 3:df...}
and the desired output would be of the same structure, but with the altered dfs from the function
I'm still quite new to python and this isn't super important, but I was wondering if anyone knew of a succinct way to get at this.

Do you want to return a dictionary per arg?
As already stated by #DeepSpace, Python stops processing the function when the first return command is executed. You can fix your problem in two ways: either create a list where you collect the dictionaries you want to return, or create a generator function:
# Solution with a list
def split_pre(*args):
ans = []
for arg in args:
newdict = {}
for key, sheet in arg.items():
if isinstance(sheet, str):
continue
else:
newdict[key] = sheet[sheet.Year < 2000]
ans.append(newdict)
return ans
or
# Solution with a generator
def split_pre(*args):
for arg in args:
newdict = {}
for key, sheet in arg.items():
if isinstance(sheet, str):
continue
else:
newdict[key] = sheet[sheet.Year < 2000]
yield newdict
In case you call a function in the way you do (a, b, c = func(x, y, z)) both samples are going to work in the same way. But they are not actually the same and I'd recommend using the solution with lists if you're not familiar with generators (you can read more about the yield keyword here)

Related

Identifying and handling more than one dataframe in Python [duplicate]

Suppose I have code like:
x = 0
y = 1
z = 2
my_list = [x, y, z]
for item in my_list:
print("handling object ", name(item)) # <--- what would go instead of `name`?
How can I get the name of each object in Python? That is to say: what could I write instead of name in this code, so that the loop will show handling object x and then handling object y and handling object z?
In my actual code, I have a dict of functions that I will call later after looking them up with user input:
def fun1():
pass
def fun2():
pass
def fun3():
pass
fun_dict = {'fun1': fun1,
'fun2': fun2,
'fun3': fun3}
# suppose that we get the name 'fun3' from the user
fun_dict['fun3']()
How can I create fun_dict automatically, without writing the names of the functions twice? I would like to be able to write something like
fun_list = [fun1, fun2, fun3] # and I'll add more as the need arises
fun_dict = {}
for t in fun_list:
fun_dict[name(t)] = t
to avoid duplicating the names.
Objects do not necessarily have names in Python, so you can't get the name.
When you create a variable, like the x, y, z above then those names just act as "pointers" or "references" to the objects. The object itself does not know what name(s) you are using for it, and you can not easily (if at all) get the names of all references to that object.
However, it's not unusual for objects to have a __name__ attribute. Functions do have a __name__ (unless they are lambdas), so we can build fun_dict by doing e.g.
fun_dict = {t.__name__: t for t in fun_list)
That's not really possible, as there could be multiple variables that have the same value, or a value might have no variable, or a value might have the same value as a variable only by chance.
If you really want to do that, you can use
def variable_for_value(value):
for n,v in globals().items():
if v == value:
return n
return None
However, it would be better if you would iterate over names in the first place:
my_list = ["x", "y", "z"] # x, y, z have been previously defined
for name in my_list:
print "handling variable ", name
bla = globals()[name]
# do something to bla
This one-liner works, for all types of objects, as long as they are in globals() dict, which they should be:
def name_of_global_obj(xx):
return [objname for objname, oid in globals().items()
if id(oid)==id(xx)][0]
or, equivalently:
def name_of_global_obj(xx):
for objname, oid in globals().items():
if oid is xx:
return objname
As others have mentioned, this is a really tricky question. Solutions to this are not "one size fits all", not even remotely. The difficulty (or ease) is really going to depend on your situation.
I have come to this problem on several occasions, but most recently while creating a debugging function. I wanted the function to take some unknown objects as arguments and print their declared names and contents. Getting the contents is easy of course, but the declared name is another story.
What follows is some of what I have come up with.
Return function name
Determining the name of a function is really easy as it has the __name__ attribute containing the function's declared name.
name_of_function = lambda x : x.__name__
def name_of_function(arg):
try:
return arg.__name__
except AttributeError:
pass`
Just as an example, if you create the function def test_function(): pass, then copy_function = test_function, then name_of_function(copy_function), it will return test_function.
Return first matching object name
Check whether the object has a __name__ attribute and return it if so (declared functions only). Note that you may remove this test as the name will still be in globals().
Compare the value of arg with the values of items in globals() and return the name of the first match. Note that I am filtering out names starting with '_'.
The result will consist of the name of the first matching object otherwise None.
def name_of_object(arg):
# check __name__ attribute (functions)
try:
return arg.__name__
except AttributeError:
pass
for name, value in globals().items():
if value is arg and not name.startswith('_'):
return name
Return all matching object names
Compare the value of arg with the values of items in globals() and store names in a list. Note that I am filtering out names starting with '_'.
The result will consist of a list (for multiple matches), a string (for a single match), otherwise None. Of course you should adjust this behavior as needed.
def names_of_object(arg):
results = [n for n, v in globals().items() if v is arg and not n.startswith('_')]
return results[0] if len(results) is 1 else results if results else None
If you are looking to get the names of functions or lambdas or other function-like objects that are defined in the interpreter, you can use dill.source.getname from dill. It pretty much looks for the __name__ method, but in certain cases it knows other magic for how to find the name... or a name for the object. I don't want to get into an argument about finding the one true name for a python object, whatever that means.
>>> from dill.source import getname
>>>
>>> def add(x,y):
... return x+y
...
>>> squared = lambda x:x**2
>>>
>>> print getname(add)
'add'
>>> print getname(squared)
'squared'
>>>
>>> class Foo(object):
... def bar(self, x):
... return x*x+x
...
>>> f = Foo()
>>>
>>> print getname(f.bar)
'bar'
>>>
>>> woohoo = squared
>>> plus = add
>>> getname(woohoo)
'squared'
>>> getname(plus)
'add'
Use a reverse dict.
fun_dict = {'fun1': fun1,
'fun2': fun2,
'fun3': fun3}
r_dict = dict(zip(fun_dict.values(), fun_dict.keys()))
The reverse dict will map each function reference to the exact name you gave it in fun_dict, which may or may not be the name you used when you defined the function. And, this technique generalizes to other objects, including integers.
For extra fun and insanity, you can store the forward and reverse values in the same dict. I wouldn't do that if you were mapping strings to strings, but if you are doing something like function references and strings, it's not too crazy.
Note that while, as noted, objects in general do not and cannot know what variables are bound to them, functions defined with def do have names in the __name__ attribute (the name used in def). Also if the functions are defined in the same module (as in your example) then globals() will contain a superset of the dictionary you want.
def fun1:
pass
def fun2:
pass
def fun3:
pass
fun_dict = {}
for f in [fun1, fun2, fun3]:
fun_dict[f.__name__] = f
Here's another way to think about it. Suppose there were a name() function that returned the name of its argument. Given the following code:
def f(a):
return a
b = "x"
c = b
d = f(c)
e = [f(b), f(c), f(d)]
What should name(e[2]) return, and why?
And the reason I want to have the name of the function is because I want to create fun_dict without writing the names of the functions twice, since that seems like a good way to create bugs.
For this purpose you have a wonderful getattr function, that allows you to get an object by known name. So you could do for example:
funcs.py:
def func1(): pass
def func2(): pass
main.py:
import funcs
option = command_line_option()
getattr(funcs, option)()
I know This is late answer.
To get func name , you can use func.__name__
To get the name of any python object that has no name or __name__ method. You can iterate over its module members.
Ex:.
# package.module1.py
obj = MyClass()
# package.module2.py
import importlib
def get_obj_name(obj):
mod = Obj.__module__ # This is necessary to
module = module = importlib.import_module(mod)
for name, o in module.__dict__.items():
if o == obj:
return name
Performance note: don't use it in large modules.
Variable names can be found in the globals() and locals() dicts. But they won't give you what you're looking for above. "bla" will contain the value of each item of my_list, not the variable.
Generally when you are wanting to do something like this, you create a class to hold all of these functions and name them with some clear prefix cmd_ or the like. You then take the string from the command, and try to get that attribute from the class with the cmd_ prefixed to it. Now you only need to add a new function/method to the class, and it's available to your callers. And you can use the doc strings for automatically creating the help text.
As described in other answers, you may be able to do the same approach with globals() and regular functions in your module to more closely match what you asked for.
Something like this:
class Tasks:
def cmd_doit(self):
# do it here
func_name = parse_commandline()
try:
func = getattr('cmd_' + func_name, Tasks())
except AttributeError:
# bad command: exit or whatever
func()
I ran into this page while wondering the same question.
As others have noted, it's simple enough to just grab the __name__ attribute from a function in order to determine the name of the function. It's marginally trickier with objects that don't have a sane way to determine __name__, i.e. base/primitive objects like basestring instances, ints, longs, etc.
Long story short, you could probably use the inspect module to make an educated guess about which one it is, but you would have to probably know what frame you're working in/traverse down the stack to find the right one. But I'd hate to imagine how much fun this would be trying to deal with eval/exec'ed code.
% python2 whats_my_name_again.py
needle => ''b''
['a', 'b']
[]
needle => '<function foo at 0x289d08ec>'
['c']
['foo']
needle => '<function bar at 0x289d0bfc>'
['f', 'bar']
[]
needle => '<__main__.a_class instance at 0x289d3aac>'
['e', 'd']
[]
needle => '<function bar at 0x289d0bfc>'
['f', 'bar']
[]
%
whats_my_name_again.py:
#!/usr/bin/env python
import inspect
class a_class:
def __init__(self):
pass
def foo():
def bar():
pass
a = 'b'
b = 'b'
c = foo
d = a_class()
e = d
f = bar
#print('globals', inspect.stack()[0][0].f_globals)
#print('locals', inspect.stack()[0][0].f_locals)
assert(inspect.stack()[0][0].f_globals == globals())
assert(inspect.stack()[0][0].f_locals == locals())
in_a_haystack = lambda: value == needle and key != 'needle'
for needle in (a, foo, bar, d, f, ):
print("needle => '%r'" % (needle, ))
print([key for key, value in locals().iteritems() if in_a_haystack()])
print([key for key, value in globals().iteritems() if in_a_haystack()])
foo()
You define a class and add the Unicode private function insert the class like
class example:
def __init__(self, name):
self.name = name
def __unicode__(self):
return self.name
Of course you have to add extra variable self.name which is the name of the object.
Here is my answer, I am also using globals().items()
def get_name_of_obj(obj, except_word = ""):
for name, item in globals().items():
if item == obj and name != except_word:
return name
I added except_word because I want to filter off some word used in for loop.
If you didn't add it, the keyword in for loop may confuse this function, sometimes the keyword like "each_item" in the following case may show in the function's result, depends on what you have done to your loop.
eg.
for each_item in [objA, objB, objC]:
get_name_of_obj(obj, "each_item")
eg.
>>> objA = [1, 2, 3]
>>> objB = ('a', {'b':'thi is B'}, 'c')
>>> for each_item in [objA, objB]:
... get_name_of_obj(each_item)
...
'objA'
'objB'
>>>
>>>
>>> for each_item in [objA, objB]:
... get_name_of_obj(each_item)
...
'objA'
'objB'
>>>
>>>
>>> objC = [{'a1':'a2'}]
>>>
>>> for item in [objA, objB, objC]:
... get_name_of_obj(item)
...
'objA'
'item' <<<<<<<<<< --------- this is no good
'item'
>>> for item in [objA, objB]:
... get_name_of_obj(item)
...
'objA'
'item' <<<<<<<<--------this is no good
>>>
>>> for item in [objA, objB, objC]:
... get_name_of_obj(item, "item")
...
'objA'
'objB' <<<<<<<<<<--------- now it's ok
'objC'
>>>
Hope this can help.
Based on what it looks like you're trying to do you could use this approach.
In your case, your functions would all live in the module foo. Then you could:
import foo
func_name = parse_commandline()
method_to_call = getattr(foo, func_name)
result = method_to_call()
Or more succinctly:
import foo
result = getattr(foo, parse_commandline())()
Python has names which are mapped to objects in a hashmap called a namespace. At any instant in time, a name always refers to exactly one object, but a single object can be referred to by any arbitrary number of names. Given a name, it is very efficient for the hashmap to look up the single object which that name refers to. However given an object, which as mentioned can be referred to by multiple names, there is no efficient way to look up the names which refer to it. What you have to do is iterate through all the names in the namespace and check each one individually and see if it maps to your given object. This can easily be done with a list comprehension:
[k for k,v in locals().items() if v is myobj]
This will evaluate to a list of strings containing the names of all local "variables" which are currently mapped to the object myobj.
>>> a = 1
>>> this_is_also_a = a
>>> this_is_a = a
>>> b = "ligma"
>>> c = [2,3, 534]
>>> [k for k,v in locals().items() if v is a]
['a', 'this_is_also_a', 'this_is_a']
Of course locals() can be substituted with any dict that you want to search for names that point to a given object. Obviously this search can be slow for very large namespaces because they must be traversed in their entirety.
Hi there is one way to get the variable name that stores an instance of a class
is to use
locals()
function, it returns a dictionary that contains the variable name as a string and its value

comparing elements of a list from an *args

I have this function that I need to compare the strings in a list to a *args
The reason being is that, the user should be able to type any words in the 2nd argument. However when I try to compare the strings to the *args it doesn't give me any results
def title_case2(title, *minor_words):
for x in title.split():
if x in minor_words:
print(x)
Assuming I ran the function with the parameters below. I was hoping it would display a and of since these words are found on those 2 entries.
title_case2('a clash of KINGS','a an the of')
*args is a tuple of arguments, so you're actually checking if x is in ('a an the of',). So either pass your argument as:
title_case2('a clash of KINGS', *'a an the of'.split())
Or, use this as your test:
if any(x in y for y in minor_words):
In either of the above cases the output is:
a
of
This is one approach.
Ex:
def title_case2(title, *minor_words):
minor_words = [j for i in minor_words for j in i.split()] #Create a flat list.
for x in title.split():
if x in minor_words:
print(x)
title_case2('a clash of KINGS','a an the of', "Jam")
using a for-loop instead of list comprehension
def title_case2(title, *minor_words):
minor_words_r = []
for i in minor_words:
for j in i.split():
minor_words_r.append(j)
for x in title.split():
if x in minor_words_r:
print(x)

Problem with calling a variable from one function into another

I am trying to call a variable from one function into another by using the command return, without success. This is the example code I have:
def G():
x = 2
y = 3
g = x*y
return g
def H():
r = 2*G(g)
print(r)
return r
H()
When I run the code i receive the following error NameError: name 'g' is not defined
Thanks in advance!
Your function def G(): returns a variable. Therefore, when you call it, you assign a new variable for the returned variable.
Therefore you could use the following code:
def H():
G = G()
r = 2*G
print (r)
You don't need to give this statement:
return r
While you've accepted the answer above, I'd like to take the time to help you learn and clean up your code.
NameError: name 'g' is not defined
You're getting this error because g is a local variable of the function G()
Clean Version:
def multiple_two_numbers():
"""
Multiplies two numbers
Args:
none
Returns:
product : the result of multiplying two numbers
"""
x = 2
y = 3
product = x*y
return product
def main():
result = multiple_two_numbers()
answer = 2 * result
print(answer)
if __name__ == "__main__":
# execute only if run as a script
main()
Problems with your code:
Have clear variable and method names. g and G can be quiet confusing to the reader.
Your not using the if __name__ == "__main__":
Your return in H() unnecessary as well as the H() function.
Use docstrings to help make your code more readable.
Questions from the comments:
I have one question what if I had two or more variables in the first
function but I only want to call one of them
Your function can have as many variables as you want. If you want to return more than one variable you can use a dictionary(key,value) List, or Tuple. It all depends on your requirements.
Is it necessary to give different names, a and b, to the new
variables or can I use the same x and g?
Absolutely! Declaring another variable called x or y will cause the previous declaration to be overwritten. This could make it hard to debug and you and readers of your code will be frustrated.

Python generator that returns group of items

I am trying to make a generator that can return a number of consecutive items in a list which "moves" only by one index. Something similar to a moving average filter in DSP. For instance if I have list:
l = [1,2,3,4,5,6,7,8,9]
I would expect this output:
[(1,2,3),(2,3,4),(3,4,5),(4,5,6),(5,6,7),(6,7,8),(7,8,9)]
I have made code but it does not work with filters and generators etc. I am afraid it will also break due to memory if I need to provide a large list of words.
Function gen:
def gen(enumobj, n):
for idx,val in enumerate(enumobj):
try:
yield tuple(enumobj[i] for i in range(idx, idx + n))
except:
break
and the example code:
words = ['aaa','bb','c','dddddd','eeee','ff','g','h','iiiii','jjj','kk','lll','m','m','ooo']
w = filter(lambda x: len(x) > 1, words)
# It's working with list
print('\nList:')
g = gen(words, 4)
for i in g: print(i)
# It's not working with filetrs / generators etc.
print('\nFilter:')
g = gen(w, 4)
for i in g: print(i)
The list for does not produce anything. The code should break because it is not possible to index a filter object. Of course one of the answers is forcing a list: list(w). However, I am looking for better code for the function. How can I change it so that function can accept filters as well etc. I am worried about memory to a huge number of data in a list.
Thanks
With iterators you need to keep track of values that have already been read. An n sized list does the trick. Append the next value to the list and discard the top item after each yield.
import itertools
def gen(enumobj, n):
# we need an iterator for the `next` call below. this creates
# an iterator from an iterable such as a list, but leaves
# iterators alone.
enumobj = iter(enumobj)
# cache the first n objects (fewer if iterator is exhausted)
cache = list(itertools.islice(enumobj, n))
# while we still have something in the cache...
while cache:
yield cache
# drop stale item
cache.pop(0)
# try to get one new item, stopping when iterator is done
try:
cache.append(next(enumobj))
except StopIteration:
# pass to emit progressively smaller units
#pass
# break to stop when fewer than `n` items remain
break
words = ['aaa','bb','c','dddddd','eeee','ff','g','h','iiiii','jjj','kk','lll','m','m','ooo']
w = filter(lambda x: len(x) > 1, words)
# It's working with list
print('\nList:')
g = gen(words, 4)
for i in g: print(i)
# now it works with iterators
print('\nFilter:')
g = gen(w, 4)
for i in g: print(i)

Python - Create a recursion function

my question is basically this: Create a recursion function that takes a nested list as a
parameter and returns the sub-list that has minimum difference between its maximum and minimum elements.
For example: Function should return [1,2] for input [[1,199,59],[1,2],[3,8]]
I searched Google and stackoverflow, but i could not find this specific example.
What i would like to get help is with iteration. I want to, using recursion, iterate over each sub-list(can be as many as possible). I have achieved this with a for loop, but i cannot grasp the idea of iteration by using recursion method.
So far, i have this:
def sublist(mylist):
if len(mylist) == 0:
return []
elif len(mylist) == 1:
return mylist
else:
a = (mylist[0][0]) - (mylist[0][-1])
if a < sublist(mylist[1:]):
return mylist[0]
sublist([[1,199,58],[1,2],[3,8]])
This part, ( sublist(mylist[1:]) ) i know is clearly wrong. I'm trying to compare the value a, with the values from the mylist[1:]. I would appreciate much advice here.
Updated:
def differences(mylist):
diff = max(mylist) - min(mylist)
return diff
def sublist(nestedlist):
if len(nestedlist) == 1:
return nestedlist[0]
else:
if differences(nestedlist[0]) < differences(sublist(nestedlist[1:])):
return nestedlist[0]
else:
return sublist(nestedlist[1:])
print(sublist([[1,199,59],[1,2],[3,8]]))
i am assuming that you want to use recursion for the first level of the list. So, without giving you the code 100%, you have to do something like that:
1) create a method e.g diferences(list) that calculates the differences of a list and returns a list with the parameter list and the min difference i.e differences([1,2]) should return [1, [1,2]]. call it once on the first sublist i.e min = differences(mylist[0])
2) create your sublist method like this:
def sublist(initial_list):
# 1) call differences() method for the first sublist of the 'initial_list'
# 2) update 'min' with differences(initial_list[0])if differences(inilitial_list[0])[0] < min[0];
# 3) call sublist() again now removing the sublist you checked before from the arguement
# 4) (the following should be at the start of your sublist() method)
if len(initial_list) = 1:
if differences(initial_list) < min:
return initial_list
else: return min[1]
Hope that helps

Resources