Changing the values in a dict, changes the values of the original returned dict - python-3.x

def data_query(Chan, Mode, Format, Sampling, Wave_Data):
if Mode.get_state() == 'NORM':
if Chan.get_state() == 'CHAN1':
wave_dict = Wave_Data.get_wave_data(1)
if Format.get_state() == 'ASCII':
return wave_dict
elif Format.get_state() == 'BYTE':
for i in range(0, len(wave_dict)):
wave_dict[i] = bin(int(wave_dict[i]))
return wave_dict
So in the code above, the parameter 'Wave_Data' is an instance of another class which holds the value of a dict 'self.wave1' which is returned by the function 'get_wave_data'.
def get_wave_data(self, channel=1):
if channel == 1:
return self.wave1
elif channel == 2:
pass
My problem is that in the code above when I make changes to the values in the local dict - 'wave_dict' (i.e. convert the values to binary), it also the changes the values in self.wave1. If I understand this correctly, its acting as a pointer to the self.wave1 object (which I am streaming using udp sockets via another thread) rather than a normal local variable.
Btw, the first code block is a function in the main thread and the second code block is a function in a class that is running as a daemon thread, the instance of which is also passed in the 'data_query' function.
Any help would be appreciated. Sorry if I've used wrong terminology anywhere.

I fixed this by creating an array and appending the hex(dict values) to this array, then returning the array instead of the dict.
Then I handle this on the receiving end by try-except to accept either a dict or a list:
try:
Wdata = list(Wdata_dict.values())
except:
Wdata = Wdata_dict

Related

.get_dummies() works alone but doesnt save within function

I have a dataset and I want to make a function that does the .get_dummies() so I can use it in a pipeline for specific columns.
When I run dataset = pd.get_dummies(dataset, columns=['Embarked','Sex'], drop_first=True)
alone it works, as in, when I run df.head() I can still see the dummified columns but when I have a function like this,
def dummies(df):
df = pd.get_dummies(df, columns=['Embarked','Sex'], drop_first=True)
return df
Once I run dummies(dataset) it shows me the dummified columsn in that same cell but when I try to dataset.head() it isn't dummified anymore.
What am I doing wrong?
thanks.
You should assign the result of the function to df, call the function like:
dataset=dummies(dataset)
function inside them have their own independent namespace for variable defined there either in the signature or inside
for example
a = 0
def fun(a):
a=23
return a
fun(a)
print("a is",a) #a is 0
here you might think that a will have the value 23 at the end, but that is not the case because the a inside of fun is not the same a outside, when you call fun(a) what happens is that you pass into the function a reference to the real object that is somewhere in memory so the a inside will have the same reference and thus the same value.
With a=23 you're changing what this a points to, which in this example is 23.
And with fun(a) the function itself return a value, but without this being saved somewhere that result get lost.
To update the variable outside you need to reassigned to the result of the function
a = 0
def fun(a):
a=23
return a
a = fun(a)
print("a is",a) #a is 23
which in your case it would be dataset=dummies(dataset)
If you want that your function make changes in-place to the object it receive, you can't use =, you need to use something that the object itself provide to allow modifications in place, for example
this would not work
a = []
def fun2(a):
a=[23]
return a
fun2(a)
print("a is",a) #a is []
but this would
a = []
def fun2(a):
a.append(23)
return a
fun2(a)
print("a is",a) #a is [23]
because we are using a in-place modification method that the object provided, in this example that would be the append method form list
But such modification in place can result in unforeseen result, specially if the object being modify is shared between processes, so I rather recomend the previous approach

Dictionary to switch between methods with different arguments

A common workaround for the lack of a case/switch statement in python is the use of a dictionary. I am trying to use this to switch between methods as shown below, but the methods have different argument sets and it's unclear how I can accommodate that.
def method_A():
pass
def method_B():
pass
def method_C():
pass
def method_D():
pass
def my_function(arg = 1):
switch = {
1: method_A,
2: method_B,
3: method_C,
4: method_D
}
option = switch.get(arg)
return option()
my_function(input) #input would be read from file or command line
If I understand correctly, the dictionary keys become associated with the different methods, so calling my_function subsequently calls the method which corresponds to the key I gave as input. But that leaves no opportunity to pass any arguments to those subsequent methods. I can use default values, but that really isn't the point. The alternative is nested if-else statements to choose, which doesn't have this problem but arguably less readable and less elegant.
Thanks in advance for your help.
The trick is to pass *args, **kwargs into my_function and the **kwargs onto to your choosen function and evaluate it there.
def method_A(w):
print(w.get("what")) # uses the value of key "what"
def method_B(w):
print(w.get("whatnot","Not provided")) # uses another keys value
def my_function(args,kwargs):
arg = kwargs.get("arg",1) # get the arg value or default to 1
switch = {
1: method_A,
2: method_B,
}
option = switch.get(arg)
return option(kwargs)
my_function(None, {"arg":1, "what":"hello"} ) # could provide 1 or 2 as 1st param
my_function(None, {"arg":2, "what":"hello"} )
Output:
hello
Not provided
See Use of *args and **kwargs for more on it.

Write a recursive function to list all paths of parts.txt

Write a function list_files_recursive that returns a list of the paths of all the parts.txt files without using the os module's walk generator. Instead, the function should use recursion. The input will be a directory name.
Here is the code I have so far and I think it's basically right, but what's happening is that the output is not one whole list?
def list_files_recursive(top_dir):
rec_list_files = []
list_dir = os.listdir(top_dir)
for item in list_dir:
item_path = os.path.join(top_dir, item)
if os.path.isdir(item_path):
list_files_recursive(item_path)
else:
if os.path.basename(item_path) == 'parts.txt':
rec_list_files.append(os.path.join(item_path))
print(rec_list_files)
return rec_list_files
This is part of the output I'm getting (from the print statement):
['CarItems/Honda/Accord/1996/parts.txt']
[]
['CarItems/Honda/Odyssey/2000/parts.txt']
['CarItems/Honda/Odyssey/2002/parts.txt']
[]
So the problem is that it's not one list and that there's empty lists in there. I don't quite know why this isn't not working and have tried everything to work through it. Any help is much appreciated on this!
This is very close, but the issue is that list_files_recursive's child calls don't pass results back to the parent. One way to do this is to concatenate all of the lists together from each child call, or to pass a reference to a single list all the way through the call chain.
Note that in rec_list_files.append(os.path.join(item_path)), there's no point in os.path.join with only a single parameter. print(rec_list_files) should be omitted as a side effect that makes the output confusing to interpret--only print in the caller. Additionally,
else:
if ... :
can be more clearly written here as elif: since they're logically equivalent. It's always a good idea to reduce nesting of conditionals whenever possible.
Here's the approach that works by extending the parent list:
import os
def list_files_recursive(top_dir):
files = []
for item in os.listdir(top_dir):
item_path = os.path.join(top_dir, item)
if os.path.isdir(item_path):
files.extend(list_files_recursive(item_path))
# ^^^^^^ add child results to parent
elif os.path.basename(item_path) == "parts.txt":
files.append(item_path)
return files
if __name__ == "__main__":
print(list_files_recursive("foo"))
Or by passing a result list through the call tree:
import os
def list_files_recursive(top_dir, files=[]):
for item in os.listdir(top_dir):
item_path = os.path.join(top_dir, item)
if os.path.isdir(item_path):
list_files_recursive(item_path, files)
# ^^^^^ pass our result list recursively
elif os.path.basename(item_path) == "parts.txt":
files.append(item_path)
return files
if __name__ == "__main__":
print(list_files_recursive("foo"))
A major problem with these functions are that they only work for finding files named precisely parts.txt since that string literal was hard coded. That makes it pretty much useless for anything but the immediate purpose. We should add a parameter for allowing the caller to specify the target file they want to search for, making the function general-purpose.
Another problem is that the function doesn't do what its name claims: list_files_recursive should really be called find_file_recursive, or, due to the hardcoded string, find_parts_txt_recursive.
Beyond that, the function is a strong candidate for turning into a generator function, which is a common Python idiom for traversal, particularly for situations where the subdirectories may contain huge amounts of data that would be expensive to keep in memory all at once. Generators also allow the flexibility of using the function to cancel the search after the first match, further enhancing its (re)usability.
The yield keyword also makes the function code itself very clean--we can avoid the problem of keeping a result data structure entirely and just fire off result items on demand.
Here's how I'd write it:
import os
def find_file_recursive(top_dir, target):
for item in os.listdir(top_dir):
item_path = os.path.join(top_dir, item)
if os.path.isdir(item_path):
yield from find_file_recursive(item_path, target)
elif os.path.basename(item_path) == target:
yield item_path
if __name__ == "__main__":
print(list(find_file_recursive("foo", "parts.txt")))

Turning if statement into a single line while setting a variable [duplicate]

I need a way to get a dictionary value if its key exists, or simply return None, if it does not.
However, Python raises a KeyError exception if you search for a key that does not exist. I know that I can check for the key, but I am looking for something more explicit. Is there a way to just return None if the key does not exist?
You can use dict.get()
value = d.get(key)
which will return None if key is not in d. You can also provide a different default value that will be returned instead of None:
value = d.get(key, "empty")
Wonder no more. It's built into the language.
>>> help(dict)
Help on class dict in module builtins:
class dict(object)
| dict() -> new empty dictionary
| dict(mapping) -> new dictionary initialized from a mapping object's
| (key, value) pairs
...
|
| get(...)
| D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.
|
...
Use dict.get
Returns the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
You should use the get() method from the dict class
d = {}
r = d.get('missing_key', None)
This will result in r == None. If the key isn't found in the dictionary, the get function returns the second argument.
If you want a more transparent solution, you can subclass dict to get this behavior:
class NoneDict(dict):
def __getitem__(self, key):
return dict.get(self, key)
>>> foo = NoneDict([(1,"asdf"), (2,"qwerty")])
>>> foo[1]
'asdf'
>>> foo[2]
'qwerty'
>>> foo[3] is None
True
I usually use a defaultdict for situations like this. You supply a factory method that takes no arguments and creates a value when it sees a new key. It's more useful when you want to return something like an empty list on new keys (see the examples).
from collections import defaultdict
d = defaultdict(lambda: None)
print d['new_key'] # prints 'None'
A one line solution would be:
item['key'] if 'key' in item else None
This is useful when trying to add dictionary values to a new list and want to provide a default:
eg.
row = [item['key'] if 'key' in item else 'default_value']
As others have said above, you can use get().
But to check for a key, you can also do:
d = {}
if 'keyname' in d:
# d['keyname'] exists
pass
else:
# d['keyname'] does not exist
pass
You could use a dict object's get() method, as others have already suggested. Alternatively, depending on exactly what you're doing, you might be able use a try/except suite like this:
try:
<to do something with d[key]>
except KeyError:
<deal with it not being there>
Which is considered to be a very "Pythonic" approach to handling the case.
For those using the dict.get technique for nested dictionaries, instead of explicitly checking for every level of the dictionary, or extending the dict class, you can set the default return value to an empty dictionary except for the out-most level. Here's an example:
my_dict = {'level_1': {
'level_2': {
'level_3': 'more_data'
}
}
}
result = my_dict.get('level_1', {}).get('level_2', {}).get('level_3')
# result -> 'more_data'
none_result = my_dict.get('level_1', {}).get('what_level', {}).get('level_3')
# none_result -> None
WARNING: Please note that this technique only works if the expected key's value is a dictionary. If the key what_level did exist in the dictionary but its value was a string or integer etc., then it would've raised an AttributeError.
I was thrown aback by what was possible in python2 vs python3. I will answer it based on what I ended up doing for python3. My objective was simple: check if a json response in dictionary format gave an error or not. My dictionary is called "token" and my key that I am looking for is "error". I am looking for key "error" and if it was not there setting it to value of None, then checking is the value is None, if so proceed with my code. An else statement would handle if I do have the key "error".
if ((token.get('error', None)) is None):
do something
You can use try-except block
try:
value = dict['keyname']
except IndexError:
value = None
d1={"One":1,"Two":2,"Three":3}
d1.get("Four")
If you will run this code there will be no 'Keyerror' which means you can use 'dict.get()' to avoid error and execute your code
If you have a more complex requirement that equates to a cache, this class might come in handy:
class Cache(dict):
""" Provide a dictionary based cache
Pass a function to the constructor that accepts a key and returns
a value. This function will be called exactly once for any key
required of the cache.
"""
def __init__(self, fn):
super()
self._fn = fn
def __getitem__(self, key):
try:
return super().__getitem__(key)
except KeyError:
value = self[key] = self._fn(key)
return value
The constructor takes a function that is called with the key and should return the value for the dictionary. This value is then stored and retrieved from the dictionary next time. Use it like this...
def get_from_database(name):
# Do expensive thing to retrieve the value from somewhere
return value
answer = Cache(get_from_database)
x = answer(42) # Gets the value from the database
x = answer(42) # Gets the value directly from the dictionary
If you can do it with False, then, there's also the hasattr built-in funtion:
e=dict()
hasattr(e, 'message'):
>>> False

How to modify the signature of a function dynamically

I am writing a framework in Python. When a user declares a function, they do:
def foo(row, fetch=stuff, query=otherStuff)
def bar(row, query=stuff)
def bar2(row)
When the backend sees query= value, it executes the function with the query argument depending on value. This way the function has access to the result of something done by the backend in its scope.
Currently I build my arguments each time by checking whether query, fetch and the other items are None, and launching it with a set of args that exactly matches what the user asked for. Otherwise I got the "got an unexpected keyword argument" error. This is the code in the backend:
#fetch and query is something computed by the backend
if fetch= None and query==None:
userfunction(row)
elif fetch==None:
userunction (row, query=query)
elif query == None:
userfunction (row, fetch=fetch)
else:
userfunction (row,fetch=fetch,query=query)
This is not good; for each additional "service" the backend offers, I need to write all the combinations with the previous ones.
Instead of that I would like to primarily take the function and manually add a named parameter, before executing it, removing all the unnecessary code that does these checks. Then the user would just use the stuff it really wanted.
I don't want the user to have to modify the function by adding stuff it doesn't want (nor do I want them to specify a kwarg every time).
So I would like an example of this if this is doable, a function addNamedVar(name, function) that adds the variable name to the function function.
I want to do that that way because the users functions are called a lot of times, meaning that it would trigger me to, for example, create a dict of the named var of the function (with inspect) and then using **dict. I would really like to just modify the function once to avoid any kind of overhead.
This is indeed doable in AST and that's what I am gonna do because this solution will suit better for my use case . However you could do what I asked more simply by having a function cloning approach like the code snippet I show. Note that this code return the same functions with different defaults values. You can use this code as example to do whatever you want.
This works for python3
def copyTransform(f, name, **args):
signature=inspect.signature(f)
params= list(signature.parameters)
numberOfParam= len(params)
numberOfDefault= len(f.__defaults__)
listTuple= list(f.__defaults__)
for key,val in args.items():
toChangeIndex = params.index(key, numberOfDefault)
if toChangeIndex:
listTuple[toChangeIndex- numberOfDefault]=val
newTuple= tuple(listTuple)
oldCode=f.__code__
newCode= types.CodeType(
oldCode.co_argcount, # integer
oldCode.co_kwonlyargcount, # integer
oldCode.co_nlocals, # integer
oldCode.co_stacksize, # integer
oldCode.co_flags, # integer
oldCode.co_code, # bytes
oldCode.co_consts, # tuple
oldCode.co_names, # tuple
oldCode.co_varnames, # tuple
oldCode.co_filename, # string
name, # string
oldCode.co_firstlineno, # integer
oldCode.co_lnotab, # bytes
oldCode.co_freevars, # tuple
oldCode.co_cellvars # tuple
)
newFunction=types.FunctionType(newCode, f.__globals__, name, newTuple, f.__closure__)
newFunction.__qualname__=name #also needed for serialization
You need to do that weird stuff with the names if you want to Pickle your clone function.

Resources