Python calling mock with "=" not called result - python-3.x

I am trying to mock the following call:
df_x = method() # returns a pandas dataframe
df_x.loc[df_x['atr'] < 0, 'atr'] = 0
I have mocked the method so it returns a MagicMock and set a default value to the __ getitem__ attribute of the MagicMock as like this:
mock_df_x = mock_method.return_value
mock_df_x.__getitem__.return_value = 0
The problem is when I try asserting the call:
mock_df_x.loc.__getitem__.assert_called_with(False, 'atr')
I get a function not called error. If I call the function like this without the "= 0" part the assertion works.
df_x.loc[df_x['atr'] < 0, 'atr']

The reason you are seeing this different behavior depending on whether on you have = 0 at the end of the call you are testing is that in Python's data model, those correspond to two different magic methods: __getitem__ and __setitem__.
This makes sense, because for example doing some_dictionary['nonexistent_key]' raises KeyError, whereas some_dictionary['nonexistent_key]' = 1 doesn't, and sets the value as expected.
Now, in order to fix your test, you only need to change your assertion from:
mock_df_x.loc.__getitem__.assert_called_with((False, 'atr'))
which only works if you are accessing the key, to:
mock_df_x.loc.__setitem__.assert_called_with((False, 'atr'), 0)
which works if you are trying to assign a value to that key.
Notice the extra parameter, too, corresponding to the value you are actually trying to assign.

Related

What's the difference between the method .get() and the method .get in python? Both are appliable to dictionaries

Imagine I have a dict.
d = ['a': 1 , 'b':3]
I'm having a hard time to understand the difference between d.get and d.get().
I know that d.get() get the value from the key, like this:
print(d.get('a') )
output: 1
But when I write d.get, it shows this:
print(d.get)
output: <built-in method get of dict object at .........>
What is 'd.get' doing in my code?
I'm using python 3X
A method is literally just an attribute of an object that happens to be of type <class function>. The output you see is essentially what happens when you try to call print() on any function object, and is essentially a concise string representation that python creates for the function.
Actually calling a function is done with parentheses: d.get('a'), which means to execute the behavior the function refers to. It doesn't especially matter where the function is, though: I could do the following, and it would still work:
d = {'a': 1 , 'b':3}
freefunc = d.get
freefunc('a')
This is what the term "first class functions" refers to, when people compare python to something like Java. An entire function can be encapsulated in a variable and treated no differently than any other variable or attribute.
The short answer? There is no difference between the two methods. They are the same exact method.
The difference in your code is at when you write .get() you call the method, but when you write .get you just get a pointer (or location in the memory, to be exact) for that method, to call it later on if needed.
In the first scenario, you are calling print on the result of executing get('a'), which in this case is 1.
In your second scenario, you are calling print on the get function itself, instead of on an execution of it, which evaluates to its documentation, i.e. <built-in method get of dict object at... etc.

.get_dummies() works alone but doesnt save within function

I have a dataset and I want to make a function that does the .get_dummies() so I can use it in a pipeline for specific columns.
When I run dataset = pd.get_dummies(dataset, columns=['Embarked','Sex'], drop_first=True)
alone it works, as in, when I run df.head() I can still see the dummified columns but when I have a function like this,
def dummies(df):
df = pd.get_dummies(df, columns=['Embarked','Sex'], drop_first=True)
return df
Once I run dummies(dataset) it shows me the dummified columsn in that same cell but when I try to dataset.head() it isn't dummified anymore.
What am I doing wrong?
thanks.
You should assign the result of the function to df, call the function like:
dataset=dummies(dataset)
function inside them have their own independent namespace for variable defined there either in the signature or inside
for example
a = 0
def fun(a):
a=23
return a
fun(a)
print("a is",a) #a is 0
here you might think that a will have the value 23 at the end, but that is not the case because the a inside of fun is not the same a outside, when you call fun(a) what happens is that you pass into the function a reference to the real object that is somewhere in memory so the a inside will have the same reference and thus the same value.
With a=23 you're changing what this a points to, which in this example is 23.
And with fun(a) the function itself return a value, but without this being saved somewhere that result get lost.
To update the variable outside you need to reassigned to the result of the function
a = 0
def fun(a):
a=23
return a
a = fun(a)
print("a is",a) #a is 23
which in your case it would be dataset=dummies(dataset)
If you want that your function make changes in-place to the object it receive, you can't use =, you need to use something that the object itself provide to allow modifications in place, for example
this would not work
a = []
def fun2(a):
a=[23]
return a
fun2(a)
print("a is",a) #a is []
but this would
a = []
def fun2(a):
a.append(23)
return a
fun2(a)
print("a is",a) #a is [23]
because we are using a in-place modification method that the object provided, in this example that would be the append method form list
But such modification in place can result in unforeseen result, specially if the object being modify is shared between processes, so I rather recomend the previous approach

Correct way to mock decorated method in Python

I have the class with .upload method. This method is wrapped using the decorator:
#retry(tries=3)
def upload(self, xml_file: BytesIO):
self.client.upload(self._compress(xml_file))
I need to test if it runs 3 times if some exception occurs.
My test looks like:
#mock.patch("api.exporter.jumbo_uploader.JumboZipUploader.upload")
def test_upload_xml_fail(self, mock_upload):
"""Check if decorator called the compress function 3 times"""
generator = BrandBankXMLGenerator()
file = generator.generate()
uploader = JumboZipUploader()
uploader.upload = retry(mock_upload)
mock_upload.side_effect = Exception("Any exception")
uploader.upload(file)
self.assertEqual(mock_upload.call_count, 3)
I have read that the default behavior of python decorators assumes that the function inside the test will be unwrapped and I need to wrap it manually.
I did that trick, but the code fails with AssertionError: 0 != 3.
So, what is the right way here to wrap the decorated method properly?

calling a class without () is working fine but not giving desired output

I am new to python, can anyone please explain to me why this is happening??
What is the meaning of "()"
class ganga:
a ="subhanshu"
def course(self,name):
self.ab = name
obj1=ganga() #it works fine
obj = ganga #works fine
obj1.course("apple") #it works fine
onj.course("apple") #gives me error
error is:
TypeError: course() missing 1 required positional argument: 'name'
Function course has two arguments: self and name. The self argument refers to the object on which to perform the function's operation.
Case #1
obj1 = ganga()
You created an object of the class ganga. When you called the function via the object obj1.course("apple"), the self argument was automatically filled in as obj1.
Case #2
obj = ganga
Here you made a copy of the class ganga and assigned it to the variable. Therfore, when you called the function, it expects you to specify both the arguments. Try the following -
obj1 = ganga()
obj.course(obj1, "apple")
This perform the course operation on obj1.

How to avoid creating a class attribute by accident

I know the motto is "we're all consenting adults around here."
but here is a problem I spent a day on. I got passed a class with over 100 attributes. I had specified one of them was to be called "run_count". The front-end had a place to enter run_count.
Somehow, the front-end/back-end package people decided to call it "run_iterations" instead.
So, my problem is I am writing unit test software, and I did this:
passed_parameters.run_count = 100
result = do_the_thing(passed_parameters)
assert result == 99.75
Now, the problem, of course, is that Python willingly let me set this "new" attribute called "run_count". But, after delving 10 levels down into the code, I discover that the function "do_the_thing" (obviously) never looks at "run_count", but uses "passed_paramaters.run_iterations" instead.
Is there some simple way to avoid allowing yourself to create a new attribute in a class, or a new entry in a dictionary, when you naievely assume you know the attribute name (or the dict key), and accidentally create a new entry that never gets looked at?
In an ideal world, no matter how dynamic, Python would allow you to "lock" and object or instance of one. Then, trying to set a new value for an attribute that doesn't exist would raise an attribute error, letting you know you are trying to change something that doesn't exist, rather than letting you create a new attribute that never gets used.
Use __setattr__, and check the attribute exists, otherwise, throw an error. If you do this, you will receive an error when you define those attributes inside __init__, so you have to workaround that situation. I found 4 ways of doing that. First, define those attributes inside the class, that way, when you try to set their initial value they will already be defined. Second, call object.__setattr__ directly. Third, add a fourth boolean param to __setattr__ indicating whether to bypass checking or not. Fourth, define the previous boolean flag as class-wide, set it to True, initialize the fields and set the flag back to False. Here is the code:
Code
class A:
f = 90
a = None
bypass_check = False
def __init__(self, a, b, c, d1, d2, d3, d4):
# 1st workaround
self.a = a
# 2nd workaround
object.__setattr__(self, 'b', b)
# 3rd workaround
self.__setattr__('c', c, True)
# 4th workaround
self.bypass_check = True
self.d1 = d1
self.d2 = d2
self.d3 = d3
self.d4 = d4
self.bypass_check = False
def __setattr__(self, attr, value, bypass=False):
if bypass or self.bypass_check or hasattr(self, attr):
object.__setattr__(self, attr, value)
else:
# Throw some error
print('Attribute %s not found' % attr)
a = A(1, 2, 3, 4, 5, 6, 7)
a.f = 100
a.d1 = -1
a.g = 200
print(a.f, a.a, a.d1, a.d4)
Output
Attribute g not found
100 1 -1 7

Resources