Is there a way to trace Python magic methods that were invoked during code execution?
For example, if I have this program:
class a:
def __init__(self):
self.x = 0
## Some wonder function or decorator that starts tracing
b = a()
b.x = 1
a.x = 2
And result would be all magic methods that was called for last 3 statements, including __setattr__, __getattr__, e.t.c.
The problem is I'm trying to figure what methods and with what arguments are getting called when, and on what classes/objects they are defined and documentation isn't very helpful.
Thanks!
Related
function_one.py
class FunctionOne(Base):
def __init__(self, amount, tax):
super().__init__(amount, tax)
function_two.py
Class FunctionTwo:
def __init__(self, a, b, c):
self.__a = a
self.__b = b
self.__c = c
def _get_info(self):
x = FunctionOne(0, 1)
return x
test_function_two.py
class TestPostProcessingStrategyFactory(unittest.TestCase):
def test__get_info(self):
a = “a”
b = “b”
c = “c”
amount = 0
tax = 1
function_two = FunctionTwo(a, b, c)
assert function_two.__get_info() == FunctionOne(0,1)
I am trying to create unit test for the function_two.py source code. I get the assertion error that the object at ******** != object at *********.
So the two objects address is different. How can make this test pass by correcting the assert statement
assert function_two.__get_info() == FunctionOne(0,1)
You need to understand that equality comparisons depend on the __eq__ method of a class. From the code you provided it appears that simply initializing two objects of FunctionOne with the same arguments does not result in two objects that compare as equal. Whatever implementation of __eq__ underlies that class, only you know that.
However, I would argue the approach is faulty to begin with because unit tests, as the name implies, are supposed to isolate your units (i.e. functions typically) as much as possible, which is not what you are doing here.
When you are testing a function f that calls another of your functions g, strictly speaking, the correct approach is mocking g during the test. You need to ensure that you are testing f and only f. This extends to instances of other classes that you wrote, since their methods are also just functions that you wrote.
Have a look at the following example code.py:
class Foo:
def __init__(self, x, y):
...
class Bar:
def __init__(self, a, b):
self.__a = a
self.__b = b
def get_foo(self):
foo = Foo(self.__a, self.__b)
return foo
Say we want to test Bar.get_foo. That method uses our Foo class inside it, instantiating it and returning that instance. We want to ensure that this is what the method does. We don't want to concern ourselves with anything that relates to the implementation of Foo because that is for another test case.
What we need to do is mock that class entirely. Then we substitute some unique object to be returned by calling our mocked Foo and check that we get that object from calling get_foo.
In addition, we want to check that get_foo called the (mocked) Foo constructor with the arguments we expected, i.e. with its __a and __b attributes.
Here is an example test.py:
from unittest import TestCase
from unittest.mock import MagicMock, patch
from . import code
class BarTestCase(TestCase):
#patch.object(code, "Foo")
def test_get_foo(self, mock_foo_cls: MagicMock) -> None:
# Create some random but unique object that should be returned,
# when the mocked class is called;
# this object should be the output of `get_bar`:
mock_foo_cls.return_value = expected_output = object()
# We remember the arguments to initialize `bar` for later:
a, b = "spam", "eggs"
bar = code.Bar(a=a, b=b)
# Run the method under testing:
output = bar.get_foo()
# Check that we get that EXACT object returned:
self.assertIs(expected_output, output)
# Ensure that our mocked class was instantiated as expected:
mock_foo_cls.assert_called_once_with(a, b)
That way we ensure proper isolation from our Foo class during the Bar.get_foo test.
Side note: If we wanted to be super pedantic, we should even isolate our test method from the initialization of Bar, but in this simple example that would be overkill. If your __init__ method does many things aside from just setting some instance attributes, you should definitely mock that during your test as well.
Hope this helps.
References:
The Mock class
The patch decorator
TestCase.assertIs
Mock.assert_called_once_with
Even after reading the answer of #ncoghlan in
Python nonlocal statement in a class definition
(that I didn't get well, by the way),
I'm not able to understand this behavior.
# The script that fails and I can't explain such fail
class DW:
def r(): return 3
def rr(): return 2 + r()
y = r()
x = rr()
# a solution that I don't like: I don't want high order functions
class W:
def r(): return 3
def rr(r): return 2 + r()
y = r()
x = rr(r)
Class bodies acts more or less like a scripts. This feature is some what strange to me yet. Around this, I have some newbie questions. I thank you in advance if you help me to get them.
Can I define functions inside the body of a class, use them and delete them before the ending of the class definition? (In this case, naturally, the deleted functions will not exist to instances of such class)
How can I promote a better visibility between functions and properties inside a class definition, avoiding the usage of arguments to create access to them?
You can use the keyword self to indicate the method of the object itself
class W:
def r(): return 3
def rr(self): return 2 + self.r()
y = r()
x = rr(r)
Creating enumerations in Python 3.4+ is pretty easy:
from enum import Enum
class MyEnum(Enum):
A = 10
B = 20
This gets me a typedef MyEnum.
With this i can assign a variable:
x = MyEnum.A
So far so good.
However things start to get complicate if i like to use enum.Enum's as arguments to functions or class methods and want to assure that class attributes only hold enum.Enum members but not other values.
How can i do this? My idea is sth like this, which i consider more as a workaround than a solution:
class EnContainer:
def __init__(self, val: type(MyEnum.A) = MyEnum.A):
assert isinstance(val, type(MyEnum.A))
self._value = val
Do you have any suggestions or do you see any problems with my approach? I have to consider about 10 different enumerations and would like to come to a consistent approach for initialization, setters and getters.
Instead of type(MyEnum.A), just use MyEnum:
def __init__(self, val: MyEnum = MyEnum.A):
assert isinstance(val, MyEnum)
Never use assert for error checking, they are for program validation -- in other words, who is calling EnContainer? If only your own code is calling it with already validated data, then assert is fine; but if code outside your control is calling it, then you should be using proper error checking:
def __init__(self, val: MyEnum = MyEnum.A):
if not isinstance(val, MyEnum):
raise ValueError(
"EnContainer called with %s.%r (should be a 'MyEnum')"
% (type(val), val)
)
I am trying to use compile to runtime generate a Python function accepting arguments as follows.
import types
import ast
code = compile("def add(a, b): return a + b", '<string>', 'exec')
fn = types.FunctionType(code, {}, name="add")
print(fn(4, 2))
But it fails with
TypeError: <module>() takes 0 positional arguments but 2 were given
Is there anyway to compile a function accepting arguments using this way or is there any other way to do that?
Compile returns the code object to create a module. In Python 3.6, if you were to disassemble your code object:
>>> import dis
>>> dis.dis(fn)
0 LOAD_CONST 0 (<code object add at ...., file "<string>" ...>)
2 LOAD_CONST 1 ('add')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (add)
8 LOAD_CONST 2 (None)
10 RETURN_VALUE
That literally translates to make function; name it 'add'; return None.
This code means that your function runs the creation of the module, not returning a module or function itself. So essentially, what you're actually doing is equivalent to the following:
def f():
def add(a, b):
return a + b
print(f(4, 2))
For the question of how do you work around, the answer is it depends on what you want to do. For instance, if you want to compile a function using compile, the simple answer is you won't be able to without doing something similar to the following.
# 'code' is the result of the call to compile.
# In this case we know it is the first constant (from dis),
# so we will go and extract it's value
f_code = code.co_consts[0]
add = FunctionType(f_code, {}, "add")
>>> add(4, 2)
6
Since defining a function in Python requires running Python code (there is no static compilation by default other than compiling to bytecode), you can pass in custom globals and locals dictionaries, and then extract the values from those.
glob, loc = {}, {}
exec(code, glob, loc)
>>> loc['add'](4, 2)
6
But the real answer is if you want to do this, the simplest way is generally to generate Abstract Syntax Trees using the ast module, and compiling that into module code and evaluating or executing the module.
If you want to do bytecode transformation, I'd suggest looking at the codetransformer package on PyPi.
TL;DR using compile will only ever return code for a module, and most serious code generation is done either with ASTs or by manipulating byte codes.
is there any other way to do that?
For what's worth: I recently created a #compile_fun goodie that considerably eases the process of applying compile on a function. It relies on compile so nothing different than was explained by the above answers, but it provides an easier way to do it. Your example writes:
#compile_fun
def add(a, b):
return a + b
assert add(1, 2) == 3
You can see that you now can't debug into add with your IDE. Note that this does not improve runtime performance, nor protects your code from reverse-engineering, but it might be convenient if you do not want your users to see the internals of your function when they debug. Note that the obvious drawback is that they will not be able to help you debug your lib, so use with care!
See makefundocumentation for details.
I think this accomplishes what you want in a better way
import types
text = "lambda (a, b): return a + b"
code = compile(text, '<string>', 'eval')
body = types.FunctionType(code, {})
fn = body()
print(fn(4, 2))
The function being anonymous resolves the implicit namespace issues.
And returning it as a value by using the mode 'eval' is cleaner that lifting it out of the code contents, since it does not rely upon the specific habits of the compiler.
More usefully, as you seem to have noticed but not gotten to using yet, since you import ast, the text passsed to compile can actually be an ast object, so you can use ast transformation on it.
import types
import ast
from somewhere import TransformTree
text = "lambda (a, b): return a + b"
tree = ast.parse(text)
tree = TransformTree().visit(tree)
code = compile(text, '<string>', 'eval')
body = types.FunctionType(code, {})
fn = body()
print(fn(4, 2))
When I use a generator in a for loop, it seems to "know", when there are no more elements yielded. Now, I have to use a generator WITHOUT a for loop, and use next() by hand, to get the next element. My problem is, how do I know, if there are no more elements?
I know only: next() raises an exception (StopIteration), if there is nothing left, BUT isn't an exception a little bit too "heavy" for such a simple problem? Isn't there a method like has_next() or so?
The following lines should make clear, what I mean:
#!/usr/bin/python3
# define a list of some objects
bar = ['abc', 123, None, True, 456.789]
# our primitive generator
def foo(bar):
for b in bar:
yield b
# iterate, using the generator above
print('--- TEST A (for loop) ---')
for baz in foo(bar):
print(baz)
print()
# assign a new iterator to a variable
foobar = foo(bar)
print('--- TEST B (try-except) ---')
while True:
try:
print(foobar.__next__())
except StopIteration:
break
print()
# assign a new iterator to a variable
foobar = foo(bar)
# display generator members
print('--- GENERATOR MEMBERS ---')
print(', '.join(dir(foobar)))
The output is as follows:
--- TEST A (for loop) ---
abc
123
None
True
456.789
--- TEST B (try-except) ---
abc
123
None
True
456.789
--- GENERATOR MEMBERS ---
__class__, __delattr__, __doc__, __eq__, __format__, __ge__, __getattribute__, __gt__, __hash__, __init__, __iter__, __le__, __lt__, __name__, __ne__, __new__, __next__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__, close, gi_code, gi_frame, gi_running, send, throw
Thanks to everybody, and have a nice day! :)
This is a great question. I'll try to show you how we can use Python's introspective abilities and open source to get an answer. We can use the dis module to peek behind the curtain and see how the CPython interpreter implements a for loop over an iterator.
>>> def for_loop(iterable):
... for item in iterable:
... pass # do nothing
...
>>> import dis
>>> dis.dis(for_loop)
2 0 SETUP_LOOP 14 (to 17)
3 LOAD_FAST 0 (iterable)
6 GET_ITER
>> 7 FOR_ITER 6 (to 16)
10 STORE_FAST 1 (item)
3 13 JUMP_ABSOLUTE 7
>> 16 POP_BLOCK
>> 17 LOAD_CONST 0 (None)
20 RETURN_VALUE
The juicy bit appears to be the FOR_ITER opcode. We can't dive any deeper using dis, so let's look up FOR_ITER in the CPython interpreter's source code. If you poke around, you'll find it in Python/ceval.c; you can view it here. Here's the whole thing:
TARGET(FOR_ITER)
/* before: [iter]; after: [iter, iter()] *or* [] */
v = TOP();
x = (*v->ob_type->tp_iternext)(v);
if (x != NULL) {
PUSH(x);
PREDICT(STORE_FAST);
PREDICT(UNPACK_SEQUENCE);
DISPATCH();
}
if (PyErr_Occurred()) {
if (!PyErr_ExceptionMatches(
PyExc_StopIteration))
break;
PyErr_Clear();
}
/* iterator ended normally */
x = v = POP();
Py_DECREF(v);
JUMPBY(oparg);
DISPATCH();
Do you see how this works? We try to grab an item from the iterator; if we fail, we check what exception was raised. If it's StopIteration, we clear it and consider the iterator exhausted.
So how does a for loop "just know" when an iterator has been exhausted? Answer: it doesn't -- it has to try and grab an element. But why?
Part of the answer is simplicity. Part of the beauty of implementing iterators is that you only have to define one operation: grab the next element. But more importantly, it makes iterators lazy: they'll only produce the values that they absolutely have to.
Finally, if you are really missing this feature, it's trivial to implement it yourself. Here's an example:
class LookaheadIterator:
def __init__(self, iterable):
self.iterator = iter(iterable)
self.buffer = []
def __iter__(self):
return self
def __next__(self):
if self.buffer:
return self.buffer.pop()
else:
return next(self.iterator)
def has_next(self):
if self.buffer:
return True
try:
self.buffer = [next(self.iterator)]
except StopIteration:
return False
else:
return True
x = LookaheadIterator(range(2))
print(x.has_next())
print(next(x))
print(x.has_next())
print(next(x))
print(x.has_next())
print(next(x))
The two statements you wrote deal with finding the end of the generator in exactly the same way. The for-loop simply calls .next() until the StopIteration exception is raised and then it terminates.
http://docs.python.org/tutorial/classes.html#iterators
As such I don't think waiting for the StopIteration exception is a 'heavy' way to deal with the problem, it's the way that generators are designed to be used.
It is not possible to know beforehand about end-of-iterator in the general case, because arbitrary code may have to run to decide about the end. Buffering elements could help revealing things at costs - but this is rarely useful.
In practice the question arises when one wants to take only one or few elements from an iterator for now, but does not want to write that ugly exception handling code (as indicated in the question). Indeed it is non-pythonic to put the concept "StopIteration" into normal application code. And exception handling on python level is rather time-consuming - particularly when it's just about taking one element.
The pythonic way to handle those situations best is either using for .. break [.. else] like:
for x in iterator:
do_something(x)
break
else:
it_was_exhausted()
or using the builtin next() function with default like
x = next(iterator, default_value)
or using iterator helpers e.g. from itertools module for rewiring things like:
max_3_elements = list(itertools.islice(iterator, 3))
Some iterators however expose a "length hint" (PEP424) :
>>> gen = iter(range(3))
>>> gen.__length_hint__()
3
>>> next(gen)
0
>>> gen.__length_hint__()
2
Note: iterator.__next__() should not be used by normal app code. That's why they renamed it from iterator.next() in Python2. And using next() without default is not much better ...
This may not precisely answer your question, but I found my way here looking to elegantly grab a result from a generator without having to write a try: block. A little googling later I figured this out:
def g():
yield 5
result = next(g(), None)
Now result is either 5 or None, depending on how many times you've called next on the iterator, or depending on whether the generator function returned early instead of yielding.
I strongly prefer handling None as an output over raising for "normal" conditions, so dodging the try/catch here is a big win. If the situation calls for it, there's also an easy place to add a default other than None.