I am trying creating a unit test for my multithreaded code.
My current code snippet is like this:
verify(someObject, times(2)).someMethod(captor.capture());
List<SomeObject> list = captor.getAllValues();
assertThat(list.get(0)).isEqualTo(...
assertThat(list.get(1)).isEqualTo(...
Now someMethod is called in two separate threads, so the order of captured arguments is nondeterministic. I was wondering if there was a way to assert these arguments without any particular order.
Of course I could write a custom Comparator and sort the list beforehand, but I was wondering if there was a simpler way than this.
Thanks!
Simply check that the list contains the elements, independently from the order:
assertThat(list, hasItem(...));
assertThat(list, hasItem(...));
Related
I would like to run the individual tests (functions) of my Python unittest Class in sequential order and not in parallel. I can tell it is parallel because the first function/test writes a record into the TinyDB and another function/test - which fails - needs to read that new record and make an existence test.
So, how do I enforce the strict sequential test execution?
If that is NOT possible, can I enforce the strict sequential processing in creating multiple tests? (I would dislike to do so, because I would like to have a 1:1 relationship of modules and their test_modules.)
Answer for unittest
A strict execution I could realize by creating a master test py file. Named it run_all_tests.py
The modules have separate classes. I trigger them one by one. Respectively the functions.
Switching to pytest and fixtures
Anyhow, I dislike that short-coming of controlling the sequence in a sophisticated/declarative way on a function level. Thus I switched to pytest.
First of all I like that there is an argument that shows the sequence. That confirms what we are expecting:
pytest -setup-show test_myfunction.py
On top of it, you may apply the decorator #pytest.fixture() to a method that is run before. This does NOT necessarily help us with the sequence in first line. Complementary, it reminds us of making independent tests - where one uses the #pytest.fixture() annotated method as an argument to the test function. This way one has a deliberate fixture for a single function. Do NOT mistaken that as the same as the setUp() method of unittest. They run before every single method. Every. And the setUpClass() runs once before any test function is invoked.
For those who still want declarative order, you find it here: https://pypi.org/project/pytest-order/, https://github.com/pytest-dev/pytest-order
I have been working a a very dense set of calculations. It all is to support a specific problem I have.
But the nature of the problem is no different than this. Suppose I develop a class called 'Matrix' that has the machinery to implement matrices. Instantiation would presumably take a list of lists, which would be the matrix entries.
Now I want to provide a multiply method. I have two choices. First, I could define a method like so:
class Matrix():
def __init__(self, entries)
# do the obvious here
return
def determinant(self):
# again, do the obvious here
return result_of_calcs
def multiply(self, b):
# again do the obvious here
return
If I do this, the call signature for two matrix objects, a and b, is
a.multiply(b)...
The other choice is a #staticmethod. Then, the definition looks like:
#staticethod
def multiply(a,b):
# do the obvious thing.
Now the call signature is:
z = multiply(a,b)
I am unclear when one is better than the other. The free-standing function is not truly part of the class definition, but who cares? it gets the job done, and because Python allows "reaching into an object" references from outside, it seems able to do everything. In practice they'll (the class and the method) end up in the same module, so they're at least linked there.
On the other hand, my understanding of the #staticmethod approach is that the function is now part of the class definition (it defines one of the methods), but the method gets no "self" passed in. In a way this is nice because the call signature is the much better looking:
z = multiply(a,b)
and the function can access all the instances' methods and attributes.
Is this the right way to view it? Are there strong reasons to do one or the other? In what ways are they not equivalent?
I have done quite a bit of Python programming since answering this question.
Suppose we have a file named matrix.py, and it has a bunch of code for manipulating matrices. We want to provide a matrix multiply method.
The two approaches are:
define a free:standing function with the signature multiply(x,y)
make it a method of all matrices: x.multiply(y)
Matrix multiply is what I will call a dyadic function. In other words, it always takes two arguments.
The temptation is to use #2, so that a matrix object "carries with it everywhere" the ability to be multiplied. However, the only thing it makes sense to multiply it with is another matrix object. In such cases there are two equally good ways to do that, viz:
z=x.multiply(y)
or
z=y.multiply(x)
However, a better way to do it is to define a function inside the file that is:
multiply(x,y)
multiply(), as such, is a function any code using the 'library' expects to have available. It need not be associated with each matrix. And, since the user will be doing an 'import', they will get the multiply method. This is better code.
What I was wrongly confounding was two things that led me to the method attached to every object instance:
Functions which need to be generally available inside the file that should be
exposed outside it; and
Functions which are needed only inside the file.
multiply() is an example of type 1. Any matrix 'library' ought to likely define matrix multiplication.
What I was worried about was needing to expose all the 'internal' functions. For example, suppose we want to make externally available matrix add(), multiple() and invert(). Suppose, however, we did not want to make externally available - but needed inside - determinant().
One way to 'protect' users is to make determinant a function (a def statement) inside the class declaration for matrices. Then it is protected from exposure. However, nothing stops a user of the code from reaching in if they know the internals, by using the method matrix.determinant().
In the end it comes down to convention, largely. It makes more sense to expose a matrix multiply function which takes two matrices, and is called like multiply(x,y). As for the determinant function, instead of 'wrapping it' in the class, it makes more sense to define it as __determinant(x) at the same level as the class definition for matrices.
You can never truly protect internal methods by their declaration, it seems. The best you can do is warn users. the "dunder" approach gives warning 'this is not expected to be called outside the code in this file'.
I often find myself in a situation where I have a variable that may or may not be a list of items that I want to iterate over. If it's not a list I'll make a list out of it so I can still iterate over it. That way I don't have to write the code inside the loop twice.
def dispatch(stuff):
if type(stuff) is list:
stuff = [stuff]
for item in stuff:
# this could be several lines of code
do_something_with(item)
What I don't like about this is (1) the two extra lines (2) the type checking which is generally discouraged (3) besides I really should be checking if stuff is an iterable because it could as well be a tuple, but then it gets even more complicated. The point is, any robust solution I can think of involves an unpleasant amount of boilerplate code.
You cannot ensure stuff is a list by writing for item in [stuff] because it will make a nested list if stuff is already a list, and not iterate over the items in stuff. And you can't do for item in list(stuff) either because the constructor of list throws an error if stuff is not an iterable.
So the question I'd like to ask: is there anything obvious I've missed to the effect of ensurelist(stuff), and if not, can you think of a reason why such functionality is not made easily accessible by the language?
Edit:
In particular, I wonder why list(x) doesn't allow x to be non-iterable, simply returning a list with x as a single item.
Consider the example of the classes defined in the io module, which provide separate write and writelines methods for writing a single line and writing a list of lines. Provide separate functions that do different things. (One can even use the other.)
def dispatch(stuff):
do_something_with(item)
def dispatch_all(stuff):
for item in stuff:
dispatch(item)
The caller will have an easier time deciding whether dispatch or dispatch_all is the correct function to use than your do-it-all function will have deciding whether it needs to iterate over its argument or not.
I am new to hypothesis and I am looking for a way to generate a pair of similar recursive objects.
My strategy for a single object is similar to this example in the hypothesis documentation.
I want to test a function which takes a pair of recursive objects A and B and the side effect of this function should be that A==B.
My first approach would be to write a test which gets two independent objects, like:
#given(my_objects(), my_objects())
def test_is_equal(a, b):
my_function(a, b)
assert a == b
But the downside is that hypothesis does not know that there is a dependency between this two objects and so they can be completely different. That is a valid test and I want to test that too.
But I also want to test complex recursive objects which are only slightly different.
And maybe that hypothesis is able to shrink a pair of very different objects where the test fails to a pair of only slightly different objects where the test fails in the same way.
This one is tricky - to be honest I'd start by writing the same test you already have above, and just turn up the max_examples setting a long way. Then I'd probably write some traditional unit tests, because getting specific data distributions out of Hypothesis is explicitly unsupported (i.e. we try to break everything that assumes a particular distribution, using some combination of heuristics and a bit of feedback).
How would I actually generate similar recursive structures though? I'd use a #composite strategy to build them at the same time, and for each element or subtree I'd draw a boolean and if True draw a different element or subtree to use in the second object. Note that this will give you a strategy for a tuple of two objects and you'll need to unpack it inside the test; that's unavoidable if you want them to be related.
Seriously do try just cracking up max_examples on the naive approach first though, running Hypothesis for ~an hour is amazingly effective and I would even expect it to shrink the output fairly well.
import multiprocessing as mp
pool = mp.Pool()
As I understand it (please correct me if wrong):
map works on functions that take only one input parameter, and you
can pass in a list of such single parameter to make multiple function calls
apply works on functions that can take multiple parameters, but you
can only pass one tuple of such parameters to make one function call
starmap can deal with both multi-parameter functions and be passed
in a list of tuples of parameters
Since starmap can handle what map and apply can, why does Python provide three functions instead of just one? In other words, what are the advantages of map and/or apply over starmap?
Update
As #coldspeed pointed out, it can just be backward compatibility. But this begs the question, and I guess this is what actually puzzles me: Why did python make both map and apply in the first place, what's so hard about allowing multi-parameter and a list of jobs at the same time?
Hope this is not closed due to being considered "primarily opinion based". There must be some reasons that can be universally appreciated why the original python developers made two functions each having its own limit instead of one single all-powerful one.