In Python, is what are the differences between a method outside a class definition, or a method in it using staticmethod? - python-3.x

I have been working a a very dense set of calculations. It all is to support a specific problem I have.
But the nature of the problem is no different than this. Suppose I develop a class called 'Matrix' that has the machinery to implement matrices. Instantiation would presumably take a list of lists, which would be the matrix entries.
Now I want to provide a multiply method. I have two choices. First, I could define a method like so:
class Matrix():
def __init__(self, entries)
# do the obvious here
return
def determinant(self):
# again, do the obvious here
return result_of_calcs
def multiply(self, b):
# again do the obvious here
return
If I do this, the call signature for two matrix objects, a and b, is
a.multiply(b)...
The other choice is a #staticmethod. Then, the definition looks like:
#staticethod
def multiply(a,b):
# do the obvious thing.
Now the call signature is:
z = multiply(a,b)
I am unclear when one is better than the other. The free-standing function is not truly part of the class definition, but who cares? it gets the job done, and because Python allows "reaching into an object" references from outside, it seems able to do everything. In practice they'll (the class and the method) end up in the same module, so they're at least linked there.
On the other hand, my understanding of the #staticmethod approach is that the function is now part of the class definition (it defines one of the methods), but the method gets no "self" passed in. In a way this is nice because the call signature is the much better looking:
z = multiply(a,b)
and the function can access all the instances' methods and attributes.
Is this the right way to view it? Are there strong reasons to do one or the other? In what ways are they not equivalent?

I have done quite a bit of Python programming since answering this question.
Suppose we have a file named matrix.py, and it has a bunch of code for manipulating matrices. We want to provide a matrix multiply method.
The two approaches are:
define a free:standing function with the signature multiply(x,y)
make it a method of all matrices: x.multiply(y)
Matrix multiply is what I will call a dyadic function. In other words, it always takes two arguments.
The temptation is to use #2, so that a matrix object "carries with it everywhere" the ability to be multiplied. However, the only thing it makes sense to multiply it with is another matrix object. In such cases there are two equally good ways to do that, viz:
z=x.multiply(y)
or
z=y.multiply(x)
However, a better way to do it is to define a function inside the file that is:
multiply(x,y)
multiply(), as such, is a function any code using the 'library' expects to have available. It need not be associated with each matrix. And, since the user will be doing an 'import', they will get the multiply method. This is better code.
What I was wrongly confounding was two things that led me to the method attached to every object instance:
Functions which need to be generally available inside the file that should be
exposed outside it; and
Functions which are needed only inside the file.
multiply() is an example of type 1. Any matrix 'library' ought to likely define matrix multiplication.
What I was worried about was needing to expose all the 'internal' functions. For example, suppose we want to make externally available matrix add(), multiple() and invert(). Suppose, however, we did not want to make externally available - but needed inside - determinant().
One way to 'protect' users is to make determinant a function (a def statement) inside the class declaration for matrices. Then it is protected from exposure. However, nothing stops a user of the code from reaching in if they know the internals, by using the method matrix.determinant().
In the end it comes down to convention, largely. It makes more sense to expose a matrix multiply function which takes two matrices, and is called like multiply(x,y). As for the determinant function, instead of 'wrapping it' in the class, it makes more sense to define it as __determinant(x) at the same level as the class definition for matrices.
You can never truly protect internal methods by their declaration, it seems. The best you can do is warn users. the "dunder" approach gives warning 'this is not expected to be called outside the code in this file'.

Related

Python class instance changed during local function variable

Let's for example define a class and a function
class class1(object):
"""description of class"""
pass
def fun2(x,y):
x.test=1;
return x.value+y
then define a class instance and run it as a local variable in the function
z=class1()
z.value=1
fun2(z,y=2)
However, if you try to run the code
z.test
a result 1 would be returned.
That was, though the attribute to x was done inside the fun2() locally, it extended to class instance x globally as well. This seemed to violate the first thing one learn about the python function, the argument stays local unless being defined nonlocal or global.
How could this happen? Why the attribute to class inside a function extend outside the function.
I have even stranger example:
def fun3(a):
b=a
b.append(3)
mya = [1]
fun3(mya)
print(mya)
[1, 3]
>
I "copy" the array to a local variable and when I change it, the global one changes as well.
The problem is that the parameters are not passed by a value (basically as a copy of the values). In python they are passed by reference. In C terminology the function gets a pointer to the memory location. It's much faster that way.
Some languages will not let you to play with private attributes of an instance, but in Python it's your responsibility to make sure that does not happen. One other rule of OOP is that you should change the internal state of an instance just by calling its methods.
But you change the value directly.
Python is very flexible and allows you to do even the bad things. But it does not push you.
I always argue to have always at least vague understanding of the underlaying structure of any higher level language (memory model, how the variables are passed etc.). There is another argument for having some C/C++ knowledge. Most of the higher level languages are written in them or at least are inspired by them. A C++ programmer would see clearly what is going on.

Ensure variable is a list

I often find myself in a situation where I have a variable that may or may not be a list of items that I want to iterate over. If it's not a list I'll make a list out of it so I can still iterate over it. That way I don't have to write the code inside the loop twice.
def dispatch(stuff):
if type(stuff) is list:
stuff = [stuff]
for item in stuff:
# this could be several lines of code
do_something_with(item)
What I don't like about this is (1) the two extra lines (2) the type checking which is generally discouraged (3) besides I really should be checking if stuff is an iterable because it could as well be a tuple, but then it gets even more complicated. The point is, any robust solution I can think of involves an unpleasant amount of boilerplate code.
You cannot ensure stuff is a list by writing for item in [stuff] because it will make a nested list if stuff is already a list, and not iterate over the items in stuff. And you can't do for item in list(stuff) either because the constructor of list throws an error if stuff is not an iterable.
So the question I'd like to ask: is there anything obvious I've missed to the effect of ensurelist(stuff), and if not, can you think of a reason why such functionality is not made easily accessible by the language?
Edit:
In particular, I wonder why list(x) doesn't allow x to be non-iterable, simply returning a list with x as a single item.
Consider the example of the classes defined in the io module, which provide separate write and writelines methods for writing a single line and writing a list of lines. Provide separate functions that do different things. (One can even use the other.)
def dispatch(stuff):
do_something_with(item)
def dispatch_all(stuff):
for item in stuff:
dispatch(item)
The caller will have an easier time deciding whether dispatch or dispatch_all is the correct function to use than your do-it-all function will have deciding whether it needs to iterate over its argument or not.

How to define a strategy in hypothesis to generate a pair of similar recursive objects

I am new to hypothesis and I am looking for a way to generate a pair of similar recursive objects.
My strategy for a single object is similar to this example in the hypothesis documentation.
I want to test a function which takes a pair of recursive objects A and B and the side effect of this function should be that A==B.
My first approach would be to write a test which gets two independent objects, like:
#given(my_objects(), my_objects())
def test_is_equal(a, b):
my_function(a, b)
assert a == b
But the downside is that hypothesis does not know that there is a dependency between this two objects and so they can be completely different. That is a valid test and I want to test that too.
But I also want to test complex recursive objects which are only slightly different.
And maybe that hypothesis is able to shrink a pair of very different objects where the test fails to a pair of only slightly different objects where the test fails in the same way.
This one is tricky - to be honest I'd start by writing the same test you already have above, and just turn up the max_examples setting a long way. Then I'd probably write some traditional unit tests, because getting specific data distributions out of Hypothesis is explicitly unsupported (i.e. we try to break everything that assumes a particular distribution, using some combination of heuristics and a bit of feedback).
How would I actually generate similar recursive structures though? I'd use a #composite strategy to build them at the same time, and for each element or subtree I'd draw a boolean and if True draw a different element or subtree to use in the second object. Note that this will give you a strategy for a tuple of two objects and you'll need to unpack it inside the test; that's unavoidable if you want them to be related.
Seriously do try just cracking up max_examples on the naive approach first though, running Hypothesis for ~an hour is amazingly effective and I would even expect it to shrink the output fairly well.

Why do we need map() and apply() if we already have starmap()?

import multiprocessing as mp
pool = mp.Pool()
As I understand it (please correct me if wrong):
map works on functions that take only one input parameter, and you
can pass in a list of such single parameter to make multiple function calls
apply works on functions that can take multiple parameters, but you
can only pass one tuple of such parameters to make one function call
starmap can deal with both multi-parameter functions and be passed
in a list of tuples of parameters
Since starmap can handle what map and apply can, why does Python provide three functions instead of just one? In other words, what are the advantages of map and/or apply over starmap?
Update
As #coldspeed pointed out, it can just be backward compatibility. But this begs the question, and I guess this is what actually puzzles me: Why did python make both map and apply in the first place, what's so hard about allowing multi-parameter and a list of jobs at the same time?
Hope this is not closed due to being considered "primarily opinion based". There must be some reasons that can be universally appreciated why the original python developers made two functions each having its own limit instead of one single all-powerful one.

Manipulating/Clearing Variables via Lists: Mathematica

My problem (in Mathematica) is referring to variables given in a particular array and manipulating them in the following manner (as an example):
Inputs: vars={x,y,z}, system=some ODE like x^2+3*x*y+...etc
(note that I haven't actually created variables x y and z)
Aim:
To assign values to the variables in the list "var" with the intention of inputting these values into the system of ODEs. Then, once I am done, clear the values of the variables in the array vars so that it is in its original form {x,y,z} (and not something like {x,1,3} where y=1 and z=3). I want to do this by referring to the positional elements of vars (I aim not to know that x, y and z are the actual variables).
The reason why: I am trying to write a program that can have any number of variables and ODEs as defined by the user. Since the number of variables and the actual letters used for them are unknown, it is necessary to perform manipulations with the array itself.
Attempt:
A fixed number of variables is easy. For the arbitrary case, I have tried modules and blocks, but with no success. Consider the following code:
Clear[x,y,z,vars,svars]
vars={x,y,z}
svars=Map[ToString,vars]
Module[{vars=vars,svars=svars},
Symbol[svars[[1]]]//Evaluate=1
]
then vars={1,y,z} and not {x,y,z} after running this. I have done functional programming with lists, atoms etc. Thus is makes sense to me that vars is changed afterwards, because I have changed x and not vars. However, I cannot get "x" in the list of variables to remain local. Of course I could put in "x" itself, but that is particular to this specific case. I would prefer to put something like:
Clear[x,y,z,vars,svars]
vars={x,y,z}
svars=Map[ToString,vars]
Module[{vars=vars,svars=svars, vars[[1]]},
Symbol[svars[[1]]]//Evaluate=1
]
which of course doesn't work because vars[[1]] is not a symbol or an assignment to a symbol.
Other possibilities:
I found a function
assignToName[name_String, value_] :=
ToExpression[name, InputForm, Function[var, var = value, HoldAll]]
which looked promising. Basically name_String is the name of the variable and value is its new value. I attempted to do:
vars={x,y,z}
svars=Map[ToString,vars]
vars[[1]]=//Evaluate=1
assignToName[svars[[1]],svars[[1]]]
but then something likeD[x^2, vars[[1]]] doesn't work (x is not a valid variable).
If I am missing something, or if perhaps I am going down the wrong path, I'm open to trying other things.
Thanks.
I can't say that I followed your train(s) of thought very well, so these are fragments which might help you to answer your own questions than a coherent and fully-formed answer. But to answer your final 'question', I think you may be going down some wrong path(s).
In passing, note that evaluating the expression
vars = {x,y,z}
does in fact define those three variables though it doesn't define any rewrite rules (such as values) for them.
Given a polynomial poly you can extract the variables in it with the function Variables[poly] so something like
Variables[x^2+3*x*y]
should return
{x,y}
Note that I write 'should' rather than does because I don't have Mathematica on this machine so my syntax may be a bit wonky. Note also that your example ODE is nothing of the sort but it strikes me that you can probably write a wrapper to manipulate an ODE into a form from which Variables can extract the variables. Mathematica offers a lot of other functions for picking expressions apart and re-assembling them, follow the trails from Variables. It often allows the use of functions defined on Lists on expressions with other heads too so it's always worth experimenting a bit.
There are a couple of widely applicable ways to avoid setting values of variables in Mathematica. For instance, you could write
x^2+3*x*y/.{x->2,y->3}
which will evaluate to
22
but not set values for x and y. This is a very simple example of using (sets of) replacement rules for temporary assignment of values to variables
The other way to avoid setting values for variables is to define functions using Modules or Blocks both of which define their own contexts. The documentation will tell you all about these two and the differences between them.
I can't help thinking that all your clever tricks using Symbol, ToExpression and ToString are a bit beside the point. Spend some time familiarising yourself with Mathematica's in-built functionality before going further down that route, you may well find you don't need to.
Finally, writing, in any language, expressions such as
vars=vars,svars=svars
will lead to madness. It may be syntactically correct, you may even be able to decrypt the semantics when you first write code like that, but in a week's time you will curse your younger self for writing it.

Resources