how to remove objects from set irrespective of order - python-3.x

I'm having trouble removing objects from a set. What i did was, create a test class and store two variables in it. Its a string variable. I need to store the objects i create into a set and also, any object where (t.a, t.b) is same as (t.b, t.a). Hence, whenever i add tuples into my set, i'm having trouble removing the reverse condition. Is there a way to do this in python?
class Test:
def __init__(self, a, b):
self.a = a
self.b = b
self.variables = [a, b]
def __hash__(self):
return hash((self.a, self.b))
def __eq__(self, other: Test):
return type(self) is type(other) and self.endpoint() == other.endpoint() or
self.endpoint() == other.endpoint()[::-1]
def endpoint(self):
return (self.a, self.b)
T = Test('A','B')
T2 = Test("B",'A")
result = set()
result.add(T)
result.add(T2)
However, result is showing me both the objects in it as opposed to one. Is there a way to fix this? Thanks

Related

Ensure a class always uses its own version of a method rather than the one defined in a subclass?

I have this code:
class A:
def __init__(self, vals: list):
self._vals = vals
def __len__(self) -> int:
# some side effects like logging maybe
return len(self._vals)
def print_len(self) -> None:
# some function that uses the len above
print(len(self))
class B(A):
def __len__(self) -> int:
return 0
The issue is, I want print_len to always call A.__len__. I can do this:
class A:
def __init__(self, vals: list):
self._vals = vals
def __len__(self) -> int:
return len(self._vals)
def print_len(self) -> None:
print(A.__len__(self))
class B(A):
def __len__(self) -> int:
return 0
But it feels wrong. Basically I want B to lie about __len__ to outside callers, but internally use the correct len specified in A.
So
a = A([1, 2, 3])
print(len(a)) # print 3
a.print_len() # print 3 - no surprises there
b = B([1, 2, 3])
print(len(b)) # print 0 - overload the __len__
b.print_len() # want this to be 3 using A's __len__, not 0 using B's __len__
Is there any way to ensure a class always uses its own version of a method rather than a subclass' version? I thought name mangling of dunder methods would help here.
I think your approach is a good one. The zen of Python states that "There should be one-- and preferably only one --obvious way to do it." and I think you've found it.
That being said, you can do this via name mangling. You just need to prefix the method with double underscores (don't add them to the end like magic methods). This will create a private method which won't ever be overwritten by subclasses.
I think this might be self-defeating since you're now putting the computation in a different method.
class A:
def __init__(self, vals: list):
self._vals = vals
def __len__(self) -> int:
return self.__length()
def __length(self) -> int:
return len(self._vals)
def print_len(self) -> None:
print(self.__length())

Creating a child class from a parent method in python

I am trying to make a class that has a bunch of children that all have their own respective methods but share common methods through the parent. The problem is I need to create an instance of the child class in the parent method but am not sure how to go about it
my code so far looks like this
def filterAttribute(self, attribute, value):
newlist = []
for thing in self._things:
if thing._attributes[attribute] == value:
newlist.append(thing)
return self.__init__(newlist)
the class constructor takes in a list as its sole argument. Does anyone know if there is a standard way of doing this because my code is returning a NoneType object
Here are a few examples of classes I have made
This is the parent class:
class _DataGroup(object):
def __init__(self, things=None):
self._things=things
def __iter__(self):
for x in self._things:
yield x
def __getitem__(self, key):
return self._things[key]
def __len__(self):
return len(self._things)
def extend(self, datagroup):
if(isinstance(datagroup, self.__class__)):
self._things.extend(datagroup._things)
self._things = list(set(self._things))
def filterAttribute(self, attribute, value):
newlist = []
for thing in self._things:
if thing._attributes[attribute] == value:
newlist.append(thing)
#return self.__init__(newlist)
return self.__init__(newlist)
this is one of the child classes
class _AuthorGroup(_DataGroup):
def __init__(self, things=None):
self._things = things
def getIDs(self):
return [x.id for x in self._things]
def getNames(self):
return [x.name for x in self._things]
def getWDs(self):
return [x.wd for x in self._things]
def getUrns(self):
return [x.urn for x in self._things]
def filterNames(self, names, incl_none=False):
newlist = []
for thing in self._things:
if((thing is not None or (thing is None and incl_none)) and thing.name in names):
newlist.append(thing)
return _AuthorGroup(newlist)
The functionality I am looking for is that I can use the parent class's with the child classes and create instances of the child classes instead of the overall DataGroup parent class
So if I correctly understand what you are trying to accomplish:
You want a Base Class 'DataGroup' which has a set of defined attributes and methods;
You want one or mpore child classes with the ability to inherit both methods and attributes from the base class as well as have the ability to over-ride base class methjods if necessary: and
You want to invoke the child class without also having to manually invoke the base class.
If this in fact is your problem, this is how I would proceed:
Note: I have modified several functions, since I think you have several other issues with your code, for example in the base class self._things is set up as a list, but in the functions get_item and filterAttribute you are assuming self._things is a dictionary structure. I have modified the functions so all assume a dict structure for self._things
class _DataGroup:
def __init__(self, things=None):
if things == None:
self._things = dict() #Sets up default empty dict
else:
self._things=things
def __iter__(self):
for x in self._things.keys():
yield x
def __len__(self):
return len(self._things)
def extend(self, datagroup):
for k, v in datagroup.items():
nv = self._things.pop(k, [])
nv.append(v)
self._things[k] = nv
# This class utilizes the methods and attributes of DataGroup
# and adds new methods, unique to the child class
class AttributeGroup(_DataGroup):
def __init__(self, things=None):
super.__init__(things)
def getIDs(self):
return [x for x in self._things]
def getNames(self):
return [x.name for x in self._things]
def getWDs(self):
return [x.wd for x in self._things]
def getUrns(self):
return [x.urn for x in self._things]
# This class over-rides a DataGroup method and adds new attribute
class NewChild(_DataGroup):
def __init__(self, newAttrib, things = None):
self._newattrib = newAttrib
super.__init__(self, things)
def __len__(self):
return max(len(self._newattrib), len(self._things))
These examples are simplified, since I am not absolutely sure of what you really want.

Removing element from container class

I'm having trouble to define the __delitem__ or similar method for the class Container in the example below. How can I rectify this? Thx.
import numpy as np
import pandas as pd
class XLData(object):
def __init__(self, name):
self.name = name
self.data = pd.DataFrame({self.name: list("ASDF" * 2),
'x': np.random.randint(1, 100, 8) })
def __repr__(self):
return repr(self.data.head(2))
class Container(object):
def __init__(self):
self.counter = 0
self.items = []
def append(self, item):
self.counter += 1
self.items = self.items + [item]
def __delitem__(self, name):
for c in self.items:
print("element name:{}, to delete:{}".format(c.name, name))
if c.name == name:
pass #!
#del c
def __iter__(self):
for c in self.items:
yield c
a = XLData('a')
b = XLData('b')
c = XLData('c')
dl = Container()
dl.append(a)
dl.append(b)
dl.append(c)
del dl['b']
for c in dl:
print(c)
# 'b' is still in ..
It is a good idea not to modify the array we are looping over in the loop itself. So just pick the index of the item and delete it outside the loop.
class Container(object):
def __init__(self):
self.counter = 0
self.items = {} # create a dict!
def append(self, item):
self.counter += 1
self.items[item.name] = item # add items to it, keyed under their names
def __delitem__(self, name):
del self.items[name] # this becomes *really* simple, and efficient
def __iter__(self):
for c in self.items.values: # loop over the dict's values to the items
yield c #
As you seem to be aware, given your code's comments, you can't usefully do del c in your loop, because that only removes the c variable from the function's local namespace temporarily, it doesn't change the list structure at all.
There are a few different ways you could make it work.
One idea would be to use enumerate while looping over the values in the list, so that you'll have the index at hand when you need to delete an item from the list:
for i, item in enumerate(self.items):
if item.name == name:
del self.items[i]
return
Note that I return from the function immediately after deleting the item. If multiple items with the same name could exist in the list at once, this may not be what you want, but this code can't properly handle that case because once you delete an item from the list, the iteration won't work properly (it will let you keep iterating, but it will have skipped one value).
A better option might be to rebuild the list so that it only includes the values you want to keep, using a list comprehension.
self.items = [item for item in self.items if item.name != name]
That's nice and concise, and it will work no matter how many items have the name you want to remove!
One flaw of both of the approaches above share is that they'll be fairly slow for large lists. They need to iterate over all the items, they can't tell ahead of time where the item to remove is stored. An alternative might be to use a dictionary, rather than a list, to store the items. If you use the item names as keys, you'll be able to look them up very efficiently.
Here's an implementation that does that, though it only allows one item to have any given name (adding another one will replace the first):
class Container(object):
def __init__(self):
self.counter = 0
self.items = {} # create a dict!
def append(self, item):
self.counter += 1
self.items[item.name] = item # add items to it, keyed under their names
def __delitem__(self, name):
del self.items[name] # this becomes *really* simple, and efficient
def __iter__(self):
for c in self.items.values(): # loop over the dict's values to the items
yield c
It is a good idea not to modify the array we are looping over in the loop itself. So just pick the index of the item and delete it outside the loop.
def __delitem__(self, name):
idx = -1
found = False
for c in self.items:
idx += 1
print("element name:{}, to delete:{}".format(c.name, name))
if c.name == name:
found = True
break
if found:
del self.items[idx]
The way you have it implemented, it's going to be slow for operations like del. And if you wanted to add other methods that return your objects by name like __getitem__(), looking them up by iterating through a list will be slow. You probably want a dictionary to hold your XLData objects in inside Container. And you won't be needing to keep a count of them since the data objects of Python all have a length property.
class Container(object): # Python 3 doesn't require 'object' in class decls.
def __init__(self):
self._items = {}
def add(self, item):
# self._items.append(item) # Why create a new list each time.
# Just append.
self._items[item.name] = item
def __len__(self):
return len(self._items)
def __getitem__(self, name):
return self._items[name]
def __delitem__(self, name):
del self._items[name] # Simple.
def __iter__(self):
for c in self._items.values():
yield c
With a dict you get the benefits of both a list and a dictionary: fast access by name, and iteration over items, etc. The dict keeps track of the order in which the keys and items are added. It is possible to have more than one data type holding information on your contained objects if you really needed a separate list to sort and iterate over. You just have to keep the dict and list in sync.
Come to think of it, you could even sort the dictionary without a list if you wanted your class to support a sort() operation, just requires a little creativity.
def sort(self, key=None):
self._items = {k: v for k, v in sorted(self._items.items(), key=key)}
I think I'm taking it a bit too far now =)
alternative method, using list filter option with object attr conditions in def delitem method
import numpy as np
import pandas as pd
class XLData(object):
def __init__(self, name):
self.name = name
self.data = pd.DataFrame({self.name: list("ASDF" * 2),
'x': np.random.randint(1, 100, 8)})
def __repr__(self):
return repr(self.data.head(2))
class Container(object):
def __init__(self):
self.counter = 0
self.items = []
def append(self, item):
self.counter += 1
self.items = self.items + [item]
def __delitem__(self, name):
self.items = [x for x in self.items if x.name != name]
def __iter__(self):
for c in self.items:
yield c
a = XLData('a')
b = XLData('b')
c = XLData('c')
dl = Container()
dl.append(a)
dl.append(b)
dl.append(c)
del dl['b']
for c in dl:
print(c)
output
a x
0 A 13
1 S 97
c x
0 A 91
1 S 17

How to make a hashable and comparable custom object with mutable fields in Python?

I want to define a hashable class with mutable fields, but if I define the __eq__ method myself (which definitely I want to do), it is no longer hashable by default and I need to define __hash__ method as well.
If I define the __hash__ method such that it just compares the id() values like it (I think) did by default before I defined __eq__ I get weird results.
According to this question, I have to define __eq__ the same way as __hash__ but I would like to understand why this is the case and, if possible, how to work around that.
To be specific, I want to define objects with two mutable fields containing integers:
class CustomObj:
def __init__(self, a, b):
self.a = a
self.b = b
ObjA = CustomObj(1, 2)
ObjB = CustomObj(1, 2)
I want to be able to compare them to each other ObjA == ObjB, as well as to simple tuples ObjA == (1, 2). I also want to be able to store them in a set myset = set(), so that I can check both (1, 2) in myset and ObjA in myset for the same result. How can I achieve this?
This is my code at the moment, but it's definitely not behaving correctly:
class Agent:
def __init__(self, x, y):
self.x = x
self.y = y
def __eq__(self, other):
try:
# Try comparing two agents
return (self.x == other.x and self.y == other.y
and isinstance(other, Agent))
except AttributeError:
# If it fails, compare an agent and a tuple
if isinstance(other, tuple) and len(other) == 2:
return self.x == other[0] and self.y == other[1]
else:
return NotImplemented
def __hash__(self):
return id(self)

Caching attributes in superclass

I have a class which caches some values to avoid computing them many times, for instance
class A(object):
def __init__(self, a, b):
self.a = a
self.b = b
self._value = None
#property
def value(self):
if self._value is None:
self._value = # <complex code that produces value>
return self._value
In this way, self._value is computed only once and all the other times the precomputed value is returned. So far so good.
Now, let's suppose I want to subclass A with class B. In our case class B will have its own method of computing self._value but it sometimes will need A's value, like in this example:
class B(A):
def __init__(self, a, b):
super().__init__(a, b)
#property
def value(self):
if self._value is not None:
self._value = # <complex code that produces B's version of value>
return self._value
def get_old_value(self):
return super().value # here comes the trouble
Now, clearly the trouble is that if get_old_value() is called before value() it will cache A's value forever. If value() is called before get_old_value() in the same way, get_old_value() will actually always return value().
Of course, one could simply use A's <complex code that produces value>, in the implementation of get_old_value() but that would duplicate code (which would pretty much make subclassing useless) or even wrap <complex code that produces value> inside another method in A and call that method in get_old_value() but this would not use caching at all.
Another way could be the following:
def get_old_value(self):
result = super().value
self._c = None
return result
but that would anyway remove caching for A's version of value and does not look clean at all. Is there any better way to accomplish this?
One thing I want to add is that in my code A and B make really sense as superclass and subclass, otherwise I would consider composition.
What you need to do is use name-mangling -- this will allow each class/subclass to maintain a private version of the variable so they don't clobber each other:
class A(object):
def __init__(self, a, b):
self.a = a
self.b = b
self.__value = None
#property
def value(self):
if self.__value is None:
self.__value = 7
return self.__value
class B(A):
def __init__(self, a, b):
super().__init__(a, b)
self.__value = None
#property
def value(self):
if self.__value is None:
self.__value = 17
return self.__value
def get_old_value(self):
return super().value # no more trouble here
And in use:
>>> b = B(1, 2)
>>> print(b.value)
17
>>> print(b.get_old_value())
7
Please note you now need to set __value in B's __init__ as well.
See also this answer for a couple more tidbits about name-mangling.

Resources