Related
so I am writing code where I generate certain data in a class and save it in a dictionary. I want to use that data in the second class . The first class is as fellows:
class DataAnalysis():
def __init__(self,matfile=None):
'''Constructor
'''
self.matfile= matfile
def get_alldata(self):
print('all dict data accessed')
print(bodedata_dict)
return bodedata_dict
if __name__ == '__main__':
obj1= DataAnalysis(matfile=matfile)
"do some work"
bodedata_dict.update(bode_data)
obj1.get_alldata()
I then access the dictionary in the second class as:
from A import DataAnalysis
class PlotComaparison(DataAnalysis):
if __name__ == '__main__':
obj= DataAnalysis(matfile=None)
obj1= PlotComaparison(obj)
dict_data= obj.get_alldata()
But when I run the script with the second class, it gives me the following error:
File "DataAnalysis.py", line 301, in get_alldata
print(bodedata_dict)
NameError: name 'bodedata_dict' is not defined
I am very new to the concept of classes in python, so please help me with how I can use data from one class into another.
The get_alldata() method in the DataAnalysis class you defined is returning a bodedata_dict which isn't defined anywhere. It's like printing the content of a variable without defining it first.
EDIT:
Looking further into it, bodedata_dictin the first example comes from outside the class. You would likely want to change the flow of your program so that when DataAnalysis has it's get data method called, it doesn't depend on an outside state.
How to create a metaclass in python? I tried to write as in tutorials:
class Meta(type):
def __new__(mcs, name, bases, attrs):
attrs2 = {'field2': 'Test'}
attrs2.update(attrs)
return super(Meta, mcs).__new__(mcs, name, bases, attrs2)
class Test(object):
__metaclass__ = Meta
field1 = 10
test = Test()
print(test.field1)
print(test.field2)
But this code fails with error:
10
Traceback (most recent call last):
File "main.py", line 18, in <module>
print(test.field2)
AttributeError: 'Test' object has no attribute 'field2'
How to declare a metaclass in python 3.7+ correctly?
UPDATED
I've changed my question with actual error.
The tutorials you are checking are covering Python 2.
In Python 3, one of the syntactic changes was exactly the way of declaring a metaclass for a class.
You don't need to change the metaclass code, just change your class declaration to:
class Test(metaclass=Meta):
field1 = 10
and it will work.
So, in short: for a metaclass in Python 3, you have to pass the equivalent of a "keyword argument" in the class declaration, with the name "metaclass". (Also, in Python 3, there is no need to inherit explicitly from object)
In Python 2, this was accomplished by the presence of the special variable __metaclass__ in the body of the class, as is in your example. (Also, when setting a metaclass, inheriting from 'object' would be optional, since the metaclass, derived from type, would do that for you).
One of the main advantages of the new syntax is that it allows the special method __prepare__ in the metaclass which can return a custom namespace object to be used when building the class body itself. It is seldom used, and a really "serious" use case would be hard to put up today. For toys and playing around, it is great, allowing for "magic autonamed enumerations" and other things - but when designing Python 3, this was way they thought to allow having an OrderedDict as the class namespace, so that the metaclass' __new__ and __init__ methods could know the order of the declaration of the attributes. Since Python 3.6, a class body namespace is ordered by default and there is no need for a __prepare__ method for this use alone.
I defined a subclass of Atom in rdkit.Chem. I also defined an instance attribute in it but I could not get that instance from RWMol object in rdkit.
Below there is a sample code for my problem:
from rdkit import Chem
class MyAtom(Chem.Atom):
def __init__(self, symbol, **kwargs):
super().__init__(symbol, **kwargs)
self.my_attribute = 0
def get_my_attribute(self):
return self.my_attribute
if __name__ == '__main__':
rw_mol = Chem.RWMol()
# I created MyAtom class object then added to RWMol. But I couldn't get it again.
my_atom = MyAtom('C')
my_atom.my_attribute = 3
rw_mol.AddAtom(my_atom)
atom_in_mol = rw_mol.GetAtoms()[0]
# I can access my_atom new defined attributes.
print(my_atom.get_my_attribute())
# below two line gives error: AttributeError: 'Atom' object has no attribute 'get_my_attribute'
print(atom_in_mol.get_my_attribute())
print(atom_in_mol.my_attribute)
# type(atom1): <class '__main__.MyAtom'>
# type(atom_in_mol): <class 'rdkit.Chem.rdchem.Atom'>
# Why below atom types are different? Thanks to polymorphism, that two object types must be same.
Normally this code must run but it gives error due to last line because atom_in_mol object type is Chem.Atom. But should it be MyAtom? I also cannot access my_attribute directly.
rdkit Python library is a wrapper of C++. So is the problem this? Cannot I use inheritance for this library?
Note: I researched rdkit documentation and there is a SetProp method for saving values in atoms. It uses dictionary to save values. It runs fine but it is too slow for my project. I want to use instance attributes to save my extra values. Is there any solution for that inheritance problem, or faster different solution?
Python RDKit library is a C++ wrapper, so sometimes it does not follows the conventional Python object handling.
To go deeper, you will have to dig through the source code:
rw_mol.AddAtom(my_atom)
Above will execute AddAtom method in rdkit/Code/GraphMol/Wrap/Mol.cpp, which, in turn, calls addAtom method in rdkit/Code/GraphMol/RWMol.h, which then calls addAtom method in rdkit/Code/GraphMol/ROMol.cpp with default argument of updateLabel = true and takeOwnership = false.
The takeOwnership = false condition makes the argument atom to be duplicated,
// rdkit/Code/GraphMol/ROMol.cpp
if (!takeOwnership)
atom_p = atom_pin->copy();
else
atom_p = atom_pin;
Finally, if you look into what copy method do in rdkit/Code/GraphMol/Atom.cpp
Atom *Atom::copy() const {
auto *res = new Atom(*this);
return res;
}
So, it reinstantiate Atom class and returns it.
My question was inspired by this question.
The problem there is the 3 level class model - the terminating classes (3-rd level) only should be stored in the registry, but the 2-nd level are interfering and also have stored, because they are subclasses of 1-st level.
I wanted to get rid of 1-st level class by using metaclass. By this way the only 2 class levels are left - base classes for each group of settings and their childs - various setting classes, inherited from the according base class. The metaclass serves as a class factory - it should create base classes with needed methods and shouldn't be displayed in the inheritance tree.
But my idea doesn't work, because it seems that the __init_subclass__ method (the link to method) doesn't copied from the metaclass to constructed classes. In contrast of __init__ method, that works as I were expected.
Code snippet № 1. The basic framework of the model:
class Meta_Parent(type):
pass
class Parent_One(metaclass=Meta_Parent):
pass
class Child_A(Parent_One):
pass
class Child_B(Parent_One):
pass
class Child_C(Parent_One):
pass
print(Parent_One.__subclasses__())
Output:
[<class '__main__.Child_A'>, <class '__main__.Child_B'>, <class '__main__.Child_C'>]
I have wanted to add functionality to the subclassing process of the above model, so I have redefined the type's builtin __init_subclass__ like this:
Code snippet № 2.
class Meta_Parent(type):
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
print(cls)
From my point of view, now every new class, constructed by Meta_Parent metaclass (for example, Parent_One) should have __init_subclass__ method and thus, should print the subclass name when every class is inherited from this new class, but it prints nothing. That is, my __init_subclass__ method doesn't called when inheritance happens.
It works if Meta_Parent metaclass is directly inherited though:
Code snippet № 3.
class Meta_Parent(type):
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
print(cls)
class Child_A(Meta_Parent):
pass
class Child_B(Meta_Parent):
pass
class Child_C(Meta_Parent):
pass
Output:
<class '__main__.Child_A'>
<class '__main__.Child_B'>
<class '__main__.Child_C'>
Nothing strange here, the __init_subclass__ was created exactly for this purpose.
I were thinking at a moment, that dunder methods are belong to metaclass only and are not passed to new constructed classes, but then, I try the __init__ method and it works as I were expecting in the beginning - looks like the link to __init__ have copied to every metaclass's class.
Code snippet № 4.
class Meta_Parent(type):
def __init__(cls, name, base, dct):
super().__init__(name, base, dct)
print(cls)
Output:
<class '__main__.Parent_One'>
<class '__main__.Child_A'>
<class '__main__.Child_B'>
<class '__main__.Child_C'>
The questions:
Why __init__ works, but __init_subclass__ doesn't?
Is it possible to implement my idea by using metaclass?
1. Why __init__ works, but __init_subclass__ doesn't?
I found the answer by debugging CPython by GDB.
The creation of a new class (type) starts in the type_call() function. It does two main things: a new type object creation and this object initialization.
obj = type->tp_new(type, args, kwds); is an object creation. It calls the type's tp_new slot with passed arguments. By default the tp_new stores reference to the basic type object's tp_new slot, but if any ancestor class implements the __new__ method, the reference is changing to the slot_tp_new dispatcher function. Then the type->tp_new(type, args, kwds); callsslot_tp_new function and it, in own turn, invokes the search of __new__ method in the mro chain. The same happens with tp_init.
The subclass initialization happens at the end of new type creation - init_subclass(type, kwds). It searches the __init_subclass__ method in the mro chain of the just created new object by using the super object. In my case the object's mro chain has two items:
print(Parent_One.__mro__)
### Output
(<class '__main__.Parent_One'>, <class 'object'>).
int res = type->tp_init(obj, args, kwds); is an object initialization. It also searches the __init__ method in the mro chain, but use the metaclass mro, not the just created new object's mro. In my case the metaclass mro has three item:
print(Meta_Parent.__mro__)
###Output
(<class '__main__.Meta_Parent'>, <class 'type'>, <class 'object'>)
The simplified execution diagram:
So, the answer is: __init_subclass__ and __init__ methods are searched in the different places:
the __init_subclass__ firstly is searched in the Parent_One's __dict__, then in the object's __dict__.
the __init__ is searched in this order: Meta_Parent's __dict__, type's __dict__, object's __dict__.
2. Is it possible to implement my idea by using metaclass?
I came up with following solution. It has drawback - the __init__ method is called by each subclass, the children included, that means - all subclasses have registry and __init_subclass__ attributes, which is needless. But it works as I were requesting in the question.
#!/usr/bin/python3
class Meta_Parent(type):
def __init__(cls, name, base, dct, **kwargs):
super().__init__(name, base, dct)
# Add the registry attribute to the each new child class.
# It is not needed in the terminal children though.
cls.registry = {}
#classmethod
def __init_subclass__(cls, setting=None, **kwargs):
super().__init_subclass__(**kwargs)
cls.registry[setting] = cls
# Assign the nested classmethod to the "__init_subclass__" attribute
# of each child class.
# It isn't needed in the terminal children too.
# May be there is a way to avoid adding these needless attributes
# (registry, __init_subclass__) to there. I don't think about it yet.
cls.__init_subclass__ = __init_subclass__
# Create two base classes.
# All child subclasses will be inherited from them.
class Parent_One(metaclass=Meta_Parent):
pass
class Parent_Two(metaclass=Meta_Parent):
pass
### Parent_One's childs
class Child_A(Parent_One, setting='Child_A'):
pass
class Child_B(Parent_One, setting='Child_B'):
pass
class Child_C(Parent_One, setting='Child_C'):
pass
### Parent_Two's childs
class Child_E(Parent_Two, setting='Child_E'):
pass
class Child_D(Parent_Two, setting='Child_D'):
pass
# Print results.
print("Parent_One.registry: ", Parent_One.registry)
print("#" * 100, "\n")
print("Parent_Two.registry: ", Parent_Two.registry)
Output
Parent_One.registry: {'Child_A': <class '__main__.Child_A'>, 'Child_B': <class '__main__.Child_B'>, 'Child_C': <class '__main__.Child_C'>}
####################################################################################################
Parent_Two.registry: {'Child_E': <class '__main__.Child_E'>, 'Child_D': <class '__main__.Child_D'>}
The solution I came up with and use/like is:
class Meta_Parent(type):
def _init_subclass_override(cls, **kwargs):
super().__init_subclass__(**kwargs)
# Do whatever... I raise an exception if something is wrong
#
# i.e
# if sub-class's name does not start with "Child_"
# raise NameError
#
# cls is the actual class, Child_A in this case
class Parent_One(metaclass=Meta_Parent):
#classmethod
def __init_subclass__(cls, **kwargs):
Meta_Parent._init_subclass_override(cls, **kwargs)
### Parent_One's childs
class Child_A(Parent_One):
pass
I like this because it DRYs the sub-class creation code/checks. At the same time, if you see Parent_One, you know that there is something happening whenever a sub-class is created.
I did it while mucking around to mimic my own Interface functionality (instead of using ABC), and the override method checks for existence of certain methods in the sub-classes.
One can argue whether the override method really belongs in the metaclass, or somewhere else.
I was using PySpark to process some calls data. As you see, I added some inner classes to class GetInfoFromCalls dynamically by using metaclass.
code below located in package for_test that existed in all nodes:
class StatusField(object):
"""
some alias.
"""
failed = "failed"
succeed = "succeed"
status = "status"
getNothingDefaultValue = "-999999"
class Result(object):
"""
Result that store result and some info about it.
"""
def __init__(self, result, status, message=None):
self.result = result
self.status = status
self.message = message
structureList = [
("user_mobile", str, None),
("real_name", str, None),
("channel_attr", str, None),
("channel_src", str, None),
("task_data", dict, None),
("bill_info", list, "task_data"),
("account_info", list, "task_data"),
("payment_info", list, "task_data"),
("call_info", list, "task_data")
]
def inner_get(self, defaultValue=StatusField.getNothingDefaultValue):
try:
return self.holder.get(self)
except Exception as e:
return Result(defaultValue, StatusField.failed)
print(e)
class call_meta(type):
def __init__(cls, name, bases, attrs):
for name_str, type_class, pLevel_str in structureList:
setattr(cls, name_str, type(
name_str,
(object,),
{})
)
class GetInfoFromCalls(object, metaclass = call_meta):
def __init__(self, call_deatails):
for name_str, type_class, pLevel_str in structureList:
inn = getattr(self.__class__, name_str)()
object_dict = {
"name": name_str,
"type": type_class,
"pLevel": None if pLevel_str is None else getattr(self, pLevel_str),
"context": None,
"get": inner_get,
"holder": self,
}
for attr_str, real_attr in object_dict.items():
setattr(inn, attr_str, real_attr)
setattr(self, name_str, inn)
self.call_details = call_deatails
when I ran
import pickle
pickle.dumps(GetInfoFromCalls("foo"))
it raised error like this:
Traceback (most recent call last):
File "<ipython-input-11-b2d409e35eb4>", line 1, in <module>
pickle.dumps(GetInfoFromCalls("foo"))
PicklingError: Can't pickle <class '__main__.user_mobile'>: attribute lookup user_mobile on __main__ failed
It seemed that I can't pickle inner classes because them were added dynamically by code. When classes were pickled, inner classes were not existed, is it right?
Really I don't want to write these classes that were nearly same to each other. Does someone has good way to avoid this problem?
Python's pickle actually does not serializes classes: it does serialize instances, and put in the serialization a reference to each instance's class - and that reference is based on the class being bound to a name in a well defined module. So, instances of classes that don't have a module name, but rather live as attribute in other classes, or data inside lists and dictionaries, typically will not work.
One straight forward thing one can try to do is try to use dill instead of pickle. It is a third party package that works like "pickle" but has extensions to actually serialize arbitrary dynamic classes.
While using dill may help other people reaching here, it is not your case, because in order to use dill, you'd have to monkey patch the underlying RPC mechanism PySpark is using to make use of dill instead of pickle, and that might not be trivial nor consistent enough for production use.
If the problem is really about dynamically created classes being unpickable, what you can do is to create extra meta-classes, for the dynamic classes themselves, instead of using "type", and on these metaclasses, create proper __getstate__ and __setstate__ (or other helper methods as it is on pickle documentation) - that might enable these classes to be pickled by ordinary Pickle. That is, a separate metaclass with Pickler helper methods to be used instead of type(..., (object, ), ...) in your code.
However, "unpickable object" is not the error you are getting - it is an attribute lookup error, which suggests the structure you are building is not good enough for Pickle to introspect into it and get all the members from one of your instances - it is not related (yet) to the unpickleability of the class object. Since your dynamic classes live as attributes on the class (which is not itself pickled) and not of the instance, it is very well possible that pickle does not care about it. Check the docs on pickle above, and maybe all you need there is proper helper-method to pickle on you class, nothing different on the the metaclass for all that you have there to work properly.