How can I assign a custom object to xarray data values? - python-3.x

I have created a DataArray using xarray successfully:
df_invoice_features = xr.DataArray(data=None,
dims={"y", "x"},
coords={"y": unique_invoices, "x": cols})
I created a custom class and assigned one value of xarray to the instance of this class:
class MyArray:
def __init__(self, s):
self.arr = np.array((s))
def set(self, idx, val):
self.arr[idx] = val
def get(self):
return self.arr
df_invoice_features.loc['basket_value_brand', invoice_id] = MyArray(len_b)
It is created successfully again:
But when I want to update the array of this class instance:
df_invoice_features.loc['basket_value_brand', invoice_id].set(0, 10)
It returns this error:
AttributeError: 'DataArray' object has no attribute 'set'
How can I use an array, dictionary or my custom object inside xarray data values?

So df_invoice_features.loc['basket_value_brand', invoice_id] doesn't actually return MyArray(len_b). Instead, it returns an xarray DataArray; specifically the subset of your full DataArray at the coordinate ['basket_value_brand', invoice_id]. This doesn't just include the value at that location (MyArray(len_b)), but also all the other information stored at that DataArray location; i.e., your coordinates, metadata, etc.
If you want to access the actual value at that location, you'll have to use .values; i.e.,
df_invoice_features.loc['basket_value_brand', invoice_id].values
That should get you the MyArray(len_b) you're looking for. However, I'm not entirely clear what you would like to do with that class. If you're trying to change the value of your DataArray at that location, this bit of the xarray docs in particular may be useful to review.

Related

How to get attributes (not methods) of a class in Python

How to get attributes (not methods) of a class in Python
Hello everyone!
Basically, I'm looking to retrieve all attributes of a class without having access to self (To create a diagram that includes the attributes).
For now I don't have any code, I just have an 'obj' variable which contains the class.
I would therefore like to know, how, via "obj" I can retrieve all the attributes including those which are in functions.
Thanking you in advance,
VitriSnake
You can call __dict__ on your class and it will return a dictionary containing all attributes with their values set by the constructor.
class Tree:
def __init__(self):
self.trunk_size = 20
self.leaf_colour = "Orange"
if "__main__" == __name__:
tree = Tree()
print(tree.__dict__)
Result: {'trunk_size': 20, 'leaf_colour': 'Orange'}
If you just want the values call tree.__dict__.values() and for your keys or rather attribute variable names do tree.__dict__.keys().

Collection of objects that are set up only if actually used

I'm writing a class to manage a collection of objects that I'd like to "load" only if actually used (immagine that each object is an heavy document). Also I want to refer to each object both with a numeric key and a string name.
I decided to create a class that inherits from OrderedDict:
from collections import OrderedDict
class MyClass:
def load_me(self, key):
print(f"Object {key} loaded")
class MyClassColl(OrderedDict):
def __getitem__(self, key):
if isinstance(key, int):
key = list(self.keys())[key]
res = super().get(key).load_me(key)
return res
When I initialise the collection and retrieve a single object everything works well and:
my_coll = MyClassColl([('Obj1', MyClass()), ('Obj2', MyClass()), ('Obj3', MyClass())])
my_obj = my_coll['Obj2'] # or my_obj = my_coll[1]
prints:
Object Obj2 loaded
But using a loop, the objects are not properly loaded so:
for key, item in my_coll.items():
obj = item
has not output.
This is because the __getitem__ method is not getting called when you loop through the dictionary like that. It is only called when you use an index operator (as far as I know). So, a super easy fix would be to do your for loop like this:
for key in my_coll:
item = my_coll[key]
Alternatively you could try playing around with the __iter__ method but I think the way you've done it is probably ideal.

Load inconsistent data in pymongo

I am working with pymongo and am wanting to ensure that data saved can be loaded even if additional data elements have been added to the schema.
I have used this for classes that don't need to have the information processed before assigning it to class attributes:
class MyClass(object):
def __init__(self, instance_id):
#set default values
self.database_id = instance_id
self.myvar = 0
#load values from database
self.__load()
def __load(self):
data_dict = Collection.find_one({"_id":self.database_id})
for key, attribute in data_dict.items():
self.__setattr__(key,attribute)
However, in classes that I have to process the data from the database this doesn't work:
class Example(object):
def __init__(self, name):
self.name = name
self.database_id = None
self.member_dict = {}
self.load()
def load(self):
data_dict = Collection.find_one({"name":self.name})
self.database_id = data_dict["_id"]
for element in data_dict["element_list"]:
self.process_element(element)
for member_name, member_info in data_dict["member_class_dict"].items():
self.member_dict[member_name] = MemberClass(member_info)
def process_element(self, element):
print("Do Stuff")
Two example use cases I have are:
1) List of strings the are used to set flags, this is done by calling a function with the string as the argument. (def process_element above)
2) A dictionary of dictionaries which are used to create a list of instances of a class. (MemberClass(member_info) above)
I tried creating properties to handle this but found that __setattr__ doesn't look for properties.
I know I could redefine __setattr__ to look for specific names but it is my understanding that this would slow down all set interactions with the class and I would prefer to avoid that.
I also know I could use a bunch of try/excepts to catch the errors but this would end up making the code very bulky.
I don't mind the load function being slowed down a bit for this but very much want to avoid anything that will slow down the class outside of loading.
So the solution that I came up with is to use the idea of changing the __setattr__ method but instead to handle the exceptions in the load function instead of the __setattr__.
def load(self):
data_dict = Collection.find_one({"name":self.name})
for key, attribute in world_data.items():
if key == "_id":
self.database_id = attribute
elif key == "element_list":
for element in attribute:
self.process_element(element)
elif key == "member_class_dict":
for member_name, member_info in attribute.items():
self.member_dict[member_name] = MemberClass(member_info)
else:
self.__setattr__(key,attribute)
This provides all of the functionality of overriding the __setattr__ method without slowing down any future calls to __setattr__ outside of loading the class.

How to store in variable function returning value (kivy properties)

class Data(object):
def get_key_nicks(self):
'''
It returns key and nicks object
'''
file = open(self.key_address, 'rb')
key = pickle.load(file)
file.close()
file = open(self.nicks_address, 'rb')
nicks = pickle.load(file)
file.close()
return (key, nicks)
Above is the data api and function which i want to use in kivy
class MainScreen(FloatLayout):
data = ObjectProperty(Data())
key, nicks = ListProperty(data.get_key_nicks())
it gives error like: AttributeError: 'kivy.properties.ObjectProperty' object has no attribute 'get_key_nicks'
Properties are descriptors, which basically means they look like normal attributes when accessed from instances of the class, but at class level they are objects on their own. That's the nature of the problem here - at class level data is an ObjectProperty, even though if you access it from an instance of the class you'll get your Data() object that you passed in as the default value.
That said, I don't know what your code is actually trying to do, do you want key and nicks to be separate ListProperties?
Could you expand a bit more on what you're trying to do?
I think all you actually need to do is:
class MainScreen(FloatLayout):
data = ObjectProperty(Data())
def get_key_nicks(self):
return data.get_key_nicks()

csv file process in Python

I work with a csv data as follow:
ticker,exchange_country,company_name,price,exchange_rate,shares_outstanding,net_income
1,HK,CK HUTCHISON HOLDINGS LTD,1.404816984,7.757949829,3859.677979,31633
2,HK,CLP HOLDINGS LTD,1.312602194,7.757949829,2526.450928,16319
3,HK,HONG KONG & CHINA GAS CO LTD,0.234939214,7.757949829,12717.04199,7546.200195
11,HK,HANG SENG BANK LTD,2.198193203,7.757949829,1911.843018,15451
I have a StockStatRecord class:
class StockStatRecord:
def __init__(self, stock_load):
self.name = stock_load[0]
self.company_name = stock_load[2]
self.exchange_country = stock_load[1]
self.price = stock_load[3]
self.exchange_rate = stock_load[4]
self.shares_outstanding = stock_load[5]
self.net_income = stock_load[6]
How am I supposed to create another class to extract the data from that CSV, parse it, create new record and return the record created? In this class, it also needs to validate the rows when reading. Validation will fail for any row that is missing any piece of information, or if the name (symbol or player name) is empty, or if any of the numbers(int or float) cannot be parsed ( watch out of the division by zero).
There are several ways of doing this, either rolling out the code yourself, or using a Python module that is made for veryfing data-schemas, like Colander, or the extended CSV reader in Pandas (as Zwinck posted in the comment above).
What is not usually needed is a separate class to check values- you can do that on the same class - or usually, have a base class that implements the data-validation mechanisms, and then just have extra information on each field for the actual data classes. And finally, if you need to process data and spill an object back, there is no need for a class because in Python you can have functions independents of classes - there is no need to try to hammer down every piece of code to a class.
One simple thing to there is to (1) use Python's csv.DictReader instead of csv.Reader to read the rows - that way you have each piece of data bound to the column name already, as a dict, instead of a list where you have to manually track the column numbers, then set a property for each of the columns you need validation, so that the fields can be validated on setting - and a __init__ method that simply assigns all fields to their respectiv attributes:
class SockStatRecord:
def __init__(self, row):
for key, value in row.items():
setattr(self, key, value)
#property
def name(self):
return self._name
#name.setter
def name(self, value):
if not name: # example verification for empty name
raise ValueError
self._name = name
# continue for other fields
import csv
reader = csv.Dictreader(open("mydatafile.csv"))
all_records = []
for row in reader:
try:
all_records.append(StockDataRecord(row))
except ValueError:
print("Some error at record: {}".format(row))

Resources