Two python(3) classes with the same input (extending a python class) - python-3.x

I have a design change I have been trying to implement with little success, as I can't seem to find my question anywhere.
Currently I have a python class that creates a database connection, stores the index name (table), and other attributes (specifically its an Elasticsearch database connection but that shouldn't matter for this question).
class Create:
# Functions to manipulate Index Objects
def __init__(self, index, type, host, shards=3, replicas=1):
# Create Index Object (OcrBook or OcrPage)
self.index = index
self.type = type
self.shards = shards
self.replicas = replicas
self.es_connection = Elasticsearch([{'host': host, 'port': 9200}])
Associated with this class are functions to manipulate the index objects, for example creating that index (table) on the database (cluster) or modifying that table in some way.
def create_index(self):
# Creates/Executes Index
try:
self.es_connection.indices.create(
index=self.index,
body={
'settings' : {
'number_of_shards' : self.shards,
'number_of_replicas': self.replicas,
}
})
except Exception:
CreateLog.write_log(Exception, 'Create Index Exception')
These being in the same class make sense to me, as the connection to the table/database and creating or modifying that table/database are connected to each other.
I also have a group of other functions that search that particular table. These I believe should be in a separate class as rather then creating or modifying the table/database they are simply searching the table/database and could ideally take any table/database initialized by the create class. Currently I tried breaking them up by doing the following:
class Search(Create):
def find_book(self, bookkey):
""" Finds a Book """
try:
results = self.es_connection.search(self.index, self.type, body={
"query": {
"match": {
"BookKey": bookkey
}
}
})
return results['hits']['hits']
except Exception:
CreateLog.write_log(Exception, 'Could Not Find Book')
This works on windows, but is not portable to 'linux' as the 'class has not been initialized' when I try to use the Search functionality. I know there is a design problem here, and I could combine both classes into one to fix the problem. But I would like to keep them separate. Is there a better way to 'inherit' (I don't believe that's the right word in this case) the object created in the 'Create' class by the search class, does anyone have a better way to separate these logically, or is there a better way to extend the create class with the search functionality? All input is helpful! Thank you.

You seem to be on a OOP path, but why exactly does Search has to be a class? You have a perfect task for a stand-alone function find_book(index_object, bookkey). It does not store anything internally, I do no see why this has to de a class, not a function.
Class naming can also be hinting at your design decisions (or problems). Class name is ususlly a noun, function name tends to be a verb. Create is not a perfect class name to me.
In your setting I'd go with a class IndexObjects (that is Create renamed) and function find_book(index_object, bookkey). You can switch to more of OOP once this design up and running.
Another split of responsibilities that comes to mind is below. Here you inject, not inherit, which allows you to parts more independent.
class IndexObject:
# ...
def query(self, query_dict):
return self.es_connection.search(self.index, self.type, body=query_dict)
class BookSearcher():
def __init__(self, index_object):
self.index_object = index_object
def find(self, book_key):
""" Finds a Book """
query_dict = {"query": {
"match": {
"BookKey": book_key
}
}
}
try:
results = self.index_object.query(query_dict)
return results['hits']['hits']
# FIXME: looks lile bare exception, not great
except Exception:
CreateLog.write_log(Exception, 'Could Not Find Book')

Related

How to implement "relationship" caching system in a similar query?

I noticed that when having a Model such as :
class User(Model):
id = ...
books = relationship('Book')
When calling user.books for the first time, SQLAlchemy query the database (when lazy='select' for instance, which is the default), but sub-sequent call to user.books don't call the database. The results seems to have been cached.
I'd like to have the same feature from SQLAlchemy when using a method that query, for instance:
class User:
def get_books(self):
return Book.query.filter(Book.user_id == self.id).all()
But when doing that, if I call 3 times get_books(), SQLAlchemy does call the database 3 times (when setting the ECHO property to True).
How can I change get_books() to use the caching system from SQLAlchemy ?
I insist to mention "from SQLAlchemy" because I believe they handle the refresh/expunge/flush system and changes are then re-queried to the DB if one of these happened. Opposed to if I were to simply create a caching property in the model with a simple:
def get_books(self):
if self._books is None:
self._books = Book.query.filter(Book.user_id == self.id).all()
return self._books
This does not work well with flush/refresh/expunge from SQLAlchemy.
So, How can I change get_books() to use the caching system from SQLAlchemy ?
Edit 1:
I realized that the solution provided under is not perfect, because it caches for the current object. If you have two instances of the same user, and call get_books on both, two queries will be made because the caching applies only on the instance, not globally, contrary to SQLAlchemy.
The reason is simple - I believe - but still unclear how to apply it in my case: The object is defined at the class level, not the instance (books = relationship()), and they build their own query internally, so they can cache it based on the query.
In the solution I gave, the memoize_getter is unaware of the query made, and as such, cannot cache it for the same value accros multiple instance, so any identical call made to another instance will query the database.
Original answer:
I've been trying to wrap my head around SQLAlchemy's code (wow that's dense!), and I think I figured it out!
A relationship, at least when being set as "lazy='select'" (default), is a InstrumentedAttribute, which contains a get function that does the following :
def __get__(self, instance, owner):
if instance is None:
return self
dict_ = instance_dict(instance)
if self._supports_population and self.key in dict_:
return dict_[self.key]
else:
try:
state = instance_state(instance)
except AttributeError as err:
util.raise_(
orm_exc.UnmappedInstanceError(instance),
replace_context=err,
)
return self.impl.get(state, dict_)
So, a basic caching system, respecting SQLAlchemy, would be something like:
from sqlalchemy.orm.base import instance_dict
def get_books(self):
dict_ = instance_dict(self)
if 'books' not in dict_:
dict_['books'] = Book.query.filter(Book.user_id == self.id).all()
return dict_['books']
Now, we can push the vice a bit further, and do ... a decorator (oh sweet):
def memoize_getter(f):
#functools.wraps(f)
def decorator(instance, *args, **kwargs):
property_name = f.__name__.replace('get_', '')
dict_ = instance_dict(instance)
if property_name not in dict_:
dict_[property_name] = f(instance, *args, **kwargs)
return dict_[property_name]
return decorator
Thus transforming the original method to :
class User:
#memoize_getter
def get_books(self):
return Book.query.filter(Book.user_id == self.id).all()
If someone has a better solution, I'm eagerly interested!

How to add a default filter parameter to every query in mongoengine?

I've been researching a lot, but I haven't found a way.
I have Document clases with a _owner attribute which specifies the ObjectID of the owner, which is a per-request value, so it's globally available. I would like to be able to set part of the query by default.
For example, doing this query
MyClass.objects(id='12345')
should be the same as doing
MyClass.objects(id='12345', _owner=global.owner)
because _owner=global.owner is always added by default
I haven't found a way to override objects, and using a queryset_classis someway confusing because I still have to remember to call a ".owned()" manager to add the filter every time I want to query something.
It ends up like this...
MyClass.objects(id='12345').owned()
// same that ...
MyClass.objects(id='12345', _owner=global.owner)
Any Idea? Thanks!
The following should do the trick for querying (example is simplified by using a constant owned=True but it can easily be extended to use your global):
class OwnedHouseWrapper(object):
# Implements descriptor protocol
def __get__(self, instance, owner):
return House.objects.filter(owned=True)
def __set__(self, instance, value):
raise Exception("can't set .objects")
class House(Document):
address = StringField()
owned = BooleanField(default=False)
class OwnedHouse:
objects = OwnedHouseWrapper()
House(address='garbage 12', owned=True).save()
print(OwnedHouse.objects()) # [<House: House object>]
print(len(OwnedHouse.objects)) # 1

Load inconsistent data in pymongo

I am working with pymongo and am wanting to ensure that data saved can be loaded even if additional data elements have been added to the schema.
I have used this for classes that don't need to have the information processed before assigning it to class attributes:
class MyClass(object):
def __init__(self, instance_id):
#set default values
self.database_id = instance_id
self.myvar = 0
#load values from database
self.__load()
def __load(self):
data_dict = Collection.find_one({"_id":self.database_id})
for key, attribute in data_dict.items():
self.__setattr__(key,attribute)
However, in classes that I have to process the data from the database this doesn't work:
class Example(object):
def __init__(self, name):
self.name = name
self.database_id = None
self.member_dict = {}
self.load()
def load(self):
data_dict = Collection.find_one({"name":self.name})
self.database_id = data_dict["_id"]
for element in data_dict["element_list"]:
self.process_element(element)
for member_name, member_info in data_dict["member_class_dict"].items():
self.member_dict[member_name] = MemberClass(member_info)
def process_element(self, element):
print("Do Stuff")
Two example use cases I have are:
1) List of strings the are used to set flags, this is done by calling a function with the string as the argument. (def process_element above)
2) A dictionary of dictionaries which are used to create a list of instances of a class. (MemberClass(member_info) above)
I tried creating properties to handle this but found that __setattr__ doesn't look for properties.
I know I could redefine __setattr__ to look for specific names but it is my understanding that this would slow down all set interactions with the class and I would prefer to avoid that.
I also know I could use a bunch of try/excepts to catch the errors but this would end up making the code very bulky.
I don't mind the load function being slowed down a bit for this but very much want to avoid anything that will slow down the class outside of loading.
So the solution that I came up with is to use the idea of changing the __setattr__ method but instead to handle the exceptions in the load function instead of the __setattr__.
def load(self):
data_dict = Collection.find_one({"name":self.name})
for key, attribute in world_data.items():
if key == "_id":
self.database_id = attribute
elif key == "element_list":
for element in attribute:
self.process_element(element)
elif key == "member_class_dict":
for member_name, member_info in attribute.items():
self.member_dict[member_name] = MemberClass(member_info)
else:
self.__setattr__(key,attribute)
This provides all of the functionality of overriding the __setattr__ method without slowing down any future calls to __setattr__ outside of loading the class.

Pyramid Hybrid application (traversal + url dispatch)

I've started using pyramid recently and I have some best-practice/general-concept questions about it's hybrid application approach. Let me try to mock a small but complex scenario for what I'm trying to do...
scenario
Say I'm building a website about movies, a place where users can go to add movies, like them and do other stuff with them. Here's a desired functionality:
/ renders templates/root.html
/movies renders templates/default.html, shows a list of movies
/movies/123456 shows details about a movie
/movies/123456/users shows list of users that edited this movie
/users renders templates/default.html, shows a list of users
/users/654321 shows details about a user
/users/654321/movies shows a list of favorite movies for this user
/reviews/movies renders templates/reviews.html, shows url_fetched movie reviews
/admin place where you can log in and change stuff
/admin/movies renders templates/admin.html, shows a list of editable movies
/admin/movies/123456 renders templates/form.html, edit this movie
extras:
ability to handle nested resources, for example movies/123/similar/234/similar/345, where similar is a property of movie class that lists ids of related movies. Or maybe more clear example would be: companies/123/partners/234/clients/345...
/movies/123456.json details about a movie in JSON format
/movies/123456.xml details about a movie in XML format
RESTFull methods (GET, PUT, DELETE..) with authorization headers for resource handling
approach
This is what I've done so far (my views are class based, for simplicity, I'll list just decorators...):
# resources.py
class Root(object):
__name__ = __parent__ = None
def __getitem__(self, name):
return None
def factory(request):
# pseudo code...
# checks for database model that's also a resource
if ModelResource:
return ModelResource()
return Root()
# main.py
def notfound(request):
return HTTPNotFound()
def wsgi_app():
config = Configurator()
config.add_translation_dirs('locale',)
config.add_renderer('.html', Jinja2Renderer)
# Static
config.add_static_view('static', 'static')
# Root
config.add_route('root', '/')
# Admin
config.add_route('admin', '/admin/*traverse', factory='factory')
config.add_route('default', '/{path}/*traverse', factory='factory')
config.add_notfound_view(notfound, append_slash=True)
config.scan('views')
return config.make_wsgi_app()
# views/root.py
#view_config(route_name='root', renderer="root.html")
def __call__():
# views/default.py
#view_defaults(route_name='default')
class DefaultView(BaseView):
#view_config(context=ModelResource, renderer="default.html")
def with_context():
#view_config()
def __call__():
# views/admin.py
#view_defaults(route_name='admin')
class AdminView(BaseView):
#view_config(context=ModelResource, renderer="admin.html")
def default(self):
#view_config(renderer="admin.html")
def __call__(self):
And following piece of code is from my real app. ModelResource is used as context for view lookup, and this is basically the reason for this post... Since all my models (I'm working with Google App Engine datastore) have same basic functionality they extend specific superclass. My first instinct was to add traversal functionality at this level, that's why I created ModelResource (additional explanation in code comments), but I'm begining to regret it :) So I'm looking for some insight and ideas how to handle this.
class ModelResource(ndb.Model):
def __getitem__(self, name):
module = next(module for module in application.modules if module.get('class') == self.__class__.__name__)
# this can be read as: if name == 'movies' or name == 'users' or .....
if name in module.get('nodes'):
# with this setup I can set application.modules[0]['nodes'] = [u'movies', u'films', u'фильми' ]
# and use any of the defined alternatives to find a resource
# return an empty instance of the resource which can be queried in view
return self
# my models use unique numerical IDs
elif re.match(r'\d+', name):
# return instance with supplied ID
cls = self.__class__
return cls.get_by_id(name)
# this is an attempt to solve nesting issue
# if this model has repeated property it should return the class
# of the child item(s)
# for now I'm just displaying the property's value...
elif hasattr(self, name):
# TODO: ndb.KeyProperty, ndb.StructuredProperty
return getattr(self, name)
# this is an option to find resource by slug instead of ID...
elif re.match(r'\w+', name):
cls = self.__class__
instance = cls.query(cls.slug == name).get()
if instance:
return instance
raise KeyError
questions
I have basically two big questions (widh some sub-questions):
How should I handle traversal in described scenario?
Should I extend some class, or implement an interface, or something else?
Should traversal be handled at model level (movie model, or it's superclass would have __getitem__ method)? Should resource be a model instance?
What's the best practice for __getitem__ that returns collection (list) of resources? In this scenario /movies/
My ModelResource in the code above is not location aware... I'm having trouble getting __public__ and __name__ to work, probably because this is the wrong way to do traversal :) I'm guessing once I find the right way, nested resources shouldn't be a problem.
How can I switch back from traversal to url dispatching?
For the extra stuff: How can I set a different renderer to handle JSON requests? Before I added traversal I configured config.add_route('json', '/{path:.*}.json') and it worked with appropriate view, but now this route is never matched...
Same problem as above - I added a view that handles HTTP methods #view_config(request_method='GET'), but it's never matched in this configuration.
Sorry for the long post, but it's fairly complex scenario. I should probably mention that real app has 10-15 models atm, so I'm looking for solution that would handle everything in one place - I want to avoid manually setting up Resource Tree...

Having trouble returning through multiple classes in Python

I'm still learning and like to build things that I will eventually be doing on a regular basis in the future, to give me a better understanding on how x does this or y does that.
I haven't learned much about how classes work entirely yet, but I set up a call that will go through multiple classes.
getattr(monster, monster_class.str().lower())(1)
Which calls this:
class monster:
def vampire(x):
monster_loot = {'Gold':75, 'Sword':50.3, 'Good Sword':40.5, 'Blood':100.0, 'Ore':.05}
if x == 1:
loot_table.all_loot(monster_loot)
Which in turn calls this...
class loot_table:
def all_loot(monster_loot):
loot = ['Gold', 'Sword', 'Good Sword', 'Ore']
loot_dropped = {}
for i in monster_loot:
if i in loot:
loot_dropped[i] = monster_loot[i]
drop_chance.chance(loot_dropped)
And then, finally, gets to the last class.
class drop_chance:
def chance(loot_list):
loot_gained = []
for i in loot_list:
x = random.uniform(0.0,100.0)
if loot_list[i] >= x:
loot_gained.append(i)
return loot_gained
And it all works, except it's not returning loot_gained. I'm assuming it's just being returned to the loot_table class and I have no idea how to bypass it all the way back down to the first line posted. Could I get some insight?
Keep using return.
def foo():
return bar()
def bar():
return baz()
def baz():
return 42
print foo()
I haven't learned much about how classes work entirely yet...
Rather informally, a class definition is a description of the object of that class (a.k.a. instance of the class) that is to be created in future. The class definition contains the code (definitions of the methods). The object (the class instance) basically contains the data. The method is a kind of function that can take arguments and that is capable to manipulate the object's data.
This way, classes should represent the behaviour of the real-world objects, the class instances simulate existence of the real-world objects. The methods represent actions that the object apply on themselves.
From that point of view, a class identifier should be a noun that describes category of objects of the class. A class instance identifier should also be a noun that names the object. A method identifier is usually a verb that describes the action.
In your case, at least the class drop_chance: is suspicious at least because of naming it this way.
If you want to print something reasonable about the object--say using the print(monster)--then define the __str__() method of the class -- see the doc.

Resources