Django - Haystack in two differents apps - django-haystack

I am using Haystack in one app and its perfect. It is indexing everything that I need. But, now I created another app, with different model and content, and I would like to Haystack index it. The idea is to create two different "search" links on my website, one for each app.
However, when I add the second configuration to haystack index it, I get some problem...
I created a new search_index.py (inside my new app) with the following content:
import datetime
from haystack.indexes import *
from haystack import site
from oportunity.models import Oportunity
class OportunityIndex(SearchIndex):
title = CharField(document=True, use_template=True)
body = CharField()
date= DateTimeField()
def index_queryset(self):
return Oportunity.objects.filter(date=datetime.datetime.now())
site.register(Oportunity, OportunityIndex)
but, when I run python manage.py rebuild_index
I get the following error:
line 94, in all_searchfields
raise SearchFieldError("All SearchIndex fields with 'document=True' must use the same fieldname.")
haystack.exceptions.SearchFieldError: All SearchIndex fields with
'document=True' must use the same fieldname.

This is a known limitation of Haystack which has been discussed in a few different places where the underlying document store needs the document field to be consistently named across all search models.
It is documented in the haystack docs what the recommended document field name is. Bottom line, you can't define title = CharField(document=True) on one index and content = CharField(document=True) on another index, they have to be named the same.
BEST PRACTICE: name the index field text. This is recommended by the haystack docs and will give you the most compatibility with 3rd party apps.

Related

django remove m2m instance when there are no more relations

In case we had the model:
class Publication(models.Model):
title = models.CharField(max_length=30)
class Article(models.Model):
publications = models.ManyToManyField(Publication)
According to: https://docs.djangoproject.com/en/4.0/topics/db/examples/many_to_many/, to create an object we must have both objects saved before we can create the relation:
p1 = Publication(title='The Python Journal')
p1.save()
a1 = Article(headline='Django lets you build web apps easily')
a1.save()
a1.publications.add(p1)
Now, if we called delete in either of those objects the object would be removed from the DB along with the relation between both objects. Up until this point I understand.
But is there any way of doing that, if an Article is removed, then, all the Publications that are not related to any Article will be deleted from the DB too? Or the only way to achieve that is to query first all the Articles and then iterate through them like:
to_delete = []
qset = a1.publications.all()
for publication in qset:
if publication.article_set.count() == 1:
to_delete(publication.id)
a1.delete()
Publications.filter(id__in=to_delete).delete()
But this has lots of problems, specially a concurrency one, since it might be that a publication gets used by another article between the call to .count() and publication.delete().
Is there any way of doing this automatically, like doing a "conditional" on_delete=models.CASCADE when creating the model or something?
Thanks!
I tried with #Ersain answer:
a1.publications.annotate(article_count=Count('article_set')).filter(article_count=1).delete()
Couldn't make it work. First of all, I couldn't find the article_set variable in the relationship.
django.core.exceptions.FieldError: Cannot resolve keyword 'article_set' into field. Choices are: article, id, title
And then, running the count filter on the QuerySet after filtering by article returned ALL the tags from the article, instead of just the ones with article_count=1. So finally this is the code that I managed to make it work with:
Publication.objects.annotate(article_count=Count('article')).filter(article_count=1).filter(article=a1).delete()
Definetly I'm not an expert, not sure if this is the best approach nor if it is really time expensive, so I'm open to suggestions. But as of now it's the only solution I found to perform this operation atomically.
You can remove the related objects using this query:
a1.publications.annotate(article_count=Count('article_set')).filter(article_count=1).delete()
annotate creates a temporary field for the queryset (alias field) which aggregates a number of related Article objects for each instance in the queryset of Publication objects, using Count function. Count is a built-in aggregation function in any SQL, which returns the number of rows from a query (a number of related instances in this case). Then, we filter out those results where article_count equals 1 and remove them.

Google Cloud Python Lib - Get Entity By ID or Key

I've been working on a python3 script that is given an Entity Id as a command line argument. I need to create a query or some other way to retrieve the entire entity based off this id.
Here are some things I've tried (self.entityId is the id provided on the commandline):
entityKey = self.datastore_client.key('Asdf', self.entityId, namespace='Asdf')
query = self.datastore_client.query(namespace='asdf', kind='Asdf')
query.key_filter(entityKey)
query_iter = query.fetch()
for entity in query_iter:
print(entity)
Instead of query.key_filter(), i have also tried:
query.add_filter('id', '=', self.entityId)
query.add_filter('__key__', '=', entityKey)
query.add_filter('key', '=', entityKey)
So far, none of these have worked. However, a generic non-filtered query does return all the Entities in the specified namespace. I have been consulting the documentation at: https://googleapis.dev/python/datastore/latest/queries.html and other similar pages of the same documentation.
A simpler answer is to simply fetch the entity. I.e. self.datastore_client.get(self.datastore_client.key('Asdf', self.entityId, namespace='asdf'))
However, given that you are casting both entity.key.id and self.entityId, you'll want to check your data to see if you are key names or ids. Alternatives to the above are:
You are using key ids, but self.entityid is a string self.datastore_client.get(self.datastore_client.key('Asdf', int(self.entityId), namespace='asdf'))
You are using key names, and entityId is an int self.datastore_client.get(self.datastore_client.key('Asdf', str(self.entityId), namespace='asdf'))
I've fixed this problem myself. Because I could not get any filter approach to work, I ended up doing a query for all Entities in the namespace, and then did a conditional check on entity.key.id, and comparing it to the id passed on the commandline.
query = self.datastore_client.query(namespace='asdf', kind='Asdf')
query_iter = query.fetch()
for entity in query_iter:
if (int(entity.key.id) == int(self.entityId)):
#do some stuff with the entity data
It is actually very easy to do, although not so clear from the docs.
Here's the working example:
>>> key = client.key('EntityKind', 1234)
>>> client.get(key)
<Entity('EntityKind', 1234) {'property': 'value'}>

Flask sqlalchemy updating multiple fields in a row

Recently moved to flask from expressjs.
I am creating a flask app using flask flask-sqlalchemy flask-wtf
It is a form heavy application. I expect to have about 30-50 forms, with each form having 20-100 fields.
Client side forms are using flask-wtf
I am able to create models and able to create a crud functionality. The problem is that with each form I have to manually do
IN CREATE
[...]
# after validation
someItem = SomeModel(someField1=form.someField1.data, ..., somefieldN = form.someFieldN.data)
db.session.add(someItem)
db.session.commit()
IN UPDATE
[....]
queryItem = SomeModel.query.filter_by(id=item_id)
queryItem.somefield1 = form.someField1.data
[...]
queryItem.somefieldN = form.someFieldN.data
db.session.commit()
As apparent, with lots of forms, it gets very tedious. Is there a way to
If you are able to suggest a library that will do this
I have searched online for the last few hours. The closest I got to was to create a dictionary and then pass it like
someDict = {'someField1': form.someField1.data, ....}
SomeModel.query.filter_by(id=item.id).update(someDict)
As you can see it is equally tedious
I am hoping to find a way to pass the form data directly to SomeModel for creating as well as updating.
I previously used expressjs + knex and I was simply able to pass req.body after validation, to knex.
Thanks for your time
Use 'populate_obj' (note: model field names must match form fields)
Create record:
someItem = SomeModel()
form.populate_obj(someItem)
db.session.add(someItem)
db.session.commit()
Update record:
queryItem = SomeModel.query.filter_by(id=item_id)
form.populate_obj(queryItem)
db.session.commit()

Google Cloud Datastore Cursor with google.cloud.ndb

I am working with Google Cloud Datastore using the latest google.cloud.ndb library
I am trying to implement pagination use Cursor using the following code.
The same is not fetching the data correctly.
[1] To Fetch Data:
query_01 = MyModel.query()
f = query_01.fetch_page_async(limit=5)
This code works fine and fetches 5 entities from MyModel
I want to implementation pagination that can be integrated with a Web frontend
[2] To Fetch Next Set of Data
from google.cloud.ndb._datastore_query import Cursor
nextpage_value = "2"
nextcursor = Cursor(cursor=nextpage_value.encode()) # Converts to bytes
query_01 = MyModel.query()
f = query_01.fetch_page_async(limit=5, start_cursor= nextcursor)
[3] To Fetch Previous Set of Data
previouspage_value = "1"
prevcursor = Cursor(cursor=previouspage_value.encode())
query_01 = MyModel.query()
f = query_01.fetch_page_async(limit=5, start_cursor=prevcursor)
The [2] & [3] sets of code do not fetch paginated data, but returns results same as results of codebase [1].
Please note I'm working with Python 3 and using the
latest "google.cloud.ndb" Client library to interact with Datastore
I have referred to the following link https://github.com/googleapis/python-ndb
I am new to Google Cloud, and appreciate all the help I can get.
Firstly, it seems to me like you are expecting to use the wrong kind of pagination. You are trying to use numeric values, whereas the datastore cursor is providing cursor-based pagination.
Instead of passing in byte-encoded integer values (like 1 or 2), the datastore is expecting tokens that look similar to this: 'CjsSNWoIb3Z5LXRlc3RyKQsSBFVzZXIYgICAgICAgAoMCxIIQ3ljbGVEYXkiCjIwMjAtMTAtMTYMGAAgAA=='
Such a cursor you can obtain from the first call to the fetch_page() method, which returns a tuple:
(results, cursor, more) where results is a list of query results, cursor is a cursor pointing just after the last result returned, and more indicates whether there are (likely) more results after that
Secondly, you should be using fetch_page() instead of fetch_page_async(), since the second method does not return you the cursors you need for pagination. Internally, fetch_page() is calling fetch_page_async() to get your query results.
Thirdly and lastly, I am not entirely sure whether the "previous page" use-case is doable using the datastore-provided pagination. It may be that you need to implement that yourself manually, by storing some of the cursors.
I hope that helps and good luck!

Avoid having 2 identical entries in SQlAlchemy using Flask

I need to find a way to alert the user that what he's introducing already exists in the database, I have a Flask application and a SQLAlchemy database, I'm also using Flask-WTF,
I tried with a very precarious solution: I stored the data captured by the forms in variables and I was thinking of concatenating them and using a Query to search if they exist.
nombre1 = form.nombre_primero.data
nombre2 = form.nombre_segundo.data
Anyway I think this is not the most appropriate way to handle the situation.
does Flask has some way to do this? Or would you recommend me something?
I'd grateful if you could help me!
I would approach this by creating a composite unique constraint made of the select fields in the sqlalchemy model.
The table can be configured additionally via __table_args__ class property of the declarative base.
from app import db
from sqlalchemy import UniqueConstraint
class Role(db.Model):
id = db.Column(db.Integer, primary_key=True)
nombre_primero = db.Column(db.String(64))
nombre_segundo = db.Column(db.String(64))
__table_args__ = (
UniqueConstraint('nombre_primero', 'nombre_segundo', name='uix_1'),
)
You can write the data to the table and handle what exception is raised when there is a conflict.
Okay, so there is a simple way to solve this, at the table itself, you make a condition that rejects duplicate entries based on some condition which you define.
So one easy way you can do this is make a hybrid function.
Read more about Hybrid Attributes here.
from sqlalchemy.ext.hybrid import hybrid_property
Now where you make the model for your table,
eg:
class xyz(db.Model):
__tablename__ = 'xyz'
#tablevalues defined here
#hybrid_property
def xyz()
#make a method here which rejects duplicate entries.
Once you read the documentation you will understand how this works.
I cant directly solve your problem because there isn't much information you have provided. But in this way, you can check the entries and make some method EASILY where your data is checked to be unique in anyway you want.

Resources