Mongoengine check if object exists without fetching the object - mongoengine

On Django we can use the QuerySet.exists() to check if an object exists in the DB in the most efficient way, without actually fetching the record.
Is there an equivalent function on Mongoengine?

According to official documentation, here is solution how to make this if you have object id.Its the best solution for your case, that i have seen in mongoengine documentation.It works as below:
# Returns None or Object if it exists
result = Collection.objects.with_id(object_id=*your object id*)
if result is None:
# raise error
else:
# make some actions with this object
Is this is what are you looking for?

Related

How to retrieve nested output from XCom using taskflow syntax in Airflow

Well, I know this seems to be possible I just don't know how. To begin with, I am using traditional operators (without #task decorator) but I am interested in XComArgs return output format from these operators that can be used in downstream tasks. Below is a sample example
task_1 = DummyOperator(
task_id = 'task_1'
) # returns {"data": {"foo" : [{"cmd": "ls"}]}}
task_2 = BashOperator(
task_2='task_2',
cmd=task_1.output['return_value']['data']['foo'][0]['cmd'] # does not give what I need and returns null.
#cmd = f"{{ ti.xcom_pull(task_ids = 'task_1', key='return_value')['data']['foo'][0]['cmd'] }}" Gives what I need
)
In this example what is working for me which is pure Jinja templating and the new syntax does not work for me using XComArgs. I have tried changing the argument render_template_as_native_obj=True in Dag configuration but does not change anything. I want to use .output format which returns XcomArgs object and is returning the complete dict but have not been able to use the nested keys like above. Also, have tried converting string to JSON and all those combinations but does not seem to work.
Unfortunately, retrieving nested values from XComArgs in a limitation of the TaskFlow API.
The TaskFlow API uses __getitem__ to override the XCom key to use. In your example, the key ends up being "cmd" rather than the value of what cmd represents in that nested object. You'll have to use the original ti.xcom_pull() method until that limitation is addressed.

How Can I Check For The Existence of a UUID

I am trying to check for the existence of a UUID as a primary key in my Django environment...and when it exists...my code works fine...But if it's not present I get a "" is not a Valid UUID...
Here's my code....
uuid_exists = Book.objects.filter(id=self.object.author_pk,is_active="True").first()
I've tried other variations of this with .exists() or .all()...but I keep getting the ['“” is not a valid UUID.'] error.
I did come up with a workaround....
if self.object.author_pk is not '':
book_exists = Book.objects.filter(id=self.object.author_pk,is_active="True").first()
context['author_exists'] = author_exists
Is this the best way to do this? I was hoping to be able to use a straight filter...without clarifying logic....But I've worked all afternoon and can't seem to come up with anything better. Thanks in advance for any feedback or comments.
I've had the same issue and this is what I have:
Wrapping it into try/except (in my case it's a View so it's supposed to return a Response object)
try:
object = Object.objects.get(id=object_id)
except Exception as e:
return Response(data={...}, status=status.HTTP_40...
It gets to the exception (4th line) but somehow sends '~your_id~' is not a valid UUID. text instead of proper data. Which might be enough in some cases.
This seems like an overlook, so might as well get a fix soon. I don't have enough time to investigate deeper, unfortunately.
So the solution I came up with is not ideal either but hopefully is a bit cleaner and faster than what you're using rn.
# Generate a list of possible object IDs (make use of filters in order to reduce the DB load)
possible_ids = [str(id) for id in Object.objects.filter(~ filters here ~).values_list('id', flat=True)]
# Return an error if ID is not valid
if ~your_id~ not in possible_ids:
return Response(data={"error": "Database destruction sequence initialized!"}, status=status.HTTP_401_UNAUTHORIZED)
# Keep working with the object
object = Objects.objects.get(id=object_id)

Google Cloud Python Lib - Get Entity By ID or Key

I've been working on a python3 script that is given an Entity Id as a command line argument. I need to create a query or some other way to retrieve the entire entity based off this id.
Here are some things I've tried (self.entityId is the id provided on the commandline):
entityKey = self.datastore_client.key('Asdf', self.entityId, namespace='Asdf')
query = self.datastore_client.query(namespace='asdf', kind='Asdf')
query.key_filter(entityKey)
query_iter = query.fetch()
for entity in query_iter:
print(entity)
Instead of query.key_filter(), i have also tried:
query.add_filter('id', '=', self.entityId)
query.add_filter('__key__', '=', entityKey)
query.add_filter('key', '=', entityKey)
So far, none of these have worked. However, a generic non-filtered query does return all the Entities in the specified namespace. I have been consulting the documentation at: https://googleapis.dev/python/datastore/latest/queries.html and other similar pages of the same documentation.
A simpler answer is to simply fetch the entity. I.e. self.datastore_client.get(self.datastore_client.key('Asdf', self.entityId, namespace='asdf'))
However, given that you are casting both entity.key.id and self.entityId, you'll want to check your data to see if you are key names or ids. Alternatives to the above are:
You are using key ids, but self.entityid is a string self.datastore_client.get(self.datastore_client.key('Asdf', int(self.entityId), namespace='asdf'))
You are using key names, and entityId is an int self.datastore_client.get(self.datastore_client.key('Asdf', str(self.entityId), namespace='asdf'))
I've fixed this problem myself. Because I could not get any filter approach to work, I ended up doing a query for all Entities in the namespace, and then did a conditional check on entity.key.id, and comparing it to the id passed on the commandline.
query = self.datastore_client.query(namespace='asdf', kind='Asdf')
query_iter = query.fetch()
for entity in query_iter:
if (int(entity.key.id) == int(self.entityId)):
#do some stuff with the entity data
It is actually very easy to do, although not so clear from the docs.
Here's the working example:
>>> key = client.key('EntityKind', 1234)
>>> client.get(key)
<Entity('EntityKind', 1234) {'property': 'value'}>

Get the updated document after update_one

What parameter needs to pass for the MongoEngine update_one() to get the updated document in return? Currently, it is returning 0 or 1.
Instead of using update_one() I used the method modify(). Pass the parameter new = True to it to get the updated result
According to the mongoengine documentation there doesn't seem to be anything you can pass to update_one to return the updated document. I've tried this myself in a test environment and I'm also getting either a 0 or 1 returned.
There is a parameter you can pass to update called full_result which gives you a bit more information but it's nothing that would help you in this case.

Get username from local session Telethon

I'm using telethon library to crawl some telegram channels. While crawling, i need to resolve many join links, usernames and channel ids. To resolve these items, i used method client.get_entity() but after a while telegram servers banned my crawler for resolving too many usernames. I searched around and found from this issue, i should use get_input_entity() instead of get_entity(). Actually telethon saves entities inside a local SQLite file and whenever a call to get_input_entity() is made, it first searches the local SQLite database, if no match found it then sends request to telegram servers. So far so good but i have two problems with this approach:
get_input_entity() just returns two attributes: ID and hash but there are other columns like username, phone and name in the SQLite database. I need a method to not just return ID and hash, but to return other columns too.
I need to control the number of resolve requests sent to telegram server but get_input_entity() sends request to telegram servers whenever founds no match in the local database. The problem is that i can't control this method when to request telegram servers. Actually i need a boolean argument for this method indicating whether or not the method should send a request to telegram servers when no match in the local database is found.
I read some of the telethon source codes, mainly get_input_entity() and wrote my own version of get_input_entity():
def my_own_get_input_entity(self, target, with_info: bool = False):
if self._client:
if target in ('me', 'self'):
return types.InputPeerSelf()
def get_info():
nonlocal self, result
res_id = 0
if isinstance(result, InputPeerChannel):
res_id = result.channel_id
elif isinstance(result, InputPeerChat):
res_id = result.chat_id
elif isinstance(result, InputPeerUser):
res_id = result.user_id
return self._sqlite_session._execute(
'select username, name from entities where id = ?', res_id
)
try:
result = self._client.session.get_input_entity(target)
info = get_info() if with_info else None
return result, info
except ValueError:
record_current_time()
try:
# when we are here, we are actually going to
# send request to telegram servers
if not check_if_appropriate_time_elapsed_from_last_telegram_request():
return None
result = self._client.get_input_entity(target)
info = get_info() if with_info else None
return result, info
except ChannelPrivateError:
pass
except ValueError:
pass
except Exception:
pass
But my code is somehow performance problematic because it makes redundant queries to SQLite database. For example, if the target is actually an entity inside the local database and with_info is True, it first queries the local database in line self._client.session.get_input_entity(target) and checks if with_info is True, then queries the database again to get username and name columns. In another situation, if target is not found inside the local database, calling self._client.get_input_entity(target) makes a redundant call to local database.
Knowing these performance issues, i delved deeper in telethon source codes but as i don't know much about asyncio, i couldn't write any better code than above.
Any ideas how to solve the problems?
client.session.get_input_entity will make no API call (it can't), and fails if there is no match in the local database, which is probably the behaviour you want.
You can, for now, access the client.session._conn private attribute. It's a sqlite3.Connection object so you can use that to make all the queries you want. Note that this is prone to breaking since you're accessing a private member although no changes are expected soon. Ideally, you should subclass the session file to suit your needs. See Session Files in the documentation.

Resources