I'm using telethon library to crawl some telegram channels. While crawling, i need to resolve many join links, usernames and channel ids. To resolve these items, i used method client.get_entity() but after a while telegram servers banned my crawler for resolving too many usernames. I searched around and found from this issue, i should use get_input_entity() instead of get_entity(). Actually telethon saves entities inside a local SQLite file and whenever a call to get_input_entity() is made, it first searches the local SQLite database, if no match found it then sends request to telegram servers. So far so good but i have two problems with this approach:
get_input_entity() just returns two attributes: ID and hash but there are other columns like username, phone and name in the SQLite database. I need a method to not just return ID and hash, but to return other columns too.
I need to control the number of resolve requests sent to telegram server but get_input_entity() sends request to telegram servers whenever founds no match in the local database. The problem is that i can't control this method when to request telegram servers. Actually i need a boolean argument for this method indicating whether or not the method should send a request to telegram servers when no match in the local database is found.
I read some of the telethon source codes, mainly get_input_entity() and wrote my own version of get_input_entity():
def my_own_get_input_entity(self, target, with_info: bool = False):
if self._client:
if target in ('me', 'self'):
return types.InputPeerSelf()
def get_info():
nonlocal self, result
res_id = 0
if isinstance(result, InputPeerChannel):
res_id = result.channel_id
elif isinstance(result, InputPeerChat):
res_id = result.chat_id
elif isinstance(result, InputPeerUser):
res_id = result.user_id
return self._sqlite_session._execute(
'select username, name from entities where id = ?', res_id
)
try:
result = self._client.session.get_input_entity(target)
info = get_info() if with_info else None
return result, info
except ValueError:
record_current_time()
try:
# when we are here, we are actually going to
# send request to telegram servers
if not check_if_appropriate_time_elapsed_from_last_telegram_request():
return None
result = self._client.get_input_entity(target)
info = get_info() if with_info else None
return result, info
except ChannelPrivateError:
pass
except ValueError:
pass
except Exception:
pass
But my code is somehow performance problematic because it makes redundant queries to SQLite database. For example, if the target is actually an entity inside the local database and with_info is True, it first queries the local database in line self._client.session.get_input_entity(target) and checks if with_info is True, then queries the database again to get username and name columns. In another situation, if target is not found inside the local database, calling self._client.get_input_entity(target) makes a redundant call to local database.
Knowing these performance issues, i delved deeper in telethon source codes but as i don't know much about asyncio, i couldn't write any better code than above.
Any ideas how to solve the problems?
client.session.get_input_entity will make no API call (it can't), and fails if there is no match in the local database, which is probably the behaviour you want.
You can, for now, access the client.session._conn private attribute. It's a sqlite3.Connection object so you can use that to make all the queries you want. Note that this is prone to breaking since you're accessing a private member although no changes are expected soon. Ideally, you should subclass the session file to suit your needs. See Session Files in the documentation.
Related
I am trying to check for the existence of a UUID as a primary key in my Django environment...and when it exists...my code works fine...But if it's not present I get a "" is not a Valid UUID...
Here's my code....
uuid_exists = Book.objects.filter(id=self.object.author_pk,is_active="True").first()
I've tried other variations of this with .exists() or .all()...but I keep getting the ['“” is not a valid UUID.'] error.
I did come up with a workaround....
if self.object.author_pk is not '':
book_exists = Book.objects.filter(id=self.object.author_pk,is_active="True").first()
context['author_exists'] = author_exists
Is this the best way to do this? I was hoping to be able to use a straight filter...without clarifying logic....But I've worked all afternoon and can't seem to come up with anything better. Thanks in advance for any feedback or comments.
I've had the same issue and this is what I have:
Wrapping it into try/except (in my case it's a View so it's supposed to return a Response object)
try:
object = Object.objects.get(id=object_id)
except Exception as e:
return Response(data={...}, status=status.HTTP_40...
It gets to the exception (4th line) but somehow sends '~your_id~' is not a valid UUID. text instead of proper data. Which might be enough in some cases.
This seems like an overlook, so might as well get a fix soon. I don't have enough time to investigate deeper, unfortunately.
So the solution I came up with is not ideal either but hopefully is a bit cleaner and faster than what you're using rn.
# Generate a list of possible object IDs (make use of filters in order to reduce the DB load)
possible_ids = [str(id) for id in Object.objects.filter(~ filters here ~).values_list('id', flat=True)]
# Return an error if ID is not valid
if ~your_id~ not in possible_ids:
return Response(data={"error": "Database destruction sequence initialized!"}, status=status.HTTP_401_UNAUTHORIZED)
# Keep working with the object
object = Objects.objects.get(id=object_id)
I'm trying to create a serializer with DRF that is able to validate if a user has access to a primarykeyrelatedfield entry.
I have a separate function which returns a queryset of the files the user can access. All it needs as a parameter is the request object. I'd like to use this function as the queryset kwarg for the primarykeyrelatedfield.
However, I can't find a way to access "self" in this location, so there doesn't seem to be a way to define a Queryset which is dependent upon the current user for a serializer.
This is my current attempt, which fails since when calling _request(self) I cannot access self.
class MySerializer(serializers.Serializer):
def _request(self):
request = getattr(self.context, 'request', None)
if request:
return request
files = serializers.PrimaryKeyRelatedField(many=True, required=True, queryset=get_user_files(_request(self)))
I want to validate that the user has access to the file(s) they are referencing in the request. How would I do this?
I ended up settling on a slightly less clean answer than I'd have liked:
class MySerializer(serializers.Serializer):
files = serializers.PrimaryKeyRelatedField(many=True, required=True, queryset=ScanFile.objects.all())
def validate_files(self, value):
request = self.context.get('request')
queryset = get_user_files(request)
for file in value:
if not queryset.filter(pk=file.id).exists():
raise ValidationError({'Invalid file': file})
return value
This seems to be a bit inefficient, as it ends up querying for each file twice, but it achieves the affect of users can only access files they specifically have request to.
I am building a telegram bot where I am attempting to get the user to fill in detail about an event and store them in a dictionary which is itself in a list.
However I want it be link a conversation. I want it to look like:
user: /create
bot-reply: What would you like to call it?
user-reply: Chris' birth day
bot-reply: When is it?
user-reply: 08/11/2021
bot-reply: Event Chris birth day on 08/11/2021 has been saved!
To achieve this I plan to use ForceReply which states in the documentation
This can be extremely useful if you want to create user-friendly step-by-step interfaces without having to sacrifice privacy mode.
The problem is the documentation does not seem to explain how to handle responses.
Currently my code looks like this:
#app.on_message(filters.command('create'))
async def create_countdown(client, message):
global countdowns
countdown = {
'countdown_id': str(uuid4())[:8],
'countdown_owner_id': message.from_user.id,
'countdown_onwner_username': message.from_user.username,
}
try:
await message.reply('What do you want to name the countdown?',
reply_markup=ForceReply()
)
except FloodWait as e:
await asyncio.sleep(e.x)
Looking through the form I have found options like this:
python telegram bot ForceReply callback
which are exactly what I am looking for but they are using different libraries like python-telegram-bot which permit them to use ConversationHandler. It seems to not be part of pyrogram
How to I create user-friendly step-by-step interfaces with pyrogram?
Pyrogram doesn't have a ConversationHandler.
You could use a dict with your users' ID as the key and the state they're in as the value, then you can use that dictionary of states as your reference to know where your User is in the conversation.
Dan: (Pyrogram creator)
A conversation-like feature is not available yet in the lib. One way to do that is saving states into a dictionary using user IDs as keys. Check the dictionary before taking actions so that you know in which step your users are and update it once they successfully go past one action
https://t.me/pyrogramchat/213488
I can get entity by ID after getting entity by username only. Is it a bug? Video from shell
I'm using Mac and python 3
I'm trying to get entity by id
entity = client.get_entity(1151511560)
but get an exception:
ValueError: Could not find the input entity for <telethon.tl.types.PeerUser object at 0x1172312e8>. Please read https://telethon.readthedocs.io/en/latest/extra/basic/entities.html to find out more details.
Then I'm successfully getting an entity by username "ekat01"
After that i successfully get an entity by id.
Why I can't get an entity by id only? I think, it's a bug, isn't it?
Video with proofs by the link: https://youtu.be/mnDNZZir5PY
Github -------------------------------------------------
From juanvelascogomez:
If I am not wrong, that is explained in the docs "Users, chat and channel, Important section": https://telethon.readthedocs.io/en/stable/extra/basic/entities.html
Once the library has “seen” the entity, you can use their integer ID. You can’t use entities from IDs the library hasn’t seen. You must make the library see them at least once and disconnect properly. You know where the entities are and you must tell the library. It won’t guess for you.
From Lonami:
On a clean session,
with client:
try:
client.get_entity(1151511560)
except ValueError:
print('Error as expected')
client.get_entity("ekat01")
client.get_entity(1151511560)
print('Works as expected')
prints:
Error as expected
Works as expected
On a second run,
with client:
client.get_entity(1151511560)
print('Works as expected')
prints:
Works as expected
i'm using
smartGWT mobile
as the front end ,
from the client UI i'm making an rpc call in return as an result i
need record List.
If i use record List it throws an compilation error saying record List package not imported or found. i need it in the form of record List. as example i have to search files based on its name , so the result should contain file name, its date and size .please help
Thanks in advance
To my sense Records are Client side object, your rpc should return Serializable objects and into the Async callback you set the differents attributes of your records with the values coresponding found in the objects coming from the server.
For example in your rpc:
MySerializableType[] thesNodes = new MySerializableType[size];
........
return theNodes;
and somewhere else:
public class MySerializableType implements IsSerializable {