Avoid Pycharm __dict__ lookup when using __getattr__ and __slots__ for composition - python-3.x

Say I have a class:
class Example:
__slots__ = ("_attrs", "other_value")
def __init__(self):
self._attrs = OrderedDict()
self.other_value = 1
self.attribute = 0
def __setattr__(self, key, value):
if key in self.__slots__:
return super().__setattr__(key, value)
else:
self._attrs[key] = value
def __getattr__(self, key):
return self._attrs[key]
The goal is to have Example have two slots:
if those are set, then set them as usual. (works)
If additional attributes are set, assign them in _attrs. (works)
For getting attributes, the code should:
Act as usual if anything from slots is requested (works)
get the value from _attrs if it exists in _attrs.keys() (works)
error in any other case as usual (issue).
For the issue, I'd like the error to mimic what would normally happen if an attribute was not present for an object. Currently when running code I get a key error on self._attrs. Although this is fine, it would be nice for it to hide this nuance away. More annoyingly, if I debug in Pycharm, the autocomplete will chuck out a large error trying to look at dict before I've even hit enter:
Example().abc # hit tab in pycharm
# returns the error:
Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydevd_bundle/pydevd_comm.py", line 1464, in do_it
def do_it(self, dbg):
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_bundle/_pydev_completer.py", line 159, in generate_completions_as_xml
def generate_completions_as_xml(frame, act_tok):
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_bundle/_pydev_completer.py", line 77, in complete
def complete(self, text):
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_bundle/_pydev_completer.py", line 119, in attr_matches
def attr_matches(self, text):
File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_bundle/_pydev_imports_tipper.py", line 165, in generate_imports_tip_for_module
def generate_imports_tip_for_module(obj_to_complete, dir_comps=None, getattr=getattr, filter=lambda name:True):
File "/Users/xxxxxxxxx/", line 46, in __getattr__
def __getattr__(self, key: str) -> None:
KeyError: '__dict__'
Is there a way to suppress this by writing the code differently?

You might be able to make it work by implementing __dir__ on the class, so it has a canonical source of names that can be completed:
def __dir__(self):
return 'other_value', *self._attrs.keys()
I can't swear to how PyCharm implements their tab-completion, so there's no guarantee it works, but this is the way to define the set of enumerable attributes for a type, and hopefully PyCharm will use it when available, rather than going for __dict__.
The other approach (and this is probably a good idea regardless) it to make sure you raise the right error when __getattr__ fails so PyCharm knows the problem is a missing attribute, not some unrelated issue with a dict:
def __getattr__(self, key):
try:
return self._attrs[key]
except KeyError:
raise AttributeError(key)

Related

Tweepy StreamingClient custom __init__ / arguments - super() not working

I'm looking to stream tweets to json files. I'm using Twitter API v2.0, with Tweepy 4.12.1 and Python 3.10 on Ubuntu 22.04. I'm working with the tweepy.StreamingClient class, and utilizing the on_data() method.
class TweetStreamer(tweepy.StreamingClient):
def __init__(self,
out_path:str,
kill_time:int=59,
time_unit:str='minutes'):
'''
adding custom params
'''
out_path = check_path_exists(out_path)
self.outfile = out_path
if time_unit not in ['seconds', 'minutes']:
raise ValueError(f'time_unit must be either `minutes` or `seconds`.')
if time_unit=='minutes':
self.kill_time = datetime.timedelta(seconds=60*kill_time)
else:
self.kill_time = datetime.timedelta(seconds=kill_time)
super(TweetStreamer, self).__init__()
def on_data(self, data):
'''
1. clean the returned tweet object
2. write it out
'''
# out_obj = data['data']
with open(self.outfile, 'ab') as o:
o.write(data+b'\n')
As you can see, I'm invoking `super() on the parent class, with the intention of retaining all the stuff that is invoked if I didn't specify a custom init.
However, however I try to change it, I get this error when I try to create an instance of the class, passing a bearer_token as well as the other arguments defined in __init__() :
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In [106], line 1
----> 1 streamer = TweetStreamer(bearer_token=bearer_token, out_path='test2.json')
TypeError: TweetStreamer.__init__() got an unexpected keyword argument 'bearer_token'
Any help would be greatly appreciated.
You get this error because you declared that __init__ does not take bearer_token as an argument. You need to pass the keyword arguments your constructor gets to the constructor of the superclass that is expecting them. You can do it using the unpacking operator **:
def __init__(self,
out_path:str,
kill_time:int=59,
time_unit:str='minutes',
**kwargs):
'''
adding custom params
'''
# (...)
super(TweetStreamer, self).__init__(**kwargs)

Is there a way to solve __enter__ AttributeError in ContextDecorator?

I'm trying to create a ContextDecorator. Here's my code:
class CmTag(ContextDecorator):
def __init__(self, cm_tag_func):
self.cm_tag_func = cm_tag_func
def __enter__(self):
return self
def __exit__(self, exc_type, exc_value, tb):
if exc_type is not None:
traceback.print_exception(exc_type, exc_value, tb)
else:
name = self.cm_tag_func.__name__
print(name)
def __call__(self, **kwargs):
name = self.cm_tag_func.__name__
print(name)
print(kwargs)
self.cm_tag_func(**kwargs)
#CmTag
def testing(**kwargs):
pass
with testing(foo='bar') as t:
print('a test')
I expect the output to be:
testing
{'foo':'bar'}
a test
testing
That is, it first prints the name of the function. Then it prints out kwargs as a dictionary.
Then it prints out whatever is there inside the context manager, which is 'a test', in this case. Finally upon exit, it prints out the name of the function again.
Instead, it says:
testing
{'foo': 'bar'}
Traceback (most recent call last):
File "/workspace/sierra/src/sierra/test.py", line 32, in <module>
with testing(foo='bar') as t:
AttributeError: __enter__
I saw other solutions saying __enter__ was not defined. But I've done it here.
How do I rectify this? Thanks.
In the line
with testing(foo='bar') as t:
print('a test')
since testing is an object of CmTag, you perform a call to __call__, however, you return None in that method instead of returning self.
Adding return self at the end of that method would fix it.

Overridden __setitem__ call works in serial but breaks in apply_async call

I've been fighting with this problem for some time now and I've finally managed to narrow down the issue and create a minimum working example.
The summary of the problem is that I have a class that inherits from a dict to facilitate parsing of misc. input files. I've overridden the the __setitem__ call to support recursive indexing of sections in our input file (e.g. parser['some.section.variable'] is equivalent to parser['some']['section']['variable']). This has been working great for us for over a year now, but we just ran into an issue when passing these Parser classes through a multiprocessing.apply_async call.
Show below is the minimum working example - obviously the __setitem__ call isn't doing anything special, but it's important that it accesses some class attribute like self.section_delimiter - this is where it breaks. It doesn't break in the initial call or in the serial function call. But when you call the some_function (which doesn't do anything either) using apply_async, it crashes.
import multiprocessing as mp
import numpy as np
class Parser(dict):
def __init__(self, file_name : str = None):
print('\t__init__')
super().__init__()
self.section_delimiter = "."
def __setitem__(self, key, value):
print('\t__setitem__')
self.section_delimiter
dict.__setitem__(self, key, value)
def some_function(parser):
pass
if __name__ == "__main__":
print("Initialize creation/setting")
parser = Parser()
parser['x'] = 1
print("Single serial call works fine")
some_function(parser)
print("Parallel async call breaks on line 16?")
pool = mp.Pool(1)
for i in range(1):
pool.apply_async(some_function, (parser,))
pool.close()
pool.join()
If you run the code below, you'll get the following output
Initialize creation/setting
__init__
__setitem__
Single serial call works fine
Parallel async call breaks on line 16?
__setitem__
Process ForkPoolWorker-1:
Traceback (most recent call last):
File "/home/ijw/miniconda3/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/ijw/miniconda3/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/ijw/miniconda3/lib/python3.7/multiprocessing/pool.py", line 110, in worker
task = get()
File "/home/ijw/miniconda3/lib/python3.7/multiprocessing/queues.py", line 354, in get
return _ForkingPickler.loads(res)
File "test_apply_async.py", line 13, in __setitem__
self.section_delimiter
AttributeError: 'Parser' object has no attribute 'section_delimiter'
Any help is greatly appreciated. I spent considerable time tracking down this bug and reproducing a minimal example. I would love to not only fix it, but clearly fill some gap in my understanding on how these apply_async and inheritance/overridden methods interact.
Let me know if you need any more information.
Thank you very much!
Isaac
Cause
The cause of the problem is that multiprocessing serializes and deserializes your Parser object to move its data across process boundaries. This is done using pickle. By default pickle does not call __init__() when deserializing classes. Because of this self.section_delimiter is not set when the deserializer calls __setitem__() to restore the items in your dictionary and you get the error:
AttributeError: 'Parser' object has no attribute 'section_delimiter'
Using just pickle and no multiprocessing gives the same error:
import pickle
parser = Parser()
parser['x'] = 1
data = pickle.dumps(parser)
copy = pickle.loads(data) # Same AttributeError here
Deserialization will work for an object with no items and the value of section_delimiter will be restored:
import pickle
parser = Parser()
parser.section_delimiter = "|"
data = pickle.dumps(parser)
copy = pickle.loads(data)
print(copy.section_delimiter) # Prints "|"
So in a sense you are just unlucky that pickle calls __setitem__() before it restores the rest of the state of your Parser.
Workaround
You can work around this by setting section_delimiter in __new__() and telling pickle what arguments to pass to __new__() by implementing __getnewargs__():
def __new__(cls, *args):
self = super(Parser, cls).__new__(cls)
self.section_delimiter = args[0] if args else "."
return self
def __getnewargs__(self):
return (self.section_delimiter,)
__getnewargs__() returns a tuple of arguments. Because section_delimiter is set in __new__(), it is no longer necessary to set it in __init__().
This is the code of your Parser class after the change:
class Parser(dict):
def __init__(self, file_name : str = None):
print('\t__init__')
super().__init__()
def __new__(cls, *args):
self = super(Parser, cls).__new__(cls)
self.section_delimiter = args[0] if args else "."
return self
def __getnewargs__(self):
return (self.section_delimiter,)
def __setitem__(self, key, value):
print('\t__setitem__')
self.section_delimiter
dict.__setitem__(self, key, value)
Simpler solution
The reason pickle calls __setitem__() on your Parser object is because it is a dictionary. If your Parser is just a class that happens to implement __setitem__() and __getitem__() and has a dictionary to implement those calls then pickle will not call __setitem__() and serialization will work with no extra code:
class Parser:
def __init__(self, file_name : str = None):
print('\t__init__')
self.dict = { }
self.section_delimiter = "."
def __setitem__(self, key, value):
print('\t__setitem__')
self.section_delimiter
self.dict[key] = value
def __getitem__(self, key):
return self.dict[key]
So if there is no other reason for your Parser to be a dictionary, I would just not use inheritance here.

Reading from file raises IndexError in python

I am making an app which will return one random line from the .txt file. I made a class to implement this behaviour. The idea was to use one method to open file (which will remain open) and the other method which will close it after the app exits. I do not have much experience in working with files hence the following behaviour is strange to me:
In __init__ I called self.open_file() in order to just open it. And it works fine to get self.len. Now I thought that I do not need to call self.open_file() again, but when I call file.get_term()(returns random line) it raises IndexError (like the file is empty), But, if I call file.open_file() method again, everything works as expected.
In addition to this close_file() method raises AttributeError - object has no attribute 'close', so I assumed the file closes automatically somehow, even if I did not use with open.
import random
import os
class Pictionary_file:
def __init__(self, file):
self.file = file
self.open_file()
self.len = self.get_number_of_lines()
def open_file(self):
self.opened = open(self.file, "r", encoding="utf8")
def get_number_of_lines(self):
i = -1
for i, line in enumerate(self.opened):
pass
return i + 1
def get_term_index(self):
term_line = random.randint(0, self.len-1)
return term_line
def get_term(self):
term_line = self.get_term_index()
term = self.opened.read().splitlines()[term_line]
def close_file(self):
self.opened.close()
if __name__ == "__main__":
print(os.getcwd())
file = Pictionary_file("pictionary.txt")
file.open_file() #WITHOUT THIS -> IndexError
file.get_term()
file.close() #AttributeError
Where is my mistake and how can I correct it?
Here in __init__:
self.open_file()
self.len = self.get_number_of_lines()
self.get_number_of_lines() actually consumes the whole file because it iterates over it:
def get_number_of_lines(self):
i = -1
for i, line in enumerate(self.opened):
# real all lines of the file
pass
# at this point, `self.opened` is empty
return i + 1
So when get_term calls self.opened.read(), it gets an empty string, so self.opened.read().splitlines() is an empty list.
file.close() is an AttributeError, because the Pictionary_file class doesn't have the close method. It does have close_file, though.

How can I use __getattr__ in functions in Python 3.1?

I'm trying to redifine the __getattr__ method, in a function.
I've tried this code:
def foo():
print("foo")
def addMethod(obj, func):
setattr(obj, func.__name__, types.MethodType(func, obj))
def __getattr__(obj, name):
print(name)
addMethod(foo, __getattr__)
foo.bar
but I get this error:
Traceback (most recent call last):
File blah blah, line 14, in <module>
foo.bar
AttributeError: 'function' object has no attribute 'bar'
I've inspected the "foo" function and it really has the method bounded to it, but it seems that if you set it dynamically, __getattr__ won't get called.
If I do the same thing to a class, if I set the __getattr__ using my "addMethod" function, the instance won't call the __getattr__ too!, so the problem must be the dynamic call!
BUT if I put the __getattr__ in the definition of the class, it will work, obviously.
The question is, how can I put the __getattr__ to the function to make it work? I can't put it from the beginning because it's a function!, I don't know how to do this
Thanks!
Well, you don't. If you want attributes, make a class. If you want instances to be callable, define __call__ for it.
class foo:
def __call__(self):
print("foo")
def __getattr__(self, name):
print(name)
f = foo()
f() # foo
f.bar # bar
Your problem is that Python will only look up __xxx__ magic methods on the class of an object, not the object itself. So even though you are setting __getattr__ on the instance of foo, it will never be automatically called.
That is also the problem you are having with class foo:
class foo():
pass
def addMethod(obj, func):
setattr(obj, func.__name__, func)
def __getattr__(obj, name):
print(name)
addMethod(foo, __getattr__)
foo.bar
Here Python is looking up bar on the class object foo which means . . .
the __getattr__ method you added to foo is not getting called, instead Python is looking at foo's metaclass, which is type.
If you change that last line to
foo().bar
and get an instance of the class of foo, it will work.

Resources