How to handle inheritance in mypy - python-3.x

I'm trying to add mypy to my python project but I have found a roadblock. Let's say I have the following inheritance:
class BaseClass:
base_attribute: str
class A(BaseClass):
attribute_for_class_A: str
class B(BaseClass):
attribute_for_class_B: str
Now let's create some code that handle both instances of these classes, but without really knowing it:
#dataclass
class ClassUsingTheOthers:
fields: Dict[str, BaseClass]
def get_field(self, field_name: str) -> BaseClass:
field = self.fields.get(field_name)
if not field:
raise ValueError('Not found')
return field
The important bit here is the get_field method. Now let's create a function to use the get_field method, but that function will require to use a particlar subclass of BaseClass, B, for instance:
def function_that_needs_an_instance_of_b(instance: B):
print(instance.attribute_for_class_B)
Now if we use all the code together, we can get the following:
if __name__ == "__main__":
class_using_the_others = ClassUsingTheOthers(
fields={
'name_1': A(),
'name_2': B()
}
)
function_that_needs_an_instance_of_b(class_using_the_others.get_field('name_2'))
Obviously, when I run mypy to this file (in this gist you find all the code), I get the following error, as expected:
error: Argument 1 to "function_that_needs_an_instance_of_b" has incompatible type "BaseClass"; expected "B" [arg-type]
So my question is, how do I fix my code to make this error go away? I cannot change the type hint of the fields attribute because I really need to set it that way. Any ideas? Am I missing something? Should I check the type of the field returned?

I cannot change the type hint of the fields attribute
Well, there is your answer. If you declare fields to be a dictionary with the values type BaseClass, how do you expect any static type checker to know more about it?
(Related: Type annotation for mutable dictionary)
The type checker does not distinguish between different values of the dictionary based on any key you provide.
If you knew ahead of time, what the exact key-value-pairs can be, you could either do this with a TypedDict (as #dROOOze suggested) or you could write some ugly overloads with different Literal string values for field_name of your get_field method.
But none of those apply due to your restriction.
So you are left with either type-narrowing with a runtime assertion (as alluded to by #juanpa.arrivillaga), which I would recommend, or placing a specific type: ignore[arg-type] comment (as mentioned by #luk2302) and be done with it.
The former would look like this:
from dataclasses import dataclass
class BaseClass:
base_attribute: str
#dataclass
class A(BaseClass):
attribute_for_class_A: str
#dataclass
class B(BaseClass):
attribute_for_class_B: str
#dataclass
class ClassUsingTheOthers:
fields: dict[str, BaseClass]
def get_field(self, field_name: str) -> BaseClass:
field = self.fields.get(field_name)
if not field:
raise ValueError('Not found')
return field
def function_that_needs_an_instance_of_b(instance: B) -> None:
print(instance.attribute_for_class_B)
if __name__ == '__main__':
class_using_the_others = ClassUsingTheOthers(
fields={
'name_1': A(attribute_for_class_A='foo'),
'name_2': B(attribute_for_class_B='bar'),
}
)
obj = class_using_the_others.get_field('name_2')
assert isinstance(obj, B)
function_that_needs_an_instance_of_b(obj)
This both keeps mypy happy and you sane, if you ever forget, what value you were expecting there.

Related

How to type hint a function, added to class by class decorator in Python

I have a class decorator, which adds a few functions and fields to decorated class.
#mydecorator
#dataclass
class A:
a: str = ""
Added (via setattr()) is a .save() function and a set of info for dataclass fields as a separate dict.
I'd like VScode and mypy to properly recognize that, so that when I use:
a=A()
a.save()
or a.my_fields_dict those 2 are properly recognized.
Is there any way to do that? Maybe modify class A type annotations at runtime?
TL;DR
What you are trying to do is not possible with the current type system.
1. Intersection types
If the attributes and methods you are adding to the class via your decorator are static (in the sense that they are not just known at runtime), then what you are describing is effectively the extension of any given class T by mixing in a protocol P. That protocol defines the method save and so on.
To annotate this you would need an intersection of T & P. It would look something like this:
from typing import Protocol, TypeVar
T = TypeVar("T")
class P(Protocol):
#staticmethod
def bar() -> str: ...
def dec(cls: type[T]) -> type[Intersection[T, P]]:
setattr(cls, "bar", lambda: "x")
return cls # type: ignore[return-value]
#dec
class A:
#staticmethod
def foo() -> int:
return 1
You might notice that the import of Intersection is conspicuously missing. That is because despite being one of the most requested features for the Python type system, it is still missing as of today. There is currently no way to express this concept in Python typing.
2. Class decorator problems
The only workaround right now is a custom implementation alongside a corresponding plugin for the type checker(s) of your choice. I just stumbled across the typing-protocol-intersection package, which does just that for mypy.
If you install that and add plugins = typing_protocol_intersection.mypy_plugin to your mypy configuration, you could write your code like this:
from typing import Protocol, TypeVar
from typing_protocol_intersection import ProtocolIntersection
T = TypeVar("T")
class P(Protocol):
#staticmethod
def bar() -> str: ...
def dec(cls: type[T]) -> type[ProtocolIntersection[T, P]]:
setattr(cls, "bar", lambda: "x")
return cls # type: ignore[return-value]
#dec
class A:
#staticmethod
def foo() -> int:
return 1
But here we run into the next problem. Testing this with reveal_type(A.bar()) via mypy will yield the following:
error: "Type[A]" has no attribute "bar" [attr-defined]
note: Revealed type is "Any"
Yet if we do this instead:
class A:
#staticmethod
def foo() -> int:
return 1
B = dec(A)
reveal_type(B.bar())
we get no complaints from mypy and note: Revealed type is "builtins.str". Even though what we did before was equivalent!
This is not a bug of the plugin, but of the mypy internals. It is another long-standing issue, that mypy does not handle class decorators correctly.
A person in that issue thread even mentioned your use case in conjunction with the desired intersection type.
DIY
In other words, you'll just have to wait until those two holes are patched. Or you can hope that at least the decorator issue by mypy is fixed soon-ish and write your own VSCode plugin for intersection types in the meantime. Maybe you can get together with the person behind that mypy plugin I mentioned above.

What is the correct type hint to use when exporting a pydantic model as a dict?

I'm writing an abstraction module which validates an excel sheet against a pydantic schema and returns the row as a dict using dict(MyCustomModel(**sheet_row))
. I would like to use type hinting so any function that uses the abstraction methods gets a type hint for the returned dictionary with its keys instead of just getting an unhelpful dict. Basically I'd like to return the keys of the dict that compose the schema so I don't have to keep referring to the schema for its fields and to catch any errors early on.
My current workaround is having my abstraction library return the pydantic model directly and type hint using the Model itself. This means every field has to be accessed using a dot notation instead of accessing it like a regular dictionary. I cannot annotate the dict has being the model itself as its a dict, not the actual pydantic model which has some extra attributes as well.
I tried type hinting with the type MyCustomModel.__dict__(). That resulted in the error TypeError: Parameters to generic types must be types. Got mappingproxy({'__config__': <class 'foo.bar.Config'>, '__fields__': {'lab.. Is there a way to send a type hint about the fields in the schema, but as a dictionary? I don't omit any keys during the dict export. All the fields in the model is present in the final dict being returned
I am going to try and abstract that question and create minimal reproducible example for you.
Question
Consider this working example:
from typing import Any
from pydantic import BaseModel
class Foo(BaseModel):
x: str
y: int
def validate(data: dict[str, Any], model: type[BaseModel]) -> dict[str, Any]:
return dict(model.parse_obj(data))
def test() -> None:
data = {"x": "spam", "y": "123"}
validated = validate(data, Foo)
print(validated)
# reveal_type(validated["x"])
# reveal_type(validated["y"])
if __name__ == "__main__":
test()
The code works fine and outputs {'x': 'spam', 'y': 123} as expected. But if you uncomment the reveal_type lines and run mypy over it, obviously the type it sees is just Any for both.
Is there a way to annotate validate, so that a type checker knows, which keys will be present in the returned dictionary, based on the model provided to it?
Answer
Python dictionaries have no mechanism built into them for distinguishing their type via specific keys. The generic dict type is parameterized by exactly two type parameters, namely the key type and the value type.
You can utilize the typing.TypedDict class to define a type based on the specific keys of a dictionary. However (as pointed out by #hernán-alarcón in the comments) the __dict__ method still returns just a dict[str, Any]. You can always cast the output of course and for this particular Foo model this would work:
from typing import Any, TypedDict, cast
from pydantic import BaseModel
class Foo(BaseModel):
x: str
y: int
class FooDict(TypedDict):
x: str
y: int
def validate(data: dict[str, Any], model: type[BaseModel]) -> FooDict:
return cast(FooDict, dict(model.parse_obj(data)))
def test() -> None:
data = {"x": "spam", "y": "123"}
validated = validate(data, Foo)
print(validated)
reveal_type(validated["x"]) # "builtins.str"
reveal_type(validated["y"]) # "builtins.int"
if __name__ == "__main__":
test()
But it is not very helpful, if validate should be able to deal with any model, not just Foo.
The easiest way to generalize this that I can think of is to make your own base model class that is generic in terms of the corresponding TypedDict. Binding the type argument in a dedicated private attribute should be enough. You won't actually have to set it or interact with it at any point. It is enough to specify it, when you subclass your base class. Here is a working example:
from typing import Any, Generic, TypeVar, TypedDict, cast
from pydantic import BaseModel as PydanticBaseModel, PrivateAttr
T = TypeVar("T")
class BaseModel(PydanticBaseModel, Generic[T]):
__typed_dict__: type[T] = PrivateAttr(...)
class FooDict(TypedDict):
x: str
y: int
class Foo(BaseModel[FooDict]):
x: str
y: int
def validate(data: dict[str, Any], model: type[BaseModel[T]]) -> T:
return cast(T, model.parse_obj(data).dict())
def test() -> None:
data = {"x": "spam", "y": "123"}
validated = validate(data, Foo)
print(validated)
reveal_type(validated["x"]) # "builtins.str"
reveal_type(validated["y"]) # "builtins.int"
reveal_type(validated) # "TypedDict('FooDict', {'x': builtins.str, 'y': builtins.int})"
if __name__ == "__main__":
test()
This works well enough to convey the dictionary keys and corresponding types.
If you are wondering, whether there is a way to just dynamically infer the TypedDict rather than just duplicating the model fields manually, the answer is no.
Static type checkers do not execute your code, they just read it.
This brings me to the final consideration. I don't know, why you would even want to use a dictionary over a model instance in the first place. It seems that for the purposes of dealing with structured data, the model is superior in every aspect, if you already are using Pydantic anyway.
The fact that you access the fields as attributes (via dot-notation) is a feature IMHO and not a drawback of this approach. If you for some reason do need to have dynamic attribute access via field names as strings, you can always just use getattr on the model instance.

AttributeError with Typed variables [duplicate]

I have been after a way to provide none initialized instance variables to my class. I found that we can actually do that using type hinting without assigning anything to them. Which does not seem to create it in anyway. For example:
class T:
def __init__(self):
self.a: str
def just_print(self):
print(self.a)
def assign(self):
self.a = "test"
Now lets say I run this code:
t = T()
t.just_print()
It will raise an AttributeError saying 'T' object has not attribute 'a'. Obviously, when I run this code, it prints test.
t = T()
t.assign()
t.just_print()
My question is, what happens behind the scene when I just do a: str? It doesn't get added to the class's attributes. But it doesn't cause any problem either. So... is it just ignored? This is python 3.8 by the way.
You're referring to type annotations, as defined by PEP 526:
my_var: int
Please note that type annotations differ from type hints, as defined by PEP 428:
def my_func(foo: str):
...
Type annotations have actual runtime effects. For example, the documentation states:
In addition, at the module or class level, if the item being annotated is a simple name, then it and the annotation will be stored in the __annotations__ attribute of that module or class [...]
So, by slightly modifying your example, we get this:
>>> class T:
... a: str
...
>>> T.__annotations__
{'a': <class 'str'>}

Python mypy marks error when method parameter is type Union

I have these python classes:
class LocalWritable(typing.TypedDict):
file_name: str
class GSheetWritable(typing.TypedDict):
tab_name: str
class S3Writable(typing.TypedDict):
data_name: str
table_name: str
WriterMeta = typing.Union[GSheetWritable, S3Writable, LocalWritable]
class DataWriter(ABC):
"""Defines the interface for all data writers"""
#abstractmethod
def write(self, data: pd.DataFrame, meta: WriterMeta, versionize: bool):
"""This method performs the writing of 'data'.
Every class implementing this method must implement its writing
using 'connector'
"""
pass
class GSheetOutputWriter(DataWriter):
def write(self, data: pd.DataFrame, meta: WriterMeta, versionize: bool):
data = data.replace({np.nan: 0, np.Inf: "Inf"})
print("Writing '{}' table to gsheet.".format(meta["tab_name"]))
if self.new:
tab = self.connector.get_worksheet(self.target.url, "Sheet1")
self.connector.rename_worksheet(tab, meta["tab_name"])
self.new = False
else:
tab = self.connector.add_worksheet(
self.target, meta["tab_name"], rows=1, cols=1
)
time.sleep(random.randint(30, 60))
self.connector.update_worksheet(
tab, [data.columns.values.tolist()] + data.values.tolist()
)
The problem is with the method write() when linting with python mypy, because it marks this error:
cost_reporter\outputs\__init__.py:209: error: TypedDict "S3Writable" has no key "tab_name"
cost_reporter\outputs\__init__.py:209: note: Did you mean "table_name" or "data_name"?
cost_reporter\outputs\__init__.py:209: error: TypedDict "LocalWritable" has no key "tab_name"
What I am trying to do is to implement three concrete classes based on the abstract class DataWriter, and each one shall implement its own write() method and each one shall receive one of the datatypes of WriterMeta union. The problem I am having is that python mypy validates the code against the three datatypes instead of any of them.
How can I do that?
EDIT
If I change the type of parameter meta to GsheetWritable(that is one of the three types of the union and the one expected by this concrete class), mypy marks this error:
cost_reporter\outputs\__init__.py:202: error: Argument 2 of "write" is incompatible with supertype "DataWriter"; supertype defines the argument type as "Union[GSheetWritable, S3Writable, LocalWritable]"
cost_reporter\outputs\__init__.py:202: note: This violates the Liskov substitution principle
A Union works like unions in set theory. In other words, a Union consisting of multiple types is a type that supports only what's shared in common.
In order to use attributes (or whatever) of a specific type, you need to hint to mypy that you're constraining an instance. You can do this by casting the Union to a specific type, asserting that your object is whatever specific type, and others. The documentation lists ways to narrow types.
import typing
from abc import ABC, abstractmethod
class LocalWritable(typing.TypedDict):
file_name: str
class GSheetWritable(typing.TypedDict):
tab_name: str
class S3Writable(typing.TypedDict):
data_name: str
table_name: str
WriterMeta = typing.Union[GSheetWritable, S3Writable, LocalWritable]
class DataWriter(ABC):
#abstractmethod
def write(self, data: str, meta: WriterMeta):
pass
class GSheetOutputWriter(DataWriter):
def write(self, data: str, meta: WriterMeta):
# LOOK HERE! The cast hints to mypy that meta is a GSheetWritable.
meta_cast: GSheetWritable = typing.cast(GSheetWritable, meta)
print("Writing '{}' table to gsheet.".format(meta_cast["tab_name"]))
Further reading
Type narrowing
Union type

Pycharm type hints warning for classes instead of instances

I am trying to understand why pycharm warns me of wrong type when using an implementation of an abstract class with static method as parameter.
To demonstrate I will make a simple example. Let's say I have an abstract class with one method, a class that implements (inherits) this interface-like abstract class, and a method that gets the implementation it should use as parameter.
import abc
class GreetingMakerBase(abc.ABC):
#abc.abstractmethod
def make_greeting(self, name: str) -> str:
""" Makes greeting string with name of person """
class HelloGreetingMaker(GreetingMakerBase):
def make_greeting(self, name: str) -> str:
return "Hello {}!".format(name)
def print_greeting(maker: GreetingMakerBase, name):
print(maker.make_greeting(name))
hello_maker = HelloGreetingMaker()
print_greeting(hello_maker, "John")
Notice that in the type hinting of print_greeting I used GreetingMakerBase, and because isinstance(hello_maker, GreetingMakerBase) is True Pycharm is not complaining about it.
The problem is that I have many implementations of my class and dont want to make an instance of each, so I will make this make_greeting method static, like this:
class GreetingMakerBase(abc.ABC):
#staticmethod
#abc.abstractmethod
def make_greeting(name: str) -> str:
""" Makes greeting string with name of person """
class HelloGreetingMaker(GreetingMakerBase):
#staticmethod
def make_greeting(name: str) -> str:
return "Hello {}!".format(name)
def print_greeting(maker: GreetingMakerBase, name):
print(maker.make_greeting(name))
print_greeting(HelloGreetingMaker, "John")
This still works the same way, but apparently because the parameter in the function call is now the class name instead of an instance of it, Pycharm complains that:
Expected type 'GreetingMakerBase', got 'Type[HelloGreetingMaker]' instead.
Is there a way I can solve this warning without having to instantiate the HelloGreetingMaker class?
When you are doing print_greeting(HelloGreetingMaker, "John"), you are not trying to pass in an instance of HelloGreetingMaker. Rather, you're passing in the class itself.
The way we type this is by using Type[T], which specifies you want the type of T, rather then an instance of T. So for example:
from typing import Type
import abc
class GreetingMakerBase(abc.ABC):
#staticmethod
#abc.abstractmethod
def make_greeting(name: str) -> str:
""" Makes greeting string with name of person """
class HelloGreetingMaker(GreetingMakerBase):
#staticmethod
def make_greeting(name: str) -> str:
return "Hello {}!".format(name)
def print_greeting(maker: Type[GreetingMakerBase], name):
print(maker.make_greeting(name))
# Type checks!
print_greeting(HelloGreetingMaker, "John")
Note that Type[HelloGreetingMaker] is considered to be compatible with Type[GreetingMakerBase] -- Type[T] is covariant with respect to T.
The Python docs on the typing module and the mypy docs have more details and examples if you want to learn more.
You did not create an instance, and your type hints implies that the function only accepts instances (something of a type GreetingMakerBase, and not GreetingMakerBase itself or a subclass of it).
If you want to specify that only GreetingMakerBase itself is an acceptable argument, why have it as an argument at all? Just have the function call that class internally.
In any case, python 3.8 has some new typing improvements that can help you. You can specify a literal type hint:
from typing import Literal
def print_greeting(maker: Literal[GreetingMakerBase], name):
print(maker.make_greeting(name))
If you need to support this type-hint in other (earlier than 3.8) python versions, you will have to install the typing extensions:
pip install typing-extensions

Resources