Given the following scenario:
import attrs
#attrs.define(kw_only=True)
class A:
values: list[float] = attrs.field(converter=float)
A(values=["1.1", "2.2", "3.3"])
which results in
*** TypeError: float() argument must be a string or a real number, not 'list'
Obviously it's due to providing the whole list to float, but is there a way to get attrs do the conversion on each element, without providing a custom converter function?
As far as I know, attrs doesn't have a built-in option to switch conversion or validation to "element-wise", the way Pydantic's validators have the each_item parameter.
I know you specifically did not ask for a converter function, but I don't really see much of an issue in defining one that you can reuse as often as you need to. Here is one way to implement a converter for your specific case:
from attrs import define, field
from collections.abc import Iterable
from typing import Any
def float_list(iterable: Iterable[Any]) -> list[float]:
return [float(item) for item in iterable]
#define
class A:
values: list[float] = field(converter=float_list)
if __name__ == '__main__':
a = A(values=["1.1", "2.2", "3.3"])
print(a)
It is not much of a difference to your example using converter=float.
The output is of course A(values=[1.1, 2.2, 3.3]).
You could even have your own generic converter factory for arbitrary convertible item types:
from attrs import define, field
from collections.abc import Callable, Iterable
from typing import Any, TypeAlias, TypeVar
T = TypeVar("T")
ItemConv: TypeAlias = Callable[[Any], T]
ListConv: TypeAlias = Callable[[Iterable[Any]], list[T]]
def list_of(item_type: ItemConv[T]) -> ListConv[T]:
def converter(iterable: Iterable[Any]) -> list[T]:
return [item_type(item) for item in iterable]
return converter
#define
class B:
foo: list[float] = field(converter=list_of(float))
bar: list[int] = field(converter=list_of(int))
baz: list[bool] = field(converter=list_of(bool))
if __name__ == '__main__':
b = B(
foo=range(0, 10, 2),
bar=["1", "2", 3.],
baz=(-1, 0, 100),
)
print(b)
Output: B(foo=[0.0, 2.0, 4.0, 6.0, 8.0], bar=[1, 2, 3], baz=[True, False, True])
The only downside to that approach is that the mypy plugin for attrs (for some reason) can not handle this type of converter function and will complain, unless you add # type: ignore[misc] to the field definition in question.
You could use cattrs, which is a companion library for attrs for transforming data.
So after a pip install cattrs:
from functools import partial
import attrs
from cattrs import structure
#attrs.define(kw_only=True)
class A:
values: list[float] = attrs.field(converter=partial(structure, cl=list[float]))
print(A(values=["1.1", "2.2", "3.3"]))
Related
Pandas allows you to extend its DataFrame class by using the pd.api.extensions.register_dataframe_accessor() decorator.
While this is functional, it doesn't offer any additional type hinting capabilities.
For example, I would expect the following to type check OK and even provide type hints
import pandas as pd
#pd.api.extensions.register_dataframe_accessor('dataset')
class Extension:
def __init__(self, df: pd.DataFrame):
self._df = df
def foo(self, bar) -> str:
return "foobar";
foo = pd.DataFrame({"foo":["bar"]})
foo.dataset.foo("bar")
^
No Suggestions
How can I get dataframe accessors to provide autocomplete?
This can be done somewhat hackishly using typing.TYPE_CHECKING and a bit of inheritance.
from typing import TYPE_CHECKING
import pandas as pd
#pd.api.extensions.register_dataframe_accessor('dataset')
class Extension:
def __init__(self, df: pd.DataFrame):
self._df = df
def foo(self, bar) -> str:
return "foobar";
if TYPE_CHECKING:
class DataFrame(pd.DataFrame):
dataset: Extension
foo: 'DataFrame' = pd.DataFrame({"foo":["bar"]})
# ^ you have to do this every time you transform the DataFrame
foo.dataset.foo("bar")
# ^ autocomplete is now provided
Unfortunately, PyCharm does not check the __annotations__ dictionary, or really do any dynamic type checking, so there doesn't appear to be any more universal solutions.
I have split a large class implementation into different packages [1], and have used an import inside a method body to avoid a compilation cycle, as follows:
# model.py
class MyInt:
def __init__(self, value: int):
self.value = value
def is_prime(self) -> bool:
from methods import is_prime
return is_prime(self)
# methods.py
from model import MyInt
def is_prime(x: MyInt) -> bool:
# TODO: actually implement this
return x.value == 2 or x.value % 2 == 1
However, pytype is not happy about this, failing to find the pyi file when reaching the import cycle:
File "/home/bkim/Projects/mwe/model.py", line 6, in is_prime: Couldn't import pyi for 'methods' [pyi-error]
Can't find pyi for 'model', referenced from 'methods'
How can I avoid this and still get type-checking?
[1] I've done this with just one tiny, utility method, actually. No need to yell about splitting a class across multiple packages.
This solution uses typing.TYPE_CHECKING, to have one behavior during type checking and another during runtime:
import typing
class MyInt:
def is_prime(self) -> bool:
if typing.TYPE_CHECKING:
return False
from methods import is_prime
return is_prime(self)
Curiously, using from typing import TYPE_CHECKING doesn't work, which may be a bug?
What I have:
I am creating a dataclass and I am stating the types of its elements:
class Task():
n_items: int
max_weight: int
max_size: int
items: numpy.array(Item) # incorrect way of doing it
What I want to do
I'd like to declare, that items will be a numpy array of obejcts of class "Item"
You can put ndarray:
import numpy as np
class Task():
n_items: int
max_weight: int
max_size: int
items: np.ndarray
You have to use ndarray class type:
import numpy as np
class Task():
n_items: int
max_weight: int
max_size: int
items: np.ndarray[<shapeType>, <convertedNumpyGenericType>]
Where <shapeType> is the type of values defining the shape of the array (probably int) and <convertedNumpyGenericType> defines the array data's type. Be careful that you have to "convert" numpy generic types into python ones. You may want to use np.dtype[<generic>] with <generic> the generic numpy type (e.g np.float64)
If you want to set a default value (inside the field dataclass function) you have to do as follows:
items: np.ndarray[_, _] = field(default_factory=lambda: np.zeros(shape=<int>, dtype=<type>))
You can use the nptyping package, which offers type hints specifically for Numpy data types.
Unless you want to create a custom Numpy container, the best you can do is to denote your array as a container of typing.Any objects, since support for types beyond the ones mentioned here is lacking.
from nptyping import NDArray, Shape
from typing import Any
import numpy as np
class Item:
pass
class Foo:
def __init__(self, bar: NDArray[Shape["1,2"], Any]):
self.bar = bar
if __name__ == '__main__':
item = Item()
foo = Foo(bar=np.array([Item(), Item()], dtype=Item))
print(foo.bar)
Running this will yield something like
[<__main__.Item object at 0x7f13f0dd9e80>
<__main__.Item object at 0x7f13f0dd9040>]
I want to validate since the instance creation if the type is right or wrong,
i tried using #dataclass decorator but doesn't allow me to use the __init__ method, i also tried using a custom like class type
also in order of the type made some validations (if is a int, that field>0 or if is a str clean whitespaces, for example),
i could use a dict to validate the type, but i want to know if there's a way to do it in pythonic way
class Car(object):
""" My class with many fields """
color: str
name: str
wheels: int
def __init__(self):
""" Get the type of fields and validate """
pass
You can use the __post_init__ method of dataclasses to do your validations.
Below I just confirm that everything is an instance of the indicated type
from dataclasses import dataclass, fields
def validate(instance):
for field in fields(instance):
attr = getattr(instance, field.name)
if not isinstance(attr, field.type):
msg = "Field {0.name} is of type {1}, should be {0.type}".format(field, type(attr))
raise ValueError(msg)
#dataclass
class Car:
color: str
name: str
wheels: int
def __post_init__(self):
validate(self)
An alternative to #dataclass is to use pyfields. It provides validation and conversion out of the box, and is directly done at the field level so you can use fields inside any class, without modifying them in any way.
from pyfields import field, init_fields
from valid8.validation_lib import is_in
ALLOWED_COLORS = ('blue', 'yellow', 'brown')
class Car(object):
""" My class with many fields """
color: str = field(check_type=True, validators=is_in(ALLOWED_COLORS))
name: str = field(check_type=True, validators={'should be non-empty': lambda s: len(s) > 0})
wheels: int = field(check_type=True, validators={'should be positive': lambda x: x > 0})
#init_fields
def __init__(self, msg="hello world!"):
print(msg)
c = Car(color='blue', name='roadie', wheels=3)
c.wheels = 'hello' # <-- (1) type validation error, see below
c.wheels = 0 # <-- (2) value validation error, see below
yields the following two errors
TypeError: Invalid value type provided for '<...>.Car.wheels'.
Value should be of type <class 'int'>. Instead, received a 'str': 'hello'
and
valid8.entry_points.ValidationError[ValueError]:
Error validating [<...>.Car.wheels=0].
InvalidValue: should be positive.
Function [<lambda>] returned [False] for value 0.
See pyfields documentation for details. I'm the author by the way :)
Dataclasses do not check the data. But I made a small superstructure for dataclasses, and you can use it this way:
import json
from dataclasses import dataclass
from validated_dc import ValidatedDC
#dataclass
class Car(ValidatedDC):
color: str
name: str
wheels: int
# This string was received by api
data = '{"color": "gray", "name": "Delorean", "wheels": 4}'
# Let's upload this json-string to the dictionary
data = json.loads(data)
car = Car(**data)
assert car.get_errors() is None
# Let's say the key "color" got the wrong value:
data['color'] = 11111
car = Car(**data)
assert car.get_errors()
print(car.get_errors())
# {
# 'color': [
# BasicValidationError(
# value_repr='11111', value_type=<class 'int'>,
# annotation=<class 'str'>, exception=None
# )
# ]
# }
# fix
car.color = 'gray'
# is_valid() - Starting validation of an already created instance
# (if True returns, then there are no errors)
assert car.is_valid()
assert car.get_errors() is None
ValidatedDC: https://github.com/EvgeniyBurdin/validated_dc
Use pydantic.
In this example, the field password1 is only validated for being a string, while other fields have custom validator functions.
from pydantic import BaseModel, ValidationError, validator
class UserModel(BaseModel):
name: str
username: str
password1: str
password2: str
#validator('name')
def name_must_contain_space(cls, v):
if ' ' not in v:
raise ValueError('must contain a space')
return v.title()
#validator('password2')
def passwords_match(cls, v, values, **kwargs):
if 'password1' in values and v != values['password1']:
raise ValueError('passwords do not match')
return v
#validator('username')
def username_alphanumeric(cls, v):
assert v.isalnum(), 'must be alphanumeric'
return v
user = UserModel(
name='samuel colvin',
username='scolvin',
password1='zxcvbn',
password2='zxcvbn',
)
print(user)
#> name='Samuel Colvin' username='scolvin' password1='zxcvbn' password2='zxcvbn'
try:
UserModel(
name='samuel',
username='scolvin',
password1='zxcvbn',
password2='zxcvbn2',
)
except ValidationError as e:
print(e)
"""
2 validation errors for UserModel
name
must contain a space (type=value_error)
password2
passwords do not match (type=value_error)
"""
I am trying to use the traitlets library provided by ipython in my code. Suppose if a trait is of instance of a particular class how do i observe change in value for a member of the class..
example:
class A:
def init(self,val):
self.value = val
class myApp(HasTraits):
myA = Instance(A,kw={'val':2})
I want to have an observe method to be called if 'value' member variable of object myA is changed.. SOmething like below:
#observe('myA.value')
def onValueChange(self,change):
return
is it possible with the trailets implementation?
In order to observe changes to the value of an instance trait, the class for the instance trait should subclass Hastraits.
traitlets.observe(*names, **kwargs)
A decorator which can be used to observe Traits on a class.
from traitlets import HasTraits, observe, Instance, Int
class A(HasTraits):
value = Int()
def __init__(self, val):
self.value = val
#observe('value')
def func(self, change):
print(change)
class App(HasTraits):
myA = Instance(klass=A, args=(2,))
app = App()
app.myA.value = 1
{'name': 'value', 'old': 0, 'new': 2, 'owner': <__main__.A object at 0x10b0698d0>, 'type': 'change'}
{'name': 'value', 'old': 2, 'new': 1, 'owner': <__main__.A object at 0x10b0698d0>, 'type': 'change'}
Edit
To keep the change handler in the composed class, you can dynamically set an observer.
Note that, the attribute observed must be property of a trait.
In case you don't have access to modify class A to subclass HasTraits, you may be able to compose a class that subclasses HasTraits on the fly using some type magic.
from traitlets import HasTraits, observe, Instance, Int
class A(HasTraits):
value = Int()
def __init__(self, val):
self.value = val
class App(HasTraits):
myA = Instance(klass=A, args=(2,))
def __init__(self):
super().__init__()
self.set_notifiers()
def set_notifiers(self):
HasTraits.observe(self.myA, func, 'value')
def func(self, change):
print(change)
app = App()
app.myA.value = 1