Rust serde deserialize dynamic trait

Rust serde deserialize dynamic trait - rust

I have a recursive data structure in my pet-project:
(this is a simplified example)
pub trait Condition {
fn validate(&self, s: &str) -> bool;
}
pub struct Equal {
ref_val: String,
}
impl Condition for Equal {
fn validate(&self, s: &str) -> bool { self.ref_val == s }
}
pub struct And<A, B> where A: Condition + ?Sized, B: Condition + ?Sized {
left: Box<A>,
right: Box<B>,
}
impl<A, B> Condition for And<A, B> where A: Condition + ?Sized, B: Condition + ?Sized {
fn validate(&self, s: &str) -> bool { self.left.validate(s) && self.right.validate(s) }
}
and i want to serialize and de-serialize the condition trait (using serde) eg.:
fn main() {
let c = And {
left: Box::new(Equal{ ref_val: "goofy".to_string() }),
right: Box::new(Equal{ ref_val: "goofy".to_string() }),
};
let s = serde_json::to_string(&c).unwrap();
let d: Box<dyn Condition> = serde_json::from_string(&s).unwrap();
}
Because serde cannot deserialize dyn traits out-of-the box, i tagged the serialized markup, eg:
#[derive(PartialEq, Debug, Serialize)]
#[serde(tag="type")]
pub struct Equal {
ref_val: String,
}
and try to implement a Deserializer and a Vistor for Box<dyn Condition>
Since i am new to Rust and because the implementation of a Deserializer and a Visitor is not that straightforward with the given documentation, i wonder if someone has an idea how to solve my issue with an easier approach?
I went through the serde documentation and searched for solution on tech sites/forums.
i tried out typetag but it does not support generic types
UPDATE:
To be more precise: the serialization works fine, ie. serde can serialize any concrete object of the Condition trait, but in order to deserialize a Condition the concrete type information needs to be provided. But this type info is not available at compile time. I am writing a web service where customers can upload rules for context matching (ie. Conditions) so the controller of the web service does not know the type when the condition needs to be deserialized.
eg. a Customer can post:
{"type":"Equal","ref_val":"goofy"}
or
{"type":"Greater","ref_val":"Pluto"}
or more complex with any combinator ('and', 'or', 'not')
{"type":"And","left":{"type":"Greater","ref_val":"Gamma"},"right":{"type":"Equal","ref_val":"Delta"}}
and therefore i need to deserialze to a trait (dyn Condition) using the type tags in the serialized markup...

I would say the “classic” way to solve this problem is by deserializing into an enum with one variant per potential “real” type you're going to deserialize into. Unfortunately, And is generic, which means those generic parameters have to exist on the enum as well, so you have to specify them where you deserialize.
use serde::{Deserialize, Serialize};
use serde_json; // 1.0.91 // 1.0.152
pub trait Condition {
fn validate(&self, s: &str) -> bool;
}
#[derive(PartialEq, Debug, Serialize, Deserialize)]
pub struct Equal {
ref_val: String,
}
impl Condition for Equal {
fn validate(&self, s: &str) -> bool {
self.ref_val == s
}
}
#[derive(PartialEq, Debug, Serialize, Deserialize)]
pub struct And<A, B>
where
A: Condition + ?Sized,
B: Condition + ?Sized,
{
left: Box<A>,
right: Box<B>,
}
impl<A, B> Condition for And<A, B>
where
A: Condition + ?Sized,
B: Condition + ?Sized,
{
fn validate(&self, s: &str) -> bool {
self.left.validate(s) && self.right.validate(s)
}
}
#[derive(Debug, Serialize, Deserialize)]
#[serde(untagged)]
enum Expr<A, B>
where
A: Condition + ?Sized,
B: Condition + ?Sized,
{
Equal(Equal),
And(And<A, B>),
}
fn main() {
let c = And {
left: Box::new(Equal {
ref_val: "goofy".to_string(),
}),
right: Box::new(Equal {
ref_val: "goofy".to_string(),
}),
};
let s = serde_json::to_string(&c).unwrap();
let d: Expr<Equal, Equal> = serde_json::from_str(&s).unwrap();
println!("{d:?}");
}
Prints And(And { left: Equal { ref_val: "goofy" }, right: Equal { ref_val: "goofy" } })

I removed the generics from the combinator conditions, so i can now use typetag like #EvilTak suggested:
#[derive(Serialize, Deserialize)]
#[serde(tag="type")]
pub struct And {
left: Box<dyn Condition>,
right: Box<dyn Condition>,
}
#[typetag::serde]
impl Condition for And {
fn validate(&self, s: &str) -> bool { self.left.validate(s) && self.right.validate(s) }
}
(on the downside, i had to remove the derive macros PartialEq, and Debug)
Interesting side fact: i have to keep the #[serde(tag="type")] on the And Struct because otherwise the typetag will be omitted in the serialization (for the primitive consitions it is not needed)
UPDATE: typetag adds the type tag only for trait objects so the #[serde(tag="type")] is not needed...

Related

Late type in Rust

I'm working with two crates: A and B. I control both. I'd like to create a struct in A that has a field whose type is known only to B (i.e., A is independent of B, but B is dependent on A).
crate_a:
#[derive(Clone)]
pub struct Thing {
pub foo: i32,
pub bar: *const i32,
}
impl Thing {
fn new(x: i32) -> Self {
Thing { foo: x, bar: &0 }
}
}
crate_b:
struct Value {};
fn func1() {
let mut x = A::Thing::new(1);
let y = Value {};
x.bar = &y as *const Value as *const i32;
...
}
fn func2() {
...
let y = unsafe { &*(x.bar as *const Value) };
...
}
This works, but it doesn't feel very "rusty". Is there a cleaner way to do this? I thought about using a trait object, but ran into issues with Clone.
Note: My reason for splitting these out is that the dependencies in B make compilation very slow. Value above is actually from llvm_sys. I'd rather not leak that into A, which has no other dependency on llvm.

The standard way to implement something like this is with generics, which are kind of like type variables: they can be "assigned" a particular type, possibly within some constraints. This is how the standard library can provide types like Vec that work with types that you declare in your crate.
Basically, generics allow Thing to be defined in terms of "some unknown type that will become known later when this type is actually used."
Given the example in your code, it looks like Thing's bar field may or may not be set, which suggests that the built-in Option enum should be used. All you have to do is put a type parameter on Thing and pass that through to Option, like so:
pub mod A {
#[derive(Clone)]
pub struct Thing<T> {
pub foo: i32,
pub bar: Option<T>,
}
impl<T> Thing<T> {
pub fn new(x: i32) -> Self {
Thing { foo: x, bar: None }
}
}
}
pub mod B {
use crate::A;
struct Value;
fn func1() {
let mut x = A::Thing::new(1);
let y = Value;
x.bar = Some(y);
// ...
}
fn func2(x: &A::Thing<Value>) {
// ...
let y: &Value = x.bar.as_ref().unwrap();
// ...
}
}
(Playground)
Here, the x in B::func1() has the type Thing<Value>. You can see with this syntax how Value is substituted for T, which makes the bar field Option<Value>.
If Thing's bar isn't actually supposed to be optional, just write pub bar: T instead, and accept a T in Thing::new() to initialize it:
pub mod A {
#[derive(Clone)]
pub struct Thing<T> {
pub foo: i32,
pub bar: T,
}
impl<T> Thing<T> {
pub fn new(x: i32, y: T) -> Self {
Thing { foo: x, bar: y }
}
}
}
pub mod B {
use crate::A;
struct Value;
fn func1() {
let mut x = A::Thing::new(1, Value);
// ...
}
fn func2(x: &A::Thing<Value>) {
// ...
let y: &Value = &x.bar;
// ...
}
}
(Playground)
Note that the definition of Thing in both of these cases doesn't actually require that T implement Clone; however, Thing<T> will only implement Clone if T also does. #[derive(Clone)] will generate an implementation like:
impl<T> Clone for Thing<T> where T: Clone { /* ... */ }
This can allow your type to be more flexible -- it can now be used in contexts that don't require T to implement Clone, while also being cloneable when T does implement Clone. You get the best of both worlds this way.

How do I pass a type bounded by a trait to Serde's deserialize_with?

A type satisfying MyTrait is supposed to be passed to deserialize_data specified by deserialize_with. Here is my sample code:
use serde::{Deserialize, Deserializer}; // 1.0.117
use serde_json; // 1.0.59
type Item = Result<String, Box<dyn std::error::Error + Send + Sync>>;
pub trait MyTrait {
fn method(ind: &str) -> Item;
}
#[derive(Deserialize)]
pub struct S<T>
where
T: MyTrait + ?Sized, // intend to pass a type T satisfying `MyTrait` to function `deserialize_data`,
{
#[serde(deserialize_with = "deserialize_data")]
//#[serde(bound( deserialize = "T: MyTrait, for<'de2> T: Deserialize<'de2>" ))]
pub data: String,
}
fn deserialize_data<'de, D, T>(d: D) -> Result<String, D::Error>
where
D: Deserializer<'de>,
{
let ind = <&str>::deserialize(d).unwrap();
match T::method(ind) {
Ok(data) => Ok(data),
Err(e) => Err(serde::de::Error::custom(format_args!("invalid type."))),
}
}
struct A;
impl MyTrait for A {
fn method(_ind: &str) -> Item {
// to make it simple, return constant
Ok("method".to_string())
}
}
fn main() {
let s = r#"{"data": "string"}"#;
let ob: S<A> = serde_json::from_str(s).unwrap();
}
The compiler complains:
error[E0392]: parameter `T` is never used
--> src/main.rs:10:14
|
10 | pub struct S<T>
| ^ unused parameter
|
= help: consider removing `T`, referring to it in a field, or using a marker such as `PhantomData`
I do use T, and PhantomData doesn't help much.
One of the obvious way may be using struct A and its implemented method as crate or something then importing them. This, unfortunately, doesn't apply to my case, so I seek to pass a struct type to deserialize_data and achieve that.

To get the code to compile, you need to:
use T in struct S<T>, for example with PhantomData.
explicitly pass T to deserialize_data using the turbofish operator ::<>.
add the appropriate trait bounds to the T generic type in deserialize_data(), such as MyTrait.
For example (playground):
#[derive(Deserialize)]
pub struct S<T>
where
T: MyTrait + ?Sized,
{
#[serde(deserialize_with = "deserialize_data::<_, T>")]
pub data: String,
marker: std::marker::PhantomData<T>,
}
fn deserialize_data<'de, D, T>(d: D) -> Result<String, D::Error>
where
D: Deserializer<'de>,
T: MyTrait + ?Sized,
{
// ...
}

How do I use Serde to serialize a HashMap with structs as keys to JSON?

I want to serialize a HashMap with structs as keys:
use serde::{Deserialize, Serialize}; // 1.0.68
use std::collections::HashMap;
fn main() {
#[derive(Serialize, Deserialize, Debug, PartialEq, Eq, Hash)]
struct Foo {
x: u64,
}
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
x: HashMap<Foo, f64>,
}
let mut p = Bar { x: HashMap::new() };
p.x.insert(Foo { x: 0 }, 0.0);
let serialized = serde_json::to_string(&p).unwrap();
}
This code compiles, but when I run it I get an error:
Error("key must be a string", line: 0, column: 0)'
I changed the code:
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
x: HashMap<u64, f64>,
}
let mut p = Bar { x: HashMap::new() };
p.x.insert(0, 0.0);
let serialized = serde_json::to_string(&p).unwrap();
The key in the HashMap is now a u64 instead of a string. Why does the first code give an error?

You can use serde_as from the serde_with crate to encode the HashMap as a sequence of key-value pairs:
use serde_with::serde_as; // 1.5.1
#[serde_as]
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
#[serde_as(as = "Vec<(_, _)>")]
x: HashMap<Foo, f64>,
}
Which will serialize to (and deserialize from) this:
{
"x":[
[{"x": 0}, 0.0],
[{"x": 1}, 0.0],
[{"x": 2}, 0.0]
]
}
There is likely some overhead from converting the HashMap to Vec, but this can be very convenient.

According to JSONs specification, JSON keys must be strings. serde_json uses fmt::Display in here, for some non-string keys, to allow serialization of wider range of HashMaps. That's why HashMap<u64, f64> works as well as HashMap<String, f64> would. However, not all types are covered (Foo's case here).
That's why we need to provide our own Serialize implementation:
impl Display for Foo {
fn fmt(&self, f: &mut Formatter) -> std::fmt::Result {
write!(f, "{}", self.x)
}
}
impl Serialize for Bar {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
let mut map = serializer.serialize_map(Some(self.x.len()))?;
for (k, v) in &self.x {
map.serialize_entry(&k.to_string(), &v)?;
}
map.end()
}
}
(playground)

I've found the bulletproof solution 😃
Extra dependencies not required
Compatible with HashMap, BTreeMap and other iterable types
Works with flexbuffers
The following code converts a field (map) to the intermediate Vec representation:
pub mod vectorize {
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use std::iter::FromIterator;
pub fn serialize<'a, T, K, V, S>(target: T, ser: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
T: IntoIterator<Item = (&'a K, &'a V)>,
K: Serialize + 'a,
V: Serialize + 'a,
{
let container: Vec<_> = target.into_iter().collect();
serde::Serialize::serialize(&container, ser)
}
pub fn deserialize<'de, T, K, V, D>(des: D) -> Result<T, D::Error>
where
D: Deserializer<'de>,
T: FromIterator<(K, V)>,
K: Deserialize<'de>,
V: Deserialize<'de>,
{
let container: Vec<_> = serde::Deserialize::deserialize(des)?;
Ok(T::from_iter(container.into_iter()))
}
}
To use it just add the module's name as an attribute:
#[derive(Debug, Serialize, Deserialize)]
struct MyComplexType {
#[serde(with = "vectorize")]
map: HashMap<MyKey, String>,
}
The remained part if you want to check it locally:
use anyhow::Error;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord, Hash)]
struct MyKey {
one: String,
two: u16,
more: Vec<u8>,
}
#[derive(Debug, Serialize, Deserialize)]
struct MyComplexType {
#[serde(with = "vectorize")]
map: HashMap<MyKey, String>,
}
fn main() -> Result<(), Error> {
let key = MyKey {
one: "1".into(),
two: 2,
more: vec![1, 2, 3],
};
let mut map = HashMap::new();
map.insert(key.clone(), "value".into());
let instance = MyComplexType { map };
let serialized = serde_json::to_string(&instance)?;
println!("JSON: {}", serialized);
let deserialized: MyComplexType = serde_json::from_str(&serialized)?;
let expected_value = "value".to_string();
assert_eq!(deserialized.map.get(&key), Some(&expected_value));
Ok(())
}
And on the Rust playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=bf1773b6e501a0ea255ccdf8ce37e74d

While all provided answers will fulfill the goal of serializing your HashMap to json they are ad hoc or hard to maintain.
One correct way to allow a specific data structure to be serialized with serde as keys in a map, is the same way serde handles integer keys in HashMaps (which works): They serialize the value to String. This has a few advantages; namely
Intermediate data-structure omitted,
no need to clone the entire HashMap,
easier maintained by applying OOP concepts, and
serialization usable in more complex structures such as MultiMap.
This can be done by manually implementing Serialize and Deserialize for your data-type.
I use composite ids for maps.
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub struct Proj {
pub value: u64,
}
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub struct Doc {
pub proj: Proj,
pub value: u32,
}
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub struct Sec {
pub doc: Doc,
pub value: u32,
}
So now manually implementing serde serialization for them is kind of a hassle, so instead we delegate the implementation to the FromStr and From<Self> for String (Into<String> blanket) traits.
impl From<Doc> for String {
fn from(val: Doc) -> Self {
format!("{}{:08X}", val.proj, val.value)
}
}
impl FromStr for Doc {
type Err = String;
fn from_str(s: &str) -> Result<Self, Self::Err> {
match parse_doc(s) {
Ok((_, p)) => Ok(p),
Err(e) => Err(e.to_string()),
}
}
}
In order to parse the Doc we make use of nom. The parse functionality below is explained in their examples.
fn is_hex_digit(c: char) -> bool {
c.is_digit(16)
}
fn from_hex8(input: &str) -> Result<u32, std::num::ParseIntError> {
u32::from_str_radix(input, 16)
}
fn parse_hex8(input: &str) -> IResult<&str, u32> {
map_res(take_while_m_n(8, 8, is_hex_digit), from_hex8)(input)
}
fn parse_doc(input: &str) -> IResult<&str, Doc> {
let (input, proj) = parse_proj(input)?;
let (input, value) = parse_hex8(input)?;
Ok((input, Doc { value, proj }))
}
Now we need to hook up self.to_string() and str::parse(&str) to serde we can do this using a simple macro.
macro_rules! serde_str {
($type:ty) => {
impl Serialize for $type {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
let s: String = self.clone().into();
serializer.serialize_str(&s)
}
}
impl<'de> Deserialize<'de> for $type {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
paste! {deserializer.deserialize_string( [<$type Visitor>] {})}
}
}
paste! {struct [<$type Visitor>] {}}
impl<'de> Visitor<'de> for paste! {[<$type Visitor>]} {
type Value = $type;
fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
formatter.write_str("\"")
}
fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
where
E: serde::de::Error,
{
match str::parse(v) {
Ok(id) => Ok(id),
Err(_) => Err(serde::de::Error::custom("invalid format")),
}
}
}
};
}
Here we are using paste to interpolate the names. Beware that now the struct will always serialize as defined above. Never as a struct, always as a string.
It is important to implement fn visit_str instead of fn visit_string because visit_string defers to visit_str.
Finally, we have to call the macro for our custom structs
serde_str!(Sec);
serde_str!(Doc);
serde_str!(Proj);
Now the specified types can be serialized to and from string with serde.

Derive attribute for specific fields only, like serde does

Using derive syntax, can I implement traits like Hash or PartialEq using specific fields, not all of them?
It could look like:
#[derive(Debug, Hash, Eq, PartialEq)]
struct MyStruct {
id: i32,
name: String,
#[derive(hash_skip, eq_skip)]
aux_data1: f64,
#[derive(hash_skip, eq_skip)]
aux_data2: f64,
#[derive(hash_skip, eq_skip)]
aux_data3: String,
}
I want the hash method to only use id, and name and no others.
The serde library allows something like this for serialization.

No, there is no such feature in Rust at this moment. What I would suggest is to use the implementation for tuples available for these traits, like this:
use std::hash::{Hash, Hasher};
#[derive(Debug)]
struct MyStruct {
id: i32,
name: String,
aux_data1: f64,
aux_data2: f64,
aux_data3: String,
}
impl Hash for MyStruct {
fn hash<H>(&self, state: &mut H) where H: Hasher {
(&self.id, &self.name).hash(state);
}
}
impl PartialEq for MyStruct {
fn eq(&self, other: &Self) -> bool {
(&self.id, &self.name) == (&other.id, &other.name)
}
}
Edit: or as #Shepmaster suggested in a comment below, you can create a key function which returns a tuple of all useful fields and use it.
impl MyStruct {
fn key(&self) -> (&i32, &String) {
(&self.id, &self.name)
}
}
impl Hash for MyStruct {
fn hash<H>(&self, state: &mut H) where H: Hasher {
self.key().hash(state);
}
}
impl PartialEq for MyStruct {
fn eq(&self, other: &Self) -> bool {
self.key() == other.key()
}
}

This is possible by using derivative.
Example:
use derivative::Derivative;
#[derive(Derivative)]
#[derivative(Hash, PartialEq)]
struct Foo {
foo: u8,
#[derivative(Hash="ignore")]
#[derivative(PartialEq="ignore")]
bar: u8,
}

Enforce Eq check on trait

Suppose the following objects:
pub struct MyStruct<T>{
items: Vec<T>
}
pub impl<T> MyStruct<T> {
pub fn new() -> MyStruct {
MyStruct{ items::new(); }
}
pub fn add(&mut self, item: T) where T : Eq {
self.items.push(item);
}
}
pub trait C {}
pub struct Another {
mystruct: MyStruct<Box<C>>
}
pub impl Another {
pub fn exec<C>(&mut self, c: C) where C: Eq + C {
self.mystruct.add(c);
}
}
On exec I'm enforcing that C also is Eq, but I'm receiving the following error:
error: the trait `core::cmp::Eq` is not implemented for the type `C`
I had to make
pub impl<T> MyStruct<T>
instead of
pub impl<T : Eq> MyStruct<T>
because since C is a trait I cant enforce Eq when using MyStruct::new, so I left the check for the type guard on the function. What's happening here?

Let’s look at the relevant definitions for Eq:
pub trait Eq: PartialEq<Self> {
…
}
pub trait PartialEq<Rhs: ?Sized = Self> {
fn eq(&self, other: &Rhs) -> bool;
…
}
Now consider MyStruct<Box<C>>: the type that it wants to implement Eq is Box<C>, a boxed trait object. To implement Eq, Box<C> must first implement PartialEq<Box<C>>, like this:
impl PartialEq for Box<C> {
fn eq(&self, other: &Box<C>) -> bool;
}
impl Eq for Box<C> { }
That is, you must be able to compare an arbitrary Box<C> with any other arbitrary Box<C>. The boxed trait objects that you are comparing could be of different concrete types, here. Thus you need to write this implementation manually. In such cases, you will typically want the trait to include some way of normalising the form of the object into a concrete, comparable type; for some types this is obvious; if AsRef<T> was to have a PartialEq implementation added, the implementation to add would be fairly obvious:
impl<T: PartialEq> PartialEq for AsRef<T> {
fn eq(&self, other: &AsRef<T>) -> bool {
self.as_ref() == other.as_ref()
}
}
(Note also that if C implements PartialEq, Box<C> does, so such implementations as we’re discussing should go on the unboxed trait object, not on the boxed trait object.)
Quite often, however, there is not an obvious and simple implementation. There are a few approaches you could take:
Convert the objects (potentially expensively, though ideally cheaply) into some base type, e.g. a String which can then be compared;
Give up;
Constrain C to 'static types and use some fancy Any magic to make it so that it uses the base type’s PartialEq implementation, returning false if the two trait objects are not of the same concrete type. This one is a fairly useful one, so I’ll give some code for it:
#![feature(core)]
use std::any::{Any, TypeId};
use std::mem;
fn main() { }
trait PartialEqFromC {
fn eq_c(&self, other: &C) -> bool;
}
impl<T: PartialEq + Any + C> PartialEqFromC for T {
fn eq_c(&self, other: &C) -> bool {
if other.get_type_id() == TypeId::of::<Self>() {
self == unsafe { *mem::transmute::<&&C, &&Self>(&other) }
} else {
false
}
}
}
trait C: Any + PartialEqFromC {
}
impl PartialEq for C {
fn eq(&self, other: &C) -> bool {
self.eq_c(other)
}
}
Note that this example depends on the unstable feature core for Any.get_type_id and is thus tied to nightly only; this can be worked around by duplicating that definition from the Any trait into a new supertrait of C and could also be simplified by mopafying the C trait.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Rust serde deserialize dynamic trait - rust

Related

Late type in Rust

How do I pass a type bounded by a trait to Serde's deserialize_with?

How do I use Serde to serialize a HashMap with structs as keys to JSON?

Derive attribute for specific fields only, like serde does

Enforce Eq check on trait

Categories

Resources