How to both implement deserialize and derive it - rust

I have a struct Foo which I want to be serialised as a single two-part string in JSON, e.g. "01abcdef:42", but as normal in bincode.
(I need it to be serialized normally in bincode for size reasons. In some cases Bar or Baz are large arrays of bytes which take up more than twice the space in hex.)
My current code does just what I want:
pub struct Foo {
pub bar: Bar,
pub baz: Baz
}
impl<'de> ::serde::Deserialize<'de> for Foo {
fn deserialize<D: ::serde::Deserializer<'de>>(d: D) -> Result<Foo, D::Error> {
use ::serde::de::Error;
use core::str::FromStr;
if d.is_human_readable() {
let sl: &str = ::serde::Deserialize::deserialize(d)?;
Foo::from_str(sl).map_err(D::Error::custom)
} else {
let clone: FooClone = FooClone::deserialize(d)?;
Ok(Foo { bar: clone.bar, baz: clone.baz })
}
}
}
#[derive(Deserialize)]
pub struct FooClone {
pub bar: Bar,
pub baz: Baz
}
I need to manually maintain FooClone as an identical copy of Foo.
I have read this but that have significantly more code to maintain than this struct clone.
How can I both manually implement Deserialize (to handle the JSON two-part string) and yet derive Deserialize for the same struct (to eliminate FooClone)?

Something like this should work. You still use the derive to generate a deserialize function. But since it is a remote derive the type will not implement Deserialize, but gain an inherent function, which you can call inside the manual Deserialize implementation.
#[derive(serde::Deserialize)]
#[serde(remote = "Self")]
pub struct Foo {
pub bar: Bar,
pub baz: Baz,
}
impl<'de> ::serde::Deserialize<'de> for Foo {
fn deserialize<D: ::serde::Deserializer<'de>>(d: D) -> Result<Foo, D::Error> {
use ::serde::de::Error;
use core::str::FromStr;
if d.is_human_readable() {
let sl: &str = ::serde::Deserialize::deserialize(d)?;
Foo::from_str(sl).map_err(D::Error::custom)
} else {
Foo::deserialize(d)
}
}
}

Related

How to serialize a type that might be an arbitrary string?

I have an enum type that is defined as either one of list of predefined strings or an arbitrary value (i.e. code that uses this type potentially wants to handle a few specific cases a certain way and also allow an arbitrary string).
I'm trying to represent this in Rust with serde the following way:
#[derive(Serialize, Debug)]
pub enum InvalidatedAreas {
#[serde(rename = "all")]
All,
#[serde(rename = "stacks")]
Stacks,
#[serde(rename = "threads")]
Threads,
#[serde(rename = "variables")]
Variables,
String(String),
}
When used as a member, I would like to serialize the above enum as simply a string value:
#[derive(Serialize, Debug)]
struct FooBar {
foo: InvalidatedAreas,
bar: InvalidatedAreas,
}
fn main() {
let foob = FooBar {
foo: types::InvalidatedAreas::Stacks,
bar: types::InvalidatedAreas::String("hello".to_string())
};
let j = serde_json::to_string(&foob)?;
println!("{}", j);
}
What I get is:
{"foo":"stacks","bar":{"String":"hello"}}
But I need
{"foo":"stacks","bar":"hello"}
If I add #[serde(untagged)] to the enum definition, I get
{"foo":null,"bar":"hello"}
How can I serialize this correctly?
I've arrived at the following solution. It requires a bit of repetition, but it's not so bad. I'll leave the question open in case someone has a better idea.
impl ToString for InvalidatedAreas {
fn to_string(&self) -> String {
match &self {
InvalidatedAreas::All => "all",
InvalidatedAreas::Stacks => "stacks",
InvalidatedAreas::Threads => "threads",
InvalidatedAreas::Variables => "variables",
InvalidatedAreas::String(other) => other
}
.to_string()
}
}
impl Serialize for InvalidatedAreas {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
serializer.serialize_str(&self.to_string())
}
}
}

Late type in Rust

I'm working with two crates: A and B. I control both. I'd like to create a struct in A that has a field whose type is known only to B (i.e., A is independent of B, but B is dependent on A).
crate_a:
#[derive(Clone)]
pub struct Thing {
pub foo: i32,
pub bar: *const i32,
}
impl Thing {
fn new(x: i32) -> Self {
Thing { foo: x, bar: &0 }
}
}
crate_b:
struct Value {};
fn func1() {
let mut x = A::Thing::new(1);
let y = Value {};
x.bar = &y as *const Value as *const i32;
...
}
fn func2() {
...
let y = unsafe { &*(x.bar as *const Value) };
...
}
This works, but it doesn't feel very "rusty". Is there a cleaner way to do this? I thought about using a trait object, but ran into issues with Clone.
Note: My reason for splitting these out is that the dependencies in B make compilation very slow. Value above is actually from llvm_sys. I'd rather not leak that into A, which has no other dependency on llvm.
The standard way to implement something like this is with generics, which are kind of like type variables: they can be "assigned" a particular type, possibly within some constraints. This is how the standard library can provide types like Vec that work with types that you declare in your crate.
Basically, generics allow Thing to be defined in terms of "some unknown type that will become known later when this type is actually used."
Given the example in your code, it looks like Thing's bar field may or may not be set, which suggests that the built-in Option enum should be used. All you have to do is put a type parameter on Thing and pass that through to Option, like so:
pub mod A {
#[derive(Clone)]
pub struct Thing<T> {
pub foo: i32,
pub bar: Option<T>,
}
impl<T> Thing<T> {
pub fn new(x: i32) -> Self {
Thing { foo: x, bar: None }
}
}
}
pub mod B {
use crate::A;
struct Value;
fn func1() {
let mut x = A::Thing::new(1);
let y = Value;
x.bar = Some(y);
// ...
}
fn func2(x: &A::Thing<Value>) {
// ...
let y: &Value = x.bar.as_ref().unwrap();
// ...
}
}
(Playground)
Here, the x in B::func1() has the type Thing<Value>. You can see with this syntax how Value is substituted for T, which makes the bar field Option<Value>.
If Thing's bar isn't actually supposed to be optional, just write pub bar: T instead, and accept a T in Thing::new() to initialize it:
pub mod A {
#[derive(Clone)]
pub struct Thing<T> {
pub foo: i32,
pub bar: T,
}
impl<T> Thing<T> {
pub fn new(x: i32, y: T) -> Self {
Thing { foo: x, bar: y }
}
}
}
pub mod B {
use crate::A;
struct Value;
fn func1() {
let mut x = A::Thing::new(1, Value);
// ...
}
fn func2(x: &A::Thing<Value>) {
// ...
let y: &Value = &x.bar;
// ...
}
}
(Playground)
Note that the definition of Thing in both of these cases doesn't actually require that T implement Clone; however, Thing<T> will only implement Clone if T also does. #[derive(Clone)] will generate an implementation like:
impl<T> Clone for Thing<T> where T: Clone { /* ... */ }
This can allow your type to be more flexible -- it can now be used in contexts that don't require T to implement Clone, while also being cloneable when T does implement Clone. You get the best of both worlds this way.

How can I make all the fields of structs publicly readable while enforcing the use of a "new" constructor

Many structs need to enforce the use of a constructor for object creation, but I want to have public read access to all of the fields.
I need to access several levels deep with bish.bash.bosh.wibble.wobble - bish.get_bash().get_bosh().get_wibble().get_wobble() is not somewhere I want to go, for readability and possibly performance reasons.
This horrible kludge is what I'm using:
#[derive(Debug)]
pub struct Foo {
pub bar: u8,
pub baz: u16,
dummy: bool,
}
impl Foo {
pub fn new(bar: u8, baz: u16) -> Foo {
Foo {bar, baz, dummy: true}
}
}
This is obviously wasting a small amount of space, and dummy is causing inconvenience elsewhere.
How should I do this?
Thanks to #hellow I now have a working solution:
use serde::{Serialize, Deserialize}; // 1.0.115
#[derive(Serialize, Deserialize, Debug)]
pub struct Foo {
pub bar: u8,
pub baz: u16,
#[serde(skip)]
_private: (),
}
impl Foo {
pub fn new(bar: u8, baz: u16) -> Foo {
Foo {bar, baz, _private: ()}
}
}

Updating public fields of Rust structs which have private fields

I have a struct Foo which represents an external serialization format. Foo has dozens of fields, and more are added all the time. Happily, all new fields are guaranteed to have sensible default values.
Rust has a nice syntax for creating a struct using default values and then updating a few selected values:
Foo {
bar: true,
..Default::default()
}
Similarly, we can represent the idea of "this struct may have more fields in a future version" using a private field of type PhantomData.
But if we combine these two idioms, we get an error:
use std::default::Default;
mod F {
use std::default::Default;
use std::marker::PhantomData;
pub struct Foo {
pub bar: bool,
phantom: PhantomData<()>,
}
impl Default for Foo {
fn default() -> Foo {
Foo {
bar: false,
phantom: PhantomData,
}
}
}
}
fn main() {
F::Foo {
bar: true,
..Default::default()
};
}
This gives us the error:
error: field `phantom` of struct `F::Foo` is private [--explain E0451]
--> <anon>:23:5
|>
23 |> F::Foo {
|> ^
Logically, I would argue that this should work, because we're only updating public fields, and it would be useful idiom. The alternative is to support something like:
Foo::new()
.set_bar(true)
...which will get tedious with dozens of fields.
How can I work around this problem?
The default field syntax doesn't work because you're still creating a new instance (even if you're trying to take some of the field values from another object).
The alternative is to support something like:
Foo::new()
.set_bar(true)
...which will get tedious with dozens of fields.
I'm not sure that even with many fields, this:
Foo::new()
.set_bar(true)
.set_foo(17)
.set_splat("Boing")
is significantly more tedious than:
Foo {
bar: true,
foo: 17,
splat: "Boing",
..Foo::default()
}
Alternatively, you could separate out the public fields into their own type:
pub struct FooPub {
pub bar: bool,
// other pub fields
}
pub struct Foo {
pub bar: bool,
// other pub fields
// alternatively, group them: pub public: FooPub,
foo: u64,
}
impl Foo {
pub fn new(init: FooPub) {
Foo {
bar: init.bar,
// other pub fields
// alternative: public: init
// private fields
foo: 17u64,
}
}
}
You'd then call it as:
Foo::new(FooPub{ bar: true })
or add a fn FooPub::default() to let you default some of the fields:
Foo::new(FooPub{ bar: true, ..FooPub::default()})
Rename phantom to __phantom, make it public and #[doc(hidden)].
use std::default::Default;
mod foo {
use std::default::Default;
use std::marker::PhantomData;
pub struct Foo {
pub bar: bool,
// We make this public but hide it from the docs, making
// it private by convention. If you use this, your
// program may break even when semver otherwise says it
// shouldn't.
#[doc(hidden)]
pub _phantom: PhantomData<()>,
}
impl Default for Foo {
fn default() -> Foo {
Foo {
bar: false,
_phantom: PhantomData,
}
}
}
}
fn main() {
foo::Foo {
bar: true,
..Default::default()
};
}
This is a not so uncommon pattern, live example: std::io::ErrorKind::__Nonexhaustive.
Sure, users won't have any warning or anything if they choose to use a __named field anyway, but the __ makes the intent pretty clear. If a warning is required, #[deprecated] could be used.

Add value of a method to serde serialization output

Is there a way to add the value of a method to the serialization output of serde when the struct derives Serialize? I'm looking for something like a "virtual field".
I know I can define my own Serializer / Visitor or use serde_json::builder to get a Value, I just wanted to check first if there was any way to do this using serde_macro magic.
To be clear I want something like this:
#[derive(Serialize, Deserialize, Debug)]
struct Foo {
bar: String,
#[serde(call="Foo::baz")]
baz: i32 // but this is not a real field
}
impl Foo {
fn baz(&self) -> i32 { self.bar.len() as i32 }
}
Here is what I am using now. It's still verbose, and I don't know if it is the best way to handle this, but I thought I would add it here for the record:
#[derive(Deserialize, Debug)]
struct Foo {
bar: String
}
impl Foo {
fn baz(&self) -> i32 { self.bar.len() as i32 }
}
impl ::serde::Serialize for Foo {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer {
#[derive(Serialize)]
struct Extended<'a> {
bar: &'a String,
baz: i32
}
let ext = Extended {
bar: &self.bar,
baz: self.baz()
};
Ok(ext.serialize(serializer)?)
}
}

Resources