How to just use custom serialisation for "stringy" serialisation? - rust

I've recently got to grips with custom serialisation/deserialisation: https://stackoverflow.com/a/63846824/129805
I want to use this custom "stringy" serialisation (and des.) only for JSON and RON, while using the #[derive(Serialisation, ... for all the binary serialisations, such as bincode. (Inflating a two-byte (100, 200) to seven or more bytes of "100:200" is pointlessly wasteful.)
I need to do this within a single executable, as server/server comms will be bincode or protobufs, while client/server comms will be JSON.
Both server/server and client/server comms will use the same serialisable structs. i.e. I want a single set of structs for all comms, but they should use custom serialisation for JSON/RON but derived serialisation for bin/protobufs.
How can I do this?
Update:
Here is working code with tests which pass:
use serde::{Serialize, Serializer, Deserialize, Deserializer};
use serde::de::{self, Visitor, Unexpected};
use std::fmt;
use std::str::FromStr;
use regex::Regex;
#[derive(Serialize, Deserialize, Debug, PartialEq, Eq, PartialOrd, Ord)]
struct DerivedIncline {
rise: u8,
distance: u8,
}
impl DerivedIncline {
pub fn new(rise: u8, distance: u8) -> DerivedIncline {
DerivedIncline {rise, distance}
}
}
#[derive(Debug, PartialEq, Eq, PartialOrd, Ord)]
struct StringyIncline {
rise: u8,
distance: u8,
}
impl StringyIncline {
pub fn new(rise: u8, distance: u8) -> StringyIncline {
StringyIncline {rise, distance}
}
}
impl Serialize for StringyIncline {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
serializer.serialize_str(&format!("{}:{}", self.rise, self.distance))
}
}
struct StringyInclineVisitor;
impl<'de> Visitor<'de> for StringyInclineVisitor {
type Value = StringyIncline;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("a colon-separated pair of integers between 0 and 255")
}
fn visit_str<E>(self, s: &str) -> Result<Self::Value, E>
where
E: de::Error,
{
let re = Regex::new(r"(\d+):(\d+)").unwrap(); // PERF: move this into a lazy_static!
if let Some(nums) = re.captures_iter(s).next() {
if let Ok(rise) = u8::from_str(&nums[1]) { // nums[0] is the whole match, so we must skip that
if let Ok(distance) = u8::from_str(&nums[2]) {
Ok(StringyIncline::new(rise, distance))
} else {
Err(de::Error::invalid_value(Unexpected::Str(s), &self))
}
} else {
Err(de::Error::invalid_value(Unexpected::Str(s), &self))
}
} else {
Err(de::Error::invalid_value(Unexpected::Str(s), &self))
}
}
}
impl<'de> Deserialize<'de> for StringyIncline {
fn deserialize<D>(deserializer: D) -> Result<StringyIncline, D::Error>
where
D: Deserializer<'de>,
{
deserializer.deserialize_string(StringyInclineVisitor)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn serialisation() {
let stringy_incline = StringyIncline::new(4, 3);
let derived_incline = DerivedIncline::new(4, 3);
let json = serde_json::to_string(&stringy_incline).unwrap();
assert_eq!(json, "\"4:3\"");
let bin = bincode::serialize(&derived_incline).unwrap();
assert_eq!(bin, [4u8, 3u8]);
}
#[test]
fn deserialisation() {
let json = "\"4:3\"";
let bin = [4u8, 3u8];
let deserialised_json: StringyIncline = serde_json::from_str(&json).unwrap();
let deserialised_bin: DerivedIncline = bincode::deserialize(&bin).unwrap();
assert_eq!(deserialised_json, StringyIncline::new(4, 3));
assert_eq!(deserialised_bin, DerivedIncline::new(4, 3));
}
}
I want to have a single Incline struct which acts like StringlyIncline when serialised to JSON or as DerivedIncline when serialised to bincode.

If you're using nightly and are willing to turn on the specialization feature you can write a function that will tell you if the generic parameter S is a serde_json::Serializer
trait IsJsonSerializer {
fn is_json_serializer() -> bool;
}
impl<T> IsJsonSerializer for T {
default fn is_json_serializer() -> bool {
false
}
}
impl<W,F> IsJsonSerializer for &mut serde_json::Serializer<W,F> {
fn is_json_serializer() -> bool {
true
}
}
Then you can write if S::is_json_serializer() {...}. Using this your serialization function can be written:
#[derive(Serialize, Deserialize, PartialEq, Eq, Debug)]
struct RawIncline {
rise: u8,
distance: u8,
}
impl Serialize for Incline {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
if S::is_json_serializer() {
serializer.serialize_str(&format!("{}:{}", self.rise, self.distance))
} else {
RawIncline{rise:self.rise, distance:self.distance}.serialize(serializer)
}
}
}
You can then do something similar for deserialization.
I can't think of a way to get something like this to work without the specialization feature, so it limited to nightly for now - but I'd love to see if it is possible somehow.

Related

Serialize / Deserialize a struct that can be represented as an array of bytes

I am working with a struct that looks more or less like this:
struct MyStruct {
// Various fields, most of which do not implement `Serialize` / `Deserialize`
}
struct MyError; // Yes, this implements `std::error::Error`
impl MyStruct {
fn from_bytes(bytes: [u8; 96]) -> Result<MyStruct, MyError> {
// Creates a `MyStruct` from an array of exactly 96 bytes.
// It returns a `MyError` upon failure.
}
fn to_bytes(&self) -> [u8; 96] {
// Serializes `MyStruct` into an array of exactly 96 bytes.
}
}
Now, I would like to have MyStruct implement serde's Serialize and Deserialize. Intuition tells me it should be simple (I literally have functions that already serialize and deserialize MyStruct), but after hours of confused trial and error I'm stuck.
What I would like to have is MyStruct implement Serialize and Deserialize and, should I call bincode::serialize(my_struct), I would like it to be represented in exactly 96 bytes (i.e., I would like to avoid paying the cost of a pointless, 8-byte header that always says "what follows is a sequence of 96 bytes": I already know that I need 96 bytes to represent MyStruct!).
First part of your question can be accomplished as following:
use serde::{de::Error, Deserialize, Deserializer, Serialize, Serializer};
use serde_big_array::BigArray;
#[derive(Serialize, Deserialize)]
struct Wrap {
#[serde(with = "BigArray")]
arr: [u8; 96],
}
struct MyStruct {}
struct MyError;
impl Display for MyError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
todo!()
}
}
impl MyStruct {
fn from_bytes(bytes: [u8; 96]) -> Result<MyStruct, MyError> {
todo!()
}
fn to_bytes(&self) -> [u8; 96] {
todo!()
}
}
impl serde::Serialize for MyStruct {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
Wrap {
arr: self.to_bytes(),
}
.serialize(serializer)
}
}
impl<'de> serde::Deserialize<'de> for MyStruct {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
let bytes = <Wrap>::deserialize(deserializer)?;
Self::from_bytes(bytes.arr).map_err(D::Error::custom)
}
}
The second part is tough and depends on the format you are using with serde.

How to use "flatten" like thing in custom Serialize and Deserialize

I need to to use custom implementations of Serialize and Deserialize, but i could not figure out how to do something like #[serde(flatten)] does, does anyone know?
Note: i know i could completely re-write the full implementation of the lower elements into the higher one, but the lower elements implement Serialize (and Deserialize), so i am searching for a way to add that to something like serialize_struct.
#[derive(Debug, Serialize, Deserialize, PartialEq)]
struct Nested {
somefield2: String,
}
#[derive(Debug, PartialEq)]
struct TopLevel {
somefield1: usize,
nested: Nested,
}
impl Serialize for TopLevel {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
// How to do this properly?
let mut do_struct = serializer.serialize_struct("Named", 2)?;
do_struct.serialize_field("somefield1", &self.somefield1)?;
// how to add everything from "self.nested" as the same level as this one?
// JSON example: { somefield1: 0, somefield2: 0 }
return do_struct.end();
}
}
impl<'de> Deserialize<'de> for TopLevel {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
// Same question as in "Serialize", how to do this properly in here?
// Here is currently no example code, because i try to figure out "Serialize" first
todo!();
}
}
Versions used:
serde: 1.0.133
rust: 1.58.1
Note: i have already read Implementing Serialize and tried to search issues / stackoverflow but could not find anything related to that.
You could try doing something like this for Serialize and something similar for Deserialize:
struct FakeStructFlatteningSerializer<'a, SS: SerializeStruct>(&'a mut SS);
impl Serializer<'a, SS: SerializeStruct> for FakeStructFlatteningSerializer<'a, SS> {
type Ok = ();
type Error = SS::Error;
type SerializeStruct = FakeStructFlatteningSerializeStruct<'a, SS>;
// return Impossible for everything else
fn serialize_struct(self, name: &'static str, len: usize) -> Result<Self::SerializeStruct, Self::Error> {
// ignore name and len!
Ok(FakeStructFlatteningSerializeStruct(self.0))
}
}
struct FakeStructFlatteningSerializeStruct<'a, SS: SerializeStruct>(&'a mut SS);
impl<'a, SS: SerializeStruct> SerializeStruct for FakeStructFlatteningSerializeStruct<'a, SS> {
type Ok = ();
type Error = SS::Error;
fn serialize_field<T: Serialize + ?Sized>(&mut self, key: &'static str, value: &T) -> Result<(), Self::Error> {
self.0.serialize_field(key, value)
}
fn skip_field(&mut self, key: &'static str) -> Result<(), Self::Error> {
self.0.skip_field(key)
}
fn end(self) -> Result<Self::Ok, Self::Error> {
// ignore!
Ok(())
}
}
impl Serialize for TopLevel {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
// len needs to include the flattened fields
let mut do_struct = serializer.serialize_struct("Named", 3)?;
do_struct.serialize_field("somefield1", &self.somefield1)?;
self.nested.serialize(FakeStructFlatteningSerializer(&mut do_struct));
return do_struct.end();
}
}
You could alternatively try to figure out how Serde does it; this might be where: https://github.com/serde-rs/serde/blob/dc0c0dcba17dd8732cd8721a7ef556afcb04c6c0/serde_derive/src/ser.rs#L953-L1037, https://github.com/serde-rs/serde/blob/fb2fe409c8f7ad6c95e3096e5e9ede865c8cfb49/serde_derive/src/de.rs#L2560-L2578

How can I implement serde for a type that I don't own and have it support compound /wrapper/collection types

This question is similar
How do I implement a trait I don't own for a type I don't own?
I wrote a serializer for Date, using the mechanism described in the documentation with my module wrapping a serialize function
pub mod my_date_format {
use chrono::{Date, NaiveDate, Utc};
use serde::{self, Deserialize, Deserializer, Serializer};
const SERIALIZE_FORMAT: &'static str = "%Y-%m-%d";
pub fn serialize<S>(date: &Date<Utc>, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
let s = format!("{}", date.format(SERIALIZE_FORMAT));
serializer.serialize_str(&s)
}
pub fn deserialize<'de, D>(deserializer: D) -> Result<Date<Utc>, D::Error>
where
D: Deserializer<'de>,
{
let s = String::deserialize(deserializer)?;
NaiveDate::parse_from_str(s.as_str(), SERIALIZE_FORMAT)
.map_err(serde::de::Error::custom)
.map(|x| {
let now = Utc::now();
let date: Date<Utc> = Date::from_utc(x, now.offset().clone());
date
})
}
}
then I can do:
struct MyStruct {
#[serde(with = "my_date_format")]
pub start: Date<Utc>,
}
Problem is if I wrap the serialized thing in other types (which are serializable themselves) I get errors:
#[serde(with = "my_date_format")]
pub dates: Vec<Date<Utc> // this won't work now since my function doesn't serialize vectors
pub maybe_date: Option<Date<Utc>>> // won't work
pub box_date: Box<Date<Utc>> // won't work...
How can I gain the implementations provided while using my own serializer?
https://docs.serde.rs/serde/ser/index.html#implementations-of-serialize-provided-by-serde
The most straight forward way, is to do as the question you linked to talks about, i.e. create a new type, wrap Date<Utc>, and implement Serialize and Deserialize for that type.
#[derive(PartialOrd, Ord, PartialEq, Eq, Clone, Debug)]
struct FormattedDate(Date<Utc>);
impl Serialize for FormattedDate {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
// If you implement `Deref`, then you don't need to add `.0`
let s = format!("{}", self.0.format(SERIALIZE_FORMAT));
serializer.serialize_str(&s)
}
}
impl<'de> Deserialize<'de> for FormattedDate {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
let s = String::deserialize(deserializer)?;
NaiveDate::parse_from_str(s.as_str(), SERIALIZE_FORMAT)
.map_err(serde::de::Error::custom)
.map(|x| {
let now = Utc::now();
let date: Date<Utc> = Date::from_utc(x, now.offset().clone());
Self(date)
// or
// date.into()
})
}
}
To make life easier, you can implement Deref and DerefMut and then using FormattedDate transparently acts as if you're using Date<Utc> directly.
use std::ops::{Deref, DerefMut};
impl Deref for FormattedDate {
type Target = Date<Utc>;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl DerefMut for FormattedDate {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}
Similarly you can implement From and Into, such that you can easily convert between FormattedDate and Date<Utc>.
impl From<Date<Utc>> for FormattedDate {
fn from(date: Date<Utc>) -> Self {
Self(date)
}
}
impl Into<Date<Utc>> for FormattedDate {
fn into(self) -> Date<Utc> {
self.0
}
}
Now all the examples you gave works with ease of use:
#[derive(Serialize, Deserialize, Debug)]
struct MyStruct {
date: FormattedDate,
dates: Vec<FormattedDate>,
opt_date: Option<FormattedDate>,
boxed_date: Box<FormattedDate>,
}
fn main() {
let s = MyStruct {
date: Utc::now().date().into(),
dates: std::iter::repeat(Utc::now().date().into()).take(4).collect(),
opt_date: Some(Utc::now().date().into()),
boxed_date: Box::new(Utc::now().date().into()),
};
let json = serde_json::to_string_pretty(&s).unwrap();
println!("{}", json);
}
Which outputs:
{
"date": "2020-12-13",
"dates": [
"2020-12-13",
"2020-12-13",
"2020-12-13",
"2020-12-13"
],
"opt_date": "2020-12-13",
"boxed_date": "2020-12-13"
}
Instead of relying on wrapper types it is possible to achieve the same results with the serde_as macro from the serde_with crate.
It works like the serde with attribute but also supports wrapper and collections types.
Since you already have a module to use with serde's with, the hard part is already done.
You can find the details in the crate documentation.
You only need to add a local type and two boilerplate implementations for the traits SerializeAs and DeserializeAs to use your custom transformations.
use chrono::{Date, NaiveDate, Utc};
struct MyDateFormat;
impl serde_with::SerializeAs<Date<Utc>> for MyDateFormat {
fn serialize_as<S>(value: &Date<Utc>, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
my_date_format::serialize(value, serializer)
}
}
impl<'de> serde_with::DeserializeAs<'de, Date<Utc>> for MyDateFormat {
fn deserialize_as<D>(deserializer: D) -> Result<Date<Utc>, D::Error>
where
D: serde::Deserializer<'de>,
{
my_date_format::deserialize(deserializer)
}
}
#[serde_with::serde_as]
#[derive(Serialize, Deserialize, Debug)]
struct MyStruct {
#[serde_as(as = "MyDateFormat")]
date: Date<Utc>,
#[serde_as(as = "Vec<MyDateFormat>")]
dates: Vec<Date<Utc>>,
#[serde_as(as = "Option<MyDateFormat>")]
opt_date: Option<Date<Utc>>,
#[serde_as(as = "Box<MyDateFormat>")]
boxed_date: Box<Date<Utc>>,
}
fn main() {
let s = MyStruct {
date: Utc::now().date().into(),
dates: std::iter::repeat(Utc::now().date().into()).take(4).collect(),
opt_date: Some(Utc::now().date().into()),
boxed_date: Box::new(Utc::now().date().into()),
};
let json = serde_json::to_string_pretty(&s).unwrap();
println!("{}", json);
}
// This module is taken uunmodified from the question
pub mod my_date_format {
use chrono::{Date, NaiveDate, Utc};
use serde::{self, Deserialize, Deserializer, Serializer};
const SERIALIZE_FORMAT: &'static str = "%Y-%m-%d";
pub fn serialize<S>(date: &Date<Utc>, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
let s = format!("{}", date.format(SERIALIZE_FORMAT));
serializer.serialize_str(&s)
}
pub fn deserialize<'de, D>(deserializer: D) -> Result<Date<Utc>, D::Error>
where
D: Deserializer<'de>,
{
let s = String::deserialize(deserializer)?;
NaiveDate::parse_from_str(s.as_str(), SERIALIZE_FORMAT)
.map_err(serde::de::Error::custom)
.map(|x| {
let now = Utc::now();
let date: Date<Utc> = Date::from_utc(x, now.offset().clone());
date
})
}
}

How do I use Serde to serialize a HashMap with structs as keys to JSON?

I want to serialize a HashMap with structs as keys:
use serde::{Deserialize, Serialize}; // 1.0.68
use std::collections::HashMap;
fn main() {
#[derive(Serialize, Deserialize, Debug, PartialEq, Eq, Hash)]
struct Foo {
x: u64,
}
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
x: HashMap<Foo, f64>,
}
let mut p = Bar { x: HashMap::new() };
p.x.insert(Foo { x: 0 }, 0.0);
let serialized = serde_json::to_string(&p).unwrap();
}
This code compiles, but when I run it I get an error:
Error("key must be a string", line: 0, column: 0)'
I changed the code:
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
x: HashMap<u64, f64>,
}
let mut p = Bar { x: HashMap::new() };
p.x.insert(0, 0.0);
let serialized = serde_json::to_string(&p).unwrap();
The key in the HashMap is now a u64 instead of a string. Why does the first code give an error?
You can use serde_as from the serde_with crate to encode the HashMap as a sequence of key-value pairs:
use serde_with::serde_as; // 1.5.1
#[serde_as]
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
#[serde_as(as = "Vec<(_, _)>")]
x: HashMap<Foo, f64>,
}
Which will serialize to (and deserialize from) this:
{
"x":[
[{"x": 0}, 0.0],
[{"x": 1}, 0.0],
[{"x": 2}, 0.0]
]
}
There is likely some overhead from converting the HashMap to Vec, but this can be very convenient.
According to JSONs specification, JSON keys must be strings. serde_json uses fmt::Display in here, for some non-string keys, to allow serialization of wider range of HashMaps. That's why HashMap<u64, f64> works as well as HashMap<String, f64> would. However, not all types are covered (Foo's case here).
That's why we need to provide our own Serialize implementation:
impl Display for Foo {
fn fmt(&self, f: &mut Formatter) -> std::fmt::Result {
write!(f, "{}", self.x)
}
}
impl Serialize for Bar {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
let mut map = serializer.serialize_map(Some(self.x.len()))?;
for (k, v) in &self.x {
map.serialize_entry(&k.to_string(), &v)?;
}
map.end()
}
}
(playground)
I've found the bulletproof solution 😃
Extra dependencies not required
Compatible with HashMap, BTreeMap and other iterable types
Works with flexbuffers
The following code converts a field (map) to the intermediate Vec representation:
pub mod vectorize {
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use std::iter::FromIterator;
pub fn serialize<'a, T, K, V, S>(target: T, ser: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
T: IntoIterator<Item = (&'a K, &'a V)>,
K: Serialize + 'a,
V: Serialize + 'a,
{
let container: Vec<_> = target.into_iter().collect();
serde::Serialize::serialize(&container, ser)
}
pub fn deserialize<'de, T, K, V, D>(des: D) -> Result<T, D::Error>
where
D: Deserializer<'de>,
T: FromIterator<(K, V)>,
K: Deserialize<'de>,
V: Deserialize<'de>,
{
let container: Vec<_> = serde::Deserialize::deserialize(des)?;
Ok(T::from_iter(container.into_iter()))
}
}
To use it just add the module's name as an attribute:
#[derive(Debug, Serialize, Deserialize)]
struct MyComplexType {
#[serde(with = "vectorize")]
map: HashMap<MyKey, String>,
}
The remained part if you want to check it locally:
use anyhow::Error;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord, Hash)]
struct MyKey {
one: String,
two: u16,
more: Vec<u8>,
}
#[derive(Debug, Serialize, Deserialize)]
struct MyComplexType {
#[serde(with = "vectorize")]
map: HashMap<MyKey, String>,
}
fn main() -> Result<(), Error> {
let key = MyKey {
one: "1".into(),
two: 2,
more: vec![1, 2, 3],
};
let mut map = HashMap::new();
map.insert(key.clone(), "value".into());
let instance = MyComplexType { map };
let serialized = serde_json::to_string(&instance)?;
println!("JSON: {}", serialized);
let deserialized: MyComplexType = serde_json::from_str(&serialized)?;
let expected_value = "value".to_string();
assert_eq!(deserialized.map.get(&key), Some(&expected_value));
Ok(())
}
And on the Rust playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=bf1773b6e501a0ea255ccdf8ce37e74d
While all provided answers will fulfill the goal of serializing your HashMap to json they are ad hoc or hard to maintain.
One correct way to allow a specific data structure to be serialized with serde as keys in a map, is the same way serde handles integer keys in HashMaps (which works): They serialize the value to String. This has a few advantages; namely
Intermediate data-structure omitted,
no need to clone the entire HashMap,
easier maintained by applying OOP concepts, and
serialization usable in more complex structures such as MultiMap.
This can be done by manually implementing Serialize and Deserialize for your data-type.
I use composite ids for maps.
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub struct Proj {
pub value: u64,
}
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub struct Doc {
pub proj: Proj,
pub value: u32,
}
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub struct Sec {
pub doc: Doc,
pub value: u32,
}
So now manually implementing serde serialization for them is kind of a hassle, so instead we delegate the implementation to the FromStr and From<Self> for String (Into<String> blanket) traits.
impl From<Doc> for String {
fn from(val: Doc) -> Self {
format!("{}{:08X}", val.proj, val.value)
}
}
impl FromStr for Doc {
type Err = String;
fn from_str(s: &str) -> Result<Self, Self::Err> {
match parse_doc(s) {
Ok((_, p)) => Ok(p),
Err(e) => Err(e.to_string()),
}
}
}
In order to parse the Doc we make use of nom. The parse functionality below is explained in their examples.
fn is_hex_digit(c: char) -> bool {
c.is_digit(16)
}
fn from_hex8(input: &str) -> Result<u32, std::num::ParseIntError> {
u32::from_str_radix(input, 16)
}
fn parse_hex8(input: &str) -> IResult<&str, u32> {
map_res(take_while_m_n(8, 8, is_hex_digit), from_hex8)(input)
}
fn parse_doc(input: &str) -> IResult<&str, Doc> {
let (input, proj) = parse_proj(input)?;
let (input, value) = parse_hex8(input)?;
Ok((input, Doc { value, proj }))
}
Now we need to hook up self.to_string() and str::parse(&str) to serde we can do this using a simple macro.
macro_rules! serde_str {
($type:ty) => {
impl Serialize for $type {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
let s: String = self.clone().into();
serializer.serialize_str(&s)
}
}
impl<'de> Deserialize<'de> for $type {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
paste! {deserializer.deserialize_string( [<$type Visitor>] {})}
}
}
paste! {struct [<$type Visitor>] {}}
impl<'de> Visitor<'de> for paste! {[<$type Visitor>]} {
type Value = $type;
fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
formatter.write_str("\"")
}
fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
where
E: serde::de::Error,
{
match str::parse(v) {
Ok(id) => Ok(id),
Err(_) => Err(serde::de::Error::custom("invalid format")),
}
}
}
};
}
Here we are using paste to interpolate the names. Beware that now the struct will always serialize as defined above. Never as a struct, always as a string.
It is important to implement fn visit_str instead of fn visit_string because visit_string defers to visit_str.
Finally, we have to call the macro for our custom structs
serde_str!(Sec);
serde_str!(Doc);
serde_str!(Proj);
Now the specified types can be serialized to and from string with serde.

Deserialize a JSON string or array of strings into a Vec

I'm writing a crate that interfaces with a JSON web API. One endpoint usually returns responses of the form { "key": ["value1", "value2"] }, but sometimes there's only one value for the key, and the endpoint returns { "key": "value" } instead of { "key": ["value"] }
I wanted to write something generic for this that I could use with #[serde(deserialize_with)] like so:
#[derive(Deserialize)]
struct SomeStruct {
#[serde(deserialize_with = "deserialize_string_or_seq_string")]
field1: Vec<SomeStringNewType>,
#[serde(deserialize_with = "deserialize_string_or_seq_string")]
field2: Vec<SomeTypeWithCustomDeserializeFromStr>,
}
#[derive(Deserialize)]
struct SomeStringNewType(String);
struct SomeTypeWithCustomDeserializeFromStr(String);
impl ::serde::de::Deserialize for SomeTypeWithCustomDeserializeFromStr {
// Some custom implementation here
}
How can I write a deserialize_string_or_seq_string to be able to do this?
In case you want to deserialize a single string or a list of strings into the more general Vec<String> instead of a custom type, the following is a simpler solution for Serde 1.0:
extern crate serde;
#[macro_use] extern crate serde_derive;
extern crate serde_json;
use std::fmt;
use std::marker::PhantomData;
use serde::de;
use serde::de::{Deserialize, Deserializer};
#[derive(Deserialize, Debug, Clone)]
pub struct Parent {
#[serde(deserialize_with = "string_or_seq_string")]
pub strings: Vec<String>,
}
fn main() {
let list_of_strings: Parent = serde_json::from_str(r#"{ "strings": ["value1", "value2"] }"#).unwrap();
println!("list of strings: {:?}", list_of_strings);
// Prints:
// list of strings: Parent { strings: ["value1", "value2"] }
let single_string: Parent = serde_json::from_str(r#"{ "strings": "value" }"#).unwrap();
println!("single string: {:?}", single_string);
// Prints:
// single string: Parent { strings: ["value"] }
}
fn string_or_seq_string<'de, D>(deserializer: D) -> Result<Vec<String>, D::Error>
where D: Deserializer<'de>
{
struct StringOrVec(PhantomData<Vec<String>>);
impl<'de> de::Visitor<'de> for StringOrVec {
type Value = Vec<String>;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("string or list of strings")
}
fn visit_str<E>(self, value: &str) -> Result<Self::Value, E>
where E: de::Error
{
Ok(vec![value.to_owned()])
}
fn visit_seq<S>(self, visitor: S) -> Result<Self::Value, S::Error>
where S: de::SeqAccess<'de>
{
Deserialize::deserialize(de::value::SeqAccessDeserializer::new(visitor))
}
}
deserializer.deserialize_any(StringOrVec(PhantomData))
}
This solution also works under the 0.9 release of Serde with the following changes:
remove the lifetimes
SeqAccess -> SeqVisitor
SeqAccessDeserializer -> SeqVisitorDeserializer
MapAccess -> MapVisitor
MapAccessDeserializer -> MapVisitorDeserializer
This solution works for Serde 1.0.
The way I found also required me to write a custom deserializer, because I needed one that would call visitor.visit_newtype_struct to try deserializing newtypes, and there don't seem to be any in-built into serde that do so. (I was expecting something like the ValueDeserializer series of types.)
A self-contained example is below. The SomeStruct is deserialized correctly for both inputs, one where the values are JSON arrays of strings, and the other where they're just strings.
#[macro_use]
extern crate serde;
#[macro_use]
extern crate serde_derive;
extern crate serde_json;
fn main() {
#[derive(Debug, Deserialize)]
struct SomeStringNewType(String);
#[derive(Debug)]
struct SomeTypeWithCustomDeserializeFromStr(String);
impl<'de> ::serde::Deserialize<'de> for SomeTypeWithCustomDeserializeFromStr {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error> where D: ::serde::Deserializer<'de> {
struct Visitor;
impl<'de> ::serde::de::Visitor<'de> for Visitor {
type Value = SomeTypeWithCustomDeserializeFromStr;
fn expecting(&self, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
write!(f, "a string")
}
fn visit_str<E>(self, v: &str) -> Result<Self::Value, E> where E: ::serde::de::Error {
Ok(SomeTypeWithCustomDeserializeFromStr(v.to_string() + " custom"))
}
}
deserializer.deserialize_any(Visitor)
}
}
#[derive(Debug, Deserialize)]
struct SomeStruct {
#[serde(deserialize_with = "deserialize_string_or_seq_string")]
field1: Vec<SomeStringNewType>,
#[serde(deserialize_with = "deserialize_string_or_seq_string")]
field2: Vec<SomeTypeWithCustomDeserializeFromStr>,
}
let x: SomeStruct = ::serde_json::from_str(r#"{ "field1": ["a"], "field2": ["b"] }"#).unwrap();
println!("{:?}", x);
assert_eq!(x.field1[0].0, "a");
assert_eq!(x.field2[0].0, "b custom");
let x: SomeStruct = ::serde_json::from_str(r#"{ "field1": "c", "field2": "d" }"#).unwrap();
println!("{:?}", x);
assert_eq!(x.field1[0].0, "c");
assert_eq!(x.field2[0].0, "d custom");
}
/// Deserializes a string or a sequence of strings into a vector of the target type.
pub fn deserialize_string_or_seq_string<'de, T, D>(deserializer: D) -> Result<Vec<T>, D::Error>
where T: ::serde::Deserialize<'de>, D: ::serde::Deserializer<'de> {
struct Visitor<T>(::std::marker::PhantomData<T>);
impl<'de, T> ::serde::de::Visitor<'de> for Visitor<T>
where T: ::serde::Deserialize<'de> {
type Value = Vec<T>;
fn expecting(&self, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
write!(f, "a string or sequence of strings")
}
fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
where E: ::serde::de::Error {
let value = {
// Try parsing as a newtype
let deserializer = StringNewTypeStructDeserializer(v, ::std::marker::PhantomData);
::serde::Deserialize::deserialize(deserializer)
}.or_else(|_: E| {
// Try parsing as a str
let deserializer = ::serde::de::IntoDeserializer::into_deserializer(v);
::serde::Deserialize::deserialize(deserializer)
})?;
Ok(vec![value])
}
fn visit_seq<A>(self, visitor: A) -> Result<Self::Value, A::Error>
where A: ::serde::de::SeqAccess<'de> {
::serde::Deserialize::deserialize(::serde::de::value::SeqAccessDeserializer::new(visitor))
}
}
deserializer.deserialize_any(Visitor(::std::marker::PhantomData))
}
// Tries to deserialize the given string as a newtype
struct StringNewTypeStructDeserializer<'a, E>(&'a str, ::std::marker::PhantomData<E>);
impl<'de, 'a, E> ::serde::Deserializer<'de> for StringNewTypeStructDeserializer<'a, E> where E: ::serde::de::Error {
type Error = E;
fn deserialize_any<V>(self, visitor: V) -> Result<V::Value, Self::Error> where V: ::serde::de::Visitor<'de> {
visitor.visit_newtype_struct(self)
}
fn deserialize_string<V>(self, visitor: V) -> Result<V::Value, Self::Error> where V: ::serde::de::Visitor<'de> {
// Called by newtype visitor
visitor.visit_str(self.0)
}
forward_to_deserialize_any! {
bool i8 i16 i32 i64 u8 u16 u32 u64 f32 f64 char str bytes
byte_buf option unit unit_struct newtype_struct seq tuple tuple_struct map
struct enum identifier ignored_any
}
}
I found this pattern to work for me in a similar situation:
use serde::{Deserialize, Serialize};
#[derive(Debug, Serialize, Deserialize)]
#[serde(untagged)]
enum ParameterValue {
Primitive(String),
List(Vec<String>),
}
#[derive(Debug, Serialize, Deserialize)]
struct Parameter {
name: String,
value: ParameterValue,
}
example primitive:
let primitive = Parameter {
name: String::from("theKey"),
value: ParameterValue::Primitive(String::from("theValue")),
};
let primitive_serialized = serde_json::to_string(&primitive).unwrap();
println!("{primitive_serialized}");
let primitive_again: Parameter = serde_json::from_str(&primitive_serialized).unwrap();
println!("{primitive_again:?}");
Prints:
{"name":"theKey","value":"theValue"}
Parameter { name: "theKey", value: Primitive("theValue") }
example array:
let list = Parameter {
name: String::from("theKey"),
value: ParameterValue::List(vec![String::from("v1"), String::from("v2")]),
};
let list_serialized = serde_json::to_string(&list).unwrap();
println!("{list_serialized}");
let list_again: Parameter = serde_json::from_str(&list_serialized).unwrap();
println!("{list_again:?}");
Prints:
{"name":"theKey","value":["v1","v2"]}
Parameter { name: "theKey", value: List(["v1", "v2"]) }

Resources