How to deserialize JSON into a structure with a Box<[u8]> value? - rust

I have a struct that needs to include a "bytes" field, and I'm trying to deserialize it from JSON.
When I use &'a [u8], this works, but then I need to add a lifetime annotation to this struct, and the struct that encloses it, and so on.
I thought I'd get around it by having the bytes "owned" and use an enclosing Box, but that didn't work. I'm trying to figure out why not, or whether there's a way (either with some serde annotations, or a custom helper for this field, or something else) to get this to work.
More concretely, this works:
struct Foo<'a> {
some_field: Option<String>,
field_of_interest: &'a [u8],
}
And this does not:
struct Foo {
some_field: Option<String>,
field_of_interest: Box<[u8]>,
}
In both cases, I'm calling it as:
let my_foo: Foo = serde_json::from_slice(...);
I encountered the same issue when replacing Box with Vec (i.e. as Vec<u8>)
Edit with solution:
As #lpiepiora pointed out below, this needs an additional wrapper. Something like the following, which is provided by the serde_bytes crate:
#[cfg(any(feature = "std", feature = "alloc"))]
impl<'de> Deserialize<'de> for Box<[u8]> {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
Deserialize::deserialize(deserializer).map(Vec::into_boxed_slice)
}
}

Assuming, that you're trying to deserialize JSON string to Vec<u8>, you can add a crate serde_bytes.
For example:
use serde::Deserialize;
#[derive(Deserialize, Debug)]
struct Foo {
a: Option<String>,
#[serde(with = "serde_bytes")]
b: Vec<u8>
}
fn main() {
let x = b"{ \"a\": \"a-value\", \"b\": \"aaaaaaaa\" }";
let my_foo: Foo = serde_json::from_slice(x).unwrap();
println!("{:?}", my_foo);
}
Would print: Foo { a: Some("a-value"), b: [97, 97, 97, 97, 97, 97, 97, 97] }.
Normally Vec<u8> expects an array.

Related

Serialize / Deserialize a struct that can be represented as an array of bytes

I am working with a struct that looks more or less like this:
struct MyStruct {
// Various fields, most of which do not implement `Serialize` / `Deserialize`
}
struct MyError; // Yes, this implements `std::error::Error`
impl MyStruct {
fn from_bytes(bytes: [u8; 96]) -> Result<MyStruct, MyError> {
// Creates a `MyStruct` from an array of exactly 96 bytes.
// It returns a `MyError` upon failure.
}
fn to_bytes(&self) -> [u8; 96] {
// Serializes `MyStruct` into an array of exactly 96 bytes.
}
}
Now, I would like to have MyStruct implement serde's Serialize and Deserialize. Intuition tells me it should be simple (I literally have functions that already serialize and deserialize MyStruct), but after hours of confused trial and error I'm stuck.
What I would like to have is MyStruct implement Serialize and Deserialize and, should I call bincode::serialize(my_struct), I would like it to be represented in exactly 96 bytes (i.e., I would like to avoid paying the cost of a pointless, 8-byte header that always says "what follows is a sequence of 96 bytes": I already know that I need 96 bytes to represent MyStruct!).
First part of your question can be accomplished as following:
use serde::{de::Error, Deserialize, Deserializer, Serialize, Serializer};
use serde_big_array::BigArray;
#[derive(Serialize, Deserialize)]
struct Wrap {
#[serde(with = "BigArray")]
arr: [u8; 96],
}
struct MyStruct {}
struct MyError;
impl Display for MyError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
todo!()
}
}
impl MyStruct {
fn from_bytes(bytes: [u8; 96]) -> Result<MyStruct, MyError> {
todo!()
}
fn to_bytes(&self) -> [u8; 96] {
todo!()
}
}
impl serde::Serialize for MyStruct {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
Wrap {
arr: self.to_bytes(),
}
.serialize(serializer)
}
}
impl<'de> serde::Deserialize<'de> for MyStruct {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
let bytes = <Wrap>::deserialize(deserializer)?;
Self::from_bytes(bytes.arr).map_err(D::Error::custom)
}
}
The second part is tough and depends on the format you are using with serde.

How can I reinterpret a RefMut<T> as a RefMut<U>

I have a collection containing elements wrapped in RefCell that I borrow. I have a wrapper struct to make the api usable like so:
pub struct RefWrapper<'a, T> {
inner: RefMut<'a, T>,
}
impl<T> Deref for RefWrapper<T> {
type Target = T;
fn deref(&self) -> &T { &self.inner }
}
impl<T> DerefMut for RefWrapper<T> {
fn deref_mut(&mut self) -> &mut T { &mut self.inner }
}
Foo and Bar have the same memory layout, so I can safely transmute between references of them, however RefMut is not repr(C) and transmuting would be unsound for other reasons, so I can't safely transmute RefWrapper<Foo> into RefWrapper<Bar>. is there any way to convert from RefWrapper<Foo> to RefWrapper<Bar>?
If you can get a &mut U from a &mut T, then you can use RefMut::map.
Here as a code example:
use std::cell::{RefCell, RefMut};
#[repr(C)]
#[derive(Debug)]
struct Foo {
a: i32,
b: i32,
}
#[repr(C)]
#[derive(Debug)]
struct Bar {
a: i32,
b: i32,
}
fn convert_foo_to_bar(foo: &mut Foo) -> &mut Bar {
unsafe { std::mem::transmute(foo) }
}
fn main() {
let foo = RefCell::new(Foo { a: 42, b: 69 });
{
let foo_ref = foo.borrow_mut();
let mut bar_ref = RefMut::map(foo_ref, convert_foo_to_bar);
println!("{:?}", bar_ref);
bar_ref.b = 420;
}
println!("{:?}", foo);
}
Bar { a: 42, b: 69 }
RefCell { value: Foo { a: 42, b: 420 } }
Be aware that his requires manually keeping the memory layout of Foo and Bar in sync. This will create very subtle and hard-to-find bugs if Foo and Bar are not 100% identical.
Of course one could generalize the concept by implementing From/Into for &mut Foo and &mut Bar instead of using convert_foo_to_bar. You can then use this trait to for your RefWrapper to convert between the two.

How can I ignore extra tuple items when deserializing with Serde? ("trailing characters" error)

Serde ignores unknown named fields when deserializing into regular structs. How can I similarly ignore extra items when deserializing into tuple structs (e.g. from a heterogeneous JSON array)?
For example, this code ignores the extra "c" field just fine:
#[derive(Serialize, Deserialize, Debug)]
pub struct MyStruct { a: String, b: i32 }
fn test_deserialize() -> MyStruct {
::serde_json::from_str::<MyStruct>(r#"
{
"a": "foo",
"b": 123,
"c": "ignore me"
}
"#).unwrap()
}
// => MyStruct { a: "foo", b: 123 }
By contrast, this fails on the extra item in the tuple:
#[derive(Serialize, Deserialize, Debug)]
pub struct MyTuple(String, i32);
fn test_deserialize_tuple() -> MyTuple {
::serde_json::from_str::<MyTuple>(r#"
[
"foo",
123,
"ignore me"
]
"#).unwrap()
}
// => Error("trailing characters", line: 5, column: 13)
I'd like to allow extra items for forward compatibility in my data format. What's the easiest way to get Serde to ignore extra tuple items when deserializing?
You can implement a custom Visitor which ignores rest of the sequence. Be aware that the whole sequence must be consumed. This is an important part (try to remove it and you'll get same error):
// This is very important!
while let Some(IgnoredAny) = seq.next_element()? {
// Ignore rest
}
Here's a working example:
use std::fmt;
use serde::de::{self, Deserialize, Deserializer, IgnoredAny, SeqAccess, Visitor};
use serde::Serialize;
#[derive(Serialize, Debug)]
pub struct MyTuple(String, i32);
impl<'de> Deserialize<'de> for MyTuple {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
struct MyTupleVisitor;
impl<'de> Visitor<'de> for MyTupleVisitor {
type Value = MyTuple;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("struct MyTuple")
}
fn visit_seq<V>(self, mut seq: V) -> Result<Self::Value, V::Error>
where
V: SeqAccess<'de>,
{
let s = seq
.next_element()?
.ok_or_else(|| de::Error::invalid_length(0, &self))?;
let n = seq
.next_element()?
.ok_or_else(|| de::Error::invalid_length(1, &self))?;
// This is very important!
while let Some(IgnoredAny) = seq.next_element()? {
// Ignore rest
}
Ok(MyTuple(s, n))
}
}
deserializer.deserialize_seq(MyTupleVisitor)
}
}
fn main() {
let two_elements = r#"["foo", 123]"#;
let three_elements = r#"["foo", 123, "bar"]"#;
let tuple: MyTuple = serde_json::from_str(two_elements).unwrap();
assert_eq!(tuple.0, "foo");
assert_eq!(tuple.1, 123);
let tuple: MyTuple = serde_json::from_str(three_elements).unwrap();
assert_eq!(tuple.0, "foo");
assert_eq!(tuple.1, 123);
}
For JSON, I'd combine RawValue and a custom deserialization:
use serde::{Deserialize, Deserializer};
#[derive(Debug)]
struct MyTuple(String, i32);
#[derive(Deserialize, Debug)]
struct MyTupleFutureCompat<'a>(
String,
i32,
#[serde(default, borrow)] Option<&'a serde_json::value::RawValue>,
);
impl<'de> Deserialize<'de> for MyTuple {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
let t: MyTupleFutureCompat = Deserialize::deserialize(deserializer)?;
Ok(MyTuple(t.0, t.1))
}
}
fn main() -> Result<(), Box<dyn std::error::Error>> {
let json = r#"[
"foo",
123,
"ignore me"
]"#;
let d: MyTuple = serde_json::from_str(json)?;
println!("{:?}", d);
Ok(())
}
See also:
How to transform fields during deserialization using Serde?
Is there a way to deserialize arbitrary JSON using Serde without creating fine-grained objects?
Why can Serde not derive Deserialize for a struct containing only a &Path?

How do I use Serde to serialize a HashMap with structs as keys to JSON?

I want to serialize a HashMap with structs as keys:
use serde::{Deserialize, Serialize}; // 1.0.68
use std::collections::HashMap;
fn main() {
#[derive(Serialize, Deserialize, Debug, PartialEq, Eq, Hash)]
struct Foo {
x: u64,
}
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
x: HashMap<Foo, f64>,
}
let mut p = Bar { x: HashMap::new() };
p.x.insert(Foo { x: 0 }, 0.0);
let serialized = serde_json::to_string(&p).unwrap();
}
This code compiles, but when I run it I get an error:
Error("key must be a string", line: 0, column: 0)'
I changed the code:
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
x: HashMap<u64, f64>,
}
let mut p = Bar { x: HashMap::new() };
p.x.insert(0, 0.0);
let serialized = serde_json::to_string(&p).unwrap();
The key in the HashMap is now a u64 instead of a string. Why does the first code give an error?
You can use serde_as from the serde_with crate to encode the HashMap as a sequence of key-value pairs:
use serde_with::serde_as; // 1.5.1
#[serde_as]
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
#[serde_as(as = "Vec<(_, _)>")]
x: HashMap<Foo, f64>,
}
Which will serialize to (and deserialize from) this:
{
"x":[
[{"x": 0}, 0.0],
[{"x": 1}, 0.0],
[{"x": 2}, 0.0]
]
}
There is likely some overhead from converting the HashMap to Vec, but this can be very convenient.
According to JSONs specification, JSON keys must be strings. serde_json uses fmt::Display in here, for some non-string keys, to allow serialization of wider range of HashMaps. That's why HashMap<u64, f64> works as well as HashMap<String, f64> would. However, not all types are covered (Foo's case here).
That's why we need to provide our own Serialize implementation:
impl Display for Foo {
fn fmt(&self, f: &mut Formatter) -> std::fmt::Result {
write!(f, "{}", self.x)
}
}
impl Serialize for Bar {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
let mut map = serializer.serialize_map(Some(self.x.len()))?;
for (k, v) in &self.x {
map.serialize_entry(&k.to_string(), &v)?;
}
map.end()
}
}
(playground)
I've found the bulletproof solution 😃
Extra dependencies not required
Compatible with HashMap, BTreeMap and other iterable types
Works with flexbuffers
The following code converts a field (map) to the intermediate Vec representation:
pub mod vectorize {
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use std::iter::FromIterator;
pub fn serialize<'a, T, K, V, S>(target: T, ser: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
T: IntoIterator<Item = (&'a K, &'a V)>,
K: Serialize + 'a,
V: Serialize + 'a,
{
let container: Vec<_> = target.into_iter().collect();
serde::Serialize::serialize(&container, ser)
}
pub fn deserialize<'de, T, K, V, D>(des: D) -> Result<T, D::Error>
where
D: Deserializer<'de>,
T: FromIterator<(K, V)>,
K: Deserialize<'de>,
V: Deserialize<'de>,
{
let container: Vec<_> = serde::Deserialize::deserialize(des)?;
Ok(T::from_iter(container.into_iter()))
}
}
To use it just add the module's name as an attribute:
#[derive(Debug, Serialize, Deserialize)]
struct MyComplexType {
#[serde(with = "vectorize")]
map: HashMap<MyKey, String>,
}
The remained part if you want to check it locally:
use anyhow::Error;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord, Hash)]
struct MyKey {
one: String,
two: u16,
more: Vec<u8>,
}
#[derive(Debug, Serialize, Deserialize)]
struct MyComplexType {
#[serde(with = "vectorize")]
map: HashMap<MyKey, String>,
}
fn main() -> Result<(), Error> {
let key = MyKey {
one: "1".into(),
two: 2,
more: vec![1, 2, 3],
};
let mut map = HashMap::new();
map.insert(key.clone(), "value".into());
let instance = MyComplexType { map };
let serialized = serde_json::to_string(&instance)?;
println!("JSON: {}", serialized);
let deserialized: MyComplexType = serde_json::from_str(&serialized)?;
let expected_value = "value".to_string();
assert_eq!(deserialized.map.get(&key), Some(&expected_value));
Ok(())
}
And on the Rust playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=bf1773b6e501a0ea255ccdf8ce37e74d
While all provided answers will fulfill the goal of serializing your HashMap to json they are ad hoc or hard to maintain.
One correct way to allow a specific data structure to be serialized with serde as keys in a map, is the same way serde handles integer keys in HashMaps (which works): They serialize the value to String. This has a few advantages; namely
Intermediate data-structure omitted,
no need to clone the entire HashMap,
easier maintained by applying OOP concepts, and
serialization usable in more complex structures such as MultiMap.
This can be done by manually implementing Serialize and Deserialize for your data-type.
I use composite ids for maps.
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub struct Proj {
pub value: u64,
}
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub struct Doc {
pub proj: Proj,
pub value: u32,
}
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub struct Sec {
pub doc: Doc,
pub value: u32,
}
So now manually implementing serde serialization for them is kind of a hassle, so instead we delegate the implementation to the FromStr and From<Self> for String (Into<String> blanket) traits.
impl From<Doc> for String {
fn from(val: Doc) -> Self {
format!("{}{:08X}", val.proj, val.value)
}
}
impl FromStr for Doc {
type Err = String;
fn from_str(s: &str) -> Result<Self, Self::Err> {
match parse_doc(s) {
Ok((_, p)) => Ok(p),
Err(e) => Err(e.to_string()),
}
}
}
In order to parse the Doc we make use of nom. The parse functionality below is explained in their examples.
fn is_hex_digit(c: char) -> bool {
c.is_digit(16)
}
fn from_hex8(input: &str) -> Result<u32, std::num::ParseIntError> {
u32::from_str_radix(input, 16)
}
fn parse_hex8(input: &str) -> IResult<&str, u32> {
map_res(take_while_m_n(8, 8, is_hex_digit), from_hex8)(input)
}
fn parse_doc(input: &str) -> IResult<&str, Doc> {
let (input, proj) = parse_proj(input)?;
let (input, value) = parse_hex8(input)?;
Ok((input, Doc { value, proj }))
}
Now we need to hook up self.to_string() and str::parse(&str) to serde we can do this using a simple macro.
macro_rules! serde_str {
($type:ty) => {
impl Serialize for $type {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
let s: String = self.clone().into();
serializer.serialize_str(&s)
}
}
impl<'de> Deserialize<'de> for $type {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
paste! {deserializer.deserialize_string( [<$type Visitor>] {})}
}
}
paste! {struct [<$type Visitor>] {}}
impl<'de> Visitor<'de> for paste! {[<$type Visitor>]} {
type Value = $type;
fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
formatter.write_str("\"")
}
fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
where
E: serde::de::Error,
{
match str::parse(v) {
Ok(id) => Ok(id),
Err(_) => Err(serde::de::Error::custom("invalid format")),
}
}
}
};
}
Here we are using paste to interpolate the names. Beware that now the struct will always serialize as defined above. Never as a struct, always as a string.
It is important to implement fn visit_str instead of fn visit_string because visit_string defers to visit_str.
Finally, we have to call the macro for our custom structs
serde_str!(Sec);
serde_str!(Doc);
serde_str!(Proj);
Now the specified types can be serialized to and from string with serde.

How do I clone a closure, so that their types are the same?

I have a struct which looks something like this:
pub struct MyStruct<F>
where
F: Fn(usize) -> f64,
{
field: usize,
mapper: F,
// fields omitted
}
How do I implement Clone for this struct?
One way I found to copy the function body is:
let mapper = |x| (mystruct.mapper)(x);
But this results in mapper having a different type than that of mystruct.mapper.
playground
As of Rust 1.26.0, closures implement both Copy and Clone if all of the captured variables do:
#[derive(Clone)]
pub struct MyStruct<F>
where
F: Fn(usize) -> f64,
{
field: usize,
mapper: F,
}
fn main() {
let f = MyStruct {
field: 34,
mapper: |x| x as f64,
};
let g = f.clone();
println!("{}", (g.mapper)(3));
}
You can't Clone closures. The only one in a position to implement Clone for a closure is the compiler... and it doesn't. So, you're kinda stuck.
There is one way around this, though: if you have a closure with no captured variables, you can force a copy via unsafe code. That said, a simpler approach at that point is to accept a fn(usize) -> f64 instead, since they don't have a captured environment (any zero-sized closure can be rewritten as a function), and are Copy.
You can use Rc (or Arc!) to get multiple handles of the same unclonable value. Works well with Fn (callable through shared references) closures.
pub struct MyStruct<F> where F: Fn(usize) -> f64 {
field: usize,
mapper: Rc<F>,
// fields omitted
}
impl<F> Clone for MyStruct<F>
where F: Fn(usize) -> f64,
{
fn clone(&self) -> Self {
MyStruct {
field: self.field,
mapper: self.mapper.clone(),
...
}
}
}
Remember that #[derive(Clone)] is a very useful recipe for Clone, but its recipe doesn't always do the right thing for the situation; this is one such case.
You can use trait objects to be able to implement Сlone for your struct:
use std::rc::Rc;
#[derive(Clone)]
pub struct MyStructRef<'f> {
field: usize,
mapper: &'f Fn(usize) -> f64,
}
#[derive(Clone)]
pub struct MyStructRc {
field: usize,
mapper: Rc<Fn(usize) -> f64>,
}
fn main() {
//ref
let closure = |x| x as f64;
let f = MyStructRef { field: 34, mapper: &closure };
let g = f.clone();
println!("{}", (f.mapper)(3));
println!("{}", (g.mapper)(3));
//Rc
let rcf = MyStructRc { field: 34, mapper: Rc::new(|x| x as f64 * 2.0) };
let rcg = rcf.clone();
println!("{}", (rcf.mapper)(3));
println!("{}", (rcg.mapper)(3));
}

Resources