How to serialize/deserialize a map with the None values - rust

I need a map with the Option values in my configuration. However, serde seems to ignore any pairs with the None value
use std::collections::HashMap;
use serde::{Deserialize, Serialize};
use toml;
#[derive(Debug, Serialize, Deserialize)]
struct Config {
values: HashMap<String, Option<u32>>,
}
fn main() {
let values = [("foo", Some(5)), ("bar", None)]
.iter()
.map(|(name, s)| (name.to_string(), s.clone()))
.collect();
let config = Config { values };
let s = toml::ser::to_string(&config).unwrap();
println!("{}", s);
}
produces
[values]
foo = 5
The same goes for deserializing: I simply cannot represent bar: None in any form,
since the TOML has no notion of None or null or alike.
Are there some tricks to do that?

The closest alternative I have found is to use a special sentinel value (the one you will probably use in Option::unwrap_or), which appears in the TOML file as the real value (e.g. 0), and converts from Option::None on serialization. But on deserialization, the sentinel value converts to Option::None and leaves us with the real Option type.
Serde has a special #[serde(with = module)] attribute to customize the ser/de field behavior, which you can use here. The full working example is here.

Related

How to serialize options to JSON as arrays

I am working in a relatively large codebase where options are represented in JSON as arrays, so None is represented in JSON as [] and Some(thing) as [thing]. (Yes, the codebase also contains Haskell, in case you are wondering.) How can I override the default serde_json behaviour, which is to omit optional fields, to match this?
E.g. a struct:
SomeData {
foo: Some(1),
bar: None
}
should be serialized to JSON as:
{
"foo": [1],
"bar": []
}
Of course, one could theoretically implement custom serialization for each and every optional field in every struct that interacts with the codebase but that would be a huge undertaking, even if it were possible.
There don't seem to be any options in the serde_json serialisation of some and none so I imagine that the solution will be creating a new serializer that inherits almost everything from serde_json apart from the Option serialization and deserialization. Are there any examples of projects that do this? It would also be possible to make a fork but maintaining a fork is never much fun.
Of course, one could theoretically implement custom serialization for each and every optional field in every struct that interacts with the codebase
A custom implementation for each and every field is not necessary. By using serialize_with, you only need one transformation function describing the serialization of any serializable Option<T> as a sequence.
fn serialize_option_as_array<T, S>(value: &Option<T>, serializer: S) -> Result<S::Ok, S::Error>
where
T: Serialize,
S: Serializer,
{
let len = if value.is_some() { 1 } else { 0 };
let mut seq = serializer.serialize_seq(Some(len))?;
for element in value {
seq.serialize_element(element)?;
}
seq.end()
}
Using it in your struct:
use serde_derive::Serialize;
use serde::ser::{Serialize, Serializer, SerializeSeq};
use serde_json;
#[derive(Debug, Serialize)]
struct SomeData {
#[serde(serialize_with = "serialize_option_as_array")]
foo: Option<i32>,
#[serde(serialize_with = "serialize_option_as_array")]
bar: Option<u32>,
}
let data = SomeData {
foo: Some(5),
bar: None,
};
println!("{}", serde_json::to_string(&data)?);
The output:
{"foo":[5],"bar":[]}
Playground
See also:
How to transform fields during serialization using Serde?

Serde skip field serialization depending on a "global" runtime condition

Depending on some runtime condition, I'd like to either serialize a field or not. That condition applies to the whole serialization and has nothing to do with the field's value itself. Hence, I cannot use skip_serializing_if() if I understand it right, unless I use some sort of a global state, but then that would be more like a constant, not a "condition".
As an example, let's say the condition depends on the client that requested the file. Some clients will need to have that field, others - not.
If the condition says serialize, do so even if the field's value is None (i.e. explicitly create a property with null value in the output JSON).
What's the simplest and cleanest way to achieve that?
Just create a function and ignore the argument:
use serde_json; // 1.0.67
use serde::Serialize; // 1.0.130
fn condition_met<T>(_: &T) -> bool {
false
}
#[derive(Serialize)]
struct Foo {
#[serde(skip_serializing_if = "condition_met")]
data: Option<u32>,
}
fn main() {
println!("{}", serde_json::to_string(&Foo{data: None}).unwrap());
}
Playground

how can I flatten an enum to a special case when the explicit cases don't match

I'd like it so when the case is unknown, it will be associated with the last case
#[derive(Serialize, Deserialize, Clone, Debug)]
#[serde(untagged)]
pub enum Action {
Action1,
Action2,
Action3,
Other(String), // when not known it should be here
}
I've tried using the directive
#[serde(untagged)]
but then it doesn't serialize properly
let b = Action::Action1;
let s = serde_json::to_string(&b);
let ss = s.unwrap();
println!("ss {:#?}", &ss);
let val = serde_json::to_value(b);
println!("ss {:#?}", &val);
results in
ss "null"
ss Ok(
Null,
)
Playground link
I can think of two options that build off each other.
First use From to turn this into a string every time you serialize and then From to turn it back into your own type. This requires you to convert every time you serialize and deserialize but will accomplish your goal.
If you want to make the API a little cleaner at the cost of doing more work you can implement serialize and deserialize yourself. Here are some references on how to do that:
Custom Serialization
Implementing Serialize
Implementing Deserialize
As a second option you can offload the custom serialization and deserialization if your willing to add another dependency of serde_with.
According to its docs:
De/Serializing a type using the Display and FromStr traits, e.g., for u8, url::Url, or mime::Mime. Check DisplayFromStr or serde_with::rust::display_fromstr for details.

Writing to a field in a MaybeUninit structure?

I'm doing something with MaybeUninit and FFI in Rust that seems to work, but I suspect may be unsound/relying on undefined behavior.
My aim is to have a struct MoreA extend a struct A, by including A as an initial field. And then to call some C code that writes to the struct A. And then finalize MoreA by filling in its additional fields, based on what's in A.
In my application, the additional fields of MoreA are all integers, so I don't have to worry about assignments to them dropping the (uninitialized) previous values.
Here's a minimal example:
use core::fmt::Debug;
use std::mem::MaybeUninit;
#[derive(Clone, Copy, PartialEq, Debug)]
#[repr(C)]
struct A(i32, i32);
#[derive(Clone, Copy, PartialEq, Debug)]
#[repr(C)]
struct MoreA {
head: A,
more: i32,
}
unsafe fn mock_ffi(p: *mut A) {
// write doesn't drop previous (uninitialized) occupant of p
p.write(A(1, 2));
}
fn main() {
let mut b = MaybeUninit::<MoreA>::uninit();
unsafe { mock_ffi(b.as_mut_ptr().cast()); }
let b = unsafe {
let mut b = b.assume_init();
b.more = 3;
b
};
assert_eq!(&b, &MoreA { head: A(1, 2), more: 3 });
}
Is the code let b = unsafe { ... } sound? It runs Ok and Miri doesn't complain.
But the MaybeUninit docs say:
Moreover, uninitialized memory is special in that the compiler knows that it does not have
a fixed value. This makes it undefined behavior to have uninitialized data in a variable
even if that variable has an integer type, which otherwise can hold any fixed bit pattern.
Also, the Rust book says that Behavior considered undefined includes:
Producing an invalid value, even in private fields and locals. "Producing" a value happens any time a value is assigned to or read from a place, passed to a function/primitive operation or returned from a function/primitive operation. The following values are invalid (at their respective type):
... An integer (i*/u*) or ... obtained from uninitialized memory.
On the other hand, it doesn't seem possible to write to the more field before calling assume_init. Later on the same page:
There is currently no supported way to create a raw pointer or reference to a field of a struct
inside MaybeUninit. That means it is not possible to create a struct by calling
MaybeUninit::uninit::() and then writing to its fields.
If what I'm doing in the above code example does trigger undefined behavior, what would solutions be?
I'd like to avoid boxing the A value (that is, I'd like to have it be directly included in MoreA).
I'd hope also to avoid having to create one A to pass to mock_ffi and then having to copy the results into MoreA. A in my real application is a large structure.
I guess if there's no sound way to get what I'm after, though, I'd have to choose one of those two fallbacks.
If struct A is of a type that can hold the bit-pattern 0 as a valid value, then I guess a third fallback would be:
Start with MaybeUninit::zeroed() rather than MaybeUninit::uninit().
Currently, the only sound way to refer to uninitialized memory—of any type—is MaybeUninit. In practice, it is probably safe to read or write to uninitialized integers, but that is not officially documented. It is definitely not safe to read or write to an uninitialized bool or most other types.
In general, as the documentation states, you cannot initialize a struct field by field. However, it is sound to do so as long as:
the struct has repr(C). This is necessary because it prevents Rust from doing clever layout tricks, so that the layout of a field of type MaybeUninit<T> remains identical to the layout of a field of type T, regardless of its adjacent fields.
every field is MaybeUninit. This lets us assume_init() for the entire struct, and then later initialise each field individually.
Given that your struct is already repr(C), you can use an intermediate representation which uses MaybeIninit for every field. The repr(C) also means that we can transmute between the types once it is initialised, provided that the two structs have the same fields in the same order.
use std::mem::{self, MaybeUninit};
#[repr(C)]
struct MoreAConstruct {
head: MaybeUninit<A>,
more: MaybeUninit<i32>,
}
let b: MoreA = unsafe {
// It's OK to assume a struct is initialized when all of its fields are MaybeUninit
let mut b_construct = MaybeUninit::<MoreAConstruct>::uninit().assume_init();
mock_ffi(b_construct.head.as_mut_ptr());
b_construct.more = MaybeUninit::new(3);
mem::transmute(b_construct)
};
It is now possible (since Rust 1.51) to initialize fields of any uninitialized struct using the std::ptr::addr_of_mut macro. This example is from the documentation:
You can use MaybeUninit, and the std::ptr::addr_of_mut macro, to
initialize structs field by field:
#[derive(Debug, PartialEq)] pub struct Foo {
name: String,
list: Vec<u8>, }
let foo = {
let mut uninit: MaybeUninit<Foo> = MaybeUninit::uninit();
let ptr = uninit.as_mut_ptr();
// Initializing the `name` field
unsafe { addr_of_mut!((*ptr).name).write("Bob".to_string()); }
// Initializing the `list` field
// If there is a panic here, then the `String` in the `name` field leaks.
unsafe { addr_of_mut!((*ptr).list).write(vec![0, 1, 2]); }
// All the fields are initialized, so we call `assume_init` to get an initialized Foo.
unsafe { uninit.assume_init() } };
assert_eq!(
foo,
Foo {
name: "Bob".to_string(),
list: vec![0, 1, 2]
}
);

How can I accept multiple deserialization names for the same Serde field?

I am trying to use Serde to deserialize JSON (serde-json) and XML (serde-xml-rs) files based on the following struct:
use serde_derive::Deserialize;
#[derive(Debug, Clone, PartialEq, Deserialize)]
pub struct SchemaConfig {
pub name: String,
#[serde(rename = "Cube")]
pub cubes: Vec<CubeConfig>,
}
The fields I am deserializing on have different names based on the file type. In this case, I would like for a JSON file to have a cubes key with a list of cubes, but the equivalent in XML would be multiple <Cube /> elements.
I can't figure out how to accept both cubes and Cube as keys for the deserialization. The closest thing I found was the #[serde(rename = "Cube")] option but when I use that the JSON deserialization stops working since it only accepts the Cube key. If I remove that option, the XML deserialization stops working as it then only accepts cubes as the key.
Is there a simple way to accomplish this in Serde?
I encourage you to read the Serde documentation. The field attributes chapter introduces the alias attribute, emphasis mine:
#[serde(alias = "name")]
Deserialize this field from the given name or from its Rust name. May
be repeated to specify multiple possible names for the same field.
use serde::Deserialize; // 1.0.88
use serde_json; // 1.0.38
#[derive(Debug, Deserialize)]
struct SchemaConfig {
#[serde(alias = "fancy_square", alias = "KUBE")]
cube: [i32; 3],
}
fn main() -> Result<(), Box<std::error::Error>> {
let input1 = r#"{
"fancy_square": [1, 2, 3]
}"#;
let input2 = r#"{
"KUBE": [4, 5, 6]
}"#;
let one: SchemaConfig = serde_json::from_str(input1)?;
let two: SchemaConfig = serde_json::from_str(input2)?;
println!("{:?}", one);
println!("{:?}", two);
Ok(())
}
I would like for a JSON file to have a cubes key with a list of cubes, but the equivalent in XML would be multiple <Cube /> elements.
This certainly sounds like you want two different structures to your files. In that case, look at something like:
How to transform fields during deserialization using Serde?
How do I serialize an enum without including the name of the enum variant?

Resources