Serde MsgPack versioning - rust

I need to serialize some data into files. For the sake of memory efficiency, I want to use the default compact serializer of MessagePack (MsgPack), as it only serializes field values w/o their names. I also want to be able to make changes to the data structure in future versions, which obviously can't be done w/o also storing some meta/versioning information. I imagine the most efficient way to do it is to simply use some "header" field for that purpose. Here is an example:
pub struct Data {
pub version: u8,
pub items: Vec<Item>,
}
pub struct Item {
pub field_a: i32,
pub field_b: String,
pub field_c: i16, // Added in version 3
}
Can I do something like that in rmp-serde (or maybe some other crate?) - to somehow annotate that a certain struct field should only be taken into account for specific file versions?

You can achieve this by writing a custom deserializer like this:
use serde::de::Error;
use serde::{Deserialize, Deserializer, Serialize};
#[derive(Serialize)]
pub struct Data {
pub version: u8,
pub items: Vec<Item>,
}
#[derive(Serialize)]
pub struct Item {
pub field_a: i32,
pub field_b: String,
pub field_c: i16, // Added in version 3
}
impl<'de> Deserialize<'de> for Data {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
// Inner structs, used for deserializing only
#[derive(Deserialize)]
pub struct InnerData {
version: u8,
items: Vec<InnerItem>,
}
#[derive(Deserialize)]
pub struct InnerItem {
field_a: i32,
field_b: String,
field_c: Option<i16>, // Added in version 3 - note that this field is optional
}
// Deserializer the inner structs
let inner_data = InnerData::deserialize(deserializer)?;
// Get the version so we can add custom logic based on the version later on
let version = inner_data.version;
// Map the InnerData/InnerItem structs to Data/Item using our version based logic
Ok(Data {
version,
items: inner_data
.items
.into_iter()
.map(|item| {
Ok(Item {
field_a: item.field_a,
field_b: item.field_b,
field_c: if version < 3 {
42 // Default value
} else {
// Get the value of field_c
// If it's missing return an error, since it's required since version 3
// Otherwise return the value
item.field_c
.map_or(Err(D::Error::missing_field("field_c")), Ok)?
},
})
})
.collect::<Result<_, _>>()?,
})
}
}
Short explanation how the deserializer works:
We create a "dumb" inner struct which is a copy of your structs but the "new" fields are optional
We deserialize to the new inner structs
We map from our inner to our outer structs using version-based logic
If one of the new fields is missing in a new version we return a D::Error::missing_field error

Related

Serialize complicated data structure

I'm having trouble serializing the following struct. I have narrowed it down to, that the problem lies within the variable objects, containing trait structs within a HashMap. I'll try to explain my circumstances:
I have the following main struct, which i'm interested in obtaining data from:
#[derive(Serialize)]
pub struct System {
pub objects: HashMap<(u32, u32), Box<dyn CommonObjects>>,
pub paths: Vec<Path>,
pub loops: Vec<Loop>,
pub physics: Physics,
pub run_limit_loops: u32,
pub run_limit_paths: u32,
pub error_tol_loops: f64,
pub error_tol_paths: f64,
pub unknown_flow_outlets: u32,
pub unknown_pumps_id: u32,
pub system: u32,
}
The kind of structs which are contained within objects are as the following:
pub struct Outlet {
// Variables found in all structs contained within 'objects'
pub Q: f64,
pub hL: f64,
pub ob: u32,
pub id: u32,
pub active: bool,
pub system: u32,
pub con_Q: HashMap<u32, f64>,
// Variables found only in this struct.
pub p_dyn: f64,
pub flow_specified: String,
pub submerged: bool,
}
pub struct Pipe {
// Variables found in all structs contained within 'objects'
pub Q: f64,
pub hL: f64,
pub ob: u32,
pub id: u32,
pub active: bool,
pub system: u32,
pub con_Q: HashMap<u32, f64>,
// Variables found only in this struct.
pub Re: f64,
pub f: f64,
}
These struct has some variables which are only used within themselves, and some variables which are common for all structs contained within objects (p_dyn is only used within Outlet, while Q is used for all structs contained within objects) To get these variables, get-function are defined, both within the local struct, and by the trait CommonObject. All i am interested in, is serializing objects in order to get all the variables in a string format, both the common ones, and the ones only appearing locally within a struct, so that i can send the variables to other programs to further visualization.
In this following code the error occurs:
// Where the system struct originates from.
let systems: Vec<System> = system_analyse::analyse_system(data);
// I try to only serialize the objects contained within one of the system structs.
let jsonstringresult = serde_json::to_string(&systems[0].objects);
match jsonstringresult {
Ok(v) => {
println!("{:?}", &v);
v
},
Err(e) => {
// The error message originates from here.
println!("An error occured at serializing: {}", e);
String::new()
}
}
I get the following error:
An error occured at serializing: key must be a string
I have found this thread discussing the issue of serializing dynamic traits, and i've followed the instructions and added the following to the trait:
pub trait CommonObjects: erased_serde::Serialize {
...
}
serialize_trait_object!(CommonObjects);
Which makes the code compile in the first place. I've also found this site getting the same error, but the issue there seems to be with enums. Maybe the problem in this site is related to my problem, but i cant figure out how if so.
I'm open to all sort of feedback and even fundamentally change the structure of the code if so necessary.
A quick way to serialize a HashMap with non-string keys to Json is to use the serde_as macro from the serde_with crate.
use serde_with::serde_as;
#[serde_as]
#[derive(Serialize)]
pub struct System {
#[serde_as(as = "Vec<(_, _)>")]
pub objects: HashMap<(u32, u32), Box<dyn CommonObjects>>,
//...
}
The #[serde_as(as = "Vec<(_, _)>")] encodes the map as a sequence of tuples, representing pairs of keys and values. In Json, this will become an array of 2-element arrays.

Serialize a remote struct with private String

I need to serialize a struct from a remote crate and all of the fields in the struct are private. There are getter's implemented in the remote struct to get those values. I am following this guidance and got it to work just fine for primitive types. However, I'm struggling with how to implement this for non-primitive types (ie: String) that the remote struct contains.
Below is a small piece of what I've implemented to frame the issue. The DataWrapper struct simply wraps Data, where Data is the remote struct.
#[derive(Serialize)]
pub struct DataWrapper {
#[serde(with = "DataDef")]
pub data: Data,
}
#[derive(Serialize)]
#[serde(remote = "remote_crate::data::Data")]
pub struct DataDef {
#[serde(getter = "Data::image_id")] // This works
image_id: u16,
#[serde(getter = "Data::description")] // This does not work
description: String,
}
The error I get when compiling this is
#[derive(Serialize)]
^^^^^^^^^ expected struct `std::string::String`, found `&str`
This makes sense, since the getter Data::description returns &str rather than a String. But, I'm not seeing a way in my code to coerce this so the compiler is happy.
If I change DataDef::description to be &str instead of String, then I have to implement lifetimes. But, when I do that, the compiler then says the remote "struct takes 0 lifetime arguments".
Appreciate any tips on how I can serialize this and other non-primitive types.
One approach you could do, so that you have full control of the serialization. Is to have the data wrapper be a copy of the struct fields you need, instead of the entire remote struct. Then you can implement From<remote_crate::data::Data> for DataWrapper and use serde without trying to coerce types.
#[derive(Serialize)]
pub struct Data {
image_id: u16,
description: String,
}
impl From<remote_crate::data::Data> for Data {
fn from(val: remote_crate::data::Data) -> Self {
Self {
image_id: val.image_id,
description: val.description.to_string(),
}
}
}
// Then you could use it like this:
// let my_data: Data = data.into();
I couldn't get it to work with an &str, but if you're OK with an allocation, you can write a custom getter:
mod local {
use super::remote::RemoteData;
use serde::{Deserialize, Serialize};
fn get_owned_description(rd: &RemoteData) -> String {
rd.description().into()
}
#[derive(Serialize)]
#[serde(remote = "RemoteData")]
pub struct DataDef {
#[serde(getter = "get_owned_description")]
description: String,
}
}
mod remote {
use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize)]
pub struct RemoteData {
description: String,
}
impl RemoteData {
pub fn description(&self) -> &str {
&self.description
}
}
}
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e9c7c0b069d7e16b6faac2fa2b840c72

How to set a field of a struct to a const value

Example struct that will be created and serialized often:
pub struct PriceMessage
{
pub message_type: String, // this will always be "price"
}
I want "message_type" to always equal e.g. "price" in order to save allocating a new string every time I create and serialize this struct using serde_json crate.
If you wanted to have a compile constant default string value for a field, that could at run time later be replaced by a heap allocated string, you could do something like this:
pub struct SomeStruct {
pub some_field: Cow<'static, str>,
}
impl SomeStruct {
const SOME_FIELD_DEFAULT: Cow<'static, str> = Cow::Borrowed("Foobar!");
pub fn new() -> Self {
Self {
some_field: Self::SOME_FIELD_DEFAULT,
}
}
}
However if you want to have an actually constant field, this doesn't really make much sense in rust, and you should consider just using an associated constant.

is it possible to auto ignore the struct fields when using Deserialize

I am using rust rocket as the http server side. On the server side, I define a rust struct like this to receive the data from client side:
#[derive(Deserialize, Serialize)]
#[allow(non_snake_case)]
pub struct PlayRecordRequest {
pub id: i64,
pub title: String,
pub url: String,
pub mvId: i64,
pub album: MusicAlbum,
pub artist: Vec<Artist>,
}
and recieve in the http controller like this:
#[post("/user/v1/save-play-record", data = "<record>")]
pub fn save_play_record<'r>(
record: Json<PlayRecordRequest>,
) -> content::Json<String> {
info!("save play record,title:{}",record.title);
save_play_record_impl(record);
let res = ApiResponse {
result: "ok".to_string(),
..Default::default()
};
let result_json = serde_json::to_string(&res).unwrap();
return content::Json(result_json);
}
the problem is that when the client side did not have some fields, the code run into error. is it possible to auto fit the field for Deserialize, if the client have the field, Deserialize it normally, if the client did not contains some fields, just ignore it and do not run into error. I read the official document of serde and find the skip annotation. But the annotation just use to mark the field should be ignored, what I want is that the struct could auto fit all field exists or not. is it possible to do like this?
There are two ways to handle this.
First is with options. Options imply that the data may or may not exist. If the data is null or missing it convert to an Option::none value. You can preserve the lack of data on serialization if you add #[serde(skip_serializing_if = "Option::is_none")]
Second option is to apply defaults to the value if the data is missing. Although from your use case this doesn't seem to be ideal.
Here is a code snippet of the two cases that you can run on https://play.rust-lang.org/:
use::serde::{Deserialize, Serialize};
#[derive(Deserialize, Serialize, Debug)]
#[allow(non_snake_case)]
pub struct Foo {
#[serde(skip_serializing_if = "Option::is_none")]
pub bar: Option<i64>,
#[serde(default = "default_baz")]
pub baz: i64,
}
fn default_baz()-> i64{
3
}
fn main(){
let data = r#"{"bar": null, "baz": 1}"#;
let v: Foo = serde_json::from_str(data).unwrap();
println!("{:#?}", v);
let s = serde_json::to_string(&v).unwrap();
println!("Don't serialize None values: {:#?}", s);
let data = r#"{"missing_bar": null}"#;
let v: Foo = serde_json::from_str(data).unwrap();
println!("{:#?}", v)
}

How do I avoid generating JSON when serializing a value that is null or a default value?

The serde_json::to_string() function will generate a string which may include null for an Option<T>, or 0 for a u32. This makes the output larger, so I want to ignore these sorts of values.
I want to simplify the JSON string output of the following structure:
use serde_derive::Serialize; // 1.0.82
#[derive(Serialize)]
pub struct WeightWithOptionGroup {
pub group: Option<String>,
pub proportion: u32,
}
When group is None and proportion is 0, the JSON string should be "{}"
Thanks for the answerHow do I change Serde's default implementation to return an empty object instead of null?, it can resolve Optionproblem, but for 0 there is none solution.
The link Skip serializing field give me the answer.
And the fixed code:
#[derive(Debug, Clone, Serialize, Deserialize, Default, PartialEq, Ord, PartialOrd, Eq)]
pub struct WeightWithOptionGroup {
#[serde(skip_serializing_if = "Option::is_none")]
#[serde(default)]
pub group: Option<String>,
#[serde(skip_serializing_if = "is_zero")]
#[serde(default)]
pub proportion: u32,
}
/// This is only used for serialize
#[allow(clippy::trivially_copy_pass_by_ref)]
fn is_zero(num: &u32) -> bool {
*num == 0
}
There's a couple of ways you could do this:
Mark each of the fields with a skip_serialising_if attribute to say when to skip them. This is much easier, but you'll have to remember to do it for every field.
Write your own Serde serialiser that does this custom JSON form. This is more work, but shouldn't be too bad, especially given you can still use the stock JSON deserialiser.
For those searching how to skip serialization for some enum entries you can do this
#[derive(Serialize, Deserialize)]
enum Metadata<'a> {
App, // want this serialized
Ebook, // want this serialized
Empty // dont want this serialized
}
#[derive(Serialize, Deserialize)]
struct Request<'a> {
request_id: &'a str,
item_type: ItemType,
#[serde(skip_serializing_if = "metadata_is_empty")]
metadata: Metadata<'a>,
}
fn metadata_is_empty<'a>(metadata: &Metadata<'a>) -> bool {
match metadata {
Metadata::Empty => true,
_ => false
}
}

Resources