I would like to deserialize a wire format, like this JSON, into the Data structure below and I am failing to write the serde Deserialize implementations for the corresponding rust types.
{ "type": "TypeA", "value": { "id": "blah", "content": "0xa1b.." } }
enum Content {
TypeA(Vec<u8>),
TypeB(BigInt),
}
struct Value {
id: String,
content: Content,
}
struct Data {
typ: String,
value: Value,
}
The difficulty is selecting the correct value of the Content enumeration, which is based on the typ value.
As far as I know, deserialization in serde is stateless, an hence there is no way of either
knowing what the value of typ is at the time of deserialization of content (even though the deserialization order is guaranteed)
or injecting the value of typ in the deserializer then collecting it.
How can this be achieved with serde ?
I have looked at
serde_state but I cannot get the macros working and this library is wrapping serde, which worries me
DeserializeSeed but my undestanding is that it must be used in place of Deserialize for all types and my data model is big
The existing SO answers usually exploit the fact that the related fields are at the same level. This is not the case here: the actual data model is big, deep and the fields are "far apart"
Much simpler using tagging, but changing your data structure:
use serde::{Deserialize, Deserializer}; // 1.0.130
use serde_json; // 1.0.67
#[derive(Debug, Deserialize)]
#[serde(tag = "type", content = "value")]
enum Data {
TypeA(Value<String>),
TypeB(Value<u32>),
}
#[derive(Debug, Deserialize)]
struct Value<T> {
id: String,
content: T,
}
fn main() {
let input = r#"{"type": "TypeA", "value": { "id": "blah", "content": "0xa1b..."}}"#;
let data: Data = serde_json::from_str(input).unwrap();
println!("{:?}", data);
}
Playground
Also, you can write your own custom desializer using some intermediary serde_json::Value:
use serde::{Deserialize, Deserializer};// 1.0.130
use serde_json; // 1.0.67
#[derive(Debug)]
enum Content {
TypeA(String),
TypeB(String),
}
#[derive(Debug)]
struct Value {
id: String,
content: Content,
}
#[derive(Debug)]
struct Data {
typ: String,
value: Value,
}
impl<'de> Deserialize<'de> for Data {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
let json: serde_json::value::Value = serde_json::value::Value::deserialize(deserializer)?;
let typ = json.get("type").expect("type").as_str().unwrap();
let value = json.get("value").expect("value");
let id = value.get("id").expect("id").as_str().unwrap();
let content = value.get("content").expect("content").as_str().unwrap();
Ok(Data {
typ: typ.to_string(),
value: Value {
id: id.to_string(),
content: {
match typ {
"TypeA" => Content::TypeA(content.to_string()),
"TypeB" => Content::TypeB(content.to_string()),
_ => panic!("Invalid type, but this should be an error not a panic"),
}
}
}
})
}
}
fn main() {
let input = r#"{"type": "TypeA", "value": { "id": "blah", "content": "0xa1b..."}}"#;
let data: Data = serde_json::from_str(input).unwrap();
println!("{:?}", data);
}
Playground
Disclaimer: I didn't handle error correctly and you could also extract the content matching into a function for example. The above code is just to illustrate the main idea.
There's a few different ways this can be solved, e.g. with a custom impl Deserialize for Data, then deserialize into a serde_json::Value, and then manually juggling between the types.
For a somewhat example of that, checkout this answer that I wrote in the past. It's not a one-to-one solution, but it might give some hints for implementing Deserialize manually, for what you want.
That being said. Personally, I prefer to minimize when I have to impl Deserialize manually, and instead deserialize into another type, and have it automatically convert using #[serde(from = "FromType")].
First, instead of type_: String, I'd suggest we introduce enum ContentType.
#[derive(Deserialize, Clone, Copy, Debug)]
enum ContentType {
TypeA,
TypeB,
TypeC,
TypeD,
}
Now, let's consider the types you introduced. I've added a few extra variants to Content, as you mentioned the variants can be different.
#[derive(Deserialize, Debug)]
#[serde(untagged)]
enum Content {
TypeA(Vec<u8>),
TypeB(Vec<u8>),
TypeC(String),
TypeD { foo: i32, bar: i32 },
}
#[derive(Deserialize, Debug)]
struct Value {
id: String,
content: Content,
}
#[derive(Deserialize, Debug)]
#[serde(try_from = "IntermediateData")]
struct Data {
#[serde(alias = "type")]
type_: ContentType,
value: Value,
}
Nothing crazy yet or much different. All the "magic" happens in the IntermediateData type, along with the impl TryFrom.
First, let's introduce a check_type(), which takes a ContentType and checks it against the Content. If the Content variant doesn't match the ContentType variant, then convert it.
In short, when using #[serde(untagged)] then when serde attempts to deserialize Content it will always return the first successful variant it can deserialize to if any. So if it can deserialize a Vec<u8>, then it will always result in Content::TypeA(). Knowing this, then in our check_type(), if the ContentType is TypeB and the Content is TypeA. Then we simply change it to TypeB.
impl Content {
// TODO: impl proper error type instead of `String`
fn check_type(self, type_: ContentType) -> Result<Self, String> {
match (type_, self) {
(ContentType::TypeA, content # Self::TypeA(_)) => Ok(content),
(ContentType::TypeB, Self::TypeA(content)) => Ok(Self::TypeB(content)),
(ContentType::TypeC | ContentType::TypeD, content) => Ok(content),
(type_, content) => Err(format!(
"unexpected combination of {:?} and {:?}",
type_, content
)),
}
}
}
Now all we need is the intermediate IntermediateData, along with a TryFrom conversion, which calls check_type() on the Content.
#[derive(Deserialize, Debug)]
struct IntermediateData {
#[serde(alias = "type")]
type_: ContentType,
value: Value,
}
impl TryFrom<IntermediateData> for Data {
// TODO: impl proper error type instead of `String`
type Error = String;
fn try_from(mut data: IntermediateData) -> Result<Self, Self::Error> {
data.value.content = data.value.content.check_type(data.type_)?;
Ok(Data {
type_: data.type_,
value: data.value,
})
}
}
That's all. Now we can test it against the following:
// serde = { version = "1", features = ["derive"] }
// serde_json = "1.0"
use std::convert::TryFrom;
use serde::Deserialize;
// ... all the previous code ...
fn main() {
let json = r#"{ "type": "TypeA", "value": { "id": "foo", "content": [0, 1, 2, 3] } }"#;
let data: Data = serde_json::from_str(json).unwrap();
println!("{:#?}", data);
let json = r#"{ "type": "TypeB", "value": { "id": "foo", "content": [0, 1, 2, 3] } }"#;
let data: Data = serde_json::from_str(json).unwrap();
println!("{:#?}", data);
let json = r#"{ "type": "TypeC", "value": { "id": "bar", "content": "foo" } }"#;
let data: Data = serde_json::from_str(json).unwrap();
println!("{:#?}", data);
let json = r#"{ "type": "TypeD", "value": { "id": "baz", "content": { "foo": 1, "bar": 2 } } }"#;
let data: Data = serde_json::from_str(json).unwrap();
println!("{:#?}", data);
}
Then it correctly results in Datas with Content::TypeA, Content::TypeB, Content::TypeC, and the last one Content::TypeD.
Lastly. There is issue #939 which talks about adding a #[serde(validate = "...")]. However, it was created in 2017, so I wouldn't hold my breath on it.
DeserializeSeed can be mixed with normal Deserialize code. It does not need to be used for all types. Here, it is enough to use it to deserialize Value.
Playground
use serde::de::{DeserializeSeed, IgnoredAny, MapAccess, Visitor};
use serde::*;
use std::fmt;
#[derive(Debug)]
enum ContentType {
A,
B,
Unknown,
}
#[derive(Debug)]
enum Content {
TypeA(String),
TypeB(i32),
}
#[derive(Debug)]
struct Value {
id: String,
content: Content,
}
#[derive(Debug)]
struct Data {
typ: String,
value: Value,
}
impl<'de> Deserialize<'de> for Data {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
struct DataVisitor;
impl<'de> Visitor<'de> for DataVisitor {
type Value = Data;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("struct Data")
}
fn visit_map<A>(self, mut access: A) -> Result<Self::Value, A::Error>
where
A: MapAccess<'de>,
{
let mut typ = None;
let mut value = None;
while let Some(key) = access.next_key()? {
match key {
"type" => {
typ = Some(access.next_value()?);
}
"value" => {
let seed = match typ.as_deref() {
Some("TypeA") => ContentType::A,
Some("TypeB") => ContentType::B,
_ => ContentType::Unknown,
};
value = Some(access.next_value_seed(seed)?);
}
_ => {
access.next_value::<IgnoredAny>()?;
}
}
}
Ok(Data {
typ: typ.unwrap(),
value: value.unwrap(),
})
}
}
deserializer.deserialize_map(DataVisitor)
}
}
impl<'de> DeserializeSeed<'de> for ContentType {
type Value = Value;
fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
where
D: Deserializer<'de>,
{
struct ValueVisitor(ContentType);
impl<'de> Visitor<'de> for ValueVisitor {
type Value = Value;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("struct Value")
}
fn visit_map<A>(self, mut access: A) -> Result<Self::Value, A::Error>
where
A: MapAccess<'de>,
{
let mut id = None;
let mut content = None;
while let Some(key) = access.next_key()? {
match key {
"id" => {
id = Some(access.next_value()?);
}
"content" => {
content = Some(match self.0 {
ContentType::A => Content::TypeA(access.next_value()?),
ContentType::B => Content::TypeB(access.next_value()?),
ContentType::Unknown => {
panic!("Should not happen if type happens to occur before value, but JSON is unordered.");
}
});
}
_ => {
access.next_value::<IgnoredAny>()?;
}
}
}
Ok(Value {
id: id.unwrap(),
content: content.unwrap(),
})
}
}
deserializer.deserialize_map(ValueVisitor(self))
}
}
fn main() {
let j = r#"{"type": "TypeA", "value": {"id": "blah", "content": "0xa1b.."}}"#;
dbg!(serde_json::from_str::<Data>(j).unwrap());
let j = r#"{"type": "TypeB", "value": {"id": "blah", "content": 666}}"#;
dbg!(serde_json::from_str::<Data>(j).unwrap());
let j = r#"{"type": "TypeB", "value": {"id": "blah", "content": "Foobar"}}"#;
dbg!(serde_json::from_str::<Data>(j).unwrap_err());
}
The main downside of this solution is that you lose the possibility of deriving the code.
Related
I need to deserialize an array (JSON) of a type let call Foo. I have implemented this and it works well for most stuff, but I have noticed the latest version of the data will sometimes include erroneous empty objects.
Prior to this change, each Foo can be de-serialized to the following enum:
#[derive(Deserialize)]
#[serde(untagged)]
pub enum Foo<'s> {
Error {
// My current workaround is using Option<Cow<'s, str>>
error: Cow<'s, str>,
},
Value {
a: u32,
b: i32,
// etc.
}
}
/// Foo is part of a larger struct Bar.
#[derive(Deserialize)]
#[serde(untagged)]
pub struct Bar<'s> {
foos: Vec<Foo<'s>>,
// etc.
}
This struct may represent one of the following JSON values:
// Valid inputs
[]
[{"a": 34, "b": -23},{"a": 33, "b": -2},{"a": 37, "b": 1}]
[{"error":"Unable to connect to network"}]
[{"a": 34, "b": -23},{"error":"Timeout"},{"a": 37, "b": 1}]
// Possible input for latest versions of data
[{},{},{},{},{},{},{"a": 34, "b": -23},{},{},{},{},{},{},{},{"error":"Timeout"},{},{},{},{},{},{}]
This does not happen very often, but it is enough to cause issues. Normally, the array should include 3 or less entries, but these extraneous empty objects break that convention. There is no meaningful information I can gain from parsing {} and in the worst cases there can be hundreds of them in one array.
I do not want to error on parsing {} as the array still contains other meaningful values, but I do not want to include {} in my parsed data either. Ideally I would also be able to use tinyvec::ArrayVec<[Foo<'s>; 3]> instead of a Vec<Foo<'s>> to save memory and reduce time spent performing allocation during paring, but am unable to due to this issue.
How can I skip {} JSON values when deserializing an array with serde in Rust?
I also put together a Rust Playground with some test cases to try different solutions.
serde_with::VecSkipError provides a way to ignore any elements which fail deserialization, by skipping them. This will ignore any errors and not only the empty object {}. So it might be too permissive.
#[serde_with::serde_as]
#[derive(Deserialize)]
pub struct Bar<'s> {
#[serde_as(as = "serde_with::VecSkipError<_>")]
foos: Vec<Foo<'s>>,
}
Playground
The simplest, but not performant, solution would be to define an enum that captures both the Foo case and the empty case, deserialize into a vector of those, and then filter that vector to get just the nonempty ones.
#[derive(Deserialize, Debug)]
#[serde(untagged)]
pub enum FooDe<'s> {
Nonempty(Foo<'s>),
Empty {},
}
fn main() {
let json = r#"[
{},{},{},{},{},{},
{"a": 34, "b": -23},
{},{},{},{},{},{},{},
{"error":"Timeout"},
{},{},{},{},{},{}
]"#;
let foo_des = serde_json::from_str::<Vec<FooDe>>(json).unwrap();
let foos = foo_des
.into_iter()
.filter_map(|item| {
use FooDe::*;
match item {
Nonempty(foo) => Some(foo),
Empty {} => None,
}
})
.collect();
let bar = Bar { foos };
println!("{:?}", bar);
// Bar { foos: [Value { a: 34, b: -23 }, Error { error: "Timeout" }] }
}
Conceptually this is simple but you're allocating a lot of space for Empty cases that you ultimately don't need. Instead, you can control exactly how deserialization is done by implementing it yourself.
struct BarVisitor<'s> {
marker: PhantomData<fn() -> Bar<'s>>,
}
impl<'s> BarVisitor<'s> {
fn new() -> Self {
BarVisitor {
marker: PhantomData,
}
}
}
// This is the trait that informs Serde how to deserialize Bar.
impl<'de, 's: 'de> Deserialize<'de> for Bar<'s> {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
impl<'de, 's: 'de> Visitor<'de> for BarVisitor<'s> {
// The type that our Visitor is going to produce.
type Value = Bar<'s>;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("a list of objects")
}
fn visit_seq<V>(self, mut access: V) -> Result<Self::Value, V::Error>
where
V: SeqAccess<'de>,
{
let mut foos = Vec::new();
while let Some(foo_de) = access.next_element::<FooDe>()? {
if let FooDe::Nonempty(foo) = foo_de {
foos.push(foo)
}
}
let bar = Bar { foos };
Ok(bar)
}
}
// Instantiate our Visitor and ask the Deserializer to drive
// it over the input data, resulting in an instance of Bar.
deserializer.deserialize_seq(BarVisitor::new())
}
}
fn main() {
let json = r#"[
{},{},{},{},{},{},
{"a": 34, "b": -23},
{},{},{},{},{},{},{},
{"error":"Timeout"},
{},{},{},{},{},{}
]"#;
let bar = serde_json::from_str::<Bar>(json).unwrap();
println!("{:?}", bar);
// Bar { foos: [Value { a: 34, b: -23 }, Error { error: "Timeout" }] }
}
I have a 5GB JSON file which is an array of objects with fixed structure:
[
{
"first": "John",
"last": "Doe",
"email": "john.doe#yahoo.com"
},
{
"first": "Anne",
"last": "Ortha",
"email": "anne.ortha#hotmail.com"
},
....
]
I know that I can try to parse this file using the code shown in How can I deserialize JSON with a top-level array using Serde?:
use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize, Debug)]
struct User {
first: String,
last: String,
email: String,
}
let users: Vec<User> = serde_json::from_str(file)?;
There are multiple problems:
It is first read as a string as a whole
After reading as string, it converts it into a vector of User structs (I don't want that)
I tried How I can I lazily read multiple JSON values from a file/stream in Rust? but it reads the whole file before printing anything and it prints the whole structure at once inside the loop. I was expecting one object at a time in the loop:
Ideally, parsing and processing of the (parsed) User object should happen simultaneously in two separate threads/tasks/routines or by making use of channel.
Streaming elements from a JSON array is possible, but requires some legwork. You must skip the leading [ and the intermittent , yourself, as well as detect the final ]. To parse individual array elements you need to use StreamDeserializer and extract a single item from it (so you can drop it and regain control of the IO reader). For example:
use serde::de::DeserializeOwned;
use serde_json::{self, Deserializer};
use std::io::{self, Read};
fn read_skipping_ws(mut reader: impl Read) -> io::Result<u8> {
loop {
let mut byte = 0u8;
reader.read_exact(std::slice::from_mut(&mut byte))?;
if !byte.is_ascii_whitespace() {
return Ok(byte);
}
}
}
fn invalid_data(msg: &str) -> io::Error {
io::Error::new(io::ErrorKind::InvalidData, msg)
}
fn deserialize_single<T: DeserializeOwned, R: Read>(reader: R) -> io::Result<T> {
let next_obj = Deserializer::from_reader(reader).into_iter::<T>().next();
match next_obj {
Some(result) => result.map_err(Into::into),
None => Err(invalid_data("premature EOF")),
}
}
fn yield_next_obj<T: DeserializeOwned, R: Read>(
mut reader: R,
at_start: &mut bool,
) -> io::Result<Option<T>> {
if !*at_start {
*at_start = true;
if read_skipping_ws(&mut reader)? == b'[' {
// read the next char to see if the array is empty
let peek = read_skipping_ws(&mut reader)?;
if peek == b']' {
Ok(None)
} else {
deserialize_single(io::Cursor::new([peek]).chain(reader)).map(Some)
}
} else {
Err(invalid_data("`[` not found"))
}
} else {
match read_skipping_ws(&mut reader)? {
b',' => deserialize_single(reader).map(Some),
b']' => Ok(None),
_ => Err(invalid_data("`,` or `]` not found")),
}
}
}
pub fn iter_json_array<T: DeserializeOwned, R: Read>(
mut reader: R,
) -> impl Iterator<Item = Result<T, io::Error>> {
let mut at_start = false;
std::iter::from_fn(move || yield_next_obj(&mut reader, &mut at_start).transpose())
}
Example usage:
fn main() {
let data = r#"[
{
"first": "John",
"last": "Doe",
"email": "john.doe#yahoo.com"
},
{
"first": "Anne",
"last": "Ortha",
"email": "anne.ortha#hotmail.com"
}
]"#;
use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize, Debug)]
struct User {
first: String,
last: String,
email: String,
}
for user in iter_json_array(io::Cursor::new(&data)) {
let user: User = user.unwrap();
println!("{:?}", user);
}
}
Playground
When using it in production, you'd open it as File instead of reading it to a string. As always, don't forget to wrap the File in a BufReader.
This is not directly possible as of serde_json 1.0.66.
One workaround suggested is to implement your own Visitor that uses a channel. As deserialization of the array progresses, each element is pushed down the channel. The receiving side of the channel can then grab each element and process it, freeing up space for the deserialization to push in another value.
I have the following data structure which should be able to hold either a String, a u64 value, a boolean value, or a String vector.
#[derive(Serialize, Deserialize)]
pub enum JsonRpcParam {
String(String),
Int(u64),
Bool(bool),
Vec(Vec<String>)
}
The use case for this data structure is to build JSON RPC parameters which can have multiple types, so I would be able to build a parameter list like this:
let mut params : Vec<JsonRpcParam> = Vec::new();
params.push(JsonRpcParam::String("Test".to_string()));
params.push(JsonRpcParam::Bool(true));
params.push(JsonRpcParam::Int(64));
params.push(JsonRpcParam::Vec(vec![String::from("abc"), String::from("cde")]));
My problem is now the serialization. I am using serde_json for the serialization part. The default serialization of the vector posted above yields:
[
{
"String":"Test"
},
{
"Bool":true
},
{
"Int":64
},
{
"Vec":[
"abc",
"cde"
]
}
]
Instead, I would like the serialization to look like this:
[
"Test",
true,
64,
["abc","cde"]
]
I attempted to implement a custom serialize method for this type, but dont't know how to achieve what I want, my attempt looks like this:
impl Serialize for JsonRpcParam {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer {
match *self {
JsonRpcParam::String(x) => serializer.serialize_str(x),
JsonRpcParam::Int(x) => serializer.serialize_u64(x),
JsonRpcParam::Bool(x) => serializer.serialize_bool(x),
JsonRpcParam::Vec(x) => _
}
}
}
Instead of manually implementing Serialize you can instead use #[serde(untagged)].
In your case that will work perfectly fine. However, be warned that if the enum variant isn't unique and can't be clearly identified from the JSON, then it will deserialize into the first variant that matches. In short if you also have e.g. a subsequent JsonRpcParam::OtherString(String), then that will deserialize into JsonRpcParam::String(String).
#[derive(Serialize, Deserialize, Debug)]
#[serde(untagged)]
pub enum JsonRpcParam {
String(String),
Int(u64),
Bool(bool),
Vec(Vec<String>),
}
If you now use e.g. serde_json::to_string_pretty(), then it will yield an output in your desired format:
fn main() {
let mut params: Vec<JsonRpcParam> = Vec::new();
params.push(JsonRpcParam::String("Test".to_string()));
params.push(JsonRpcParam::Bool(true));
params.push(JsonRpcParam::Int(64));
params.push(JsonRpcParam::Vec(vec![
String::from("abc"),
String::from("cde"),
]));
let json = serde_json::to_string_pretty(¶ms).unwrap();
println!("{}", json);
}
Output:
[
"Test",
true,
64,
[
"abc",
"cde"
]
]
I'm quite new to rust and come from an OOP background. So, maybe I misunderstood some rust basics.
I want to parse a fixed json-structure with serde. This structure represents one of different messages types. Each message has a numeric type attribute to distinguish it. The exact structure of the individual message types varies mostly, but they can also be the same.
{"type": 1, "sender_id": 4, "name": "sender", ...}
{"type": 2, "sender_id": 5, "measurement": 3.1415, ...}
{"type": 3, "sender_id": 6, "measurement": 13.37, ...}
...
First of all, I defined an enum to distinguish between message types also a struct for each type of message without a field storing the type.
#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "type")]
enum Message {
T1(Type1),
T2(Type2),
T3(Type3),
// ...
}
#[derive(Debug, Serialize, Deserialize)]
struct Type1 {
sender_id: u32,
name: String,
// ...
}
#[derive(Debug, Serialize, Deserialize)]
struct Type2 {
sender_id: u32,
measurement: f64,
// ...
}
#[derive(Debug, Serialize, Deserialize)]
struct Type3 {
sender_id: u32,
measurement: f64,
// ...
}
// ...
When I try to turn a string to a Message object, I get an error.
let message = r#"{"type":1,"sender_id":123456789,"name":"sender"}"#;
let message: Message = serde_json::from_str(message)?; // error here
// Error: Custom { kind: InvalidData, error: Error("invalid type: integer `1`, expected variant identifier", line: 1, column: 9) }
So, as I understood, serde tries to figure out the type of the current message but it needs a string
for that. I also tried to write my own deserialize()-function. I tried to get the numerical value
of the corresponding type-key and wanted to create the specific object by the type value.
How I have to implement the deserialize() to extract the type of the message and create the specific message object? Is it possible to write this without writing a deserialize()-function for each Type1/2/3/... struct?
impl<'de> Deserialize<'de> for Message {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where D: Deserializer<'de>,
{
// which functions I have to call?
}
Or is there a better solution to achieve my deserialization?
I prepared a playground for this issue: Playground
Serde doesn't support integer tags yet (see issue #745).
If you're able to change what's producing the data, then if you're able to change type into a string, i.e. "1" instead of 1. Then you can get it working simply using #[serde(rename)].
#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "type")]
enum Message {
#[serde(rename = "1")]
T1(Type1),
#[serde(rename = "2")]
T2(Type2),
#[serde(rename = "3")]
T3(Type3),
// ...
}
If that's not an option, then you indeed need to create a custom deserializer. The shortest in terms of code, is likely to deserialize into a serde_json::Value, and then match on the type, and deserialize the serde_json::Value into the correct Type{1,2,3}.
use serde_json::Value;
impl<'de> serde::Deserialize<'de> for Message {
fn deserialize<D: serde::Deserializer<'de>>(d: D) -> Result<Self, D::Error> {
let value = Value::deserialize(d)?;
Ok(match value.get("type").and_then(Value::as_u64).unwrap() {
1 => Message::T1(Type1::deserialize(value).unwrap()),
2 => Message::T2(Type2::deserialize(value).unwrap()),
3 => Message::T3(Type3::deserialize(value).unwrap()),
type_ => panic!("unsupported type {:?}", type_),
})
}
}
You'll probably want to perform some proper error handling, instead of unwrapping and panicking.
If you need serialization as well, then you will likewise need a custom serializer. For this you could create a new type to serialize into, as you cannot use Message.
use serde::Serializer;
impl Serialize for Message {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
#[derive(Serialize)]
#[serde(untagged)]
enum Message_<'a> {
T1(&'a Type1),
T2(&'a Type2),
T3(&'a Type3),
}
#[derive(Serialize)]
struct TypedMessage<'a> {
#[serde(rename = "type")]
t: u64,
#[serde(flatten)]
msg: Message_<'a>,
}
let msg = match self {
Message::T1(t) => TypedMessage { t: 1, msg: Message_::T1(t) },
Message::T2(t) => TypedMessage { t: 2, msg: Message_::T2(t) },
Message::T3(t) => TypedMessage { t: 3, msg: Message_::T3(t) },
};
msg.serialize(serializer)
}
}
When using #[serde(flatten)], then it uses serde::private::ser::FlatMapSerializer, which is hidden from the documentation. In place of creating new types, you could use SerializeMap and FlatMapSerializer.
However, be warned, given it's undocumented, then any future release of serde could break your code if you're using FlatMapSerializer directly.
use serde::{private::ser::FlatMapSerializer, ser::SerializeMap, Serializer};
impl Serialize for Message {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
let mut s = serializer.serialize_map(None)?;
let type_ = &match self {
Message::T1(_) => 1,
Message::T2(_) => 2,
Message::T3(_) => 3,
};
s.serialize_entry("type", &type_)?;
match self {
Message::T1(t) => t.serialize(FlatMapSerializer(&mut s))?,
Message::T2(t) => t.serialize(FlatMapSerializer(&mut s))?,
Message::T3(t) => t.serialize(FlatMapSerializer(&mut s))?,
}
s.end()
}
}
As per the Serde specification, an Object / Map<String, Value> is a Value:
pub enum Value {
Null,
Bool(bool),
Number(Number),
String(String),
Array(Vec<Value>),
Object(Map<String, Value>),
}
Yet when I compile this code:
extern crate serde;
#[macro_use]
extern crate serde_json;
#[derive(Debug)]
struct Wrapper {
ok: bool,
data: Option<serde_json::Value>,
}
impl Wrapper {
fn ok() -> Wrapper {
Wrapper {
ok: true,
data: None,
}
}
pub fn data(&mut self, data: serde_json::Value) -> &mut Wrapper {
self.data = Some(data);
self
}
pub fn finalize(self) -> Wrapper {
self
}
}
trait IsValidWrapper {
fn is_valid_wrapper(&self) -> bool;
}
impl IsValidWrapper for serde_json::Map<std::string::String, serde_json::Value> {
fn is_valid_wrapper(&self) -> bool {
self["ok"].as_bool().unwrap_or(false)
}
}
fn main() {
let json = json!({
"name": "John Doe",
"age": 43,
"phones": [
"+44 1234567",
"+44 2345678"
]
});
let converted_json: Wrapper = json
.as_object()
.map_or_else(
|| Err(json),
|obj| {
if obj.is_valid_wrapper() {
Ok(Wrapper::ok().data(obj["data"].clone()).finalize())
} else {
Err(*obj as serde_json::Value)
}
},
)
.unwrap_or_else(|data| Wrapper::ok().data(data.clone()).finalize());
println!(
"org json = {:?} => converted json = {:?}",
json, converted_json
);
}
I get this error:
error[E0605]: non-primitive cast: `serde_json::Map<std::string::String, serde_json::Value>` as `serde_json::Value`
--> src/main.rs:60:25
|
60 | Err(*obj as serde_json::Value)
| ^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: an `as` expression can only be used to convert between primitive types. Consider using the `From` trait
Is there a way to downcast a Map into a Value?
an Object / Map<String, Value> is a Value
No, it is not. Value is a type. Map<String, Value> is a type. Value::Object is an enum variant, which is not a separate type. In this case, Value::Object holds another value of type Map<String, Value>. You have to wrap the value in the variant to convert the type:
Err(serde_json::Value::Object(obj))
This will lead you to the problem:
error[E0308]: mismatched types
--> src/main.rs:57:55
|
57 | Err(serde_json::Value::Object(obj))
| ^^^ expected struct `serde_json::Map`, found reference
|
= note: expected type `serde_json::Map<std::string::String, serde_json::Value>`
found type `&serde_json::Map<std::string::String, serde_json::Value>`
as_object returns a reference to the contained object (if it's present), not the value itself. You will need to match on it for now:
let converted_json = match json {
serde_json::Value::Object(obj) => {}
_ => {}
};
Something like this:
let converted_json = match json {
serde_json::Value::Object(obj) => {
if obj.is_valid_wrapper() {
let mut w = Wrapper::ok();
w.data(obj["data"].clone());
Ok(w.finalize())
} else {
Err(serde_json::Value::Object(obj))
}
}
other => Err(other),
};