Rust: how to minimize patternmatching when parsing json into complex enum - rust

So, let's say I am expecting a lot of different JSONs of a known format from a network stream. I define structures for them and wrap them with an enum representing all the possibilities:
use serde::Deserialize;
use serde_json;
#[derive(Deserialize, Debug)]
struct FirstPossibleResponse {
first_field: String,
}
#[derive(Deserialize, Debug)]
struct SecondPossibleResponse {
second_field: String,
}
#[derive(Deserialize, Debug)]
enum ResponseFromNetwork {
FirstPossibleResponse(FirstPossibleResponse),
SecondPossibleResponse(SecondPossibleResponse),
}
Then, being a smart folk, I want to provide myself with a short way of parsing these JSONs into my structures, so I am implementing a trait (and here is the problem):
impl From<String> for ResponseFromNetwork {
fn from(r: String) -> Self {
match serde_json::from_slice(&r.as_bytes()) {
Ok(v) => ResponseFromNetwork::FirstPossibleResponse(v),
Err(_) => match serde_json::from_slice(&r.as_bytes()) {
Ok(v) => ResponseFromNetwork::SecondPossibleResponse(v),
Err(_) => unimplemented!("idk"),
},
}
}
}
...To use it later like this:
fn main() {
let data_first = r#"
{
"first_field": "first_value"
}"#;
let data_second = r#"
{
"second_field": "first_value"
}"#;
print!("{:?}", ResponseFromNetwork::from(data_first.to_owned()));
print!("{:?}", ResponseFromNetwork::from(data_second.to_owned()));
}
Rust playground
So, as mentioned earlier, the problem is that - this match tree is the only way I got parsing work, and you can imagine - the more variations of different JSONs I might possibly get over the network - the deeper and nastier the tree grows.
I want to have it in some way like ths, e.g. parse it once and then operate depending on the value:
use serde_json::Result;
impl From<String> for ResponseFromNetwork {
fn from(r: String) -> Self {
let parsed: Result<ResponseFromNetwork> = serde_json::from_slice(r.as_bytes());
match parsed {
Ok(v) => {
match v {print!("And here we should match on invariants or something: {:?}", v);
v
}
Err(e) => unimplemented!("{:?}", e),
}
}
}
But it doesn't really work:
thread 'main' panicked at 'not implemented: Error("unknown variant `first_field`, expected `FirstPossibleResponse` or `SecondPossibleResponse`", line: 3, column: 25)', src/main.rs:28:23
Playground

#[serde(untagged)] is designed for precisely that use case. Just add it in front of the definition of enum ResponseFromNetwork and your code will work the way you want it to:
#[derive(Deserialize, Debug)]
#[serde(untagged)]
enum ResponseFromNetwork {
FirstPossibleResponse(FirstPossibleResponse),
SecondPossibleResponse(SecondPossibleResponse),
}
impl From<String> for ResponseFromNetwork {
fn from(r: String) -> Self {
match serde_json::from_slice(r.as_bytes()) {
Ok(v) => v,
Err(e) => unimplemented!("{:?}", e),
}
}
}
Playground

If the formats of the response JSON strings can be extended (maybe not if they are predefined and unchangable), adding a tag field, say "kind", in each JSON, and annotating each variant struct with #[serde(tag = "kind")] and the enum with #[serde(untagged)] can address the issue.
[playground]
use serde::Deserialize;
use serde_json;
#[derive(Deserialize, Debug)]
#[serde(tag = "kind")]
struct FirstPossibleResponse {
first_field: String,
}
#[derive(Deserialize, Debug)]
#[serde(tag = "kind")]
struct SecondPossibleResponse {
second_field: String,
}
#[derive(Deserialize, Debug)]
#[serde(tag = "kind")]
struct ThirdPossibleResponse {
third_field: String,
}
#[derive(Deserialize, Debug)]
#[serde(untagged)]
enum ResponseFromNetwork {
FirstPossibleResponse(FirstPossibleResponse),
SecondPossibleResponse(SecondPossibleResponse),
ThirdPossibleResponse(ThirdPossibleResponse),
}
impl From<String> for ResponseFromNetwork {
fn from(r: String) -> Self {
match serde_json::from_slice(&r.as_bytes()) {
Ok(v) => v,
Err(_) => unimplemented!("idk"),
}
}
}
fn main() {
let data_first = r#"
{
"kind":"FirstPossibleResponse",
"first_field": "first_value"
}"#;
let data_second = r#"
{
"kind":"SecondPossibleResponse",
"second_field": "second_value"
}"#;
let data_third = r#"
{
"kind":"ThirdPossibleResponse",
"third_field": "third_value"
}"#;
println!("{:?}", ResponseFromNetwork::from(data_first.to_owned()));
println!("{:?}", ResponseFromNetwork::from(data_second.to_owned()));
println!("{:?}", ResponseFromNetwork::from(data_third.to_owned()));
}

Related

Is there a way to fix this avoiding serde and using nanoserde instead?

I'm new to Rust and I'm using serde only to do this easy thing. I think I can use nanoserde instead but I don't know how.
I'm using serde like this:
use serde::Deserialize;
#[derive(Deserialize, Clone)]
pub struct Player {
pub team: Team,
}
#[derive(Deserialize, Clone)]
pub struct Team {
pub id: String,
// others here
}
// ...
if api_call.status().is_success() {
match serde_json::from_slice::<Player>(
&hyper::body::to_bytes(api_call.into_body()).await.unwrap(),
) {
Ok(player) => {
// use player.team here
// do something else
}
Err(err) => {
eprintln!("{err}");
}
}
}
I tried with:
match nanoserde::DeBin::de_bin::<Player>(
&hyper::body::to_bytes(api_call.into_body()).await.unwrap(),
) {
// ...
}
with no luck: it doesn't like the generics.
And tried this too:
let player: Player = nanoserde::DeBin::deserialize_bin(
&hyper::body::to_bytes(api_call.into_body()).await.unwrap(),
).unwrap();
but this doesn't work, error:
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: Bin deserialize error at:8 wanted:1 bytes but max size is 1161', src\cli.rs:85:10
What does this mean?
Is there a way to fix this?
As you were lacking minimal reproducible examples, here's one for your serde code:
use serde::Deserialize;
#[derive(Deserialize, Clone)]
pub struct Player {
pub team: Team,
}
#[derive(Deserialize, Clone)]
pub struct Team {
pub id: String,
// others here
}
// ...
fn main() {
let data = br#"{"team":{"id":"42"}}"#;
match serde_json::from_slice::<Player>(data) {
Ok(player) => {
println!("Team: {}", player.team.id);
}
Err(err) => {
eprintln!("{err}");
}
};
}
Team: 42
Now let's see if we can rewrite this in nanoserde:
use nanoserde::DeJson;
#[derive(DeJson, Clone)]
pub struct Player {
pub team: Team,
}
#[derive(DeJson, Clone)]
pub struct Team {
pub id: String,
// others here
}
// ...
fn main() {
let data = r#"{"team":{"id":"42"}}"#;
match Player::deserialize_json(data) {
Ok(player) => {
println!("Team: {}", player.team.id);
}
Err(err) => {
eprintln!("{err}");
}
};
}
Team: 42
Note that we have to use a &str instead of a &[u8] here.
If we are 100% stuck with a &[u8], you can use from_utf8:
use nanoserde::DeJson;
#[derive(DeJson, Clone)]
pub struct Player {
pub team: Team,
}
#[derive(DeJson, Clone)]
pub struct Team {
pub id: String,
// others here
}
// ...
fn main() {
let data = br#"{"team":{"id":"42"}}"#;
match std::str::from_utf8(data).map(Player::deserialize_json) {
Ok(Ok(player)) => {
println!("Team: {}", player.team.id);
}
Ok(Err(err)) => {
eprintln!("{err}");
}
Err(err) => {
eprintln!("{err}");
}
};
}
Team: 42

How to serialize a type that might be an arbitrary string?

I have an enum type that is defined as either one of list of predefined strings or an arbitrary value (i.e. code that uses this type potentially wants to handle a few specific cases a certain way and also allow an arbitrary string).
I'm trying to represent this in Rust with serde the following way:
#[derive(Serialize, Debug)]
pub enum InvalidatedAreas {
#[serde(rename = "all")]
All,
#[serde(rename = "stacks")]
Stacks,
#[serde(rename = "threads")]
Threads,
#[serde(rename = "variables")]
Variables,
String(String),
}
When used as a member, I would like to serialize the above enum as simply a string value:
#[derive(Serialize, Debug)]
struct FooBar {
foo: InvalidatedAreas,
bar: InvalidatedAreas,
}
fn main() {
let foob = FooBar {
foo: types::InvalidatedAreas::Stacks,
bar: types::InvalidatedAreas::String("hello".to_string())
};
let j = serde_json::to_string(&foob)?;
println!("{}", j);
}
What I get is:
{"foo":"stacks","bar":{"String":"hello"}}
But I need
{"foo":"stacks","bar":"hello"}
If I add #[serde(untagged)] to the enum definition, I get
{"foo":null,"bar":"hello"}
How can I serialize this correctly?
I've arrived at the following solution. It requires a bit of repetition, but it's not so bad. I'll leave the question open in case someone has a better idea.
impl ToString for InvalidatedAreas {
fn to_string(&self) -> String {
match &self {
InvalidatedAreas::All => "all",
InvalidatedAreas::Stacks => "stacks",
InvalidatedAreas::Threads => "threads",
InvalidatedAreas::Variables => "variables",
InvalidatedAreas::String(other) => other
}
.to_string()
}
}
impl Serialize for InvalidatedAreas {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
serializer.serialize_str(&self.to_string())
}
}
}

Using serde to deserialize a HashMap with a Enum key

I have the following Rust code which models a configuration file which includes a HashMap keyed with an enum.
use std::collections::HashMap;
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, Hash)]
enum Source {
#[serde(rename = "foo")]
Foo,
#[serde(rename = "bar")]
Bar
}
#[derive(Debug, Clone, Serialize, Deserialize)]
struct SourceDetails {
name: String,
address: String,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
struct Config {
name: String,
main_source: Source,
sources: HashMap<Source, SourceDetails>,
}
fn main() {
let config_str = std::fs::read_to_string("testdata.toml").unwrap();
match toml::from_str::<Config>(&config_str) {
Ok(config) => println!("toml: {:?}", config),
Err(err) => eprintln!("toml: {:?}", err),
}
let config_str = std::fs::read_to_string("testdata.json").unwrap();
match serde_json::from_str::<Config>(&config_str) {
Ok(config) => println!("json: {:?}", config),
Err(err) => eprintln!("json: {:?}", err),
}
}
This is the Toml representation:
name = "big test"
main_source = "foo"
[sources]
foo = { name = "fooname", address = "fooaddr" }
[sources.bar]
name = "barname"
address = "baraddr"
This is the JSON representation:
{
"name": "big test",
"main_source": "foo",
"sources": {
"foo": {
"name": "fooname",
"address": "fooaddr"
},
"bar": {
"name": "barname",
"address": "baraddr"
}
}
}
Deserializing the JSON with serde_json works perfectly, but deserializing the Toml with toml gives the error.
Error: Error { inner: ErrorInner { kind: Custom, line: Some(5), col: 0, at: Some(77), message: "invalid type: string \"foo\", expected enum Source", key: ["sources"] } }
If I change the sources HashMap to be keyed on String instead of Source, both the JSON and the Toml deserialize correctly.
I'm pretty new to serde and toml, so I'm looking for suggestions on how to I would properly de-serialize the toml variant.
As others have said in the comments, the Toml deserializer doesn't support enums as keys.
You can use serde attributes to convert them to String first:
use std::convert::TryFrom;
use std::fmt;
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, Hash)]
#[serde(try_from = "String")]
enum Source {
Foo,
Bar
}
And then implement a conversion from String:
struct SourceFromStrError;
impl fmt::Display for SourceFromStrError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.write_str("SourceFromStrError")
}
}
impl TryFrom<String> for Source {
type Error = SourceFromStrError;
fn try_from(s: String) -> Result<Self, Self::Error> {
match s.as_str() {
"foo" => Ok(Source::Foo),
"bar" => Ok(Source::Bar),
_ => Err(SourceFromStrError),
}
}
}
If you only need this for the HashMap in question, you could also follow the suggestion in the Toml issue, which is to keep the definition of Source the same and use the crate, serde_with, to modify how the HashMap is serialized instead:
use serde_with::{serde_as, DisplayFromStr};
use std::collections::HashMap;
#[serde_as]
#[derive(Debug, Clone, Serialize, Deserialize)]
struct Config {
name: String,
main_source: Source,
#[serde_as(as = "HashMap<DisplayFromStr, _>")]
sources: HashMap<Source, SourceDetails>,
}
This requires a FromStr implementation for Source, rather than TryFrom<String>:
impl FromStr for Source {
type Err = SourceFromStrError;
fn from_str(s: &str) -> Result<Self, Self::Err> {
match s {
"foo" => Ok(Source::Foo),
"bar" => Ok(Source::Bar),
_ => Err(SourceFromStrError),
}
}
}

How to convert the name of a enum's variant to a String in RUST?

I am wondering how to implement a method for any enum that will return the variant identifier as a String or &'static str, without using any external crate. Something like:
pub enum MyEnum {
EnumVariant1
EnumVariant2
}
impl MyEnum {
fn to_string(&self) -> String {
// do Rust stuff here
}
}
As stated in my comment I believe a custom derive macro may be the easiest option (although I could be missing something) so here is a basic implementation of one:
// lib.rs in enum_name
extern crate self as enum_name;
pub use enum_name_derive::EnumName;
pub trait EnumName {
fn enum_name(&self) -> &'static str;
}
#[cfg(test)]
mod tests {
use super::*;
#[derive(EnumName)]
#[allow(dead_code)]
enum MyEnum<'a, T> {
VariantA,
VariantB(T, i32),
AnotherOne { x: &'a str },
}
#[test]
fn test_enum_name() {
assert_eq!("VariantA", MyEnum::VariantA::<u32>.enum_name());
assert_eq!("VariantB", MyEnum::VariantB(1, 2).enum_name());
assert_eq!(
"AnotherOne",
MyEnum::AnotherOne::<u8> { x: "test" }.enum_name()
);
}
}
// lib.rs in enum_name_derive
use proc_macro::TokenStream;
use quote::quote;
use syn::{parse_macro_input, Data, DeriveInput, Fields};
#[proc_macro_derive(EnumName)]
pub fn derive_proto_read(input: TokenStream) -> TokenStream {
let input = parse_macro_input!(input as DeriveInput);
let ident = input.ident;
let (impl_generics, ty_generics, where_clause) = input.generics.split_for_impl();
let variants = match input.data {
Data::Enum(data) => data.variants.into_iter().map(|variant| {
let ident = variant.ident;
let ident_string = ident.to_string();
let fields = match variant.fields {
Fields::Named(_) => quote!({ .. }),
Fields::Unnamed(_) => quote!((..)),
Fields::Unit => quote!(),
};
quote! {
Self::#ident#fields => #ident_string
}
}),
_ => panic!("not an enum"),
};
(quote! {
impl #impl_generics enum_name::EnumName for #ident #ty_generics #where_clause {
fn enum_name(&self) -> &'static str {
match self {
#(#variants),*
}
}
}
})
.into()
}

How can I receive multiple query params with the same name in actix-web?

In the actix-web documentation is only an example of how to receive uniquely named query params.
But how can I receive multiple query params of the same name? For example:
http://localhost:8088/test?id=1&id=2&id=3
How do I have to change following code so it accepts multiple ids and how can I read them?
use actix_web::web;
use serde::Deserialize;
#[derive(Deserialize)]
struct Info {
id: String,
}
#[get("/test")]
async fn index(info: web::Query<Info>) -> impl Responder {
println!("Id: {}!", info.id);
"ok"
}
Having a look at this question, it seems like there is no definitive standard for what you want. I dont know if actix has such an extractor. I would work on my Deserialize impl.
use std::fmt;
use serde::de::{ Deserialize, Deserializer, Visitor, MapAccess};
impl<'de> Deserialize<'de> for Info {
fn deserialize<D>(deserializer: D) -> Result<Info, D::Error>
where
D: Deserializer<'de>,
{
struct FieldVisitor;
impl<'de> Visitor<'de> for FieldVisitor {
type Value = Info;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("`id`")
}
fn visit_map<V>(self, mut map: V) -> Result<Info, V::Error>
where
V: MapAccess<'de>
{
let mut ids: Vec<String> = Vec::default();
while let Some(key) = map.next_key()? {
match key {
"id" => {
ids.push(map.next_value::<String>()?)
}
_ => unreachable!()
}
}
Ok(Info {
id: ids
})
}
}
deserializer.deserialize_identifier(FieldVisitor)
}
}
#[derive(Debug)]
struct Info {
id: Vec<String>,
}

Resources