Custom mapping for into_group_map in rust - rust

I have the following data structure
pub struct FileContent{
version: u16,
fileTypes: Vec<FileVersion>
}
#[derive(Clone, PartialEq, Eq, Debug, Serialize, Deserialize)]
pub enum FileVersion {
Cats(String, Vec<Definition>),
Dogs(Vec<Definition>),
Birds(Vec<Definition>),
}
pub struct Definition {
..
}
And I want to parse it into a HashMap<string, Vec> where only FileVersion::Cats are included.
I have the following code:
use itertools::Itertools;
let x = config
.fileTypes
.iter()
.filter_map(|voc| match voc {
FileVersion::Cats(s, v) => Some((s, v)),
_ => None,
})
.into_group_map();
Which gives me a HashMap<string, Vec<Vec>>, but there is really no need for the double vector structure. How can I map this so I get HashMap<string, Vec>?

Would using std::iter::FromIterator be an option?
The following might create the structure that you're looking for:
use std::collections::HashMap;
use std::iter::FromIterator;
#[derive(Clone, PartialEq, Eq, Debug)]
pub enum FileVersion {
Cats(String, Vec<Definition>),
Dogs(Vec<Definition>),
Birds(Vec<Definition>),
}
#[derive(Clone, PartialEq, Eq, Debug)]
pub struct Definition;
fn main() {
let file_types = vec![
FileVersion::Cats(String::from("A"), vec![Definition, Definition]),
FileVersion::Dogs(vec![Definition]),
FileVersion::Cats(String::from("B"), vec![Definition]),
FileVersion::Birds(vec![Definition]),
];
let my_map = HashMap::<_, _>::from_iter(file_types.into_iter().filter_map(|voc| match voc {
FileVersion::Cats(s, v) => Some((s, v)),
_ => None,
}));
}
The value for my_map should then equal {"A": [Definition, Definition], "B": [Definition]}.

Related

Deserialize struct which contains Weak

I have a struct which contains a weak ref to a pre-existing Arc. Is there a way for me to pass this Arc to the struct's deserializer, without writing my own deserializer/visitor?
Example struct / deserialize:
struct SomeGlobal;
#[derive(Deserialize)]
struct SomeState {
pub a: i32,
pub b: Weak<SomeGlobal>,
};
fn magic_deserialize(value: serde_json::Value, g: Arc<SomeGlobal>) -> Result<SomeState, ...> {
// what goes here?
}
#[test]
fn example() {
let g = Arc::new(SomeGlobal);
let value = json!({
"a": 5
});
let state = magic_deserialize(value, g.clone()).unwrap();
// state.a == 5
// state.b == weak ref to g
}
Since Weak<T> implements Default, you can simply annotate b with #[serde(skip)] to have serde skip serializing this field, and set it to default when deserializing.
#[derive(Debug, Serialize, Deserialize)]
struct SomeState {
pub a: i32,
#[serde(skip)]
pub b: Weak<SomeGlobal>,
}

is it possible to use enum to parse the request in rust rocket

I want to do a condition query in rust diesel(diesel = { version = "1.4.7", features = ["postgres","64-column-tables","chrono"] }). read the docs from here and write the fuction like this:
fn find_channel(request: &ChannelRequest) -> Box<dyn BoxableExpression<crate::model::diesel::dolphin::dolphin_schema::rss_sub_source::table, DB, SqlType=Bool> + '_> {
use crate::model::diesel::dolphin::dolphin_schema::rss_sub_source::dsl::*;
match request {
ChannelRequest::editorPick(editorPick) => Box::new(crate::model::diesel::dolphin::dolphin_schema::rss_sub_source::dsl::editor_pick.eq(editorPick)),
_ => Box::new(crate::model::diesel::dolphin::dolphin_schema::rss_sub_source::dsl::editor_pick.eq(0))
}
}
but the function need the parameter was enum, and this is the ChannelRequest define in rocket rocket = { version = "0.5.0-rc.1", features = ["json"] } :
use rocket::serde::Deserialize;
use rocket::serde::Serialize;
#[derive(Debug, PartialEq, Eq, Deserialize, Serialize)]
#[allow(non_snake_case)]
pub enum ChannelRequest {
userId(Option<i64>),
pageNum(Option<i64>),
pageSize(Option<i64>),
editorPick(Option<i32>),
}
and this is the rocket api controller define:
#[post("/v1/page", data = "<request>")]
pub fn page(request: Json<ChannelRequest>) -> content::Json<String> {
let channels = channel_query::<Vec<RssSubSource>>(&request);
return box_rest_response(channels);
}
and this is the channel_query which invoke the conditional query:
pub fn channel_query<T>(request: &Json<ChannelRequest>) -> PaginationResponse<Vec<RssSubSource>> {
use crate::model::diesel::dolphin::dolphin_schema::rss_sub_source::dsl::*;
let connection = config::establish_connection();
let query = rss_sub_source
.filter(find_channel(&request.0))
.order(created_time.desc())
.paginate(1)
.per_page(10);
let query_result: QueryResult<(Vec<_>, i64, i64)> = query.load_and_count_pages_total::<RssSubSource>(&connection);
let page_result = map_pagination_res(
query_result,
1,
10);
return page_result;
}
when I request the channel search api, seems the server side did not understand the client request, what should I do to using enum to receive the client request? is it possible? or what should I do to tweak the condition query function to make it work?
I tweak the match data type frome enum to struct, change the enum to strunct:
use rocket::serde::Deserialize;
use rocket::serde::Serialize;
#[derive(Debug, PartialEq, Eq, Deserialize, Serialize)]
#[allow(non_snake_case)]
pub struct ChannelRequest {
pub userId: Option<i64>,
pub pageNum: Option<i64>,
pub pageSize: Option<i64>,
pub editorPick: Option<i32>
}
and tweak the condition query like this:
fn find_channel(request: &ChannelRequest) -> Box<dyn BoxableExpression<crate::model::diesel::dolphin::dolphin_schema::rss_sub_source::table, DB, SqlType=Bool> + '_> {
use crate::model::diesel::dolphin::dolphin_schema::rss_sub_source::dsl::*;
match request {
ChannelRequest { editorPick, .. } => Box::new(crate::model::diesel::dolphin::dolphin_schema::rss_sub_source::dsl::editor_pick.eq(editorPick)),
_ => Box::new(crate::model::diesel::dolphin::dolphin_schema::rss_sub_source::dsl::editor_pick.eq(0))
}
}

Serde tag = x, but keep the tag in the struct

I'm trying to de-serialise JSON to a Rust struct with enum variants using internally tagged JSON (https://serde.rs/enum-representations.html).
I want to store the tag of the variant inside the struct - currently serde stores this data in attributes.
Can this be done keeping the tag key inside the struct?
The methods I have tried:
A. #[serde(untagged)]
This works but I want to avoid it because of the performance hit of searching for a pattern match.
B. #[serde(tag = "f_tag")]
This does not work and results in "duplicate field `f_tag`".
Both the serde attribute and the enum variant rename try to use the same key.
I do not want to place #[serde(rename = "f_tag")] under Uni as this defines the tag key in the parent enum. (I want child structs to have the same tag key anywhere they are contained inside a parent enum).
Example:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=584c3f32488a1a27f47dac9a0e81d31c
use serde::{Deserialize, Serialize};
use serde_json::json;
use serde_json::{Result, Value};
#[derive(Serialize, Deserialize, Debug)]
struct S1 {
f1: String,
f2: Uni,
}
#[derive(Serialize, Deserialize, Debug)]
#[serde(tag = "f_tag")]
// #[serde(untagged)]
enum Uni {
String(String),
// #[serde(rename = "f_tag")]
S2(S2),
// #[serde(rename = "f_tag")]
S3(S3),
}
#[derive(Serialize, Deserialize, Debug)]
struct S2 {
f_tag: S2Tag,
f2_s2: bool,
}
#[derive(Serialize, Deserialize, Debug)]
struct S3 {
f_tag: S3Tag,
f2_s3: bool,
}
#[derive(Serialize, Deserialize, Debug)]
enum S2Tag {
#[serde(rename = "s2")]
S2,
}
#[derive(Serialize, Deserialize, Debug)]
enum S3Tag {
#[serde(rename = "s3")]
S3,
}
fn main() {
let s1 = S1 {
f1: "s1.f1".into(),
f2: Uni::S2(S2 {
f_tag: S2Tag::S2,
f2_s2: true,
}),
};
let j = serde_json::to_string(&s1).unwrap();
dbg!(&j);
let s1_2: S1 = serde_json::from_str(&j).unwrap();
dbg!(s1_2);
}
{
"f1": "s1.f1",
"f2": {
// Issue: serde(tag = x) uses the name of the enum here.
"f_tag": "S2",
"f_tag": "s2",
"f2_s2": true
}
}

How to serialise and deserialise BTreeMaps with arbitrary key types?

This example code:
use std::collections::BTreeMap;
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord)]
struct Foo {
bar: String,
baz: Baz
}
#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord)]
enum Baz {
Quux(u32),
Flob,
}
#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord)]
struct Bish {
bash: u16,
bosh: i8
}
fn main() -> std::io::Result<()> {
let mut btree: BTreeMap<Foo, Bish> = BTreeMap::new();
let foo = Foo {
bar: "thud".to_string(),
baz: Baz::Flob
};
let bish = Bish {
bash: 1,
bosh: 2
};
println!("foo: {}", serde_json::to_string(&foo)?);
println!("bish: {}", serde_json::to_string(&bish)?);
btree.insert(foo, bish);
println!("btree: {}", serde_json::to_string(&btree)?);
Ok(())
}
gives the runtime output/error:
foo: {"bar":"thud","baz":"Flob"}
bish: {"bash":1,"bosh":2}
Error: Custom { kind: InvalidData, error: Error("key must be a string", line: 0, column: 0) }
I've googled this, and found that the problem is that the serialiser would be trying to write:
{{"bar":"thud","baz":"Flob"}:{"bash":1,"bosh":2}}}
which is not valid JSON, as keys must be strings.
The internet tells me to write custom serialisers.
This is not a practical option, as I have a large number of different non-string keys.
How can I make serde_json serialise to (and deserialise from):
{"{\"bar\":\"thud\",\"baz\":\"Flob\"}":{"bash":1,"bosh":2}}
for arbitrary non-string keys in BTreeMap and HashMap?
Although OP decided not to use JSON in the end, I have written a crate that does exactly what the original question asked for: https://crates.io/crates/serde_json_any_key. Using it is as simple as a single function call.
Because this is StackOverflow and just a link is not a sufficient answer, here is a complete implementation, combining code from v1.1 of the crate with OP's main function, replacing only the final call to serde_json::to_string:
extern crate serde;
extern crate serde_json;
use serde::{Serialize, Deserialize};
use std::collections::BTreeMap;
mod serde_json_any_key {
use std::any::{Any, TypeId};
use serde::ser::{Serialize, Serializer, SerializeMap, Error};
use std::cell::RefCell;
struct SerializeMapIterWrapper<'a, K, V>
{
pub iter: RefCell<&'a mut (dyn Iterator<Item=(&'a K, &'a V)> + 'a)>
}
impl<'a, K, V> Serialize for SerializeMapIterWrapper<'a, K, V> where
K: Serialize + Any,
V: Serialize
{
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> where
S: Serializer
{
let mut ser_map = serializer.serialize_map(None)?;
let mut iter = self.iter.borrow_mut();
// handle strings specially so they don't get escaped and wrapped inside another string
if TypeId::of::<K>() == TypeId::of::<String>() {
while let Some((k, v)) = iter.next() {
let s = (k as &dyn Any).downcast_ref::<String>().ok_or(S::Error::custom("Failed to serialize String as string"))?;
ser_map.serialize_entry(s, &v)?;
}
} else {
while let Some((k, v)) = iter.next() {
ser_map.serialize_entry(match &serde_json::to_string(&k)
{
Ok(key_string) => key_string,
Err(e) => { return Err(e).map_err(S::Error::custom); }
}, &v)?;
}
}
ser_map.end()
}
}
pub fn map_iter_to_json<'a, K, V>(iter: &'a mut dyn Iterator<Item=(&'a K, &'a V)>) -> Result<String, serde_json::Error> where
K: Serialize + Any,
V: Serialize
{
serde_json::to_string(&SerializeMapIterWrapper {
iter: RefCell::new(iter)
})
}
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord)]
struct Foo {
bar: String,
baz: Baz
}
#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord)]
enum Baz {
Quux(u32),
Flob,
}
#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord)]
struct Bish {
bash: u16,
bosh: i8
}
fn main() -> std::io::Result<()> {
let mut btree: BTreeMap<Foo, Bish> = BTreeMap::new();
let foo = Foo {
bar: "thud".to_string(),
baz: Baz::Flob
};
let bish = Bish {
bash: 1,
bosh: 2
};
println!("foo: {}", serde_json::to_string(&foo)?);
println!("bish: {}", serde_json::to_string(&bish)?);
btree.insert(foo, bish);
println!("btree: {}", serde_json_any_key::map_iter_to_json(&mut btree.iter())?);
Ok(())
}
Output:
foo: {"bar":"thud","baz":"Flob"}
bish: {"bash":1,"bosh":2}
btree: {"{\"bar\":\"thud\",\"baz\":\"Flob\"}":{"bash":1,"bosh":2}}
After discovering Rusty Object Notation, I realised that I was pushing a RON-shaped peg into a JSON-shaped hole.
The correct solution was to use JSON for the interface with the outside world, and RON for human-readable local data storage.

How do I use Serde to serialize a HashMap with structs as keys to JSON?

I want to serialize a HashMap with structs as keys:
use serde::{Deserialize, Serialize}; // 1.0.68
use std::collections::HashMap;
fn main() {
#[derive(Serialize, Deserialize, Debug, PartialEq, Eq, Hash)]
struct Foo {
x: u64,
}
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
x: HashMap<Foo, f64>,
}
let mut p = Bar { x: HashMap::new() };
p.x.insert(Foo { x: 0 }, 0.0);
let serialized = serde_json::to_string(&p).unwrap();
}
This code compiles, but when I run it I get an error:
Error("key must be a string", line: 0, column: 0)'
I changed the code:
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
x: HashMap<u64, f64>,
}
let mut p = Bar { x: HashMap::new() };
p.x.insert(0, 0.0);
let serialized = serde_json::to_string(&p).unwrap();
The key in the HashMap is now a u64 instead of a string. Why does the first code give an error?
You can use serde_as from the serde_with crate to encode the HashMap as a sequence of key-value pairs:
use serde_with::serde_as; // 1.5.1
#[serde_as]
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
#[serde_as(as = "Vec<(_, _)>")]
x: HashMap<Foo, f64>,
}
Which will serialize to (and deserialize from) this:
{
"x":[
[{"x": 0}, 0.0],
[{"x": 1}, 0.0],
[{"x": 2}, 0.0]
]
}
There is likely some overhead from converting the HashMap to Vec, but this can be very convenient.
According to JSONs specification, JSON keys must be strings. serde_json uses fmt::Display in here, for some non-string keys, to allow serialization of wider range of HashMaps. That's why HashMap<u64, f64> works as well as HashMap<String, f64> would. However, not all types are covered (Foo's case here).
That's why we need to provide our own Serialize implementation:
impl Display for Foo {
fn fmt(&self, f: &mut Formatter) -> std::fmt::Result {
write!(f, "{}", self.x)
}
}
impl Serialize for Bar {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
let mut map = serializer.serialize_map(Some(self.x.len()))?;
for (k, v) in &self.x {
map.serialize_entry(&k.to_string(), &v)?;
}
map.end()
}
}
(playground)
I've found the bulletproof solution 😃
Extra dependencies not required
Compatible with HashMap, BTreeMap and other iterable types
Works with flexbuffers
The following code converts a field (map) to the intermediate Vec representation:
pub mod vectorize {
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use std::iter::FromIterator;
pub fn serialize<'a, T, K, V, S>(target: T, ser: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
T: IntoIterator<Item = (&'a K, &'a V)>,
K: Serialize + 'a,
V: Serialize + 'a,
{
let container: Vec<_> = target.into_iter().collect();
serde::Serialize::serialize(&container, ser)
}
pub fn deserialize<'de, T, K, V, D>(des: D) -> Result<T, D::Error>
where
D: Deserializer<'de>,
T: FromIterator<(K, V)>,
K: Deserialize<'de>,
V: Deserialize<'de>,
{
let container: Vec<_> = serde::Deserialize::deserialize(des)?;
Ok(T::from_iter(container.into_iter()))
}
}
To use it just add the module's name as an attribute:
#[derive(Debug, Serialize, Deserialize)]
struct MyComplexType {
#[serde(with = "vectorize")]
map: HashMap<MyKey, String>,
}
The remained part if you want to check it locally:
use anyhow::Error;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Ord, Hash)]
struct MyKey {
one: String,
two: u16,
more: Vec<u8>,
}
#[derive(Debug, Serialize, Deserialize)]
struct MyComplexType {
#[serde(with = "vectorize")]
map: HashMap<MyKey, String>,
}
fn main() -> Result<(), Error> {
let key = MyKey {
one: "1".into(),
two: 2,
more: vec![1, 2, 3],
};
let mut map = HashMap::new();
map.insert(key.clone(), "value".into());
let instance = MyComplexType { map };
let serialized = serde_json::to_string(&instance)?;
println!("JSON: {}", serialized);
let deserialized: MyComplexType = serde_json::from_str(&serialized)?;
let expected_value = "value".to_string();
assert_eq!(deserialized.map.get(&key), Some(&expected_value));
Ok(())
}
And on the Rust playground: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=bf1773b6e501a0ea255ccdf8ce37e74d
While all provided answers will fulfill the goal of serializing your HashMap to json they are ad hoc or hard to maintain.
One correct way to allow a specific data structure to be serialized with serde as keys in a map, is the same way serde handles integer keys in HashMaps (which works): They serialize the value to String. This has a few advantages; namely
Intermediate data-structure omitted,
no need to clone the entire HashMap,
easier maintained by applying OOP concepts, and
serialization usable in more complex structures such as MultiMap.
This can be done by manually implementing Serialize and Deserialize for your data-type.
I use composite ids for maps.
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub struct Proj {
pub value: u64,
}
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub struct Doc {
pub proj: Proj,
pub value: u32,
}
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
pub struct Sec {
pub doc: Doc,
pub value: u32,
}
So now manually implementing serde serialization for them is kind of a hassle, so instead we delegate the implementation to the FromStr and From<Self> for String (Into<String> blanket) traits.
impl From<Doc> for String {
fn from(val: Doc) -> Self {
format!("{}{:08X}", val.proj, val.value)
}
}
impl FromStr for Doc {
type Err = String;
fn from_str(s: &str) -> Result<Self, Self::Err> {
match parse_doc(s) {
Ok((_, p)) => Ok(p),
Err(e) => Err(e.to_string()),
}
}
}
In order to parse the Doc we make use of nom. The parse functionality below is explained in their examples.
fn is_hex_digit(c: char) -> bool {
c.is_digit(16)
}
fn from_hex8(input: &str) -> Result<u32, std::num::ParseIntError> {
u32::from_str_radix(input, 16)
}
fn parse_hex8(input: &str) -> IResult<&str, u32> {
map_res(take_while_m_n(8, 8, is_hex_digit), from_hex8)(input)
}
fn parse_doc(input: &str) -> IResult<&str, Doc> {
let (input, proj) = parse_proj(input)?;
let (input, value) = parse_hex8(input)?;
Ok((input, Doc { value, proj }))
}
Now we need to hook up self.to_string() and str::parse(&str) to serde we can do this using a simple macro.
macro_rules! serde_str {
($type:ty) => {
impl Serialize for $type {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
let s: String = self.clone().into();
serializer.serialize_str(&s)
}
}
impl<'de> Deserialize<'de> for $type {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
paste! {deserializer.deserialize_string( [<$type Visitor>] {})}
}
}
paste! {struct [<$type Visitor>] {}}
impl<'de> Visitor<'de> for paste! {[<$type Visitor>]} {
type Value = $type;
fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
formatter.write_str("\"")
}
fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
where
E: serde::de::Error,
{
match str::parse(v) {
Ok(id) => Ok(id),
Err(_) => Err(serde::de::Error::custom("invalid format")),
}
}
}
};
}
Here we are using paste to interpolate the names. Beware that now the struct will always serialize as defined above. Never as a struct, always as a string.
It is important to implement fn visit_str instead of fn visit_string because visit_string defers to visit_str.
Finally, we have to call the macro for our custom structs
serde_str!(Sec);
serde_str!(Doc);
serde_str!(Proj);
Now the specified types can be serialized to and from string with serde.

Resources