Encode Arc<RwLock<serde_json>> to PostgreSQL jsonb - rust

Obligatory disclaimer: I'm definitely no Rust/SQL expert, but I'm still learning. I'd appreciate any help in this process.
I am trying to encode a struct containing a field of type Arc<RwLock<json>> into the jsonb PostgreSQL type using the sqlx crate. My code takes the general form:
...
use std::sync::Arc;
use parking_lot::RwLock;
use serde_json::value::Value as json;
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct ArcJson {
pub data: Arc<RwLock<json>>,
}
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct MyStruct {
...
pub my_data: ArcJson,
}
...
// this is an associated function (self is sqlx::postgres::PgPool)
pub async fn add_my_struct_instance_to_db(self, my_new_data: MyStruct) -> Result<MyStruct, sqlx::Error> {
match sqlx::query("INSERT INTO postgres_table (..., postgres_jsonb_data) VALUES ($1, $2) RETURNING ..., postgres_jsonb_data")
.bind(my_new_data.my_data)
.map(|row: PgRow| MyStruct{
...
my_data: ArcJson {data: Arc::new(RwLock::new(row.get("data")))},
})
.fetch_one(&self.connection)
.await{
Ok(data) => Ok(data),
Err(e) => Err(e)
}
}
The error I'm getting is that the trait bound 'ArcJson Encode<_, Postgres>' is not satisfied.
I've tried to derive sqlx::Type for ArcJson to no avail. What is the simplest way to basically say "to convert all you have to do is read the locks and clone the internal json"? (I know cloning here isn't the most efficient thing, but I'd like to avoid going down this sort of rabbit hole if I can avoid it for now).
Lifetime annotations scare me, but they're required to manually implement encode. Can anyone advise on how I should proceed? I'm not sure if my brain is just too small or I'm over-complicating things.
PS: I tried to simplify the code as much as possible to best convey the gist of the issue, please let me know if it is unclear or confusing

Related

How to create an arbitrary HashMap to use in rust rocket for a web API

I try to make a web API with rocket to try out the framework. I managed to return paginated results with a special struct that implements serializable.
However, the API I try to build depends on arbitrary values in a special dictionary. The received values may be strings, integers, bools, or other complex objects. The problem now is, that I'm not able to create a struct that contains "any" since Any is not serializable.
The basic idea would be something like this:
#[derive(Debug, Serialize, Deserialize)]
pub struct Foobar<'a> {
pub id: Uuid,
pub data: HashMap<&'a str, ??????>,
}
Even with enums, the problem remains since there is an infinite count of variations. Let's say, I use an enum to determine strings, bools, integers. When the containing type is another type, I need the json representation of that specific type. Basically another map with string -> any.
The current idea would be to use:
#[derive(Debug, Serialize, Deserialize)]
pub struct Foobar {
pub id: Uuid,
pub data: HashMap<String, rocket::serde::json::Value>,
}
But I don't know how the API will fare when there are non json values (e.g. msgpack).
Has somebody accomplished such a feat with rust/rocket?
After several tries with traits and std::any::Any, I ended up with:
pub type DataObject = HashMap<String, DataValue>;
#[derive(Debug, Serialize, Deserialize, ToSchema)]
pub enum DataValue {
String(String),
Integer(i64),
Float(f64),
Boolean(bool),
Date(String),
Time(String),
DateTime(String),
Reference(DataObject),
ArrayReference(Vec<DataObject>),
}
This is - more or less - the same as serde_json::Value. It helps to further check the values and validate them against a possible type description of the objects.

How To Do Zero-Copy Deserialization of Recursive Enums with Serde?

I'm not even sure it's possible with serde, but what I'm trying to do is something along the following:
#[derive(serde::Deserialize)]
pub enum Tree<'a> {
Zero,
One(&'a Tree<'a>),
Two(&'a Tree<'a>, &'a Tree<'a>),
Three(&'a Tree<'a>, &'a Tree<'a>, &'a Tree<'a>),
}
Is this possible using specific serde attributes (like #[serde(borrow)], etc.)? Is it required to do a custom implementation of Deserialize? Or is it not something serde can do?
You can't because something has to own all the new Tree objects.
You can however create a similar structure:
#[derive(Debug, serde::Serialize, serde::Deserialize)]
pub enum Tree<'a> {
Zero(&'a str),
One(Box<Tree<'a>>),
Two(Box<(Tree<'a>, Tree<'a>)>),
Three(Box<(Tree<'a>, Tree<'a>, Tree<'a>)>),
}
I added a &'a str argument to Zero to have some use for that lifetime, else you could just get rid of it all together.
Boxes are needed because else we would have an infinite size requirement.
This is still zero-copy since we're not copying any data from the underlaying array. It is however not zero-allocation which might work with some hacks or in special cases but generally is impossible.
I figured out the closest possible thing to what I wanted to do without allocation:
#[derive(serde::Deserialize)]
pub enum Tree<'a> {
Zero,
One(&'a [u8]),
Two(&'a [u8], &'a [u8]),
Three(&'a [u8], &'a [u8], &'a [u8]),
}
Then each individual slice would be deserialized into Tree on descent. As #Caesar pointed out this would not technically be zero-copy, though, depending on your definition (I think it's kind of a gray area).

Why does switching from struct to enum breaks API, exactly?

I encountered an interesting change in a public PR.
Initially they had:
#[derive(Debug, Clone, PartialEq, Eq, Copy)]
pub struct ParseError(ParseErrorKind);
#[derive(Debug, Clone, PartialEq, Eq, Copy)]
enum ParseErrorKind {
OutOfRange // ... omitting other values here to be short
}
ParseError cannot be instantiated by clients because ParseErrorKind is private. They are making that enum public now, which seems ok, but I suggested an alternative: have ParseError be an enum itself, and leverage the type system instead of imitating it with the notion of "kind". They told me that would be an API breakage, and therefore was not ok.
I think I understand why in theory a struct and an enum are different. But I am not sure to understand why it is incompatible in this precise case.
Since the struct ParseError had no mutable field and cannot be instantiated by clients, there was nothing we could do with the type but to assign it and compare it. It seems both struct and enum support that, so client code is unlikely to require a change to compile with a newer version exposing an enum instead of struct. Or did I miss another use we could have with the struct, that would result in requiring a change in client code?
However there might be an ABI incompatibility too. How does Rust handle the struct in practice, knowing that only the library can construct it? Is there any sort of allocation or deallocation mechanism that requires to know precisely what ParseError is made of at buildtime? And does switching from that exact struct to an enum impact that? Or could it be safe in this particular case? And is that relevant to try to maintain the ABI since it is not guaranteed so far?
That's because every struct has fields, and hence this pattern will work for any struct, but will not compile with an enum:
struct Foo {}
fn returns_a_foo() -> Foo {
// anything that may return a Foo
}
if let Foo { .. } = returns_a_foo() {}
For example, this code compiles:
fn main() {
if let String { .. } = String::new() {}
}
Playground.
And while probably not code you'd write on your own, it's still possible to write, and additionally, possible to generate through a macro. Note that this is then, obviously, not compatible with an enum pattern match:
if let Option { .. } = None {
// Compile error.
}
Playground.

How to only allow one field or the other with Serde?

Say I have this struct:
use serde::{Serialize, Deserialize};
#[derive(Deserialize)]
struct MyStruct {
field_1: Option<usize>, // should only have field_1 or field_2
field_2: Option<usize>, // should only have field_1 or field_2
other_field: String,
}
How can I deserialize this but only allow one of these fields to exist?
The suggestions in the comments to use an enum are likely your best bet. You don't need to replace your struct with an enum, instead you'd add a separate enum type to represent this constraint, e.g.:
use serde::{Serialize, Deserialize};
#[derive(Deserialize)]
enum OneOf {
F1(usize), F2(usize)
}
#[derive(Deserialize)]
struct MyStruct {
one_of_field: OneOf,
other_field: String,
}
Now MyStruct's one_of_field can be initialized with either an F1 or an F2.
#dimo414's answer is correct about the enum being necessary, but the code sample will not function in the way the question is described. This is caused by a couple factors related to how enums are deserialized. Mainly, it will not enforce mutual exclusion of the two variants and will silently pick the first variant that matches and ignore extraneous fields. Another issue is the enum will be treated as a separate structure within MyStruct (Ex: {"one_of_field":{"F1":123},"other_field":"abc"} in JSON).
Solution
For anyone wanting an easy solution, here it is. However keep in mind that variants of the mutually exclusive type can not contain #[serde(flatten)] fields (more information in the issue section). To accommodate neither field_1 or field_2, Option<MutuallyExclusive> can be used in MyStruct.
/// Enum containing mutually exclusive fields. Variants names will be used as the
/// names of fields unless annotated with `#[serde(rename = "field_name")]`.
#[derive(Deserialize)]
enum MutuallyExclusive {
field_1(usize),
field_2(usize),
}
#[derive(Deserialize)]
/// `deny_unknown_fields` is required. If not included, it will not error when both
/// `field_1` and `field_2` are both present.
#[serde(deny_unknown_fields)]
struct MyStruct {
/// Flatten makes it so the variants of MutuallyExclusive are seen as fields of
/// this struct. Without it, foo would be treated as a separate struct/object held
/// within this struct.
#[serde(flatten)]
foo: MutuallyExclusive,
other_field: String,
}
The Issue
TL;DR: It should be fine to use deny_unknown_fields with flatten in this way so long as types used in MutuallyExclusive do not use flatten.
If you read the serde documentation, you may notice it warns that using deny_unknown_fields in conjunction with flatten is unsupported. This is problematic as it throws the long-term reliability of the above code into question. As of writing this, serde will not produce any errors or warnings about this configuration and is able to handle it as intended.
The pull request adding this warning cited 3 issues when doing so:
#[serde(flatten)] error behavior is different to normal one
deny_unknown_fields incorrectly fails with flattened untagged enum
Structs with nested flattens cannot be deserialized if deny_unknown_fields is set
To be honest, I don't really care about the first one. I personally feel it is a bit overly pedantic. It simply states that the error message is not exactly identical between a type and a wrapper for that type using flatten for errors triggered by deny_unknown_fields. This should not have any effect on the functionality of the code above.
However, the other two errors are relevant. They both relate to nested flatten types within a deny_unknown_fields type. Technically the second issue uses untagged for the second layer of nesting, but it has the same effect as flatten in this context. The main idea is that deny_unknown_fields is unable to handle more than a single level of nesting without causing issues. The use case is in any way at fault, but the way deny_unknown_fields and flattened are handled makes it difficult to implement a workaround.
Alternative
However, if anyone still feels uncomfortable with using the above code, you can use this version instead. It will be a pain to work with if there are a lot of other fields, but sidesteps the warning in the documentation.
#[derive(Debug, Deserialize)]
#[serde(untagged, deny_unknown_fields)]
enum MyStruct {
WithField1 {
field_1: usize,
other_field: String,
},
WithField2 {
field_2: usize,
other_field: String,
},
}
You can deserialize your struct and then verify that all the invariants your type should uphold. You can implement Deserialize for your type to this while also relying on the derive macro to do the heavy lifting.
use serde::{Deserialize, Deserializer};
#[derive(Debug, Deserialize)]
#[serde(remote = "Self")]
struct MyStruct {
field_1: Option<usize>, // should only have field_1 or field_2
field_2: Option<usize>, // should only have field_1 or field_2
other_field: String,
}
impl<'de> Deserialize<'de> for MyStruct {
fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
use serde::de::Error;
let s = Self::deserialize(deserializer)?;
if s.field_1.is_some() && s.field_2.is_some() {
return Err(D::Error::custom("should only have field_1 or field_2"));
}
Ok(s)
}
}
fn main() -> () {
dbg!(serde_json::from_value::<MyStruct>(serde_json::json!({
"field_1": 123,
"other_field": "abc"
})));
dbg!(serde_json::from_value::<MyStruct>(serde_json::json!({
"field_2": 456,
"other_field": "abc"
})));
dbg!(serde_json::from_value::<MyStruct>(serde_json::json!({
"field_1": 123,
"field_2": 456,
"other_field": "abc"
})));
}
Playground

It is possible to use std::rc::Rc with a trait type?

The code looks like this:
// Simplified
pub trait Field: Send + Sync + Clone {
fn name(&self);
}
#[deriving(Clone)]
pub enum Select {
SelectOnly(Vec<Rc<Field>>),
SelectAll
}
The error is:
the trait `core::kinds::Sized` is not implemented for the type `Field+'static`
Is there any other way to have the vector with reference-counted immutable objects of trait type?
I suppose that I can rewrite the code like this:
#[deriving(Clone)]
pub enum Select {
SelectOnly(Vec<Rc<Box<Field>>>),
SelectAll
}
Is it the right way?
It is possible to create an trait object with an Rc as of Rust 1.1. This compiles:
use std::rc::Rc;
trait Field: Send + Sync {
fn name(&self);
}
enum Select {
Only(Vec<Rc<Field>>),
All,
}
// ---
struct Example;
impl Field for Example {
fn name(&self) {}
}
fn main() {
let fields: Vec<Rc<Field>> = vec![Rc::new(Example)];
Select::Only(fields);
}
Note that your original example used Clone, but you cannot make such a trait into a trait object because it is not object safe. I've removed it to answer the question.
I also removed the redundancy of the enum variant names.
I believe that it should be possible with DST, but Rust is not there just yet. The major motivation for DST was exactly the desire to use trait objects with any kind of smart pointer. As far as I know, this should be possible by 1.0 release.
As a temporary workaround, indeed, you can use Rc<Box<T>>, though this kind of double indirection is unfortunate.
It will be possible after #18248 and #16918.

Resources