How can I disable struct construction but maintaining pattern matching in Rust?
Let's see an example:
struct OrderedPair(pub u32, pub u32);
impl OrderedPair {
fn new(a: u32, b: u32) -> Self {
if a < b {
Self(a, b)
} else {
Self(b, a)
}
}
}
It's obvious that I want to inhibit the construction of such struct (e.g. OrderedPair(2, 1)) and to use only the new method, in order to preserve the invariant. I know of 3 ways to do this:
Make private the fields
struct OrderedPair(u32, u32);
Add a private dummy field
struct OrderedPair(pub u32, pub u32, ());
Make the struct non-exhaustive
#[non_exhaustive]
struct OrderedPair(pub u32, pub u32);
The issues are that with 1 I cannot access the members at all and with all three I cannot use pattern matching
let OrderedPair(min, max) = my_ordered_pair;
So is there a way to block struct construction but allow pattern matching?
I know that if we declare a mutable variable of that type with public access to members then the invariant can be broken by manually changing the members, but for now avoiding the struct constructor is enough.
Instead of doing pattern matching directly on the fields, you can do it on a returned tupple:
#[derive(Clone, Copy)]
pub struct OrderedPair {
a: u32,
b: u32,
}
impl OrderedPair {
pub fn new(a: u32, b: u32) -> Self {
let (a, b) = if a < b { (a, b) } else { (b, a) };
Self { a, b }
}
pub fn content(self) -> (u32, u32) {
(self.a, self.b)
}
}
Related
I have a recursive data structure in my pet-project:
(this is a simplified example)
pub trait Condition {
fn validate(&self, s: &str) -> bool;
}
pub struct Equal {
ref_val: String,
}
impl Condition for Equal {
fn validate(&self, s: &str) -> bool { self.ref_val == s }
}
pub struct And<A, B> where A: Condition + ?Sized, B: Condition + ?Sized {
left: Box<A>,
right: Box<B>,
}
impl<A, B> Condition for And<A, B> where A: Condition + ?Sized, B: Condition + ?Sized {
fn validate(&self, s: &str) -> bool { self.left.validate(s) && self.right.validate(s) }
}
and i want to serialize and de-serialize the condition trait (using serde) eg.:
fn main() {
let c = And {
left: Box::new(Equal{ ref_val: "goofy".to_string() }),
right: Box::new(Equal{ ref_val: "goofy".to_string() }),
};
let s = serde_json::to_string(&c).unwrap();
let d: Box<dyn Condition> = serde_json::from_string(&s).unwrap();
}
Because serde cannot deserialize dyn traits out-of-the box, i tagged the serialized markup, eg:
#[derive(PartialEq, Debug, Serialize)]
#[serde(tag="type")]
pub struct Equal {
ref_val: String,
}
and try to implement a Deserializer and a Vistor for Box<dyn Condition>
Since i am new to Rust and because the implementation of a Deserializer and a Visitor is not that straightforward with the given documentation, i wonder if someone has an idea how to solve my issue with an easier approach?
I went through the serde documentation and searched for solution on tech sites/forums.
i tried out typetag but it does not support generic types
UPDATE:
To be more precise: the serialization works fine, ie. serde can serialize any concrete object of the Condition trait, but in order to deserialize a Condition the concrete type information needs to be provided. But this type info is not available at compile time. I am writing a web service where customers can upload rules for context matching (ie. Conditions) so the controller of the web service does not know the type when the condition needs to be deserialized.
eg. a Customer can post:
{"type":"Equal","ref_val":"goofy"}
or
{"type":"Greater","ref_val":"Pluto"}
or more complex with any combinator ('and', 'or', 'not')
{"type":"And","left":{"type":"Greater","ref_val":"Gamma"},"right":{"type":"Equal","ref_val":"Delta"}}
and therefore i need to deserialze to a trait (dyn Condition) using the type tags in the serialized markup...
I would say the “classic” way to solve this problem is by deserializing into an enum with one variant per potential “real” type you're going to deserialize into. Unfortunately, And is generic, which means those generic parameters have to exist on the enum as well, so you have to specify them where you deserialize.
use serde::{Deserialize, Serialize};
use serde_json; // 1.0.91 // 1.0.152
pub trait Condition {
fn validate(&self, s: &str) -> bool;
}
#[derive(PartialEq, Debug, Serialize, Deserialize)]
pub struct Equal {
ref_val: String,
}
impl Condition for Equal {
fn validate(&self, s: &str) -> bool {
self.ref_val == s
}
}
#[derive(PartialEq, Debug, Serialize, Deserialize)]
pub struct And<A, B>
where
A: Condition + ?Sized,
B: Condition + ?Sized,
{
left: Box<A>,
right: Box<B>,
}
impl<A, B> Condition for And<A, B>
where
A: Condition + ?Sized,
B: Condition + ?Sized,
{
fn validate(&self, s: &str) -> bool {
self.left.validate(s) && self.right.validate(s)
}
}
#[derive(Debug, Serialize, Deserialize)]
#[serde(untagged)]
enum Expr<A, B>
where
A: Condition + ?Sized,
B: Condition + ?Sized,
{
Equal(Equal),
And(And<A, B>),
}
fn main() {
let c = And {
left: Box::new(Equal {
ref_val: "goofy".to_string(),
}),
right: Box::new(Equal {
ref_val: "goofy".to_string(),
}),
};
let s = serde_json::to_string(&c).unwrap();
let d: Expr<Equal, Equal> = serde_json::from_str(&s).unwrap();
println!("{d:?}");
}
Prints And(And { left: Equal { ref_val: "goofy" }, right: Equal { ref_val: "goofy" } })
I removed the generics from the combinator conditions, so i can now use typetag like #EvilTak suggested:
#[derive(Serialize, Deserialize)]
#[serde(tag="type")]
pub struct And {
left: Box<dyn Condition>,
right: Box<dyn Condition>,
}
#[typetag::serde]
impl Condition for And {
fn validate(&self, s: &str) -> bool { self.left.validate(s) && self.right.validate(s) }
}
(on the downside, i had to remove the derive macros PartialEq, and Debug)
Interesting side fact: i have to keep the #[serde(tag="type")] on the And Struct because otherwise the typetag will be omitted in the serialization (for the primitive consitions it is not needed)
UPDATE: typetag adds the type tag only for trait objects so the #[serde(tag="type")] is not needed...
I'm working with two crates: A and B. I control both. I'd like to create a struct in A that has a field whose type is known only to B (i.e., A is independent of B, but B is dependent on A).
crate_a:
#[derive(Clone)]
pub struct Thing {
pub foo: i32,
pub bar: *const i32,
}
impl Thing {
fn new(x: i32) -> Self {
Thing { foo: x, bar: &0 }
}
}
crate_b:
struct Value {};
fn func1() {
let mut x = A::Thing::new(1);
let y = Value {};
x.bar = &y as *const Value as *const i32;
...
}
fn func2() {
...
let y = unsafe { &*(x.bar as *const Value) };
...
}
This works, but it doesn't feel very "rusty". Is there a cleaner way to do this? I thought about using a trait object, but ran into issues with Clone.
Note: My reason for splitting these out is that the dependencies in B make compilation very slow. Value above is actually from llvm_sys. I'd rather not leak that into A, which has no other dependency on llvm.
The standard way to implement something like this is with generics, which are kind of like type variables: they can be "assigned" a particular type, possibly within some constraints. This is how the standard library can provide types like Vec that work with types that you declare in your crate.
Basically, generics allow Thing to be defined in terms of "some unknown type that will become known later when this type is actually used."
Given the example in your code, it looks like Thing's bar field may or may not be set, which suggests that the built-in Option enum should be used. All you have to do is put a type parameter on Thing and pass that through to Option, like so:
pub mod A {
#[derive(Clone)]
pub struct Thing<T> {
pub foo: i32,
pub bar: Option<T>,
}
impl<T> Thing<T> {
pub fn new(x: i32) -> Self {
Thing { foo: x, bar: None }
}
}
}
pub mod B {
use crate::A;
struct Value;
fn func1() {
let mut x = A::Thing::new(1);
let y = Value;
x.bar = Some(y);
// ...
}
fn func2(x: &A::Thing<Value>) {
// ...
let y: &Value = x.bar.as_ref().unwrap();
// ...
}
}
(Playground)
Here, the x in B::func1() has the type Thing<Value>. You can see with this syntax how Value is substituted for T, which makes the bar field Option<Value>.
If Thing's bar isn't actually supposed to be optional, just write pub bar: T instead, and accept a T in Thing::new() to initialize it:
pub mod A {
#[derive(Clone)]
pub struct Thing<T> {
pub foo: i32,
pub bar: T,
}
impl<T> Thing<T> {
pub fn new(x: i32, y: T) -> Self {
Thing { foo: x, bar: y }
}
}
}
pub mod B {
use crate::A;
struct Value;
fn func1() {
let mut x = A::Thing::new(1, Value);
// ...
}
fn func2(x: &A::Thing<Value>) {
// ...
let y: &Value = &x.bar;
// ...
}
}
(Playground)
Note that the definition of Thing in both of these cases doesn't actually require that T implement Clone; however, Thing<T> will only implement Clone if T also does. #[derive(Clone)] will generate an implementation like:
impl<T> Clone for Thing<T> where T: Clone { /* ... */ }
This can allow your type to be more flexible -- it can now be used in contexts that don't require T to implement Clone, while also being cloneable when T does implement Clone. You get the best of both worlds this way.
For a Gameboy emulator, you have two u8 fields for registers A and F, but they can sometimes be accessed as AF, a combined u16 register.
In C, it looks like you can do something like this:
struct {
union {
struct {
unsigned char f;
unsigned char a;
};
unsigned short af;
};
};
(Taken from here)
Is there a way in Rust, ideally without unsafe, of being able to access two u8s as registers.a/registers.f, but also be able to use them as the u16 registers.af?
I can give you a couple of ways to do it. First is a straightforward unsafe analogue but without boilerplate, the second one is safe but explicit.
Unions in rust are very similar, so you can translate it to this:
#[repr(C)]
struct Inner {
f: u8,
a: u8,
}
#[repr(C)]
union S {
inner: Inner,
af: u16,
}
// Usage:
// Putting data is safe:
let s = S { af: 12345 };
// but retrieving is not:
let a = unsafe { s.inner.a };
Or as an alternative you may manually do all of the explicit casts wrapped in a structure:
#[repr(transparent)]
// This is optional actually but allows a chaining,
// you may remove these derives and change method
// signatures to `&self` and `&mut self`.
#[derive(Clone, Copy)]
struct T(u16);
impl T {
pub fn from_af(af: u16) -> Self {
Self(af)
}
pub fn from_a_f(a: u8, f: u8) -> Self {
Self::from_af(u16::from_le_bytes([a, f]))
}
pub fn af(self) -> u16 {
self.0
}
pub fn f(self) -> u8 {
self.0.to_le_bytes()[0]
}
pub fn set_f(self, f: u8) -> Self {
Self::from_a_f(self.a(), f)
}
pub fn a(self) -> u8 {
self.0.to_le_bytes()[1]
}
pub fn set_a(self, a: u8) -> Self {
Self::from_a_f(a, self.f())
}
}
// Usage:
let t = T::from_af(12345);
let a = t.a();
let new_af = t.set_a(12).set_f(t.f() + 1).af();
Suppose we define a generic struct, with many fields, representing a type-safe state machine using a phantom type:
struct Foo<State> {
a: A,
b: B,
c: C,
//...
state: PhantomData<State>,
}
We can then write a type-safe state transition:
impl Foo<SourceState> {
fn transition(self, extra: X) -> Foo<DestinationState> {
let Foo {a, b, c, state: _} = self;
// do lots of stuff
Foo { a, b, c, state: PhantomData }
}
}
But we need to awkwardly unpack every field and re-pack in in a different structure.
We could also use mem::transmute, although my understanding is that different monomorphizations of the same struct are not guaranteed to have the same memory layout.
I hoped that Foo { state: PhantomData, ..self } would work; alas, it fails to compile.
Is there any canonical, ergonomic, safe way to write this ?
There is no way to do that in a straight forward manner, because they are 2 different types: that's the whole point of your code, actually. To simplify it, I'd do that in 2 steps, with a generic transition being the 2nd one:
use core::marker::PhantomData;
struct Foo<State> {
a: i32,
b: i32,
c: i32,
//...
state: PhantomData<State>,
}
struct SourceState;
struct DestinationState;
impl<Src> Foo<Src> {
fn transition<Dest>(self) -> Foo<Dest> {
let Foo {a, b, c, state: _} = self;
Foo { a, b, c, state: PhantomData }
}
}
impl Foo<SourceState> {
fn to_destination_state(mut self, extra: ()) -> Foo<DestinationState> {
// Do whatever you want with self
self.transition()
}
}
Alternatively, you can abstract the fact that you have a state:
mod stateful {
use core::marker::PhantomData;
pub struct Stateful<T, State> {
pub data: T,
state: PhantomData<State>,
}
impl<T, SrcState> Stateful<T, SrcState> {
pub fn transform<DestState>(self) -> Stateful<T, DestState> {
let Stateful { data, state: _ } = self;
Stateful {
data,
state: Default::default(),
}
}
}
}
struct Data {
a: i32,
b: i32,
c: i32,
}
struct SourceState;
struct DestinationState;
type Foo<State> = stateful::Stateful<Data, State>;
impl Foo<SourceState> {
fn to_destination_state(mut self, extra: ()) -> Foo<DestinationState> {
// Do whatever you want with self.data
self.transform()
}
}
Given an implementation of a heterogeneous list in rust (for example like the one from frunk), how can I obtain a reference to an element within the list without knowing the concrete type of the element? The length of the list is known statically, but cannot be hard-coded as numeric literal.
I have tried to obtain the individual elements by popping each element off the list sequentially (as shown in the code example) and by writing an indexing method that accepts a usize index as argument. Neither attempts even compile. The posted example is what I would like to do.
pub trait HList: Sized {
const LEN: usize;
type Head;
type Tail: HList;
fn push<E>(self, element: E) -> Element<E, Self> {
Element { head: element, tail: self }
}
fn pop(self) -> Option<(Self::Head, Self::Tail)>;
fn len(&self) -> usize {
Self::LEN
}
}
#[derive(Debug, Clone, Default, PartialEq, Eq, Hash)]
pub struct Element<H, T> {
pub head: H,
pub tail: T,
}
impl<H, T: HList> HList for Element<H, T> {
const LEN: usize = 1 + <T as HList>::LEN;
type Head = H;
type Tail = T;
fn pop(self) -> Option<(H, T)> {
Some((self.head, self.tail))
}
}
#[derive(Debug, Clone, Copy, Default, PartialEq, Eq, Hash)]
pub struct End;
impl HList for End {
const LEN: usize = 0;
type Head = End;
type Tail = End;
fn pop(self) -> Option<(Self::Head, Self::Tail)> {
None
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn pop_two_for() {
let h = End
.push(0usize)
.push(String::from("Hello, World"));
fn eval<H: HList>(l: H) {
for _ in 0usize..H::LEN {
let (e, l) = l.pop().unwrap();
// Do work with `e`.
}
}
eval(h);
}
}
I have found a reasonable recursive solution, but it only works out if I can modify the HList trait based on the trait requirements of the work to be done. This is fine for my use case, even if it's not as generic as I would have liked it to be.
use std::fmt::Debug;
pub trait MyCustomTrait<'a>: Debug {}
impl<'a> MyCustomTrait<'a> for () {}
impl<'a> MyCustomTrait<'a> for usize {}
impl<'a> MyCustomTrait<'a> for String {}
pub trait HList: Sized {
const LEN: usize;
type Head: for<'a> MyCustomTrait<'a>;
type Tail: HList;
fn push<E>(self, element: E) -> Element<E, Self>
where
E: for<'a> MyCustomTrait<'a>,
{
Element::new(element, self)
}
fn head(&self) -> &Self::Head;
fn tail(&self) -> &Self::Tail;
fn len(&self) -> usize {
Self::LEN
}
}
#[derive(Debug, Clone, Default, PartialEq, Eq, Hash)]
pub struct Element<H, T>(H, T);
impl<H, T: HList> Element<H, T> {
pub fn new(head: H, tail: T) -> Self {
Element(head, tail)
}
}
impl<H, T> HList for Element<H, T>
where
H: for<'a> MyCustomTrait<'a>,
T: HList,
{
const LEN: usize = 1 + <T as HList>::LEN;
type Head = H;
type Tail = T;
fn head(&self) -> &Self::Head {
&self.0
}
fn tail(&self) -> &Self::Tail {
&self.1
}
}
#[derive(Debug, Clone, Copy, Default, PartialEq, Eq, Hash)]
pub struct End;
impl HList for End {
const LEN: usize = 0;
type Head = ();
type Tail = End;
fn head(&self) -> &Self::Head {
&()
}
fn tail(&self) -> &Self::Tail {
&End
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn artbitrary_recursive() {
let h = End
.push(0usize)
.push(String::from("Hello, World"));
fn eval<H: HList>(list: &H) {
if H::LEN > 0 {
let head = list.head();
// Do work here, like print the debug representation of `head`.
eprintln!("{:?}", head);
eval(list.tail());
}
}
eval(&h);
assert!(false);
}
}