Overloading AsRef<T> by using it to unwrap enums in Rust? - rust

I'm using AsRef<T> and AsMut<T> to expose an wrapped value in an enum. Can anyone tell me if this is an anti-pattern? I came across Is it considered a bad practice to implement Deref for newtypes?, and it convinced me that Deref would be a bad idea, but I'm not sure about the approach below.
pub enum Node {
Stmt(Statement),
Expr(Expression),
}
impl AsMut<Expression> for Node {
fn as_mut(&mut self) -> &mut Expression {
match self {
Node::Stmt(_) => panic!("fatal: expected Expression"),
Node::Expr(e) => e,
}
}
}
impl AsMut<Expression> for Box<Node> {
fn as_mut(&mut self) -> &mut Expression {
(**self).as_mut()
}
}
impl AsMut<Expression> for Expression {
fn as_mut(&mut self) -> &mut Expression {
self
}
}
fn check_binop<T: AsMut<Expression>>(
&mut self,
sym: Symbol,
lhs: &mut T,
rhs: &mut T,
ty: &mut Option<Type>,
) -> Result<Type, String> {
let lhs = lhs.as_mut();
let rhs = rhs.as_mut();
...
}
I'm considering just making my own traits (AsExpr<T> and AsExprMut<T>) that are just re-implementations of AsRef<T> and AsMut<T>. Functionally no different, but I think it would be clearer.

I strongly suggest you do not do this, especially considering the docs for AsRef and AsMut say in bold:
Note: This trait must not fail.
and I would definitely consider panicking to be failure. This isn't an anti-pattern so much as it is breaking the contract you opt-into by implementing those traits. You should consider returning an Option<&mut Expression> instead of going through AsRef/AsMut, or making your own traits like you mentioned.

Related

Optimizing reading a `Vec` of custom types

I am writing a protocol parsing library and want to optimize its speed while reading lists of data.
The library defines a read trait
trait Readable: Sized {
fn read(reader: &mut impl std::io::Read) -> std::io::Result<Self>;
}
and implements all the primitives like so:
impl Readable for u8 {
fn read(reader: &mut impl std::io::Read) -> std::io::Result<Self> {
Ok(reader.read_u8()?)
}
}
// same for i8, u16, i16 etc
The problem is with the Vec implementation, which is defined as
impl <T> Readable for Vec<T>
where T: Readable {
fn read(reader: &mut impl std::io::Read) -> std::io::Result<Self> {
let length = u16::read(reader)?;
let mut items = Vec::with_capacity(length as usize);
for _ in 0..length {
items.push(T::read(reader)?);
}
Ok(items)
}
}
It is common in the protocol to be reading large Vec<u8>, so the current implementation of reading read_u8 on every item is not ideal. Using read_exact has almost 5x the performance at 100 bytes in my benchmarks, and the difference only gets larger as the size increases. (the reader is in-memory, not doing any IO)
The question is, how can I optimize this trait when reading Vec<u8> while keeping the functionality of being able to read generic lists of Vec<T:Readable>?
My current solutions are either:
Use specialization on nightly, but I would rather stay on the stable channel.
add a const IS_U8 : bool to the Readable trait and call read_exact if T::IS_u8 is true in the Vec impl, using mem::transmutate to return it as a Vec<T>. Adding a const for just this case is annoying and I'm not sure I can guarantee the safety of the implementation.
Is there a solution to this problem in stable rust?
There's a way to force a sort of specialization in stable Rust. Here's something that compiles, though it's very awkward. I hope someone else has a less-awkward method. But this trick is to define a separate trait for the unspecialized types. For example:
trait XReadable: Sized {
fn xread(reader: &mut impl std::io::Read) -> std::io::Result<Self>;
}
impl <T: XReadable> Readable for T {
fn read(reader: &mut impl std::io::Read) -> std::io::Result<Self> {
Self::xread(reader)
}
}
So that now you can either implement Readable or XReadable for a type, and then use it as Readable. Now that you have that separation, you can define your unspecialized Vec behavior for XReadable only:
impl <T> XReadable for Vec<T>
where T: XReadable {
fn xread(reader: &mut impl std::io::Read) -> std::io::Result<Self> {
// ...
}
}
And define the specialized behavior explicitly with Readable:
impl Readable for Vec<u8> {
// ...
}

Patterns for proxying and implementation hiding in Rust?

So I'm an experienced developer, pretty new to Rust, expert in Java - but started out in assembly language, so I get memory and allocation, and have written enough compiler-y things to fathom the borrow-checker pretty well.
I decided to port a very useful, high-performance bitset-based graph library I wrote in Java, both to use it in a larger eventual project, and because it's darned useful. Since it's all integer positions in bitsets, and if you want an object graph you map indices into an array or whatever - I'm not tangled up in building a giant tree of objects that reference each other, which would be a mess to try to do in Rust. The problem I'm trying to solve is much simpler - so simple I feel like I must be missing an obvious pattern for how to do it:
Looking over the various bit-set libraries available for Rust, FixedBitSet seemed like a good fit. However, I would rather not expose it via my API and tie every consumer of my library irrevocably to FixedBitSet (it is nice, but, it can also be useful to swap in, say, an implementation backed by atomics; and being married to usize may not be ideal, but FixedBitSet is).
In Java, you'd just create an interface that wraps an instance of the concrete type, and exposes the functionality you want, hiding the implementation type. In Rust, you have traits, so it's easy enough to implement:
pub trait Bits<'i, S: Sized + Add<S>> {
fn size(&'i self) -> S;
fn contains(&'i self, s: S) -> bool;
...
impl<'i, 'a> Bits<'i, usize> for FixedBitSet {
fn size(&'i self) -> usize {
self.len()
}
fn contains(&'i self, s: usize) -> bool {
FixedBitSet::contains(self, s)
}
...
this gets a little ugly in that, if I don't want to expose FixedBitSet, everything has to return Box<dyn Bits<'a, usize> + 'a>, but so be it for now - though that creates its own problems with the compiler not knowing the size of dyn Bits....
So far so good. Where it gets interesting (did I say this was a weirdly simple problem?) is proxying an iterator. Rust iterators seem to be irrevocably tied to a concrete Trait type which has an associated type. So you can't really abstract it (well, sort of, via Box<dyn Iterator<Item=usize> + 'a> where ..., and it looks like it might be possible to create a trait that extends iterator and also has a type Item on it, and implement it for u32, u64, usize and ?? maybe the compiler coalesces the type Item members of the traits?). And as far as I can tell, you can't narrow the return type in a trait implementation method to something other than the trait specifies, either.
This gets further complicated by the fact that the Ones type in FixedBitSet for iterating set bits has its own lifetime - but Rust's Iterator does not - so any generic Iterator implementation is going to need to return an iterator scoped to that lifetime, not '_ or there will be issues with how long the iterator lives vis-a-vis the thing that created it.
The tidiest thing I could come up with - which is not tidy at all - after experimenting with various containers for an iterator (an implementation of Bits that adds an offset to the base value is also useful) that expose it, was something like:
pub trait Biterable<'a, 'b, S: Sized + Add<S>, I: Iterator<Item = S> + 'b> where 'a: 'b {
fn set_bits<'c>(&'a mut self) -> I where I: 'c, 'b: 'c;
}
which is implementable enough:
impl<'a, 'b> Biterable<'a, 'b, usize, Ones<'b>> for FixedBitSet where 'a: 'b {
fn set_bits<'c>(&'a mut self) -> Ones<'b> where Ones<'b>: 'c, 'b: 'c {
self.ones()
}
}
but then, we know we're going to be dealing with Boxes. So, we're going to need an implementation for that. Great! A signature like impl<'a, 'b> Biterable<'a, 'b, usize, Ones<'b>> for Box<FixedBitSet> where 'a: 'b { is implementable. BUUUUUUT, that's not what anything is going to return if we're not exposing FixedBitSet anywhere - we need it for Box<dyn Bits<...> + ...>. For that, we wind up in a hall of mirrors, scribbling out increasingly baroque and horrifying (and uncompilable) variants on
impl<'a, 'b, B> Biterable<'a, 'b, usize, &'b mut dyn Iterator<Item=usize>>
for Box<dyn B + 'a> where 'a: 'b, B : Bits<'a, usize> + Biterable<'a, 'b, usize, Ones<'b>> {
in a vain search for something that compiles and works (this fails because while Bits and Biterable are traits, evidently Biterable + Bits is not a trait). Seriously - a stateless, no-allocation-needed wrapper for one call on this thing returning one call on that thing, just not exposing that thing's type to the caller. That's it. The Java equivalent would be Supplier<T> a = ...; return () -> a.get();
I have to be thinking about this problem wrong. How?
It does certainly seem like you're over-complicating things. You have a lot of lifetime annotations that don't seem necessary. Here's a straightforward implementation (ignoring generic S):
use fixedbitset::FixedBitSet; // 0.2.0
pub trait Bits {
fn size(&self) -> usize;
fn contains(&self, s: usize) -> bool;
fn set_bits<'a>(&'a mut self) -> Box<dyn Iterator<Item = usize> + 'a>;
}
impl Bits for FixedBitSet {
fn size(&self) -> usize {
self.len()
}
fn contains(&self, s: usize) -> bool {
self.contains(s)
}
fn set_bits<'a>(&'a mut self) -> Box<dyn Iterator<Item = usize> + 'a> {
Box::new(self.ones())
}
}
pub fn get_bits_from_api() -> impl Bits {
FixedBitSet::with_capacity(64)
}
If you want the index type to be anonymous as well, make it an associated type and not define it when exposing your Bits:
use fixedbitset::FixedBitSet; // 0.2.0
pub trait Bits {
type Idx: std::ops::Add<Self::Idx>;
fn size(&self) -> Self::Idx;
fn contains(&self, s: Self::Idx) -> bool;
fn set_bits<'a>(&'a self) -> Box<dyn Iterator<Item = Self::Idx> + 'a>;
}
impl Bits for FixedBitSet {
type Idx = usize;
fn size(&self) -> Self::Idx {
self.len()
}
fn contains(&self, s: Self::Idx) -> bool {
self.contains(s)
}
fn set_bits<'a>(&'a self) -> Box<dyn Iterator<Item = Self::Idx> + 'a> {
Box::new(self.ones())
}
}
pub fn get_bits_from_api() -> impl Bits {
// ^^^^^^^^^ doesn't have <Idx = usize>
FixedBitSet::with_capacity(64)
}
fn main() {
let bits = get_bits_from_api();
// just a demonstration that it compiles
let size = bits.size();
if bits.contains(size) {
for indexes in bits.set_bits() {
// ...
}
}
}
I highly encourage against this though for many reasons. 1) You'd need many more constraints than just Add for this to be remotely usable. 2) You are severely limited with impl Bits; its not fully defined so you can't have dyn Bits or store it in a struct. 3) I don't see much benefit in being generic in this regard.

Mutually exclusive traits

I need to create operations for an operation sequence. The operations share the following behaviour. They can be evaluated, and at construction they can either be parametrized by a single i32 (eg. Sum) or not parametrized at all (eg. Id).
I create a trait Operation. The evaluate part is trivial.
trait Operation {
fn evaluate(&self, operand: i32) -> i32;
}
But I don't know how to describe the second demand. The first option is to simply let concrete implementations of Operation handle that behaviour.
pub struct Id {}
impl Id {
pub fn new() -> Id {
Id {}
}
}
impl Operation for Id {
fn evaluate(&self, operand: i32) -> i32 {
operand
}
}
pub struct Sum {
augend: i32,
}
impl Sum {
pub fn new(augend: i32) -> Sum {
Sum { augend }
}
}
impl Operation for Sum {
fn evaluate(&self, addend: i32) -> i32 {
augend + addend
}
}
Second option is a new function that takes an optional i32. Then the concrete implementations deal with the possibly redundant input. I find this worse than the first option.
trait Operation {
fn evaluate(&self, operand: i32) -> i32;
fn new(parameter: std::Option<i32>)
}
Google has lead me to mutually exclusive traits: https://github.com/rust-lang/rust/issues/51774. It seems promising, but it doesn't quite solve my problem.
Is there a way to achieve this behaviour?
trait Operation = Evaluate + (ParametrizedInit or UnparametrizedInit)
How about you use an associated type to define the initialization data?
trait Operation {
type InitData;
fn init(data: Self::InitData) -> Self;
fn evaluate(&self, operand: i32) -> i32;
}
impl Operation for Id {
type InitData = ();
fn init(_: Self::InitData) -> Self {
Id {}
}
fn evaluate(&self, operand: i32) -> i32 {
operand
}
}
impl Operation for Sum {
type InitData = i32;
fn init(augend: Self::InitData) -> Self {
Sum { augend }
}
fn evaluate(&self, addend: i32) -> i32 {
augend + addend
}
}
For the Id case you specify () to say that the initialization does not need data. It's still a bit meh to call Operation::init(()), but I think the trait at least captures the logic fairly well.
To actually get mutually exclusive traits (which is apparently what you want), you have to use some workaround. The Rust language does not support mutually exclusive traits per-se. But you can use associated types and some marker types to get something similar. This is a bit strange, but works for now.
trait InitMarker {}
enum InitFromNothingMarker {}
enum InitFromI32Marker {}
impl InitMarker for InitFromNothingMarker {}
impl InitMarker for InitFromI32Marker {}
trait Operation {
type InitData: InitMarker;
fn init() -> Self
where
Self: Operation<InitData = InitFromNothingMarker>;
fn init_from(v: i32) -> Self
where
Self: Operation<InitData = InitFromI32Marker>;
}
trait UnparametrizedInit: Operation<InitData = InitFromNothingMarker> {}
trait ParametrizedInit: Operation<InitData = InitFromI32Marker> {}
impl<T: Operation<InitData = InitFromNothingMarker>> UnparametrizedInit for T {}
impl<T: Operation<InitData = InitFromI32Marker>> ParametrizedInit for T {}
(Ideally you want to have a Sealed trait that is defined in a private submodule of your crate. That way, no one (except for you) can implement the trait. And then make Sealed a super trait for InitMarker.)
This is quite a bit of code, but at least you can make sure that implementors of Operation implement exactly one of ParametrizedInit and UnparametrizedInit.
In the future, you will likely be able to replace the marker types with an enum and the associated type with an associated const. But currently, "const generics" are not finished enough, so we have to take the ugly route by using marker types. I'm actually discussing these solutions in my master's thesis (section 4.2, just search for "mutually exclusive").

How do you create a generic function in Rust with a trait requiring a lifetime?

I am trying to write a trait which works with a database and represents something which can be stored. To do this, the trait inherits from others, which includes the serde::Deserialize trait.
trait Storable<'de>: Serialize + Deserialize<'de> {
fn global_id() -> &'static [u8];
fn instance_id(&self) -> Vec<u8>;
}
struct Example {
a: u8,
b: u8
}
impl<'de> Storable<'de> for Example {
fn global_id() -> &'static [u8] { b"p" }
fn instance_id(&self) -> Vec<u8> { vec![self.a, self.b] }
}
Next, I am trying to write this data using a generic function:
pub fn put<'de, S: Storable>(&mut self, obj: &'de S) -> Result<(), String> {
...
let value = bincode::serialize(obj, bincode::Infinite);
...
db.put(key, value).map_err(|e| e.to_string())
}
However, I am getting the following error:
error[E0106]: missing lifetime specifier
--> src/database.rs:180:24
|
180 | pub fn put<'de, S: Storable>(&mut self, obj: &'de S) -> Result<(), String> {
| ^^^^^^^^ expected lifetime parameter
Minimal example on the playground.
How would I resolve this, possibly avoid it altogether?
You have defined Storable with a generic parameter, in this case a lifetime. That means that the generic parameter has to be propagated throughout the entire application:
fn put<'de, S: Storable<'de>>(obj: &'de S) -> Result<(), String> { /* ... */ }
You can also decide to make the generic specific. That can be done with a concrete type or lifetime (e.g. 'static), or by putting it behind a trait object.
Serde also has a comprehensive page about deserializer lifetimes. It mentions that you can choose to use DeserializeOwned as well.
trait Storable: Serialize + DeserializeOwned { /* ... */ }
You can use the same concept as DeserializeOwned for your own trait as well:
trait StorableOwned: for<'de> Storable<'de> { }
fn put<'de, S: StorableOwned>(obj: &'de S) -> Result<(), String> {
You have the 'de lifetime in the wrong place -- you need it to specify the argument to Storable, not the lifetime of the reference obj.
Instead of
fn to_json<'de, S: Storable>(obj: &'de S) -> String {
use
fn to_json<'de, S: Storable<'de>>(obj: &S) -> String {
Playground.
The lifetime of obj doesn't actually matter here, because you're not returning any values derived from it. All you need to prove is that S implements Storable<'de> for some lifetime 'de.
If you want to eliminate the 'de altogether, you should use DeserializeOwned, as the other answer describes.

Generic function for modifying scalars and slices in place

I don't understand some basics in Rust. I want to compute a function sinc(x), with x being a scalar or a slice, which modifies the values in place. I can implement methods for both types, calling them with x.sinc(), but I find it more convenient (and easier to read in long formulas) to make a function, e.g. sinc(&mut x). So how do you do that properly?
pub trait ToSinc<T> {
fn sinc(self: &mut Self) -> &mut Self;
}
pub fn sinc<T: ToSinc<T>>(y: &mut T) -> &mut T {
y.sinc()
}
impl ToSinc<f64> for f64 {
fn sinc(self: &mut Self) -> &mut Self {
*self = // omitted
self
}
}
impl<'a> ToSinc<&'a mut [f64]> for &'a mut [f64] {
fn sinc(self: &mut Self) -> &mut Self {
for yi in (**self).iter_mut() { ... }
self
}
}
This seems to work, but isn't the "double indirection" in the last impl costly? I also thought about doing
pub trait ToSinc<T> {
fn sinc(self: Self) -> Self;
}
pub fn sinc<T: ToSinc<T>>(y: T) -> T {
y.sinc()
}
impl<'a> ToSinc<&'a mut f64> for &'a mut f64 {
fn sinc(self) -> Self {
*self = ...
self
}
}
impl<'a> ToSinc<&'a mut [f64]> for &'a mut [f64] {
fn sinc(self) -> Self {
for yi in (*self).iter_mut() { ... }
self
}
}
This also works, the difference is that if x is a &mut [f64] slice, I can call sinc(x) instead of sinc(&mut x). So I have the impression there is less indirection going on in the second one, and I think that's good. Am I on the wrong track here?
I find it highly unlikely that any differences from the double-indirection won't be inlined away in this case, but you're right that the second is to be preferred.
You have ToSinc<T>, but don't use T. Drop the template parameter.
That said, ToSinc should almost certainly be by-value for f64s:
impl ToSinc for f64 {
fn sinc(self) -> Self {
...
}
}
You might also want ToSinc for &mut [T] where T: ToSinc.
You might well say, "ah - one of these is by value, and the other by mutable reference; isn't that inconsistent?"
The answer depends on what you're actually intend the trait to be used as.
An interface for sinc-able types
If your interface represents those types that you can run sinc over, as traits of this kind are intended to be used, the goal would be to write functions
fn do_stuff<T: ToSinc>(value: T) { ... }
Now note that the interface is by-value. ToSinc takes self and returns Self: that is a value-to-value function. In fact, even when T is instantiated to some mutable reference, like &mut [f64], the function is unable to observe any mutation to the underlying memory.
In essence, these functions treat the underlying memory as an allocation source, and to value transformations on the data held in these allocations, much like a Box → Box operation is a by-value transformation of heap memory. Only the caller is able to observe mutations to the memory, but even then implementations which treat their input as a value type will return a pointer that prevents needing to access the data in this memory. The caller can just treat the source data as opaque in the same way that an allocator is.
Operations which depend on mutability, like writing to buffers, should probably not be using such an interface. Sometimes to support these cases it makes sense to build a mutating basis and a convenient by-value accessor. ToString is an interesting example of this, as it's just a wrapper over Display.
pub trait ToSinc: Sized {
fn sinc_in_place(&mut self);
fn sinc(mut self) -> Self {
self.sinc_in_place();
self
}
}
where impls mostly just implement sinc_in_place and users tend to prefer sinc.
As fakery for ad-hoc overloading
In this case, one doesn't care if the trait is actually usable generically, or even that it's consistent. sinc("foo") might do a sing and dance, for all we care.
As such, although the trait is needed it should be defined as weakly as possible:
pub trait Sincable {
type Out;
fn sinc(self) -> Self::Out;
}
Then your function is far more generic:
pub fn sinc<T: Sincable>(val: T) -> T::Out {
val.sinc()
}
To implement a by-value function you do
impl Sincable for f64 {
type Out = f64;
fn sinc(self) -> f64 {
0.4324
}
}
and a by-mut-reference one is just
impl<'a, T> Sincable for &'a mut [T]
where T: Sincable<Out=T> + Copy
{
type Out = ();
fn sinc(self) {
for i in self {
*i = sinc(*i);
}
}
}
since () is the default empty type. This acts just like an ad-hoc overloading would.
Playpen example of emulated ad-hoc overloading.

Resources