I want to write an LRU Cache with a memory size limitation rather than the "number of objects" limitation in std. After trying to figure it out for myself, I cheated and looked at an existing implementation, and I almost understand it, but this stops me:
struct KeyRef<K> {
k: *const K,
}
impl<K: Hash> Hash for LruKeyRef<K> {
fn hash<H: Hasher>(&self, state: &mut H) {
unsafe { (*self.k).hash(state) }
}
}
impl<K: PartialEq> PartialEq for LruKeyRef<K> {
fn eq(&self, other: &LruKeyRef<K>) -> bool {
unsafe { (*self.k).eq(&*other.k) }
}
}
It's that last unsafe line that I don't understand. I'm using a HashMap as the underlying structure, the key is stored with the value, and I want the hasher to be able to find it. I make the working hash key a reference to the real key and provide Hash and PartialEq functions such that the HashMap can find and use the key for bucketing purposes. That's easy.
I understand then that I have to compare the two for PartialEq, and so it makes sense to me that I have to use *self.k to dereference the current object, so why &*other.k for the other object? That's what I don't understand. Why isn't it just *other.k? Aren't I just dereferencing both so I can compare the actual keys?
We wish to call PartialEq::eq:
trait PartialEq<Rhs = Self>
where
Rhs: ?Sized,
{
fn eq(&self, other: &Rhs) -> bool;
}
Assuming the default implementation where Rhs = Self and Self = K, we need to end up with two &K types
other.k is of type *const K
*other.k is of type K
&*other.k is of type &K
This much should hopefully make sense.
self.k is of type *const K
*self.k is of type K
The piece that's missing that that method calls are allowed to automatically reference the value they are called on. This is why there's no distinct syntax for a reference and a value, as there would be in C or C++ (foo.bar() vs foo->bar()).
Thus, the K is automatically referenced to get &K, fulfilling the signature.
impl<K: PartialEq> PartialEq for LruKeyRef<K> {
fn eq(&self, other: &LruKeyRef<K>) -> bool {
unsafe { (*self.k).eq(&*other.k) }
}
}
Under typical circumstances, we can call methods taking &self with just a reference to the object. In addition, a chain of references to the object is also implicitly coerced. That is, we can write:
let a: &str = "I'm a static string";
assert_eq!(str.len(), 19);
assert_eq!((&&&&str).len(), 19);
In your case however, we start with a pointer, which must be explicitly dereferenced inside an unsafe scope. Here are the types of all relevant expressions:
self.k : *const K
(*self.k) : K
other.k : *const K
&*other.k : &K
Since equals takes a reference on its right-hand member, we must make it a reference. Unlike in C++, you can not just pass an lvalue as a reference without making this reference-passing explicit, nor can you pass an rvalue to a const reference. You can however, prepend & to a literal in order to obtain a reference to it (foo(&5)). It only appears asymmetrical because (in a way) self.k is the caller and other.k is the callee.
Related
Consider the following Rust function, which is meant to indicate whether the given 3-byte string is equal to b"foo".
fn is_foo(value: [u8; 3]) -> bool {
value == b"foo"
}
This doesn't work:
error[E0277]: can't compare [u8; 3] with &[u8; 3]
The compiler is complaining that it can't compare a value of some type to a reference of the same type.
I found two ways of getting the equality check to work:
Turning the value into a ref first: &value == b"foo"
Turning the ref into a value first: value == *b"foo"
Coming from C++ (where a value and a reference are pretty much the same thing), both approaches look a bit strange to me. What is the most idiomatic way of comparing a value and a reference?
In rust T and &T (and also &mut T) are different types. Though different types can be compared it is not standard. Equality comparison (the == operator) is done by PartialEq trait. It looks like this:
pub trait PartialEq<Rhs = Self> where
Rhs: ?Sized,
{
fn eq(&self, other: &Rhs) -> bool;
fn ne(&self, other: &Rhs) -> bool { ... }
}
And although it is generic over Rhs (meaning that one type can potentially be compared with many other types), it defaults to Self.
So coming back to your example you could just write value.eq(b"foo") (that would compare values of references), but though it is not wrong, probably more often is &value == b"foo". Dereferencing is fine too, but I seldom see it.
You also don't have to be worry, that equality between types and their references will differ, because standard library has following blanket implementation (and others for mutable references):
impl<A, B> PartialEq<&B> for &A where
A: PartialEq<B> + ?Sized,
B: ?Sized,
{ ... }
That automatically implements equality for references in a proper way.
I have a simple struct that I would like to implement Index for, but as a newcomer to Rust I'm having a number of troubles with the borrow checker. My struct is pretty simple, I'd like to have it store a start and step value, then when indexed by a usize it should return start + idx * step:
pub struct MyStruct {
pub start: f64,
pub step: f64,
}
My intuition is that I'd simply be able to take the signature of Index and plug in my types:
impl Index<usize> for MyStruct {
type Output = f64;
fn index(&self, idx: usize) -> &f64 {
self.start + (idx as f64) * self.step
}
}
This gives the error mismatched types saying expected type &f64, found type f64. As someone who has yet to fully understand how Rust's type system works, I tried simply slapping & on the expression:
fn index(&self, idx: usize) -> &f64 {
&(self.start + (idx as f64) * self.step)
}
This now tells me that the borrowed value does not live long enough, so maybe it needs a lifetime variable?
fn index<'a>(&self, idx: usize) -> &'a f64 {
&(self.start + (idx as f64) * self.step)
}
The error is the same, but the note now gives lifetime 'a instead of lifetime #1, so I guess that's not necessary, but at this point I feel like I'm stuck. I'm confused that such a simple exercise for most languages has become so difficult to implement in Rust, since all I want to do is return a computation from a function that happens to be behind a reference. How should I go about implementing Index for a simple structure where the value is calculated on demand?
The Index trait is meant to return a borrowed pointer to a member of self (e.g. an item in a Vec). The signature of the index method from the Index trait makes it impractical to implement it to have the behavior you described, as you'd have to store every value returned by index in self and ensure that the pointers remain valid until the MyStruct is dropped.
This use case does not match the intuition for Index. When I see myStruct[3], my intuition is that, just as for arrays, I'm getting a pointer to some already-initialized data. The interface for Index corroborates this intuition.
I can see two things that you might potentially be trying to achieve:
Getting nice indexing syntax for your datastructure.
In this case I would recommend against the premise of implementing Index and just provide a method that returns a f64 instead of an &f64 like so.
impl MyStruct {
pub fn index(&self, idx: usize) -> f64 {
self.start + (idx as f64) * self.step
}
}
You don't get the operators, which is good because somebody reading [] would be mislead into thinking they were getting a pointer. But you do get the functionality you want. Depending on your use cases you may want to rename this method.
Passing MyStruct to a parameter with Index bounds.
This is trickier with good reason. Index expects the data to be there before it asks for it. You can't generate and return it because index returns f64, and you can't generate it in the datastructure and return a pointer because it doesn't take a &mut self. You'd have to populate these values before the call to index. Some redesign would be in order, and the direction of that redesign would depend on the larger context of your problem.
The struct std::vec::Vec implements two kinds of Extend, as specified here – impl<'a, T> Extend<&'a T> for Vec<T> and impl<T> Extend<T> for Vec<T>. The documentation states that the first kind is an "Extend implementation that copies elements out of references before pushing them onto the Vec". I'm rather new to Rust, and I'm not sure if I'm understanding it correctly.
I would guess that the first kind is used with the equivalent of C++ normal iterators, and the second kind is used with the equivalent of C++ move iterators.
I'm trying to write a function that accepts any data structure that will allow inserting i32s to the back, so I take a parameter that implements both kinds of Extend, but I can't figure out how to specify the generic parameters to get it to work:
fn main() {
let mut vec = std::vec::Vec::<i32>::new();
add_stuff(&mut vec);
}
fn add_stuff<'a, Rec: std::iter::Extend<i32> + std::iter::Extend<&'a i32>>(receiver: &mut Rec) {
let x = 1 + 4;
receiver.extend(&[x]);
}
The compiler complains that &[x] "creates a temporary which is freed while still in use" which makes sense because 'a comes from outside the function add_stuff. But of course what I want is for receiver.extend(&[x]) to copy the element out of the temporary array slice and add it to the end of the container, so the temporary array will no longer be used after receiver.extend returns. What is the proper way to express what I want?
From the outside of add_stuff, Rect must be able to be extended with a reference whose lifetime is given in the inside of add_stuff. Thus, you could require that Rec must be able to be extended with references of any lifetime using higher-ranked trait bounds:
fn main() {
let mut vec = std::vec::Vec::<i32>::new();
add_stuff(&mut vec);
}
fn add_stuff<Rec>(receiver: &mut Rec)
where
for<'a> Rec: std::iter::Extend<&'a i32>
{
let x = 1 + 4;
receiver.extend(&[x]);
}
Moreover, as you see, the trait bounds were overly tight. One of them should be enough if you use receiver consistently within add_stuff.
That said, I would simply require Extend<i32> and make sure that add_stuff does the right thing internally (if possible):
fn add_stuff<Rec>(receiver: &mut Rec)
where
Rec: std::iter::Extend<i32>
{
let x = 1 + 4;
receiver.extend(std::iter::once(x));
}
In Rust 1.14, the Index trait is defined as follows:
pub trait Index<Idx> where Idx: ?Sized {
type Output: ?Sized;
fn index(&self, index: Idx) -> &Self::Output;
}
The implicit Sized bound of the Output type is relaxed with ?Sized here. Which makes sense, because the index() method returns a reference to Output. Thus, unsized types can be used, which is useful; example:
impl<T> Index<Range<usize>> for Vec<T> {
type Output = [T]; // unsized!
fn index(&self, index: Range<usize>) -> &[T] { … } // no problem: &[T] is sized!
}
The Idx type parameter's implicit bound is also relaxed and can be unsized. But Idx is used by value as method argument and using unsized types as arguments is not possible AFAIK. Why is Idx allowed to be unsized?
I'm pretty sure this is just an accident of history. That looser bound was introduced in 2014. At that time, the trait looked a bit different:
// Syntax predates Rust 1.0!
pub trait Index<Sized? Index, Sized? Result> for Sized? {
/// The method for the indexing (`Foo[Bar]`) operation
fn index<'a>(&'a self, index: &Index) -> &'a Result;
}
Note that at this point in time, the Index type was passed by reference. Later on the renamed Idx type changed to pass by value:
fn index<'a>(&'a self, index: Idx) -> &'a Self::Output;
However, note that both forms coexisted in the different compiler bootstrap stages. That's probably why the optional Sized bound couldn't be immediately removed. It's my guess that it basically was forgotten due to more important changes, and now we are where we are.
It's an interesting thought experiment to decide if restricting the bound (by removing ?Sized) would break anything... maybe someone should submit a PR... ^_^
Thought experiment over! Lukas submitted a PR! There's been discussion that it might break downstream code that creates subtraits of Index like:
use std::ops::Index;
trait SubIndex<I: ?Sized>: Index<I> { }
There's also talk that someday, we might want to pass dynamically-sized types (DSTs) by value, although I don't understand how.
I have the following code
extern crate rand;
use rand::Rng;
pub struct Randomizer {
rand: Box<Rng>,
}
impl Randomizer {
fn new() -> Self {
let mut r = Box::new(rand::thread_rng()); // works
let mut cr = Randomizer { rand: r };
cr
}
fn with_rng(rng: &Rng) -> Self {
let mut r = Box::new(*rng); // doesn't work
let mut cr = Randomizer { rand: r };
cr
}
}
fn main() {}
It complains that
error[E0277]: the trait bound `rand::Rng: std::marker::Sized` is not satisfied
--> src/main.rs:16:21
|
16 | let mut r = Box::new(*rng);
| ^^^^^^^^ `rand::Rng` does not have a constant size known at compile-time
|
= help: the trait `std::marker::Sized` is not implemented for `rand::Rng`
= note: required by `<std::boxed::Box<T>>::new`
I don't understand why it requires Sized on Rng when Box<T> doesn't impose this on T.
More about the Sized trait and bound - it's a rather special trait, which is implicitly added to every function, which is why you don't see it listed in the prototype for Box::new:
fn new(x: T) -> Box<T>
Notice that it takes x by value (or move), so you need to know how big it is to even call the function.
In contrast, the Box type itself does not require Sized; it uses the (again special) trait bound ?Sized, which means "opt out of the default Sized bound":
pub struct Box<T> where T: ?Sized(_);
If you look through, there is one way to create a Box with an unsized type:
impl<T> Box<T> where T: ?Sized
....
unsafe fn from_raw(raw: *mut T) -> Box<T>
so from unsafe code, you can create one from a raw pointer. From then on, all the normal things work.
The problem is actually quite simple: you have a trait object, and the only two things you know about this trait object are:
its list of available methods
the pointer to its data
When you request to move this object to a different memory location (here on the heap), you are missing one crucial piece of information: its size.
How are you going to know how much memory should be reserved? How many bits to move?
When an object is Sized, this information is known at compile-time, so the compiler "injects" it for you. In the case of a trait-object, however, this information is unknown (unfortunately), and therefore this is not possible.
It would be quite useful to make this information available and to have a polymorphic move/clone available, but this does not exist yet and I do not remember any proposal for it so far and I have no idea what the cost would be (in terms of maintenance, runtime penalty, ...).
I also want to post the answer, that one way to deal with this situation is
fn with_rng<TRand: Rng>(rng: &TRand) -> Self {
let r = Box::new(*rng);
Randomizer { rand: r }
}
Rust's monomorphism will create the necessary implementation of with_rng replacing TRand by a concrete sized type. In addition, you may add a trait bound requiring TRand to be Sized.