Is it possible to have a variable local to a trait implementation? - rust

I have a indexable type that I want to iterate over. It consists of some metadata and an array. I need to first iterate over the bytes of the metadata and then to that of the array. From what I understand, the iterator cannot have any storage local to the trait implementation. I think this is very disorganized, and I don't want my data types to be muddled by the need to satisfy extraneous influence.
impl Iterator for IndexableData {
type Item = u8
let index : isize = 0;
fn next(& mut self) -> Option<Item> {
if self.index > self.len() { None }
if self.index > size_of::<Metadata> {
Some (self.data[index - size_of::<Metadata>])
}
Some (self.metadata[index])
}
}
This is what I think the implementation should look like. The index variable belongs in the iterator trait. Not my IndexableData type. How can I achieve this?

The Iterator should be a separate struct that has a reference to the collection plus any other data it may need (such as this index). The collection object itself should not be an iterator. That would not only require misplaced additional metadata in the collection, it would prevent you from having multiple independent iterators over the collection.

Related

Storing mutable references owned by someone else

I want to store structs in HashMap but also reference to same structs inside another struct vector field.
Let's say I have a tree build of two types of structs Item and Relation.
I am storing all the relations in HashMap by their id,
but I also want to fill each item.out_relations vector with mutable references to same Relation structs which are owned by HashMap.
Here is my Item:
pub struct Item<'a> {
pub id: oid::ObjectId,
pub out_relations: Vec<&'a mut Relation>, // <-- want to fill this
}
And when I am iterating over my relations got from DB
I am trying to do smth like this:
let from_id = relation.from_id; //to find related item
item_map // my items loaded here already
.entry(from_id)
.and_modify(|item| {
(*item)
.out_relations.push(
relation_map.try_insert(relation.id, relation).unwrap() // mut ref to relation expected here
)
}
);
For now compiler warns try_insert is unstable feature and points me to this bug
But let's imagine I have this mutable ref to relation owned already by HashMap- is it going to work?
Or this will be again some ref lives not long enough error? What are my options then? Or I better will store just relation id in item out_relations vector rather then refs? And when needed I will take my relation from the hashmap?
This is called shared mutability, and it is forbidden by the borrow checker.
Fortunately Rust offers safe tools to achieve this.
In your case you need to use Rc<RefCell>, so your code would be:
pub struct Item {
pub id: oid::ObjectId,
pub out_relations: Vec<Rc<RefCell<Relation>>>,
}
And when I am iterating over my relations got from DB I am trying to do smth like this:
let relation = Rc::new(RefCell::new(relation));
let from_id = relation.borrow().from_id; // assuming id is a Copy type
item_map // my items loaded here already
.entry(from_id)
.and_modify(|item| {
(*item)
.out_relations.push(
relation_map.try_insert(relation.id, relation.clone()).unwrap() // mut ref to relation expected here
)
}
);
If you want to mutate relation later, you can use .borrow_mut()
relation.borrow_mut().from_id = new_id;
I agree with Oussama Gammoudi's diagnosis, but I'd like to offer alternative solutions.
The issue with reference counting is that it kills performance.
In your vector, can you store the key to the hash map instead? Then, when you need to get the value from the vector, get the key from the vector and fetch the value from the hash map?
Another strategy is to keep a vector of the values, and then store indices into the vector in your hash map and other vector. This Rust Keynote speaker describes the strategy well: https://youtu.be/aKLntZcp27M?t=1787

how can I create a throwaway mutable reference?

I'm trying to wrap a vector to change its index behaviour, so that when an out of bounds access happens, instead of panicking it returns a reference to a dummy value, like so:
use std::ops::Index;
struct VecWrapper(Vec<()>);
impl Index<usize> for VecWrapper {
type Output = ();
fn index(&self, idx: usize) -> &() {
if idx < self.0.len() {
&self.0[idx]
} else {
&()
}
}
}
which works just fine for the implementation of Index, but trying to implement IndexMut the same way fails for obvious reasons. the type in my collection does not have a Drop implementation so no destructor needs to be called (other than freeing the memory).
the only solution I can think of is to have a static mutable array containing thousands of dummies, and distributing references to elements of this array, this is a horrible solution though it still leads to UB if there are ever more dummies borrowed than the size of the static array.
Give the wrapper an additional field that's the dummy. It's got the same lifetime restrictions, and so can't be aliased.

Rust Data Structure

I am currently learning Rust for fun. I have some experience in C / C++ and other experience in other programming languages that use more complex paradigms like generics.
Background
For my first project (after the tutorial), I wanted to create a N-Dimensional array (or Matrix) data structure to practice development in Rust.
Here is what I have so far for my Matrix struct and a basic fill and new initializations.
Forgive the absent bound checking and parameter testing
pub struct Matrix<'a, T> {
data: Vec<Option<T>>,
dimensions: &'a [usize],
}
impl<'a, T: Clone> Matrix<'a, T> {
pub fn fill(dimensions: &'a [usize], fill: T) -> Matrix<'a, T> {
let mut total = if dimensions.len() > 0 { 1 } else { 0 };
for dim in dimensions.iter() {
total *= dim;
}
Matrix {
data: vec![Some(fill); total],
dimensions: dimensions,
}
}
pub fn new(dimensions: &'a [usize]) -> Matrix<'a, T> {
...
Matrix {
data: vec![None; total],
dimensions: dimensions,
}
}
}
I wanted the ability to create an "empty" N-Dimensional array using the New fn. I thought using the Option enum would be the best way to accomplish this, as I can fill the N-Dimensional with None and it would allocate space for this T generic automatically.
So then it comes down to being able to set the entries for this. I found the IndexMut and Index traits that looked like I could do something like m[&[2, 3]] = 23. Since the logic is similar to each other here is the IndexMut impl for Matrix.
impl<'a, T> ops::IndexMut<&[usize]> for Matrix<'a, T> {
fn index_mut(&mut self, indices: &[usize]) -> &mut Self::Output {
match self.data[get_matrix_index(self.dimensions, indices)].as_mut() {
Some(x) => x,
None => {
NOT SURE WHAT TO DO HERE.
}
}
}
}
Ideally what would happen is that the value (if there) would be changed i.e.
let mut mat = Matrix::fill(&[4, 4], 0)
mat[&[2, 3]] = 23
This would set the value from 0 to 23 (which the above fn does via returning &mut x from Some(x)). But I also want None to set the value i.e.
let mut mat = Matrix::new(&[4, 4])
mat[&[2, 3]] = 23
Question
Finally, is there a way to make m[&[2,3]] = 23 possible with what the Vec struct requires to allocate the memory? If not what should I change and how can I still have an array with "empty" spots. Open to any suggestions as I am trying to learn. :)
Final Thoughts
Through my research, the Vec struct impls I see that the type T is typed and has to be Sized. This could be useful as to allocate the Vec with the appropriate size via vec![pointer of T that is null but of size of T; total]. But I am unsure of how to do this.
So there are a few ways to make this more similar to idiomatic rust, but first, let's look at why the none branch doesn't make sense.
So the Output type for IndexMut I'm going to assume is &mut T as you don't show the index definition but I feel safe in that assumption. The type &mut T means a mutable reference to an initialized T, unlike pointers in C/C++ where they can point to initialized or uninitialized memory. What this means is that you have to return an initialized T which the none branch cannot because there is no initialized value. This leads to the first of the more idiomatic ways.
Return an Option<T>
The easiest way would be to change Index::Output to be an Option<T>. This is better because the user can decide what to do if there was no value there before and is close to what you are actually storing. Then you can also remove the panic in your index method and allow the caller to choose what to do if there is no value. At this point, I think you can go a little further with gentrifying the structure in the next option.
Store a T directly
This method allows the caller to directly change what the type is that's stored rather than wrapping it in an option. This cleans up most of your index code nicely as you just have to access what's already stored. The main problem is now initialization, how do you represent uninitialized values? You were correct that option is the best way to do this1, but now the caller can decide to have this optional initialization capability by storing an Option themselves. So that means we can always store initialized Ts without losing functionality. This only really changes your new function to instead not fill with None values. My suggestion here is to make a bound T: Default for the new function2:
impl<'a, T: Default> Matrix<'a, T> {
pub fn new(dimensions: &'a [usize]) -> Matrix<'a, T> {
Matrix {
data: (0..total).into_iter().map(|_|Default::default()).collect(),
dimensions: dimensions,
}
}
}
This method is much more common in the rust world and allows the caller to choose whether to allow for uninitialized values. Option<T> also implements default for all T and returns None So the functionality is very similar to what you have currently.
Aditional Info
As you're new to rust there are a few comments that I can make about traps that I've fallen into before. To start your struct contains a reference to the dimensions with a lifetime. What this means is that your structs cannot exist longer than the dimension object that created them. This hasn't caused you a problem so far as all you've been passing is statically created dimensions, dimensions that are typed into the code and stored in static memory. This gives your object a lifetime of 'static, but this won't occur if you use dynamic dimensions.
How else can you store these dimensions so that your object always has a 'static lifetime (same as no lifetime)? Since you want an N-dimensional array stack allocation is out of the question since stack arrays must be deterministic at compile time (otherwise known as const in rust). This means you have to use the heap. This leaves two real options Box<[usize]> or Vec<usize>. Box is just another way of saying this is on the heap and adds Sized to values that are ?Sized. Vec is a little more self-explanatory and adds the ability to be resized at the cost of a little overhead. Either would allow your matrix object to always have a 'static lifetime.
1. The other way to represent this without Option<T>'s discriminate is MaybeUninit<T> which is unsafe territory. This allows you to have a chunk of initialized memory big enough to hold a T and then assume it's initialized unsafely. This can cause a lot of problems and is usually not worth it as Option is already heavily optimized in that if it stores a type with a pointer it uses compiler magic to store the discriminate in whether or not that value is a null pointer.
2. The reason this section doesn't just use vec![Default::default(); total] is that this requires T: Clone as the way this macro works the first part is called once and cloned until there are enough values. This is an extra requirement that we don't need to have so the interface is smoother without it.

What is the Item in a nested struct that implements the Iterator trait?

I am watching a video on Rust iterators and the video host implements a slightly more complex struct Flatten. In essence, this Flatten struct holds items, each of which implements the IntoIterator trait so that each item inside Flatten (Note that the following code is not the final code from the video and is only used to show my confusion):
pub struct Flatten<O>
{
outer: O,
}
impl<O> Flatten<O> {
fn new(iter: O) -> Self {
Flatten { outer: iter }
}
}
Then there is the part where Flatten implements the Iterator trait:
impl<O> Iterator for Flatten<O>
where
O: Iterator,
O::Item: IntoIterator,
{
type Item = <O::Item as IntoIterator>::Item;
fn next(&mut self) -> Option<Self::Item> {
self.outer.next().and_then(|inner| inner.into_iter().next())
}
}
What I am very confused about is that when we set the constraint O::Item: IntoIterator, we obviously meant that the generic type O has items that implement IntoIterator.
Edit 1: To clarify, my confusion was that we first said that O::Item: IntoIterator and then specified again type Item = <O::Item as IntoIterator>::Item. Are the Items appearing in these two clauses referring to different concepts?
Edit 2: I think I understand it now. An Iterator has Item, so we could write O::Item: IntoIterator. And when we define type Item = <O::Item as IntoIterator>::Item, we are specifying the return Item type for Flatten, not for O. I was confused by the syntax and had some illogical understanding.
Rust knows what item to chose because we specifically tell him what Item to chose.
We know that both Iterator and IntoIterator define an Item.
So Flatten is a collection of other IntoIterator (O) , being the target to access this IntoIterator inner items we need to set the Flatten::Item to be the same as the inner IntoIterator::Item, hence:
type Item = <O::Item as IntoIterator>::Item;
Since we are implementing Iterator the Self::Item is that implementation-defined item (the inner ones that we just explained).
As recap:
O::Item => Inner IntoIterator::Item for flatten
Self::Item => Flatten as Iterator Item. Which matches the previous one.
Flatten::Item => Same as Self::Item for the Iterator implementation.

Avoiding "cannot move out of borrowed content" without the use of "to_vec"?

I'm learning rust and have a simple program, shown below. Playground link.
#[derive(Debug)]
pub struct Foo {
bar: String,
}
pub fn gather_foos<'a>(data: &'a Vec<Vec<&'a Foo>>) -> Vec<Vec<&'a Foo>> {
let mut ret: Vec<Vec<&Foo>> = Vec::new();
for i in 0..data.len() {
if meets_requirements(&data[i]) {
ret.push(data[i].to_vec());
}
}
return ret
}
fn meets_requirements<'a>(_data: &'a Vec<&'a Foo>) -> bool {
true
}
fn main() {
let foo = Foo{
bar: String::from("bar"),
};
let v1 = vec![&foo, &foo, &foo];
let v2 = vec![&foo, &foo];
let data = vec![v1, v2];
println!("{:?}", gather_foos(&data));
}
The program simply loops through an array of arrays of a struct, checks if the array of structs meets some requirement and returns an array of arrays that meets said requirement.
I'm sure there's a more efficient way of doing this without the need to call to_vec(), which I had to implement in order to avoid the error cannot move out of borrowed content, but I'm not sure what that solution is.
I'm learning about Box<T> now and think it might provide a solution to my needs? Thanks for any help!!
The error is showing up because you're trying to move ownership of one of the vectors in the input vector to the output vector, which is not allowed since you've borrowed the input vector immutably. to_vec() creates a copy, which is why it works when you use it.
The solution depends on what you're trying to do. If you don't need the original input (you only want the matched ones), you can simply pass the input by value rather than by reference, which will allow you to consume the vector and move items to the output. Here's an example of this.
If you do need the original input, but you don't want to copy the vectors with to_vec(), you may want to use references in the output, as demonstrated by this example. Note that the function now returns a vector of references to vectors, rather than a vector of owned vectors.
For other cases, there are other options. If you need the data to be owned by multiple items for some reason, you could try Rc<T> or Arc<T> for reference-counted smart pointers, which can be cloned to provide immutable access to the same data by multiple owners.

Resources