Vec of structs causing borrowing issues - struct

I am trying to build a molecule data structure. Starting with an Atom struct, a molecule Vec stores all of the Atoms (with their coordinates and indices, etc). I also want a Bond struct which will have pairs of Atom structs, and another Vec which stores all of the bonds. I'll do the same for Angle and so on...
Once in the structs, the data will not be mutated, it will just be used to calculate things like bond lengths via methods, but I can't quite work out how to get around the ownership issue.
mvp_molecule.rs
#[derive(Debug)]
struct Atom {
atomic_symbol: String,
index: i16,
}
#[derive(Debug)]
struct Bond {
atom_1: Atom,
atom_2: Atom,
}
pub fn make_molecule() {
let mut molecule = Vec::new();
let mut bonds = Vec::new();
let atom_1 = Atom {
atomic_symbol: "C".to_string(),
index: 0,
};
molecule.push(atom_1);
let atom_2 = Atom {
atomic_symbol: "H".to_string(),
index: 1,
};
molecule.push(atom_2);
let bond = Bond {
atom_1: molecule[0],
atom_2: molecule[1],
};
bonds.push(bond);
}
I think the issue is that Rust thinks I might change an Atom while it's in a Bond, which I won't do. How can I convince Rust of that?
I appreciate this may be a common problem but I'm not learned enough to realise what I should be looking for to solve it. I've looked through a lot of the documentation on references, borrowing and lifetimes but I'm still not quite sure what the issue I'm trying to solve is, or if it's solvable in this way.

Related

How to rotate a vector without standard library?

I'm getting into Rust and Arduino at the same time.
I was programming my LCD display to show a long string by rotating it through the top column of characters. Means: Every second I shift all characters by one position and show the new String.
This was fairly complex in the Arduino language, especially because I had to know the size of the String at compile time (given my limited knowledge).
Since I'd like to use Rust in the long term, I was curious to see if that could be done more easily in a modern language. Not so much.
This is the code I came up with, after hours of experimentation:
#![no_std]
extern crate alloc;
use alloc::{vec::Vec};
fn main() {
}
fn rotate_by<T: Copy>(rotate: Vec<T>, by: isize) -> Vec<T> {
let real_by = modulo(by, rotate.len() as isize) as usize;
Vec::from_iter(rotate[real_by..].iter().chain(rotate[..real_by].iter()).cloned())
}
fn modulo(a: isize, b: isize) -> isize {
a - b * (a as f64 /b as f64).floor() as isize
}
mod tests {
use super::*;
#[test]
fn test_rotate_five() {
let chars: Vec<_> = "I am the string and you should rotate me! ".chars().collect();
let res_chars: Vec<_> = "the string and you should rotate me! I am ".chars().collect();
assert_eq!(rotate_by(chars, 5), res_chars);
}
}
My questions are:
Could you provide an optimized version of this function? I'm aware that there already is Vec::rotate but it uses unsafe code and can panic, which I would like to avoid (by returning a Result).
Explain whether or not it is possible to achieve this in-place without unsafe code (I failed).
Is Vec<_> the most efficient data structure to work with? I tried hard to use [char], which I thought would be more efficient, but then I have to know the size at compile time, which hardly works. I thought Rust arrays would be similar to Java arrays, which can be sized at runtime yet are also fixed size once created, but they seem to have a lot more constraints.
Oh and also what happens if I index into a vector at an invalid index? Will it panic? Can I do this better? Without "manually" checking the validity of the slice indices?
I realize that's a lot of questions, but I'm struggling and this is bugging me a lot, so if somebody could set me straight it would be much appreciated!
You can use slice::rotate_left and slice::rotate_right:
#![no_std]
extern crate alloc;
use alloc::vec::Vec;
fn rotate_by<T>(data: &mut [T], by: isize) {
if by > 0 {
data.rotate_left(by.unsigned_abs());
} else {
data.rotate_right(by.unsigned_abs());
}
}
I made it rotate in-place because that is more efficient. If you don't want to do it in-place you still have the option of cloning the vector first, so this is more flexible than if the function creates a new vector, as you have done, because you aren't be able to opt out of that when you call it.
Notice that rotate_by takes a mutable slice, but you can still pass a mutable reference to a vector, because of deref coercion.
#[test]
fn test_rotate_five() {
let mut chars: Vec<_> = "I am the string and you should rotate me! ".chars().collect();
let res_chars: Vec<_> = "the string and you should rotate me! I am ".chars().collect();
rotate_by(&mut chars, 5);
assert_eq!(chars, res_chars);
}
There are some edge cases with moving chars around like this because some valid UTF-8 will contain grapheme clusters that are made up of multiple codepoints (chars in Rust). This will result in strange effects when a grapheme cluster is split between the start and end of the string. For example, rotating "abcdéfghijk" by 5 will result in "efghijkabcd\u{301}", with the acute accent stranded on its own, away from the 'e'.
If your strings are ASCII then you don't have to worry about that, but then you can also just treat them as byte strings anyway:
#[test]
fn test_rotate_five_ascii() {
let mut chars = b"I am the string and you should rotate me! ".clone();
let res_chars = b"the string and you should rotate me! I am ";
rotate_by(&mut chars, 5);
assert_eq!(chars, &res_chars[..]);
}

How to get 2 mutable references from 2-D vec at the same time?

I have a 2D Vec and want to get two mutable references from it at the same time, here is the demo code
use std::default::Default;
#[derive(Default, Clone, PartialEq)]
struct Ele {
foo: i32,
bar: f32,
}
fn main() {
let mut data:Vec<Vec<Ele>> = vec![vec![Default::default();100];100];
let a = &mut data[1][2];
let b = &mut data[2][4];
if a != b {
a.foo += b.foo;
b.bar += a.bar;
}
}
Use unsafe code is OK.
You shouldn't try to solve this problem using unsafe, but rather by understanding why the compiler doesn't allow you to do something that looks alright, and what are the available tools to convince it (without hiding it behind a black box and just saying "trust me") it's a genuine thing to do (usually these tools will themselves use unsafe code, but since it's behind a safe boundary it's the burden of the writer of these tools to ensure everything works fine even when the compiler can't figure it out on its own, which is better that having this burden yourself).
In particular, Rust doesn't understand that you are accessing two separate region of memory; being conservative, it just assumes that if you are using a single element of an array, it must consider you are using it all. To make it clear you are talking about two separate pieces of an array, the solution is simply to split the array into two distinct pieces. That way, you make it clear that they are different memory regions:
use std::default::Default;
#[derive(Default, Clone, PartialEq)]
struct Ele {
foo: i32,
bar: f32,
}
fn main() {
let mut data:Vec<Vec<Ele>> = vec![vec![Ele::default();100];100];
let (left, right) = data.split_at_mut(2);
let a = &mut left[1][2];
let b = &mut right[0][4];
if a != b {
a.foo += b.foo;
b.bar += a.bar;
}
}
Note that this will not actually split the vector, it will only give two views over the vector that are disjoint, so it's very efficient.
See the playground.

What is the best way to resolve mutable borrow after immutable borrow, IF there is no perceived reference conflict

This question popped into my head (while I wasn't programming), and it actually made me question a lot of things about programming (like in C++, C#, Rust, in particular).
I want to point out, I'm aware there is a similar question on this issue:
Cannot borrow as mutable because it is also borrowed as immutable.
But I believe this question is aiming at a particular situation; a sub-problem. And I want to better understand how to resolve a thing like this in Rust.
The "thing" that I realised recently was that: "If I have a pointer/reference to an element in a dynamic array, and then I add an element, causing the array to expand and reallocate, that would break the pointer. Therefore, I need a special refererence that will always point to the same element even if it re-allocates".
This made me start thinking differently about a lot of things. But outside of that, I am aware that this problem is trivial to experienced c++ programmers. I have simply not come across this situation in my experiences, unfortunately.
So I wanted to see if Rust either had an existing 'special type' for this type of issue, and if not, what would happen if I made my own (for testing). The idea is that this "special pointer" would simply be a pointer to the Vector (List) itself, but also have a i32 field for the index; so it's all bundled under 1 variable that can be 'dereferenced' whenever you need.
Note: "VecPtr" is meant to be a immutable reference.
struct VecPtr<'a, T> {
vec: &'a Vec<T>,
index: usize
}
impl<T: Copy> VecPtr<'_, T> {
pub fn value(&self) -> T {
return self.vec[self.index];
}
}
fn main() {
let mut v = Vec::<i32>::with_capacity(6);
v.push(3);
v.push(1);
v.push(4);
v.push(1);
let r = VecPtr {vec: &v,index: 2};
let n = r.value();
println!("{}",n);
v.push(5); // error!
v.push(9); // error!
v.push(6); // re-allocation triggered // also error!
let n2 = r.value();
println!("{}",n2);
return;
}
So the above example code is showing that you can't have an existing immutable reference while also trying to have a mutable reference at the same time. good!
From what I've read from the other StackOverflow question, one of the reasons for the compiler error is that the Vector could re-allocate it's internal array at any time when it is calling "push". Which would invalidate all references to the internal array.
Which makes 100% sense. So as a programmer, you may desire to still have references to the array, but they are designed to be a bit more safer. Instead of a direct pointer to the internal array, you just have a pointer to the vector itself in question, and include an i32 index so you know the element you are looking at. Which means the dangling pointer issue that would occur at v.push(6); shouldn't happen any more. But yet the compiler still complains about the same issue. Which I understand.
I suppose it's still concerned about the reference to the vector itself, not the internals. Which makes things a bit confusing. Because there are different pointers here that the compiler is looking at and trying to protect. But to be honest, in the example code, the pointer to vec itself looks totally fine. That reference doesn't change at all (and it shouldn't, from what I can tell).
So my question is, is there a practice at which you can tell the compiler your intentions with certain references? So the compiler knows there isn't an issue (other than the unsafe keyword).
Or alternatively, is there a better way to do what I'm trying to do in the example code?
After some more research
It looks like one solution here would be to use reference counting Rc<T>, but I'm not sure that's 100% it.
I would normally not ask this question due to there being a similar existing question, but this one (I think) is investigating a slightly different situation, where someone (or me) would try to resolve an unsafe reference situation, but the compiler still insists there is an issue.
I guess the question comes down to this: would you find this acceptable?
fn main() {
let mut v = Vec::<i32>::with_capacity(6);
v.push(3);
v.push(1);
v.push(4);
v.push(1);
let r = VecPtr { vec: &v, index: 2 };
let n = r.value();
println!("{}",n);
v[2] = -1;
let n2 = r.value(); // This returned 4 just three lines ago and I was
// promised it wouldn't change! Now it's -1.
println!("{}",n2);
}
Or this
fn main() {
let mut v = Vec::<i32>::with_capacity(6);
v.push(3);
v.push(1);
v.push(4);
v.push(1);
let r = VecPtr { vec: &v, index: 2 };
let n = r.value();
println!("{}",n);
v.clear();
let n2 = r.value(); // This exact same thing that worked three lines ago will now panic.
println!("{}",n2);
}
Or, worst of all:
fn main() {
let mut v = Vec::<i32>::with_capacity(6);
v.push(3);
v.push(1);
v.push(4);
v.push(1);
let r = VecPtr { vec: &v, index: 2 };
let n = r.value();
println!("{}",n);
drop(v);
let n2 = r.value(); // Now you do actually have a dangling pointer.
println!("{}",n2);
}
Rust's answer is an emphatic "no" and that is enforced in the type system. It's not just about the unsoundness of dereferencing dangling pointers, it's a core design decision.
Can you tell the compiler your intentions with certain references? Yes! You can tell the compiler whether you want to share your reference, or whether you want to mutate through it. In your case, you've told the compiler that you want to share it. Which means you're not allowed to mutate it anymore. And as the examples above show, for good reason.
For the sake of this, the borrow checker has no notion of the stack or the heap, it doesn't know what types allocate and which don't, or when a Vec resizes. It only knows and cares about moving values and borrowing references: whether they're shared or mutable and for how long they live.
Now, if you want to make your structure work, Rust offers you some possibilities: One of those is RefCell. A RefCell allows you to borrow a mutable reference from an immutable one at the expense of runtime checking that nothing is aliased incorrectly. This together with an Rc can make your VecPtr:
use std::cell::RefCell;
use std::rc::Rc;
struct VecPtr<T> {
vec: Rc<RefCell<Vec<T>>>,
index: usize,
}
impl<T: Copy> VecPtr<T> {
pub fn value(&self) -> T {
return self.vec.borrow()[self.index];
}
}
fn main() {
let v = Rc::new(RefCell::new(Vec::<i32>::with_capacity(6)));
{
let mut v = v.borrow_mut();
v.push(3);
v.push(1);
v.push(4);
v.push(1);
}
let r = VecPtr {
vec: Rc::clone(&v),
index: 2,
};
let n = r.value();
println!("{}", n);
{
let mut v = v.borrow_mut();
v.push(5);
v.push(9);
v.push(6);
}
let n2 = r.value();
println!("{}", n2);
}
I'll leave it to you to look into how RefCell works.

Can I add the items of a subvector in a vector to another subvector in the same vector?

I am writing code in which I mean to save vectors in a bigger vector.
Later, I want to append the values of one of the vectors to another and remove the vector whose values I've transferred.
I've made two attempts which have both failed and I would very much appreciate it if someone could help me out with this!
The code I use looks like this:
fn my_func(distribution: (u16, u8, u8, u8, u8, u8)) {
let mut covered: Vec<(Vec<usize>, u16)> = vec![
(vec![0], distribution.1.into()),
(vec![1], distribution.2.into()),
(vec![2], distribution.3.into()),
(vec![3], distribution.4.into()),
(vec![4], distribution.5.into()),
];
// Attempt 1 - Error: borrowed as mutable more than once
&covered[0].0.append(&mut covered[1].0);
// Attempt 2 - Error: borrowed as mutable and immutable
for i in &covered[1].0 {
&covered[0].0.push(*i);
}
}
I am relatively new to Rust, so I'm still learning about the intricacies of borrowing.
Could someone please help me understand how I am to accomplish what I want to accomplish?
Any other remarks on my coding style or on other mistakes I made are also super welcome.
Assuming I'm understanding correctly what you're trying to do. Then you're trying to mutably borrow from covered at 2 different indices. Your approach doesn't work as covered[0] causes the whole Vec to be mutably borrowed. You can only get 1 mutable reference, which is why the subsequent covered[1] results in an error.
One solution to accomplish what you're trying, is to use split_at_mut(). Then you can get 2 mutable slices from covered, which then each can be indexed.
This is related to what's known as "Splitting Borrows".
fn func(distribution: (u16, u8, u8, u8, u8, u8)) {
let mut covered: Vec<(Vec<usize>, u16)> = vec![
(vec![0], distribution.1.into()),
(vec![1], distribution.2.into()),
(vec![2], distribution.3.into()),
(vec![3], distribution.4.into()),
(vec![4], distribution.5.into()),
];
let (left, right) = covered.split_at_mut(1);
let v0 = &mut left[0];
let v1 = &mut right[0];
// Now both this
v0.0.append(&mut v1.0);
// and this works
for i in &v1.0 {
v0.0.push(*i);
}
}
Note that as Aplet123 mentioned in the comments, append() moves all the elements from the other Vec leaving it empty. If you don't want the other Vec to be cleared, then instead use extend() or extend_from_slice().
If you know up front that you can remove the item from the Vec. Then you don't need to use split_at_mut() for this case. Then you can do as RedBorg mentioned, and just remove() the item and borrow as you otherwise would.
fn my_func(distribution: (u16, u8, u8, u8, u8, u8)) {
let mut covered: Vec<(Vec<usize>, u16)> = vec![
(vec![0], distribution.1.into()),
(vec![1], distribution.2.into()),
(vec![2], distribution.3.into()),
(vec![3], distribution.4.into()),
(vec![4], distribution.5.into()),
];
let mut v1 = covered.remove(1);
let v0 = &mut covered[0];
// Now both this
v0.0.append(&mut v1.0);
// and this works
for i in &v1.0 {
v0.0.push(*i);
}
}

Ergonomics issues with fixed size byte arrays in Rust

Rust sadly cannot produce a fixed size array [u8; 16] with a fixed size slicing operator s[0..16]. It'll throw errors like "expected array of 16 elements, found slice".
I've some KDFs that output several keys in wrapper structs like
pub struct LeafKey([u8; 16]);
pub struct MessageKey([u8; 32]);
fn kdfLeaf(...) -> (MessageKey,LeafKey) {
// let mut r: [u8; 32+16];
let mut r: (MessageKey, LeafKey);
debug_assert_eq!(mem::size_of_val(&r), 384/8);
let mut sha = Sha3::sha3_384();
sha.input(...);
// sha.result(r);
sha.result(
unsafe { mem::transmute::<&mut (MessageKey, LeafKey),&mut [u8;32+16]>(&r) }
);
sha.reset();
// (MessageKey(r[0..31]), LeafKey(r[32..47]))
r
}
Is there a safer way to do this? We know mem::transmute will refuse to compile if the types do not have the same size, but that only checks that pointers have the same size here, so I added that debug_assert.
In fact, I'm not terribly worried about extra copies though since I'm running SHA3 here, but afaik rust offers no ergonomic way to copy amongst byte arrays.
Can I avoid writing (MessageKey, LeafKey) three times here? Is there a type alias for the return type of the current function? Is it safe to use _ in the mem::transmute given that I want the code to refuse to compile if the sizes do not match? Yes, I know I could make a type alias, but that seems silly.
As an aside, there is a longer discussion of s[0..16] not having type [u8; 16] here
There's the copy_from_slice method.
fn main() {
use std::default::Default;
// Using 16+8 because Default isn't implemented
// for [u8; 32+16] due to type explosion unfortunateness
let b: [u8; 24] = Default::default();
let mut c: [u8; 16] = Default::default();
let mut d: [u8; 8] = Default::default();
c.copy_from_slice(&b[..16])
d.copy_from_slice(&b[16..16+8]);
}
Note, unfortunately copy_from_slice throws a runtime error if the slices are not the same length, so make sure you thoroughly test this yourself, or use the lengths of the other arrays to guard.
Unfortunately, c.copy_from_slice(&b[..c.len()]) doesn't work because Rust thinks c is borrowed both immutably and mutably at the same time.
I marked the accepted answer as best since it's safe, and led me to the clone_into_array answer here, but..
Another idea that improves the safety is to make a version of mem::transmute for references that checks the sizes of the referenced types, as opposed to just the pointers. It might look like :
#[inline]
unsafe fn transmute_ptr_mut<A,B>(v: &mut A) -> &mut B {
debug_assert_eq!(core::mem::size_of(A),core::mem::size_of(B));
core::mem::transmute::<&mut A,&mut B>(v)
}
I have raised an issue on the arrayref crate to discuss this, as arrayref might be a reasonable crate for it to live in.
Update : We've a new "best answer" by the arrayref crate developer :
let (a,b) = array_refs![&r,32,16];
(MessageKey(*a), LeafKey(*b))

Resources