Memory deallocation using slice::from_raw_parts_mut and ptr::drop_in_place in rust - rust

I saw a piece of code online that was dropping allocated memory using a combination of std::slice::from_raw_parts_mut() and std::ptr::drop_in_place(). Below is a piece of code that allocates an array of ten integers and then de-allocates it:
use std::{
alloc::{alloc, Layout},
ptr::NonNull,
};
fn main() {
let len: usize = 10;
let layout: Layout = Layout::array::<i32>(len).unwrap();
let data: NonNull<i32> = unsafe { NonNull::new(alloc(layout) as *mut i32).unwrap() };
unsafe {
std::ptr::drop_in_place(std::slice::from_raw_parts_mut(data.as_ptr(), len));
}
}
The return type of std::slice::from_raw_parts_mut() is a mutable slice &mut [T], but the argument of std::ptr::drop_in_place() is *mut T. It seems to me that the conversion happens automatically. I'm pretty sure I'm missing something here since it shouldn't be allowed. Would someone explain what exactly is happening here?

When you write std::slice::from_raw_parts_mut(data.as_ptr(), len) you are building a value of type &mut [i32].
Then you are passing it to drop_in_place() that is defined more or less as:
fn drop_in_place<T: ?Sized>(to_drop: *mut T)
So you are coercing a &mut [i32] into a *mut T, that is solved in two steps: there is an automatic coercion from reference to pointer, and then T is resolved as [i32] which is the type whose drop is actually called.
(You may think that the automatic coercion from reference to pointer is dangerous and should not be automatic, but it is actually totally safe. What is unsafe is usually what you do with the pointer afterwards. And actually there are a couple of uses of raw pointers that are safe, such as std::ptr::eq or std::ptr::hash).
Slices implement Drop::drop by simply iterating over the elements and calling drop_in_place in each of them. This is a clever way to avoid writing the loop manually.
But note a couple of things about this code:
drop_in_place will call Drop::drop on every element of the slice, but since they are of type i32 it is effectively a no-op. I guess that your original code uses a generic type.
drop_in_place does not free the memory, for that you need a call to std::alloc::dealloc.

Related

What are the differences between fn(b: Box<dyn Trait>) and fn<T: Trait>(b: &mut T) in Rust? [duplicate]

I'm a bit confused about how pointers work in Rust. There's ref, Box, &, *, and I'm not sure how they work together.
Here's how I understand it currently:
Box isn't really a pointer - it's a way to allocate data on the heap, and pass around unsized types (traits especially) in function arguments.
ref is used in pattern matching to borrow something that you match on, instead of taking it. For example,
let thing: Option<i32> = Some(4);
match thing {
None => println!("none!"),
Some(ref x) => println!("{}", x), // x is a borrowed thing
}
println!("{}", x + 1); // wouldn't work without the ref since the block would have taken ownership of the data
& is used to make a borrow (borrowed pointer). If I have a function fn foo(&self) then I'm taking a reference to myself that will expire after the function terminates, leaving the caller's data alone. I can also pass data that I want to retain ownership of by doing bar(&mydata).
* is used to make a raw pointer: for example, let y: i32 = 4; let x = &y as *const i32. I understand pointers in C/C++ but I'm not sure how this works with Rust's type system, and how they can be safely used. I'm also not sure what the use cases are for this type of pointer. Additionally, the * symbol can be used to dereference things (what things, and why?).
Could someone explain the 4th type of pointer to me, and verify that my understanding of the other types is correct? I'd also appreciate anyone pointing out any common use cases that I haven't mentioned.
First of all, all of the items you listed are really different things, even if they are related to pointers. Box is a library-defined smart pointer type; ref is a syntax for pattern matching; & is a reference operator, doubling as a sigil in reference types; * is a dereference operator, doubling as a sigil in raw pointer types. See below for more explanation.
There are four basic pointer types in Rust which can be divided in two groups - references and raw pointers:
&T - immutable (shared) reference
&mut T - mutable (exclusive) reference
*const T - immutable raw pointer
*mut T - mutable raw pointer
The difference between the last two is very thin, because either can be cast to another without any restrictions, so const/mut distinction there serves mostly as a lint. Raw pointers can be created freely to anything, and they also can be created out of thin air from integers, for example.
Naturally, this is not so for references - reference types and their interaction define one of the key feature of Rust: borrowing. References have a lot of restrictions on how and when they could be created, how they could be used and how they interact with each other. In return, they can be used without unsafe blocks. What borrowing is exactly and how it works is out of scope of this answer, though.
Both references and raw pointers can be created using & operator:
let x: u32 = 12;
let ref1: &u32 = &x;
let raw1: *const u32 = &x;
let ref2: &mut u32 = &mut x;
let raw2: *mut u32 = &mut x;
Both references and raw pointers can be dereferenced using * operator, though for raw pointers it requires an unsafe block:
*ref1; *ref2;
unsafe { *raw1; *raw2; }
The dereference operator is often omitted, because another operator, the "dot" operator (i.e., .), automatically references or dereferences its left argument. So, for example, if we have these definitions:
struct X { n: u32 };
impl X {
fn method(&self) -> u32 { self.n }
}
then, despite that method() takes self by reference, self.n automatically dereferences it, so you won't have to type (*self).n. Similar thing happens when method() is called:
let x = X { n: 12 };
let n = x.method();
Here, the compiler automatically references x in x.method(), so you won't have to write (&x).method().
The next to last piece of code also demonstrated the special &self syntax. It means just self: &Self, or, more specifically, self: &X in this example. &mut self, *const self, *mut self also work.
So, references are the main pointer kind in Rust and should be used almost always. Raw pointers, which don't have restrictions of references, should be used in low-level code implementing high-level abstractions (collections, smart pointers, etc.) and in FFI (interacting with C libraries).
Rust also has dynamically-sized (or unsized) types. These types do not have a definite statically-known size and therefore can only be used through a pointer/reference. However, only a pointer is not enough - additional information is needed, for example, length for slices or a pointer to a virtual methods table for trait objects. This information is "embedded" in pointers to unsized types, making these pointers "fat".
A fat pointer is basically a structure which contains the actual pointer to the piece of data and some additional information (length for slices, pointer to vtable for trait objects). What's important here is that Rust handles these details about pointer contents absolutely transparently for the user - if you pass &[u32] or *mut SomeTrait values around, corresponding internal information will be automatically passed along.
Box<T> is one of the smart pointers in the Rust standard library. It provides a way to allocate enough memory on the heap to store a value of the corresponding type, and then it serves as a handle, a pointer to that memory. Box<T> owns the data it points to; when it is dropped, the corresponding piece of memory on the heap is deallocated.
A very useful way to think of boxes is to consider them as regular values, but with a fixed size. That is, Box<T> is equivalent to just T, except it always takes a number of bytes which correspond to the pointer size of your machine. We say that (owned) boxes provide value semantics. Internally, they are implemented using raw pointers, like almost any other high-level abstraction.
Boxes (in fact, this is true for almost all of the other smart pointers, like Rc) can also be borrowed: you can get a &T out of Box<T>. This can happen automatically with the . operator or you can do it explicitly by dereferencing and referencing it again:
let x: Box<u32> = Box::new(12);
let y: &u32 = &*x;
In this regard, Boxes are similar to built-in pointers - you can use dereference operator to reach their contents. This is possible because the dereference operator in Rust is overloadable, and it is overloaded for most (if not all) of the smart pointer types. This allows easy borrowing of these pointers contents.
And, finally, ref is just a syntax in patterns to obtain a variable of the reference type instead of a value. For example:
let x: u32 = 12;
let y = x; // y: u32, a copy of x
let ref z = x; // z: &u32, points to x
let ref mut zz = x; // zz: &mut u32, points to x
While the above example can be rewritten with reference operators:
let z = &x;
let zz = &mut x;
(which would also make it more idiomatic), there are cases when refs are indispensable, for example, when taking references into enum variants:
let x: Option<Vec<u32>> = ...;
match x {
Some(ref v) => ...
None => ...
}
In the above example, x is only borrowed inside the whole match statement, which allows using x after this match. If we write it as such:
match x {
Some(v) => ...
None => ...
}
then x will be consumed by this match and will become unusable after it.
Box is logically a newtype around a raw pointer (*const T). However, it allocates and deallocates its data during construction and destruction, so does not have to borrow data from some other source.
The same thing is true of other pointer types, like Rc - a reference counted pointer. These are structs containing private raw pointers which they allocate into and deallocate from.
A raw pointer has exactly the same layout as a a normal pointer, so are not compatible with C pointers in several cases. Importantly, *const str and *const [T] are fat pointers, which means they contain extra information about the value's length.
However, raw pointers makes absolutely no guarantees as to their validity. For example, I can safely do
123 as *const String
This pointer is invalid, since the memory location 123 does not point to a valid String. Thus, when dereferencing one, an unsafe block is required.
Further, whereas borrows are required to respect certain laws - namely that you cannot have multiple borrows if one is mutable - raw pointers do not have to respect this. There are other, weaker, laws that must be obeyed, but you're less likely to run afoul of these.
There is no logical difference between *mut and *const, although they may need to be casted to the other to do certain operations - the difference is documentative.
References and raw pointers are the same thing at the implementation level. The difference from the programmer perspective is that references are safe (in Rust terms), but raw pointers are not.
The borrow checker guarantees that references are always valid (lifetime management), that you can have only one mutable reference at time, etc.
These type of constraint can be too strict for many use cases, so raw pointers (which do not have any constraints, like in C/C++) are useful to implement low-level data structures, and in general low-level stuff. However, you can only dereference raw pointers or do operations on them inside an unsafe block.
The containers in the standard library are implemented using raw pointers, Box and Rc too.
Box and Rc are what smart pointers are in C++, that is wrappers around raw pointers.
I would like to add my two cents.
A. Table
Reference/Pointer
DataLocation
Mutable
SharedOwnership
Safe
implCopy
&T
stack
❌
✔️️
✔️
✔️
&mut T
stack
✔️
❌
✔️
❌
*const T
stack
❌
✔️
❌
✔️
*mut T
stack
✔️
✔️
❌
✔️
Box<T>
heap
✔️
❌
✔️
❌
Rc<T>
heap
❌
✔️
✔️
❌
B. Comments on table
&T
Mutable (❌): Error: cannot assign to *some_ref, which is behind a & reference some_ref is a & reference, so the data it refers to cannot be written rustc (E0594).
Shared (✔️)
Safe (✔️)
impl Copy (✔️)
&mut T
Mutable (✔️)
Shared (❌): Has only one owner. Error: cannot borrow x as mutable more than once at a time second mutable borrow occurs here rustc (E0499).
Safe (✔️)
impl Copy (❌): Error: move occurs because some_ref has type &mut u32, which does not implement the Copy trait.
*const T
Mutable: (❌): Error: cannot assign to *some_raw_pointer, which is behind a *const pointer raw1 is a *const pointer, so the data it refers to cannot be written rustc (E0594).
Shared (✔️)
Safe: (❌): Error: dereference of raw pointer is unsafe and requires unsafe function or block raw pointers may be null, dangling or unaligned; they can violate aliasing rules and cause data races: all of these are undefined behavior rustc (E0133).
impl Copy (✔️): Please check the official documentation.
*mut T
Mutable (✔️)
Shared (✔️)
Safe (❌): Error: dereference of raw pointer is unsafe and requires unsafe function or block
raw pointers may be null, dangling or unaligned; they can violate aliasing rules and cause data races: all of these are undefined behavior rustc (E0133).
impl Copy (✔️): Please check the Official Documentation.
Box<T>
Mutable (✔️)
Shared (❌): In order to prove it, use a reference to a box in some scope, the reference will drop right after that scope ends because it has only one owner. Please refer to this SO answer for more details. Error: some_box does not live long enough borrowed value does not live long enough rustc (E0597).
Safe (✔️)
impl Copy (❌): Please check the Official Documentation. Actually there is a reason:
You can't implement Copy for Box, that would allow creation of multiple boxes referencing the same thing.
Rc<T>
Mutable (❌): Well, only one copy is mutable, and it's a bit more complicated. Error: cannot assign to data in an Rc trait DerefMut is required to modify through a dereference, but it is not implemented for Rc<u32> rustc (E0594).
Shared (✔️): Actually it's multiple ownership.
Safe (✔️)
impl Copy (❌): Please check the Official Documentation.
C. Related Notes
1. Copy trait vs move:
According to the official documentation:
It’s important to note that in these two examples, the only difference is whether you are allowed to access x after the assignment. Under the hood, both a copy and a move can result in bits being copied in memory, although this is sometimes optimized away.
So, be aware that move transfers ownership, while Copy has nothing to do with it.
2. Mutable References do not implement Copy
Some types can’t be copied safely. For example, copying &mut T would create an aliased mutable reference. Copying String would duplicate responsibility for managing the String’s buffer, leading to a double free.
It's good anyway to read the full Copy documentation page.
3. Dereferencing Pointers and Unsafe
The term unsafe here means that you won't be able to dereference the pointer unless with an unsafe function or block. Otherwise, you'll get the following error:
dereference of raw pointer is unsafe and requires unsafe function or block raw pointers may be null, dangling or unaligned; they can violate aliasing rules and cause data races: all of these are undefined behavior rustc (E0133).
4. ref is the same as &
Box is a smart pointer which is a data type. it is not just a simple pointer to the address in the memory. Box pointer is the owner of the value.
fn main(){
// this will point to a value 0.1 which will be stored on the HEAP
// the var heap_value is just the address and it will be stored in the stack
// Box pointer is the owner of the value
let heap_value=Box::new(0.1);
// "x" is a primitive type, it will have a fixed size and therefore will be stored on the stack.
let x=0.1;
// * dereference which means just get the stored value
println!("they are equal or not {}",x==*heap_value); // true
}
Dereference a tuple:
fn main(){
let coord=Box::new((25,50));
// x is a pointer
let x=coord;
// to extract all the tuple data structure
// if you are behind a reference and you need to use the value
let extracted_tuple=*x;
}
type of "x" pointer is: Box<(i32, i32)>
type of "extracted_tuple" is (i32, i32)
Keep in mind that references are always stack allocated, because they are fixed size
fn main(){
let stack_var=10;
// this is the reference of stack_var. they both are on the stack.
// this will point to the above +
let stack_ref=&stack_var;
// this will create a box pointer. heap memory will be allocated
// copy of stack_var will be stored on the heap, heap_var points to that memory
let heap_var=Box::new(stack_var);
println!("heap var is {}",heap_var);
}
this image explains above function
As you said ref is used in pattern matching to borrow something that you match on. Instead of using ref keyword,
&thing is used
let thing: Option<i32> = Some(4);
match &thing {
None => println!("none!"),
Some(x) => println!("{}", x), // x is a borrowed thing
}
println!("{}", x + 1);

Understanding rust borrowing and dereferencing

I was reading through the Rust documentation and can't quite seem to be able to wrap my head around what is going on. For example, over here I see the following example:
// This function takes ownership of a box and destroys it
fn eat_box_i32(boxed_i32: Box<i32>) {
println!("Destroying box that contains {}", boxed_i32);
}
// This function borrows an i32
fn borrow_i32(borrowed_i32: &i32) {
println!("This int is: {}", borrowed_i32);
}
fn main() {
// Create a boxed i32, and a stacked i32
let boxed_i32 = Box::new(5_i32);
let stacked_i32 = 6_i32;
// Borrow the contents of the box. Ownership is not taken,
// so the contents can be borrowed again.
borrow_i32(&boxed_i32);
borrow_i32(&stacked_i32);
{
// Take a reference to the data contained inside the box
let _ref_to_i32: &i32 = &boxed_i32;
// Error!
// Can't destroy `boxed_i32` while the inner value is borrowed later in scope.
eat_box_i32(boxed_i32);
// FIXME ^ Comment out this line
// Attempt to borrow `_ref_to_i32` after inner value is destroyed
borrow_i32(_ref_to_i32);
// `_ref_to_i32` goes out of scope and is no longer borrowed.
}
// `boxed_i32` can now give up ownership to `eat_box` and be destroyed
eat_box_i32(boxed_i32);
}
Things I believe:
eat_box_i32 takes a pointer to a Box
this line let boxed_i32 = Box::new(5_i32); makes is so that boxed_i32 now contains a pointer because Box is not a primitive
Things I don't understand:
why do we need to call borrow_i32(&boxed_i32); with the ampersand? Isn't boxed_i32 already a pointer?
on this line: let _ref_to_i32: &i32 = &boxed_i32; why is the ampersand required on the right hand side? Isn't boxed_i32 already an address?
how come borrow_i32 can be called with pointer to Box and pointer to i32 ?
Comment on the term "pointers"
You can skip this part if you'd like, I just figured given the questions you asked, this might be a helpful comment:
In Rust, &i32, &mut i32, *const i32, *mut i32, Box<i32>, Rc<i32>, Arc<i32> are all arguably a "pointer to i32" type. However, Rust will not let you convert between them casually, even between those that are laid out identically in memory.
It can be useful to talk about pointers in general sometimes, but as a rule of thumb, if you're trying to figure out why one piece of Rust code compiles, and another doesn't, I'd recommend keeping track of which kind of pointer you're working with.
Things you believe:
eat_box_i32 takes a pointer to a Box
Actually not quite. eat_box_i32 accepts a Box<i32>, and not a pointer to a Box<i32>. It just so happens that Box<i32> in memory is stored as a pointer to an i32.
this line let boxed_i32 = Box::new(5_i32); makes is so that boxed_i32 now contains a pointer because Box is not a primitive
Yes, boxed_i32 is a pointer.
Things you don't understand:
why do we need to call borrow_i32(&boxed_i32); with the ampersand? Isn't boxed_i32 already a pointer?
Yes, boxed_i32 is already a pointer. However, a boxed pointer still indicates ownership. If you passed boxed_i32 instead of &boxed_i32, you would still be passing a pointer, but Rust will consider that variable "consumed", and you would no longer be able to use boxed_i32 after that function call.
on this line: let _ref_to_i32: &i32 = &boxed_i32; why is the ampersand required on the right hand side? Isn't boxed_i32 already an address?
Yes, boxed_i32 is already an address, but the fact that it's an address is kind of meant to be opaque (like a struct with a single private field). The actual type of &boxed_i32 is &Box<i32>.
Though this is weird right? If &boxed_i32 is &Box<i32>, how can you assign it to a variable of type &i32?
This is actually a shorthand -- If a type T implements the Deref<Target=R> trait, it'll automatically convert values of type &T into values of type &R as needed. And it turns out that the Box<T> type implements Deref<Target=T>.
See https://doc.rust-lang.org/std/ops/trait.Deref.html for more info about Deref.
So if you wrote it out explicitly without that automatic conversion, that line would actually look something like:
let _ref_to_i32: &i32 = Deref::deref(&boxed_i32);
how come borrow_i32 can be called with pointer to Box and pointer to i32 ?
The reason is the same as with (2) above.
borrow_i32 accepts &i32 as its parameter. Passing &i32 is obviously ok because the types match exactly. If you try to pass it &Box<i32>, Rust will automatically convert it to &i32 for you, because Box<i32> implements Deref<i32>.
EDIT: Thanks #kmdreko for pointing out that Deref allows the coercion, and not AsRef
Just to supplement #math4tots, the auto dereferencing is call Deref Coercion. It is explained in the rustbook here: https://doc.rust-lang.org/book/ch15-02-deref.html#implicit-deref-coercions-with-functions-and-methods

How to safely get an immutable byte slice from a `&mut [u32]`?

In a rather low level part of a project of mine, a function receives a mutable slice of primitive data (&mut [u32] in this case). This data should be written to a writer in little endian.
Now, this alone wouldn't be a problem, but all of this has to be fast. I measured my application and identified this as one of the critical paths. In particular, if the endianness doesn't need to be changed (since we're already on a little endian system), there shouldn't be any overhead.
This is my code (Playground):
use std::{io, mem, slice};
fn write_data(mut w: impl io::Write, data: &mut [u32]) -> Result<(), io::Error> {
adjust_endianness(data);
// Is this safe?
let bytes = unsafe {
let len = data.len() * mem::size_of::<u32>();
let ptr = data.as_ptr() as *const u8;
slice::from_raw_parts(ptr, len)
};
w.write_all(bytes)
}
fn adjust_endianness(_: &mut [u32]) {
// implementation omitted
}
adjust_endianness changes the endianness in place (which is fine, since a wrong-endian u32 is garbage, but still a valid u32).
This code works, but the critical question is: Is this safe? In particular, at some point, data and bytes both exist, being one mutable and one immutable slice to the same data. That sounds very bad, right?
On the other hand, I can do this:
let bytes = &data[..];
That way, I also have those two slices. The difference is just that data is now borrowed.
Is my code safe or does it exhibit UB? Why? If it's not safe, how to safely do what I want to do?
In general, creation of slices that violate Rust's safety rules, even briefly, is unsafe. If you cheat the borrow checker and make independent slices borrowing the same data as & and &mut at the same time, it will make Rust specify incorrect aliasing information in LLVM, and this may lead to actually miscompiled code. Miri doesn't flag this case, because you're not using data afterwards, but the exact details of what is unsafe are still being worked out.
To be safe, you should to explain the sharing situation to the borrow checker:
let shared_data = &data[..];
data will be temporarily reborrowed as shared/read-only for the duration shared_data is used. In this case it shouldn't cause any limitations. The data will keep being mutable after exiting this scope.
Then you'll have &[u32], but you need &[u8]. Fortunately, this conversion is safe to do, because both are shared, and u8 has lesser alignment requirement than u32 (if it was the other way, you'd have to use align_to!).
let shared_data = &data[..];
let bytes = unsafe {
let len = shared_data.len() * mem::size_of::<u32>();
let ptr = data.as_ptr() as *const u8;
slice::from_raw_parts(ptr, len)
};

Is it safe to use a closure to get a raw pointer from an Option<&T>?

I have an Option<&T> and I would like to have a raw *const T which is null if the option was None. I want to wrap an FFI call that takes a pointer to a Rust-allocated object.
Additionally, the FFI interface I am using has borrowing semantics (I allocate something and pass in a pointer to it), not ownership semantics
extern "C" {
// Parameter may be null
fn ffi_call(*const T);
}
fn safe_wrapper(opt: Option<&T>) {
let ptr: *const T = ???;
unsafe { ffi_call(ptr) }
}
I could use a match statement to do this, but that method feels very verbose.
let ptr = match opt {
Some(inner) => inner as *const T,
None => null(),
};
I could also map the reference to a pointer, then use unwrap_or.
let ptr = opt.map(|inner| inner as *const T).unwrap_or(null());
However, I'm worried that the pointer might be invalidated as it passes through the closure. Does Rust make a guarantee that the final pointer will point to the same thing as the original reference? If T is Copy, does this change the semantics in a meaningful way? Is there a better way that I am overlooking?
Yes, this is safe. I'd write it as:
use std::ptr;
fn safe_wrapper(opt: Option<&u8>) {
let p = opt.map_or_else(ptr::null, |x| x);
unsafe { ffi_call(p) }
}
If you find yourself writing this a lot, you could make it into a trait and reduce it down to a single method call.
the pointer might be invalidated as it passes through the closure
It could be, if you invalidate it yourself somehow. Because the function takes a reference, you know for sure that the referred-to value will be valid for the duration of the function call — that's the purpose of Rust's borrow checker.
The only way for the pointer to become invalid is if you change the value of the pointer (e.g. you add an offset to it). Since you don't do that, it's fine.
Does Rust make a guarantee that the final pointer will point to the same thing as the original reference?
It depends what you mean by "final". Converting a reference to a pointer will always result in both values containing the same location in memory. Anything else would be deliberately malicious and no one would ever have used Rust to begin with.
If T is Copy, does this change the semantics in a meaningful way?
No. Besides we are talking about a &T, which is always Copy
See also:
Convert Option<&mut T> to *mut T
Should we use Option or ptr::null to represent a null pointer in Rust?
Is it valid to use ptr::NonNull in FFI?
the FFI interface I am using has borrowing semantics (I allocate something and pass in a pointer to it), not ownership semantics
To be clear, you cannot determine ownership based purely on what the function types are.
This C function takes ownership:
void string_free(char *)
This C function borrows:
size_t string_len(char *)
Both take a pointer. Rust improves on this situation by clearly delineating what is a borrow and what is a transfer of ownership.
extern "C" {
// Parameter may be null
fn ffi_call(*const T);
}
This code is nonsensical; it does not define the generic type T and FFI functions cannot have generic types anyway.

When is it necessary to circumvent Rust's borrow checker?

I'm implementing Conway's game of life to teach myself Rust. The idea is to implement a single-threaded version first, optimize it as much as possible, then do the same for a multi-threaded version.
I wanted to implement an alternative data layout which I thought might be more cache-friendly. The idea is to store the status of two cells for each point on a board next to each other in memory in a vector, one cell for reading the current generation's status from and one for writing the next generation's status to, alternating the access pattern for each
generation's computation (which can be determined at compile time).
The basic data structures are as follows:
#[repr(u8)]
pub enum CellStatus {
DEAD,
ALIVE,
}
/** 2 bytes */
pub struct CellRW(CellStatus, CellStatus);
pub struct TupleBoard {
width: usize,
height: usize,
cells: Vec<CellRW>,
}
/** used to keep track of current pos with iterator e.g. */
pub struct BoardPos {
x_pos: usize,
y_pos: usize,
offset: usize,
}
pub struct BoardEvo {
board: TupleBoard,
}
The function that is causing me troubles:
impl BoardEvo {
fn evolve_step<T: RWSelector>(&mut self) {
for (pos, cell) in self.board.iter_mut() {
//pos: BoardPos, cell: &mut CellRW
let read: &CellStatus = T::read(cell); //chooses the right tuple half for the current evolution step
let write: &mut CellStatus = T::write(cell);
let alive_count = pos.neighbours::<T>(&self.board).iter() //<- can't borrow self.board again!
.filter(|&&status| status == CellStatus::ALIVE)
.count();
*write = CellStatus::evolve(*read, alive_count);
}
}
}
impl BoardPos {
/* ... */
pub fn neighbours<T: RWSelector>(&self, board: &BoardTuple) -> [CellStatus; 8] {
/* ... */
}
}
The trait RWSelector has static functions for reading from and writing to a cell tuple (CellRW). It is implemented for two zero-sized types L and R and is mainly a way to avoid having to write different methods for the different access patterns.
The iter_mut() method returns a BoardIter struct which is a wrapper around a mutable slice iterator for the cells vector and thus has &mut CellRW as Item type. It is also aware of the current BoardPos (x and y coordinates, offset).
I thought I'd iterate over all cell tuples, keep track of the coordinates, count the number of alive neighbours (I need to know coordinates/offsets for this) for each (read) cell, compute the cell status for the next generation and write to the respective another half of the tuple.
Of course, in the end, the compiler showed me the fatal flaw in my design, as I borrow self.board mutably in the iter_mut() method and then try to borrow it again immutably to get all the neighbours of the read cell.
I have not been able to come up with a good solution for this problem so far. I did manage to get it working by making all
references immutable and then using an UnsafeCell to turn the immutable reference to the write cell into a mutable one.
I then write to the nominally immutable reference to the writing part of the tuple through the UnsafeCell.
However, that doesn't strike me as a sound design and I suspect I might run into issues with this when attempting to parallelize things.
Is there a way to implement the data layout I proposed in safe/idiomatic Rust or is this actually a case where you actually have to use tricks to circumvent Rust's aliasing/borrow restrictions?
Also, as a broader question, is there a recognizable pattern for problems which require you to circumvent Rust's borrow restrictions?
When is it necessary to circumvent Rust's borrow checker?
It is needed when:
the borrow checker is not advanced enough to see that your usage is safe
you do not wish to (or cannot) write the code in a different pattern
As a concrete case, the compiler cannot tell that this is safe:
let mut array = [1, 2];
let a = &mut array[0];
let b = &mut array[1];
The compiler doesn't know what the implementation of IndexMut for a slice does at this point of compilation (this is a deliberate design choice). For all it knows, arrays always return the exact same reference, regardless of the index argument. We can tell that this code is safe, but the compiler disallows it.
You can rewrite this in a way that is obviously safe to the compiler:
let mut array = [1, 2];
let (a, b) = array.split_at_mut(1);
let a = &mut a[0];
let b = &mut b[0];
How is this done? split_at_mut performs a runtime check to ensure that it actually is safe:
fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T]) {
let len = self.len();
let ptr = self.as_mut_ptr();
unsafe {
assert!(mid <= len);
(from_raw_parts_mut(ptr, mid),
from_raw_parts_mut(ptr.offset(mid as isize), len - mid))
}
}
For an example where the borrow checker is not yet as advanced as it can be, see What are non-lexical lifetimes?.
I borrow self.board mutably in the iter_mut() method and then try to borrow it again immutably to get all the neighbours of the read cell.
If you know that the references don't overlap, then you can choose to use unsafe code to express it. However, this means you are also choosing to take on the responsibility of upholding all of Rust's invariants and avoiding undefined behavior.
The good news is that this heavy burden is what every C and C++ programmer has to (or at least should) have on their shoulders for every single line of code they write. At least in Rust, you can let the compiler deal with 99% of the cases.
In many cases, there's tools like Cell and RefCell to allow for interior mutation. In other cases, you can rewrite your algorithm to take advantage of a value being a Copy type. In other cases you can use an index into a slice for a shorter period. In other cases you can have a multi-phase algorithm.
If you do need to resort to unsafe code, then try your best to hide it in a small area and expose safe interfaces.
Above all, many common problems have been asked about (many times) before:
How to iterate over mutable elements inside another mutable iteration over the same elements?
Mutating an item inside of nested loops
How can a nested loop with mutations on a HashMap be achieved in Rust?
What's the Rust way to modify a structure within nested loops?
Nesting an iterator's loops

Resources