Consider the following code:
unsafe {
let mut s = "".to_string();
let r = &mut s;
let ptr = r as *mut String;
r.push('a');
(*ptr).push('b');
r.push('c');
(*ptr).push('d');
println!("{}", r);
}
Playground.
I would think that this is a clear violation of Rust aliasing rules: I have a mutable reference and a mutable pointer, and there are interleaving writes using both of them. However, MIRI lets it pass. What am I missing?
Related
I have a program that uses a QuadTree. This tree stores mutable borrows to data that is owned by another container (a Vec). I rebuild the QuadTree every game loop, but I do not want to reallocate, so I clear the underlying Vecs of the QuadTree instead of reconstructing it from scratch.
A simplified example that demonstrates the same problem is shown below. Instead of a QuadTree, here I am just using another Vec as this has identical issues.
struct A;
fn main() {
let mut owned_data = vec![A, A, A];
let mut mut_borrowed_data = vec![];
'_outer: loop {
mut_borrowed_data.clear();
'_inner: for borrow in &mut owned_data {
mut_borrowed_data.push(borrow);
}
}
}
This gives the error:
error[E0499]: cannot borrow `owned_data` as mutable more than once at a time
--> src\main.rs:8:30
|
8 | '_inner: for borrow in &mut owned_data {
| ^^^^^^^^^^^^^^^ `owned_data` was mutably borrowed here in the previous iteration of the loop
The issue isn't really that I am mutably borrowing in a previous iteration of the outer loop. If I remove the mut_borrowed_data.push(data); it compiles, because the borrow checker realises that the mutable borrow of owned_data is dropped at the end of each outer loop, therefore the number of mutable borrows is a max of 1. By pushing into mut_borrowed_data, this mutable borrow is moved into this container (Please correct me if I am wrong here), therefore it isn't dropped and the borrow checker is not happy. If I did not have the clear there would be multiple copies of the mutable borrow, and the borrow checker is not smart enough to realise that I only push into the mut_borrowed_data once, and that I clear it every outer loop.
But as it stands, there is only one instance of the mutable borrow at any one time, so is the following code safe/sound?
struct A;
fn main() {
let mut owned_data = vec![A, A, A];
let mut mut_borrowed_data = vec![];
'_outer: loop {
mut_borrowed_data.clear();
'_inner: for borrow in &mut owned_data {
let ptr = borrow as *mut A;
let new_borrow = unsafe { &mut *ptr };
mut_borrowed_data.push(new_borrow);
}
}
}
This now compiles. The mutable borrow of owned_data (named borrow) is not moved into the mut_borrowed_data and therefore it is dropped at the end of the outer loop. This means owned_data is only mutable borrowed once. The unsafe code takes a copy of the pointer to the data, dereferences it and creates a new borrow to that. (again, please correct me if I am wrong). Because this uses a copy and not a move, the compiler allows borrow and new_borrow to exist at the same time. This use of unsafe could break the borrow rules, but as long as I do not use borrow after I have created new_borrow, and as long as I clear mut_borrowed_data, then I think this is safe/sound.
Moreover, (I think) the guarantees given by the borrow checker still hold as long as I clear the mut_borrowed_data vec. It won't let me push into mut_borrowed_data twice in one loop, because the new_borrow is moved after it is first inserted.
I do not want to use a RefCell as I want this to be as performant as possible. The whole purpose of the QuadTree is to increase performance so I want to make any overhead it introduces as lean as possible. Incrementing the borrow count is probably cheap, but the branch (to check if that value is <= 1), the indirection, and the decreased simplicity of my data, are too much for me to feel happy about.
Is my use of unsafe here safe/sound? Is there anything that could trip me up?
Let's start with that: your code is safe, and also pretty sound.
The unsafe code takes a copy of the pointer to the data, dereferences it and creates a new borrow to that. (again, please correct me if I am wrong). Because this uses a copy and not a move, the compiler allows borrow and new_borrow to exist at the same time.
This is not accurate. The reason borrow and new_borrow can exist at the same time is not because raw pointers are copied while references are moved, but because when you converted the reference to a raw pointer you detached the lifetime chain - the compiler can no longer track the source of new_borrow.
It won't let me push into mut_borrowed_data twice in one loop, because the new_borrow is moved after it is first inserted.
Yes, but also no:
'_outer: loop {
mut_borrowed_data.clear();
'_inner: for borrow in &mut owned_data {
let ptr = borrow as *mut A;
let new_borrow = unsafe { &mut *ptr };
mut_borrowed_data.push(new_borrow);
mut_borrowed_data.push(new_borrow);
}
}
// Does not compile:
// error[E0382]: borrow of moved value: `new_borrow`
// --> src/lib.rs:12:32
// |
// 10 | let new_borrow = unsafe { &mut *ptr };
// | ---------- move occurs because `new_borrow` has type `&mut A`, which does not implement the `Copy` trait
// 11 | mut_borrowed_data.push(new_borrow);
// | ---------- value moved here
// 12 | mut_borrowed_data.push(new_borrow);
// | ^^^^^^^^^^ value borrowed here after move
// However, this does compile, and it is still Undefined Behavior:
'_outer: loop {
mut_borrowed_data.clear();
'_inner: for borrow in &mut owned_data {
let ptr = borrow as *mut A;
let new_borrow = unsafe { &mut *ptr };
mut_borrowed_data.push(new_borrow);
eprintln!("{borrow}"); // Use the old `borrow`.
}
}
You can make it a bit safer by shadowing the original borrow, so it can no longer be used:
'_outer: loop {
mut_borrowed_data.clear();
'_inner: for borrow in &mut owned_data {
let borrow = unsafe { &mut *(borrow as *mut A) };
mut_borrowed_data.push(borrow);
}
}
But it is still not perfect. The reason is that since you detach the lifetime, you get an unlimited, essentially 'static reference. This means it can be used longer than allowed, for example:
use std::sync::Mutex;
#[derive(Debug)]
struct A;
static EVIL: Mutex<Option<&'static mut A>> = Mutex::new(None);
fn main() {
let mut owned_data = vec![A, A, A];
let mut mut_borrowed_data = vec![];
'_outer: loop {
if let Some(evil) = EVIL.lock().unwrap().as_deref_mut() {
eprintln!("HaHa! We got two overlapping mutable references! {evil:?}");
}
mut_borrowed_data.clear();
'_inner: for borrow in &mut owned_data {
let borrow = unsafe { &mut *(borrow as *mut A) };
mut_borrowed_data.push(borrow);
}
*EVIL.lock().unwrap() = mut_borrowed_data.pop();
}
}
This does not mean this approach is bad (it is probably what I would use) but you need to be careful.
Why it is allowed to do something like this:
fn main() {
let mut w = MyStruct;
w.fun1();
}
struct MyStruct;
impl MyStruct {
fn fun1(&mut self) {
self.fun2();
}
fn fun2(&mut self) {
println!("Hello world 2");
}
}
In the above code fun1() gets mut MyStruct and calls fun2() also with mut MyStruct. Is it double mutable reference in one scope?
This is allowed because the borrow checker can conclude there is only one mutable reference being accessed during execution. While fun2 is running, no other statement in fun1 is being executed. When the next statement in fun1 (if there was any) starts executing, fun2 has already dropped its mutable reference.
In the other question linked:
fn main() {
let mut x1 = String::from("hello");
let r1 = &mut x1;
let r2 = &mut x1;
r1.insert(0, 'w');
}
We can say r2 is never used, but borrow checker decided it shouldn't be allowed. Consider this example:
fn main() {
let mut x1 = String::from("hello");
let r1 = &mut x1;
r1.insert(0, 'w');
let r2 = &mut x1;
r2.insert(0, 'x');
}
This compiles and runs correctly. I suppose borrow checker assumes the lifetime r1 ends before r2 is created. If this makes sense, calling methods that mutate self shouldn't be so surprising.
(I don't know why the 1st piece of code does not compile, but I am glad rust team made it that way. r2 should not be there anyway.)
I have this code:
struct Foo<'a> {
link: &'a i32,
}
fn main() {
let mut x = 33;
println!("x:{}", x);
let ff = Foo { link: &x };
x = 22;
}
Which generates this compiler error:
error[E0506]: cannot assign to `x` because it is borrowed
--> src/main.rs:9:5
|
8 | let ff = Foo { link: &x };
| - borrow of `x` occurs here
9 | x = 22;
| ^^^^^^ assignment to borrowed `x` occurs here
The Rust book has only two rules:
one or more references (&T) to a resource,
exactly one mutable reference (&mut T).
I have one mutable variable and one immutable link. Why does the compiler give an error?
The Rust Programming Language defines the rules of references:
At any given time, you can have either one mutable reference or any number of immutable references.
References must always be valid.
Reassigning a variable implicitly requires a mutable reference:
fn main() {
let mut x = 33;
let link = &x;
x = 22;
*(&mut x) = 22; // Basically the same thing
}
Importantly, reassigning a variable mutates the variable, which would cause the value of the immutable reference link to change, which is disallowed.
Note that the initial assignment of the variable does not require the variable to be mutable:
fn main() {
let x;
// Some other code
x = 42;
}
I have a function f that accepts two references, one mut and one not mut. I have values for f inside a HashMap:
use std::collections::HashMap;
fn f(a: &i32, b: &mut i32) {}
fn main() {
let mut map = HashMap::new();
map.insert("1", 1);
map.insert("2", 2);
{
let a: &i32 = map.get("1").unwrap();
println!("a: {}", a);
let b: &mut i32 = map.get_mut("2").unwrap();
println!("b: {}", b);
*b = 5;
}
println!("Results: {:?}", map)
}
This doesn't work because HashMap::get and HashMap::get_mut attempt to mutably borrow and immutably borrow at the same time:
error[E0502]: cannot borrow `map` as mutable because it is also borrowed as immutable
--> src/main.rs:15:27
|
12 | let a: &i32 = map.get("1").unwrap();
| --- immutable borrow occurs here
...
15 | let b: &mut i32 = map.get_mut("2").unwrap();
| ^^^ mutable borrow occurs here
...
18 | }
| - immutable borrow ends here
In my real code I'm using a large, complex structure instead of a i32 so it is not a good idea to clone it.
In fact, I'm borrowing two different things mutably/immutably, like:
struct HashMap {
a: i32,
b: i32,
}
let mut map = HashMap { a: 1, b: 2 };
let a = &map.a;
let b = &mut map.b;
Is there any way to explain to the compiler that this is actually safe code?
I see how it possible to solve in the concrete case with iter_mut:
{
let mut a: &i32 = unsafe { mem::uninitialized() };
let mut b: &mut i32 = unsafe { mem::uninitialized() };
for (k, mut v) in &mut map {
match *k {
"1" => {
a = v;
}
"2" => {
b = v;
}
_ => {}
}
}
f(a, b);
}
But this is slow in comparison with HashMap::get/get_mut
TL;DR: You will need to change the type of HashMap
When using a method, the compiler does not inspect the interior of a method, or perform any runtime simulation: it only bases its ownership/borrow-checking analysis on the signature of the method.
In your case, this means that:
using get will borrow the entire HashMap for as long as the reference lives,
using get_mut will mutably borrow the entire HashMap for as long as the reference lives.
And therefore, it is not possible with a HashMap<K, V> to obtain both a &V and &mut V at the same time.
The work-around, therefore, is to avoid the need for a &mut V entirely.
This can be accomplished by using Cell or RefCell:
Turn your HashMap into HashMap<K, RefCell<V>>,
Use get in both cases,
Use borrow() to get a reference and borrow_mut() to get a mutable reference.
use std::{cell::RefCell, collections::HashMap};
fn main() {
let mut map = HashMap::new();
map.insert("1", RefCell::new(1));
map.insert("2", RefCell::new(2));
{
let a = map.get("1").unwrap();
println!("a: {}", a.borrow());
let b = map.get("2").unwrap();
println!("b: {}", b.borrow());
*b.borrow_mut() = 5;
}
println!("Results: {:?}", map);
}
This will add a runtime check each time you call borrow() or borrow_mut(), and will panic if you ever attempt to use them incorrectly (if the two keys are equal, unlike your expectations).
As for using fields: this works because the compiler can reason about borrowing status on a per-field basis.
Something appears to have changed since the question was asked. In Rust 1.38.0 (possibly earlier), the following compiles and works:
use std::collections::HashMap;
fn f(a: &i32, b: &mut i32) {}
fn main() {
let mut map = HashMap::new();
map.insert("1", 1);
map.insert("2", 2);
let a: &i32 = map.get("1").unwrap();
println!("a: {}", a);
let b: &mut i32 = map.get_mut("2").unwrap();
println!("b: {}", b);
*b = 5;
println!("Results: {:?}", map)
}
playground
There is no need for RefCell, nor is there even a need for the inner scope.
fn t(x: &mut u8) -> &mut u8 {
x
}
fn main() {
let mut x = 5u8;
let y = & mut x;
let z = t(y);
println!("{}", y);
}
Compiling this gives me this error:
main.rs:9:20: 9:21 error: cannot borrow `y` as immutable because `*y` is also borrowed as mutable
main.rs:9 println!("{}", y);
I would have thought y would have been moved during the call to t and then back to z, resulting in an error: use of moved value
Why do I get this error message instead?
Does Rust automatically create a new borrow instead of passing ownership when references are offered as function parameters?
What is the purpose of this behaviour?
You are returning a mutable reference to the parameter from your function. However, Rust doesn't know that the method hasn't kept a copy of that pointer didn't return a subsection of that pointer, were it a struct. This means that at any time, the value pointed to might be changed, which is a big no-no in Rust; if it were allowed, then you could easily cause memory errors.
Does Rust automatically create a new borrow
Yes, Rust "re-borrows" references.
A better example requires a smidge more complexity:
struct Thing { a: u8, b: u8 }
fn t(x: &mut Thing) -> &mut u8 {
&mut x.a
}
fn main() {
let mut x = Thing { a: 5, b: 6 };
let z = t(&mut x);
*z = 0;
// x.a = 0; // cannot assign to `x.a` because it is borrowed
}
Here, t returns a mutable pointer to a subset of the struct. This means that the entire struct is borrowed, and we cannot change it (except via z). Rust applies this logic to all functions, and doesn't try to recognize that your t function just returns the same pointer.
By compiling your program with rustc --pretty=expanded, we can see that the println! macro borrows its argument:
#![no_std]
#[macro_use]
extern crate "std" as std;
#[prelude_import]
use std::prelude::v1::*;
fn t(x: &mut u8) -> &mut u8 { x }
fn main() {
let mut x = 5u8;
let y = &mut x;
let z = t(y);
::std::io::stdio::println_args(::std::fmt::Arguments::new({
#[inline]
#[allow(dead_code)]
static __STATIC_FMTSTR:
&'static [&'static str]
=
&[""];
__STATIC_FMTSTR
},
&match (&y,) { // <----- y is borrowed here
(__arg0,)
=>
[::std::fmt::argument(::std::fmt::String::fmt,
__arg0)],
}));
}