Miri complains about UB when initializing and manually dropping MaybeUninit - rust

I am writing code to initialize an array of MaybeUninits and drop all initialized elements in case of a panic. Miri complains about undefined behaviour, which I've reduced to the example below.
use std::mem::{transmute, MaybeUninit};
fn main() {
unsafe {
let mut item: MaybeUninit<String> = MaybeUninit::uninit();
let ptr = item.as_mut_ptr();
item = MaybeUninit::new(String::from("Hello"));
println!("{}", transmute::<_, &String>(&item));
ptr.drop_in_place();
}
}
Error message produced by cargo miri run:
error: Undefined Behavior: trying to reborrow for SharedReadWrite at alloc1336, but parent tag <untagged> does not have an appropriate item in the borrow stack
--> /home/antek/.local/opt/rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:179:1
|
179 | pub unsafe fn drop_in_place<T: ?Sized>(to_drop: *mut T) {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ trying to reborrow for SharedReadWrite at alloc1336, but parent tag <untagged> does not have an appropriate item in the borrow stack
|
As far as I can tell, this is exactly how MaybeUninit is supposed to be used.
Am I using the interface incorrectly and invoking undefined behaviour, or is Miri being overly conservative?

Your issue is actually completely unrelated to MaybeUninit and can be boiled down to:
fn main() {
let mut item = 0;
let ptr = &item as *const _;
item = 1;
// or do anything with the pointer
unsafe { &*ptr; }
}
Playground link
Which throws the same "undefined behavior" error when run with Miri. From what I could gather by reading their reference, it seems like Miri has some metadata that keeps track of which pointers are allowed to read an item, and when you reassign that item, the metadata is wiped. However, reassigning values shouldn't change their memory address, so I would say that this is a bug with Miri and not UB.
In fact, changing ptr to a *mut i32 then using ptr.write instead of the assignment actually gets rid of the UB warning, which means it's probably a false positive.

Related

How to return struct in Rust without 'use of moved value' error?

I think the problem I'm running into with this code is that the struct I'm trying to return has variables that are strings.
Currently I am querying an InfluxDB database and storing the result in the result variable. The print line towards the bottom of my function confirms that the index it correctly and is printing
IotData { time: 2021-12-27T14:01:34.404593Z, device_id: "1000", transaction_date: "27-12-2021", transaction_time: "14:01:34", usage: 8 }
How do I return this struct? I added the Clone attribute to the struct and return the clone since I encountered the has not attribute 'Copy' error, but now I am receiving the error, use of moved value: 'result'. I am new to Rust and don't quite understand ownership. My next guess is that I need to add lifetimes to my function and return the reference instead, but am not sure where to go from here.
Code Dump:
use influxdb::{ Client, ReadQuery};
use chrono::{ Utc };
extern crate serde;
extern crate serde_json;
use serde_derive::{ Serialize, Deserialize };
#[derive(Serialize, Deserialize, Debug, Clone)]
struct IotData {
time: chrono::DateTime<Utc>,
device_id: String,
transaction_date: String,
transaction_time: String,
usage: i64,
}
pub async fn pull(last_time: u64) -> IotData {
let client = Client::new("http://localhost:8086", "PuPPY");
let query_text = format!("
SELECT *
FROM device_data
WHERE time > {}
ORDER BY time
LIMIT 1", last_time);
let read_query = ReadQuery::new(query_text);
let result = client
.json_query(read_query)
.await
.and_then(|mut db_result| db_result.deserialize_next::<IotData>());
println!("{:?}", result.unwrap().series[0].values[0]);
return result.unwrap().series[0].values[0].clone();
}
Minimized example yielding the essentially same error:
pub fn pull<T: std::fmt::Debug>(opt: Option<T>) -> T {
println!("{:?}", opt.unwrap());
opt.unwrap()
}
Playground
The error is this:
error[E0382]: use of moved value: `opt`
--> src/lib.rs:3:5
|
1 | pub fn pull<T: std::fmt::Debug>(opt: Option<T>) -> T {
| --- move occurs because `opt` has type `Option<T>`, which does not implement the `Copy` trait
2 | println!("{:?}", opt.unwrap());
| -------- `opt` moved due to this method call
3 | opt.unwrap()
| ^^^ value used here after move
|
note: this function takes ownership of the receiver `self`, which moves `opt`
help: consider calling `.as_ref()` to borrow the type's contents
|
2 | println!("{:?}", opt.as_ref().unwrap());
| +++++++++
(By the way, if you're reading your errors from IDE, you might not see most of this. That's the general advice: confusing compiler errors in IDE are usually more clear in console, where they are not trimmed)
Here, it's easy to see the issue: Option::unwrap consumes self, since that's the only way for it to yield the owned value. So, then we unwrap the incoming Option to print its contents, we destroy it for good - there's nothing more to return.
To fix it, we can do one of the following:
Just follow the compiler suggestion and add as_ref before first unwrap. In this case, we will consume not the Option<T>, but only the shared reference to it, which can be freely used without touching the value.
Remove the println!, since it seems to be purely debugging information anyway.
Unwrap the Option once, and then use the unwrapped value twice:
pub async fn pull(last_time: u64) -> IotData {
// --snip--
let value = result.unwrap().series[0].values[0];
println!("{:?}", value);
return value;
}
Note also that you don't really need clone in any case - again, Option::unwrap returns an owned value, so you don't have to clone it explicitly.

converting `MaybeUninit<T>` to `T`, but getting error E0382

I'm trying to reproduce the code suggested in the MaybeUninit docs. Specifically, it seems to work with specific datatypes, but produces a compiler error on generic types.
Working example (with u32)
use std::mem::{self, MaybeUninit};
fn init_array(t: u32) -> [u32; 1000] {
// Create an uninitialized array of `MaybeUninit`. The `assume_init` is
// safe because the type we are claiming to have initialized here is a
// bunch of `MaybeUninit`s, which do not require initialization.
let mut data: [MaybeUninit<u32>; 1000] = unsafe { MaybeUninit::uninit().assume_init() };
// Dropping a `MaybeUninit` does nothing. Thus using raw pointer
// assignment instead of `ptr::write` does not cause the old
// uninitialized value to be dropped. Also if there is a panic during
// this loop, we have a memory leak, but there is no memory safety
// issue.
for elem in &mut data[..] {
elem.write(t);
}
// Everything is initialized. Transmute the array to the
// initialized type.
unsafe { mem::transmute::<_, [u32; 1000]>(data) }
}
fn main() {
let data = init_array(42);
assert_eq!(&data[0], &42);
}
Failing example (with generic T)
use std::mem::{self, MaybeUninit};
fn init_array<T: Copy>(t: T) -> [T; 1000] {
// Create an uninitialized array of `MaybeUninit`. The `assume_init` is
// safe because the type we are claiming to have initialized here is a
// bunch of `MaybeUninit`s, which do not require initialization.
let mut data: [MaybeUninit<T>; 1000] = unsafe { MaybeUninit::uninit().assume_init() };
// Dropping a `MaybeUninit` does nothing. Thus using raw pointer
// assignment instead of `ptr::write` does not cause the old
// uninitialized value to be dropped. Also if there is a panic during
// this loop, we have a memory leak, but there is no memory safety
// issue.
for elem in &mut data[..] {
elem.write(t);
}
// Everything is initialized. Transmute the array to the
// initialized type.
unsafe { mem::transmute::<_, [T; 1000]>(data) }
}
fn main() {
let data = init_array(42);
assert_eq!(&data[0], &42);
}
error:
Compiling playground v0.0.1 (/playground)
error[E0512]: cannot transmute between types of different sizes, or dependently-sized types
--> src/main.rs:20:14
|
20 | unsafe { mem::transmute::<_, [T; 1000]>(data) }
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: source type: `[MaybeUninit<T>; 1000]` (size can vary because of T)
= note: target type: `[T; 1000]` (size can vary because of T)
For more information about this error, try `rustc --explain E0512`.
error: could not compile `playground` due to previous error
Playground link here
Questions
why is the second example failing? (I thought MaybeUninit<T> could always be transmuted into a T because they'd be guaranteed to have the same memory layout.)
can the example be rewritten to work with generic types?
This is a known issue (related), you can fix the code using the tips of HadrienG2 by doing a more unsafe unsafe thing:
// Everything is initialized. Transmute the array to the
// initialized type.
let ptr = &mut data as *mut _ as *mut [T; 1000];
let res = unsafe { ptr.read() };
core::mem::forget(data);
res
In future we expect to be able to use array_assume_init().

Can I have a mutable reference to a type and its trait object in the same scope? [duplicate]

Why can I have multiple mutable references to a static type in the same scope?
My code:
static mut CURSOR: Option<B> = None;
struct B {
pub field: u16,
}
impl B {
pub fn new(value: u16) -> B {
B { field: value }
}
}
struct A;
impl A {
pub fn get_b(&mut self) -> &'static mut B {
unsafe {
match CURSOR {
Some(ref mut cursor) => cursor,
None => {
CURSOR= Some(B::new(10));
self.get_b()
}
}
}
}
}
fn main() {
// first creation of A, get a mutable reference to b and change its field.
let mut a = A {};
let mut b = a.get_b();
b.field = 15;
println!("{}", b.field);
// second creation of A, a the mutable reference to b and change its field.
let mut a_1 = A {};
let mut b_1 = a_1.get_b();
b_1.field = 16;
println!("{}", b_1.field);
// Third creation of A, get a mutable reference to b and change its field.
let mut a_2 = A {};
let b_2 = a_2.get_b();
b_2.field = 17;
println!("{}", b_1.field);
// now I can change them all
b.field = 1;
b_1.field = 2;
b_2.field = 3;
}
I am aware of the borrowing rules
one or more references (&T) to a resource,
exactly one mutable reference (&mut T).
In the above code, I have a struct A with the get_b() method for returning a mutable reference to B. With this reference, I can mutate the fields of struct B.
The strange thing is that more than one mutable reference can be created in the same scope (b, b_1, b_2) and I can use all of them to modify B.
Why can I have multiple mutable references with the 'static lifetime shown in main()?
My attempt at explaining this is behavior is that because I am returning a mutable reference with a 'static lifetime. Every time I call get_b() it is returning the same mutable reference. And at the end, it is just one identical reference. Is this thought right? Why am I able to use all of the mutable references got from get_b() individually?
There is only one reason for this: you have lied to the compiler. You are misusing unsafe code and have violated Rust's core tenet about mutable aliasing. You state that you are aware of the borrowing rules, but then you go out of your way to break them!
unsafe code gives you a small set of extra abilities, but in exchange you are now responsible for avoiding every possible kind of undefined behavior. Multiple mutable aliases are undefined behavior.
The fact that there's a static involved is completely orthogonal to the problem. You can create multiple mutable references to anything (or nothing) with whatever lifetime you care about:
fn foo() -> (&'static i32, &'static i32, &'static i32) {
let somewhere = 0x42 as *mut i32;
unsafe { (&*somewhere, &*somewhere, &*somewhere) }
}
In your original code, you state that calling get_b is safe for anyone to do any number of times. This is not true. The entire function should be marked unsafe, along with copious documentation about what is and is not allowed to prevent triggering unsafety. Any unsafe block should then have corresponding comments explaining why that specific usage doesn't break the rules needed. All of this makes creating and using unsafe code more tedious than safe code, but compared to C where every line of code is conceptually unsafe, it's still a lot better.
You should only use unsafe code when you know better than the compiler. For most people in most cases, there is very little reason to create unsafe code.
A concrete reminder from the Firefox developers:

Why can't I reuse a &mut reference after passing it to a function that accepts a generic type?

Why doesn't this code compile:
fn use_cursor(cursor: &mut io::Cursor<&mut Vec<u8>>) {
// do some work
}
fn take_reference(data: &mut Vec<u8>) {
{
let mut buf = io::Cursor::new(data);
use_cursor(&mut buf);
}
data.len();
}
fn produce_data() {
let mut data = Vec::new();
take_reference(&mut data);
data.len();
}
The error in this case is:
error[E0382]: use of moved value: `*data`
--> src/main.rs:14:5
|
9 | let mut buf = io::Cursor::new(data);
| ---- value moved here
...
14 | data.len();
| ^^^^ value used here after move
|
= note: move occurs because `data` has type `&mut std::vec::Vec<u8>`, which does not implement the `Copy` trait
The signature of io::Cursor::new is such that it takes ownership of its argument. In this case, the argument is a mutable reference to a Vec.
pub fn new(inner: T) -> Cursor<T>
It sort of makes sense to me; because Cursor::new takes ownership of its argument (and not a reference) we can't use that value later on. At the same time it doesn't make sense: we essentially only pass a mutable reference and the cursor goes out of scope afterwards anyway.
In the produce_data function we also pass a mutable reference to take_reference, and it doesn't produce a error when trying to use data again, unlike inside take_reference.
I found it possible to 'reclaim' the reference by using Cursor.into_inner(), but it feels a bit weird to do it manually, since in normal use-cases the borrow-checker is perfectly capable of doing it itself.
Is there a nicer solution to this problem than using .into_inner()? Maybe there's something else I don't understand about the borrow-checker?
Normally, when you pass a mutable reference to a function, the compiler implicitly performs a reborrow. This produces a new borrow with a shorter lifetime.
When the parameter is generic (and is not of the form &mut T), the compiler doesn't do this reborrowing automatically1. However, you can do it manually by dereferencing your existing mutable reference and then referencing it again:
fn take_reference(data: &mut Vec<u8>) {
{
let mut buf = io::Cursor::new(&mut *data);
use_cursor(&mut buf);
}
data.len();
}
1 — This is because the current compiler architecture only allows a chance to do a coercion if both the source and target types are known at the coercion site.

Why can't I call a mutable method (and store the result) twice in a row?

In the following example:
struct SimpleMemoryBank {
vec: Vec<Box<i32>>,
}
impl SimpleMemoryBank {
fn new() -> SimpleMemoryBank {
SimpleMemoryBank{ vec: Vec::new() }
}
fn add(&mut self, value: i32) -> &mut i32 {
self.vec.push(Box::new(value));
let last = self.vec.len() - 1;
&mut *self.vec[last]
}
}
fn main() {
let mut foo = SimpleMemoryBank::new();
// Works okay
foo.add(1);
foo.add(2);
// Doesn't work: "cannot borrow `foo` as mutable more than once at a time"
let one = foo.add(1);
let two = foo.add(2);
}
add() can be called multiple times in a row, as long as I don't store the result of the function call. But if I store the result of the function (let one = ...), I then get the error:
problem.rs:26:15: 26:18 error: cannot borrow `foo` as mutable more than once at a time
problem.rs:26 let two = foo.add(2);
^~~
problem.rs:25:15: 25:18 note: previous borrow of `foo` occurs here; the mutable borrow prevents subsequent moves, borrows, or modification of `foo` until the borrow ends
problem.rs:25 let one = foo.add(1);
^~~
problem.rs:27:2: 27:2 note: previous borrow ends here
problem.rs:17 fn main() {
...
problem.rs:27 }
^
error: aborting due to previous error
Is this a manifestation of issue #6393: borrow scopes should not always be lexical?
How can I work around this? Essentially, I want to add a new Box to the vector, and then return a reference to it (so the caller can use it).
This is exactly the problem that Rust is designed to prevent you from causing. What would happen if you did:
let one = foo.add(1);
foo.vec.clear();
println!("{}", one);
Or what if foo.add worked by pushing the new value at the beginning of the vector? Bad things would happen! The main thing is that while you have a borrow out on a variable, you cannot mutate the variable any more. If you were able to mutate it, then you could potentially invalidate the memory underlying the borrow and then your program could do a number of things, the best case would be that it crashes.
Is this a manifestation of issue #6393: borrow scopes should not always be lexical?
Kind of, but not really. In this example, you never use one or two, so theoretically a non-lexical scope would allow it to compile. However, you then state
I want to add a new Box to the vector, and then return a reference to it (so the caller can use it)
Which means your real code wants to be
let one = foo.add(1);
let two = foo.add(2);
do_something(one);
do_something(two);
So the lifetimes of the variables would overlap.
In this case, if you just want a place to store variables that can't be deallocated individually, don't overlap with each other, and cannot be moved, try using a TypedArena:
extern crate arena;
use arena::TypedArena;
fn main() {
let arena = TypedArena::new();
let one = arena.alloc(1);
let two = arena.alloc(2);
*one = 3;
println!("{}, {}", one, two);
}

Resources