Unexpected segfault when working with raw pointers - rust

Initially I wrote a Heap implementation in Rust, but I was getting strange segfaults, so I narrowed down the code to this example, which reproduces the behavior.
use core::fmt::Debug;
pub struct Node<T: Ord + Debug> {
pub value: *const T,
}
pub struct MaxHeap<T: Ord + Debug> {
pub root: *const Node<T>,
}
impl<T: Ord + Debug> MaxHeap<T> {
pub fn push(&mut self, value: *const T) {
self.root = &mut Node { value: value };
}
}
fn main() {
let a = 124i64;
let b = 124i64;
let c = 1i64;
let mut heap = MaxHeap {
root: &mut Node { value: &a },
};
heap.push(&b);
println!("{:?}", &c);
unsafe {
println!("{:?}", *(*heap.root).value);
}
}
Playground.
The result I get from this is:
1
Segmentation fault (core dumped)
The interesting thing (to me) is that if I remove the print of c, there is no segfault and the correct value is printed for the heap root.
124
I expect that anything happening with c can't affect heap, but it does. What am I missing?

You've got a use-after-free. In push(), you assign a temporary to self.root. The temporary's lifetime is finished of the statement and you're pointing to freed memory. Any further use will cause undefined behavior.
Miri reports it (Tools->Miri in the playground):
error: Undefined Behavior: pointer to alloc1786 was dereferenced after this allocation got freed
--> src/main.rs:29:26
|
29 | println!("{:?}", *(*heap.root).value);
| ^^^^^^^^^^^^^^^^^^^ pointer to alloc1786 was dereferenced after this allocation got freed
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: inside `main` at src/main.rs:29:26
= note: this error originates in the macro `$crate::format_args_nl` (in Nightly builds, run with -Z macro-backtrace for more info)
note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace
Since you've got UB, the program can do anything, and any change may affect what it does, even if it seems unrelated.

Related

Is it legal Rust to cast a pointer to a struct's first member to a pointer to the struct?

In C, a pointer to a struct can be cast to a pointer to its first member, and vice-versa. That is, the address of a struct is defined to be the address of its first member.
struct Base { int x; };
struct Derived { struct Base base; int y; };
int main() {
struct Derived d = { {5}, 10 };
struct Base *base = &d.base; // OK
printf("%d\n", base->x);
struct Derived *derived = (struct Derived *)base; // OK
printf("%d\n", derived->y);
}
This is commonly used to implement C++-style inheritance.
Is the same thing allowed in Rust if the structs are repr(C) (so that their fields aren't reorganized)?
#[derive(Debug)]
#[repr(C)]
struct Base {
x: usize,
}
#[derive(Debug)]
#[repr(C)]
struct Derived {
base: Base,
y: usize,
}
// safety: `base` should be a reference to `Derived::base`, otherwise this is UB
unsafe fn get_derived_from_base(base: &Base) -> &Derived {
let ptr = base as *const Base as *const Derived;
&*ptr
}
fn main() {
let d = Derived {
base: Base {
x: 5
},
y: 10,
};
let base = &d.base;
println!("{:?}", base);
let derived = unsafe { get_derived_from_base(base) }; // defined behaviour?
println!("{:?}", derived);
}
The code works, but will it always work, and is it defined behaviour?
The way you wrote it, currently not; but it is possible to make it work.
Reference to T is only allowed to access T - no more (it has provenance for T). The expression &d.base gives you a reference that is only valid for Base. Using it to access Derived's fields is undefined behavior. It is not clear this is what we want, and there is active discussion about that (also this), but that is the current behavior. There is a good tool named Miri that allows you to check your Rust code for the presence of some (not all!) undefined behavior (you can run it in the playground; Tools->Miri), and indeed it flags your code:
error: Undefined Behavior: trying to reborrow <untagged> for SharedReadOnly permission at alloc1707[0x8], but that tag does not exist in the borrow stack for this location
--> src/main.rs:17:5
|
17 | &*ptr
| ^^^^^
| |
| trying to reborrow <untagged> for SharedReadOnly permission at alloc1707[0x8], but that tag does not exist in the borrow stack for this location
| this error occurs as part of a reborrow at alloc1707[0x0..0x10]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
= note: inside `get_derived_from_base` at src/main.rs:17:5
note: inside `main` at src/main.rs:31:28
--> src/main.rs:31:28
|
31 | let derived = unsafe { get_derived_from_base(base) }; // defined behaviour?
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can make it work by creating a reference to the whole Derived and casting it to a raw pointer to Base. The raw pointer will keep the provenance of the original reference, and thus that will work:
// safety: `base` should be a reference to `Derived`, otherwise this is UB
unsafe fn get_derived_from_base<'a>(base: *const Base) -> &'a Derived {
let ptr = base as *const Derived;
&*ptr
}
fn main() {
let d = Derived {
base: Base {
x: 5
},
y: 10,
};
let base = &d as *const Derived as *const Base;
println!("{:?}", unsafe { &*base });
let derived = unsafe { get_derived_from_base(base) };
println!("{:?}", derived);
}
Note: References should not be involved at all in the process. If you'll reborrow base as a reference of type Base, you will lose the provenance again. This will pass Miri on the playground, but is still undefined behavior per the current rules and will fail Miri with stricter flags (set the environment variable MIRIFLAGS to -Zmiri-tag-raw-pointers before running Miri locally).

How to return struct in Rust without 'use of moved value' error?

I think the problem I'm running into with this code is that the struct I'm trying to return has variables that are strings.
Currently I am querying an InfluxDB database and storing the result in the result variable. The print line towards the bottom of my function confirms that the index it correctly and is printing
IotData { time: 2021-12-27T14:01:34.404593Z, device_id: "1000", transaction_date: "27-12-2021", transaction_time: "14:01:34", usage: 8 }
How do I return this struct? I added the Clone attribute to the struct and return the clone since I encountered the has not attribute 'Copy' error, but now I am receiving the error, use of moved value: 'result'. I am new to Rust and don't quite understand ownership. My next guess is that I need to add lifetimes to my function and return the reference instead, but am not sure where to go from here.
Code Dump:
use influxdb::{ Client, ReadQuery};
use chrono::{ Utc };
extern crate serde;
extern crate serde_json;
use serde_derive::{ Serialize, Deserialize };
#[derive(Serialize, Deserialize, Debug, Clone)]
struct IotData {
time: chrono::DateTime<Utc>,
device_id: String,
transaction_date: String,
transaction_time: String,
usage: i64,
}
pub async fn pull(last_time: u64) -> IotData {
let client = Client::new("http://localhost:8086", "PuPPY");
let query_text = format!("
SELECT *
FROM device_data
WHERE time > {}
ORDER BY time
LIMIT 1", last_time);
let read_query = ReadQuery::new(query_text);
let result = client
.json_query(read_query)
.await
.and_then(|mut db_result| db_result.deserialize_next::<IotData>());
println!("{:?}", result.unwrap().series[0].values[0]);
return result.unwrap().series[0].values[0].clone();
}
Minimized example yielding the essentially same error:
pub fn pull<T: std::fmt::Debug>(opt: Option<T>) -> T {
println!("{:?}", opt.unwrap());
opt.unwrap()
}
Playground
The error is this:
error[E0382]: use of moved value: `opt`
--> src/lib.rs:3:5
|
1 | pub fn pull<T: std::fmt::Debug>(opt: Option<T>) -> T {
| --- move occurs because `opt` has type `Option<T>`, which does not implement the `Copy` trait
2 | println!("{:?}", opt.unwrap());
| -------- `opt` moved due to this method call
3 | opt.unwrap()
| ^^^ value used here after move
|
note: this function takes ownership of the receiver `self`, which moves `opt`
help: consider calling `.as_ref()` to borrow the type's contents
|
2 | println!("{:?}", opt.as_ref().unwrap());
| +++++++++
(By the way, if you're reading your errors from IDE, you might not see most of this. That's the general advice: confusing compiler errors in IDE are usually more clear in console, where they are not trimmed)
Here, it's easy to see the issue: Option::unwrap consumes self, since that's the only way for it to yield the owned value. So, then we unwrap the incoming Option to print its contents, we destroy it for good - there's nothing more to return.
To fix it, we can do one of the following:
Just follow the compiler suggestion and add as_ref before first unwrap. In this case, we will consume not the Option<T>, but only the shared reference to it, which can be freely used without touching the value.
Remove the println!, since it seems to be purely debugging information anyway.
Unwrap the Option once, and then use the unwrapped value twice:
pub async fn pull(last_time: u64) -> IotData {
// --snip--
let value = result.unwrap().series[0].values[0];
println!("{:?}", value);
return value;
}
Note also that you don't really need clone in any case - again, Option::unwrap returns an owned value, so you don't have to clone it explicitly.

converting `MaybeUninit<T>` to `T`, but getting error E0382

I'm trying to reproduce the code suggested in the MaybeUninit docs. Specifically, it seems to work with specific datatypes, but produces a compiler error on generic types.
Working example (with u32)
use std::mem::{self, MaybeUninit};
fn init_array(t: u32) -> [u32; 1000] {
// Create an uninitialized array of `MaybeUninit`. The `assume_init` is
// safe because the type we are claiming to have initialized here is a
// bunch of `MaybeUninit`s, which do not require initialization.
let mut data: [MaybeUninit<u32>; 1000] = unsafe { MaybeUninit::uninit().assume_init() };
// Dropping a `MaybeUninit` does nothing. Thus using raw pointer
// assignment instead of `ptr::write` does not cause the old
// uninitialized value to be dropped. Also if there is a panic during
// this loop, we have a memory leak, but there is no memory safety
// issue.
for elem in &mut data[..] {
elem.write(t);
}
// Everything is initialized. Transmute the array to the
// initialized type.
unsafe { mem::transmute::<_, [u32; 1000]>(data) }
}
fn main() {
let data = init_array(42);
assert_eq!(&data[0], &42);
}
Failing example (with generic T)
use std::mem::{self, MaybeUninit};
fn init_array<T: Copy>(t: T) -> [T; 1000] {
// Create an uninitialized array of `MaybeUninit`. The `assume_init` is
// safe because the type we are claiming to have initialized here is a
// bunch of `MaybeUninit`s, which do not require initialization.
let mut data: [MaybeUninit<T>; 1000] = unsafe { MaybeUninit::uninit().assume_init() };
// Dropping a `MaybeUninit` does nothing. Thus using raw pointer
// assignment instead of `ptr::write` does not cause the old
// uninitialized value to be dropped. Also if there is a panic during
// this loop, we have a memory leak, but there is no memory safety
// issue.
for elem in &mut data[..] {
elem.write(t);
}
// Everything is initialized. Transmute the array to the
// initialized type.
unsafe { mem::transmute::<_, [T; 1000]>(data) }
}
fn main() {
let data = init_array(42);
assert_eq!(&data[0], &42);
}
error:
Compiling playground v0.0.1 (/playground)
error[E0512]: cannot transmute between types of different sizes, or dependently-sized types
--> src/main.rs:20:14
|
20 | unsafe { mem::transmute::<_, [T; 1000]>(data) }
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: source type: `[MaybeUninit<T>; 1000]` (size can vary because of T)
= note: target type: `[T; 1000]` (size can vary because of T)
For more information about this error, try `rustc --explain E0512`.
error: could not compile `playground` due to previous error
Playground link here
Questions
why is the second example failing? (I thought MaybeUninit<T> could always be transmuted into a T because they'd be guaranteed to have the same memory layout.)
can the example be rewritten to work with generic types?
This is a known issue (related), you can fix the code using the tips of HadrienG2 by doing a more unsafe unsafe thing:
// Everything is initialized. Transmute the array to the
// initialized type.
let ptr = &mut data as *mut _ as *mut [T; 1000];
let res = unsafe { ptr.read() };
core::mem::forget(data);
res
In future we expect to be able to use array_assume_init().

Miri complains about UB when initializing and manually dropping MaybeUninit

I am writing code to initialize an array of MaybeUninits and drop all initialized elements in case of a panic. Miri complains about undefined behaviour, which I've reduced to the example below.
use std::mem::{transmute, MaybeUninit};
fn main() {
unsafe {
let mut item: MaybeUninit<String> = MaybeUninit::uninit();
let ptr = item.as_mut_ptr();
item = MaybeUninit::new(String::from("Hello"));
println!("{}", transmute::<_, &String>(&item));
ptr.drop_in_place();
}
}
Error message produced by cargo miri run:
error: Undefined Behavior: trying to reborrow for SharedReadWrite at alloc1336, but parent tag <untagged> does not have an appropriate item in the borrow stack
--> /home/antek/.local/opt/rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:179:1
|
179 | pub unsafe fn drop_in_place<T: ?Sized>(to_drop: *mut T) {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ trying to reborrow for SharedReadWrite at alloc1336, but parent tag <untagged> does not have an appropriate item in the borrow stack
|
As far as I can tell, this is exactly how MaybeUninit is supposed to be used.
Am I using the interface incorrectly and invoking undefined behaviour, or is Miri being overly conservative?
Your issue is actually completely unrelated to MaybeUninit and can be boiled down to:
fn main() {
let mut item = 0;
let ptr = &item as *const _;
item = 1;
// or do anything with the pointer
unsafe { &*ptr; }
}
Playground link
Which throws the same "undefined behavior" error when run with Miri. From what I could gather by reading their reference, it seems like Miri has some metadata that keeps track of which pointers are allowed to read an item, and when you reassign that item, the metadata is wiped. However, reassigning values shouldn't change their memory address, so I would say that this is a bug with Miri and not UB.
In fact, changing ptr to a *mut i32 then using ptr.write instead of the assignment actually gets rid of the UB warning, which means it's probably a false positive.

Storing a boxed closure which references an object in that object

I'm trying to implement a console system for the game I'm writing and have found a fairly simple system: I define a Console object that stores commands as boxed closures (specifically Box<FnMut + 'a> for some 'a). This works for any component of the engine so long as the Console is created before anything else.
Unfortunately, this prevents me from adding commands that modify the Console itself, which means I can't create commands that simply print text or define other variables or commands. I've written a small example that replicates the error:
use std::cell::Cell;
struct Console<'a> {
cmds: Vec<Box<FnMut() + 'a>>,
}
impl<'a> Console<'a> {
pub fn println<S>(&self, msg: S)
where S: AsRef<str>
{
println!("{}", msg.as_ref());
}
pub fn add_cmd(&mut self, cmd: Box<FnMut() + 'a>) {
self.cmds.push(cmd);
}
}
struct Example {
val: Cell<i32>,
}
fn main() {
let ex = Example {
val: Cell::new(0),
};
let mut con = Console {
cmds: Vec::new(),
};
// this works
con.add_cmd(Box::new(|| ex.val.set(5)));
(con.cmds[0])();
// this doesn't
let cmd = Box::new(|| con.println("Hello, world!"));
con.add_cmd(cmd);
(con.cmds[1])();
}
And the error:
error: `con` does not live long enough
--> console.rs:34:31
|
34 | let cmd = Box::new(|| con.println("Hello, world!"));
| -- ^^^ does not live long enough
| |
| capture occurs here
35 | con.add_cmd(cmd);
36 | }
| - borrowed value dropped before borrower
|
= note: values in a scope are dropped in the opposite order they are created
error: aborting due to previous error
Is there a workaround for this, or a better system I should look into? This is on rustc 1.18.0-nightly (53f4bc311 2017-04-07).
This is one of those fairly tricky resource borrowing conundrums that the compiler could not allow. Basically, we have a Console that owns multiple closures, which in turn capture an immutable reference to the same console. This means two constraints:
Since Console owns the closures, they will live for as long as the console itself, and the inner vector will drop them right after Console is dropped.
At the same time, each closure must not outlive Console, because otherwise we would end up with dangling references to the console.
It may seem harmless from the fact that the console and respective closures go out of scope at once. However, the drop method follows a strict order here: first the console, then the closures.
Not to mention of course, that if you wish for closures to freely apply modifications to the console without interior mutability, you would have to mutably borrow it, which cannot be done over multiple closures.
An approach to solving the problem is to separate the two: let the console not own the closures, instead having them in a separate registry, and let the closures only borrow the console when calling the closure.
This can be done by passing the console as an argument to the closures and moving the closure vector to another object (Playground):
use std::cell::Cell;
struct CommandRegistry<'a> {
cmds: Vec<Box<Fn(&mut Console) + 'a>>,
}
impl<'a> CommandRegistry<'a> {
pub fn add_cmd(&mut self, cmd: Box<Fn(&mut Console) + 'a>) {
self.cmds.push(cmd);
}
}
struct Console {
}
impl Console {
pub fn println<S>(&mut self, msg: S)
where S: AsRef<str>
{
println!("{}", msg.as_ref());
}
}
struct Example {
val: Cell<i32>,
}
fn main() {
let ex = Example {
val: Cell::new(0),
};
let mut reg = CommandRegistry{ cmds: Vec::new() };
let mut con = Console {};
// this works
reg.add_cmd(Box::new(|_: &mut Console| ex.val.set(5)));
(reg.cmds[0])(&mut con);
// and so does this now!
let cmd = Box::new(|c: &mut Console| c.println("Hello, world!"));
reg.add_cmd(cmd);
(reg.cmds[1])(&mut con);
}
I also took the liberty of making closures accept a mutable reference. No conflicts emerge here because we are no longer borrowing the console that was already borrowed when fetching the borrowing closure. This way, the closures can also outlive the console.

Resources