Why does Rust allow calling functions via null pointers? - rust

I was experimenting with function pointer magic in Rust and ended up with a code snippet which I have absolutely no explanation for why it compiles and even more, why it runs.
fn foo() {
println!("This is really weird...");
}
fn caller<F>() where F: FnMut() {
let closure_ptr = 0 as *mut F;
let closure = unsafe { &mut *closure_ptr };
closure();
}
fn create<F>(_: F) where F: FnMut() {
caller::<F>();
}
fn main() {
create(foo);
create(|| println!("Okay..."));
let val = 42;
create(|| println!("This will seg fault: {}", val));
}
I cannot explain why foo is being invoked by casting a null pointer in caller(...) to an instance of type F. I would have thought that functions may only be called through corresponding function pointers, but that clearly can't be the case given that the pointer itself is null. With that being said, it seems that I clearly misunderstand an important piece of Rust's type system.
Example on Playground

This program never actually constructs a function pointer at all- it always invokes foo and those two closures directly.
Every Rust function, whether it's a closure or a fn item, has a unique, anonymous type. This type implements the Fn/FnMut/FnOnce traits, as appropriate. The anonymous type of a fn item is zero-sized, just like the type of a closure with no captures.
Thus, the expression create(foo) instantiates create's parameter F with foo's type- this is not the function pointer type fn(), but an anonymous, zero-sized type just for foo. In error messages, rustc calls this type fn() {foo}, as you can see this error message.
Inside create::<fn() {foo}> (using the name from the error message), the expression caller::<F>() forwards this type to caller without giving it a value of that type.
Finally, in caller::<fn() {foo}> the expression closure() desugars to FnMut::call_mut(closure). Because closure has type &mut F where F is just the zero-sized type fn() {foo}, the 0 value of closure itself is simply never used1, and the program calls foo directly.
The same logic applies to the closure || println!("Okay..."), which like foo has an anonymous zero-sized type, this time called something like [closure#src/main.rs:2:14: 2:36].
The second closure is not so lucky- its type is not zero-sized, because it must contain a reference to the variable val. This time, FnMut::call_mut(closure) actually needs to dereference closure to do its job. So it crashes2.
1 Constructing a null reference like this is technically undefined behavior, so the compiler makes no promises about this program's overall behavior. However, replacing 0 with some other "address" with the alignment of F avoids that problem for zero-sized types like fn() {foo}, and gives the same behavior!)
2 Again, constructing a null (or dangling) reference is the operation that actually takes the blame here- after that, anything goes. A segfault is just one possibility- a future version of rustc, or the same version when run on a slightly different program, might do something else entirely!

The type of fn foo() {...} is not a function pointer fn(), it's actually a unique type specific to foo. As long as you carry that type along (here as F), the compiler knows how to call it without needing any extra pointers (a value of such a type carries no data). A closure that doesn't capture anything works the same way. It only gets dicey when the last closure tries to look up val because you put a 0 where (presumably) the pointer to val was supposed to be.
You can observe this with size_of, in the first two calls, the size of closure is zero, but in the last call with something captured in the closure, the size is 8 (at least on the playground). If the size is 0, the program doesn't have to load anything from the NULL pointer.
The effective cast of a NULL pointer to a reference is still undefined behavior, but because of type shenanigans and not because of memory access shenanigans: having references that are really NULL is in itself illegal, because memory layout of types like Option<&T> relies on the assumption that the value of a reference is never NULL. Here's an example of how it can go wrong:
unsafe fn null<T>(_: T) -> &'static mut T {
&mut *(0 as *mut T)
}
fn foo() {
println!("Hello, world!");
}
fn main() {
unsafe {
let x = null(foo);
x(); // prints "Hello, world!"
let y = Some(x);
println!("{:?}", y.is_some()); // prints "false", y is None!
}
}

Given that rust is built on top of LLVM, and that what you're doing is guaranteed UB, you're likely hitting something similar to https://kristerw.blogspot.com/2017/09/why-undefined-behavior-may-call-never.html. This is one of many reasons why safe rust works to eliminate all UB.

Although this is entirely up to UB, here's what I assume might be happening in the two cases:
The type F is a closure with no data. This is equivalent to a function, which means that F is a function item. What this means is that the compiler can optimize any call to an F into a call to whatever function produced F (without ever making a function pointer). See this for an example of the different names for these things.
The compiler sees that val is always 42, and hence it can optimize it into a constant. If that's the case, then the closure passed into create is again a closure with no captured items, and hence we can follow the ideas in #1.
Additionally, I say this is UB, however please note something critical about UB: If you invoke UB and the compiler takes advantage of it in an unexpected way, it is not trying to mess you up, it is trying to optimize your code. UB after all, is about the compiler mis-optimizing things because you've broken some expectations it has. It is hence, completely logical that the compiler optimizes this way. It would also be completely logical that the compiler doesn't optimize this way and instead takes advantage of the UB.

This is "working" because fn() {foo} and the first closure are zero-sized types. Extended answer:
If this program ends up executed in Miri (Undefined behaviour checker), it ends up failing because NULL pointer is dereferenced. NULL pointer cannot ever be dereferenced, even for zero-sized types. However, undefined behaviour can do anything, so compiler makes no promises about the behavior, and this means it can break in the future release of Rust.
error: Undefined Behavior: memory access failed: 0x0 is not a valid pointer
--> src/main.rs:7:28
|
7 | let closure = unsafe { &mut *closure_ptr };
| ^^^^^^^^^^^^^^^^^ memory access failed: 0x0 is not a valid pointer
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: inside `caller::<fn() {foo}>` at src/main.rs:7:28
note: inside `create::<fn() {foo}>` at src/main.rs:13:5
--> src/main.rs:13:5
|
13 | func_ptr();
| ^^^^^^^^^^
note: inside `main` at src/main.rs:17:5
--> src/main.rs:17:5
|
17 | create(foo);
| ^^^^^^^^^^^
This issue can be easily fixed by writing let closure_ptr = 1 as *mut F;, then it will only fail on line 22 with the second closure that will segfault.
error: Undefined Behavior: inbounds test failed: 0x1 is not a valid pointer
--> src/main.rs:7:28
|
7 | let closure = unsafe { &mut *closure_ptr };
| ^^^^^^^^^^^^^^^^^ inbounds test failed: 0x1 is not a valid pointer
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: inside `caller::<[closure#src/main.rs:22:12: 22:55 val:&i32]>` at src/main.rs:7:28
note: inside `create::<[closure#src/main.rs:22:12: 22:55 val:&i32]>` at src/main.rs:13:5
--> src/main.rs:13:5
|
13 | func_ptr();
| ^^^^^^^^^^
note: inside `main` at src/main.rs:22:5
--> src/main.rs:22:5
|
22 | create(|| println!("This will seg fault: {}", val));
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Why it didn't complain about foo or || println!("Okay...")? Well, because they don't store any data. When referring to a function, you don't get a function pointer but rather a zero-sized type representing that specific function - this helps with monomorphization, as each function is distinct. A structure not storing any data can be created from aligned dangling pointer.
However, if you explicitly say the function is a function pointer by saying create::<fn()>(foo) then the program will stop working.

Related

Rust Compiler throws error related to moving data out of a shared reference

I have been writing a program in Rust where I encountered an error relating to moving the data of a Shared Reference.
I made some research, but I was unable to find the cause of the error in the program I have written. Here is the simplified version of the program:
enum A {
Won = 1,
}
struct B {
result: A,
}
fn print_game(game: &B) {
println!("{}", game.result as u32);
}
fn main() {
let game: B = B { result: A::Won };
print_game(&game);
}
The above program when compiled, throws the below error:
error[E0507]: cannot move out of `game.result` which is behind a shared reference
--> src/main.rs:10:20
|
10 | println!("{}", game.result as u32);
| ^^^^^^^^^^^ move occurs because `game.result` has type `A`, which does not implement the `Copy` trait
From the error I can infer that the data in game.result is moved, but, I am not sure where it is moved and why it's been moved.
In rust, default behaviour for custom type is 'move' unless you implement Copy trait for the type.
If the value bound to a variable moves, the variable can not be used anymore. Developers new to Rust must get used to it.
Move is attempted in 'game.result as i32' portion of your code. Type casting is also counted as move for value of 'move' type.
I agree with the solution already mentioned above, just add the line
#[derive(Clone, Copy)]
Rust Playground
Traits like Deref, Into, From etc. could also be relevant, but depends...
By 'move' compiler means just transfer of ownership, not the change of data location in memory.
Suppose we have a variable v bound to a 'move' data. In a type casting like 'v as T' the ownership of data is detached from v and the bytes are reinterpreted as type T.
But why is this casting not allowed in your example?
Actual cause of error is that 'game' is a shared reference of type &B and you tried to use it to detach original ownership of data referred by it.
Shared reference can never be used to move referred data or part of it. <<
In this case I think your enum A just needs a #[derive(Clone, Copy)].
Rust Playground Example
Another option would be to refactor the program to only print what is actually needed:
…
fn main() {
let game: B = B { result: A::Won };
print_result(game.result);
}
Playground

A value that is no longer borrowed causes a "does not live long enough" error

This program cannot be compiled.
struct F<'a>(Box<dyn Fn() + 'a>);
fn main() {
let mut v = vec![]; // Vec<F>
let s = String::from("foo");
let f = F(Box::new(|| println!("{:?}", s)));
v.push(f);
drop(v);
}
error[E0597]: `s` does not live long enough
--> src/main.rs:7:44
|
7 | let f = F(Box::new(|| println!("{:?}", s)));
| -- ^ borrowed value does not live long enough
| |
| value captured here
...
11 | }
| -
| |
| `s` dropped here while still borrowed
| borrow might be used here, when `v` is dropped and runs the `Drop` code for type `Vec`
|
= note: values in a scope are dropped in the opposite order they are defined
When s is dropped(line 11), v is already dropped, so s is not borrowed.
But the compiler said that s was still borrowed. why?
This is due to the consideration that a panic could happen as a result of any function call, since there is no decoration on functions indicating whether they might panic.
When a panic occurs, the stack unwinds and the drop code for each (initialized) variable is run in the opposite of their declaration order (s is dropped first, and then v) But v has the type Vec<F<'a>> where 'a is the lifetime of s, and F implements Drop, which means that s cannot be dropped before v because the compiler can't guarantee that the drop code for F won't access s.
The compiler cannot tell that there isn't actually a memory safety issue here (if push panics, the vector doesn't reference s through the closure). All it knows is that the type of v must live at least as long as s; whether v actually contains a reference to s is immaterial.
To fix this, just swap the order v and s are declared in, which will guarantee that v is dropped before s.
But why does F implement Drop in the first place?
Note that the problem goes away if you remove the Fn() trait object and push the closure directly (e.g. without dyn). This case is different because the compiler knows that the closure doesn't implement Drop -- the closure didn't move-capture any values that implement Drop. Therefore, the compiler knows that s will not be accessed by v's drop code.
By comparison, trait objects always have a vtable slot for Drop::drop, and so the compiler must pessimistically assume that every trait object could have a Drop implementation. This means that when the Vec and Box are destroyed, the compiler emits code to call the trait object's drop code, and based on the information the compiler has, that can result in an access to s since the F value captures the lifetime of s.
This is one of the pitfalls about type erasure through trait objects: the trait object is opaque to the compiler and it can no longer verify that s won't be used by a Drop implementation of a boxed closure after s is dropped. If an owned trait object captures a lifetime, the compiler has to ensure that the captured lifetime does not end before the trait object is dropped.
The above is actually a somewhat simplified explanation. Rust's drop-checker is a bit more complex than this; it's okay if F auto-implements Drop so long as the drop-checker determines that the lifetime 'a doesn't get used. Because of the trait object, this can't be guaranteed. However, this code can compile with a Box holding a non-dyn closure as the drop-checker determines that the captured lifetime isn't used when dropping the box.

Rust: Cannot reference local variable in return value - but the "local variable" is passed to the caller

Writing a simple interpreter has lead me to this battle with the borrow checker.
#[derive(Clone, Debug)]
struct Context<'a> {
display_name: &'a str,
parent: Option<Box<Context<'a>>>,
parent_entry_pos: Position<'a>,
}
// --snip--
#[derive(Copy, Clone, Debug)]
pub enum BASICVal<'a> {
Float(f64, Position<'a>, Position<'a>, &'a Context<'a>),
Int(i64, Position<'a>, Position<'a>, &'a Context<'a>),
Nothing(Position<'a>, Position<'a>, &'a Context<'a>),
}
// --snip--
pub fn run<'a>(text: &'a String, filename: &'a String) -> Result<(Context<'a>, BASICVal<'a>), BASICError<'a>> {
// generate tokens
let mut lexer = Lexer::new(text, filename);
let tokens = lexer.make_tokens()?;
// parse program to AST
let mut parser = Parser::new(tokens);
let ast = parser.parse();
// run the program
let context: Context<'static> = Context {
display_name: "<program>",
parent: None,
parent_entry_pos: Position::default(),
};
Ok((context, interpreter_visit(&ast?, &context)?))
}
The error is "cannot return value referencing local variable `context`" and (secondary) the "borrow of moved value: `context`":
error[E0515]: cannot return value referencing local variable `context`
--> src\basic.rs:732:2
|
732 | Ok((context, interpreter_visit(&ast?, &context)?))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^--------^^^^
| | |
| | `context` is borrowed here
| returns a value referencing data owned by the current function
error[E0382]: borrow of moved value: `context`
--> src\basic.rs:732:40
|
727 | let context: Context<'static> = Context {
| ------- move occurs because `context` has type `basic::Context<'_>`, which does not implement the `Copy` trait
...
732 | Ok((context, interpreter_visit(&ast?, &context)?))
| ------- value moved here ^^^^^^^^ value borrowed here after move
As far as I understand it: The context references several lifetime-dependent structs. The values of these structs are static in this case, as I can see by explicitly setting the lifetime parameter to 'static and the compiler not complaining. The interpreter_visit function needs to borrow the context because it gets passed to several independent functions, including itself recursively. In addition, the interpreter_visit returns BASICVals that reference the context themselves. For this reason, the context needs to outlive the run return. I try to achieve that by passing the context itself as part of the return value, thereby giving the caller control over its life. But now, I move the context to the return value before actually using it? This makes no sense. I should be able to reference one part of the return value in another part of the return value because both values make it out of the function "alive". I have tried:
boxing the context, thereby forcing it off the stack onto the heap, but that seems to only complicate things.
switching the order of the tuple, but that doesn't help.
storing interpreter_visit's result in an intermediate variable, which as expected doesn't help.
cloning the interpreter_visit result or the context itself
The issue may lie with the result and the error. The error doesn't reference a context but giving it a separate lifetime in interpreter_visit breaks the entire careful balance I have been able to achieve until now.
Answering this so that people don't have to read the comment thread.
This is a problem apparently not solvable by Rust's borrow checker. The borrow checker cannot understand that a Box of context will live on the heap and therefore last longer than the function return, therefore being "legally" referencable by the return value of interpreter_visit which itself escapes the function. The solution in this case is to circumvent borrow checking via unsafe, namely a raw pointer. Like this:
let context = Box::new(Context {
display_name: "<program>",
parent: None,
parent_entry_pos: Position::default(),
});
// Obtain a pointer to a location on the heap
let context_ptr: *const Context = &*context;
// outsmart the borrow checker
let result = interpreter_visit(&ast?, unsafe { &*context_ptr })?;
// The original box is passed back, so it is destroyed safely.
// Because the result lives as long as the context as required by the lifetime,
// we cannot get a pointer to uninitialized memory through the value and its context.
Ok((context, result))
I store a raw pointer to the context in context_ptr. The borrowed value passed to interpreter_visit is then piped through a (completely memory-safe) raw pointer dereference and borrow. This will (for some reason, only the Rust gods know) disable the borrow check, so the context data given to interpreter_visit is considered to have a legal lifetime. As I am however still passing back the very safe Box around the context data, I can avoid creating memory leaks by leaving the context with no owner. It might be possible now to pass around the interpreter_visit return value with having the context destroyed, but because both values are printed and discarded immediately, I see no issues arising from this in the future.
If you have a deeper understanding of Rust's borrow checker and would consider this a fixable edge case that doesn't have more "safe" solutions I couldn't come up with, please do comment and I will report this to the Rust team. I'm however not that certain especially because my experience with and knowledge of Rust is limited.

Returning from inside for loop causes type mismatch

I am attempting to return a function pointer, which is located inside a for loop, from a function located in an impl of a struct.
fn locate_func(&self, string: &str) -> fn() -> bool {
let mut func;
for alt in &self.alts {
return alt.func;
}
}
There will be an if statement inside the for loop in the future, but as I am testing things at the very moment, it looks rather generic, and somewhat illogical.
The above code in my mind, is supposed to return the pointer to alt.func(), which clearly is a pointer, as it tells me so should I remove the return and semicolon of that line.
error[E0308]: mismatched types
--> src\main.rs:42:3
|
42 | for alt in &self.alts
| ^ expected fn pointer, found ()
|
= note: expected type `fn() -> bool`
= note: found type `()`
Above is the error that is caused upon running locate_func(). I am clearly missing something as the aforementioned code is not working properly. Any hints?
Your for-loop is the last expression inside the function. The compiler expects the last expression to evaluate to the return type. But all loops evaluate to () (unit or void), so the compiler has a classic type mismatch there.
The correct question to ask yourself is: what would happen if the return inside of the loop wouldn't be executed (for example, because the loop isn't executed at all, because self.alts is empty)? This would lead to problems, wouldn't it?
So you have to return a valid object after the for-loop to cover that case. But if you are certain that the spot after the loop will never be reached you can use unreachable!(); to tell the compiler what you already know. However, if the program will reach this spot, it will panic! So better make sure, you know for certain how the program behaves.

Why doesn't `Box::into_raw` take `self` as parameter?

This simple program:
fn main() {
let b: Box<i32> = Box::new(1);
b.into_raw();
}
Produces this inconvenient error when compiled with Rust 1.12.0:
error: no method named `into_raw` found for type `Box<i32>` in the current scope
--> <anon>:3:7
|
3 | b.into_raw();
| ^^^^^^^^
|
= note: found the following associated functions; to be used as methods, functions must have a `self` parameter
= note: candidate #1 is defined in an impl for the type `Box<_>`
This is because into_raw is not defined to take self as parameter, but instead is defined as:
impl Box<T: ?Sized> {
fn into_raw(b: Box<T>) -> *mut T;
}
This seems inconvenient, and I cannot find a rationale.
So... why?
Because 99.995% of the time (statistic totally made up), you expect method calls to happen to the thing being pointed to, not to the pointer itself. As a result, the "smart pointer" types in Rust generally avoid doing anything to break that expectation. An obvious exception would be something like Rc/Arc implementing Clone directly.
Box implements Deref, which means that all methods that are enclosed by the Box are automatically made available; from the outside, Box<T> and T look and act the same.
If into_raw were a method instead of an associated function, it would shadow any into_raw method on the contained type.
There are other examples of these enhancing associated functions on Rc, such as downgrade or try_unwrap, or on Arc, such as make_mut.

Resources