I Googled some segfault examples in Rust, but none of crash now. Is Rust able to prevent all segfaults now? Is there a simple demo that can cause a segfault?
If unsafe code is allowed, then:
fn main() {
unsafe { std::ptr::null_mut::<i32>().write(42) };
}
results in:
Compiling playground v0.0.1 (/playground)
Finished dev [unoptimized + debuginfo] target(s) in 1.37s
Running `target/debug/playground`
timeout: the monitored command dumped core
/playground/tools/entrypoint.sh: line 11: 7 Segmentation fault timeout --signal=KILL ${timeout} "$#"
as seen on the playground.
Any situation that would trigger a segfault would require invoking undefined behavior at some point. The compiler is allowed to optimize out code or otherwise exploit the fact that undefined behavior should never occur, so it's very hard to guarantee that some code will segfault. The compiler is well within its rights to make the above program run without triggering a segfault.
As an example, the code above when compiled in release mode results in an "Illegal instruction" instead.
If unsafe code is not allowed, see How does Rust guarantee memory safety and prevent segfaults? for how Rust can guarantee it doesn't happen as long as its memory safety invariants aren't violated (which could only happen in unsafe code).
Don't use unsafe code if you can avoid it.
Strictly speaking, it is always possible to trick a program into thinking that it had a segmentation fault, since this is a signal sent by the OS:
use libc::kill;
use std::process;
fn main() {
unsafe {
// First SIGSEGV will be consumed by Rust runtime
// (see https://users.rust-lang.org/t/is-sigsegv-handled-by-rust-runtime/45680)...
kill(process::id() as i32, libc::SIGSEGV);
// ...but the second will crash the program, as expected
kill(process::id() as i32, libc::SIGSEGV);
}
}
Playground
This is not really an answer to your question, since that's not a "real" segmentation fault, but taking the question literally - Rust program can still end with a "segmentation fault" error, and here's a case which reliably triggers it.
If you're looking more generally for something that will dump core, and not specifically cause a segfault, there is another option which is to cause the compiler to emit an UD2 instruction or equivalent. There's a few things which can produce this:
An empty loop without any side effects is UB because of LLVM optimizations:
fn main() {
(|| loop {})()
}
Playground.
This no longer produces UB.
Trying to create the never (!) type.
#![feature(never_type)]
union Erroneous {
a: (),
b: !,
}
fn main() {
unsafe { Erroneous { a: () }.b }
}
Playground.
Or also trying to use (in this case match on) an enum with no variants:
#[derive(Clone, Copy)]
enum Uninhabited {}
union Erroneous {
a: (),
b: Uninhabited,
}
fn main() {
match unsafe { Erroneous { a: () }.b } {
// Nothing to match on.
}
}
Playground.
And last, you can cheat and just force it to produce a UD2 directly:
#![feature(asm)]
fn main() {
unsafe {
asm! {
"ud2"
}
};
}
Playground
Or using llvm_asm! instead of asm!
#![feature(llvm_asm)]
fn main() {
unsafe {
llvm_asm! {
"ud2"
}
};
}
Playground.
NULL pointer can cause segfault in both C and Rust.
union Foo<'a>{
a:i32,
s:&'a str,
}
fn main() {
let mut a = Foo{s:"fghgf"};
a.a = 0;
unsafe {
print!("{:?}", a.s);
}
}
Related
For some reason, when I run the code below, it does not panic or throw any errors...?
Isn't this a seg fault?
Why is this happening? How do I check the size of the passed pointer to avoid panics? (without the user having to pass a "size" variable as well)
#[repr(C)]
pub struct MyStruct {
pub item: u32
// a bunch of other fields as well
}
#[no_mangle]
pub unsafe extern fn do_something(mut data: *mut MyStruct) {
println!("{:p}", data);
data= data.offset(100);
println!("{:p}", data);
println!("{}", (*data).item);
if data.is_null() {
println!("datais null");
}
}
After I build, (and generate header using cbindgen) I link and use in a sample program like so:
#include "my_bindings.h"
int main() {
MyStruct *data = new MyStruct[2];
do_something(data);
return 0;
}
This is the output I get:
0x55f0ba739eb0
0x55f0ba73a108
0
An out of bounds access is not necessarily a segmentation fault, it's just an unidentified behaviour, the data that's out of bounds may still be a part of your application so the OS won't kill your application.
Unfortunately this is unsafe code, so rust can't do anything about it, and you should wrap it in a safer rust container along with the container length (you must pass the length), that panic on out of bounds access, as in the following answer Creating a Vec in Rust from a C array pointer and safely freeing it?
In the following code example the compiler can work out that the if blocks aren't reachable, and yet it still gives me an error.
const A_MODE: bool = false; // I manually edit this to switch "modes"
fn main() {
let a: Vec<u32>;
if A_MODE {
a = vec![1,2,3];
}
if A_MODE {
println!("a: {:?}", a); // error: borrow of possibly uninitialized variable
}
}
Rust Playground
I thought that maybe the compiler was really trying to tell me that I need to initialize a at some point, but this compiles fine:
fn main() {
let a: Vec<u32>;
println!("Finished.");
}
Is the error just because the Rust compiler isn't smart enough yet, or does this behaviour have some purpose? Is there any simple workaround which results in a similar code structure?
I know that I could restructure the code to make it work, but for my purposes the above structure is the most straight-forward and intuitive. My current work-around is to comment and uncomment code blocks, which isn't fun. Thanks!
The compiler does not expand constant expressions in the phase for validating lifetime and ownership, so it is not "obvious" to the compiler.
If you really don't want to run that block, you might want to use #[cfg] (or the cfg-if crate if you like if syntax).
fn main() {
let a: Vec<u32>;
#[cfg(a-mode)] {
a = vec![1,2,3];
}
#[cfg(a-mode)] {
println!("a: {:?}", a); // error: borrow of possibly uninitialized variable
}
}
This way, it will compile both usages without branching at all if a-mode cfg is set, and will not compile either of them otherwise.
The compiler is aware that constant expression conditions never change, but that is handled at a later phase of the compilation for optimizations like removing branching.
I am writing a Rust library containing an implementation of the callbacks for LLVM SanitizerCoverage. These callbacks can be used to trace the execution of an instrumented program.
A common way to produce a trace is to print the address of each executed basic block. However, in order to do that, it is necessary to retrieve the address of the call instruction that invoked the callback. The C++ examples provided by LLVM rely on the compiler intrinsic __builtin_return_address(0) in order to obtain this information.
extern "C" void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
if (!*guard) return;
void *PC = __builtin_return_address(0);
printf("guard: %p %x PC %p\n", guard, *guard, PC);
}
I am trying to reproduce the same function in Rust but, apparently, there is no equivalent to __builtin_return_address. The only reference I found is from an old version of Rust, but the function described is not available anymore. The function is the following:
pub unsafe extern "rust-intrinsic" fn return_address() -> *const u8
My current hacky solution involves having a C file in my crate that contains the following function:
void* get_return_address() {
return __builtin_return_address(1);
}
If I call it from a Rust function, I am able to obtain the return address of the Rust function itself. This solution, however, requires the compilation of my Rust code with -C force-frame-pointers=yes for it to work, since the C compiler intrinsic relies on the presence of frame pointers.
Concluding, is there a more straightforward way of getting the return address of the current function in Rust?
edit: The removal of the return_address intrinsic is discussed in this GitHub issue.
edit 2: Further testing showed that the backtrace crate is able to correctly extract the return address of the current function, thus avoiding the hack I described before. Credit goes to this tweet.
The problem with this solution is the overhead that is generated creating a full backtrace when only the return address of the current function is needed. In addition, the crate is using C libraries to extract the backtrace; this looks like something that should be done in pure Rust.
edit 3: The compiler intrinsic __builtin_return_address(0) generates a call to the LLVM intrinsic llvm.returnaddress. The corresponding documentation can be found here.
I could not find any official documentation about this, but found out by asking in the rust-lang repository: You can link against LLVM intrinsics, like llvm.returnaddress, with only a few lines of code:
extern {
#[link_name = "llvm.returnaddress"]
fn return_address() -> *const u8;
}
fn foo() {
println!("I was called by {:X}", return_address());
}
The LLVM intrinsic llvm.addressofreturnaddress might also be interesting.
As of 2022 Maurice's answer doesn't work as-is and requires an additional argument.
#![feature(link_llvm_intrinsics)]
extern {
#[link_name = "llvm.returnaddress"]
fn return_address(a: i32) -> *const u8;
}
macro_rules! caller_address {
() => {
unsafe { return_address(0) }
};
}
fn print_caller() {
println!("caller: {:p}", caller_address!());
}
fn main() {
println!("main address: {:p}", main as *const ());
print_caller();
}
Output:
main address: 0x56261a13bb50
caller: 0x56261a13bbaf
Playground link;
Here is the code:
pub struct Node<T> {
data: Option<T>,
level: usize,
forward: [Option<*mut Node<T>>; MAX_HEIGHT],
}
And I want to iterate the linked list:
// let next = some_node.forward[n];
unsafe {
loop {
match next {
None => { break; }
Some(v) => {
write!(f, "{:?}", (*v).data)?;
break;
}
}
}
}
When I use unsafe keyword, I get the [1] 74042 illegal hardware instruction cargo run error, so is there anyway to debug this unsafe block?
unsafe is a way of saying, "shut up, rustc, I know what I'm doing." In this case, you're assuring the compiler that v is always a valid aligned pointer to a Node<T>, that the array indexing of forward resolves to an array of Option<*mut Node<T>> with size MAX_HEIGHT. If any of these assumptions are violated, you're back in undefined behavior land.
You've turned off all the safeties and aimed your compiler at unknown pointers. The rational part of my brain wants to know exactly what you're trying to accomplish here.
The best advice I can offer with the information provided is to use rust-gdb and step through your program until your pointers don't look sane.
I got an error trying this code, which realizes a simple linked list.
use std::rc::Rc;
use std::cell::RefCell;
struct Node {
a : Option<Rc<RefCell<Node>>>,
value: i32
}
impl Node {
fn new(value: i32) -> Rc<RefCell<Node>> {
let node = Node {
a: None,
value: value
};
Rc::new(RefCell::new(node))
}
}
fn main() {
let first = Node::new(0);
let mut t = first.clone();
for i in 1 .. 10_000
{
if t.borrow().a.is_none() {
t.borrow_mut().a = Some(Node::new(i));
}
if t.borrow().a.is_some() {
t = t.borrow().a.as_ref().unwrap().clone();
}
}
println!("Done!");
}
Why does it happen? Does this mean that Rust is not as safe as positioned?
UPD:
If I add this method, the program does not crash.
impl Drop for Node {
fn drop(&mut self) {
let mut children = mem::replace(&mut self.a, None);
loop {
children = match children {
Some(mut n) => mem::replace(&mut n.borrow_mut().a, None),
None => break,
}
}
}
}
But I am not sure that this is the right solution.
Does this mean that Rust is not as safe as positioned?
Rust is only safe against certain kinds of failures; specifically memory corrupting crashes, which are documented here: http://doc.rust-lang.org/reference.html#behavior-considered-undefined
Unfortunately there is a tendency to sometimes expect rust to be more robust against certain sorts of failures that are not memory corrupting. Specifically, you should read http://doc.rust-lang.org/reference.html#behavior-considered-undefined.
tldr; In rust, many things can cause a panic. A panic will cause the current thread to halt, performing shutdown operations.
This may superficially appear similar to a memory corrupting crash from other languages, but it is important to understand although it is an application failure, it is not a memory corrupting failure.
For example, you can treat panic's like exceptions by running actions in a different thread and gracefully handling failure when the thread panics (for whatever reason).
In this specific example, you're using up too much memory on the stack.
This simple example will also fail:
fn main() {
let foo:&mut [i8] = &mut [1i8; 1024 * 1024];
}
(On most rustc; depending on the stack size on that particularly implementation)
I would have thought that moving your allocations to the stack using Box::new() would fix it in this example...
use std::rc::Rc;
use std::cell::RefCell;
#[derive(Debug)]
struct Node {
a : Option<Box<Rc<RefCell<Node>>>>,
value: i32
}
impl Node {
fn new(value: i32) -> Box<Rc<RefCell<Node>>> {
let node = Node {
a: None,
value: value
};
Box::new(Rc::new(RefCell::new(node)))
}
}
fn main() {
let first = Node::new(0);
let mut t = first.clone();
for i in 1 .. 10000
{
if t.borrow().a.is_none() {
t.borrow_mut().a = Some(Node::new(i));
}
if t.borrow().a.is_some() {
let c:Box<Rc<RefCell<Node>>>;
{ c = t.borrow().a.as_ref().unwrap().clone(); }
t = c;
println!("{:?}", t);
}
}
println!("Done!");
}
...but it doesn't. I don't really understand why, but hopefully someone else can look at this and post a more authoritative answer about what exactly is causing stack exhaustion in your code.
For those who come here and are specifically interested in the case where the large struct is a contiguous chunk of memory (instead of a tree of boxes), I found this GitHub issue with further discussion, as well as a solution that worked for me:
https://github.com/rust-lang/rust/issues/53827
Vec's method into_boxed_slice() returns a Box<[T]>, and does not overflow the stack for me.
vec![-1; 3000000].into_boxed_slice()
A note of difference with the vec! macro and array expressions from the docs:
This will use clone to duplicate an expression, so one should be careful using this with types having a nonstandard Clone implementation.
There is also the with_capacity() method on Vec, which is shown in the into_boxed_slice() examples.