How to handle "An unknown error occurred" when running a Rust program? - rust

I am currently running cargo run and getting the most generic error, An unknown error has occurred.
When I then run cargo run --verbose I get Process didn't exit successfully: 'target/debug/ok_rust' (signal: 11) which I have no clue how to handle.
How do I debug this? What am I supposed to do? Test it with the nightly version, but the same libraries? How am I supposed to know if I'm responsible or if it's Rust?

According to the error you provided, this is not a problem with tooling (that is, Cargo and rustc both work correctly) but with your program:
Process didn't exit successfully: 'target/debug/ok_rust' (signal: 11)
Signal 11 means that a segmentation fault has happened in the program. Segfaults usually happen when invalid memory is accessed, for example, when a destroyed object is read. Rust is explicitly designed to avoid segfaults; if one happens, it means that one of the unsafe blocks in your program contains an error. This unsafe block may be the one you have written yourself or it may be in one of the libraries you use.
Anyway, you need to find the exact place where the segfault happens. You can use a debugger (gdb or lldb, depending on your system) or you can add debug output to your program, with which you will likely be able to pinpoint the problematic line. Then you'll need to trace the problem back to one of the unsafe blocks. For example, if you find that the segfault happens when accessing a value through a reference, like
let x = some_struct.field;
where some_struct: &SomeStruct, then it is likely that some_struct points to an invalid object; this can only happen if some_struct was created in an unsafe block, so you need to find where some_structs originates.

Related

What is the difference between panic and process::exit

As per the title, what is the difference between these two and when should I consider using one over the other?
There may or may not be a difference depending on your definition of what happens when a panic happens (defined in Cargo.toml). Depending on whether you have it set to unwind or abort, different things will happen:
With unwind, this will (as the name suggests) unwind the stack. With this, in particular, it is possible to get a full stack trace
With abort, you will only get the last callee
process::exit(), on the other hand, is a "clean" exit - you will not get a last callee, and you'll get a regular process exit status.
Due to this, you'll ideally want to keep to the following:
For planned shutdowns, use exit(). Do note that a known error is considered a planned shutdown
For unplanned shutdowns (i.e. exceptional failures) consider panic!(), as you'll both benefit from being able to get a stack trace when this happens, and the failure case should be exceptional enough that it is effectively unaccounted for and stems from an unplanned scenario
Afaik, a panic is never supposed to happen in a released program. It gives informations for developpers, but not anything user friendly. I'd say "use it for errors that should not happen in prod". There is probably behind something like an exit(101);
exit just terminates your process with the code you give to it. An exit(0) should mean "Everything is okay".

Is it possible to check if `panic` is set to `abort` while a library is compiling?

It may be not a good idea or not idiomatic, but let's assume that for some reason a library relies on catch_unwind for its business logic.
Can I somehow warn (by failing the compilation with an error message?) a user of this library if they set panic = "abort" in Cargo.toml of their "terminal" crate?
I was thinking about checking some environment variable in build.rs but can't find any variables with this information.
You can use this unstable code in your binary or library to cause an error when -C panic=abort is specified:
#![feature(panic_unwind)]
extern crate panic_unwind;
Which causes this helpful error when the wrong panic strategy is used:
error: the linked panic runtime `panic_unwind` is not compiled with this crate's panic strategy `abort`
When the panic strategy is correct, the extern crate declaration is redundant, but does nothing. When the panic strategy is wrong, it causes a linking error since you can't have two different panic strategy crates in the same binary. Since this check happens when crates are linked, note that if a library is never actually used by the top-level crate, then the check isn't run. (but this is a good thing: if your library is unused, then there is no need for this check anyways!)
Also, this error happens very late in the compilation process, so while cargo build will error out, cargo check won't complain since cargo check doesn't check for linking errors for performance reasons.
Unfortunately, there doesn't seem to be a way to do this on the stable channel.

Haskell App Crash: Handling Native Exceptions

I have a haskell package which contains native code as well. However, I get exceptions, (and sometimes segfaults) as I interface through FFI.
Is it possible to handle native exceptions on the haskell side. I tried using catch/catchIOError in the some cases without any success.
In this case, I would also like to like to debug only the native code. How can I use native debuggers with Haskell/FFI?
Sometimes, segfaults may occur in the C code. Being able to debug this code would help a lot.
If you think the error is in a component in C, just use gdb. You should be able to set breakpoint in your C code and step into it. Compile your code and simply run gdb dist/build/myprogram/myprogram (or wherever it is).
Also you could have a look at valgrind for detecting thinks such as allocated memory not freed.

Getting a backtrace of other thread

In Linux, to get a backtrace you can use backtrace() library call, but it only returns backtrace of current thread. Is there any way to get a backtrace of some other thread, assuming I know it's TID (or pthread_t) and I can guarantee it sleeps?
It seems that libunwind (http://www.nongnu.org/libunwind/) project can help. The problem is that it is not supported by CentOS, so I prefer not to use it.
Any other ideas?
Thanks.
I implemented that myself here.
Initially, I wanted to implement something similar as suggested here, i.e. getting somehow the top frame pointer of the thread and unwinding it manually (the linked source is derived from Apples backtrace implementation, thus might be Apple-specific, but the idea is generic).
However, to have that safe (and the source above is not and may even be broken anyway), you must suspend the thread while you access its stack. I searched around for different ways to suspend a thread and found this, this and this. Basically, there is no really good way. The common hack, also used by the Hotspot JAVA VM, is to use signals and sending a custom signal to your thread via pthread_kill.
So, as I would need such signal-hack anyway, I can have it a bit simpler and just use backtrace inside the called signal handler which is executed in the target thread (as also suggested here by sandeep). This is basically what my implementation is doing.
If you are also interested in printing the backtrace, i.e. get some useful debugging information (function name, source code filename, source code line number, ...), read here about an extended backtrace_symbols based on libbfd. Or just see the source here.
Signal Handling with the help of backtrace can solve your purpose.
I mean if you have a PID of the Thread, you can raise a signal for that thread. and in the handler you can use the backtrace. since the handler would be executing in that partucular thread, the backtrace there would be the output what you are needed.
gdb provides these facilities for debugging multi-thread programs:
automatic notification of new threads
‘thread thread-id’, a command to switch among threads
‘info threads’, a command to inquire about existing threads
‘thread apply [thread-id-list] [all] args’, a command to apply a command to a list of threads
thread-specific breakpoints
‘set print thread-events’, which controls printing of messages on thread start and exit.
‘set libthread-db-search-path path’, which lets the user specify which libthread_db to use if the default choice isn't compatible with the program.
So just goto required thread in GDB by cmd: 'thread thread-id'.
Then do 'bt' in that thread context to print the thread backtrace.

Kernel panic seems to be unrelated to the changes

I made changes in sched.c in Linux kernel 2.4 (homework), and now the system goes into kernel panic. The strange thing is: it seems to pass A LOT of booting checks and initializations, and panics only at the very end, showing hte following stack trace:
update_process_times
do_timer
timer_interrupt
handle_IRQ_event
do_IRQ
call_do_IRQ
do)wp_page
handle_mm_fault
do_page_fault
do_sigaction
sys_rt_sigaction
do_page_fault
error_code
And the error is: "In interrupt handler - not synching"
I know it's hard to tell without any code, but can anybody make an educated guess to point me in the right direction?
I can give you my own personal mantra when debugging kernel problems: "it's always your fault."
I often see issues due to overwriting memory outside where I'm working -- if I feed hardware an incorrect address for DMA for example. You may be screwing up a lock somehow; that seems possible in this case if you are seeing a timeout: a forgotten locked lock is causing a timeout to occur due to a hang.
To me, a panic in update_process_times might suggest a problem with the task struct pointer... but I really have no idea.
Keep in mind that things in the kernel often go wrong long before a failure occurs, so a wrong bit anywhere in your code may be to blame, even if it doesn't seem like it should have an effect. If you can, I recommend incrementally adding or removing your code and checking for the problem to see if you can isolate it.

Resources