What is the difference between a direct and indirect leak? - memory-leaks

I got the following output from the LeakSanitizer tool. What is the difference between a direct and indirect leak, as the tool understands it?
13: ==29107==ERROR: LeakSanitizer: detected memory leaks
13:
13: Direct leak of 288 byte(s) in 6 object(s) allocated from:
13: #0 0x7f2ce0089050 in __interceptor_malloc (/nix/store/zahs1kwq4742f6l6h7yy4mdj44zzc1kd-gcc-7-20170409-lib/lib/libasan.so+0xd9050)
13: #1 0x7f2cdfb974fe in qdr_core_subscribe ../src/router_core/route_tables.c:149
13: #2 0x7f2cdfb47ff0 in IoAdapter_init ../src/python_embedded.c:548
13: #3 0x7f2cde966ecd in type_call (/nix/store/1snk2wkpv97an87pk1842fgskl1vqhkr-python-2.7.14/lib/libpython2.7.so.1.0+0x9fecd)
13:
13: Indirect leak of 2368 byte(s) in 1 object(s) allocated from:
13: #0 0x7f2ce0089b88 in __interceptor_posix_memalign (/nix/store/zahs1kwq4742f6l6h7yy4mdj44zzc1kd-gcc-7-20170409-lib/lib/libasan.so+0xd9b88)
13: #1 0x7f2cdfbcc8ea in qd_alloc ../src/alloc_pool.c:182
13: #2 0x7f2cdfbb6c6b in qd_server_connection ../src/server.c:500
13: #3 0x7f2cdfbbe27d in on_accept ../src/server.c:531
13: #4 0x7f2cdfbbe27d in handle_listener ../src/server.c:701
13: #5 0x7f2cdfbbe27d in handle ../src/server.c:844
13: #6 0x7f2cdfbc2837 in thread_run ../src/server.c:921
13: #7 0x7f2cdf0ba233 in start_thread (/nix/store/zpg78y1mf0di6127q6r51kgx2q8cxsvv-glibc-2.25-49/lib/libpthread.so.0+0x7233)
[...]

The LSan wiki design document states:
Another useful feature is being able to distinguish between directly leaked
blocks (not reachable from anywhere) and indirectly leaked blocks (reachable
from other leaked blocks).
Stated another way, indirect leaks are a result of direct leaks. Fixing direct leaks should make the indirect leaks become either fixed or direct leaks themselves (depending on whether their memory management is implemented correctly or not, respectively).

The accepted answer isn't quite correct. In particular
Stated another way, indirect leaks are a result of direct leaks.
is wrong in that it's possible to have only indirect leaks. This situation may arise when a self-referential structure is built and leaked.
Example:
#include <malloc.h>
struct Foo {
struct Foo *other;
};
void fn(int depth)
{
if (depth > 0) {
// recursion is only necessary to avoid LSan finding "stray" pointers
// and not reporting any leaks at all.
fn(depth - 1);
} else {
struct Foo *f1 = malloc(sizeof(*f1));
struct Foo *f2 = malloc(sizeof(*f2));
f1->other = f2;
f2->other = f1;
}
}
int main()
{
fn(10);
return 0;
}
clang -g -fsanitize=address t.c && ./a.out
=================================================================
==845196==ERROR: LeakSanitizer: detected memory leaks
Indirect leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x49832d in malloc (/tmp/a.out+0x49832d)
#1 0x4c7f8e in fn /tmp/t.c:15:22
#2 0x4c7f71 in fn /tmp/t.c:12:5
#3 0x4c7f71 in fn /tmp/t.c:12:5
#4 0x4c7f71 in fn /tmp/t.c:12:5
#5 0x4c7f71 in fn /tmp/t.c:12:5
#6 0x4c7f71 in fn /tmp/t.c:12:5
#7 0x4c7f71 in fn /tmp/t.c:12:5
#8 0x4c7f71 in fn /tmp/t.c:12:5
#9 0x4c7f71 in fn /tmp/t.c:12:5
#10 0x4c7f71 in fn /tmp/t.c:12:5
#11 0x4c7f71 in fn /tmp/t.c:12:5
#12 0x4c8028 in main /tmp/t.c:23:3
#13 0x7f147d3a7d09 in __libc_start_main csu/../csu/libc-start.c:308:16
Indirect leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x49832d in malloc (/tmp/a.out+0x49832d)
#1 0x4c7f80 in fn /tmp/t.c:14:22
#2 0x4c7f71 in fn /tmp/t.c:12:5
#3 0x4c7f71 in fn /tmp/t.c:12:5
#4 0x4c7f71 in fn /tmp/t.c:12:5
#5 0x4c7f71 in fn /tmp/t.c:12:5
#6 0x4c7f71 in fn /tmp/t.c:12:5
#7 0x4c7f71 in fn /tmp/t.c:12:5
#8 0x4c7f71 in fn /tmp/t.c:12:5
#9 0x4c7f71 in fn /tmp/t.c:12:5
#10 0x4c7f71 in fn /tmp/t.c:12:5
#11 0x4c7f71 in fn /tmp/t.c:12:5
#12 0x4c8028 in main /tmp/t.c:23:3
#13 0x7f147d3a7d09 in __libc_start_main csu/../csu/libc-start.c:308:16
SUMMARY: AddressSanitizer: 16 byte(s) leaked in 2 allocation(s).
Note: no direct leaks, only indirect ones.

Related

Rust compiler generating intrinsic llvm.add call instruction while clang generates normal add?

While working with llvm ir I noticed that when compiling a simple addition in c, clang will generate a normal llvm add instruction. However when I compile the same code written in rust, rustc generates a call to
%38 = call { i32, i1 } #llvm.ssub.with.overflow.i32(i32 %37, i32 5), !dbg !597
%39 = extractvalue { i32, i1 } %38, 0, !dbg !597
%40 = extractvalue { i32, i1 } %38, 1, !dbg !597
%41 = call i1 #llvm.expect.i1(i1 %40, i1 false), !dbg !597
br i1 %41, label %panic1, label %bb9, !dbg !597
followed by two extractvalue instructions and some according error handling if an overflow has occurred.
why does it do that? As far as I understand, there is overflow handling with the normal add instruction as well through the nsw keyword:
If the nuw and/or nsw keywords are present, the result value of the add is a poison value if unsigned and/or signed overflow, respectively, occurs.
as I understand, when the IR is further lowered to assembly, it will result in the same code?
TL;DR:
as I understand, when the IR is further lowered to assembly, it will result in the same code?
No, it will not. rustc (in debug mode) ~= clang + undefined behaviour sanitiser UBSAN.
Explanation
In debug mode rustc generates code to capture and panic on integer overflows. e.g.
pub fn bad_add(num: i32) -> i32 {
num + i32::MAX
}
Results in;
define i32 #_ZN7example7bad_add17ha9c5f96e25ec3c52E(i32 %num) unnamed_addr #0 !dbg !5 {
start:
%0 = call { i32, i1 } #llvm.sadd.with.overflow.i32(i32 %num, i32 2147483647), !dbg !10
%_3.0 = extractvalue { i32, i1 } %0, 0, !dbg !10
%_3.1 = extractvalue { i32, i1 } %0, 1, !dbg !10
%1 = call i1 #llvm.expect.i1(i1 %_3.1, i1 false), !dbg !10
br i1 %1, label %panic, label %bb1, !dbg !10
bb1: ; preds = %start
ret i32 %_3.0, !dbg !11
panic: ; preds = %start
call void #_ZN4core9panicking5panic17hab046c3856b52f65E([0 x i8]* align 1 bitcast ([28 x i8]* #str.0 to [0 x i8]*), i64 28, %"core::panic::location::Location"* align 8 bitcast (<{ i8*, [16 x i8] }>* #alloc7 to %"core::panic::location::Location"*)) #4, !dbg !10
unreachable, !dbg !10
}
However in release mode e.g. adding -C opt-level=3 we get
define i32 #_ZN7example7bad_add17ha9c5f96e25ec3c52E(i32 %num) unnamed_addr #0 !dbg !5 {
%0 = add i32 %num, 2147483647, !dbg !10
ret i32 %0, !dbg !11
}
Note that the checks and calls to panic are now removed.
With C/clang we won't get exactly the same result, e.g.
#include <limits.h>
// Type your code here, or load an example.
int bad_add(int num) {
return INT_MAX + num;
}
Will result in;
define dso_local i32 #bad_add(i32 %0) #0 {
%2 = alloca i32, align 4
store i32 %0, i32* %2, align 4
%3 = load i32, i32* %2, align 4
%4 = add nsw i32 2147483647, %3
ret i32 %4
}
To generate similar code in C you can enable UBSAN. e.g. add -fsanitize=undefined, or more specifically just the signed integer checker with -fsanitize=signed-integer-overflow to your command line. This is usually enabled, when running fuzz tests.
Enabling UBSAN with clang we get very similar (though not identical) output to rustc in debug mode;
define dso_local i32 #bad_add(i32 %0) #0 {
%2 = alloca i32, align 4
store i32 %0, i32* %2, align 4
%3 = load i32, i32* %2, align 4
%4 = call { i32, i1 } #llvm.sadd.with.overflow.i32(i32 2147483647, i32 %3), !nosanitize !2
%5 = extractvalue { i32, i1 } %4, 0, !nosanitize !2
%6 = extractvalue { i32, i1 } %4, 1, !nosanitize !2
%7 = xor i1 %6, true, !nosanitize !2
br i1 %7, label %10, label %8, !prof !3, !nosanitize !2
8: ; preds = %1
%9 = zext i32 %3 to i64, !nosanitize !2
call void #__ubsan_handle_add_overflow(i8* bitcast ({ { [10 x i8]*, i32, i32 }, { i16, i16, [6 x i8] }* }* #1 to i8*), i64 2147483647, i64 %9) #3, !nosanitize !2
br label %10, !nosanitize !2
10: ; preds = %8, %1
ret i32 %5
}
Note that we now get the same llvm call to llvm.sadd.with.overflow for the C function with UBSAN enabled. Also, you'll notice that __ubsan_handle_add_overflow essentially prints the problem with a backtrace and then exits. This is effectively the same behaviour as rusts panic.

Remove superfluous `andi ..., 0xff` instruction from inline assembly when outputting an `i8`

I have the following function using inline assembly, targeting mipsel-unknown-linux-gnu:
#![feature(asm)]
#[no_mangle]
pub unsafe extern "C" fn f(ptr: u32) {
let value: i8;
asm!(
"lb $v0, ($a0)",
in("$4") ptr,
out("$2") value,
);
asm!(
"sb $v0, ($a0)",
in("$4") ptr,
in("$2") value,
);
}
I expected this to compile into the following:
lb $v0, ($a0)
sb $v0, ($a0)
jr $ra
nop
Note: In this example, it's possible the compiler reorders the store instruction to after the jump to use the delay slot, but in my actual use case, I return via an asm block, so this is not a worry. Given this I expected the assembly above exactly.
Instead, what I got was:
00000000 <f>:
0: 80820000 lb v0,0(a0)
4: 304200ff andi v0,v0,0xff
8: a0820000 sb v0,0(a0)
c: 03e00008 jr ra
10: 00000000 nop
The compiler seems to not have trusted me that the output is an i8 and inserted an andi $v0, 0xff instruction there.
I need to produce the assembly I specified above exactly, so I'd like to get rid of the andi instruction, while keeping the type of value i8.
My use case for this is that I want to produce an exact assembly output from this function, while being able to later fork it and add rust code that interacts with the existing assembly code to extend the function. For this I'd like value's type to be properly described as an i8 in the rust side.
Edit
Looking at the llvm-ir generated by rust, the andi instruction seems to have been added by rustc, not llvm.
; Function Attrs: nonlazybind uwtable
define void #f(i32 %ptr) unnamed_addr #0 {
start:
%0 = tail call i32 asm sideeffect alignstack "lbu $$v0, ($$a0)", "=&{$2},{$4},~{memory}"(i32 %ptr) #1, !srcloc !2
%1 = and i32 %0, 255 # <--------- Over here
tail call void asm sideeffect alignstack "sb $$v0, ($$a0)", "{$4},{$2},~{memory}"(i32 %ptr, i32 %1) #1, !srcloc !3
ret void
}
There is also no mention of an i8, so I'm not quite sure what rustc is doing here.

Is it possible to print a backtrace in Rust without panicking?

Is it possible to print a backtrace (assuming RUST_BACKTRACE is enabled) without panicking? It seems that the only way of doing that is calling via panic!. If not, is there a reason for it?
Rust uses the backtrace crate to print the backtrace in case of panics (has been merged in PR #60852).
A simple example can be found in the crate documentation
use backtrace::Backtrace;
fn main() {
let bt = Backtrace::new();
// do_some_work();
println!("{:?}", bt);
}
which gives for example
stack backtrace:
0: playground::main::h6849180917e9510b (0x55baf1676201)
at src/main.rs:4
1: std::rt::lang_start::{{closure}}::hb3ceb20351fe39ee (0x55baf1675faf)
at /rustc/3c235d5600393dfe6c36eeed34042efad8d4f26e/src/libstd/rt.rs:64
2: {{closure}} (0x55baf16be492)
at src/libstd/rt.rs:49
do_call<closure,i32>
at src/libstd/panicking.rs:293
3: __rust_maybe_catch_panic (0x55baf16c00b9)
at src/libpanic_unwind/lib.rs:87
4: try<i32,closure> (0x55baf16bef9c)
at src/libstd/panicking.rs:272
catch_unwind<closure,i32>
at src/libstd/panic.rs:388
lang_start_internal
at src/libstd/rt.rs:48
5: std::rt::lang_start::h2c4217f9057b6ddb (0x55baf1675f88)
at /rustc/3c235d5600393dfe6c36eeed34042efad8d4f26e/src/libstd/rt.rs:64
6: main (0x55baf16762f9)
7: __libc_start_main (0x7fab051b9b96)
8: _start (0x55baf1675e59)
9: <unknown> (0x0)
You can use std::backtrace::Backtrace since rust 1.65.0:
use std::backtrace::Backtrace;
fn main() {
println!("Custom backtrace: {}", Backtrace::capture());
// ... or forcibly capture the backtrace regardless of environment variable configuration
println!("Custom backtrace: {}", Backtrace::force_capture());
}
Documentation: https://doc.rust-lang.org/std/backtrace/struct.Backtrace.html

LLVM produced by rustc gives error about argument type of main when run with lli

I'm trying to learn a little about the LLVM IR, particularly what exactly rustc outputs. I'm having a little bit of trouble running even a very simple case.
I put the following in a source file simple.rs:
fn main() {
let x = 7u32;
let y = x + 2;
}
and run rustc --emit llvm-ir simple.rs to get the file simple.ll, containing
; ModuleID = 'simple.cgu-0.rs'
source_filename = "simple.cgu-0.rs"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
; Function Attrs: uwtable
define internal void #_ZN6simple4main17h8ac50d7470339b75E() unnamed_addr #0 {
start:
br label %bb1
bb1: ; preds = %start
ret void
}
define i64 #main(i64, i8**) unnamed_addr {
top:
%2 = call i64 #_ZN3std2rt10lang_start17ha09816a4e25587eaE(void ()* #_ZN6simple4main17h8ac50d7470339b75E, i64 %0, i8** %1)
ret i64 %2
}
declare i64 #_ZN3std2rt10lang_start17ha09816a4e25587eaE(void ()*, i64, i8**) unnamed_addr
attributes #0 = { uwtable }
!llvm.module.flags = !{!0}
!0 = !{i32 1, !"PIE Level", i32 2}
I then try to run this with the command
lli-3.9 -load ~/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/libstd-35ad9950c7e5074b.so simple.ll
but I get the error message
LLVM ERROR: Invalid type for first argument of main() supplied
I'm able to make a minimal reproduction of this as follows: I make a file called s2.ll, containing
define i32 #main(i64, i8**) {
ret i32 42
}
and running lli-3.9 s2.ll gives the same error message. But if I change the contents of s2.ll to
define i32 #main(i32, i8**) {
ret i32 42
}
(i.e. I've changed the type of argc in main) then lli-3.9 s2.ll runs, and echo $? reveals that it did indeed return 42.
I don't think I should have to pass in the i64 explicitly - my argument list or C strings should be put into memory somewhere and the pointer and length passed to main automatically, right? Therefore I assume that I'm doing something wrong in the way in invoke lli - but I have no idea what.
Rust marks its entry point (the function marked with #[start] attribute, by default the function lang_start in the standard library) as taking an argc parameter of type isize. This is a bug because it should have the type of a C int, so it should be 32 bits on a 64-bit platform, but isize is 64 bits. However, due to the way 64-bit calling conventions work, this happens to still work correctly. The same issue also exists for the return type.
A fix for this has been committed on 2017-10-01 and should be present in Rust 1.22.
lli is apparently more strict about checking the type of main which is why it gives the error. But if you use llc instead, it should work correctly.
To get the correct main signature, you can cancel the default main by putting #![no_main] at the top of the module, and provide your own main marked with #[no_mangle]. But note that this will skip the standard library's initialization.
#![no_main]
#[no_mangle]
pub extern fn main(_argc: i32, _argv: *const *const u8) -> i32 {
0
}
See also:
documentation about lang_items and disabling main
Tracking issue for #[start] feature, where some people mention isize not being correct.

What does the "box" keyword do?

In Rust, we can use the Box<T> type to allocate things on the heap. This type is used to safely abstract pointers to heap memory. Box<T> is provided by the Rust standard library.
I was curious about how Box<T> allocation is implemented, so I found its source code. Here is the code for Box<T>::new (as of Rust 1.0):
impl<T> Box<T> {
/// Allocates memory on the heap and then moves `x` into it.
/// [...]
#[stable(feature = "rust1", since = "1.0.0")]
#[inline(always)]
pub fn new(x: T) -> Box<T> {
box x
}
}
The only line in the implementation returns the value box x. This box keyword is not explained anywhere in the official documentation; in fact, it is only mentioned briefly on the std::boxed documentation page.
NOTE: This reply is a bit old. Since it talks about internals and unstable features, things have changed a little bit. The basic mechanism remains the same though, so the answer is still capable of explaining the underlying mechanisms of box.
What does box x usually uses to allocate and free memory?
The answer is the functions marked with lang items exchange_malloc for allocation and exchange_free for freeing. You can see the implementation of those in the default standard library at heap.rs#L112 and heap.rs#L125.
In the end the box x syntax depends on the following lang items:
owned_box on a Box struct to encapsulate the allocated pointer. This struct does not need a Drop implementation, it is implemented automatically by the compiler.
exchange_malloc to allocate the memory.
exchange_free to free the previously allocated memory.
This can be effectively seen in the lang items chapter of the unstable rust book using this no_std example:
#![feature(lang_items, box_syntax, start, no_std, libc)]
#![no_std]
extern crate libc;
extern {
fn abort() -> !;
}
#[lang = "owned_box"]
pub struct Box<T>(*mut T);
#[lang = "exchange_malloc"]
unsafe fn allocate(size: usize, _align: usize) -> *mut u8 {
let p = libc::malloc(size as libc::size_t) as *mut u8;
// malloc failed
if p as usize == 0 {
abort();
}
p
}
#[lang = "exchange_free"]
unsafe fn deallocate(ptr: *mut u8, _size: usize, _align: usize) {
libc::free(ptr as *mut libc::c_void)
}
#[start]
fn main(argc: isize, argv: *const *const u8) -> isize {
let x = box 1;
0
}
#[lang = "stack_exhausted"] extern fn stack_exhausted() {}
#[lang = "eh_personality"] extern fn eh_personality() {}
#[lang = "panic_fmt"] fn panic_fmt() -> ! { loop {} }
Notice how Drop was not implemented for the Box struct? Well let's see the LLVM IR generated for main:
define internal i64 #_ZN4main20hbd13b522fdb5b7d4ebaE(i64, i8**) unnamed_addr #1 {
entry-block:
%argc = alloca i64
%argv = alloca i8**
%x = alloca i32*
store i64 %0, i64* %argc, align 8
store i8** %1, i8*** %argv, align 8
%2 = call i8* #_ZN8allocate20hf9df30890c435d76naaE(i64 4, i64 4)
%3 = bitcast i8* %2 to i32*
store i32 1, i32* %3, align 4
store i32* %3, i32** %x, align 8
call void #"_ZN14Box$LT$i32$GT$9drop.103617h8817b938807fc41eE"(i32** %x)
ret i64 0
}
The allocate (_ZN8allocate20hf9df30890c435d76naaE) was called as expected to build the Box, meanwhile... Look! A Drop method for the Box (_ZN14Box$LT$i32$GT$9drop.103617h8817b938807fc41eE)! Let's see the IR for this method:
define internal void #"_ZN14Box$LT$i32$GT$9drop.103617h8817b938807fc41eE"(i32**) unnamed_addr #0 {
entry-block:
%1 = load i32** %0
%2 = ptrtoint i32* %1 to i64
%3 = icmp ne i64 %2, 2097865012304223517
br i1 %3, label %cond, label %next
next: ; preds = %cond, %entry- block
ret void
cond: ; preds = %entry-block
%4 = bitcast i32* %1 to i8*
call void #_ZN10deallocate20he2bff5e01707ad50VaaE(i8* %4, i64 4, i64 4)
br label %next
}
There it is, deallocate (ZN10deallocate20he2bff5e01707ad50VaaE) being called on the compiler generated Drop!
Notice even on the standard library the Drop trait is not implemented by user-code. Indeed Box is a bit of a magical struct.
Before box was marked as unstable, it was used as a shorthand for calling Box::new. However, it's always been intended to be able to allocate arbitrary types, such as Rc, or to use arbitrary allocators. Neither of these have been finalized, so it wasn't marked as stable for the 1.0 release. This is done to prevent supporting a bad decision for all of Rust 1.x.
For further reference, you can read the RFC that changed the "placement new" syntax and also feature gated it.
box does exactly what Box::new() does - it creates an owned box.
I believe that you can't find implementation of box keyword because currently it is hardcoded to work with owned boxes, and Box type is a lang item:
#[lang = "owned_box"]
#[stable(feature = "rust1", since = "1.0.0")]
#[fundamental]
pub struct Box<T>(Unique<T>);
Because it is a lang item, the compiler has special logic to handle its instantiation which it can link with box keyword.
I believe that the compiler delegates box allocation to functions in alloc::heap module.
As for what box keyword does and supposed to do in general, Shepmaster's answer describes perfectly.

Resources