Understanding ASAN output - linux

I have a problem figuring out why ASAN gives this output, why I can't see where and in what line the bug is in my code, is this bug even in my code as it says or is it in some libraries that is used by the program?
This is how I build my project :
CC=clang CXX=clang++ meson -Db_sanitize=address -Db_lundef=false
build-clang
and then I configure the env values and run the executable like this :
ASAN_OPTIONS=symbolize=1
ASAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer ./executable
And yes that is a valid path for llvm-sybolizer
So is there a way for me to know what does executable+0x431340 mean and where it points to in my code?
=================================================================
==13110==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 16384 byte(s) in 1 object(s) allocated from:
#0 0x4e1340 in __interceptor_malloc (/home/maysara/Desktop/testscreen/build-clang/src/excutable+0x4e1340)
#1 0x7ff16a2ccab8 in g_malloc (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x51ab8)
Direct leak of 4352 byte(s) in 17 object(s) allocated from:
#0 0x4e1340 in __interceptor_malloc (/home/maysara/Desktop/testscreen/build-clang/src/excutable+0x4e1340)
#1 0x7ff165e518ed (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1d8ed)
Direct leak of 3840 byte(s) in 6 object(s) allocated from:
#0 0x4e17c0 in realloc (/home/maysara/Desktop/testscreen/build-clang/src/excutable+0x4e17c0)
#1 0x7ff165e51998 (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1d998)
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x4e1340 in __interceptor_malloc (/home/maysara/Desktop/testscreen/build-clang/src/excutable+0x4e1340)
#1 0x7ff16a2ccab8 in g_malloc (/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x51ab8)
#2 0x7ff168b5910c in g_closure_invoke (/usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0+0x1010c)
Indirect leak of 10016 byte(s) in 313 object(s) allocated from:
#0 0x4e1340 in __interceptor_malloc (/home/maysara/Desktop/testscreen/build-clang/src/excutable+0x4e1340)
#1 0x7ff165e3ffef (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0xbfef)
Indirect leak of 4887 byte(s) in 405 object(s) allocated from:
#0 0x43db60 in strdup (/home/maysara/Desktop/testscreen/build-clang/src/excutable+0x43db60)
#1 0x7ff165e512f4 in FcValueSave (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1d2f4)
Indirect leak of 4320 byte(s) in 135 object(s) allocated from:
#0 0x4e1568 in calloc (/home/maysara/Desktop/testscreen/build-clang/src/excutable+0x4e1568)
#1 0x7ff165e51fd8 (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1dfd8)
Indirect leak of 2400 byte(s) in 75 object(s) allocated from:
#0 0x4e1568 in calloc (/home/maysara/Desktop/testscreen/build-clang/src/excutable+0x4e1568)
#1 0x7ff165e515c4 (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1d5c4)
Indirect leak of 576 byte(s) in 18 object(s) allocated from:
#0 0x4e1568 in calloc (/home/maysara/Desktop/testscreen/build-clang/src/excutable+0x4e1568)
#1 0x7ff165e51440 (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x1d440)
Indirect leak of 144 byte(s) in 3 object(s) allocated from:
#0 0x4e1340 in __interceptor_malloc (/home/maysara/Desktop/testscreen/build-clang/src/excutable+0x4e1340)
#1 0x7ff165e4bacd in FcLangSetCreate (/usr/lib/x86_64-linux-gnu/libfontconfig.so.1+0x17acd)
SUMMARY: AddressSanitizer: 46943 byte(s) leaked in 974 allocation(s).

In order to resolve the code addresses to source code locations, you need to compile the code with debug symbols enabled, e.g. with -g on the compiler command line or the commonly with build systems the environment variables CFLAGS and CXXFLAGS set accordingly:
CFLAGS="-g"
CXXFLAGS="-g"
That needs to be done for the code actually referenced, meaning here for example not only the code of executable, but also the linked libraries like glib, fontconfig, etc. if you want all addresses resolved.
Since these libraries are probably installed through a system package manager, you would need to look in your distribution's documentation how to install debug symbols. For Ubuntu, for example, there are usually variants of packages with an -dbg suffix.
In any case your stack traces do not look very helpful anyway, so it is not clear that finding the source code locations will be any help to you. You might want to recompile your executable with -fno-omit-frame-pointer and/or set the environment variable ASAN_OPTIONS=fast_unwind_on_malloc=0 when running the executable to try and improve them. See also the ASAN faq.

Related

valgrind reports memory leak in gsoap

Valgrind is detecting some memory leaks in Gsoap. This is a very basic example code:
//file hr.cpp
#include "soapH.h"
#include "ns1.nsmap"
int main()
{
struct soap *soap_ = soap_new();
soap_destroy(soap_);
soap_end(soap_);
//soap_done(soap_);
soap_free(soap_);
return 0;
}
To compile this code, I use this :
g++ -D DEBUG -g -std=c++0x -o hr hr.cpp soapC.cpp stdsoap2.cpp
And these are the memory leaks reported by Valgrind:
==5290== HEAP SUMMARY:
==5290== in use at exit: 72,800 bytes in 4 blocks
==5290== total heap usage: 18 allocs, 14 frees, 244,486 bytes allocated
==5290==
==5290== 32 bytes in 1 blocks are definitely lost in loss record 1 of 4
==5290== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5290== by 0x41D6FF: soap_track_malloc (stdsoap2.cpp:8952)
==5290== by 0x421347: soap_set_logfile (stdsoap2.cpp:10022)
==5290== by 0x421402: soap_set_test_logfile (stdsoap2.cpp:10062)
==5290== by 0x422391: soap_init_REQUIRE_lib_v20851 (stdsoap2.cpp:10405)
==5290== by 0x43DD31: soap::soap() (stdsoap2.cpp:20074)
==5290== by 0x41AF2E: soap_new_REQUIRE_lib_v20851 (stdsoap2.cpp:8109)
==5290== by 0x40238C: main (hr.cpp:6)
==5290==
==5290== 32 bytes in 1 blocks are definitely lost in loss record 2 of 4
==5290== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5290== by 0x41D6FF: soap_track_malloc (stdsoap2.cpp:8952)
==5290== by 0x421347: soap_set_logfile (stdsoap2.cpp:10022)
==5290== by 0x4213DA: soap_set_sent_logfile (stdsoap2.cpp:10050)
==5290== by 0x4223A2: soap_init_REQUIRE_lib_v20851 (stdsoap2.cpp:10406)
==5290== by 0x43DD31: soap::soap() (stdsoap2.cpp:20074)
==5290== by 0x41AF2E: soap_new_REQUIRE_lib_v20851 (stdsoap2.cpp:8109)
==5290== by 0x40238C: main (hr.cpp:6)
==5290==
==5290== 32 bytes in 1 blocks are definitely lost in loss record 3 of 4
==5290== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5290== by 0x41D6FF: soap_track_malloc (stdsoap2.cpp:8952)
==5290== by 0x421347: soap_set_logfile (stdsoap2.cpp:10022)
==5290== by 0x4213B2: soap_set_recv_logfile (stdsoap2.cpp:10038)
==5290== by 0x4223B3: soap_init_REQUIRE_lib_v20851 (stdsoap2.cpp:10407)
==5290== by 0x43DD31: soap::soap() (stdsoap2.cpp:20074)
==5290== by 0x41AF2E: soap_new_REQUIRE_lib_v20851 (stdsoap2.cpp:8109)
==5290== by 0x40238C: main (hr.cpp:6)
==5290==
==5290== LEAK SUMMARY:
==5290== definitely lost: 96 bytes in 3 blocks
==5290== indirectly lost: 0 bytes in 0 blocks
==5290== possibly lost: 0 bytes in 0 blocks
==5290== still reachable: 72,704 bytes in 1 blocks
==5290== suppressed: 0 bytes in 0 blocks
==5290== Reachable blocks (those to which a pointer was found) are not shown.
==5290== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==5290==
==5290== For counts of detected and suppressed errors, rerun with: -v
==5290== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
What am I doing wrong?
Am I missing any instruction? I have read through gsoap documentation and internet but I did not found anything.
Thanks a lot!
Do not compile your code with -DDEBUG to test with valgrind, because it activates gSOAP's internal memory leak checker. This may interfere with valgrind.
EDIT:
I verified this with the latest release 2.8.61 which shows no leak in DEBUG mode or otherwise. I suggest to upgrade to gSOAP 2.8.61.

Tie System.map values to kernel addresses

I'm trying to boot a custom kernel on a BeagleBoneBlack. u-boot works, and loads stuff as follows:
U-Boot 2016.03 (Apr 26 2016 - 11:32:30 +0000)
Watchdog enabled
I2C: ready
DRAM: 512 MiB
MMC: OMAP SD/MMC: 0, OMAP SD/MMC: 1
*** Warning - bad CRC, using default environment
Net: <ethaddr> not set. Validating first E-fuse MAC
cpsw, usb_ether
Press SPACE to abort autoboot in 2 seconds
switch to partitions #0, OK
mmc0 is current device
Scanning mmc 0:1...
Found /boot/extlinux/extlinux.conf
Retrieving file: /boot/extlinux/extlinux.conf
278 bytes read in 39 ms (6.8 KiB/s)
1: Linux grsec
Retrieving file: /boot/initramfs-grsec
5875398 bytes read in 349 ms (16.1 MiB/s)
Retrieving file: /boot/vmlinuz-4.4.8-grsec
3140944 bytes read in 211 ms (14.2 MiB/s)
append: BOOT_IMAGE=/boot/vmlinuz-4.4.8-grsec modules=loop,squashfs,sd-mod,usb-storage modloop=/boot/modloop-grsec console=ttyO0,115200n8
Retrieving file: /boot/dtbs/am335x-boneblack.dtb
31516 bytes read in 426 ms (71.3 KiB/s)
Kernel image # 0x82000000 [ 0x000000 - 0x2fed50 ]
## Flattened Device Tree blob at 88000000
Booting using the fdt blob at 0x88000000
Loading Ramdisk to 8fa65000, end 8ffff6c6 ... OK
Loading Device Tree to 8fa5a000, end 8fa64b1b ... OK
Starting kernel ...
Everything looks good so far, I think. But the kernel fails to load. I can't get access to anything from the kernel with low level debugging enabled in the kernel options either.
I've attached a J-Link JTAG debugger and was hoping to trace through to the problem, but I'm having trouble tying the System.map through to the disassembly.
Here for example is the start of the System.Map:
00000000 t __vectors_start
00000024 A cpu_ca8_suspend_size
00000024 A cpu_v7_suspend_size
0000002c A cpu_ca9mp_suspend_size
00001000 t __stubs_start
00001004 t vector_rst
00001020 t vector_irq
000010a0 t vector_dabt
00001120 t vector_pabt
000011a0 t vector_und
00001220 t vector_addrexcptn
00001240 t vector_fiq
00001240 T vector_fiq_offset
80204000 A swapper_pg_dir
80208000 T _text
80208000 T stext
8020808c t __create_page_tables
8020813c t __turn_mmu_on_loc
80208148 t __fixup_smp
802081b0 t __fixup_smp_on_up
802081d4 t __fixup_pv_table
80208228 t __vet_atags
80208280 T __idmap_text_start
80208280 T __turn_mmu_on
80208280 T _stext
So taking __create_page_tables, I grep in the source code under ./arch/arm/kernel with:
.../arm/arm/kernel$ grep __create_page_tables -rn
Binary file head.o matches
head.S:128: bl __create_page_tables
head.S:180:__create_page_tables:
head.S:355:ENDPROC(__create_page_tables)
So we're looking for the following at the symbol address:
__create_page_tables:
pgtbl r4, r8 # page table address
But the disassembler shows something different at the address I'm translating too give the Kernel is loaded at 0x82000000:
How can I translate the kernel symbols to the debugger addresses?

clang memory sanitizer; how to make it print source line numbers

I'm compiling my program with clang++ -fsanitize=memory -fsanitize-memory-track-origins -fno-omit-frame-pointer -g -O0 and when I run it, the output is:
matiu#matiu-laptop:~/projects/json++11/build$ ./tests
.......==10534== WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x7fe7602d4a51 (/home/matiu/projects/json++11/build/tests+0x106a51)
#1 0x7fe7602dfca6 (/home/matiu/projects/json++11/build/tests+0x111ca6)
...
#31 0x7fe75edbaec4 (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
#32 0x7fe7602808dc (/home/matiu/projects/json++11/build/tests+0xb28dc)
Uninitialized value was created by a heap allocation
#0 0x7fe76026e7b3 (/home/matiu/projects/json++11/build/tests+0xa07b3)
#1 0x7fe7602ee7da (/home/matiu/projects/json++11/build/tests+0x1207da)
...
#18 0x7fe7602c1c4c (/home/matiu/projects/json++11/build/tests+0xf3c4c)
#19 0x7fe7602873fa (/home/matiu/projects/json++11/build/tests+0xb93fa)
SUMMARY: MemorySanitizer: use-of-uninitialized-value ??:0 ??
Exiting
How can I make it show line numbers like in the beautiful examples: http://clang.llvm.org/docs/MemorySanitizer.html
I'm suspecting it might not be possible, due to my pragram being one giant nested bunch of lambdas: https://github.com/matiu2/json--11/blob/master/tests.cpp
With the address sanitizer I noticed that I needed to have these environment variables defined:
ASAN_OPTIONS=symbolize=1 (only needed when compiled with GCC > 4.8) and
ASAN_SYMBOLIZER_PATH=$(which llvm-symbolizer) I think the symbolizer is what you're looking for. It transforms symbols to file names with line numbers and columns.
On the memory sanitizer project website it reads:
Symbolization
Set MSAN_SYMBOLIZER_PATH environment variable to the path to
llvm-symbolizer binary (normally built with LLVM). MemorySanitizer
will use it to symbolize reports on-the-fly.
So you need MSAN_SYMBOLIZER_PATH to be set analogous to ASAN_SYMBOLIZER_PATH.

Geting SIGBUS (Bus error) # 0 (0)killed by SIGBUS (core dumped) in Redhat

I have process that works perfectly in the same machine in 2 accounts but when i copy the process to other account and run the process im getting core dump.
when i run the process with strace in the end im getting :
--- SIGBUS (Bus error) # 0 (0) ---
+++ killed by SIGBUS (core dumped) +++
when i open the core dump im getting :
#0 0x000000360046fed3 in malloc_consolidate () from /lib64/libc.so.6
#1 0x00000036004723fd in _int_malloc () from /lib64/libc.so.6
#2 0x000000360047402a in malloc () from /lib64/libc.so.6
#3 0x00000036004616ba in __fopen_internal () from /lib64/libc.so.6
#4 0x0000000000fe9652 in LogMngr::OpenFile (this=0x2aaaaad17010, iLogIndex=0) at LogMngr.c:801
i can see it something with opening the file for logging , but why it only in one account and in the other is fine ?
You can get a SIGBUS from an unaligned memory access . Are you using something like mmap, shared memory regions, or something similar ?
Any core dump inside malloc always indicates heap corruption, and heap corruption in general is sneaky like that: it may never show up on machine A, sometimes show up on machine B, and always show up on machine C.
Valgrind will likely point you straight at the problem.

how do you diagnose a kernel oops?

Given a linux kernel oops, how do you go about diagnosing the problem? In the output I can see a stack trace which seems to give some clues. Are there any tools that would help find the problem? What basic procedures do you follow to track it down?
Unable to handle kernel paging request for data at address 0x33343a31
Faulting instruction address: 0xc50659ec
Oops: Kernel access of bad area, sig: 11 [#1]
tpsslr3
Modules linked in: datalog(P) manet(P) vnet wlan_wep wlan_scan_sta ath_rate_sample ath_pci wlan ath_hal(P)
NIP: c50659ec LR: c5065f04 CTR: c00192e8
REGS: c2aff920 TRAP: 0300 Tainted: P (2.6.25.16-dirty)
MSR: 00009032 CR: 22082444 XER: 20000000
DAR: 33343a31, DSISR: 20000000
TASK = c2e6e3f0[1486] 'datalogd' THREAD: c2afe000
GPR00: c5065f04 c2aff9d0 c2e6e3f0 00000000 00000001 00000001 00000000 0000b3f9
GPR08: 3a33340a c5069624 c5068d14 33343a31 82082482 1001f2b4 c1228000 c1230000
GPR16: c60f0000 000004a8 c59abbe6 0000002f c1228360 c340d6b0 c5070000 00000001
GPR24: c2aff9e0 c5070000 00000000 00000000 00000003 c2cc2780 c2affae8 0000000f
NIP [c50659ec] mesh_packet_in+0x3d8/0xdac [manet]
LR [c5065f04] mesh_packet_in+0x8f0/0xdac [manet]
Call Trace:
[c2aff9d0] [c5065f04] mesh_packet_in+0x8f0/0xdac [manet] (unreliable)
[c2affad0] [c5061ff8] IF_netif_rx+0xa0/0xb0 [manet]
[c2affae0] [c01925e4] netif_receive_skb+0x34/0x3c4
[c2affb10] [c60b5f74] netif_receive_skb_debug+0x2c/0x3c [wlan]
[c2affb20] [c60bc7a4] ieee80211_deliver_data+0x1b4/0x380 [wlan]
[c2affb60] [c60bd420] ieee80211_input+0xab0/0x1bec [wlan]
[c2affbf0] [c6105b04] ath_rx_poll+0x884/0xab8 [ath_pci]
[c2affc90] [c018ec20] net_rx_action+0xd8/0x1ac
[c2affcb0] [c00260b4] __do_softirq+0x7c/0xf4
[c2affce0] [c0005754] do_softirq+0x58/0x5c
[c2affcf0] [c0025eb4] irq_exit+0x48/0x58
[c2affd00] [c000627c] do_IRQ+0xa4/0xc4
[c2affd10] [c00106f8] ret_from_except+0x0/0x14
--- Exception: 501 at __delay+0x78/0x98
LR = cfi_amdstd_write_buffers+0x618/0x7ac
[c2affdd0] [c0163670] cfi_amdstd_write_buffers+0x504/0x7ac (unreliable)
[c2affe50] [c015a2d0] concat_write+0xe4/0x140
[c2affe80] [c0158ff4] part_write+0xd0/0xf0
[c2affe90] [c015bdf0] mtd_write+0x170/0x2a8
[c2affef0] [c0073898] vfs_write+0xcc/0x16c
[c2afff10] [c0073f2c] sys_write+0x4c/0x90
[c2afff40] [c0010060] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfd98a50
LR = 0x10003840
Instruction dump:
419d02a0 98010009 800100a4 2f800003 419e0508 2f170000 419a0098 3d20c507
a0e1002e 81699624 39299624 7f8b4800 419e007c a0610016 7d264b78
Kernel panic - not syncing: Fatal exception in interrupt
Rebooting in 1 seconds..
An Oops gives a bunch of information useful in diagnosing a crash. It starts with the address of the crash, the reason ("access of bad area") and the contents of the registers. The call trace answers the question "how did we get here". The first item in the list happened most recently. Working backwards, an interrupt happened (do_IRQ) because the Atheros WiFi adapter received a packet (ath_rx_poll). The routine passed it to the generic WiFi code (ieee80211_input) which in turn passed it up to the network stack (netif_receive_skb).
To figure out the exact code causing the problem, you can run
gdb /usr/src/linux/vmlinux
and then disassemble the function in question, which might be mesh_packet_in(). Might, because the faulting instruction (0xc50659ec) looks to be outside of mesh_packet_in() (0xc5065f04). You might also try the gdb command
(gdb) info line 0xc50659ec
to figure out which function contains this address.
You should first try to find the source of the code that has crashed. In the specific case, the analysis claims that the crash happened in mesh_packet_in of the manet driver, at offset 0x8f0. It also reports that the instructions at this point are 419d02a0 98010009 ... So inspect the module with "objdump -d", to confirm whether the function/offset reported is correct. Then check the source for what it is doing; you can use the registers list to confirm again that you are looking at the right instruction.
When you know what C statement is faulting, you need to read the source to find out where the bogus data were coming from.
http://oss.sgi.com/projects/kdb/
Install this into your kernel, then when it Oops's, you'll be thrown into a gdb-like interface that you can poke around with. However, it looks like the manet module is deref'ing a bad pointer.

Resources