How do I find out what all symbols are exported from a shared object? - linux

I have a shared object (dll). How do I find out what all symbols are exported from that?

Do you have a "shared object" (usually a shared library on AIX), a UNIX shared library, or a Windows DLL? These are all different things, and your question conflates them all :-(
For an AIX shared object, use dump -Tv /path/to/foo.o.
For an ELF shared library, use readelf -Ws --dyn-syms /path/to/libfoo.so, or (if you have GNU nm) nm -D /path/to/libfoo.so.
For a non-ELF UNIX shared library, please state which UNIX you are interested in.
For a Windows DLL, use dumpbin /EXPORTS foo.dll.

objdump is another good one on linux.

If it is a Windows DLL file and your OS is Linux then use winedump:
$ winedump -j export pcre.dll
Contents of pcre.dll: 229888 bytes
Exports table:
Name: pcre.dll
Characteristics: 00000000
TimeDateStamp: 53BBA519 Tue Jul 8 10:00:25 2014
Version: 0.00
Ordinal base: 1
# of functions: 31
# of Names: 31
Addresses of functions: 000375C8
Addresses of name ordinals: 000376C0
Addresses of names: 00037644
Entry Pt Ordn Name
0001FDA0 1 pcre_assign_jit_stack
000380B8 2 pcre_callout
00009030 3 pcre_compile
...

On *nix check nm. On windows use the program Dependency Walker

see man nm
GNU nm lists the symbols from object files objfile.... If no object
files are listed as arguments, nm assumes the file a.out.

Use: nm --demangle <libname>.so

The cross-platform way (not only cross-platform itself, but also working, at the very least, with both *.so and *.dll) is using reverse-engineering framework radare2. E.g.:
$ rabin2 -s glew32.dll | head -n 5
[Symbols]
vaddr=0x62afda8d paddr=0x0005ba8d ord=000 fwd=NONE sz=0 bind=GLOBAL type=FUNC name=glew32.dll___GLEW_3DFX_multisample
vaddr=0x62afda8e paddr=0x0005ba8e ord=001 fwd=NONE sz=0 bind=GLOBAL type=FUNC name=glew32.dll___GLEW_3DFX_tbuffer
vaddr=0x62afda8f paddr=0x0005ba8f ord=002 fwd=NONE sz=0 bind=GLOBAL type=FUNC name=glew32.dll___GLEW_3DFX_texture_compression_FXT1
vaddr=0x62afdab8 paddr=0x0005bab8 ord=003 fwd=NONE sz=0 bind=GLOBAL type=FUNC name=glew32.dll___GLEW_AMD_blend_minmax_factor
As a bonus, rabin2 recognizes C++ name mangling, for example (and also with .so file):
$ rabin2 -s /usr/lib/libabw-0.1.so.1.0.1 | head -n 5
[Symbols]
vaddr=0x00027590 paddr=0x00027590 ord=124 fwd=NONE sz=430 bind=GLOBAL type=FUNC name=libabw::AbiDocument::isFileFormatSupported
vaddr=0x0000a730 paddr=0x0000a730 ord=125 fwd=NONE sz=58 bind=UNKNOWN type=FUNC name=boost::exception::~exception
vaddr=0x00232680 paddr=0x00032680 ord=126 fwd=NONE sz=16 bind=UNKNOWN type=OBJECT name=typeinfoforboost::exception_detail::clone_base
vaddr=0x00027740 paddr=0x00027740 ord=127 fwd=NONE sz=235 bind=GLOBAL type=FUNC name=libabw::AbiDocument::parse
Works with object files too:
$ g++ test.cpp -c -o a.o
$ rabin2 -s a.o | head -n 5
Warning: Cannot initialize program headers
Warning: Cannot initialize dynamic strings
Warning: Cannot initialize dynamic section
[Symbols]
vaddr=0x08000149 paddr=0x00000149 ord=006 fwd=NONE sz=1 bind=LOCAL type=OBJECT name=std::piecewise_construct
vaddr=0x08000149 paddr=0x00000149 ord=007 fwd=NONE sz=1 bind=LOCAL type=OBJECT name=std::__ioinit
vaddr=0x080000eb paddr=0x000000eb ord=017 fwd=NONE sz=73 bind=LOCAL type=FUNC name=__static_initialization_and_destruction_0
vaddr=0x08000134 paddr=0x00000134 ord=018 fwd=NONE sz=21 bind=LOCAL type=FUNC name=_GLOBAL__sub_I__Z4funcP6Animal

You can use gnu objdump. objdump -p your.dll. Then pan to the .edata section contents and you'll find the exported functions under [Ordinal/Name Pointer] Table.

Usually, you would also have a header file that you include in your code to access the symbols.

Related

"Whirlwind Tutorial on Teensy ELF Executables" -- why is the output of ld 10X bigger, 20 years later? [duplicate]

For a university course, I like to compare code-sizes of functionally similar programs if written and compiled using gcc/clang versus assembly. In the process of re-evaluating how to further shrink the size of some executables, I couldn't trust my eyes when the very same assembly code I assembled/linked 2 years ago now has grown >10x in size after building it again (which true for multiple programs, not only helloworld):
$ make
as -32 -o helloworld-asm-2020.o helloworld-asm-2020.s
ld -melf_i386 -o helloworld-asm-2020 helloworld-asm-2020.o
$ ls -l
-rwxr-xr-x 1 xxx users 708 Jul 18 2018 helloworld-asm-2018*
-rwxr-xr-x 1 xxx users 8704 Nov 25 15:00 helloworld-asm-2020*
-rwxr-xr-x 1 xxx users 4724 Nov 25 15:00 helloworld-asm-2020-n*
-rwxr-xr-x 1 xxx users 4228 Nov 25 15:00 helloworld-asm-2020-n-sstripped*
-rwxr-xr-x 1 xxx users 604 Nov 25 15:00 helloworld-asm-2020.o*
-rw-r--r-- 1 xxx users 498 Nov 25 14:44 helloworld-asm-2020.s
The assembly code is:
.code32
.section .data
msg: .ascii "Hello, world!\n"
len = . - msg
.section .text
.globl _start
_start:
movl $len, %edx # EDX = message length
movl $msg, %ecx # ECX = address of message
movl $1, %ebx # EBX = file descriptor (1 = stdout)
movl $4, %eax # EAX = syscall number (4 = write)
int $0x80 # call kernel by interrupt
# and exit
movl $0, %ebx # return code is zero
movl $1, %eax # exit syscall number (1 = exit)
int $0x80 # call kernel again
The same hello world program, compiled using GNU as and GNU ld (always using 32-bit assembly) was 708 bytes then, and has grown to 8.5K now. Even when telling the linker to turn off page alignment (ld -n), it still has almost 4.2K. stripping/sstripping doesn't pay off either.
readelf tells me that the start of section headers is much later in the code (byte 468 vs 8464), but I have no idea why. It's running on the same arch system as in 2018, the Makefile is the same and I'm not linking against any libraries (especially not libc). I guess something regarding ld has changed due to the fact that the object file is still quite small, but what and why?
Disclaimer: I'm building 32-bit executables on an x86-64 machine.
Edit: I'm using GNU binutils (as & ld) version 2.35.1 Here is a base64-encoded archive which includes the source and both executables (small old one, large new one) :
cat << EOF | base64 -d | tar xj
QlpoOTFBWSZTWVaGrEQABBp////xebj/7//Xf+a8RP/v3/rAAEVARARAeEADBAAAoCAI0AQ+NAam
ytMpCGmpDVPU0aNpGmh6Rpo9QAAeoBoADQaNAADQ09IAACSSGUwaJpTNQGE9QZGhoADQPUAA0AAA
AA0aA4AAAABoAAAAA0GgAAAAZAGgAHAAAAANAAAAAGg0AAAADIA0AASJCBIyE8hHpqPVPUPU/VAa
fqn6o0ep6BB6TQaNGj0j1ABobU00yeU9JYiuVVZKYE+dKNa3wls6x81yBpGAN71NoylDUvNryWiW
E4ER8XkfpaJcPb6ND12ULEqkQX3eaBHP70Apa5uFhWNDy+U3Ekj+OLx5MtDHxQHQLfMcgCHrGayE
Dc76F4ZC4rcRkvTW4S2EbJAsbBGbQxSbx5o48zkyk5iPBBhJowtCSwDBsQBc0koYRSO6SgJNL0Bg
EmCoxCDAs5QkEmTGmQUgqZNIoxsmwDmDQe0NIDI0KjQ64leOr1fVk6AaVhjOAJjLrEYkYy4cDbyS
iXSuILWohNh+PA9Izk0YUM4TQQGEYNgn4oEjGmAByO+kzmDIxEC3Txni6E1WdswBJLKYiANdiQ2K
00jU/zpMzuIhjTbgiBqE24dZWBcNBBAAioiEhCQEIfAR8Vir4zNQZFgvKZa67Jckh6EHZWAWuf6Q
kGy1lOtA2h9fsyD/uPPI2kjvoYL+w54IUKBEEYFBIWRNCNpuyY86v3pNiHEB7XyCX5wDjZUSF2tO
w0PVlY2FQNcLQcbZjmMhZdlCGkVHojuICHMMMB5kQQSZRwNJkYTKz6stT/MTWmozDCcj+UjtB9Cf
CUqAqqRlgJdREtMtSO4S4GpJE2I/P8vuO9ckqCM2+iSJCLRWx2Gi8VSR8BIkVX6stqIDmtG8xSVU
kk7BnC5caZXTIynyI0doXiFY1+/Csw2RUQJroC0lCNiIqVVUkTqTRMYqKNVGtCJ5yfo7e3ZpgECk
PYUEihPU0QVgfQ76JA8Eb16KCbSzP3WYiVApqmfDhUk0aVc+jyBJH13uKztUuva8F4YdbpmzomjG
kSJmP+vCFdKkHU384LdRoO0LdN7VJlywJ2xJdM+TMQ0KhMaicvRqfC5pHSu+gVDVjfiss+S00ikI
DeMgatVKKtcjsVDX09XU3SzowLWXXunnFZp/fP3eN9Rj1ubiLc0utMl3CUUkcYsmwbKKrWhaZiLO
u67kMSsW20jVBcZ5tZUKgdRtu0UleWOs1HK2QdMpyKMxTRHWhhHwMnVEsWIUEjIfFEbWhRTRMJXn
oIBSEa2Q0llTBfJV0LEYEQTBTFsDKIxhgqNwZB2dovl/kiW4TLp6aGXxmoIpVeWTEXqg1PnyKwux
caORGyBhTEPV2G7/O3y+KeAL9mUM4Zjl1DsDKyTZy8vgn31EDY08rY+64Z/LO5tcRJHttMYsz0Fh
CRN8LTYJL/I/4u5IpwoSCtDViIA=
EOF
Update:
When using ld.gold instead of ld.bfd (to which /usr/bin/ld is symlinked to by default), the executable size becomes as small as expected:
$ cat Makefile
TARGET=helloworld
all:
as -32 -o ${TARGET}-asm.o ${TARGET}-asm.s
ld.bfd -melf_i386 -o ${TARGET}-asm-bfd ${TARGET}-asm.o
ld.gold -melf_i386 -o ${TARGET}-asm-gold ${TARGET}-asm.o
rm ${TARGET}-asm.o
$ make -q
$ ls -l
total 68
-rw-r--r-- 1 eso eso 200 Dec 1 13:57 Makefile
-rwxrwxr-x 1 eso eso 8700 Dec 1 13:57 helloworld-asm-bfd
-rwxrwxr-x 1 eso eso 732 Dec 1 13:57 helloworld-asm-gold
-rw-r--r-- 1 eso eso 498 Dec 1 13:44 helloworld-asm.s
Maybe I just used gold previously without being aware.
It's not 10x in general, it's page-alignment of a couple sections as Jester says, per changes to ld's default linker script for security reasons:
First change: Making sure data from .data isn't present in any of the mapping of .text, so none of that static data is available for ROP / Spectre gadgets in an executable page. (In older ld, that meant the program-headers mapped the same disk-block twice, also into a RW-without-exec segment for the actual .data section. The executable mapping was still read-only.)
More recent change: Separate .rodata from .text into separate segments, again so static data isn't mapped into an executable page. Previously, const char code[]= {...} could be cast to a function pointer and called, without needing mprotect or gcc -z execstack or other tricks, if you wanted to test shellcode that way. (A separate Linux kernel change made -z execstack only apply to the actual stack, not READ_IMPLIES_EXEC.)
See Why an ELF executable could have 4 LOAD segments? for this history, including the strange fact that .rodata is in a separate segment from the read-only mapping for access to the ELF metadata.
That extra space is just 00 padding and will compress well in a .tar.gz or whatever.
So it has a worst-case upper bound of about 2x 4k extra pages of padding, and tiny executables are close to that worst case.
gcc -Wl,--nmagic will turn off page-alignment of sections if you want that for some reason. (see the ld(1) man page) I don't know why that doesn't pack everything down to the old size. Perhaps checking the default linker script would shed some light, but it's pretty long. Run ld --verbose to see it.
stripping won't help for padding that's part of a section; I think it can only remove whole sections.
ld -z noseparate-code uses the old layout, only 2 total segments to cover the .text and .rodata sections, and the .data and .bss sections. (And the ELF metadata that dynamic linking wants access to.)
Related:
Linking with gcc instead of ld
This question is about ld, but note that if you're using gcc -nostdlib, that used to also default to making a static executable. But modern Linux distros config GCC with -pie as the default, and GCC won't make a static-pie by default even if there aren't any shared libraries being linked. Unlike with -no-pie mode where it will simply make a static executable in that case. (A static-pie still needs startup code to apply relocations for any absolute addresses.)
So the equivalent of ld directly is gcc -nostdlib -static (which implies -no-pie). Or gcc -nostdlib -no-pie should let it default to -static when there are no shared libs being linked. You can combine this with -Wl,--nmagic and/or -Wl,-z -Wl,noseparate-code.
Also:
A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux - eventually making a 45 byte executable, with the machine code for an _exit syscall stuffed into the ELF program header itself.
FASM can make quite small executables, using its mode where it outputs a static executable (not object file) directly with no ELF section metadata, just program headers. (It's a pain to debug with GDB or disassemble with objdump; most tools assume there will be section headers, even though they're not needed to run static executables.)
What is a reasonable minimum number of assembly instructions for a small C program including setup?
What's the difference between "statically linked" and "not a dynamic executable" from Linux ldd? (static vs. static-pie vs. (dynamic) PIE that happens to have no shared libraries.)

Minimal executable size now 10x larger after linking than 2 years ago, for tiny programs?

For a university course, I like to compare code-sizes of functionally similar programs if written and compiled using gcc/clang versus assembly. In the process of re-evaluating how to further shrink the size of some executables, I couldn't trust my eyes when the very same assembly code I assembled/linked 2 years ago now has grown >10x in size after building it again (which true for multiple programs, not only helloworld):
$ make
as -32 -o helloworld-asm-2020.o helloworld-asm-2020.s
ld -melf_i386 -o helloworld-asm-2020 helloworld-asm-2020.o
$ ls -l
-rwxr-xr-x 1 xxx users 708 Jul 18 2018 helloworld-asm-2018*
-rwxr-xr-x 1 xxx users 8704 Nov 25 15:00 helloworld-asm-2020*
-rwxr-xr-x 1 xxx users 4724 Nov 25 15:00 helloworld-asm-2020-n*
-rwxr-xr-x 1 xxx users 4228 Nov 25 15:00 helloworld-asm-2020-n-sstripped*
-rwxr-xr-x 1 xxx users 604 Nov 25 15:00 helloworld-asm-2020.o*
-rw-r--r-- 1 xxx users 498 Nov 25 14:44 helloworld-asm-2020.s
The assembly code is:
.code32
.section .data
msg: .ascii "Hello, world!\n"
len = . - msg
.section .text
.globl _start
_start:
movl $len, %edx # EDX = message length
movl $msg, %ecx # ECX = address of message
movl $1, %ebx # EBX = file descriptor (1 = stdout)
movl $4, %eax # EAX = syscall number (4 = write)
int $0x80 # call kernel by interrupt
# and exit
movl $0, %ebx # return code is zero
movl $1, %eax # exit syscall number (1 = exit)
int $0x80 # call kernel again
The same hello world program, compiled using GNU as and GNU ld (always using 32-bit assembly) was 708 bytes then, and has grown to 8.5K now. Even when telling the linker to turn off page alignment (ld -n), it still has almost 4.2K. stripping/sstripping doesn't pay off either.
readelf tells me that the start of section headers is much later in the code (byte 468 vs 8464), but I have no idea why. It's running on the same arch system as in 2018, the Makefile is the same and I'm not linking against any libraries (especially not libc). I guess something regarding ld has changed due to the fact that the object file is still quite small, but what and why?
Disclaimer: I'm building 32-bit executables on an x86-64 machine.
Edit: I'm using GNU binutils (as & ld) version 2.35.1 Here is a base64-encoded archive which includes the source and both executables (small old one, large new one) :
cat << EOF | base64 -d | tar xj
QlpoOTFBWSZTWVaGrEQABBp////xebj/7//Xf+a8RP/v3/rAAEVARARAeEADBAAAoCAI0AQ+NAam
ytMpCGmpDVPU0aNpGmh6Rpo9QAAeoBoADQaNAADQ09IAACSSGUwaJpTNQGE9QZGhoADQPUAA0AAA
AA0aA4AAAABoAAAAA0GgAAAAZAGgAHAAAAANAAAAAGg0AAAADIA0AASJCBIyE8hHpqPVPUPU/VAa
fqn6o0ep6BB6TQaNGj0j1ABobU00yeU9JYiuVVZKYE+dKNa3wls6x81yBpGAN71NoylDUvNryWiW
E4ER8XkfpaJcPb6ND12ULEqkQX3eaBHP70Apa5uFhWNDy+U3Ekj+OLx5MtDHxQHQLfMcgCHrGayE
Dc76F4ZC4rcRkvTW4S2EbJAsbBGbQxSbx5o48zkyk5iPBBhJowtCSwDBsQBc0koYRSO6SgJNL0Bg
EmCoxCDAs5QkEmTGmQUgqZNIoxsmwDmDQe0NIDI0KjQ64leOr1fVk6AaVhjOAJjLrEYkYy4cDbyS
iXSuILWohNh+PA9Izk0YUM4TQQGEYNgn4oEjGmAByO+kzmDIxEC3Txni6E1WdswBJLKYiANdiQ2K
00jU/zpMzuIhjTbgiBqE24dZWBcNBBAAioiEhCQEIfAR8Vir4zNQZFgvKZa67Jckh6EHZWAWuf6Q
kGy1lOtA2h9fsyD/uPPI2kjvoYL+w54IUKBEEYFBIWRNCNpuyY86v3pNiHEB7XyCX5wDjZUSF2tO
w0PVlY2FQNcLQcbZjmMhZdlCGkVHojuICHMMMB5kQQSZRwNJkYTKz6stT/MTWmozDCcj+UjtB9Cf
CUqAqqRlgJdREtMtSO4S4GpJE2I/P8vuO9ckqCM2+iSJCLRWx2Gi8VSR8BIkVX6stqIDmtG8xSVU
kk7BnC5caZXTIynyI0doXiFY1+/Csw2RUQJroC0lCNiIqVVUkTqTRMYqKNVGtCJ5yfo7e3ZpgECk
PYUEihPU0QVgfQ76JA8Eb16KCbSzP3WYiVApqmfDhUk0aVc+jyBJH13uKztUuva8F4YdbpmzomjG
kSJmP+vCFdKkHU384LdRoO0LdN7VJlywJ2xJdM+TMQ0KhMaicvRqfC5pHSu+gVDVjfiss+S00ikI
DeMgatVKKtcjsVDX09XU3SzowLWXXunnFZp/fP3eN9Rj1ubiLc0utMl3CUUkcYsmwbKKrWhaZiLO
u67kMSsW20jVBcZ5tZUKgdRtu0UleWOs1HK2QdMpyKMxTRHWhhHwMnVEsWIUEjIfFEbWhRTRMJXn
oIBSEa2Q0llTBfJV0LEYEQTBTFsDKIxhgqNwZB2dovl/kiW4TLp6aGXxmoIpVeWTEXqg1PnyKwux
caORGyBhTEPV2G7/O3y+KeAL9mUM4Zjl1DsDKyTZy8vgn31EDY08rY+64Z/LO5tcRJHttMYsz0Fh
CRN8LTYJL/I/4u5IpwoSCtDViIA=
EOF
Update:
When using ld.gold instead of ld.bfd (to which /usr/bin/ld is symlinked to by default), the executable size becomes as small as expected:
$ cat Makefile
TARGET=helloworld
all:
as -32 -o ${TARGET}-asm.o ${TARGET}-asm.s
ld.bfd -melf_i386 -o ${TARGET}-asm-bfd ${TARGET}-asm.o
ld.gold -melf_i386 -o ${TARGET}-asm-gold ${TARGET}-asm.o
rm ${TARGET}-asm.o
$ make -q
$ ls -l
total 68
-rw-r--r-- 1 eso eso 200 Dec 1 13:57 Makefile
-rwxrwxr-x 1 eso eso 8700 Dec 1 13:57 helloworld-asm-bfd
-rwxrwxr-x 1 eso eso 732 Dec 1 13:57 helloworld-asm-gold
-rw-r--r-- 1 eso eso 498 Dec 1 13:44 helloworld-asm.s
Maybe I just used gold previously without being aware.
It's not 10x in general, it's page-alignment of a couple sections as Jester says, per changes to ld's default linker script for security reasons:
First change: Making sure data from .data isn't present in any of the mapping of .text, so none of that static data is available for ROP / Spectre gadgets in an executable page. (In older ld, that meant the program-headers mapped the same disk-block twice, also into a RW-without-exec segment for the actual .data section. The executable mapping was still read-only.)
More recent change: Separate .rodata from .text into separate segments, again so static data isn't mapped into an executable page. Previously, const char code[]= {...} could be cast to a function pointer and called, without needing mprotect or gcc -z execstack or other tricks, if you wanted to test shellcode that way. (A separate Linux kernel change made -z execstack only apply to the actual stack, not READ_IMPLIES_EXEC.)
See Why an ELF executable could have 4 LOAD segments? for this history, including the strange fact that .rodata is in a separate segment from the read-only mapping for access to the ELF metadata.
That extra space is just 00 padding and will compress well in a .tar.gz or whatever.
So it has a worst-case upper bound of about 2x 4k extra pages of padding, and tiny executables are close to that worst case.
gcc -Wl,--nmagic will turn off page-alignment of sections if you want that for some reason. (see the ld(1) man page) I don't know why that doesn't pack everything down to the old size. Perhaps checking the default linker script would shed some light, but it's pretty long. Run ld --verbose to see it.
stripping won't help for padding that's part of a section; I think it can only remove whole sections.
ld -z noseparate-code uses the old layout, only 2 total segments to cover the .text and .rodata sections, and the .data and .bss sections. (And the ELF metadata that dynamic linking wants access to.)
Related:
Linking with gcc instead of ld
This question is about ld, but note that if you're using gcc -nostdlib, that used to also default to making a static executable. But modern Linux distros config GCC with -pie as the default, and GCC won't make a static-pie by default even if there aren't any shared libraries being linked. Unlike with -no-pie mode where it will simply make a static executable in that case. (A static-pie still needs startup code to apply relocations for any absolute addresses.)
So the equivalent of ld directly is gcc -nostdlib -static (which implies -no-pie). Or gcc -nostdlib -no-pie should let it default to -static when there are no shared libs being linked. You can combine this with -Wl,--nmagic and/or -Wl,-z -Wl,noseparate-code.
Also:
A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux - eventually making a 45 byte executable, with the machine code for an _exit syscall stuffed into the ELF program header itself.
FASM can make quite small executables, using its mode where it outputs a static executable (not object file) directly with no ELF section metadata, just program headers. (It's a pain to debug with GDB or disassemble with objdump; most tools assume there will be section headers, even though they're not needed to run static executables.)
What is a reasonable minimum number of assembly instructions for a small C program including setup?
What's the difference between "statically linked" and "not a dynamic executable" from Linux ldd? (static vs. static-pie vs. (dynamic) PIE that happens to have no shared libraries.)

Cygwin install does not have shared libraries, or how should I activate the shared libraries?

I'm new to Cygwin - so hopefully, someone can point me in the right direction. I would like to be able to choose to use the shared libraries to compile my code. However, so far, it seems that it always uses the static library, and I don't know where exactly I did wrong.
I installed Cygwin on my Windows 10 computer. Created a file: test.c, which contains:
#include <stdio.h>
const char msg[] = "Hello, world.";
int main(void){
puts (msg);
return 0;
}
I then compiled it with:
$ gcc -Wall -c test.c -o test.o
Then I checked the symbols using:
$ nm test.o
It gives me what I expected:
U __main
0000000000000000 T main
0000000000000000 R msg
U puts
where none of the symbols have been assigned addresses yet. This is all good.
Then, I linked it using the following:
$ gcc -Wall test.o –o test
Then checked the symbols like below:
$ nm test
I got the following:
0000000100401080 T main
0000000100401000 T mainCRTStartup
0000000100401640 T malloc
0000000100403000 R msg
0000000100401650 T posix_memalign
00000001004010d0 T puts
while I was expecting the symbol puts being something like
U puts##GLIBC_x.x.x`.
It seems like I did not have the shared libraries, or I'm not using the process correctly. What is wrong then? Thanks.
using objdump
objdump -x test.exe
DLL Name: cygwin1.dll
vma: Hint/Ord Member-Name Bound-To
813c 15 __cxa_atexit
814c 46 __main
8158 108 _dll_crt0
8164 115 _impure_ptr
8174 257 calloc
8180 373 cygwin_detach_dll
8194 375 cygwin_internal
81a8 403 dll_dllcrt0
81b8 579 free
81c0 909 malloc
81cc 1015 posix_memalign
81e0 1170 puts
81e8 1196 realloc
so puts is an external symbol taken from cygwin1.dll shared lib

Where are GDB symbols coming from?

When I load Fedora 28's /usr/bin/ls file into GDB, I can access to the symbol abformat_init, even if it is not present as a string nor in the symbols table of the binary file.
$ file /usr/bin/ls
/usr/bin/ls: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=d6d0ea6be508665f5586e90a30819d090710842f, stripped, too many notes (256)
$ readelf -S /usr/bin/ls | grep abformat
$ nm /usr/bin/ls
nm: /usr/bin/ls: no symbols
$ strings /usr/bin/ls | grep abformat
$ gdb /usr/bin/ls
[...]
Reading symbols from /usr/bin/ls...Reading symbols from /usr/bin/ls...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Missing separate debuginfos, use: dnf debuginfo-install coreutils-8.29-7.fc28.x86_64
(gdb) info symbol abformat_init
abformat_init in section .text of /usr/bin/ls
Where does this symbol comes from? Is there a program that allows to extract them outside of GDB?
TL;DR:
There is a special .gnu_debugdata compressed section in Fedora binaries that GDB reads, and which contains mini-symbols.
Contents of that section can be conveniently printed with eu-readelf -Ws --elf-section /usr/bin/ls
readelf -S /usr/bin/ls | grep abformat
That command is dumping sections. You want symbols instead:
readelf -s /usr/bin/ls | grep abformat
readelf --all /usr/bin/ls | grep abformat
strings /usr/bin/ls | grep abformat
Strings tries to guess what you want, and doesn't output all strings found in the binary. See this blog post and try:
strings -a /usr/bin/ls | grep abformat
Update: I confirmed the results you've observed: abformat does not appear anywhere, yet GDB knows about it.
Turns out, there is a .gnu_debugdata compressed section (described here), which has mini-symbols.
To extract this data, normally you would do:
objcopy -O binary -j .gnu_debugdata /usr/bin/ls ls.mini.xz
However, that is broken on my system (produces empty output), so instead I used dd:
# You may need to adjust the numbers below from "readelf -WS /usr/bin/ls"
dd if=/usr/bin/ls of=ls.mini.xz bs=1 skip=151896 count=3764
xz -d ls.mini.xz
nm ls.mini | grep abformat
This produced:
00000000000005db0 t abformat_init
QED.
Additional info:
Confusing GDB no debugging symbols is addressed in this bug.
objcopy refusing to copy .gnu_debugdata is the subject of this bug.
There is a tool that can conveniently dump this info:
eu-readelf -Ws --elf-section /usr/bin/ls | grep abformat
37: 0000000000005db0 593 FUNC LOCAL DEFAULT 14 abformat_init
Is there a program that allows to extract them outside of GDB?
Yes, you can use nm to extract the symbol, but you should look for the symbol in a separate debug info file, because the binary itself is stripped.
You can use readelf or objdump to know separate debug info file name, see How to know the name and/or path of the debug symbol file which is linked to a binary executable?:
$ objdump -s -j .gnu_debuglink /usr/bin/ls
/usr/bin/ls: file format elf64-x86-64
Contents of section .gnu_debuglink:
0000 6c732d38 2e33302d 362e6663 32392e78 ls-8.30-6.fc29.x
0010 38365f36 342e6465 62756700 5cddcc98 86_64.debug.\...
On Fedora 29 the separate debug info file name for /usr/bin/ls is ls-8.30-6.fc29.x86_64.debug.
Normally, on Fedora, separate debug info is installed to /usr/lib/debug/ directory so the full path to debug info file is /usr/lib/debug/usr/bin/ls-8.30-6.fc29.x86_64.debug.
Now you can look for the symbol with nm:
$ nm /usr/lib/debug/usr/bin/ls-8.30-6.fc29.x86_64.debug | grep abformat_init
0000000000006d70 t abformat_init
Note that separate debug info should be installed with debuginfo-install, this is what gdb is telling you.

how the gcc -llibname and the OS find lib path?

I had symbolic link to the target libs:
~/opt/OpenBLAS/lib $ ls -al
total 0
drwxrwxr-x 2 user user 59 Jul 9 13:03 .
drwxrwxr-x 5 user user 147 Jul 9 12:48 ..
lrwxrwxrwx 1 user user 64 Jul 9 13:03 libopenblas.a -> ../openBLAS_v0.2.9df/lib/libopenblas_df029_sandybridgep-r0.2.9.a
lrwxrwxrwx 1 user user 65 Jul 9 13:03 libopenblas.so -> ../openBLAS_v0.2.9df/lib/libopenblas_df029_sandybridgep-r0.2.9.so
$ echo $LD_LIBRARY_PATH
/home/user/opt/OpenBLAS/lib
Then the gcc has following args:
-L/home/user/opt/OpenBLAS/lib -lopenblas
however, after compiling, run the command it always throws out error:
error while loading shared libraries: libopenblas_df029.so.0: cannot open shared object file: No such file or directory
If I create a symbolic link libopenblas_df029.so.0 under the /home/user/opt/OpenBLAS/lib, it will then work.
Can anyone explain to me why this is happening and how can I change the behaviour?
Does this mean the libopenblas contain some suffix and the OS always append this suffix when trying to find lib file?
I think, your library "libopenblas_df029_sandybridgep-r0.2.9.so" is compiled with "-soname" switch
something like:
gcc -shared -Wl,-soname,libopenblas_df029.so.0 source.c -o libopenblas_df029_sandybridgep-r0.2.9.so
When you link against such library, your executable tries to find library by name "libopenblas_df029.so.0" ( i.e. whatever name you specified in -soname switch )
The best way to find if this is true is to run following command and look for "SONAME"
readelf -d <shared_object> | head -10
Like #Icarus3 said, the linked executable contains the name of the library as it wants to be called (the SONAME).
There should be a symbolic link from that name to the actual shared library. If it is missing, the library is not properly installed -- usually it is a sign that you need to run ldconfig to update the symbolic links and the dynamic linker cache.

Resources