So I'm confused. I'm going through the book "Programming from the Ground Up" and am working with using libraries.
printf is working just fine so long as I include a "\n" in the string, but without it it will print absolutely nothing.
Any idea why this happens?
Code:
.section .data
my_str:
.ascii "Jimmy Joe is %d years old!\n\0"
my_num:
.long 76
.section .text
.globl _start
_start:
pushl my_num
pushl $my_str
call printf
movl $1, %eax
movl $0, %ebx
int $0x80
Also, when I use -m elf_i386 for 32-bit mode and -dynamic-linker /lib/ld-linux.so.2 -lc to link, I get the warning
ld: skipping incompatible /usr/lib64/libc.so when searching for -lc
If that makes any difference, or if anybody has any suggestions as to how to have it load the 32-bit library directly.
Thanks!
The problem is that printf by default just prints stuff into the stdout buffer. Things won't actually be printed until the buffer is flushed. The depends on the buffering mode of stdout, but, by default, it is line-buffered, which means it gets flushed every time you print a newline character.
To flush explicitly in C, you call fflush; you can do that in asm code with
pushl stdout
call fflush
addl $4, %esp
Alternately, you can call the stdlib exit function (which flushes all I/O buffers before actually exiting), instead of using the _exit system call, which does not.
It seems you try to link your 32-bit program against the (system default) 64Bit c library.
Check if you have libs32 packages installed.
To find out which libraries a program or other dynamically loads froum the LD_LIBRARY_PATH use ldd <name_of_your_binary>
As to why the newline is required I can only speculate that it flushes the output buffer.
See also Why does printf not flush after the call unless a newline is in the format string?
Related
I just started to learn assembly by following "Programming From The Ground Up" and already hit my first issue with the first ever program. I got a segfault for the following code which is supposed to be an exit program:
.section .data
.section .text
.global _start
_start:
movl $1, %eax
movl $0, %edi
int $0x80
I've looked into it and one thing suggested was to not use int $0x80 anymore since its a legacy way to invoke system call so I tried to use syscall instead but it didn't fix it.
the commands I used are as follow:
as test.s -o test.o
ld test.o -o test
./test
I am using the Windows Subsystem for Linux.
I tried to look at it in a debugger and what I found was that after my code, there would be an endless stream of add %al, (%rax) with each memory address from 0x40100c and onwards having this line.
I have absolutely no idea what is happening and would appreciate any help.
Working under Linux, i just met the following issue. (For sure, someone will give me the answer, but up to now,i didn't find any simple and clear answer :)
/*compile with gcc -o out.x hello.c*/
#include<stdio.h>
int main()
{
printf("Hello World2\r\n");
printf("Hello World3\r\n ");
return 0;
}
Running the following code under Linux give two strings BUT the ending char are differents: the first output ends with 0x0d while the 2nd ends with 0x0d,0x0a.
This is something done by the compiler (GCC) as you can see in the obj file:
Contents of section .rodata:
400610 01000200 48656c6c 6f20576f 726c6432 ....Hello World2
400620 0d004865 6c6c6f20 576f726c 64330d0a ..Hello World3..
400630 2000 .
So, questions are:
Why ?
How can i avoid this kind of "optimization"(!?)
Thanks
Creating formatted output at runtime takes time; the printf call is slow. GCC knows this, so replaces the first function with a call to puts. Since puts automatically adds a \n, GCC needs to remove the \n from the string to compensate.
GCC does this because it considers printf a built-in. Because this has no effect on the bytes output or even on the number of calls to write; I strongly recommend leaving it as-is. If you do want to disable it, you can pass -fno-builtin-printf, but the only effect will be to slow down your code as it tries to format the string unnecessarily.
It is simpler to ask GCC (using GCC7.2 on Linux/Debian/Sid/x86-64) to emit assembler. So I compiled your program bflash.c with
gcc -fverbose-asm -O0 -S bflash.c -o bflash-O0.S
to get it without optimization, and with
gcc -fverbose-asm -O1 -S bflash.c -o bflash-O1.S
to get -O1 optimization. Feel free to repeat the experiment with various other optimization flags.
Even without optimization, the bflash-O0.S contains:
.section .rodata
.LC0:
.string "Hello World2\r"
.LC1:
.string "Hello World3\r\n "
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp #
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp #,
.cfi_def_cfa_register 6
# bflash.c:5: printf("Hello World2\r\n");
leaq .LC0(%rip), %rdi #,
call puts#PLT #
# bflash.c:6: printf("Hello World3\r\n ");
leaq .LC1(%rip), %rdi #,
movl $0, %eax #,
call printf#PLT #
# bflash.c:8: return 0;
movl $0, %eax #, _4
# bflash.c:9: }
popq %rbp #
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
As you see, the first printf has been optimized as a puts; and this is permitted by the C11 standard n1570 (as-if rule). BTW, the bflash-01.S contains similar code. Notice that the C11 standard has been specified with current optimization practices in mind (many members of the standardization committees are compiler implementors).
BTW Clang 5, invoked as clang-5.0 -O1 -fverbose-asm -S bflash.c -o bflash-01clang.s, performs the same kind of optimization.
How can i avoid this kind of "optimization"(!?)
Follow Daniel H's answer (and you might compile with -ffreestanding, but I don't recommend that).
Or avoid using printf from the <stdio.h> and implement your own slower printing function. If you implement your own printing function, name it differently (since printf is defined in the C11 standard), and perhaps (if so wanted) write your own GCC plugin to optimize it your way (and that plugin should better be some free software which is GPL compatible, read the GCC runtime library exception).
The C language specification (study n1570) defines a semantics, that is the behavior of your compiled program. It does not require any particular sequence of bytes to appear in the executable (which is probably not even mentioned in the standard). If you need such a property, find a different programming language, and give up all the important optimizations GCC is trying hard to do for you. Optimizations are what is making writing a C compiler difficult (if you want a non-optimizing compiler, use something else than GCC, but accept to lose perhaps a factor of three or more in performance, w.r.t. code compiled with gcc -O2).
I am experiencing an issue where gdb is mapping a line number to the wrong memory address when adding a breakpoint.
The following x86 Linux assembly program prints "hello".
/* hello.s */
.section .data
str:
.ascii "hello\n"
strlen = . - str
.section .text
print:
pushl %ebp
movl %esp, %ebp
pushl %ebx
movl $4, %eax
movl $1, %ebx
movl $str, %ecx
movl $strlen, %edx
int $0x80
popl %ebx
movl %ebp, %esp
popl %ebp
ret
.globl _start
_start:
call print
movl $1, %eax
movl $0, %ebx
int $0x80
I compile it with debugging information, and then link.
$ as -g --32 -o hello.o hello.s
$ ld -m elf_i386 -o hello hello.o
Next, in gdb, I try to set a breakpoint on line 11, the first line of the print function (pushl %ebp).
$ gdb ./hello
(gdb) break hello.s:11
Breakpoint 3 at 0x8048078: file hello.s, line 11.
As shown in the output, the breakpoint is set at address 0x8048078. However, that is the wrong address. When I run my program in gdb, it breaks at line 14. The address of line 11 is 0x8048074, confirmed using gdb's info command.
(gdb) info line hello.s:11
Line 11 of "hello.s" starts at address 0x8048074 and ends at 0x8048075 .
Setting a breakpoint on the print instruction directly works (the break point is set for the address of line 11, 0x8048074).
How come when I add a breakpoint for line 11, gdb does not use the same address as output by using the info command above? This is the memory address I am trying to break on.
I am experiencing the same behavior on both gdb 7.11.1 and 8.0.1. I have tried adding a .type print,#function annotation, but that did not solve my issue.
How come
By default, GDB tries to skip past function prolog, when you set a breakpoint on a function, or a line on which the function starts.
This tends to be what C developers want, since they usually aren't interested in parameter setup.
If you want something else, use b *address or b &print to prevent GDB from doing its usual thing.
I try to use printf from my assembler code, this is a minimal example which should just print hello to stdout:
.section .rodata
hello:
.ascii "hello\n\0"
.section .text
.globl _start
_start:
movq $hello, %rdi # first parameter
xorl %eax, %eax # 0 - number of used vector registers
call printf
#exit
movq $60, %rax
movq $0, %rdi
syscall
I build it with
gcc -nostdlib try_printf.s -o try_printf -lc
and when I run it, it seems to work: the string hello is printed out and the exit status is 0:
XXX$ ./try_printf
hello
XXX$ echo $?
0
XXX$
But when I try to capture the text, it is obvious, that something is not working properly:
XXX$ output=$(./try_printf)
XXX$ echo $output
XXX$
The variable output should have the value hello, but is empty.
What is wrong with my usage of printf?
Use call exit instead of a raw _exit syscall after using stdio functions like printf. This flushes stdio buffers (write system call) before making an exit_group system call).
(Or if your program defines a main instead of _start, returning from main is equivalent to calling exit. You can't ret from _start.) Calling fflush(NULL) should also work.
As Michael explained, it is OK to link the C-library dynamically. This is also how it is introduced in the "Programming bottom up" book (see chapter 8).
However it is important to call exit from the C-library in order to end the program and not to bypass it, which was what I wrongly did by calling exit-syscall. As hinted by Michael, exit does a lot of clean up like flushing streams.
That is what happened: As explained here, the C-library buffers the the standard streams as follows:
No buffering for standard error.
If standard out/in is a terminal, it is line-buffered.
If standard out/in is a not a terminal, it is fully-buffered and thus flush is needed before a raw exit system call.
Which case applies is decided when printf is called for the first time for a stream.
So if printf_try is called directly in the terminal, the output of the program can be seen because hello has \n at the end (which triggers the flush in the line-buffered mode) and it is a terminal, also the 2. case.
Calling printf_try via $(./printf_try) means that the stdout is no longer a terminal (actually I don't know whether is is a temp file or a memory file) and thus the 3. case is in effect - there is need for an explicit flush i.e. call to C-exit.
The C standard library often contains initialization code for the standard I/O streams — initialization code that you're bypassing by defining your own entry point. Try defining main instead of _start:
.globl main
main:
# _start code here.
and then build with gcc try_printf.s -o try_printf (i.e., without -nostdlib).
I try to use printf from my assembler code, this is a minimal example which should just print hello to stdout:
.section .rodata
hello:
.ascii "hello\n\0"
.section .text
.globl _start
_start:
movq $hello, %rdi # first parameter
xorl %eax, %eax # 0 - number of used vector registers
call printf
#exit
movq $60, %rax
movq $0, %rdi
syscall
I build it with
gcc -nostdlib try_printf.s -o try_printf -lc
and when I run it, it seems to work: the string hello is printed out and the exit status is 0:
XXX$ ./try_printf
hello
XXX$ echo $?
0
XXX$
But when I try to capture the text, it is obvious, that something is not working properly:
XXX$ output=$(./try_printf)
XXX$ echo $output
XXX$
The variable output should have the value hello, but is empty.
What is wrong with my usage of printf?
Use call exit instead of a raw _exit syscall after using stdio functions like printf. This flushes stdio buffers (write system call) before making an exit_group system call).
(Or if your program defines a main instead of _start, returning from main is equivalent to calling exit. You can't ret from _start.) Calling fflush(NULL) should also work.
As Michael explained, it is OK to link the C-library dynamically. This is also how it is introduced in the "Programming bottom up" book (see chapter 8).
However it is important to call exit from the C-library in order to end the program and not to bypass it, which was what I wrongly did by calling exit-syscall. As hinted by Michael, exit does a lot of clean up like flushing streams.
That is what happened: As explained here, the C-library buffers the the standard streams as follows:
No buffering for standard error.
If standard out/in is a terminal, it is line-buffered.
If standard out/in is a not a terminal, it is fully-buffered and thus flush is needed before a raw exit system call.
Which case applies is decided when printf is called for the first time for a stream.
So if printf_try is called directly in the terminal, the output of the program can be seen because hello has \n at the end (which triggers the flush in the line-buffered mode) and it is a terminal, also the 2. case.
Calling printf_try via $(./printf_try) means that the stdout is no longer a terminal (actually I don't know whether is is a temp file or a memory file) and thus the 3. case is in effect - there is need for an explicit flush i.e. call to C-exit.
The C standard library often contains initialization code for the standard I/O streams — initialization code that you're bypassing by defining your own entry point. Try defining main instead of _start:
.globl main
main:
# _start code here.
and then build with gcc try_printf.s -o try_printf (i.e., without -nostdlib).