Conditional move problem - linux

Code fragment from Assembly exercise (GNU Assembler, Linux 32 bit)
.data
more:
.asciz "more\n"
.text
...
movl $more, %eax # this is compiled
cmova more, %eax # this is compiled
cmova $more, %eax # this is not compiled
Error: suffix or operands invalid for `cmova'
I can place string address to %eax using movl, but cmova is not compiled. I need the source operand to be $more and not more, to use it for printing. Finally, this value goes to %ecx register of Linux system call 4 (write).

The assembler is correct! The CMOVcc instructions are more limited than MOV: they can only move 16/32/64-bit values from memory into a register, or from one register to another. They don't support immediate (or 8-bit register) operands.
(Reference: http://www.intel.com/Assets/PDF/manual/253666.pdf - from the set of manuals available at http://www.intel.com/products/processor/manuals/index.htm .)

Related

Assistance on x86 Assembler running on Linux

I am currently learning a bit of Assembler on Linux and I need your advice.
Here is the small program:
.section .data
zahlen:
.float 12,44,5,143,223,55,0
.section .text
.global _start
_start:
movl $0, %edi
movl zahlen (,%edi,4), %eax
movl %eax, %ebx
begin_loop:
cmpl $0, %eax
je prog_end
incl %edi
movl zahlen (,%edi,4), %eax
cmpl %ebx, %eax
jle begin_loop
movl %eax, %ebx
jmp begin_loop
prog_end:
movl $1, %eax
int $0x80
The program seems to compiling and running fine.
But I have some unclear questions/behaviors:
if I check the return value, which is the highers number in register %ebx, with the command "echo %?" it always return 0. I expect the value 223.
Any Idea why this happens?
I checked with DDD and gdb compiling with debugging option. So i saw that the program runs the correct steps.
But if i want to exam the register with command ie. "i r eax" it only shows me the address i believe, not the value. Same on DDD. I see only registers rax rbx and so on.
Here i need some advise to get on the right track.
Any Help appreciated.
Thanks
The "main" registers eax, ebx, ecx, edx, etc. are all designed to work with integers only. A float is a shorthand term that typically refers to a very specific data format (namely, the IEEE-754 binary32 standard), for which your CPU has dedicated registers and hardware to work with. As you saw, you are allowed to load them into integer registers as-is, but the value isn't going to convert itself like it would in a high-level, dynamically-typed language. Your code loaded the raw bit pattern instead, which likely is not at all what you intended.
This is because assembly has no type safety or runtime type-checking. The CPU has no knowledge of what type you declared your data as in your program. So when loading from memory into eax the CPU assumes that the data is a 32-bit integer, even if you declared it in your source code as something else.
If you're curious as to what a float actually looks like you can check this out: Floating Point to Hex Calculator
Switching from float to long solved the problem. Think mistake by myself. Also compiling and linking as 32bit shows the right registers in the debugger.

Segmentation fault when trying to write to .bss [duplicate]

I am trying to load the address of 'main' into a register (R10) in the GNU Assembler. I am unable to. Here I what I have and the error message I receive.
main:
lea main, %r10
I also tried the following syntax (this time using mov)
main:
movq $main, %r10
With both of the above I get the following error:
/usr/bin/ld: /tmp/ccxZ8pWr.o: relocation R_X86_64_32S against symbol `main' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
Compiling with -fPIC does not resolve the issue and just gives me the same exact error.
In x86-64, most immediates and displacements are still 32-bits because 64-bit would waste too much code size (I-cache footprint and fetch/decode bandwidth).
lea main, %reg is an absolute disp32 addressing mode which would stop load-time address randomization (ASLR) from choosing a random 64-bit (or 47-bit) address. So it's not supported on Linux except in position-dependent executables, or at all on MacOS where static code/data are always loaded outside the low 32 bits. (See the x86 tag wiki for links to docs and guides.) On Windows, you can build executables as "large address aware" or not. If you choose not, addresses will fit in 32 bits.
The standard efficient way to put a static address into a register is a RIP-relative LEA:
# RIP-relative LEA always works. Syntax for various assemblers:
lea main(%rip), %r10 # AT&T syntax
lea r10, [rip+main] # GAS .intel_syntax noprefix equivalent
lea r10, [rel main] ; NASM equivalent, or use default rel
lea r10, [main] ; FASM defaults to RIP-relative. MASM may also
See How do RIP-relative variable references like "[RIP + _a]" in x86-64 GAS Intel-syntax work? for an explanation of the 3 syntaxes, and Why are global variables in x86-64 accessed relative to the instruction pointer? (and this) for reasons why RIP-relative is the standard way to address static data.
This uses a 32-bit relative displacement from the end of the current instruction, like jmp/call. This can reach any static data in .data, .bss, .rodata, or function in .text, assuming the usual 2GiB total size limit for static code+data.
In position dependent code (built with gcc -fno-pie -no-pie for example) on Linux, you can take advantage of 32-bit absolute addressing to save code size. Also, mov r32, imm32 has slightly better throughput than RIP-relative LEA on Intel/AMD CPUs, so out-of-order execution may be able to overlap it better with the surrounding code. (Optimizing for code-size is usually less important than most other things, but when all else is equal pick the shorter instruction. In this case all else is at least equal, or also better with mov imm32.)
See 32-bit absolute addresses no longer allowed in x86-64 Linux? for more about how PIE executables are the default. (Which is why you got a link error about -fPIC with your use of a 32-bit absolute.)
# in a non-PIE executable, mov imm32 into a 32-bit register is even better
# same as you'd use in 32-bit code
## GAS AT&T syntax
mov $main, %r10d # 6 bytes
mov $main, %edi # 5 bytes: no REX prefix needed for a "legacy" register
## GAS .intel_syntax
mov edi, OFFSET main
;; mov edi, main ; NASM and FASM syntax
Note that writing any 32-bit register always zero-extends into the full 64-bit register (R10 and RDI).
lea main, %edi or lea main, %rdi would also work in a Linux non-PIE executable, but never use LEA with a [disp32] absolute addressing mode (even in 32-bit code where that doesn't require a SIB byte); mov is always at least as good.
The operand-size suffix is redundant when you have a register operand that uniquely determines it; I prefer to just write mov instead of movl or movq.
The stupid/bad way is a 10-byte 64-bit absolute address as an immediate:
# Inefficient, DON'T USE
movabs $main, %r10 # 10 bytes including the 64-bit absolute address
This is what you get in NASM if you use mov rdi, main instead of mov edi, main so many people end up doing this. Linux dynamic linking does actually support runtime fixups for 64-bit absolute addresses. But the use-case for that is for jump tables, not for absolute addresses as immediates.
movq $sign_extended_imm32, %reg (7 bytes) still uses a 32-bit absolute address, but wastes code bytes on a sign-extended mov to a 64-bit register, instead of implicit zero-extension to 64-bit from writing a 32-bit register.
By using movq, you're telling GAS you want a R_X86_64_32S relocation instead of a R_X86_64_64 64-bit absolute relocation.
The only reason you'd ever want this encoding is for kernel code where static addresses are in the upper 2GiB of 64-bit virtual address space, instead of the lower 2GiB. mov has slight performance advantages over lea on some CPUs (e.g. running on more ports), but normally if you can use a 32-bit absolute it's in the low 2GiB of virtual address space where a mov r32, imm32 works.
(Related: Difference between movq and movabsq in x86-64)
PS: I intentionally left out any discussion of "large" or "huge" memory / code models, where RIP-relative +-2GiB addressing can't reach static data, or maybe can't even reach other code addresses. The above is for x86-64 System V ABI's "small" and/or "small-PIC" code models. You may need movabs $imm64 for medium and large models, but that's very rare.
I don't know if mov $imm32, %r32 works in Windows x64 executables or DLLs with runtime fixups, but RIP-relative LEA certainly does.
Semi-related: Call an absolute pointer in x86 machine code - if you're JITing, try to put the JIT buffer near existing code so you can call rel32, otherwise movabs a pointer into a register.

"Segmentation fault", x86_64 assembly, AT&T syntax

I am running my code in a 64-bit Linux environment where the Linux Kernel is built with IA32_EMULATION and X86_X32 disabled.
In the book Programming from the Ground Up the very first program doesn't do anything except produce a segfault:
.section .data
.section .text
.globl _start
_start:
movl $1, %eax
movl $0, %ebx
int $0x80
I convert the code to use x86-64 instructions but it also segfaults:
.section .data
.section .text
.globl _start
_start:
movq $1, %rax
movq $0, %rbx
int $0x80
I assembled both these programs like this:
as exit.s -o exit.o
ld exit.o -o exit
Running ./exit gives Segmentation fault for both. What am I doing wrong?
P.S. I have seen a lot of tutorials assembling code with gcc, however I'd like to use gas.
Update
Combining comments and the answer, here's the final version of the code:
.section .data
.section .text
.globl _start
_start:
movq $60, %rax
xor %rbx, %rbx
syscall
int $0x80 is the 32bit ABI. On normal kernels (compiled with IA32 emulation), it's available in 64bit processes, but you shouldn't use it because it only supports 32bit pointers, and some structs have a different layout.
See the x86 tag wiki for info on making 64bit Linux system calls. (Also ZX485's answer on this question). There are many differences, including the fact that the syscall instruction clobbers %rcx and %r11, unlike the int $0x80 ABI.
In a kernel without IA32 emulation, like yours, running int $0x80 is probably the same as running any other invalid software interrupt, like int $0x79. Single-stepping that instruction in gdb (on my 64bit 4.2 kernel that does include IA32 emulation) results in a segfault on that instruction.
It doesn't return and keep executing garbage bytes as instructions (which would also result in a SIGSEGV or SIGILL), or keep executing until it jumped to (or reached normally) an unmapped page. If it did, that would be the mechanism for segfaulting.
You can run a process under strace, e.g. strace /bin/true --version to make sure it's making the system calls you thought it would. You can also use gdb to see where a program segfaults. Using a debugger is essential, moreso than in most languages, because the failure mode in asm is usually just a segfault.
The first observation is that the code in both your examples effectively do the same thing, but are encoded differently. The site x86-64.org has some good information for those starting out with x86-64 development. The first code snippet that uses 32-bit registers is equivalent to the second because of Implicit Zero Extend:
Implicit zero extend
Results of 32-bit operations are implicitly zero extended to 64-bit values. This differs from 16 and 8 bit operations, that don't affect the upper part of registers. This can be used for code size optimisations in some cases, such as:
movl $1, %eax # one byte shorter movq $1, %rax
xorq %rax, %rax # three byte equivalent of mov $0,%rax
andl $5, %eax # equivalent for andq $5, %eax
The question is, why does this code segfault? If you had run this code on a typical x86-64 Linux distro your code may have exited as expected without generating a segfault. The reason that your code is failing is because you are using a custom kernel with IA32 emulation off.
IA32 emulation in the Linux kernel does allow you to use the 32-bit int 0x80 interrupt to make calls using the traditional 32-bit system call mechanism. This is an emulation layer, and doesn't support passing pointers that can't be represented in a 32-bit register. This is the case for stack based pointers since they fall outside the 4gb address space, and can't be accessed with 32-bit pointers.
Your system has IA32 emulation off, and because of that int 0x80 doesn't exist for backwards compatibility. The result is that the int 0x80 interrupt will throw a segmentation fault and your application will fail.
In x86-64 code it is preferred that you use the syscall instruction to make system calls to the 64-bit Linux kernel. This mechanism supports 64-bit operands and pointers where necessary. Ryan Chapman's site has some good information on the 64-bit SYSCALL interface which differs considerably from the 32-bit int 0x80 mechanism.
Your code could have been written this way to work in a 64-bit environment without IA32 emulation:
.section .text
.globl _start
_start:
mov $60, %eax
xor %ebx, %ebx
syscall
Other useful information on doing 64-bit development can be found in the 64-bit System V ABI. This document also better describes the general syscall convention used by the Linux kernel including side effects in Section A.2 . This document is also very informative if you also wish to interface with third party libraries and modules (like the C library etc).
The reason is that the Linux System Call Table for x86_64 is different from the table for x86.
In x86, SYS_EXIT is 1. In x64, SYS_EXIT is 60 and 1 is the value for SYS_WRITE, which, if called, expects a const char *buf in %RSI. If that buffer pointer is invalid, it probably segfaults.
%rax System call %rdi %rsi %rdx %r10 %r8 %r9
1 sys_write unsigned int fd const char *buf size_t
60 sys_exit int error_code

Confused about AT&T Assembly Syntax for addressing modes vs. jmp and far jmp

In AT&T Assembly Syntax, literal values must be prefixed with a $ sign
But, in Memory Addressing, literal values do not have $ sign
for example:
mov %eax, -100(%eax)
and
jmp 100
jmp $100, $100
are different.
My question is why the $ prefix so confusing?
jmp 100 is a jump to absolute address 100, just like jmp my_label is a jump to the code at my_label. EIP = 100 or EIP = the address of my_label.
(jmp 100 assembles to a jmp rel32 with a R_386_PC32 relocation, asking the linker to fill in the right relative offset from the jmp instruction's own address to the absolute target.)
So in AT&T syntax, you can think of jmp x as sort of like an LEA into EIP.
Or another way to think of it is that code-fetch starts from the specified memory location. Requiring a $ for an immediate wouldn't really make sense, because the machine encoding for direct near jumps uses a relative displacement, not absolute. (http://felixcloutier.com/x86/JMP.html).
Also, indirect jumps use a different syntax (jmp *%eax register indirect or jmp *(%edi, %ecx, 4) memory indirect), so a distinction between immediate vs. memory isn't needed.
But far jump is a different story.
jmp ptr16:32 and jmp m16:32 are both available in 32-bit mode, so you do need to distinguish between ljmp *(%edi) vs. ljmp $100, $100.
Direct far jump (jmp far ptr16:32) does take an absolute segment:offset encoded into the instruction, just like add $123, %eax takes an immediate encoded into the instruction.
Question: My question is why the prefixed $ so confused ?
$ prefix is used to load/use the value as is.
example:
movl $5, %eax #copy value 5 to eax
addl $10,%eax # add 10 + 5 and store result in eax
$5, $10 are values (constants) and are not take from any external source like register or memory
In Memory addressing, Specifically "Direct addressing mode" we want to use the value stored in particular memory location.
example:
movl 20, %eax
The above would get the value stored in Memory location at 20.
Practially since memory locations are numbered in hexadecimal (0x00000000 to 0xfffffffff), it's difficult to specify the memory locations in hexadecimals in instructions. So we assign a symbol to the location
Example:
.section .data
mydata:
long 4
.section .text
.globl _start
_start
movl mydata, %eax
In the above code, mydata is symbolic representation given a particular memory location where value "4" is stored.
I hope the above clears your confusion.

Linux X86-64 assembly and printf

I am reading some linux assembly manuals and found idea about using printf() function. I need it to output register values for debugging reasons in binary form to terminal, but now I am tried simply to test that function with text.
I am stuck, because of segfault when I am using pushq instead of pushl. How can I change this program to output strings and binary form of registers?
.data
input_prompt:
.string "Hello, world!"
printf_format:
.string "%5d "
printf_newline:
.string "\n"
size:
.long 0
.text
.globl main
main:
pushq $input_prompt
call printf
movl $0, %eax
ret
It was compiled by GCC as:
gcc tmp.S -o tmp
Linux (and Windows) x86-64 calling convention has the first few arguments not on the stack, but in registers instead
See http://www.x86-64.org/documentation/abi.pdf (page 20)
Specifically:
If the class is MEMORY, pass the argument on the stack.
If the class is INTEGER, the next available register of the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9 is used.
If the class is SSE, the next available vector register is used, the registers are taken in the order from %xmm0 to %xmm7.
If the class is SSEUP, the eightbyte is passed in the next available eightbyte chunk of the last used vector register.
If the class is X87, X87UP or COMPLEX_X87, it is passed in memory.
The INTEGER class is anything that will fit in a general purpose register, so that's what you would use for string pointers as well.

Resources