Can I multiply a register's value by an immediate number to add the result to another register? - linux

Learning Assembly with NASM, Ubuntu, 32 bits.
My array in .data:
ary db 1,2,2,4,5 ; Five elements of one byte each
And some number:
tmp db 2 ; Holds the number 2
Let's say I want to print the element at index 4 in the array (so it would be 5).
I know I could do this:
mov EAX,4
mov EBX,0
mov ECX,ary ; Put the array's address in ECX
add ECX,4 ; Move address four bytes to the right
add byte [ECX],'0' ; The value at this address to ASCII
mov EDX,1
int 0x80
However, for whatever reasons, I decided that instead of writing the constant number 4, I want to do it by multiplying my variable (which is 2) by 2.
This is the updated code:
mov EAX,[tmp] ; Put the number 2 in EAX
mov ECX,ary ; Put the array's address in ECX
add ECX,EAX * 2 ; Move (2 * 2) = 4 bytes to the right
add byte [ECX],'0' ; Decimal to ASCII
mov EAX,4
mov EBX,0
mov EDX,1
int 0x80
This doesn't work at add ECX,EAX * 2:
invalid operand type
But why? Doesn't ECX evaluate to 2? Being equivalent to
add ECX,2 * 2
Curiously, these do work:
add ECX,EAX * 1 ; Moves by 2
add ECX,EAX * 0 ; Moves by 0
The above suggests me that the answer is no. And the reason that multiplying by 1 or 0 works is because the assembler doesn't actually need to do any multiplication to know the answer in the first place.
Does this mean that to achieve what I want, I do have to use the mul instruction?

You CAN do multiplication and adding in one instruction if you use lea:
lea ECX,[ECX+EAX*2]

In x86, although lea supports multiplication by a constant, the add instruction doesn't support an operand that multiples a register by a constant. It supports additive offsets, but not multiplication. I assume, as you noted, that the assembler is being somewhat forgiving in this case in the accepted syntax of add ECX,EAX*0 and add ECX,EAX*1 as being equivalent to add ECX,0 and add ECX,EAX, respectively.
You would instead need do something like this:
mov ECX,ary ; Put the array's address in ECX
mov EAX,[tmp] ; Put the number 2 in EAX
shl EAX,1 ; (instead of mul EAX,2)
add ECX,EAX ; Move (2 * 2) = 4 bytes to the right
add byte [ECX],'0' ; Decimal to ASCII
mov EAX,4
mov EBX,0
mov EDX,1
int 0x80

The instruction LEA can be used to provide two additions and one limited multiplication at once. The common syntax is:
lea reg, [offset+reg+const*reg]
Here, reg is any register, offset is some constant number and const is one of 1, 2, 4 or 8 constant.
This way, this instruction is very powerful is order to compute some pretty complex equations:
The equation from the question:
add ECX,EAX * 2
can be computed this way:
lea ecx, [ecx+2*eax]
There are many other uses:
lea eax, [ebx+2*ebx] ; eax = 3*ebx
lea eax, [eax+4*eax] ; eax = 5*eax
lea eax, [ecx+8*ecx] ; eax = 9*ecx
lea eax, [1234+ebx+8*ecx]
Note, that FASM allows shorter syntax for the above examples:
lea eax, [3*ebx]
lea eax, [5*eax]
lea eax, [9*ecx]
Additional advantage of lea instruction is that it does not affects the flags. The execution speed of this instruction is very fast on all x86 CPU.

Related

Converting a string of numbers into an integer in Assembly x86

I'm trying to convert a user inputted string of numbers to an integer.
For example, user enters "1234" as a string I want 1234 stored in a DWORD variable.
I'm using lodsb and stosb to get the individual bytes. My problem is I can't get the algorithm right for it. My code is below:
mov ecx, (SIZEOF num)-1
mov esi, OFFSET num
mov edi, OFFSET ints
cld
counter:
lodsb
sub al,48
stosb
loop counter
I know that the ECX counter is going to be a bit off also because it's reading the entire string not just the 4 bytes, so it's actually 9 because the string is 10 bytes.
I was trying to use powers of 10 to multiply the individual bytes but I'm pretty new to Assembly and can't get the right syntax for it. If anybody can help with the algorithm that would be great. Thanks!
A simple implementation might be
mov ecx, digitCount
mov esi, numStrAddress
cld ; We want to move upward in mem
xor edx, edx ; edx = 0 (We want to have our result here)
xor eax, eax ; eax = 0 (We need that later)
counter:
imul edx, 10 ; Multiply prev digits by 10
lodsb ; Load next char to al
sub al,48 ; Convert to number
add edx, eax ; Add new number
; Here we used that the upper bytes of eax are zeroed
loop counter ; Move to next digit
; edx now contains the result
mov [resultIntAddress], edx
Of course there are ways to improve it, like avoiding the use of imul.
EDIT: Fixed the ecx value

Finding null pointer after environment variables

I'm reading a book(Assembly Language Step by Step, Programming with Linux by Jeff Duntemann) and I'm trying to change this program that show's arguments to instead show the environment variables. I'm trying to only use what was taught thus far(no C) and I've gotten the program to print environment variables but only after I counted how many I had and used an immediate, obviously not satisfying. Here's what I have:
global _start ; Linker needs this to find the entry point!
_start:
nop ; This no-op keeps gdb happy...
mov ebp,esp ; Save the initial stack pointer in EBP
; Copy the command line argument count from the stack and validate it:
cmp dword [ebp],MAXARGS ; See if the arg count exceeds MAXARGS
ja Error ; If so, exit with an error message
; Here we calculate argument lengths and store lengths in table ArgLens:
xor eax,eax ; Searching for 0, so clear AL to 0
xor ebx,ebx ; Stack address offset starts at 0
ScanOne:
mov ecx,0000ffffh ; Limit search to 65535 bytes max
mov edi,dword [ebp+16+ebx*4] ; Put address of string to search in EDI
mov edx,edi ; Copy starting address into EDX
cld ; Set search direction to up-memory
repne scasb ; Search for null (0 char) in string at edi
jnz Error ; REPNE SCASB ended without finding AL
mov byte [edi-1],10 ; Store an EOL where the null used to be
sub edi,edx ; Subtract position of 0 from start address
mov dword [ArgLens+ebx*4],edi ; Put length of arg into table
inc ebx ; Add 1 to argument counter
cmp ebx,44; See if arg counter exceeds argument count
jb ScanOne ; If not, loop back and do another one
; Display all arguments to stdout:
xor esi,esi ; Start (for table addressing reasons) at 0
Showem:
mov ecx,[ebp+16+esi*4] ; Pass offset of the message
mov eax,4 ; Specify sys_write call
mov ebx,1 ; Specify File Descriptor 1: Standard Output
mov edx,[ArgLens+esi*4] ; Pass the length of the message
int 80H ; Make kernel call
inc esi ; Increment the argument counter
cmp esi,44 ; See if we've displayed all the arguments
jb Showem ; If not, loop back and do another
jmp Exit ; We're done! Let's pack it in!
I moved the displacement up past the first null pointer to the first environment variable([ebp+4+ebx*4] > [ebp+16+ebx*4]) in both ScanOne and Showem. When I compare to the number of environment variables I have(44) it will print them just fine without a segfault, comparing to 45 only gives me a segfault.
I've tried using the pointers to compare to zero(in search of null pointer): cmp dword [ebp+16+ebx*4],0h but that just returns a segfault. I'm sure that the null pointer comes after the last environment variable in the stack but it's like it won't do anything up to and beyond that.
Where am I going wrong?
What if your program has 2, 3, or 0 args, would your code still work? Each section is separated by a NULL pointer (4 bytes of 0) You could just get the count of parameters and use that as your array index and skip over the args until you get to the NULL bytes. Now you have your Environment Block:
extern printf, exit
section .data
fmtstr db "%s", 10, 0
fmtint db "%d", 10, 0
global main
section .text
main:
push ebp
mov ebp, esp
mov ebx, [ebp + 4]
.SkipArgs:
mov edi, dword [ebp + 4 * ebx]
inc ebx
test edi, edi
jnz .SkipArgs
.ShowEnvBlock:
mov edi, dword [ebp + 4 * ebx]
test edi, edi
jz .NoMore
push edi
push fmtstr
call printf
add esp, 4 * 2
inc ebx
jmp .ShowEnvBlock
.NoMore:
push 0
call exit
Yes I use printf here, but you just swap that with your system call.
Want to go ahead and apologize, this always happens to me(fix it myself after asking question on stackoverflow). I think when I tried comparing pointer to 0h I typed something wrong. Here's what I did:
inc ebx
cmp dword [ebp+16+ebx*4],0h
jnz ScanOne
and
inc esi
cmp dword [ebp+16+esi*4],0h
jnz Showem
This worked.

Copy string from BSS variable to BSS variable in Assembly

Let's suppose I have to string stored in variables created in the .BSS section.
var1 resw 5 ; this is "abcde" (UNICODE)
var2 resw 5 ; here I will copy the first one
How would I do this with NASM?
I tried something like this:
mov ebx, var2 ; Here we will copy the string
mov dx, 5 ; Length of the string
mov esi, dword var1 ; The variable to be copied
.Copy:
lodsw
mov [ebx], word ax ; Copy the character into the address from EBX
inc ebx ; Increment the EBX register for the next character to copy
dec dx ; Decrement DX
cmp dx, 0 ; If DX is 0 we reached the end
jg .Copy ; Otherwise copy the next one
So, first problem is that the string is not copied as UNICODE but as ASCII and I don't know why. Secondly, I know there might be some not recommended use of some registers. And lastly, I wonder if there is some quicker way of doing this (maybe there are instructions specially created for this kind of operations with strings). I'm talking about 8086 processors.
inc ebx ; Increment the EBX register for the next character to copy
A word is 2 bytes, but you're only stepping ebx 1 byte ahead. Replace inc ebx with add ebx,2.
Michael already answered about the obvious problem of the demonstrated code.
But there is also another layer of understanding. It is not important how you will copy the string from one buffer to another - by bytes, words or double words. It will always create exact copy of the string.
So, how to copy the string is a matter of optimization. Using rep movsd is the fastest known way.
Here is one example:
; ecx contains the length of the string in bytes
; esi - the address of the source, aligned on dword
; edi - the address of the destination aligned on dword
push ecx
shr ecx, 2
rep movsd
pop ecx
and ecx, 3
rep movsb

How to print a number in assembly NASM?

Suppose that I have an integer number in a register, how can I print it? Can you show a simple example code?
I already know how to print a string such as "hello, world".
I'm developing on Linux.
If you're already on Linux, there's no need to do the conversion yourself. Just use printf instead:
;
; assemble and link with:
; nasm -f elf printf-test.asm && gcc -m32 -o printf-test printf-test.o
;
section .text
global main
extern printf
main:
mov eax, 0xDEADBEEF
push eax
push message
call printf
add esp, 8
ret
message db "Register = %08X", 10, 0
Note that printf uses the cdecl calling convention so we need to restore the stack pointer afterwards, i.e. add 4 bytes per parameter passed to the function.
You have to convert it in a string; if you're talking about hex numbers it's pretty easy. Any number can be represented this way:
0xa31f = 0xf * 16^0 + 0x1 * 16^1 + 3 * 16^2 + 0xa * 16^3
So when you have this number you have to split it like I've shown then convert every "section" to its ASCII equivalent.
Getting the four parts is easily done with some bit magic, in particular with a right shift to move the part we're interested in in the first four bits then AND the result with 0xf to isolate it from the rest. Here's what I mean (soppose we want to take the 3):
0xa31f -> shift right by 8 = 0x00a3 -> AND with 0xf = 0x0003
Now that we have a single number we have to convert it into its ASCII value. If the number is smaller or equal than 9 we can just add 0's ASCII value (0x30), if it's greater than 9 we have to use a's ASCII value (0x61).
Here it is, now we just have to code it:
mov si, ??? ; si points to the target buffer
mov ax, 0a31fh ; ax contains the number we want to convert
mov bx, ax ; store a copy in bx
xor dx, dx ; dx will contain the result
mov cx, 3 ; cx's our counter
convert_loop:
mov ax, bx ; load the number into ax
and ax, 0fh ; we want the first 4 bits
cmp ax, 9h ; check what we should add
ja greater_than_9
add ax, 30h ; 0x30 ('0')
jmp converted
greater_than_9:
add ax, 61h ; or 0x61 ('a')
converted:
xchg al, ah ; put a null terminator after it
mov [si], ax ; (will be overwritten unless this
inc si ; is the last one)
shr bx, 4 ; get the next part
dec cx ; one less to do
jnz convert_loop
sub di, 4 ; di still points to the target buffer
PS: I know this is 16 bit code but I still use the old TASM :P
PPS: this is Intel syntax, converting to AT&T syntax isn't difficult though, look here.
Linux x86-64 with printf
main.asm
default rel ; make [rel format] the default, you always want this.
extern printf, exit ; NASM requires declarations of external symbols, unlike GAS
section .rodata
format db "%#x", 10, 0 ; C 0-terminated string: "%#x\n"
section .text
global main
main:
sub rsp, 8 ; re-align the stack to 16 before calling another function
; Call printf.
mov esi, 0x12345678 ; "%x" takes a 32-bit unsigned int
lea rdi, [rel format]
xor eax, eax ; AL=0 no FP args in XMM regs
call printf
; Return from main.
xor eax, eax
add rsp, 8
ret
GitHub upstream.
Then:
nasm -f elf64 -o main.o main.asm
gcc -no-pie -o main.out main.o
./main.out
Output:
0x12345678
Notes:
sub rsp, 8: How to write assembly language hello world program for 64 bit Mac OS X using printf?
xor eax, eax: Why is %eax zeroed before a call to printf?
-no-pie: plain call printf doesn't work in a PIE executable (-pie), the linker only automatically generates a PLT stub for old-style executables. Your options are:
call printf wrt ..plt to call through the PLT like traditional call printf
call [rel printf wrt ..got] to not use a PLT at all, like gcc -fno-plt.
Like GAS syntax call *printf#GOTPCREL(%rip).
Either of these are fine in a non-PIE executable as well, and don't cause any inefficiency unless you're statically linking libc. In which case call printf can resolve to a call rel32 directly to libc, because the offset from your code to the libc function would be known at static linking time.
See also: Can't call C standard library function on 64-bit Linux from assembly (yasm) code
If you want hex without the C library: Printing Hexadecimal Digits with Assembly
Tested on Ubuntu 18.10, NASM 2.13.03.
It depends on the architecture/environment you are using.
For instance, if I want to display a number on linux, the ASM code will be different from the one I would use on windows.
Edit:
You can refer to THIS for an example of conversion.
I'm relatively new to assembly, and this obviously is not the best solution,
but it's working. The main function is _iprint, it first checks whether the
number in eax is negative, and prints a minus sign if so, than it proceeds
by printing the individual numbers by calling the function _dprint for
every digit. The idea is the following, if we have 512 than it is equal to: 512 = (5 * 10 + 1) * 10 + 2 = Q * 10 + R, so we can found the last digit of a number by dividing it by 10, and
getting the reminder R, but if we do it in a loop than digits will be in a
reverse order, so we use the stack for pushing them, and after that when
writing them to stdout they are popped out in right order.
; Build : nasm -f elf -o baz.o baz.asm
; ld -m elf_i386 -o baz baz.o
section .bss
c: resb 1 ; character buffer
section .data
section .text
; writes an ascii character from eax to stdout
_cprint:
pushad ; push registers
mov [c], eax ; store ascii value at c
mov eax, 0x04 ; sys_write
mov ebx, 1 ; stdout
mov ecx, c ; copy c to ecx
mov edx, 1 ; one character
int 0x80 ; syscall
popad ; pop registers
ret ; bye
; writes a digit stored in eax to stdout
_dprint:
pushad ; push registers
add eax, '0' ; get digit's ascii code
mov [c], eax ; store it at c
mov eax, 0x04 ; sys_write
mov ebx, 1 ; stdout
mov ecx, c ; pass the address of c to ecx
mov edx, 1 ; one character
int 0x80 ; syscall
popad ; pop registers
ret ; bye
; now lets try to write a function which will write an integer
; number stored in eax in decimal at stdout
_iprint:
pushad ; push registers
cmp eax, 0 ; check if eax is negative
jge Pos ; if not proceed in the usual manner
push eax ; store eax
mov eax, '-' ; print minus sign
call _cprint ; call character printing function
pop eax ; restore eax
neg eax ; make eax positive
Pos:
mov ebx, 10 ; base
mov ecx, 1 ; number of digits counter
Cycle1:
mov edx, 0 ; set edx to zero before dividing otherwise the
; program gives an error: SIGFPE arithmetic exception
div ebx ; divide eax with ebx now eax holds the
; quotent and edx the reminder
push edx ; digits we have to write are in reverse order
cmp eax, 0 ; exit loop condition
jz EndLoop1 ; we are done
inc ecx ; increment number of digits counter
jmp Cycle1 ; loop back
EndLoop1:
; write the integer digits by poping them out from the stack
Cycle2:
pop eax ; pop up the digits we have stored
call _dprint ; and print them to stdout
dec ecx ; decrement number of digits counter
jz EndLoop2 ; if it's zero we are done
jmp Cycle2 ; loop back
EndLoop2:
popad ; pop registers
ret ; bye
global _start
_start:
nop ; gdb break point
mov eax, -345 ;
call _iprint ;
mov eax, 0x01 ; sys_exit
mov ebx, 0 ; error code
int 0x80 ; край
Because you didn't say about number representation I wrote the following code for unsigned number with any base(of course not too big), so you could use it:
BITS 32
global _start
section .text
_start:
mov eax, 762002099 ; unsigned number to print
mov ebx, 36 ; base to represent the number, do not set it too big
call print
;exit
mov eax, 1
xor ebx, ebx
int 0x80
print:
mov ecx, esp
sub esp, 36 ; reserve space for the number string, for base-2 it takes 33 bytes with new line, aligned by 4 bytes it takes 36 bytes.
mov edi, 1
dec ecx
mov [ecx], byte 10
print_loop:
xor edx, edx
div ebx
cmp dl, 9 ; if reminder>9 go to use_letter
jg use_letter
add dl, '0'
jmp after_use_letter
use_letter:
add dl, 'W' ; letters from 'a' to ... in ascii code
after_use_letter:
dec ecx
inc edi
mov [ecx],dl
test eax, eax
jnz print_loop
; system call to print, ecx is a pointer on the string
mov eax, 4 ; system call number (sys_write)
mov ebx, 1 ; file descriptor (stdout)
mov edx, edi ; length of the string
int 0x80
add esp, 36 ; release space for the number string
ret
It's not optimised for numbers with base of power of two and doesn't use printf from libc.
The function print outputs the number with a new line. The number string is formed on stack. Compile by nasm.
Output:
clockz
https://github.com/tigertv/stackoverflow-answers/tree/master/8194141-how-to-print-a-number-in-assembly-nasm

Multiplying using shifts in Assembly. But getting a way too high number out! Where am I going wrong?

I am having issues with using shifts to multiply two numbers given by the user.
It asks the user to enter two integers and it is supposed to multiply them.
My program works well in asking for the integers, but when it gives the product it is an astronomical number no where near being correct.
Where am I going wrong? what register is it reading?
%include "asm_io.inc"
segment .data
message1 db "Enter a number: ", 0
message2 db "Enter another number: ", 0
message3 db "The product of these two numbers is: ", 0
segment .bss
input1 resd 1
input2 resd 1
segment .text
Global main
main:
enter 0,0
pusha
mov eax, message1 ; print out first message
call print_string
call read_int ; input first number
mov eax, [input1]
mov eax, message2 ; print out second message
call print_string
call read_int ; input second number
mov ebx, [input2]
cmp eax, 0 ; compares eax to zero
cmp ebx, 0 ; compares ebx to zero
jnz LOOP ;
LOOP:
shl eax, 1
dump_regs 1
mov eax, message3 ; print out product
call print_string
mov ebx, eax
call print_int
You are going wrong in pretty much everything besides asking for the numbers.
You are acting like read_int writes the read integer into input1 the first time it is called and into intput2 the second time. This is almost certainly not the case.
Even were that the case, you load the first number into eax and then immediately overwrite it with the address of message2.
Even if eax and ebx were loaded correctly with the input values, your code that is supposed to be multiplying the two is actually be doing something along the lines of "if the second number is non-zero, multiply eax by 2. Otherwise leave it alone."
Even were the loop arranged correctly, it would be multiplying eax by 2 to the power of ebx.
Then you overwrite this result with the address of message3 anyway, so none of that matters.
In the end, it is impossible to determine what register is getting printed from this code. Between this question and your other question, you seem to be expecting print_int to print any of eax, ebx, or ecx.
Ignoring the code you've posted, and looking strictly at how to multiply numbers (without using a multiply instruction), you do something like this:
mult proc
; multiplies eax by ebx and places result in edx:ecx
xor ecx, ecx
xor edx, edx
mul1:
test ebx, 1
jz mul2
add ecx, eax
adc edx, 0
mul2:
shr ebx, 1
shl eax, 1
test ebx, ebx
jnz mul1
done:
ret
mult endp

Resources