I'm learning the basics of bubblesort in Assembly right now and what I've coded makes sense in my head, but apparently not to the compiler. When I print out the entire array after running this subroutine, why does the array look the same as before it was sorted?
bubblesort proc
bubbleSort Proc
mov ecx, NUMS_LENGTH
dec ecx
mov edi, 0
.WHILE ecx > 0
mov ax, NUMS[di]
.IF ax > NUMS[di]+2
xchg ax, NUMS[di]+2
call WriteInt
mov [NUMS[di]+2], ax
inc di
.ENDIF
dec ecx
.ENDW
bubbleSort endp
Advice and insight are greatly appreciated. Many thanks in advance!
EDIT
bubbleSort Proc
mov ecx, NUMS_LENGTH
dec ecx
mov edi, 0
.WHILE ecx > 0
mov di, NUMS_LENGTH-1
.WHILE di < NUMS_LENGTH-1
mov ax, NUMS[di]
.IF ax > NUMS[di]+2
xchg ax, NUMS[di]+2
mov [NUMS[di]+2], ax
inc di
.ELSE
inc di
.ENDIF
.ENDW
dec ecx
.ENDW
bubbleSort endp
*EDIT 2 *
bubbleSort Proc
mov ecx, NUMS_LENGTH
dec ecx
mov di, 0
.WHILE ecx > 0
mov ax, NUMS[di]
.WHILE di != NUMS_LENGTH-1
.IF ax > NUMS[di]+2
xchg ax, NUMS[di]+2
mov NUMS[di]+2, ax
inc di
.ELSE
inc di
mov NUMS[di]+2, ax
.ENDIF
.ENDW
dec ecx
.ENDW
bubbleSort endp
At a guess, it doesn't look quite the same, but it's not sorted either.
A bubble sort requires two nested loops, but you seem to only have one loop. That means it makes only one pass through the array. At the end of one pass, the last element in the array will be in the correct position, but the rest won't be sorted yet.
Edit: The edited code probably isn't an improvement. In particular:
mov di, NUMS_LENGTH-1
.WHILE di < NUMS_LENGTH-1
We're setting di equal to NUMS_LENGTH-1, then executing a while loop that can only execute if di is less than NUMS_LENGTH-1, so that inner loop clearly won't (ever!) execute.
Edit2: Though a bubble sort in assembly language strikes me as just about the worst wast of time possible, I suppose if you insist, the obvious way to start would be to consider how a bubble sort looks in something like C:
for (int i=array_len; i!=0; --i)
for (int j=0; j<i; j++)
if (array[j] > array[j+1]) {
temp = array[j];
array[j] = array[j+1];
array[j+1] = temp;
}
Writing something on the same general order in assembly language shouldn't be terribly difficult.
Edit4: Fixed code (I think):
mov ecx, (NUMS_LEN-1)*4
outer_loop:
xor edi, edi
inner_loop:
mov eax, nums[edi]
cmp eax, nums[edi+4]
jl noswap
xchg nums[edi+4], eax
mov nums[edi], eax
noswap:
add edi, 4
cmp edi, ecx
jl inner_loop
sub ecx, 4
jnz outer_loop
Sorry -- I've never really gotten used to Microsoft's "high level" control-flow macros for assembly language. A couple of points to consider: at least assuming we don't allow a zero-length array, for this task loops that test the condition at the bottom are a lot simpler. In general, loops that test at the bottom are cleaner in assembly language anyway. When the algorithm demands a test at the beginning, it's still often cleaner to put the test at the bottom, and build a structure like:
initalization
jmp loop_test
loop_top:
loop body
loop_test:
update loop variable(s)
if (more iterations)
jmp loop_top
Related
I'm trying to write a function that reverses order of characters in a string using x86 NASM assembly language. I tried doing it using registers (I know it's more efficient to do it using stack) but I keep getting a segmentation fault, the c declaration looks as following
extern char* reverse(char*);
The assembly segment:
section .text
global reverse
reverse:
push ebp ; prologue
mov ebp, esp
mov eax, [ebp+8] ; eax <- points to string
mov edx, eax
look_for_last:
mov ch, [edx] ; put char from edx in ch
inc edx
test ch, ch
jnz look_for_last ; if char != 0 loop
sub edx, 2 ; found last
swap: ; eax = first, edx = last (characters in string)
test eax, edx
jg end ; if eax > edx reverse is done
mov cl, [eax] ; put char from eax in cl
mov ch, [edx] ; put char from edx in ch
mov [edx], cl ; put cl in edx
mov [eax], ch ; put ch in eax
inc eax
dec edx
jmp swap
end:
mov eax, [ebp+8] ; move char pointer to eax (func return)
pop ebp ; epilogue
ret
It seems like the line causing the segmentation fault is
mov cl, [eax]
Why is that happening? In my understanding eax never goes beyond the bounds of the string so there always is something in [eax]. How come I get a segmentation fault?
Ok I figured it out, I mistakenly used test eax, edx instead of which I should have used cmp eax, edx. It works now.
Trying to find the number of characters in a string and disregard all the " " space characters
I have a C++ portion that passes the strings to asm and here is my asm
works fine, only thing is that the space characters are being counted as well.
stringLength PROC PUBLIC
PUSH ebp ; save caller base pointer
MOV ebp, esp ; set our base pointer
SUB esp, (1 * 4) ; allocate uint32_t local vars
PUSH edi
PUSH esi
; end prologue
MOV esi, [ebp+8] ;gets the string
xor ebx, ebx
COMPARE:
MOV al, [esi + ebx]
CMP al, 0 ;compare character of string with 0
JE FINALE ;if = to 0 go to end
INC ebx ;counter
CMP al, ' ' ;compare with sapce
JE SPACE ;go get rid of the space and keep going
INC al ;otherwise inc al to next character and repeat
JMP COMPARE
SPACE:
DEC ebx ;get rid of the extra space
INC al
JMP COMPARE ;goes back to compare
FINALE:
MOV eax,ebx ; bring back the counter
ADD esp, (2 * 4) ; clear the stack
POP esi
POP edi
MOV esp, ebp ; deallocate locals
POP ebp ; restore caller base pointer
RET
stringLength ENDP ; end the procedure
END stringLength
You are doing a lot of useless stuff and not doing anything to count ignoring the spaces.
You don't really need to setup a new stack frame, for such a simple routine you can do everything in clobbered registers, or at most save a few registers on the stack;
That inc al is pointless - you are incrementing the character value, just to discard it at the next loop iteration.
push fmt and then you clean the stack immediately? What sense does it make?
mov ebx, 0- nobody does that, the idiomatic way to zero a register is xor ebx,ebx (the instruction encoding is more compact);
cmp al, 0 given that you are only interested in equality, you can just do test al, al (more compact);
you read [ebp+12] but never actually use it - is that supposed to be an unused parameter?
As for the algorithm itself, you'll just have to keep a separate counter to count non-space characters; actually, you can just keep ebx for that, and increment directly esi to iterate over characters. For example:
xor ebx, ebx
COMPARE:
mov al, [esi]
cmp al, ' '
jne nonspace
inc ebx
nonspace:
test al, al
jz FINALE
inc esi
jmp COMPARE
FINALE:
Now, this can be streamlined further exploiting the fact that the eax is going to be the return value, and that you can clobber freely ecx and edx, so:
stringLength PROC PUBLIC
mov ecx,[esp+4] ; get the string
xor eax,eax ; zero the counter
compare:
mov dl,[ecx] ; read character
cmp dl,' '
jne nospace
inc eax ; increase counter if it's a space
nospace:
test dl,dl
jz end ; go to end if we reached the NUL
inc ecx ; next character
jmp compare
end:
ret ; straight return, nothing else to do
stringLength ENDP ; end the procedure
edit: about the updated version
COMPARE:
MOV al, [esi + ebx]
CMP al, 0
JE FINALE
INC ebx
CMP al, " " ; I don't know what assembler you are using,
; but typically character values are in single quotes
JE SPACE
INC al ; this makes no sense! you are incrementing
; the _character value_, not the position!
; it's going to be overwritten at the next iteration
JMP COMPARE
SPACE:
INC eax ; you cannot use eax as a counter, you are already
; using it (al) as temporary store for the current
; character!
JMP COMPARE
I think we need to use whole the eax register to compare values. In such manner:
; inlet data:
; eax - pointer to first byte of string
; edx - count of bytes in string
; ecx - result (number of non-space chars)
push esi
mov ecx, 0
mov esi, eax
##compare: cmp edx, 4
jl ##finalpass
mov eax, [esi]
xor eax, 20202020h ; 20h - space char
cmp al, 0
jz ##nextbyte0
inc ecx
##nextbyte0: cmp ah, 0
jz ##nextbyte1
inc ecx
##nextbyte1: shr eax, 16
cmp al, 0
jz ##nextbyte2
inc ecx
##nextbyte2: cmp ah, 0
jz ##nextbyte3
inc ecx
##nextbyte3: add esi, 4
sub edx, 4
jmp ##compare
##finalpass: and edx, edx
jz ##fine
mov al, [esi]
cmp al, 20h
jz ##nextbyte4
inc ecx
##nextbyte4: inc esi
dec edx
jmp ##finalpass
##fine: pop esi
; save the result data and restore stack
I have a problem with 32bit Assembly, assembling it with NASM on linux.
Here is my implementation of insertion sort
myInsertionSort:
push ebp
mov ebp, esp
push ebx
push esi
push edi
mov ecx, [ebp+12] ;put len in ecx, our loop variable
mov eax, 1 ; size of one spot in array, one byte
mov ebx, 0
mov esi, [ebp+8] ; the array
loop loop_1
loop_1:
cmp eax, ecx ; if we're done
jge done_1 ; then done with loop
push ecx ; we save len, because loop command decrements ecx
mov ecx, [esi+eax] ; ecx now array[i]
mov ebx, eax
dec ebx ; ebx is now eax-1, number of times we should go through inner loop
loop_2:
cmp ebx, 0 ; we don't use loop to not affect ecx so we use ebx and compare it manually with 0
jl done_2
cmp [esi+ebx], ecx ;we see if array[ebx] os ecx so we can exit the loop
jle done_2
mov edx, esi
add edx, ebx
push [edx] ; pushing our array[ebx] *****************************
add edx, eax
pop [edx] ; poping the last one *********************************
dec ebx ; decrementing the loop iterator
jmp loop_2 ; looping again
done_2:
mov [esi+ebx+1], ecx
inc eax ; incrementing iterator
pop ecx ; len of array to compare now to eax and see if we're done
jmp loop_1
done_1:
pop edi
pop esi
pop ebx
pop ebp ; we pop them in opposite to how we pushed (opposite order, it's the stack, LIFO)
ret
Now... When I try to compile my code with nasm, I get errors of "operation size not specified" on the lines containing asterisks in the comments :P
It's basic insertion sort and I'm not sure what could have gone wrong.
Enlighten me, please.
The data at [edx] could be anything, so its size is unknown to the assembler. You'll have to specify the size of the data you want to push/pop. For example, if you want to push/pop a dword (32 bits) you'd write:
push dword [edx]
pop dword [edx]
By the way, you can combine these lines:
mov edx, esi
add edx, ebx
into:
lea edx,[esi + ebx]
I'm trying to implement insertion sort in 32bit assembly in linux using NASM and I get a segmentation fault mid-run (not to mention that for some reason 'printf' prints random garbage values, I'm not totally sure why), Here is the
code:
section .rodata
MSG: DB "welcome to sortMe, please sort me",10,0
S1: DB "%d",10,0 ; 10 = '\n' , 0 = '\0'
section .data
array DD 5,1,7,3,4,9,12,8,10,2,6,11 ; unsorted array
len DB 12
section .text
align 16
global main
extern printf
main:
push MSG ; print welcome message
call printf
add esp,4 ; clean the stack
call printArray ;print the unsorted array
;parameters
;push len
;push array
mov eax, len
mov ebx, array
push eax
push ebx
call myInsertionSort
call printArray ; print the sorted one
mov eax, 1 ;exit system call
int 0x80
printArray:
push ebp ;save old frame pointer
mov ebp,esp ;create new frame on stack
pushad ;save registers
mov eax,0
mov ebx,0
mov edi,0
mov esi,0 ;array index
mov bl, byte [len]
add edi,ebx ; edi = array size
print_loop:
cmp esi,edi
je print_end
push dword [array+esi*4]
push S1
call printf
add esp, 8 ;clean the stack
inc esi
jmp print_loop
print_end:
popa ;restore registers
mov esp,ebp ;clean the stack frame
pop ebp ;return to old stack frame
ret
myInsertionSort:
push ebp
mov ebp, esp
push ebx
push esi
push edi
mov ecx, [ebp+12]
movzx ecx, byte [ecx] ;put len in ecx, our loop variable
mov eax, 0
mov ebx, 0
mov esi, [ebp+8] ; the array
loop loop_1
loop_1:
cmp ecx, 0 ; if we're done
je done_1 ; then done with loop
mov edx, ecx
push ecx ; we save len, because loop command decrements ecx
sub edx, ecx
mov ecx, [esi+4*edx] ;;;;;; ecx now array[i] ? how do I access array[i] in a similar manner?
mov ebx, eax
shr ebx, 2 ; number of times for inner loop
loop_2:
cmp ebx, 0 ; we don't use loop to not affect ecx so we use ebx and compare it manually with 0
jl done_2
cmp [esi+ebx], ecx ;we see if array[ebx] os ecx so we can exit the loop
jle done_2
lea edx, [esi+ebx]
push dword [edx] ; pushing our array[ebx]
add edx, 4
pop dword [edx] ; popping the last one
dec ebx ; decrementing the loop iterator
jmp loop_2 ; looping again
done_2:
mov [esi+ebx+1], ecx
inc eax ; incrementing iterator
pop ecx ; len of array to compare now to eax and see if we're done
jmp loop_1
done_1:
pop edi
pop esi
pop ebx
pop ebp ; we pop them in opposite to how we pushed
ret
About the printf thing, I'm positive that I should push the parameters the opposite way (first S1 and then the integer so it'd be from left to right as we'd call it in C), and if I do switch them, nothing is printed at all while I'm getting a segmentation fault. I don't know what to do, it prints these as output:
welcome to sortMe, please sort me
5
16777216
65536
256
1
117440512
458752
1792
7
50331648
196608
768
mov ecx, [ebp+12] ;put len in ecx, our loop variab
This only moves the address of LEN into ECX not its value! You need to add movzx ecx, byte [ecx]
You also need to define LEN=48
loop loop_1
What's this bizare use of LOOP doing here?
You are mixing bytes and dwords on multiple occasions. You need to rework the code. p.e.
dec ebx ; ebx is now number of times we should go through inner loop
should become
shr ebx,2
This is not correct because you need the address and not the value. Change MOV into LEA.
jle done_2
mov edx, [esi+ebx]
Perhaps you can post your reworked code as an EDIT within your Original question.
Your edited code does not address ALL the problems signaled by user3144770!
The parameters to printf are correct but here are some additional problems with your printArray routine.
Since ESI is an index in an array of dwords you need to scale it up!
push dword [array+esi*4]
Are you sure pusha will save 32 bits ? Perhaps you'd better use pushad
ps Should you decide to rework your code and post the edit then please add the reworked code after the last line of the existing post. This way the original question will continue making sense to people viewing it the first time!
Excuse me again. I am trying understand learn assembly languaje. However I have many problems. I am trying working with strings in NASM. I have copy a string constant to string variable. The maximum size is 50. So I want verify this bound. However this program throw a segmentation fault. I use a example in MASM, so perhaps exist a use error with NASM syntax.
My program is the following:
section .data
MAXTEXTSIZE equ 50
_cte_hola db "Hola", 0
_cte_mundo db "Mundo", 0
section .bss
MAIN_d resb MAXTEXTSIZE+1
section .text
global _start
strlen:
mov bx, 0
strl01:
cmp WORD [SI+BX],0 t
je strend
inc bx
jmp strl01
strend:
ret
strcpy:
call strlen
cmp bx, MAXTEXTSIZE
jle copiarsizeok
mov bx, MAXTEXTSIZE
copiarsizeok:mov cx, bx
cld
rep movsb
mov al,0
mov BYTE [DI], al
ret
_start:
mov ds, ax
mov es, ax
mov si, [MAIN_d]
mov di, [_cte_hola]
call strcpy
mov eax, 1
mov ebx, 0
int 80h
Thanks in advance and excuse me. My question are stupid for a assembly programmer.
I believe you are trying to make 32bit program in Linux, but your examples are 16bit.
In Linux, all pointers are 32bit. So, use extended registers: esi, edi, ebx etc. You still can use 8 and 16bit registers for arithmetics and data processing but not as memory pointers.
In strlen you have to compare byte [esi+ebx], 0 not word.
Don't set the segment registers in Linux. They will be set by the OS and you can't touch them. In Linux all memory is one flat area and you don't have to use segment registers anymore.
Here's a more concrete example of how you could write your strlen function (which is the first of your problems)
section .data
MAXTEXTSIZE equ 50
_cte_hola db "Hola", 0xa, 0
_cte_mundo db "Mundo", 0
section .bss
MAIN_d resb MAXTEXTSIZE+1
section .text
global _start
strlen:
mov ebx, 0
strlen_loop:
cmp BYTE [esi+ebx], 0
je strlen_end
inc ebx
jmp strlen_loop
strlen_end:
mov eax, ebx
ret
_start:
mov esi, _cte_hola
call strlen ; Get the length of _cte_hola
mov edx, eax ; The length was stored in eax by strlen
mov ecx, _cte_hola
mov ebx,1
mov eax, 4
int 0x80 ; Write to stdout
mov eax, 1
int 0x80 ; Exit
There are definitely better ways of implementing this (I'd use repne to implement strlen, for example) but I wanted to keep it close to your implementation.
Hope this helps!