Learning NASM Assembly, I am trying to make a program that reads two one-digit number inputs.
I have two variables declared in the .bss:
num1 resb 1
num2 resb 1
Then, I ask the user to write the numbers like this:
; Get number 1
mov EAX,3
mov EBX,1
mov ECX,num1
mov EDX,1
int 0x80
; Get number 2
mov EAX,3
mov EBX,1
mov ECX,num2
mov EDX,1
int 0x80
Since I am only interested in one-digit number inputs, I set EDX to 1. This way, whatever the user types, only the first character will be stored in my variable (right?).
The problem is that everything that follows after that first character will be used for the future reads. If you type 5 and then press ENTER, the ASCII code 53 will be stored in num1 just fine, but the line break you generated by pressing ENTER will carry on to the next read instruction, which will be stored in num2. Clearly that's not what I was intending. I want the user to type a number, press ENTER, type another number, and press ENTER.
I am not entirely sure how to work around this in the simplest way possible.
The dumbest idea was to put a "dummy" read instruction between num1 and num2, which will capture the line break (and do nothing with it). This is obviously not good.
Here's a very basic way of reading input until you get digits you want. It will skip anything but digits. This approach is fine if it provides the functionality you want. If you need different behavior depending upon other non-numeric input, then you need to specify that behavior. Then that behavior can be programmed as well.
; Get number 1
mov ECX,num1
call GetNumber
; Get number 2
mov ECX,num2
call GetNumber
...
GetNumber:
pusha ; save regs
get:
mov EAX,3 ; system call for reading a character
mov EBX,0 ; 0 is standard input
mov EDX,1 ; number of characters to read
int 0x80 ; ECX has the buffer, passed into GetNumber
cmp byte [ecx],0x30
jl get ; Retry if the byte read is < '0'
cmp byte [ecx],0x39
jg get ; Retry if the byte read is > '9'
; At this point, if you want to just return an actual number,
; you could subtract '0' (0x30) off of the value read
popa ; restore regs
ret
Meddling with stdin to disable I_CANON will work, but may be the "hard way". Using a two byte buffer and doing mov edx, 2 will work if the pesky user is well behaved - either clear the second byte, or just ignore it.
Sometimes the pesky user is not well behaved. Dealing with "garbage input" or other error conditions generally takes much more code than just "doing the work"! Either deal with it, or be satisfied with a program that "usually" works. The second option may be sufficient for beginners.
The pesky user might just hit "enter" without entering a number. In this case, we want to either re-prompt, or perhaps print "Sorry you didn't like my program" and exit. Or he/she might type more than one character before hitting "enter". This is potentially dangerous! If a malicious user types "1rm -rf .", you've just wiped out your entire system! Unix is powerful, and like any powerful tool can be dangerous in the hands of an unskilled user.
You might try something like (warning: untested code ahead!)...
section .bss
num1 resb 1
num2 resb 1
trashbin resb 1
section .text
re_prompt:
; prompt for your number
; ...
; get the number (character representing the number!)
mov ecx, num1
reread:
mov edx, 1
mov ebx, 0 ; 1 will work, but 0 is stdin
mov eax, 3 ; sys_read
int 0x80
cmp byte [ecx], 10 ; linefeed
jz got_it
mov ecx, trashbin
jmp reread
got_it:
cmp byte [num1], 10 ; user entered nothing?
jz re_prompt ; or do something intelligent
; okay, we have a character in num1
; may want to make sure it's a valid digit
; convert character to number now?
; carry on
You may need to fiddle with that to make it work. I probably shouldn't post untested code (you can embarrass yourself that way!). "Something like that" might be easier for you than fiddling with termios. The second link Michael gave you includes the code I use for that. I'm not very happy with it (sloppy!), but it "kinda works". Either way, have fun! :)
You will have to deal with canonical disabling, raw keyboard.
This is how linux manages entering console password for exampe without showing it.
The assembly to do this is nicely described here:
http://asm.sourceforge.net/articles/rawkb.html
Related
Ok, so I'm fairly new to assembly, infact, I'm very new to assembly. I wrote a piece of code which is simply meant to take numerical input from the user, multiply it by 10, and have the result expressed to the user via the programs exit status (by typing echo $? in terminal)
Problem is, it is not giving the correct number, 4x10 showed as 144. So then I figured the input would probably be as a character, rather than an integer. My question here is, how do I convert the character input to an integer so that it can be used in arithmetic calculations?
It would be great if someone could answer keeping in mind that I'm a beginner :)
Also, how can I convert said integer back to a character?
section .data
section .bss
input resb 4
section .text
global _start
_start:
mov eax, 3
mov ebx, 0
mov ecx, input
mov edx, 4
int 0x80
mov ebx, 10
imul ebx, ecx
mov eax, 1
int 0x80
Here's a couple of functions for converting strings to integers, and vice versa:
; Input:
; ESI = pointer to the string to convert
; ECX = number of digits in the string (must be > 0)
; Output:
; EAX = integer value
string_to_int:
xor ebx,ebx ; clear ebx
.next_digit:
movzx eax,byte[esi]
inc esi
sub al,'0' ; convert from ASCII to number
imul ebx,10
add ebx,eax ; ebx = ebx*10 + eax
loop .next_digit ; while (--ecx)
mov eax,ebx
ret
; Input:
; EAX = integer value to convert
; ESI = pointer to buffer to store the string in (must have room for at least 10 bytes)
; Output:
; EAX = pointer to the first character of the generated string
int_to_string:
add esi,9
mov byte [esi],STRING_TERMINATOR
mov ebx,10
.next_digit:
xor edx,edx ; Clear edx prior to dividing edx:eax by ebx
div ebx ; eax /= 10
add dl,'0' ; Convert the remainder to ASCII
dec esi ; store characters in reverse order
mov [esi],dl
test eax,eax
jnz .next_digit ; Repeat until eax==0
mov eax,esi
ret
And this is how you'd use them:
STRING_TERMINATOR equ 0
lea esi,[thestring]
mov ecx,4
call string_to_int
; EAX now contains 1234
; Convert it back to a string
lea esi,[buffer]
call int_to_string
; You now have a string pointer in EAX, which
; you can use with the sys_write system call
thestring: db "1234",0
buffer: resb 10
Note that I don't do much error checking in these routines (like checking if there are characters outside of the range '0' - '9'). Nor do the routines handle signed numbers. So if you need those things you'll have to add them yourself.
The basic algorith for string->digit is: total = total*10 + digit, starting from the MSD. (e.g. with digit = *p++ - '0' for an ASCII string of digits). So the left-most / Most-Significant / first digit (in memory, and in reading order) gets multiplied by 10 N times, where N is the total number of digits after it.
Doing it this way is generally more efficient than multiplying each digit by the right power of 10 before adding. That would need 2 multiplies; one to grow a power of 10, and another to apply it to the digit. (Or a table look-up with ascending powers of 10).
Of course, for efficiency you might use SSSE3 pmaddubsw and SSE2 pmaddwd to multiply digits by their place-value in parallel: see Is there a fast way to convert a string of 8 ASCII decimal digits into a binary number? and arbitrary-length How to implement atoi using SIMD?. But the latter probably isn't a win when numbers are typically short. A scalar loop is efficient when most numbers are only a couple digits long.
Adding on to #Michael's answer, it may be useful to have the int->string function stop at the first non-digit, instead of at a fixed length. This will catch problems like your string including a newline from when the user pressed return, as well as not turning 12xy34 into a very large number. (Treat it as 12, like C's atoi function). The stop character can also be the terminating 0 in a C implicit-length string.
I've also made some improvements:
Don't use the slow loop instruction unless you're optimizing for code-size. Just forget it exists and use dec / jnz in cases where counting down to zero is still what you want to do, instead of comparing a pointer or something else.
2 LEA instructions are significantly better than imul + add: lower latency.
accumulate the result in EAX where we want to return it anyway. (If you inline this instead of calling it, use whatever register you want the result in.)
I changed the registers so it follows the x86-64 System V ABI (First arg in RDI, return in EAX).
Porting to 32-bit: This doesn't depend on 64-bitness at all; it can be ported to 32-bit by just using 32-bit registers. (i.e. replace rdi with edi, rax with ecx, and rax with eax). Beware of C calling-convention differences between 32 and 64-bit, e.g. EDI is call-preserved and args are usually passed on the stack. But if your caller is asm, you can pass an arg in EDI.
; args: pointer in RDI to ASCII decimal digits, terminated by a non-digit
; clobbers: ECX
; returns: EAX = atoi(RDI) (base 10 unsigned)
; RDI = pointer to first non-digit
global base10string_to_int
base10string_to_int:
movzx eax, byte [rdi] ; start with the first digit
sub eax, '0' ; convert from ASCII to number
cmp al, 9 ; check that it's a decimal digit [0..9]
jbe .loop_entry ; too low -> wraps to high value, fails unsigned compare check
; else: bad first digit: return 0
xor eax,eax
ret
; rotate the loop so we can put the JCC at the bottom where it belongs
; but still check the digit before messing up our total
.next_digit: ; do {
lea eax, [rax*4 + rax] ; total *= 5
lea eax, [rax*2 + rcx] ; total = (total*5)*2 + digit
; imul eax, 10 / add eax, ecx
.loop_entry:
inc rdi
movzx ecx, byte [rdi]
sub ecx, '0'
cmp ecx, 9
jbe .next_digit ; } while( digit <= 9 )
ret ; return with total in eax
This stops converting on the first non-digit character. Often this will be the 0 byte that terminates an implicit-length string. You could check after the loop that it was a string-end, not some other non-digit character, by checking ecx == -'0' (which still holds the str[i] - '0' integer "digit" value that was out of range), if you want to detect trailing garbage.
If your input is an explicit-length string, you'd need to use a loop counter instead of checking a terminator (like #Michael's answer), because the next byte in memory might be another digit. Or it might be in an unmapped page.
Making the first iteration special and handling it before jumping into the main part of the loop is called loop peeling. Peeling the first iteration allows us to optimize it specially, because we know total=0 so there's no need to multiply anything by 10. It's like starting with sum = array[0]; i=1 instead of sum=0, i=0;.
To get nice loop structure (with the conditional branch at the bottom), I used the trick of jumping into the middle of the loop for the first iteration. This didn't even take an extra jmp because I was already branching in the peeled first iteration. Reordering a loop so an if()break in the middle becomes a loop branch at the bottom is called loop rotation, and can involve peeling the first part of the first iteration and the 2nd part of the last iteration.
The simple way to solve the problem of exiting the loop on a non-digit would be to have a jcc in the loop body, like an if() break; statement in C before the total = total*10 + digit. But then I'd need a jmp and have 2 total branch instructions in the loop, meaning more overhead.
If I didn't need the sub ecx, '0' result for the loop condition, I could have used lea eax, [rax*2 + rcx - '0'] to do it as part of the LEA as well. But that would have made the LEA latency 3 cycles instead of 1, on Sandybridge-family CPUs. (3-component LEA vs. 2 or less.) The two LEAs form a loop-carried dependency chain on eax (total), so (especially for large numbers) it would not be worth it on Intel. On CPUs where base + scaled-index is no faster than base + scaled-index + disp8 (Bulldozer-family / Ryzen), then sure, if you have an explicit length as your loop condition and don't want to check the digits at all.
I used movzx to load with zero extension in the first place, instead of doing that after converting the digit from ASCII to integer. (It has to be done at some point to add into 32-bit EAX). Often code that manipulates ASCII digits uses byte operand-size, like mov cl, [rdi]. But that would create a false dependency on the old value of RCX on most CPUs.
sub al,'0' saves 1 byte over sub eax,'0', but causes a partial-register stall on Nehalem/Core2 and even worse on PIII. Fine on all other CPU families, even Sandybridge: it's a RMW of AL, so it doesn't rename the partial reg separately from EAX. But cmp al, 9 doesn't cause a problem, because reading a byte register is always fine. It saves a byte (special encoding with no ModRM byte), so I used that at the top of the function.
For more optimization stuff, see http://agner.org/optimize, and other links in the x86 tag wiki.
The tag wiki also has beginner links, including an FAQ section with links to integer->string functions, and other common beginner questions.
Related:
How do I print an integer in Assembly Level Programming without printf from the c library? is the reverse of this question, integer -> base10string.
Is there a fast way to convert a string of 8 ASCII decimal digits into a binary number? highly optimized SSSE3 pmaddubsw / pmaddwd for 8-digit integers.
How to implement atoi using SIMD? using a shuffle to handle variable-length
Conversion of huge decimal numbers (128bit) formatted as ASCII to binary (hex) handles long strings, e.g. a 128-bit integer that takes 4x 32-bit registers. (It's not very efficient, and might be better to convert in multiple chunks and then do extended-precision multiplies by 1e9 or something.)
Convert from ascii to integer in AT&T Assembly Inefficient AT&T version of this.
Ok, so I'm fairly new to assembly, infact, I'm very new to assembly. I wrote a piece of code which is simply meant to take numerical input from the user, multiply it by 10, and have the result expressed to the user via the programs exit status (by typing echo $? in terminal)
Problem is, it is not giving the correct number, 4x10 showed as 144. So then I figured the input would probably be as a character, rather than an integer. My question here is, how do I convert the character input to an integer so that it can be used in arithmetic calculations?
It would be great if someone could answer keeping in mind that I'm a beginner :)
Also, how can I convert said integer back to a character?
section .data
section .bss
input resb 4
section .text
global _start
_start:
mov eax, 3
mov ebx, 0
mov ecx, input
mov edx, 4
int 0x80
mov ebx, 10
imul ebx, ecx
mov eax, 1
int 0x80
Here's a couple of functions for converting strings to integers, and vice versa:
; Input:
; ESI = pointer to the string to convert
; ECX = number of digits in the string (must be > 0)
; Output:
; EAX = integer value
string_to_int:
xor ebx,ebx ; clear ebx
.next_digit:
movzx eax,byte[esi]
inc esi
sub al,'0' ; convert from ASCII to number
imul ebx,10
add ebx,eax ; ebx = ebx*10 + eax
loop .next_digit ; while (--ecx)
mov eax,ebx
ret
; Input:
; EAX = integer value to convert
; ESI = pointer to buffer to store the string in (must have room for at least 10 bytes)
; Output:
; EAX = pointer to the first character of the generated string
int_to_string:
add esi,9
mov byte [esi],STRING_TERMINATOR
mov ebx,10
.next_digit:
xor edx,edx ; Clear edx prior to dividing edx:eax by ebx
div ebx ; eax /= 10
add dl,'0' ; Convert the remainder to ASCII
dec esi ; store characters in reverse order
mov [esi],dl
test eax,eax
jnz .next_digit ; Repeat until eax==0
mov eax,esi
ret
And this is how you'd use them:
STRING_TERMINATOR equ 0
lea esi,[thestring]
mov ecx,4
call string_to_int
; EAX now contains 1234
; Convert it back to a string
lea esi,[buffer]
call int_to_string
; You now have a string pointer in EAX, which
; you can use with the sys_write system call
thestring: db "1234",0
buffer: resb 10
Note that I don't do much error checking in these routines (like checking if there are characters outside of the range '0' - '9'). Nor do the routines handle signed numbers. So if you need those things you'll have to add them yourself.
The basic algorith for string->digit is: total = total*10 + digit, starting from the MSD. (e.g. with digit = *p++ - '0' for an ASCII string of digits). So the left-most / Most-Significant / first digit (in memory, and in reading order) gets multiplied by 10 N times, where N is the total number of digits after it.
Doing it this way is generally more efficient than multiplying each digit by the right power of 10 before adding. That would need 2 multiplies; one to grow a power of 10, and another to apply it to the digit. (Or a table look-up with ascending powers of 10).
Of course, for efficiency you might use SSSE3 pmaddubsw and SSE2 pmaddwd to multiply digits by their place-value in parallel: see Is there a fast way to convert a string of 8 ASCII decimal digits into a binary number? and arbitrary-length How to implement atoi using SIMD?. But the latter probably isn't a win when numbers are typically short. A scalar loop is efficient when most numbers are only a couple digits long.
Adding on to #Michael's answer, it may be useful to have the int->string function stop at the first non-digit, instead of at a fixed length. This will catch problems like your string including a newline from when the user pressed return, as well as not turning 12xy34 into a very large number. (Treat it as 12, like C's atoi function). The stop character can also be the terminating 0 in a C implicit-length string.
I've also made some improvements:
Don't use the slow loop instruction unless you're optimizing for code-size. Just forget it exists and use dec / jnz in cases where counting down to zero is still what you want to do, instead of comparing a pointer or something else.
2 LEA instructions are significantly better than imul + add: lower latency.
accumulate the result in EAX where we want to return it anyway. (If you inline this instead of calling it, use whatever register you want the result in.)
I changed the registers so it follows the x86-64 System V ABI (First arg in RDI, return in EAX).
Porting to 32-bit: This doesn't depend on 64-bitness at all; it can be ported to 32-bit by just using 32-bit registers. (i.e. replace rdi with edi, rax with ecx, and rax with eax). Beware of C calling-convention differences between 32 and 64-bit, e.g. EDI is call-preserved and args are usually passed on the stack. But if your caller is asm, you can pass an arg in EDI.
; args: pointer in RDI to ASCII decimal digits, terminated by a non-digit
; clobbers: ECX
; returns: EAX = atoi(RDI) (base 10 unsigned)
; RDI = pointer to first non-digit
global base10string_to_int
base10string_to_int:
movzx eax, byte [rdi] ; start with the first digit
sub eax, '0' ; convert from ASCII to number
cmp al, 9 ; check that it's a decimal digit [0..9]
jbe .loop_entry ; too low -> wraps to high value, fails unsigned compare check
; else: bad first digit: return 0
xor eax,eax
ret
; rotate the loop so we can put the JCC at the bottom where it belongs
; but still check the digit before messing up our total
.next_digit: ; do {
lea eax, [rax*4 + rax] ; total *= 5
lea eax, [rax*2 + rcx] ; total = (total*5)*2 + digit
; imul eax, 10 / add eax, ecx
.loop_entry:
inc rdi
movzx ecx, byte [rdi]
sub ecx, '0'
cmp ecx, 9
jbe .next_digit ; } while( digit <= 9 )
ret ; return with total in eax
This stops converting on the first non-digit character. Often this will be the 0 byte that terminates an implicit-length string. You could check after the loop that it was a string-end, not some other non-digit character, by checking ecx == -'0' (which still holds the str[i] - '0' integer "digit" value that was out of range), if you want to detect trailing garbage.
If your input is an explicit-length string, you'd need to use a loop counter instead of checking a terminator (like #Michael's answer), because the next byte in memory might be another digit. Or it might be in an unmapped page.
Making the first iteration special and handling it before jumping into the main part of the loop is called loop peeling. Peeling the first iteration allows us to optimize it specially, because we know total=0 so there's no need to multiply anything by 10. It's like starting with sum = array[0]; i=1 instead of sum=0, i=0;.
To get nice loop structure (with the conditional branch at the bottom), I used the trick of jumping into the middle of the loop for the first iteration. This didn't even take an extra jmp because I was already branching in the peeled first iteration. Reordering a loop so an if()break in the middle becomes a loop branch at the bottom is called loop rotation, and can involve peeling the first part of the first iteration and the 2nd part of the last iteration.
The simple way to solve the problem of exiting the loop on a non-digit would be to have a jcc in the loop body, like an if() break; statement in C before the total = total*10 + digit. But then I'd need a jmp and have 2 total branch instructions in the loop, meaning more overhead.
If I didn't need the sub ecx, '0' result for the loop condition, I could have used lea eax, [rax*2 + rcx - '0'] to do it as part of the LEA as well. But that would have made the LEA latency 3 cycles instead of 1, on Sandybridge-family CPUs. (3-component LEA vs. 2 or less.) The two LEAs form a loop-carried dependency chain on eax (total), so (especially for large numbers) it would not be worth it on Intel. On CPUs where base + scaled-index is no faster than base + scaled-index + disp8 (Bulldozer-family / Ryzen), then sure, if you have an explicit length as your loop condition and don't want to check the digits at all.
I used movzx to load with zero extension in the first place, instead of doing that after converting the digit from ASCII to integer. (It has to be done at some point to add into 32-bit EAX). Often code that manipulates ASCII digits uses byte operand-size, like mov cl, [rdi]. But that would create a false dependency on the old value of RCX on most CPUs.
sub al,'0' saves 1 byte over sub eax,'0', but causes a partial-register stall on Nehalem/Core2 and even worse on PIII. Fine on all other CPU families, even Sandybridge: it's a RMW of AL, so it doesn't rename the partial reg separately from EAX. But cmp al, 9 doesn't cause a problem, because reading a byte register is always fine. It saves a byte (special encoding with no ModRM byte), so I used that at the top of the function.
For more optimization stuff, see http://agner.org/optimize, and other links in the x86 tag wiki.
The tag wiki also has beginner links, including an FAQ section with links to integer->string functions, and other common beginner questions.
Related:
How do I print an integer in Assembly Level Programming without printf from the c library? is the reverse of this question, integer -> base10string.
Is there a fast way to convert a string of 8 ASCII decimal digits into a binary number? highly optimized SSSE3 pmaddubsw / pmaddwd for 8-digit integers.
How to implement atoi using SIMD? using a shuffle to handle variable-length
Conversion of huge decimal numbers (128bit) formatted as ASCII to binary (hex) handles long strings, e.g. a 128-bit integer that takes 4x 32-bit registers. (It's not very efficient, and might be better to convert in multiple chunks and then do extended-precision multiplies by 1e9 or something.)
Convert from ascii to integer in AT&T Assembly Inefficient AT&T version of this.
I'm having a bit of trouble reading to and from arrays in assembly.
It's a fairly simple program (albeit at this point, far from finished). All I'm trying to do at this point is read a string of (what we're assuming is numbers), converting it to a decimal number, and printing it. Here's what I've got so far. As of now, it prints str1. After you enter a number and hit enter, it prints str1 again and freezes. Can anyone offer some insight as to what all I'm doing wrong?
INCLUDE Irvine32.inc
.data
buffersize equ 80
buffer DWORD buffersize DUP (0)
str1 BYTE "Enter numbers to be added together. Press (Q) to Quit.", 0dh, 0ah,0;
str2 BYTE "The numbers entered were: ", 0dh, 0ah, 0
str3 BYTE "The total of numbers entered is: ", 0dh, 0ah, 0
error BYTE "Invalid Entry. Please try again.", 0dh, 0ah,0
value DWORD 0
.code
main PROC
mov edx, OFFSET str1
call Writestring
Input:
call readstring
mov buffer[edi], eax
cmp buffer[edi], 0
JL NOTDIGIT
cmp buffer[edi], 9
JG NOTDIGIT
call cvtDec
mov edx, buffer[edi]
call WriteString
jmp endloop
Notdigit:
mov edx, OFFSET error
call writestring
exit
cvtDec:
mov eax, buffer[edi]
AND eax,0Fh
mov buffer[edi],edx
ret
endloop:
main ENDP
END MAIN
First off, Mr. Irvine created the function called WriteString, but you use 2 variations - writestring and Writestring; you do use the correct case of the function in one place. Get into the habit of using the correct names of functions now, and it will cut down on bugs later.
Second, you created a label called Notdigit but yet you use JL NOTDIGIT and JG NOTDIGIT in your code. Again, use the correct spelling. MASM should of given you an A2006 error "undefined symbol"
You also declared your entry point as main, but you close your code section with END MAIN instead of END main.
If you have MASM set up properly (by adding option casemap:none at the top of your source. Or just open irvine32.inc and uncomment the line that says OPTION CASEMAP:NONE)
Let's look at the ReadString procedure comment in irvine32.asm:
; Reads a string from the keyboard and places the characters
; in a buffer.
; Receives: EDX offset of the input buffer
; ECX = maximum characters to input (including terminal null)
; Returns: EAX = size of the input string.
; Comments: Stops when Enter key (0Dh,0Ah) is pressed. If the user
; types more characters than (ECX-1), the excess characters
; are ignored.
ReadString takes an address of the buffer to hold the inputed string in edx, you are using the address of your prompt str1, maybe you meant to use buffer? You also did not put the size of the buffer into ecx
Your using edi as an index into your buffer, what value does edi contain? Your trying to put the value of eax into it, what does eax contain??? Both edi and eax probably contain garbage; not what you want.
Look at this carefully:
cvtDec:
mov eax, buffer[edi]
AND eax,0Fh
mov buffer[edi],edx
Your putting a value (That you think is an ASCII value of a number) into eax then converting to a decimal value... ok... Next, you are putting whatever is in edx back into your buffer. Is that what you want?
I try to implement cat>filename command in NASM in Ubuntu 11.04 using system calls. My program is compiled successfully and run successfully (seems so). But whenever I tried to fire cat filename command it shows "No such file or directory" yet I see the file residing in the directory. And if I try to open the file by double clicking it shows me "You do not have the permissions necessary to open the file." Can you please help me to find the errors in my code?
The code is following:
section .data
msg: dd "%d",10,0
msg1: db "cat>",0
length: equ $-msg1
section .bss
a resb 100
len1 equ $-a
b resd 1
c resb 100
len2 equ $-c
section .txt
global main
main:
mov eax,4 ;;it will print cat>
mov ebx,1
mov ecx,msg1
mov edx,length
int 80h
start:
mov eax,3 ;;it will take the file name as input
mov ebx,0
mov ecx,a
mov edx,len1
int 80h
mov eax,5 ;;it will create the file by giving owner read/write/exec permission
mov ebx,a
mov ecx,0100
mov edx,1c0h
int 80h
cmp eax,0
jge inputAndWrite
jmp errorSegment
inputAndWrite:
mov [b],eax
mov eax,3 ;;take the input lines
mov ebx,0
mov ecx,c
mov edx,len2
int 80h
mov edx,eax ;;write the input lines in the file
mov eax,4
mov ebx,[b]
mov ecx,c
int 80h
jmp done
errorSegment:
jmp done
done:
mov eax, 1
xor ebx, ebx
int 80h
p.s. The above code is re-edited by taking the suggestions from RageD. Yet,the file I have created has not contain any lines of input given from "inputAndWrite" segment. I am looking for your suggestion.
Your major problem with permissions is that permissions are in octal and you have listed them in decimal. You are looking for 0700 in base 8, not base 10. So instead, you can try using 1c0h (0700 octal in hexadecimal). So the following code fix should fix your permissions problem:
;; This is file creation
mov eax, 5
mov ebx, a
mov ecx, 01h ; Edited here for completeness - forgot to update this initially (see edit)
mov edx, 1c0h
For your reference, a quick guide (maybe somewhat outdated, but for the most part correct) for linux system calls is to use the Linux System Call Table. It is extremely helpful in remembering how the registers need to be set, etc.
Another critical issue is writing to the file. I think you became a little confused on a few issues. First of all, be careful with your length variables. Assembly is done "in-line," that is, when you calculate len1, you calculate the distance between a plus everything in between a to len1. That said, your length values should look like this:
.section bss
a resb 100
len1 equ $ - a
b resd 1
c resb 100
len2 equ $ - c
Doing this should make sure that you have proper reads (although it is important to note that you are restricted by your buffer sizes here for input).
Another crucial issue I found is how you're trying to write to the file. You flipped the syscall registers.
;; Write to file
mov edx, eax ;; Amount of data to write
mov eax, 4 ;; Write syscall
mov ebx, [b] ;; File descriptor to write out to (I think this is where you stored this, I don't remember exactly)
mov ecx, c ;; Buffer to write out
From here, I would make a few more adjustments. First off, to end nicely (no segfault), I would suggest simply using exit. Unless this is in another program, ret may not always work properly (particularly if this is a standalone x86 program). The code for the exit syscall is below:
;; Exit
mov eax, 1 ;; Exit is syscall 1
xor ebx, ebx ;; This is the return value
int 80h ;; Interrupt
Also, as for cleanliness, I assume you are taking input buffered by a newline. If this is the case, I would suggest stripping away the newline character after the filename. The simplest way to do this is to simply null-terminate after the last character (which will be new line). So, after reading input for the filename, I would place some code similar to this:
;; Null-terminate the last character - this assumes it directly follows the read call
;; and so the contents of eax are the amount of bytes read
mov ebx, eax ;; How many bytes read (or offset to current null-terminator)
sub ebx, 1 ;; Offset in array to the last valid character
add ebx, a ;; Add the memory address (i.e. in C this looks like a[OFFSET])
mov BYTE [ebx], 0 ;; Null-terminated
Finally, it is polite in larger projects to close your file descriptors when you're done. It may not be necessary here since you are immediately exiting, but that would look something like:
;; Close fd
mov eax, 6 ;; close() is syscall 6
mov ebx, [b] ;; File descriptor to close
int 80h
EDIT
Sorry, I missed the writing issue. You are opening your file with value 100. What you want is 1 for O_RDWR (read and write capabilities). Also, you may want to consider simply using the sync system call (syscall number 0x24 with no arguments) to make sure your buffers get properly flushed; however, in my tests this was unnecessary since the line-feed to enter the data should technically do this, I believe. So the update bit of code to open the file properly should look like this:
; Open file
mov eax, 5
mov ebx, a
mov ecx, 01h
mov edx, 1c0h
int 80h
Hope this helps. Good luck!
I am trying to write a program that will allow me to print multiple characters (strings of characters or integers). The problem that I am having is that my code only prints one of the characters, and then newlines and stays in an infinite loop. Here is my code:
SECTION .data
len EQU 32
SECTION .bss
num resb len
output resb len
SECTION .text
GLOBAL _start
_start:
Read:
mov eax, 3
mov ebx, 1
mov ecx, num
mov edx, len
int 80h
Point:
mov ecx, num
Print:
mov al, [ecx]
inc ecx
mov [output], al
mov eax, 4
mov ebx, 1
mov ecx, output
mov edx, len
int 80h
cmp al, 0
jz Exit
Clear:
mov eax, 0
mov [output], eax
jmp Print
Exit:
mov eax, 1
mov ebx, 0
int 80h
Could someone point out what I am doing wrong?
Thanks,
Rileyh
In the first time you enter the Print section, ecx is pointing to the start of the string and you use it to copy a single character to the start of the output string. But a few more instructions down, you overwrite ecx with the pointer to the output string, and never restore it, therefore you never manage to copy and print the rest of the string.
Also, why are you calling write() with a single character string with the aim to loop over it to print the entire string? Why not just pass num directly in instead of copying a single character to output and passing that?
In your last question, you showed message as a zero-terminated string, so cmp al, 0 would indicate the end of the string. sys_read does NOT create a zero-terminated string! (we can stuff a zero in there if we need it - e.g. as a filename for sys_open) sys_read will read a maximum of edx characters. sys_read from stdin returns when, and only when, the "enter" key is hit. If fewer than edx characters were entered, the string is terminated with a linefeed character (10 decimal or 0xA or 0Ah hex) - you could look for that... But, if the pesky user types more than edx characters, only edx characters go into your buffer, the "excess" remains in the OS's buffer (and can cause trouble later!). In this case your string is NOT terminated with a linefeed, so looking for it will fail. sys_read returns the number of characters actually read - up to edx - including the linefeed - in eax. If you don't want to include the linefeed in the length, you can decrement eax.
As an experiment, do a sys_read with some small number (say 4) in edx, then exit the program. Type "abcdls"(enter) and watch the "ls" be executed. If some joker typed "abcdrm -rf ."... well, don't!!!
Safest thing is to flush the OS's input buffer.
mov ecx, num
mov edx, len
mov ebx, 1
mov eax, 3
int 80h
cmp byte [ecx + eax - 1], 10 ; got linefeed?
push eax ; save read length - doesn't alter flags
je good
flush:
mov ecx, dummy_buf
mov edx, 1
mov ebx, 1
mov eax, 3
int 80h
cmp byte [ecx], 10
jne flush
good:
pop eax ; restore length from first sys_read
Instead of defining dummy_buf in .bss (or .data), we could put it on the stack - trying to keep it simple here. This is imperfect - we don't know if our string is linefeed-terminated or not, and we don't check for error (unlikely reading from stdin). You'll find you're writing much more code dealing with errors and "idiot user" input than "doing the work". Inevitable! (it's a low-level language - we've gotta tell the CPU Every Single Thing!)
sys_write doesn't know about zero-terminated strings, either! It'll print edx characters, regardless of how much garbage that might be. You want to figure out how many characters you actually want to print, and put that in edx (that's why I saved/restored the original length above).
You mention "integers" and use num as a variable name. Neither of these functions know about "numbers" except as ascii codes. You're reading and writing characters. Converting a single-digit number to and from a character is easy - add or subtract '0' (48 decimal or 30h). Multiple digits are more complicated - look around for an example, if that's what you need.
Best,
Frank