Implementing cat>fileName command in NASM

Implementing cat>fileName command in NASM - linux

I try to implement cat>filename command in NASM in Ubuntu 11.04 using system calls. My program is compiled successfully and run successfully (seems so). But whenever I tried to fire cat filename command it shows "No such file or directory" yet I see the file residing in the directory. And if I try to open the file by double clicking it shows me "You do not have the permissions necessary to open the file." Can you please help me to find the errors in my code?
The code is following:
section .data
msg: dd "%d",10,0
msg1: db "cat>",0
length: equ $-msg1
section .bss
a resb 100
len1 equ $-a
b resd 1
c resb 100
len2 equ $-c
section .txt
global main
main:
mov eax,4 ;;it will print cat>
mov ebx,1
mov ecx,msg1
mov edx,length
int 80h
start:
mov eax,3 ;;it will take the file name as input
mov ebx,0
mov ecx,a
mov edx,len1
int 80h
mov eax,5 ;;it will create the file by giving owner read/write/exec permission
mov ebx,a
mov ecx,0100
mov edx,1c0h
int 80h
cmp eax,0
jge inputAndWrite
jmp errorSegment
inputAndWrite:
mov [b],eax
mov eax,3 ;;take the input lines
mov ebx,0
mov ecx,c
mov edx,len2
int 80h
mov edx,eax ;;write the input lines in the file
mov eax,4
mov ebx,[b]
mov ecx,c
int 80h
jmp done
errorSegment:
jmp done
done:
mov eax, 1
xor ebx, ebx
int 80h
p.s. The above code is re-edited by taking the suggestions from RageD. Yet,the file I have created has not contain any lines of input given from "inputAndWrite" segment. I am looking for your suggestion.

Your major problem with permissions is that permissions are in octal and you have listed them in decimal. You are looking for 0700 in base 8, not base 10. So instead, you can try using 1c0h (0700 octal in hexadecimal). So the following code fix should fix your permissions problem:
;; This is file creation
mov eax, 5
mov ebx, a
mov ecx, 01h ; Edited here for completeness - forgot to update this initially (see edit)
mov edx, 1c0h
For your reference, a quick guide (maybe somewhat outdated, but for the most part correct) for linux system calls is to use the Linux System Call Table. It is extremely helpful in remembering how the registers need to be set, etc.
Another critical issue is writing to the file. I think you became a little confused on a few issues. First of all, be careful with your length variables. Assembly is done "in-line," that is, when you calculate len1, you calculate the distance between a plus everything in between a to len1. That said, your length values should look like this:
.section bss
a resb 100
len1 equ $ - a
b resd 1
c resb 100
len2 equ $ - c
Doing this should make sure that you have proper reads (although it is important to note that you are restricted by your buffer sizes here for input).
Another crucial issue I found is how you're trying to write to the file. You flipped the syscall registers.
;; Write to file
mov edx, eax ;; Amount of data to write
mov eax, 4 ;; Write syscall
mov ebx, [b] ;; File descriptor to write out to (I think this is where you stored this, I don't remember exactly)
mov ecx, c ;; Buffer to write out
From here, I would make a few more adjustments. First off, to end nicely (no segfault), I would suggest simply using exit. Unless this is in another program, ret may not always work properly (particularly if this is a standalone x86 program). The code for the exit syscall is below:
;; Exit
mov eax, 1 ;; Exit is syscall 1
xor ebx, ebx ;; This is the return value
int 80h ;; Interrupt
Also, as for cleanliness, I assume you are taking input buffered by a newline. If this is the case, I would suggest stripping away the newline character after the filename. The simplest way to do this is to simply null-terminate after the last character (which will be new line). So, after reading input for the filename, I would place some code similar to this:
;; Null-terminate the last character - this assumes it directly follows the read call
;; and so the contents of eax are the amount of bytes read
mov ebx, eax ;; How many bytes read (or offset to current null-terminator)
sub ebx, 1 ;; Offset in array to the last valid character
add ebx, a ;; Add the memory address (i.e. in C this looks like a[OFFSET])
mov BYTE [ebx], 0 ;; Null-terminated
Finally, it is polite in larger projects to close your file descriptors when you're done. It may not be necessary here since you are immediately exiting, but that would look something like:
;; Close fd
mov eax, 6 ;; close() is syscall 6
mov ebx, [b] ;; File descriptor to close
int 80h
EDIT
Sorry, I missed the writing issue. You are opening your file with value 100. What you want is 1 for O_RDWR (read and write capabilities). Also, you may want to consider simply using the sync system call (syscall number 0x24 with no arguments) to make sure your buffers get properly flushed; however, in my tests this was unnecessary since the line-feed to enter the data should technically do this, I believe. So the update bit of code to open the file properly should look like this:
; Open file
mov eax, 5
mov ebx, a
mov ecx, 01h
mov edx, 1c0h
int 80h
Hope this helps. Good luck!

Related

NASM - How can I solve the input reading problem from the terminal? [duplicate]

This question already has answers here:
Read STDIN using syscall READ in Linux: unconsumed input is sent to bash
(2 answers)
Closed 4 months ago.
section .data
yourinputis db "your input is =",0
len equ $ - yourinputis
section .bss
msginput resb 10
section .text
global _start
_start:
mov eax,3 ;read syscall
mov ebx,2
mov ecx,msginput
mov edx,9 ; I don't know that is correct?
int 80h
mov eax,4 ;write syscall
mov ebx,1
mov ecx,yourinputis
mov edx,len
int 80h
mov eax,4 ;write syscall
mov ebx,1
mov ecx,msginput
mov edx,10
int 80h
exit:
mov eax,1 ;exit syscall
xor ebx,ebx
int 80h
This code working very well. But It is so terrible bug(for me:(). If I enter an input longer than 10 --->
$./mycode
012345678rm mycode
your input is 012345678$rm mycode
$
This is happening. And of course "mycode" is not exist right now.
What should I do?
EDIT:The entered input is correctly printed on the screen. But if you enter a long input, it moves after the 9th character to the shell and runs it.
In the example, the "rm mycode" after "012345678" is running in the shell.

If you enter more than 9 characters, they're left in the terminal driver's input buffer. When the program exits, the shell reads from the terminal and tries to execute the rest of the line as a command.
To prevent this, your program should keep reading in a loop until it gets a newline.

You can read the characters one by one until you reach 0x0a. Something like:
_read:
mov esi, msginput
_loop:
mov eax,3 ;read syscall
mov ebx,0
mov ecx, esi
mov edx,1 ; I don't know that is correct?
int 80h
cmp byte[esi], 0x0a
je end
inc esi
jmp _loop
end:
ret
You would have to increase the size of msginput tho.
IMPORTANT: Do note that this is not the efficient way to do this (see the comments), it is only put here as an example to the answer above.

Finding the number of bytes of entered string at runtime

I'm new at learning assembly x86. I have written a program that asks the user to enter a number and then checks if it's even or odd and then print a message to display this information.
The code works fine but it has one problem. It only works for 1 digit numbers:
; Ask the user to enter a number from the keyboard
; Check if this number is odd or even and display a message to say this
section .text
global _start ;must be declared for linker (gcc)
_start: ;tell linker entry point
;Display 'Please enter a number'
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov ecx, msg1 ; message to be print
mov edx, len1 ; message length
int 80h ; perform system call
;Enter the number from the keyboard
mov eax, 3 ; sys_read
mov ebx, 2 ; file descriptor: stdin
mov ecx, myvariable ; destination (memory address)
mov edx, 4 ; size of the the memory location in bytes
int 80h ; perform system call
;Convert the variable to a number and check if even or odd
mov eax, [myvariable]
sub eax, '0' ;eax now has the number value
and eax, 01H
jz isEven
;Display 'The entered number is odd'
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov ecx, msg2 ; message to be print
mov edx, len2 ; message length
int 80h
jmp outProg
isEven:
;Display 'The entered number is even'
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov ecx, msg3 ; message to be print
mov edx, len3 ; message length
int 80h
outProg:
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg1 db "Please enter a number: ", 0xA,0xD
len1 equ $- msg1
msg2 db "The entered number is odd", 0xA,0xD
len2 equ $- msg2
msg3 db "The entered number is even", 0xA,0xD
len3 equ $- msg3
segment .bss
myvariable resb 4
It does not work properly for numbers with more than 1 digit because it only takes in account the first byte(first digit) of the entered number so it only checks that. So I would need a way to find out how many digits(bytes) there are in the entered value that the user gives so I could do something like this:
;Convert the variable to a number and check if even or odd
mov eax, [myvariable+(number_of_digits-1)]
And only check eax which contains the last digit to see if it's even or odd.
Problem is I have no ideea how could I check how many bytes are in my number after the user has entered it.
I'm sure it's something very easy yet I have not been able to figure it out, nor have I found any solutions on how to do this on google. Please help me with this. Thank you!

You actually want movzx eax, byte [myvariable+(number_of_digits-1)] to only load 1 byte, not a dword. Or just directly test memory with test byte [...], 1. You can skip the sub because '0' is an even number; subtracting to convert from ASCII code to integer digit doesn't change the low bit.
But yes, you need least significant digit, the last (highest address) in printing / reading order.
A read system call returns the number of bytes read in EAX. (Or negative error code). This will include a newline if the user hit return, but not if the user redirected from a file that didn't end with a newline. (Or if they submitted input on a terminal using control-d after typing some digits). The most simple and robust way would be to simply loop looking for the first non-digit in the buffer.
But the "clever" / fun way would be to check if [mybuffer + eax - 1] is a digit, and if so use it. Otherwise check the previous byte. (Or just assume there's a newline and always check [mybuffer + eax - 2], the 2nd-last byte of what was read. (Or off the start of the buffer if the user just pressed return.)
(To efficiently check for an ASCII digit; sub al, '0' / cmp al, 9 / ja non_digit. See double condition checking in assembly / What is the idea behind ^= 32, that converts lowercase letters to upper and vice versa?)
Just for fun, here's a more compact version that always just checks the 2nd-last byte of the read() input. (It doesn't check for being a digit, and it reads outside the buffer for input lengths of 0 or 1, e.g. pressing control-D or return.) Also for read errors, e.g. redirect with strace ./oddeven <&- to close its stdin.
Note the interesting part:
; check if the low digit is even or odd
mov ecx, msg_even
mov edx, msg_odd ; these don't set flags and actually could be done after TEST
test byte [mybuf + eax - 2], 1 ; check the low bit of 2nd-last byte of the read input
cmovnz ecx, edx
;Display selected message
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov edx, msg_odd.len
int 80h ; write(1, digit&1 ? msg_odd : msg_even, msg_odd.len)
I used cmov, but a simple branch over a mov ecx, msg_odd would work. You don't need to duplicate the whole setup for the system call, just run it with the right pointer and length. (ECX and EDX values, and I padded the odd message with a space so I could use the same length for both.)
And this is a homebrewed static_assert(msg_odd.len == msg_even.len), using NASM's conditional directives (https://nasm.us/doc/nasmdoc4.html). It's not just a separate preprocessor like C has, it can use NASM numeric equ expressions.
%if msg_odd.len != msg_even.len
; homebrew assert with NASM preprocessor, since I chose to skip doing a 2nd cmov for the length
%warn we assume both messages have the same length
%endif
The full thing. I outside of the part shown above, I just tweaked comments to sometimes simplify when I thought it was too redundant, and used meaningful label names.
Also, I put .rodata and .bss at the top because NASM complained about referencing msg_odd.len before it was defined. (You previously had your strings in .data, but read-only data should generally go in .rodata, so the OS can share those pages between runs of the same program because they stay clean.)
Other fixes:
Linux/Unix uses 0xa line endings, \n not \n\r.
stdin is fd 0. 2 is stderr. (2 happens to work because terminal emulators normally run the shell with all 3 file descriptors referring to the same read+write open file description for the tty).
; Ask the user to enter a number from the keyboard
; Check if this number is odd or even and display a message to say this
section .rodata
msg_prompt db "Please enter a number: ", 0xA
.len equ $- msg_prompt
msg_odd db "The entered number is odd ", 0xA ; padded with a space for same length as even
.len equ $- msg_odd
msg_even db "The entered number is even", 0xA
.len equ $- msg_even
section .bss
mybuf resb 128
.len equ $ - mybuf
section .text
global _start
_start: ; ld defaults to starting at the top of the .text section, but exporting a symbol silences the warning and can make GDB work more easily.
; Display prompt
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov ecx, msg_prompt
mov edx, msg_prompt.len
int 80h ; perform system call
mov eax, 3 ; sys_read
xor ebx, ebx ; file descriptor: stdin
mov ecx, mybuf
mov edx, mybuf.len
int 80h ; read(0, mybuf, len)
; return value in EAX: negative for error, 0 for EOF, or positive byte count
; for this toy program, lets assume valid input ending with digit\n
; the newline will be at [mybuf + eax - 1]. The digit before that, at [mybuf + eax - 2].
; If the user just presses return, we'll access before the end of mybuf, and may segfault if it's at the start of a page.
; check if the low digit is even or odd
mov ecx, msg_even
mov edx, msg_odd ; these don't set flags and actually could be done after TEST
test byte [mybuf + eax - 2], 1 ; check the low bit of 2nd-last byte of the read input
cmovnz ecx, edx
;Display selected message
mov eax, 4 ; sys_write
mov ebx, 1 ; file descriptor: stdout
mov edx, msg_odd.len
int 80h ; write(1, digit&1 ? msg_odd : msg_even, msg_odd.len)
%if msg_odd.len != msg_even.len
; homebrew assert with NASM preprocessor, since I chose to skip doing a 2nd cmov for the length
%warning we assume both messages have the same length
%endif
mov eax, 1 ;system call number (sys_exit)
xor ebx, ebx
int 0x80 ; _exit(0)
assemble + link with nasm -felf32 oddeven.asm && ld -melf_i386 -o oddeven oddeven.o

linux nasm print multiple characters

I am trying to write a program that will allow me to print multiple characters (strings of characters or integers). The problem that I am having is that my code only prints one of the characters, and then newlines and stays in an infinite loop. Here is my code:
SECTION .data
len EQU 32
SECTION .bss
num resb len
output resb len
SECTION .text
GLOBAL _start
_start:
Read:
mov eax, 3
mov ebx, 1
mov ecx, num
mov edx, len
int 80h
Point:
mov ecx, num
Print:
mov al, [ecx]
inc ecx
mov [output], al
mov eax, 4
mov ebx, 1
mov ecx, output
mov edx, len
int 80h
cmp al, 0
jz Exit
Clear:
mov eax, 0
mov [output], eax
jmp Print
Exit:
mov eax, 1
mov ebx, 0
int 80h
Could someone point out what I am doing wrong?
Thanks,
Rileyh

In the first time you enter the Print section, ecx is pointing to the start of the string and you use it to copy a single character to the start of the output string. But a few more instructions down, you overwrite ecx with the pointer to the output string, and never restore it, therefore you never manage to copy and print the rest of the string.
Also, why are you calling write() with a single character string with the aim to loop over it to print the entire string? Why not just pass num directly in instead of copying a single character to output and passing that?

In your last question, you showed message as a zero-terminated string, so cmp al, 0 would indicate the end of the string. sys_read does NOT create a zero-terminated string! (we can stuff a zero in there if we need it - e.g. as a filename for sys_open) sys_read will read a maximum of edx characters. sys_read from stdin returns when, and only when, the "enter" key is hit. If fewer than edx characters were entered, the string is terminated with a linefeed character (10 decimal or 0xA or 0Ah hex) - you could look for that... But, if the pesky user types more than edx characters, only edx characters go into your buffer, the "excess" remains in the OS's buffer (and can cause trouble later!). In this case your string is NOT terminated with a linefeed, so looking for it will fail. sys_read returns the number of characters actually read - up to edx - including the linefeed - in eax. If you don't want to include the linefeed in the length, you can decrement eax.
As an experiment, do a sys_read with some small number (say 4) in edx, then exit the program. Type "abcdls"(enter) and watch the "ls" be executed. If some joker typed "abcdrm -rf ."... well, don't!!!
Safest thing is to flush the OS's input buffer.
mov ecx, num
mov edx, len
mov ebx, 1
mov eax, 3
int 80h
cmp byte [ecx + eax - 1], 10 ; got linefeed?
push eax ; save read length - doesn't alter flags
je good
flush:
mov ecx, dummy_buf
mov edx, 1
mov ebx, 1
mov eax, 3
int 80h
cmp byte [ecx], 10
jne flush
good:
pop eax ; restore length from first sys_read
Instead of defining dummy_buf in .bss (or .data), we could put it on the stack - trying to keep it simple here. This is imperfect - we don't know if our string is linefeed-terminated or not, and we don't check for error (unlikely reading from stdin). You'll find you're writing much more code dealing with errors and "idiot user" input than "doing the work". Inevitable! (it's a low-level language - we've gotta tell the CPU Every Single Thing!)
sys_write doesn't know about zero-terminated strings, either! It'll print edx characters, regardless of how much garbage that might be. You want to figure out how many characters you actually want to print, and put that in edx (that's why I saved/restored the original length above).
You mention "integers" and use num as a variable name. Neither of these functions know about "numbers" except as ascii codes. You're reading and writing characters. Converting a single-digit number to and from a character is easy - add or subtract '0' (48 decimal or 30h). Multiple digits are more complicated - look around for an example, if that's what you need.
Best,
Frank

Stop BufferOverflow - NASM Input

I am trying to write some basic input/output code to the terminal in Linux with NASM. I want to allow the user to input data but my problem is that I get a buffer overflow if the user enters more data than the buffer length. I am attempting to check if the inputted data is greater than the bufferlength and if so then ask the user to "Enter Data:" again.
Here is my current code:
SECTION .bss
BUFFLENGTH equ 8 ;The max length of our Buffer
Buff: resb BUFFLENGTH ;The buffer itself
SECTION .data
Prompt: db "Enter Data: ",10
PromptLen: equ $-Prompt
SECTION .text
global _start
_start:
DisplayPrompt:
mov eax, 4
mov ebx, 1
mov ecx, Prompt
mov edx, PromptLen
int 80h
Read:
mov eax, 3 ;Specify sys_read call
mov ebx, 0; Specify File Descriptor 0 : STDIN (Default to keyboard input)
mov ecx, Buff; pass offset of the buffer to read to
mov edx, BUFFLENGTH ; Tell sys_read to read BUFFLEN
int 80h ;make kernel call
mov esi, eax
cmp byte[ecx+esi], BUFFLENGTH ;compare the returned bufferSize to BUFFLENGTH
jnbe DisplayPrompt ;Jump If Not Below or Equal To BUFFLENGTH
Write:
mov edx, eax ;grab the size of the buffer that was used (charachter length)
mov eax, 4 ;specify sys_write
mov ebx, 1 ; specify File Descriptor 1: STDOUT
mov ecx, Buff ;pass the offset of the Buffer
int 80h ;make kernel call
Exit:
mov eax, 1 ; Code for Exit syscall
mov ebx, 0 ; Exit code { = 0; Program ran OK }
int 80h ; make kernel call
I believe my error is in how I am comparing the data, here:
mov esi, eax
cmp byte[ecx+esi], BUFFLENGTH ;compare the returned bufferSize to BUFFLENGTH
jnbe DisplayPrompt ;Jump If Not Below or Equal To BUFFLENGTH
Any help would be appreciated. Thanks.

What you are calling "buffer overflow" here isn't the common definition of buffer overflow. If I understand correctly, what you are considering "buffer overflow" in this scenario is "The data spills over into terminal instead of limiting the user to not enter more data than the bufferlength". But in fact, the user can't enter more data than the buffer length. What is happening is that your read() reads 8 bytes from stdin and the remaining bytes "are still" in stdin where bash reads from when your program exits and the "\n" at the makes it try to execute the "spilling bytes" like you call them. There is no reason to change this since it's not a security issue at all. The user can't execute commands as the owner of the program that way.
If you really wanted to get rid of this, you could use malloc() to allocate a 'big enough' buffer. That way no matter how much the user inputs, the buffer will be big enough (depending on how much RAM you have, etc.) and you won't see those "spilling bytes" anymore.

Reading from a file in assembly

I'm trying to learn assembly -- x86 in a Linux environment. The most useful tutorial I can find is Writing A Useful Program With NASM. The task I'm setting myself is simple: read a file and write it to stdout.
This is what I have:
section .text ; declaring our .text segment
global _start ; telling where program execution should start
_start: ; this is where code starts getting exec'ed
; get the filename in ebx
pop ebx ; argc
pop ebx ; argv[0]
pop ebx ; the first real arg, a filename
; open the file
mov eax, 5 ; open(
mov ecx, 0 ; read-only mode
int 80h ; );
; read the file
mov eax, 3 ; read(
mov ebx, eax ; file_descriptor,
mov ecx, buf ; *buf,
mov edx, bufsize ; *bufsize
int 80h ; );
; write to STDOUT
mov eax, 4 ; write(
mov ebx, 1 ; STDOUT,
; mov ecx, buf ; *buf
int 80h ; );
; exit
mov eax, 1 ; exit(
mov ebx, 0 ; 0
int 80h ; );
A crucial problem here is that the tutorial never mentions how to create a buffer, the bufsize variable, or indeed variables at all.
How do I do this?
(An aside: after at least an hour of searching, I'm vaguely appalled at the low quality of resources for learning assembly. How on earth does any computer run when the only documentation is the hearsay traded on the 'net?)

Ohh, this is going to be fun.
Assembly language doesn't have variables. Those are a higher-level language construct. In assembly language, if you want variables, you make them yourself. Uphill. Both ways. In the snow.
If you want a buffer, you're going to have to either use some region of your stack as the buffer (after calling the appropriate stack-frame-setup instructions), or use some region on the heap. If your heap is too small, you'll have to make a SYSCALL instruction (another INT 80h) to beg the operating system for more (via sbrk).
Another alternative is to learn about the ELF format and create a global variable in the appropriate section (I think it's .data).
The end result of any of these methods is a memory location you can use. But your only real "variables" like you're used to from the now-wonderful-seeming world of C are your registers. And there aren't very many of them.
The assembler might help you out with useful macros. Read the assembler documentation; I don't remember them off the top of my head.
Life is tough down there at the ASM level.

you must declare your buffer in bss section and the bufsize in data
section .data
bufsize dw 1024
section .bss
buf resb 1024

After the call to open, the file handle is in eax. You rightfully move eax it to ebx, where the call to read will look for it. Unfortunately, at this point you have already overwritten it with 3, the syscall for reading.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string