Manually Add Newline To Stack Variable In x86 Linux Assembly - linux

I wrote a simple assembly program that gets two integers from the user via a prompt, multiplies them together and prints that out. I wanted to do this directly with sys_read and not scanf so I could manually convert the input to an integer after removing the LF.
Here's the full source: http://pastebin.com/utnjTvNZ
In particular, what I want to do now is manually add a newline to the result of the multiplication that is now converted back to it's ASCII char equivalent. Initially, I thought I could just left shift 16 bits and add 0xA leaving me with, for example, 0x0034000A on the stack for 2*2 (0x0034 is "4" in ASCII chars), followed by a null terminator and a LF. However, the LF is printing before the result. I figured this was an endianess thing, so I tried the reverse (0x000A0034) and that just printed some other ASCII char instead.
So, simply, how do I properly push a newline to the stack so that this is printed with a newline following the number when using sys_write? What I'm missing is how strings are stored on the stack... which I can't test because normally you just create a variable and push the address onto the stack.
I'm aware some things in here could be done better, cleaner and up-to-standards and whatnot. I understand things intuitively so it's something I just need to do to better understand the stack and Linux system calls in general.

Okay, so to answer my own question thanks to the help of Jester, to add a newline to the 32-bit word I was displaying in memory, I had to understand endianness. Since I compiled for 32-bit, my program is functioning on 32-bit words. These words' bytes are written into memory "backwards". The words themselves are still stored in "normal" order. For example 0x0A290028 0x0A293928 prints (NULL)LF(9)LF. The bytes are backwards but the words are not. Sys_write, since it just uses a void *buf and isn't aware of strings, just reads bytes in endian-order from the buffer and spits them out.
What I was able to do was simply left-shift my single-digit product, for example, 0x00000034 by 8-bits. This left me with 0x00003400. To that, I could add 0x000A0000. This would result in 0x000A3400, and the number "4" being printed followed by a newline.
So, the new procedure looks like this:
multprint:
mov eax, sys_write ;4
mov ebx, stdout ;1
mov ecx, resultstr
mov edx, resultstrLen
dec edx
int 0x80
pop eax ;multiplican't
pop ebx ;multiplicand
mul ebx
add eax, '0'
shl eax, 8 ;make room for () and LF
add eax, 0x0A290028
push eax
mov ecx, esp
;mov [num], eax ;use these two lines if I don't want to use the stack
;mov ecx, num
mov eax, sys_write
mov ebx, stdout
mov edx, 4
int 0x80

Related

Is it possible for a program to read itself?

Theoretical question. But let's say I have written an assembly program. I have "labelx:" I want the program to read at this memory address and only this size and print to stdout.
Would it be something like
jmp labelx
And then would i then use the Write syscall , making sure to read from the next instruction from labelx:
mov rsi,rip
mov rdi,0x01
mov rdx,?
mov rax,0x01
syscall
to then output to stdout.
However how would I obtain the size to read itself? Especially if there is a
label after the code i want to read or code after. Would I have to manually
count the lines?
mov rdx,rip+(bytes*lines)
And then syscall with populated registers for the syscall to write to from rsi to rdi. Being stdout.
Is this Even possible? Would i have to use the read syscall first, as the write system call requires rsi to be allocated memory buffer. However I assumed .text is already allocated memory and is read only. Would I have to allocate onto the stack or heap or a static buffer first before write, if it's even possible in the first place?
I'm using NASM syntax btw. And pretty new to assembly. And just a question.
Yes, the .text section is just bytes in memory, no different from section .rodata where you might normally put msg: db "hello", 10. x86 is a Von Neumann architecture (not Harvard), so there's no distinction between code pointers and data pointers, other than what you choose to do with them. Use objdump -drwC -Mintel on a linked executable to see the machine-code bytes, or GDB's x command in a running process, to see bytes anywhere.
You can get the assembler to calculate the size by putting labels at the start/end of the part you want, and using mov edx, prog_end - prog_start in the code at the point where you want that size in RDX.
See How does $ work in NASM, exactly? for more about subtracting two labels (in the same section) to get a size. (Where $ is an implicit label at the start of the current line, although $ isn't likely what you want here.)
To get the current address into a register, you need a RIP-relative LEA, not mov, because RIP isn't a general-purpose register and there's no special form of mov that reads it.
here:
lea rsi, [rel here] ; with DEFAULT REL you could just use [here]
mov edi, 1 ; stdout fileno
mov edx, .end - here ; assemble-time constant size calculation
mov eax, 1 ; __NR_write
syscall
.end:
This is fully position-independent, unlike if you used mov esi, here. (How to load address of function or label into register)
The LEA could use lea rsi, [rel $] to assemble to the same machine-code bytes, but you want a label there so you can subtract them.
I optimized your MOV instructions to use 32-bit operand-size, implicitly zero-extending into the full 64-bit RDX and RAX. (And RDI, but write(int fd, void *buf, size_t len) only looks at EDI anyway for the file descriptor).
Note that you can write any bytes of any section; there's nothing special about having a block of code write itself. In the above example, put the start/end labels anywhere. (e.g. foo: and .end:, and mov edx, foo.end - foo taking advantage of how NASM local labels work, by appending to the previous non-local label, so you can reference them from somewhere else. Or just give them both non-dot names.)

Incrementing one to a variable in IA32 Linux Assembly

I'm trying to increment 1 to a variable in IA32 Assembly in Linux
section .data
num: dd 0x1
section .text
global _start
_start:
add dword [num], 1
mov edx, 1
mov ecx, [num]
mov ebx,1
mov eax,4
int 0x80
mov eax,1
int 0x80
Not sure if it's possible to do.
In another literature I saw the follow code:
mov eax, num
inc eax
mov num, eax
Is it possible to increment a value to a var without moving to a register?
If so, do I have any advantage moving the value to a register?
Is it possible to increment a value to a var without moving to a register?
Certainly: inc dword [num].
Like practically all x86 instructions, inc can take either a register or memory operand. See the instruction description at http://felixcloutier.com/x86/inc; the form inc r/m32 indicates that you can give an operand which is either a 32-bit register or 32-bit memory operand (effective address).
If you're interested in micro-optimizations, it turns out that add dword [num], 1 may still be somewhat faster, though one byte larger, on certain CPUs. The specifics are pretty complicated and you can find a very extensive discussion at INC instruction vs ADD 1: Does it matter?. This is partly related to the slight difference in effect between the two, which is that add will set or clear the carry flag according to whether a carry occurs, while inc always leaves the carry flag unchanged.
If so, do I have any advantage moving the value to a register?
No. That would make your code larger and probably slower.

Read data from proc/sys/kernel/

I want create program for get info about operating system. I tried used syscalls, but think that read from systems files will be more faster (directly). So, i write simple program for read data from file from directory "/proc/sys/kernel/" and want read files: osrelease hostname, ostype and other.... So, i did supposed is just but isn't. When i read file "hostname" i got this:
����
If i read as superuser i got normal data
oleg
This is code of my program:
global _start
section .data
file db "/proc/sys/kernel/hostname",0
section .bss
buf resb 1024
descriptor resb 4
len equ 1024
section .text
_start:
mov eax, 5
mov ebx, file
mov ecx, 2
int 80h
mov [descriptor], eax
read:
mov eax, 3 ;read text
mov ebx, [descriptor];
mov ecx, buf ;read to variable buf
mov edx, len ;size of bug
int 80h ;interrupt
print_text:
mov edx, eax
mov eax, 4
mov ebx, 1
mov ecx, buf
int 80h
close_file:
mov eax, 6
mov ebx, [descriptor]
int 80h
exit:
mov eax, 1
mov ebx, 0
int 80h
So, i thought change name of file and can get other system information, but it's mistake, because i don't got result. So, i change path to file, compile project and execute program as super user and i don't get result. Nothing...
I can read all files except this directory ("proc/sys/kernel").
I googled information about this problem and don't find similar problem. I think it is security of OS, but i only read info, don't write... I understand that using syscall more simple, but want undestand structure of OS. Why i can't read info from this directory? So, can you attach useful info about this problam, please?
mov ecx, 2
A flags value of 2 for open() is O_RDWR. You're attempting to open the file read-write, which as a normal user you cannot do because it's writable only by root (mode 0644 on my system). Unix permission checks are done when you open the file, not on each individual read and write, so this fails even though you do not intend to actually write to the file.
So the open call returns a negative error code (which you don't check for), you pass this as the fd to read which thus also fails with a negative error code (which you also don't check for) and thus your buffer still contains a bunch of zeros. You pass this negative error code as the length to write(), which interprets it as a huge positive number and writes out not only the zero bytes from buf, but also whatever garbage follows it in memory, until it runs off the end of your address space.
This does work for me as root. I don't quite understand your last paragraph and can't tell whether it does or doesn't work for you in that case. If it doesn't, you may have securelevels or some other mechanism that prevents writing the file even as root. Note that some other files in /proc/sys/kernel are not writable even by root, e.g. /proc/sys/kernel/version which is 0444, so for those files your program will fail as above even if you are root.
But since you don't care about writing the file, just use flags O_RDONLY which has the value 0. So xor ecx, ecx instead. With this change the program works for me as a normal user.
Error checking throughout would be a good idea.

Write an assembly language code to reverse a string

this is a code to reverse a string
I am really struggling to finish this code. I could not find what's wrong with it, however the output is wrong.
mov esi, OFFSET source
mov edi, OFFSET target
add edi, SIZEOF target-2
mov ecx, SIZEOF source-1
L1:
mov al, [esi]
mov [edi], al
inc esi
dec edi
loop L1
It's a bit odd that you pass two lengths to the function. All sorts of bad stuff might happen if the two mismatch.
Better to pass the length of the string explicitly, or have the code figure out the length.
As getting the length of a old-skool c-string is non-trivial to code, but trivial to google, I'm going to just pass it as a parameter.
'...the output is wrong.'
The problem is that your strings need to be zero terminated, but you are not putting the terminating zero on your destination string.
First off, if you want to have the string be a valid c-style string, make sure you add a terminating zero, like so: source db "test test",0
mov esi, OFFSET source ;Start of source
mov edi, OFFSET target ;start of dest
;length EQU SIZEOF source ;we are reversing source
mov ecx, SIZEOF source ;Length of the string
;(includes the terminating 0)
Setup:
;//a c-string must have a terminating 0!
xor eax,eax ;al=0, put the terminating zero in first
L1:
mov [edi+ecx-1], al ;if length(ecx)=1, then write to [edi] directly.
mov al, [esi]
inc esi
loop L1
Remarks on the code
There is no need to keep three counters in flight (edi,esi,ecx), two is enough. esi counts up, ecx counts down.
The x86 has lots of really helpful addressing modes, which are mostly free performance wise.
The last iteration will read the terminating zero in al, we don't need to reverse this and you've already written it at the beginning, so it is (silently) dropped.
Because of the terminating zero, length is at least 1. This is good because if somehow you were to feed 0 into loop it will loop 'forever'; not good ('forever' being 4+ billion times).
Note that your code does not take account of Unicode, so it will not work for UTF8, but lets assume it's just a learning exercise.
If you follow an ABI, then you can just pass the parameters in registers, meaning you can skip on some of the initialization. Given that your code is not going to win any prizes for raw speed, I've skipped this step.

How do I test to ensure only an integer is entered and ensure length of input is 5 bytes or less in Assembly?

How do I test to ensure only an integer is entered and ensure length of input is 5 bytes or less in the following code?
I am trying to understand how to properly control input so that the input beyond 5 bytes is not outputted to the terminal upon exiting of the program.
In addition, how would I test to ensure only a string is entered and finally in the last scenario, only a double is entered?
*** Updated code based on x82 and Peter C's guidance. I did some C disas and was able to amend my original code below. It still has some flaws but you are both a great deal of help! I am just stuck on when more than 5 integer bytes are entered it wont re-prompt as it does when I enter in a character data as it continues to dump extra bytes data to tty.
SECTION .data ; initialized data section
promptInput db 'Enter Number: ', 0
lenPromptInput equ $ - promptInput
displayInput db 'Data Entered: ', 0
lenDisplayInput equ $ - lenDisplayInput
SECTION .bss ; uninitialized data section
number resb 1024 ; allocate 1024 bytes for number variable
SECTION .text ; code section
global _start ; linker entry point
_start:
nop ; used for debugging
Read:
mov eax, 4 ; specify sys_write call
mov ebx, 1 ; specify stdout file descriptor
mov ecx, promptInput ; display promptInput
mov edx, lenPromptInput ; length of promptInput
int 0x80 ; call sys_write
mov eax, 3 ; specify sys_read call
mov ebx, 0 ; specify stdin file descriptor
mov ecx, number ; pass address of the buffer to read to
mov edx, 1024 ; specify sys_read to read 1024 bytes stdin
int 0x80 ; call sys_read
cmp eax, 0 ; examine sys_read return value in eax
je Exit ; je if end of file
cmp byte [number], 0x30 ; test input against numeric 0
jb Read ; jb if below 0 in ASCII chart
cmp byte [number], 0x39 ; test input against numeric 9
ja Read ; ja if above 9 in ASCII chart
Write:
mov eax, 4 ; specify sys_write call
mov ebx, 1 ; specify stdout file descriptor
mov ecx, displayInput ; display displayInput
mov edx, lenDisplayInput ; length of displayInput
int 0x80 ; call sys_write
mov eax, 4 ; specify sys_write call
mov ebx, 1 ; specify stdout file descriptor
mov ecx, number ; pass address of the number to write
mov edx, 5 ; pass number of numbers to write
int 0x80 ; call sys_write
Exit:
mov eax, 1 ; specific sys_exit call
mov ebx, 0 ; return code 0 to OS
int 0x80 ; call sys_exit
(Since you accepted this answer, I'll point out that the actual answer to this question about using read on TTYs is my other answer on this question.)
Here's an answer to your low-quality followup question which I was about to post when you deleted it.
Note that I said "you can ask for debugging help in a new question", not that you should ask 3 different questions in one, and re-post your whole code barely changed with no serious attempt at solving your own problem. It's still up to you to make the new question a good question.
I probably wouldn't have answered it if I hadn't sort of led to you posting it in the first place. Welcome to StackOverflow, I'm being generous since you're new and don't know what's a good question yet.
The usual term for the characters '0' through '9' is "digit", not "integer". It's much more specific.
ensure only integers are inputted in the buffer
You can't. You have to decide what you want to do if you detect such input.
Need help creating an array to loop through
Your buffer is an array of bytes.
You can loop over it with something like
# eax has the return value from the read system call, which you've checked is strictly greater than 0
mov esi, number ; start pointer
scan_buffer:
movzx edx, byte [esi]
# do something with the character in dl / edx
...
inc esi ; move to the next character
dec eax
jnz scan_buffer ; loop n times, where n = number of characters read by the system call.
ensure characters over the 1024 buffer do not send data to the tty
If you're worried that 1024 isn't necessarily big enough for this toy program, then use select(2) or poll(2) to check if there's more input to be read without blocking if there isn't.
I'm just going to answer the POSIX systems programming part of the question, and leave it up to you to make the right system calls once you know what you want your program to do. Use gdb to debug it (see the bottom of the x86 tag wiki for debug tips), and strace to trace system calls.
You might want to write your program in C instead of trying to learn asm and the Unix system call API at the same time. Write something in C to test the idea, and then implement it in asm. (Then you can look at the compiler output when you get stuck, to see how the compiler did things. As long as you carefully read and understand how the compiler-generated code works, you're still learning. I'd suggest compiling with -O2 or -O3 as a starting point for an asm implementation. At least -O1, definitely not -O0.).
I am trying to understand how to properly control input so that the input beyond 5 bytes is not outputted to the terminal upon exiting of the program.
This is just a POSIX semantics issue, nothing to do with asm. It would be the same if you were doing systems programming in C, calling the read(2) system call. You're calling it in asm with mov eax,3 / int 0x80, instead of calling the glibc wrapper like C compiler output would, but it's the same system call.
If there is unread data on the terminal (tty) when your program exits, the shell will read it when it checks for input.
In an interactive shell running on a tty, programs you run (like ./a.out or /bin/cat) have their stdin connected to the same tty device that the shell takes interactive input from. So unread data on your program's stdin is the same thing as unread data that the shell will see.
Things are different if you redirected your program's input from a file. (./a.out < my_file.txt). Then your program won't start with an already-open file descriptor for the tty. It could still open("/dev/tty") (which is a "magic" symlink that always refers to the controlling tty) and vacuum up anything that was typed while it was running.
ensure only an integer is entered and ensure length of input is 5 bytes or less in the following code?
You can't control what your input will be. You can detect input you don't like, and print an error message or anything else you want to do.
If you want input characters to stop echoing to the screen after 5 bytes, you'd need to put the tty into raw mode (instead of the default line-buffered "cooked" mode) and either do the echo manually, or disable echo after 5 bytes. (The latter wouldn't be reliable, though. There'd be a race condition between disabling echo and the user typing a 6th byte (e.g. as part of a paste).
RE: edit
I am just stuck on when more than 5 integer bytes are entered it wont re-prompt as it does when I enter in a character data as it continues to dump extra bytes data to tty.
You broke your program, because the logic is still designed around re-read()ing a character if you don't like the digit you read. But your read call reads up to 5 bytes.
The normal thing to do is one big read and then parse the whole line by looping over the bytes in the buffer. So use a big buffer (like 1024 bytes) in the .bss section, and make a read system call.
Don't make another read system call unless you want to prompt the user to enter another line of text.

Resources