NASM get console size - nasm

I am new to NASM (and assembler in general) and I am looking for way to get console size (number of console cols and rows) in NASM. Like AH=0Fh and INT 10h: http://en.wikipedia.org/wiki/INT_10H
Now, I understand, that in NASM (and linux in general) I can not do BIOS interruption, so there have to be other way.
The idea is to print some output to fill the screen and then wait for user to press ENTER until print more output.

If you are programming in Linux, then you must use the available system calls to achieve your aims. It's not that there are no interrupts. The system call itself is executed with an interrupt call. Outside of the kernel, however, you will be unable to access them and, since the kernel runs in protected mode, even if you could they likely wouldn't do what you would expect.
To your question, however. To obtain the console size you would need to make use of the ioctl system call. This is accessed with the value of 0x36 in EAX. I'd suggest that you have a read through the manual page for ioctl and you may also find this system call table very useful!

This is a problem I've got to deal with a time ago. The code for unistd.inc and termio.inc can be found here in the includes folder. The program can be found and the makefile you can find in de tree programs/basics/terminal-winsize.
The vaules for rows and columns you can get on any terminal (console). xpixels and ypixels you can get only from some terminals. (xterm yes, gnome-terminal depends). So if you don't get the x and y pixels (screensize) on from some terminals, the terminal is text-based I guess. Correct me if it has another reason for this behaviour.
You can convert this program easily to 32 bits since it make use of nasmx macros for the syscalls,. The only thing you have to do is to replace the 64 bit registers in 32 bit registers and put some parameters in the right register. Look for agguro on github to see all include files.
I hope this is helpfull to you
; Name: winsize
; Build: see makefile
; Run: ./winsize
; Description: Show the screen dimension of a terminal in rows/columns.
BITS 64
[list -]
%include "unistd.inc"
%include "termio.inc"
[list +]
section .bss
buffer: resb 5
.end:
.length: equ $-buffer
lf: resb 1
section .data
WINSIZE winsize
; keep the lengths the same or the 'array' construction will fail!
array: db "rows : "
db "columns : "
db "xpixels : "
db "ypixels : "
.length: equ $-array
.items: equ 4
.itemsize: equ array.length / array.items
section .text
global _start
_start:
mov BYTE[lf], 10 ; end of line in byte after
buffer
; fetch the winsize structure data
syscall ioctl, STDOUT, TIOCGWINSZ, winsize
; initialize pointers and used variables
mov rsi, array ; pointer to array of strings
mov rcx, array.items ; items in array
.nextVariable:
; print the text associated with the winsize variable
push rcx ; save remaining strings to process
push rdx ; save winsize pointer
syscall write, STDOUT, rsi, array.itemsize
pop rax ; restore winsize pointer
push rax ; save winsize pointer
; convert variable to decimal
mov ax, WORD[rax] ; get value form winsize structure
mov rdi, buffer.end-1
.repeat:
xor rbx, rbx ; convert value in decimal
mov bx, 10
xor rdx, rdx
div bx
xchg rax, rdx
or al, "0"
std
stosb
xchg rax, rdx
cmp al, 0
jnz .repeat
push rsi ; save pointer to text
; print the variable value
mov rsi, rdi
mov rdx, buffer.end ; length of variable
sub rdx, rsi
inc rsi
syscall write, STDOUT, rsi, rdx
pop rsi
pop rdx
; calculate pointer to next variable value in winsize
add rdx, 2
; calculate pointer to next string in strings
add rsi, array.itemsize
; if all strings processed
pop rcx ; remaining arrayitems
loop .nextVariable
; exit the program
syscall exit, 0

Related

NASM append to string with memory immediately after

For context I'm using NASM on a 64 bit Debian distro.
I'm still learning Assembly as part of writing my own programming language but I recently ran into a problem that I'm not sure how to handle. The following is a snippet of code that my compiler spits out:
section .text
global _start
section .var_1 write
char_1 db 'Q', 0
section .var_2 write
string_1 db 'Asdf', 0
section .var_3 write
char_2 db 'W', 0
section .text
_start:
push 4 ; String length onto stack
push string_1
;; Push a raw char onto the stack
mov bl, [char_1]
push bx
pop ax
pop rbx
pop rcx
mov byte [rbx+rcx], al
If I then print out the value of string_1, I see AsdfWQ. As I understand it, this is because of the mov command I am using to append combined with the fact that I have some data declared after the string's termination character. I've been trying to search around on Google with no luck about how to resolve this problem (partially because I don't know exactly what to search for). Conceptually I would think I could move the address of everything after string_1 by the offset of the length of what I'm appending but this seems highly inefficient if I had something like 40 different pieces of data after that. So what I'm trying to sort out is, how do I manage dynamic data that could increase or decrease in size in assembly?
Edit
Courtesy of fuz pointing out that dynamic memory allocation via the brk calls works, I've revised the program a little but am still experience come issues:
section .var_1 write
hello_string db '', 0
section .var_2 write
again_string db 'Again!', 0
section .text
_start:
;; Get current break address
mov rdi, 0
mov rax, 12
syscall
;; Attempt to allocate 8 bytes for string
mov rdi, rax
add rdi, 8
mov rax, 12
syscall
;; Set the memory address to some label
mov qword [hello_string], rax
;; Try declaring a string
mov byte [hello_string], 'H'
mov byte [hello_string+1], 'e'
mov byte [hello_string+2], 'l'
mov byte [hello_string+3], 'l'
mov byte [hello_string+4], 'o'
mov byte [hello_string+5], ','
mov byte [hello_string+6], ' '
mov byte [hello_string+7], 0
;; Print the string
mov rsi, hello_string
mov rax, 1
mov rdx, 8
mov rdi, 1
syscall
;; Print the other string
mov rsi, again_string
mov rax, 1
mov rdx, 5
mov rdi, 1
syscall
This results in Hello, ello, which means that I'm still overwriting data associated with the again_string label? But I was under the impression that using brk to allocate would do so after the data had been initialized?

Why is this register value in x86 assembly from user input different than expected?

Whenever the user inputs s, the expected value in the rax register that the buffer is moved to would be 73, but instead it is a73. Why is this? I need these two values to be equal in order to perform the jumps I need for the user input menu.
On any user input, the information in the register is always preceded by an a, while the register that I use to check for the value is not. This makes it impossible to compare them for a jump.
Any suggestions?
section .data
prompt: db 'Enter a command: '
section .bss
buffer: resb 100; "reserve" 32 bytes
section .text ; code
global _start
_start:
mov rax, 4 ; write
mov rbx, 1 ; stdout
mov rcx, prompt ; where characters start
mov rdx, 0x10 ; 16 characters
int 0x80
mov rax, 3 ; read
mov rbx, 0 ; from stdin
mov rcx, buffer ; start of storage
mov rdx, 0x10; no more than 64 (?) chars
int 0x80
mov rax, [buffer]
mov rbx, "s"
cmp rax, rbx
je _s
; return to Linux
mov rax, 1
mov rbx, 0
int 0x80
_s:
add r8, [buffer]
; dump buffer that was read
mov rdx, rax ; # chars read
mov rax, 4 ; write
mov rbx, 1 ; to stdout
mov rcx, buffer; starting point
int 0x80
jmp _start
If the user types s, followed by <enter>, the memory starting at the address of buffer will contain bytes ['s', '\n', '\0', '\0', ...] (where the newline byte '\n' is from pressing <enter> and the null bytes '\0' are from the .bss section being initialized to 0). As integers, represented in hex, the corresponding values in memory are [0x73, 0x0A, 0x00, 0x00, ...].
The mov rax, [buffer] instruction will copy 8 bytes of memory starting at the address of buffer to the rax register. The byte ordering is little endian on x86, so the 8 bytes will be loaded from memory in reversed order, resulting in rax having 0x0000000000000A73.
Workarounds
This workaround is based on Peter Cordes's comment below. The idea is to compare 1) the first byte starting at the address of buffer with 2) the byte 's'. This would replace the three lines in your question that 1) move [buffer] to rax, 2) move 's' to rbx, and 3) cmp rax, rbx.
cmp byte [buffer], 's'
je _s
This would check that the first character entered is 's', even if followed by other characters. If your intent is to check that only a single character 's' is entered (optionally followed by '\n' in the case that <enter> was pressed to end the input, as opposed to <ctrl-d>), a more thorough approach could utilize the value returned by the read system call, which indicates how many bytes were read.
Without checking how many characters are read, you might want to clear the buffer on each iteration. As is, a user could enter 's' on one iteration, followed by <ctrl-d> on the next iteration, and the buffer would still start with an 's'.
Band-aid Workarounds
(I had originally proposed the following two ideas as workarounds, but they have their own problems that Peter Cordes's identifies in the comments below)
To work around the issue, one option could be to add the newline to your target for comparison.
mov rax, [buffer]
mov rbx, `s\n` ; the second operand was formerly "s"
cmp rax, rbx
je _s
Alternatively, specifying that the read system call only consume 1 byte could be another approach to address the issue.
mov rax, 3 ; read
mov rbx, 0 ; from stdin
mov rcx, buffer ; start of storage
mov rdx, 0x01 ; the second operand was formerly 0x10
int 0x80

I'm getting a segmentation fault in my assembly program [duplicate]

The tutorial I am following is for x86 and was written using 32-bit assembly, I'm trying to follow along while learning x64 assembly in the process. This has been going very well up until this lesson where I have the following simple program which simply tries to modify a single character in a string; it compiles fine but segfaults when ran.
section .text
global _start ; Declare global entry oint for ld
_start:
jmp short message ; Jump to where or message is at so we can do a call to push the address onto the stack
code:
xor rax, rax ; Clean up the registers
xor rbx, rbx
xor rcx, rcx
xor rdx, rdx
; Try to change the N to a space
pop rsi ; Get address from stack
mov al, 0x20 ; Load 0x20 into RAX
mov [rsi], al; Why segfault?
xor rax, rax; Clear again
; write(rdi, rsi, rdx) = write(file_descriptor, buffer, length)
mov al, 0x01 ; write the command for 64bit Syscall Write (0x01) into the lower 8 bits of RAX
mov rdi, rax ; First Paramter, RDI = 0x01 which is STDOUT, we move rax to ensure the upper 56 bits of RDI are zero
;pop rsi ; Second Parameter, RSI = Popped address of message from stack
mov dl, 25 ; Third Parameter, RDX = Length of message
syscall ; Call Write
; exit(rdi) = exit(return value)
xor rax, rax ; write returns # of bytes written in rax, need to clean it up again
add rax, 0x3C ; 64bit syscall exit is 0x3C
xor rdi, rdi ; Return value is in rdi (First parameter), zero it to return 0
syscall ; Call Exit
message:
call code ; Pushes the address of the string onto the stack
db 'AAAABBBNAAAAAAAABBBBBBBB',0x0A
This culprit is this line:
mov [rsi], al; Why segfault?
If I comment it out, then the program runs fine, outputting the message 'AAAABBBNAAAAAAAABBBBBBBB', why can't I modify the string?
The authors code is the following:
global _start
_start:
jmp short ender
starter:
pop ebx ;get the address of the string
xor eax, eax
mov al, 0x20
mov [ebx+7], al ;put a NULL where the N is in the string
mov al, 4 ;syscall write
mov bl, 1 ;stdout is 1
pop ecx ;get the address of the string from the stack
mov dl, 25 ;length of the string
int 0x80
xor eax, eax
mov al, 1 ;exit the shellcode
xor ebx,ebx
int 0x80
ender:
call starter
db 'AAAABBBNAAAAAAAABBBBBBBB'0x0A
And I've compiled that using:
nasm -f elf <infile> -o <outfile>
ld -m elf_i386 <infile> -o <outfile>
But even that causes a segfault, images on the page show it working properly and changing the N into a space, however I seem to be stuck in segfault land :( Google isn't really being helpful in this case, and so I turn to you stackoverflow, any pointers (no pun intended!) would be appreciated
I would assume it's because you're trying to access data that is in the .text section. Usually you're not allowed to write to code segment for security. Modifiable data should be in the .data section. (Or .bss if zero-initialized.)
For actual shellcode, where you don't want to use a separate section, see Segfault when writing to string allocated by db [assembly] for alternate workarounds.
Also I would never suggest using the side effects of call pushing the address after it to the stack to get a pointer to data following it, except for shellcode.
This is a common trick in shellcode (which must be position-independent); 32-bit mode needs a call to get EIP somehow. The call must have a backwards displacement to avoid 00 bytes in the machine code, so putting the call somewhere that creates a "return" address you specifically want saves an add or lea.
Even in 64-bit code where RIP-relative addressing is possible, jmp / call / pop is about as compact as jumping over the string for a RIP-relative LEA with a negative displacement.
Outside of the shellcode / constrained-machine-code use case, it's a terrible idea and you should just lea reg, [rel buf] like a normal person with the data in .data and the code in .text. (Or read-only data in .rodata.) This way you're not trying execute code next to data, or put data next to code.
(Code-injection vulnerabilities that allow shellcode already imply the existence of a page with write and exec permission, but normal processes from modern toolchains don't have any W+X pages unless you do something to make that happen. W^X is a good security feature for this reason, so normal toolchain security features / defaults must be defeated to test shellcode.)

Does int 0x80 overwrite register values? [duplicate]

This question already has an answer here:
What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?
(1 answer)
Closed 4 years ago.
I wrote a program which is supposed to behave like a for while loop, printing a string of text a certain number of times.
Here is the code:
global _start
section .data
msg db "Hello World!",10 ; define the message
msgl equ $ - msg ; define message length
; use minimal size of storage space
imax dd 0x00001000 ; defines imax to be big!
section .text
_start:
mov r8, 0x10 ; <s> put imax in r8d, this will be our 'i' </s>
; just attempt 10 iterations
_loop_entry: ; loop entry point
mov eax, 4 ; setup the message to print
mov ebx, 1 ; write, stdout, message, length
mov ecx, msg
mov edx, msgl
int 0x80 ; print message
; this is valid because registers do not change
dec r8 ; decrease i and jump on not zero
cmp r8,1 ; compare values to jump
jnz _loop_entry
mov rax, 1 ; exit with zero
mov rbx, 0
int 0x80
The problem I have is the program runs into an infinite loop. I ran it inside gdb and the cause is:
int 0x80 is called to print the message, and this works correctly, however after the interrupt finishes, the contents of r8 is set to zero, rather than the value it should be. r8 is where the counter sits, counting (down) the number of times the string is printed.
Does int 0x80 modify register values? I noticed that rax, rbx, rcx, rdx were not affected in the same way.
Test Results
Answer: YES! It does modify r8.
I have changed two things in my program. Firstly I now cmp r8, 0, to get Hello World! the correct number of times, and
I have added
mov [i], r8 ; put away i
After _loop_entry:
and also I have added
mov r8, [i] ; get i back
after the first int 0x80.
Here is my now working program. More info to come on performance against C++.
;
; main.asm
;
;
; To be used with main.asm, as a test to see if optimized c++
; code can be beaten by me, writing a for / while loop myself.
;
;
; Absolute minimum code to be competative with asm.
global _start
section .data
msg db "Hello World!",10 ; define the message
msgl equ $ - msg ; define message length
; use minimal size of storage space
imax dd 0x00001000 ; defines imax to be big!
i dd 0x0 ; defines i
section .text
_start:
mov r8, 0x10 ; put imax in r8d, this will be our 'i'
_loop_entry: ; loop entry point
mov [i], r8 ; put away i
mov eax, 4 ; setup the message to print
mov ebx, 1 ; write, stdout, message, length
mov ecx, msg
mov edx, msgl
int 0x80 ; print message
; this is valid because registers do not change
mov r8, [i] ; get i back
dec r8 ; decrease i and jump on not zero
cmp r8,0 ; compare values to jump
jnz _loop_entry
mov rax, 1 ; exit with zero
mov rbx, 0
int 0x80
int 0x80 just causes a software interrupt. In your case it's being used to make a system call. Whether or not any registers are affected will depend on the particular system call you're invoking and the system call calling convention of your platform. Read your documentation for the details.
Specifically, from the System V Application Binary Interface x86-64™ Architecture Processor Supplement [PDF link], Appendix A, x86-64 Linux Kernel Conventions:
The interface between the C library and the Linux kernel is the same as for the user-level applications...
For user-level applications, r8 is a scratch register, which means it's caller-saved. If you want it to be preserved over the system call, you'll need to do it yourself.

lost in assembly NASM ELF64 world

So as part of my Computer Architecture class I need to get comfortable with Assembly, or at least comfortable enough, I'm trying to read the input to the user and then reprint it (for the time being), this is my how I tried to laid this out in pseudo code:
Declare msg variable (this will be printed on screen)
Declare length variable (to be used by the sys_write function) with long enough value
Pop the stack once to get the program name
Pop the stack again to get the first argument
Move the current value of the stack into the msg variable
Move msg to ECX (sys_write argument)
Mov length to EDX (sys_write argument)
Call sys_write using standard output
Kernel call
Call sys_exit and leave
This is my code so far
section .data
msg: db 'placeholder text',0xa;
length: dw 0x123;
section .text
global _start
_start:
pop rbx;
pop rbx;
; this is not working when I leave it in I get this error:
; invalid combination of opcode and operands
;mov msg, rbx;
mov ecx, msg;
mov edx, length;
mov eax, 4;
mov ebx, 1;
int 0x80;
mov ebx, 0;
mov eax, 1;
int 0x80;
When I leave it out (not moving the argument into msg), I get this output
placeholder text
#.shstrtab.text.data
�#�$�`��
We really just begun with NASM so ANY help will be greatly appreciated, I've been looking at this http://www.cin.ufpe.br/~if817/arquivos/asmtut/index.html#stack and http://syscalls.kernelgrok.com/ adapting the examples adapting the registry names to the best of my understanding to match http://www.nasm.us/doc/nasmdo11.html
I'm running Ubuntu 12.04, 64bit compiling (not even sure if this is the right word) NASM under ELF64, I'm sorry to ask such a silly question but I have been unable to find an easy enough tutorial for NASM that uses 64bits.
When the program is called the stack should looks like this:
+----------------+
| ... | <--- rsp + 24
+----------------+
| argument 2 | <--- rsp + 16
+----------------+
| argument 1 | <--- rsp + 8
+----------------+
| argument count | <--- rsp
+----------------+
The first argument is the name of your program and the second is the user input (if the user typed anything as an argument). So the count of the arguments is at least 1.
The arguments for system calls in 64-mode are stored in the following registers:
rax (system call number)
rdi (1st argument)
rsi (2nd argument)
rdx (3rd argument)
rcx (4th argument)
r8 (5th argument)
r9 (6th argument)
And the system call is called with syscall. The numbers of all the system calls can be found here here (yes they are different from the numbers in 32 bit mode).
This is the program which should do your stuff:
section .data
msg: db 'Requesting 1 argument!', 10 ; message + newline
section .text
global _start
_start:
cmp qword [rsp], 2 ; check if argument count is 2
jne fail ; if not jump to the fail lable
mov rax, 1 ; sys_write
mov rdi, 1 ; stdout
mov rsi, [rsp+16] ; get the address of the argument
mov rdx, 1 ; one character (length 1)
loop:
cmp byte [rsi], 0 ; check if current character is 0
je exit ; if 0 then jump to the exit lable
syscall
inc rsi ; jump to the next character
jmp loop ; repeat
fail:
mov rax, 1 ; sys_write
mov rdi, 1 ; stdout
lea rsi, [rel msg] ; move the address of the lable msg in rsi
mov rdx, 23 ; length = 23
syscall
exit:
mov rax, 60 ; sys_exit
mov rdi, 0 ; with code 0
syscall
Since the code isn't prefect in many ways you may want to modify it.
You've followed the instructions quite literally -- and this is expected output.
The stack variable that you write to the message, is just some binary value -- to be exact, it's a pointer to an array of strings containing the command line arguments.
To make sense of that, either you would have to print those strings, or convert the pointer to ascii string eg. "0x12313132".
My OS is Ubuntu 64-bit. Compiling your code produced the error:
nasm print3.asm
print3.asm:12: error: instruction not supported in 16-bit mode
print3.asm:13: error: instruction not supported in 16-bit mode
Exactly where the "pop rbx" is located.
Adding "BITS 64" to the top of the asm file solved the problem:
BITS 64
section .data
msg: db 'placeholder text',0xa;
length: dw 0x123;
...

Resources