GAS to NASM assembly: translate ".rept .set" to NASM (loop and assign incrementing value to label) - nasm

GAS assembly knows about the .set-directive which can be combined with .rept to increment a label (variable) in a loop as in the example below:
pd:
.set SPAGE, 0
.rept 512
.quad SPAGE + 0x87 // PRESENT, R/W, USER, 2MB
.set SPAGE, SPAGE + 0x200000
.endr
How can I achieve something similar convenient in NASM? I know about TIMES directive, but this alone doesn't help me to achieve, what I want. Any ideas? The EQU-directive from NASM only allows assigning a value once. Hence, it will not solve my problem.

Indeed this is impossible to do with times directive due to the operand to TIMES is a critical expression, to repeat more than one line of code, or a complex macro, use the preprocessor %rep directive, take a look at this silly example:
global _start
section .text
_start:
mov rbx, 0
%assign i 0
%rep 5
mov rbx, [variable]
add rbx, i
mov [variable], rbx
%assign i i+1
%endrep
mov rax, 60 ; system call for exit
mov rdi, [variable]; value of 'variable' = 10
syscall
section .bss
variable: resb 1
Check the answer:
nasm -felf64 ass.asm && ld ass.o && ./a.out
echo $?

Related

Why is the RDI register missing in this "Hello world" assembly program?

I found this "Hello" (shellcode) assembly program:
SECTION .data
SECTION .text
global main
main:
mov rax, 1
mov rsi, 0x6f6c6c6548 ; "Hello" is stored in reverse order "olleH"
push rsi
mov rsi, rsp
mov rdx, 5
syscall
mov rax, 60
syscall
And I found that mov rdi, 1 is missing. In other "hello world" programs that instruction appears so I would like to understand why this happens.
I was going to say it's an intentional trick or hack to save code bytes, using argc as the file descriptor. (1 if you run it from the shell without extra command line args). main(int argc, char**argv) gets its args in EDI and RSI respectively, in the x86-64 SysV calling convention used on Linux.
But given the other choices, like mov rax, 1 instead of mov eax, edi, it's probably just a bug that got overlooked because the code happened to work.
It would not work in real shellcode for a code-injection attack, where execution would probably reach this code with garbage other than 0, 1, or 2 in EDI. The shellcode test program on the tutorial you linked calls a const char[] of machine code as the only thing in main, which will normally compile to asm that doesn't touch RDI.
This code wouldn't work for code-injection attacks based on strcpy or other C-string overflows either, since the machine code contains 00 bytes as part of mov eax, 1, mov edx, 5, and the end of that character string.
Also, modern linkers don't link .rodata into an executable segment, and -zexecstack only affects the actual stack, not all readable memory. So that shellcode test won't work, although I expect it did when written. See How to get c code to execute hex machine code? for working ways, like using a local array and compiling with -zexecstack.
That tutorial is overall not great, probably something this guy wrote while learning. (But not as bad as I expected based on this bug and the use of Kali; it's at least decently written, just missing some tricks.)
Since you're using NASM, you don't need to manually waste time looking up ASCII codes and getting the byte order correct. Unlike some assemblers, mov rsi, "Hello" / push rsi results in those bytes being in memory in source order.
You also don't need an empty .data section, especially when making shellcode which is just a self-contained snippet of machine code which can't reference anything outside itself.
Writing a 32-bit register implicitly zero-extends to 64-bit. NASM optimizes mov rax,1 into mov eax,1 for you (as you can see in the objdump -d AT&D disassembly; objdump -drwC -Mintel to use Intel-syntax disassembly similar to NASM.)
The following should work:
global main
main:
mov rax, `Hello\n ` ; non-zero padding to fill 8 bytes
push rax
mov rsi, rsp
push 1 ; push imm8
pop rax ; __NR_write
mov edi, eax ; STDOUT_FD is also 1
lea edx, [rax-1 + 6] ; EDX = 6; using 3 bytes with no zeros
syscall
mov al, 60 ; assuming write success, RAX = 5, zero outside the low byte
;lea eax, [rdi-1 + 60] ; the safe way that works even with ./hello >&- to return -EBADF
syscall
This is fewer bytes of machine code than the original, and avoids \x00 bytes which strcpy would stop on. I changed the string to end with a newline, using NASM backticks to support C-style escape sequences like \n as 0x0a byte.
Running normally (I linked it into a static executable without CRT, despite it being called main instead of _start. ld foo.o -o foo):
$ strace ./foo > /dev/null
execve("./foo", ["./foo"], 0x7ffecdc70a20 /* 54 vars */) = 0
write(1, "Hello\n", 6) = 6
exit(1) = ?
Running with stdout closed to break the mov al, 60 __NR_exit hack:
$ strace ./foo >&-
execve("./foo", ["./foo"], 0x7ffe3d24a240 /* 54 vars */) = 0
write(1, "Hello\n", 6) = -1 EBADF (Bad file descriptor)
syscall_0xffffffffffffff3c(0x1, 0x7ffd0b37a988, 0x6, 0, 0, 0) = -1 ENOSYS (Function not implemented)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xffffffffffffffda} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)
To still exit cleanly, use lea eax, [rdi-1 + 60] (3 bytes) instead of mov al, 60 (2 bytes) to set RAX according to the unmodified EDI, instead of depending on the upper bytes of RAX being zero which they aren't after an error return.
See also https://codegolf.stackexchange.com/questions/132981/tips-for-golfing-in-x86-x64-machine-code

printing numbers in nasm

I have written an assembly code to print numbers from 1 to 9 but the code only prints 1 and no other element other than 1 is printed and only one output is received.It means that the loop is also not being run. I cant figure out what is wrong with my code.
section .bss
lena equ 1024
outbuff resb lena
section .data
section .text
global _start
_start:
nop
mov cx,0
incre:
inc cx
add cx,30h
mov [outbuff],cx
cmp cx,39h
jg done
cmp cx,39h
jl print
print:
mov rax,1 ;sys_write
mov rdi,1
mov rsi,outbuff
mov rdx,lena
syscall
jmp incre
done:
mov rax,60 ;sys_exit
mov rdi,0
syscall
My OS is 64 bit linux. this code is built using nasm with the following commands : nasm -f elf64 -g -o num.o num.asm and ld -o num num.asm
Answer rewritten after some experimentation.
There two errors in your code, and a few inefficiencies.
First, you add 0x30 to the number (to turn it from the number 1 to the ASCII 1). However, you do that increment inside the loop. As a result, your first iteration cx is 0x31, second 0x62 ("b"), third 0x93 (invalid UTf-8 sequence) etc.
Just initialize cx to 0x30 and remove the add from inside the loop.
But there's another problem. RCX is clobbered during system calls. Replacing cx with r12 causes the program to work.
In addition to that, you pass the buffer's length to write, but it only has one character. The program so far:
section .bss
lena equ 1024
outbuff resb lena
section .data
section .text
global _start
_start:
nop
mov r12,30h
incre:
inc r12
mov [outbuff],r12
cmp r12,39h
jg done
cmp r12,39h
jl print
print:
mov rax,1 ;sys_write
mov rdi,1
mov rsi,outbuff
mov rdx,1
syscall
jmp incre
done:
mov rax,60 ;sys_exit
mov rdi,0
syscall
Except even now, the code is extremely inefficient. You have two compares on the same condition, one of them branches to the very next instruction.
Also, your code would be much much much faster and smaller if you moved the breaking condition to the end of the code. Also, cx is a 16 bit register. r12 is a 64 bit register. We actually only need 8 bits. Using larger registers than needed means all of our immediates waste up space in memory and the cache. We therefor switch to the 8 bit variant of r12. After these changes, we get:
section .bss
lena equ 1024
outbuff resb lena
section .data
section .text
global _start
_start:
nop
mov r12b,30h
incre:
inc r12b
mov [outbuff],r12b
mov rax,1 ;sys_write
mov rdi,1
mov rsi,outbuff
mov rdx,1
syscall
cmp r12b,39h
jl incre
mov rax,60 ;sys_exit
mov rdi,0
syscall
There's still lots more you can do. For example, you call the write system call 9 times, instead of filling the buffer and then calling it once (despite the fact that you've allocated a 1024 bytes buffer). It will probably be faster to initialize r12 with zero (xor r12, r12) and then add 0x30. (not relevant for the 8 bit version of the register).

Access .data section in Position Independent Code

I'm building a shared library with NASM. In that library, in some function, I need what we'd call a static variable in C. Basically, I think it is some space in the .data section:
SECTION .data
last_tok: dq 0 ; Define a QWORD
The problem arises when I try to access last_tok in my function.
I read the NASM Manual: 8.2 Writing Linux/ELF Shared Libraries which explains the problem and gives the solution.
SECTION .data
last_tok: dq 0 ; Define a QWORD
SECTION .text
EXTERN _GLOBAL_OFFSET_TABLE_
GLOBAL strtok:function
strtok:
enter 0, 0
push rbx
call .get_GOT
.get_GOT:
pop rbx
add rbx, _GLOBAL_OFFSET_TABLE_ + $$ - .get_GOT wrt ..gotpc
mov [rbx + last_tok wrt ..gotoff], rdi ; Store the contents of RDI at last_tok
mov rbx, [rbp - 8]
leave
ret
It may work with ELF32, but with ELF64 I get the following error:
nasm -f elf64 -o strtok.o strtok.s
strtok:15: error: ELF64 requires ..gotoff references to be qword
<builtin>: recipe for target 'strtok.o' failed
make: *** [strtok.o] Error 1
What am I doing wrong?
The effective address format only allows for 32 bit displacement that is sign extended to 64 bit. According to the error message, you need full 64 bits. You can add it via a register, such as:
mov rax, last_tok wrt ..gotoff
mov [rbx + rax], rdi
Also, the call .get_GOT is a 32 bit solution, in 64 bit mode you have rip relative addressing which you can use there. While the above may compile, but I am not sure it will work. Luckily the simple solution is to use the mentioned rip relative addressing to access your variable thus:
SECTION .data
GLOBAL last_tok
last_tok: dq 0 ; Define a QWORD
SECTION .text
GLOBAL strtok:function
strtok:
mov rcx, [rel last_tok wrt ..gotpc] ; load the address from the GOT
mov rax, [rcx] ; load the old dq value from there
; and/or
mov [rcx], rdi ; store arg at that address
ret
Note that for a private (static) variable you can just use [rel last_tok] without having to mess with the got at all.
In a PIE executable, compilers use (the equivalent of) [rel symbol] to access even global variables, on the assumption that the main executable doesn't need or want symbol interposition for its own symbols.
(Symbol interposition, or symbols defined in other shared libraries, is the only reason to load symbol addresses from the GOT on x86-64. But even something like mov rdx, [rel stdin] is safe in a PIE executable: https://godbolt.org/z/eTf87e - the linker creates a definition of the variable in the executable so it's within range and at a link-time-constant offset for RIP-relative addressing.)

NASM get console size

I am new to NASM (and assembler in general) and I am looking for way to get console size (number of console cols and rows) in NASM. Like AH=0Fh and INT 10h: http://en.wikipedia.org/wiki/INT_10H
Now, I understand, that in NASM (and linux in general) I can not do BIOS interruption, so there have to be other way.
The idea is to print some output to fill the screen and then wait for user to press ENTER until print more output.
If you are programming in Linux, then you must use the available system calls to achieve your aims. It's not that there are no interrupts. The system call itself is executed with an interrupt call. Outside of the kernel, however, you will be unable to access them and, since the kernel runs in protected mode, even if you could they likely wouldn't do what you would expect.
To your question, however. To obtain the console size you would need to make use of the ioctl system call. This is accessed with the value of 0x36 in EAX. I'd suggest that you have a read through the manual page for ioctl and you may also find this system call table very useful!
This is a problem I've got to deal with a time ago. The code for unistd.inc and termio.inc can be found here in the includes folder. The program can be found and the makefile you can find in de tree programs/basics/terminal-winsize.
The vaules for rows and columns you can get on any terminal (console). xpixels and ypixels you can get only from some terminals. (xterm yes, gnome-terminal depends). So if you don't get the x and y pixels (screensize) on from some terminals, the terminal is text-based I guess. Correct me if it has another reason for this behaviour.
You can convert this program easily to 32 bits since it make use of nasmx macros for the syscalls,. The only thing you have to do is to replace the 64 bit registers in 32 bit registers and put some parameters in the right register. Look for agguro on github to see all include files.
I hope this is helpfull to you
; Name: winsize
; Build: see makefile
; Run: ./winsize
; Description: Show the screen dimension of a terminal in rows/columns.
BITS 64
[list -]
%include "unistd.inc"
%include "termio.inc"
[list +]
section .bss
buffer: resb 5
.end:
.length: equ $-buffer
lf: resb 1
section .data
WINSIZE winsize
; keep the lengths the same or the 'array' construction will fail!
array: db "rows : "
db "columns : "
db "xpixels : "
db "ypixels : "
.length: equ $-array
.items: equ 4
.itemsize: equ array.length / array.items
section .text
global _start
_start:
mov BYTE[lf], 10 ; end of line in byte after
buffer
; fetch the winsize structure data
syscall ioctl, STDOUT, TIOCGWINSZ, winsize
; initialize pointers and used variables
mov rsi, array ; pointer to array of strings
mov rcx, array.items ; items in array
.nextVariable:
; print the text associated with the winsize variable
push rcx ; save remaining strings to process
push rdx ; save winsize pointer
syscall write, STDOUT, rsi, array.itemsize
pop rax ; restore winsize pointer
push rax ; save winsize pointer
; convert variable to decimal
mov ax, WORD[rax] ; get value form winsize structure
mov rdi, buffer.end-1
.repeat:
xor rbx, rbx ; convert value in decimal
mov bx, 10
xor rdx, rdx
div bx
xchg rax, rdx
or al, "0"
std
stosb
xchg rax, rdx
cmp al, 0
jnz .repeat
push rsi ; save pointer to text
; print the variable value
mov rsi, rdi
mov rdx, buffer.end ; length of variable
sub rdx, rsi
inc rsi
syscall write, STDOUT, rsi, rdx
pop rsi
pop rdx
; calculate pointer to next variable value in winsize
add rdx, 2
; calculate pointer to next string in strings
add rsi, array.itemsize
; if all strings processed
pop rcx ; remaining arrayitems
loop .nextVariable
; exit the program
syscall exit, 0

lost in assembly NASM ELF64 world

So as part of my Computer Architecture class I need to get comfortable with Assembly, or at least comfortable enough, I'm trying to read the input to the user and then reprint it (for the time being), this is my how I tried to laid this out in pseudo code:
Declare msg variable (this will be printed on screen)
Declare length variable (to be used by the sys_write function) with long enough value
Pop the stack once to get the program name
Pop the stack again to get the first argument
Move the current value of the stack into the msg variable
Move msg to ECX (sys_write argument)
Mov length to EDX (sys_write argument)
Call sys_write using standard output
Kernel call
Call sys_exit and leave
This is my code so far
section .data
msg: db 'placeholder text',0xa;
length: dw 0x123;
section .text
global _start
_start:
pop rbx;
pop rbx;
; this is not working when I leave it in I get this error:
; invalid combination of opcode and operands
;mov msg, rbx;
mov ecx, msg;
mov edx, length;
mov eax, 4;
mov ebx, 1;
int 0x80;
mov ebx, 0;
mov eax, 1;
int 0x80;
When I leave it out (not moving the argument into msg), I get this output
placeholder text
#.shstrtab.text.data
�#�$�`��
We really just begun with NASM so ANY help will be greatly appreciated, I've been looking at this http://www.cin.ufpe.br/~if817/arquivos/asmtut/index.html#stack and http://syscalls.kernelgrok.com/ adapting the examples adapting the registry names to the best of my understanding to match http://www.nasm.us/doc/nasmdo11.html
I'm running Ubuntu 12.04, 64bit compiling (not even sure if this is the right word) NASM under ELF64, I'm sorry to ask such a silly question but I have been unable to find an easy enough tutorial for NASM that uses 64bits.
When the program is called the stack should looks like this:
+----------------+
| ... | <--- rsp + 24
+----------------+
| argument 2 | <--- rsp + 16
+----------------+
| argument 1 | <--- rsp + 8
+----------------+
| argument count | <--- rsp
+----------------+
The first argument is the name of your program and the second is the user input (if the user typed anything as an argument). So the count of the arguments is at least 1.
The arguments for system calls in 64-mode are stored in the following registers:
rax (system call number)
rdi (1st argument)
rsi (2nd argument)
rdx (3rd argument)
rcx (4th argument)
r8 (5th argument)
r9 (6th argument)
And the system call is called with syscall. The numbers of all the system calls can be found here here (yes they are different from the numbers in 32 bit mode).
This is the program which should do your stuff:
section .data
msg: db 'Requesting 1 argument!', 10 ; message + newline
section .text
global _start
_start:
cmp qword [rsp], 2 ; check if argument count is 2
jne fail ; if not jump to the fail lable
mov rax, 1 ; sys_write
mov rdi, 1 ; stdout
mov rsi, [rsp+16] ; get the address of the argument
mov rdx, 1 ; one character (length 1)
loop:
cmp byte [rsi], 0 ; check if current character is 0
je exit ; if 0 then jump to the exit lable
syscall
inc rsi ; jump to the next character
jmp loop ; repeat
fail:
mov rax, 1 ; sys_write
mov rdi, 1 ; stdout
lea rsi, [rel msg] ; move the address of the lable msg in rsi
mov rdx, 23 ; length = 23
syscall
exit:
mov rax, 60 ; sys_exit
mov rdi, 0 ; with code 0
syscall
Since the code isn't prefect in many ways you may want to modify it.
You've followed the instructions quite literally -- and this is expected output.
The stack variable that you write to the message, is just some binary value -- to be exact, it's a pointer to an array of strings containing the command line arguments.
To make sense of that, either you would have to print those strings, or convert the pointer to ascii string eg. "0x12313132".
My OS is Ubuntu 64-bit. Compiling your code produced the error:
nasm print3.asm
print3.asm:12: error: instruction not supported in 16-bit mode
print3.asm:13: error: instruction not supported in 16-bit mode
Exactly where the "pop rbx" is located.
Adding "BITS 64" to the top of the asm file solved the problem:
BITS 64
section .data
msg: db 'placeholder text',0xa;
length: dw 0x123;
...

Resources