ARM, GNU assembler: how to pass "array" arguments to execve()? - linux

I was writing a simple shellcode that would call execve() for an ARM platform (Linux on Raspberry PI) and got stuck with the second argument to execve. As per documentation:
int execve(const char *filename, char *const argv[], char *const envp[]);
Which totally cuts it for me if I call execve("/bin/sh", {NULL}, {NULL}); (from the assembly standpoint):
.data
.section .rodata
.command:
.string "/bin/sh"
.text
.globl _start
_start:
mov r7, #11
ldr r0, =.command
eor r1, r1 # temporarily forget about argv
eor r2, r2 # don't mind envp too
svc #0
mov r7, #1
eor r0, r0
svc #0
The assembly above compiles nicely and evokes a shell when run on my test machine that has true /bin/sh. However, all my trouble is that on the particular target box there's no /bin/sh per se, but only a symlink to busybox which necessitates me to execute something like execve("/bin/busybox", {"/bin/busybox", "sh", NULL}, {NULL}).
As to what I understand, arrays are continuous in memory, so all I have to do is to allocate bytes in memory in a continuous manner and then feed pointer to the beginning of what I deem as such "array". With that in mind I tried to the following:
.data
.section .rodata
.command:
.string "/bin/busybox"
.args:
.ascii "/bin/busybox\0"
.ascii "sh\0"
.ascii "\0"
.text
.globl _start
_start:
mov r7, #11
ldr r0, =.command
ldr r1, =.args
eor r2, r2
svc #0
mov r7, #1
eor r0, r0
svc #0
however with no success. Tried to play around with bytes and just create a series of bytes with null bytes filled to align to 4 bytes, which also didn't work. If the .args label looks like this:
.args:
.ascii "/bin/sh\0"
.ascii "-c\0\0\0"
.ascii "ls\0\0\0"
.ascii "\0\0\0\0"
then strace of the program being executed is as below:
$ strace ./shell
execve("./shell", ["./shell"], [/* 19 vars */]) = 0
dup2(0, 4) = 4
dup2(1, 4) = 4
dup2(2, 4) = 4
execve("/bin/sh", [0x6e69622f, 0x68732f, 0x632d, 0x736c00], [/* 0 vars */]) = -1 EFAULT (Bad address)
exit(0) = ?
+++ exited with 0 +++
(Trying to execute /bin/sh -c ls first on the testing machine before coding for /bin/busybox sh).
I ran a similar C program and then debugged it to see how it's done. It appears the location that's passed to r1 contains a bunch of pointers to strings and then, naturally, 0x00:
(gdb) x/4xw 0xbefff764
0xbefff764: 0x000105d0 0x000105d8 0x000105dc 0x00000000
... snip ...
(gdb) p argv
$3 = {0x105d0 "/bin/sh", 0x105d8 "-c", 0x105dc "ls", 0x0}
Question
Now that I figured out how memory is laid out, how do I prepare such layout in assembly and correctly pass the second parameter to execve() as an "array" in ARM assembly parlance?

Gosh, I just came up with this... Several hours of fiddling around and then 2 minutes after posting my own question an answer hit me... Rubber duck debugging works.
.data
.section .rodata
command:
.string "/bin/sh"
arg0:
.string "/bin/sh"
arg1:
.string "-c"
arg2:
.string "ls"
args:
.word arg0
.word arg1
.word arg2
.word 0
.text
.globl _start
_start:
mov r7, #11
ldr r0, =command
ldr r1, =args
eor r2, r2
svc #0
mov r7, #1
eor r0, r0
svc #0

You can use stack pointer to pass parameters. When program is started, first argument (arg[1]) will be in sp+8.
shell.s:
.text
.globl _start
_start:
.code 32
add r3,pc,#1
bx r3
.code 16
ldr r0, [sp, #8] # load argv[1] to r0
add r1, sp, #8 # set &argv[1] to r1
eor r2, r2 # set NULL to r2
mov r7, #11
svc #1
This code does same as next c code:
#include <unistd.h>
int main(int argc, char *argv[])
{
execve(argv[1], &argv[1], NULL);
return 0;
}
Third parameter is envp, it can be set to NULL.
To start /bin/sh:
shell /bin/sh
I hope this helps someone

Related

How do I write to files in ARM assembly?

I am learning ARM assembly on my raspberry pi, and I am trying to write to a file called "user_data.txt". I do know how to create a file, like so...
.data
.balign 1
file_name: .asciz "user_data.txt"
.text
.global _start
_start:
MOV R7, #8
LDR R0, =file_name
MOV R1, #0777
SWI 0
_end:
MOV R7, #1
SWI #0
...but, as I said, I can't figure out how I would write to this file. I have looked at other tutorials, but none that I looked at explain what each line does. I understand that I would move 4 into R7, in order to call the sys_write system call, but how would I tell ARM the file name I want to write to?
Can anyone give some code which clearly shows and explains some ARM that writes to a file?
Thanks,
primecubed
So you wanted code:
.data
.balign 1
file_name: .asciz "user_data.txt"
.text
.global _start
_start:
MOV R7, #8
LDR R0, =file_name
MOV R1, #0777
SWI 0
MOV R7, #4 ;write(int fd, void* buf, int len)
LDR R1, =file_name ;buf
MOV R2, #9 ;len
SWI 0
MOV R7, #6 ;close(int fd)
SWI 0
_end:
MOV R7, #1
SWI #0
This will (for simplicity) write 9 chars of file_name (user_data) into the file and close it. Note that R0 always holds fd.
The manpages (https://linux.die.net/man/2/creat, https://linux.die.net/man/2/write) and this table (https://syscalls.w3challs.com/?arch=arm_thumb) are useful resources I often consult.

SegFault when calling function in asm

I started to learn calling a function in assembly. I followed much tutorial in the internet and make some modification to it.
But it doesnot really work as expected.
.data
hello: .ascii "hello everyone\n"
len= . - hello
.text
.global _start
exit:
mov %r1,#0
mov %r2,#0
mov %r0, #0
mov %r7, #1
swi #0
println:
mov %r7, #4
swi #0
mov %pc, %lr
bx %r7
_start:
ldr %r1, =hello
ldr %r2, =len
b println
b exit
and the output goes
hello everyone
Segmentation fault
I dont know where i was wrong.
For function calls, use the bl (branch and link) instruction. This sets up lr to contain the return address. Your code uses b (branch) rather than bl, so lr is not set up and returning from println goes to an unpredictable address, likely crashing your program.
To fix this, use bl instead of b for function calls:
bl println
bl exit

Refering to named constant in ARM assembly syntax / gas?

When I try to compile this ARM asm with as (arm-linux-gnueabihf):
.data
len = 42
.text
mov r0, #13
...it works. However, when I replace #13 with =len:
.data
len = 42
.text
mov r0, =len
I get:
Error: immediate expression requires a # prefix -- `mov r0,=len'
I've tried #len and #=len, neither seem to work. How do I refer to a named constants from the .data section in the .text section in ARM syntax?
Update:
Yeah, I had gotten section addresses and constants confused. For posterity, here is ARM hello world in unified syntax:
.syntax unified
.data
msg:
.ascii "Hello, ARM!\n"
len = . - msg
.text
.globl _start
_start:
mov r0, 1
ldr r1, =msg
mov r2, len
mov r7, 4
svc 0
mov r0, 0
mov r7, 1
svc 0

Pass arguments to ARM program while remotely debugging

I'm trying to debug an ARM code from my Linux machine. The beginning of the code is as follows:
.text:00008290 MOV R12, SP
.text:00008294 STMFD SP!, {R4,R11,R12,LR,PC}
.text:00008298 SUB R11, R12, #4
.text:0000829C SUB SP, SP, #0x24
.text:000082A0 STR R0, [R11,#var_28]
.text:000082A4 STR R1, [R11,#var_2C]
.text:000082A8 LDR R3, [R11,#var_28]
.text:000082AC CMP R3, #1 ; Check whether arg has been provided
.text:000082B0 BGT loc_82C0 ; Jump to 0x82C0 if arg provided
.text:000082B4 MOV R3, #0xFFFFFFFF
.text:000082B8 STR R3, [R11,#var_30]
.text:000082BC B loc_8448
As you can see, if arg is provided, the code jumps to 0x82C0 but I can't find a way to run the code with the argument.
To debug it, I'm using a server/client architecture on my machine as follows:
1st terminal window:
$ qemu-arm -g 1234 ./chall9.bin
2nd terminal window:
$ gdb-multiarch
(gdb) file chall9.bin
Reading symbols from /data/malware/chall9.bin...done.
(gdb) set architecture arm
The target architecture is assumed to be arm
(gdb) target remote 127.0.0.1:1234
Remote debugging using 127.0.0.1:1234
[New Remote target]
[Switching to Remote target]
0x00008150 in _start ()
(gdb) break *0x82b0
Breakpoint 1 at 0x82b0
(gdb) set args 12345
(gdb) show args
Argument list to give program being debugged when it is started is "12345".
(gdb) r
The "remote" target does not support "run". Try "help target" or "continue".
(gdb) c
Continuing.
Breakpoint 1, 0x000082b0 in main ()
(gdb) x /12i $pc
=> 0x82b0 <main+32>: bgt 0x82c0 <main+48>
0x82b4 <main+36>: mvn r3, #0
0x82b8 <main+40>: str r3, [r11, #-48] ; 0x30
0x82bc <main+44>: b 0x8448 <main+440>
0x82c0 <main+48>: mov r3, #0
0x82c4 <main+52>: str r3, [r11, #-28]
0x82c8 <main+56>: mov r0, #32
0x82cc <main+60>: bl 0x8248 <xmalloc>
0x82d0 <main+64>: mov r3, r0
0x82d4 <main+68>: str r3, [r11, #-32]
0x82d8 <main+72>: b 0x832c <main+156>
0x82dc <main+76>: ldr r3, [r11, #-28]
(gdb) si
0x000082b4 in main ()
It seems that my arguments are not taken because the code should normally jump to 0x82c0 but it jumps to 0x82b4.
Any idea? Thank you in advance for your inputs.
I've found! The arg should be passed to qemu as follows:
$ qemu-arm -g 1234 ./chall9.bin 12345

How to print a number in ARM assembly?

I am trying to print a number that I have stored. I'm not sure if I am close or way off. Any help would be appreciated though. Here is my code:
.data
.balign 4
a: .word 4
.text
.global main
main:
ldr r0, addr_of_a
mov r1, #8
str r1, [r0]
write:
mov r0, #1
ldr r1, addr_of_a
mov r2, #4
mov r7, #4
swi #0
bx lr
addr_of_a: .word a
It compiles and runs, but I don't see anything printed. From what I understand, I need the address of where to start printing in r1, how many bytes in r2, the file descriptor in r0, and r7 specifies the write call if it is set to #4. I am simply trying to store #8, then print the stored number.
The syscall write takes on the second argument (r1) as a pointer to the string you want to print. You are passing it a pointer to an integer, which is why it's not printing anything, because there are no ASCII characters on the memory region you are passing to it.
Below you'll find a "Hello World" program using the syscall write.
.text
.global main
main:
push {r7, lr}
mov r0, #1
ldr r1, =string
mov r2, #12
mov r7, #4
svc #0
pop {r7, pc}
.data
string: .asciz "Hello World\n"
If you want to print a number you can use the printf function from the C library. Like this:
.text
.global main
.extern printf
main:
push {ip, lr}
ldr r0, =string
mov r1, #1024
bl printf
pop {ip, pc}
.data
string: .asciz "The number is: %d\n"
Finally, if you want to print the number with the syscall write you can also implement a itoa function (one that converts an integer to a string).
Hi I appreciate that this is a pretty old thread but I've scratched my head over this for a while and would like to share my solution. Maybe it'll help someone along the way!
I was aiming to print to digit without recourse to using C++ in any way, though I realise that simply decompiling a tostring() - or whatever equivalent exists in C++ - and seeing what that came up with would have been a far quicker route.
Basically I ended up with creating a pointer to an empty .ascii string in the section .data and added the digit that I wanted to print + 48 to it before printing off that digit.
The +48 of course is to refer to the specific digit's ascii index number.
.global _start
_start:
MOV R8, #8
ADD R8, R8, #48
LDR R9, =num
STR R8, [R9]
MOV R0, #1
LDR R1, =num
MOV R2, #1
MOV R7, #4
SWI 0
.data
num:
.ascii: " "
The biggest drawback of this approach is that it doesn't handle any number more than one digit long of course.
My solution for that was much, much uglier and beyond the scope of this answer here but if you've a strong stomach you can see it here:

Resources