Distinguish the existence of two separate strings - string

I have an assembly source file, named: helloworld.s:
.global _start
_start: mov X0, #1
ldr X1, =helloworld
mov X2, #13
mov X8, #64
svc 0
mov X0, #0
mov X8, #93
svc 0
.data
helloworld: .ascii "Hello nice warm World"
.ascii "Hello nice warm World2"
I created an executable file:
/usr/aarch64-linux-gnu/bin/as -o helloworld.o helloworld.s
/usr/aarch64-linux-gnu/bin/ld -o helloworld helloworld.o
and then, created an objdump output of the executable file:
/usr/aarch64-linux-gnu/bin/objdump -s -D helloworld > objdum_output_helloworld.txt
This gives:
helloworld: file format elf64-littleaarch64
Contents of section .text:
...
4000b0 200080d2 e1000058 a20180d2 080880d2 ......X........
4000c0 010000d4 000080d2 a80b80d2 010000d4 ................
4000d0 d8004100 00000000 ..A.....
...
Contents of section .data:
...
4100d8 48656c6c 6f206e69 63652077 61726d20 Hello nice warm
4100e8 576f726c 6448656c 6c6f206e 69636520 WorldHello nice
4100f8 7761726d 20576f72 6c6432 warm World2
...
Disassembly of section .text:
00000000004000b0 <_start>:
...
4000b0: d2800020 mov x0, #0x1 // #1
4000b4: 580000e1 ldr x1, 4000d0 <_start+0x20>
4000b8: d28001a2 mov x2, #0xd // #13
4000bc: d2800808 mov x8, #0x40 // #64
4000c0: d4000001 svc #0x0
4000c4: d2800000 mov x0, #0x0 // #0
4000c8: d2800ba8 mov x8, #0x5d // #93
4000cc: d4000001 svc #0x0
4000d0: 004100d8 .inst 0x004100d8 ; undefined
4000d4: 00000000 .inst 0x00000000 ; undefined
...
Disassembly of section .data:
00000000004100d8 <helloworld>:
...
4100d8: 6c6c6548 ldnp d8, d25, [x10, #-320]
4100dc: 696e206f ldpsw x15, x8, [x3, #-144]
4100e0: 77206563 .inst 0x77206563 ; undefined
4100e4: 206d7261 .inst 0x206d7261 ; undefined
4100e8: 6c726f57 ldnp d23, d27, [x26, #-224]
4100ec: 6c654864 ldnp d4, d18, [x3, #-432]
4100f0: 6e206f6c umin v12.16b, v27.16b, v0.16b
4100f4: 20656369 .inst 0x20656369 ; undefined
4100f8: 6d726177 ldp d23, d24, [x11, #-224]
4100fc: 726f5720 .inst 0x726f5720 ; undefined
410100: Address 0x0000000000410100 is out of bounds.
The question:
How can I see from the objdump output only, the existence of two separate strings:
"Hello nice warm World"
and
"Hello nice warm World2" ?
Thanks

Related

PLT GOT: How "jmp DWORD PTR [ebx+0xc]" takes control to GOT?

For puts#plt, control should go from main to PLT to GOT. Control is going to PLT but from PLT how "jmp DWORD PTR [ebx+0xc]" takes control to GOT? In GDB, ebx=0, so ebx+0xc=0xc, which is weird. Where am I missing the flow?
└─# cat main.c
#include <stdio.h>
int main() {
printf ("Hello World!\n");
return 0;
}
└─# gcc -m32 -g main.c -o main
└─# objdump -Mintel -d --no-show-raw-insn main
main: file format elf32-i386
:::
0000119d <main>:
119d: lea ecx,[esp+0x4]
11a1: and esp,0xfffffff0
11a4: push DWORD PTR [ecx-0x4]
11a7: push ebp
11a8: mov ebp,esp
11aa: push ebx
11ab: push ecx
11ac: call 11d9 <__x86.get_pc_thunk.ax>
11b1: add eax,0x2e4f
11b6: sub esp,0xc
11b9: lea edx,[eax-0x1ff8]
11bf: push edx
11c0: mov ebx,eax
11c2: call 1030 <puts#plt> <-- go to addr 1030 (supposed to be in PLT)
11c7: add esp,0x10
11ca: mov eax,0x0
11cf: lea esp,[ebp-0x8]
11d2: pop ecx
11d3: pop ebx
11d4: pop ebp
11d5: lea esp,[ecx-0x4]
11d8: ret
└─# objdump -Mintel -D -j main -j .plt -j .got.plt --no-show-raw-insn main
main: file format elf32-i386
Disassembly of section .plt:
00001020 <puts#plt-0x10>:
1020: push DWORD PTR [ebx+0x4]
1026: jmp DWORD PTR [ebx+0x8]
102c: add BYTE PTR [eax],al
...
00001030 <puts#plt>:
1030: jmp DWORD PTR [ebx+0xc] <-- Go to ebx+0xc = 0xc (in gdb, ebx=0) !!
1036: push 0x0 (it should have gone to GOT)
103b: jmp 1020 <_init+0x20>
00001040 <__libc_start_main#plt>:
1040: jmp DWORD PTR [ebx+0x10]
1046: push 0x8
104b: jmp 1020 <_init+0x20>
Disassembly of section .got.plt:
00004000 <_GLOBAL_OFFSET_TABLE_>:
4000: cld
4001: add BYTE PTR ds:[eax],al
...
400c: adc BYTE PTR ss:[eax],al
400f: add BYTE PTR [esi+0x10],al
...

How do I write to files in ARM assembly?

I am learning ARM assembly on my raspberry pi, and I am trying to write to a file called "user_data.txt". I do know how to create a file, like so...
.data
.balign 1
file_name: .asciz "user_data.txt"
.text
.global _start
_start:
MOV R7, #8
LDR R0, =file_name
MOV R1, #0777
SWI 0
_end:
MOV R7, #1
SWI #0
...but, as I said, I can't figure out how I would write to this file. I have looked at other tutorials, but none that I looked at explain what each line does. I understand that I would move 4 into R7, in order to call the sys_write system call, but how would I tell ARM the file name I want to write to?
Can anyone give some code which clearly shows and explains some ARM that writes to a file?
Thanks,
primecubed
So you wanted code:
.data
.balign 1
file_name: .asciz "user_data.txt"
.text
.global _start
_start:
MOV R7, #8
LDR R0, =file_name
MOV R1, #0777
SWI 0
MOV R7, #4 ;write(int fd, void* buf, int len)
LDR R1, =file_name ;buf
MOV R2, #9 ;len
SWI 0
MOV R7, #6 ;close(int fd)
SWI 0
_end:
MOV R7, #1
SWI #0
This will (for simplicity) write 9 chars of file_name (user_data) into the file and close it. Note that R0 always holds fd.
The manpages (https://linux.die.net/man/2/creat, https://linux.die.net/man/2/write) and this table (https://syscalls.w3challs.com/?arch=arm_thumb) are useful resources I often consult.

Can't get newline to print with ARM Linux assembly

I started with code from a Raspberry Pi assembly language book. It prints out 15 in binary as so:
00000000000000000000000000001111pi#raspberrypi:$
I wanted to add a newline at the end, so I implemented the _newline: and new: .ascii "\n" portion of the code.
I reassembled it, but the output remains the same. Did I miss something in outputting the newline?
.global _start
_start:
mov r6, #15
mov r10, #1
mov r9, r10, lsl #31
ldr r1, =string
_bits:
tst r6, r9
moveq r0, #48
movne r0, #49
str r0, [r1]
mov r8, r6
bl _write
mov r6, r8
movs r9, r9, lsr #1
bne _bits
_newline:
mov r0, #1
mov r2, #1
mov r7, #4
ldr r1, =new
swi 0
_exit:
mov r7, #1
swi 0
_write:
mov r0, #1
mov r2, #1
mov r7, #4
swi 0
bx lr
.data
string: .ascii " "
new: .ascii "\n"
The last few lines of strace output are:
write(1, "1", 11) = 1
write(1, "1", 11) = 1
write(1, "1", 11) = 1
write(1, "1", 11) = 1
write(1, "\0", 11) = 1
exit(1) =?
+++ exited with 1 +++
Your strace output is the clue: write(1, "\0", 11) = 1 shows us that you wrote a 0 byte instead of the ASCII encoding of \n.
When you str r0, [r1], you're storing 4 bytes.
The destination of that store is
.data
string: .ascii " "
new: .ascii "\n"
which is really:
.data
string: .byte ' '
new: .byte '\n'
So each time you store '0' or '1' to string, you're also writing 3 more zero bytes, clobbering your '\n' and 2 more bytes beyond the end of your data section. (It doesn't segfault because you're not right at the end of a page.)
The simplest fix is to use a single-byte store: strb r0, [r1] instead of the word-sized str.

Refering to named constant in ARM assembly syntax / gas?

When I try to compile this ARM asm with as (arm-linux-gnueabihf):
.data
len = 42
.text
mov r0, #13
...it works. However, when I replace #13 with =len:
.data
len = 42
.text
mov r0, =len
I get:
Error: immediate expression requires a # prefix -- `mov r0,=len'
I've tried #len and #=len, neither seem to work. How do I refer to a named constants from the .data section in the .text section in ARM syntax?
Update:
Yeah, I had gotten section addresses and constants confused. For posterity, here is ARM hello world in unified syntax:
.syntax unified
.data
msg:
.ascii "Hello, ARM!\n"
len = . - msg
.text
.globl _start
_start:
mov r0, 1
ldr r1, =msg
mov r2, len
mov r7, 4
svc 0
mov r0, 0
mov r7, 1
svc 0

ARM, GNU assembler: how to pass "array" arguments to execve()?

I was writing a simple shellcode that would call execve() for an ARM platform (Linux on Raspberry PI) and got stuck with the second argument to execve. As per documentation:
int execve(const char *filename, char *const argv[], char *const envp[]);
Which totally cuts it for me if I call execve("/bin/sh", {NULL}, {NULL}); (from the assembly standpoint):
.data
.section .rodata
.command:
.string "/bin/sh"
.text
.globl _start
_start:
mov r7, #11
ldr r0, =.command
eor r1, r1 # temporarily forget about argv
eor r2, r2 # don't mind envp too
svc #0
mov r7, #1
eor r0, r0
svc #0
The assembly above compiles nicely and evokes a shell when run on my test machine that has true /bin/sh. However, all my trouble is that on the particular target box there's no /bin/sh per se, but only a symlink to busybox which necessitates me to execute something like execve("/bin/busybox", {"/bin/busybox", "sh", NULL}, {NULL}).
As to what I understand, arrays are continuous in memory, so all I have to do is to allocate bytes in memory in a continuous manner and then feed pointer to the beginning of what I deem as such "array". With that in mind I tried to the following:
.data
.section .rodata
.command:
.string "/bin/busybox"
.args:
.ascii "/bin/busybox\0"
.ascii "sh\0"
.ascii "\0"
.text
.globl _start
_start:
mov r7, #11
ldr r0, =.command
ldr r1, =.args
eor r2, r2
svc #0
mov r7, #1
eor r0, r0
svc #0
however with no success. Tried to play around with bytes and just create a series of bytes with null bytes filled to align to 4 bytes, which also didn't work. If the .args label looks like this:
.args:
.ascii "/bin/sh\0"
.ascii "-c\0\0\0"
.ascii "ls\0\0\0"
.ascii "\0\0\0\0"
then strace of the program being executed is as below:
$ strace ./shell
execve("./shell", ["./shell"], [/* 19 vars */]) = 0
dup2(0, 4) = 4
dup2(1, 4) = 4
dup2(2, 4) = 4
execve("/bin/sh", [0x6e69622f, 0x68732f, 0x632d, 0x736c00], [/* 0 vars */]) = -1 EFAULT (Bad address)
exit(0) = ?
+++ exited with 0 +++
(Trying to execute /bin/sh -c ls first on the testing machine before coding for /bin/busybox sh).
I ran a similar C program and then debugged it to see how it's done. It appears the location that's passed to r1 contains a bunch of pointers to strings and then, naturally, 0x00:
(gdb) x/4xw 0xbefff764
0xbefff764: 0x000105d0 0x000105d8 0x000105dc 0x00000000
... snip ...
(gdb) p argv
$3 = {0x105d0 "/bin/sh", 0x105d8 "-c", 0x105dc "ls", 0x0}
Question
Now that I figured out how memory is laid out, how do I prepare such layout in assembly and correctly pass the second parameter to execve() as an "array" in ARM assembly parlance?
Gosh, I just came up with this... Several hours of fiddling around and then 2 minutes after posting my own question an answer hit me... Rubber duck debugging works.
.data
.section .rodata
command:
.string "/bin/sh"
arg0:
.string "/bin/sh"
arg1:
.string "-c"
arg2:
.string "ls"
args:
.word arg0
.word arg1
.word arg2
.word 0
.text
.globl _start
_start:
mov r7, #11
ldr r0, =command
ldr r1, =args
eor r2, r2
svc #0
mov r7, #1
eor r0, r0
svc #0
You can use stack pointer to pass parameters. When program is started, first argument (arg[1]) will be in sp+8.
shell.s:
.text
.globl _start
_start:
.code 32
add r3,pc,#1
bx r3
.code 16
ldr r0, [sp, #8] # load argv[1] to r0
add r1, sp, #8 # set &argv[1] to r1
eor r2, r2 # set NULL to r2
mov r7, #11
svc #1
This code does same as next c code:
#include <unistd.h>
int main(int argc, char *argv[])
{
execve(argv[1], &argv[1], NULL);
return 0;
}
Third parameter is envp, it can be set to NULL.
To start /bin/sh:
shell /bin/sh
I hope this helps someone

Resources