Why I cannot single stepping into aeskeygenassist instruction in self-modifying code?

Why I cannot single stepping into aeskeygenassist instruction in self-modifying code? - linux

I tried implementing aes128 encryption using assembly language, my final goal is to find out the final value. when debugging (using single stepping), the debugger stops at the 0x8048074 address.
Here the code :
global _start
section .text
_start:
pxor xmm2, xmm2
pxor xmm3, xmm3
mov bx, 0x36e5
mov ah, 0x73
roundloop:
shr ax, 7
div bl
mov byte [sdfsdf+5], ah
sdfsdf:
aeskeygenassist xmm1, xmm0, 0x45
pshufd xmm1, xmm1, 0xff
shuffle:
shufps xmm2, xmm0, 0x10
pxor xmm0, xmm2
xor byte [shuffle+3], 0x9c
js short shuffle
pxor xmm0, xmm1
cmp ah, bh
jz short lastround
aesenc xmm3, xmm0
jmp short roundloop
lastround:
aesenclast xmm3, xmm0
ret
Debugger stuck at here, I cannot single-stepping to 0x804807a
[-------------------------------------code-------------------------------------]
0x804806c <_start+12>: mov ah,0x73
0x804806e <roundloop>: shr ax,0x7
0x8048072 <roundloop+4>: div bl
=> 0x8048074 <roundloop+6>: mov BYTE PTR ds:0x804807f,ah
0x804807a <sdfsdf>: aeskeygenassist xmm1,xmm0,0x45
0x8048080 <sdfsdf+6>: pshufd xmm1,xmm1,0xff
0x8048085 <shuffle>: shufps xmm2,xmm0,0x10
0x8048089 <shuffle+4>: pxor xmm0,xmm2
I'm using peda plugin for GDB.
EDIT :
Sorry, I don't mention the error message, error message is Segmentation fault at this instruction mov BYTE PTR ds:0x804807f,ah

I assume you forgot to link with --omagic to make the .text section writable.
So mov BYTE PTR ds:0x804807f,ah segfaults, and it's right before aeskeygenassist. You can't keep single-stepping after your program crashes. (You have no handler for SIGSEGV, and the default action is to terminate your program).
When I tried this on my desktop out of curiosity, I can imagine interpreting the behaviour as single-stepping getting "stuck" before aeskeygenassist, if I ignore the segfault message!!! and the fact that trying again says "the program is no longer running".
From a GDB session:
(gdb) layout reg
(gdb) starti # like run with an implicit breakpoint on the first instruction
(gdb) si
0x0000000000401004 in _start ()
0x0000000000401008 in _start () ## I kept pressing return to repeat the command
0x000000000040100c in _start ()
0x000000000040100e in roundloop ()
0x0000000000401012 in roundloop ()
0x0000000000401014 in roundloop () # the MOV store
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401014 in roundloop () # still pointing at the MOV store
Notice that RIP is still pointing at the mov. 0x8048074 in your 32-bit build, 0x401014 in my 64-bit build of the same source.
From the ld manual:
-N
--omagic
Set the text and data sections to be readable and writable. Also, do not page-align the data segment, and disable linking against
shared
libraries. If the output format supports Unix style magic numbers, mark the output as "OMAGIC". Note: Although a writable text
section is
allowed for PE-COFF targets, it does not conform to the format specification published by Microsoft.
Your code works fine for me if I link with:
nasm -felf64 aes.asm &&
ld --omagic aes.o -o aes
Alternatively, you could make an mprotect system call to give the page containing this code PROT_READ|PROT_WRITE|PROT_EXEC.
GDB's layout reg disassembly window even updates disassembly for aeskeygenassist after its immediate is modified by store.
Also note that Self-Modifying Code (SMC) is extremely slow on modern x86. Full pipeline nuke after every store near instructions being executed. You'd be much better off unrolling with an assembler macro.
Also, you can't ret from _start under Linux; it's not a function. The stack pointer points to argc, not a return address. Make an _exit system call with int 0x80 for 32-bit code. When I say "works" I meant it reaches that ret and segfaults on code-fetch from address 1 after popping argc into RIP.
Also, use default rel for RIP-relative addressing of the store; it's more compact. Or I guess you're building a 32-bit executable out of this for some reason, based on your code addresses. I didn't notice that at first, that's why I tested as a 64-bit executable. Fortunately you used labels correctly, and aeskeygenassist is the same length in both modes, so it still works.

Related

How can I work around gdb w/gef giving me "No function contains specified address"?

I am debugging a program using gdb with the gef extension. I am using GNU gdb (GDB) Fedora 12.1-1.fc36. The program I am debugging calls setvbuf and as it's dynamically linked, a call instruction occurs into the setvbuf#PLT procedure linkage table. There, I see the following:
→ 0x804859c <main+18> call 0x8048450 <setvbuf#plt>
↳ 0x8048450 <setvbuf#plt+0> jmp DWORD PTR ds:0x8049a38
0x8048456 <setvbuf#plt+6> push 0x38
0x804845b <setvbuf#plt+11> jmp 0x80483d0
When I issue the command disass 0x8049a38, gdb properly shows me the entry in the setvbuf#plt.got. However, when I issue disass 0x80483d0, gdb tells me: No function contains specified address.. I do not understand this because when I do vmmap 0x80483d0, gdb recognizes that address is indeed in the code section of my own program. Stranger, when I finally step into that jump, there IS code there and now gdb disassembles it just fine:
→ 0x80483d0 push DWORD PTR ds:0x8049a14
0x80483d6 jmp DWORD PTR ds:0x8049a18
0x80483dc add BYTE PTR [eax], al
0x80483de add BYTE PTR [eax], al
0x80483e0 <printf#plt+0> jmp DWORD PTR ds:0x8049a1c
0x80483e6 <printf#plt+6> push 0x0
0x80483eb <printf#plt+11> jmp 0x80483d0
0x80483f0 <fflush#plt+0> jmp DWORD PTR ds:0x8049a20
0x80483f6 <fflush#plt+6> push 0x8
0x80483fb <fflush#plt+11> jmp 0x80483d0
I know there's a bit of funny business that occurs with the whole PLT/GOT thunk table thing, but is there any way to force disassembly even if something is "not part of a function?"

Wrong Memory Adresses regarding the code x86 using GDB [duplicate]

I'm trying to debug this simple C program:
#include <stdio.h>
int main(int argc, char *argv[]) {
printf("Hello\n");
}
But when I disassemble the main function I get this:
(gdb) disas main
Dump of assembler code for function main:
0x000000000000063a <+0>: push rbp
0x000000000000063b <+1>: mov rbp,rsp
0x000000000000063e <+4>: sub rsp,0x10
0x0000000000000642 <+8>: mov DWORD PTR [rbp-0x4],edi
0x0000000000000645 <+11>: mov QWORD PTR [rbp-0x10],rsi
0x0000000000000649 <+15>: lea rdi,[rip+0x94] # 0x6e4
0x0000000000000650 <+22>: call 0x510 <puts#plt>
0x0000000000000655 <+27>: mov eax,0x0
0x000000000000065a <+32>: leave
0x000000000000065b <+33>: ret
End of assembler dump.
And this is already pretty strange because addresses starts with a prefix of 4... for 32 bit executables and 8... for 64 bit executables I think.
But going on I then put a breakpoint:
(gdb) b *0x0000000000000650
Breakpoint 1 at 0x650
I run it and I get this error message:
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0x650

Your code was most probably compiled as Position-Independent Executable (PIE) to allow Address Space Layout Randomization (ASLR). On some systems, gcc is configured to create PIEs by default (that implies the options -pie -fPIE being passed to gcc).
When you start GDB to debug a PIE, it starts reading addresses from 0, since your executable was not started yet, and therefore not relocated (in PIEs, all addresses including the .text section are relocatable and they start at 0, similar to a dynamic shared object). This is a sample output:
$ gcc -o prog main.c -pie -fPIE
$ gdb -q prog
Reading symbols from prog...(no debugging symbols found)...done.
gdb-peda$ disassemble main
Dump of assembler code for function main:
0x000000000000071a <+0>: push rbp
0x000000000000071b <+1>: mov rbp,rsp
0x000000000000071e <+4>: sub rsp,0x10
0x0000000000000722 <+8>: mov DWORD PTR [rbp-0x4],edi
0x0000000000000725 <+11>: mov QWORD PTR [rbp-0x10],rsi
0x0000000000000729 <+15>: lea rdi,[rip+0x94] # 0x7c4
0x0000000000000730 <+22>: call 0x5d0 <puts#plt>
0x0000000000000735 <+27>: mov eax,0x0
0x000000000000073a <+32>: leave
0x000000000000073b <+33>: ret
End of assembler dump.
As you can see, this shows a similar output to yours, with .text adresses starting at low values.
Relocation takes place once you start your executable, so after that, your code will be placed at some random address in your process memory:
gdb-peda$ start
...
gdb-peda$ disassemble main
Dump of assembler code for function main:
0x00002b1c8f17271a <+0>: push rbp
0x00002b1c8f17271b <+1>: mov rbp,rsp
=> 0x00002b1c8f17271e <+4>: sub rsp,0x10
0x00002b1c8f172722 <+8>: mov DWORD PTR [rbp-0x4],edi
0x00002b1c8f172725 <+11>: mov QWORD PTR [rbp-0x10],rsi
0x00002b1c8f172729 <+15>: lea rdi,[rip+0x94] # 0x2b1c8f1727c4
0x00002b1c8f172730 <+22>: call 0x2b1c8f1725d0 <puts#plt>
0x00002b1c8f172735 <+27>: mov eax,0x0
0x00002b1c8f17273a <+32>: leave
0x00002b1c8f17273b <+33>: ret
End of assembler dump.
As you can see, the addresses now take "real" values that you can set breakpoints to. Note that usually you will still not see the effect of ASLR in GDB though, since it disables randomization by default (debugging a program with randomized location would be cumbersome). You can check this with show disable-randomization. If you really want to see the effects of ASLR in your PIE, set disable-randomization off. Then every run will relocate your code to random addresses.
So the bottom line is: When debugging PIE code, start your program in GDB first and then figure out the addresses.
Alternatively, you can explicitly disable the creation of PIE code and compile your application using gcc filename.c -o filename -no-pie -fno-PIE.
My system does not enforce PIE creation by default, so unfortunately I don't know about the implications of disabling PIE on such a system (would be glad to see comments on that).
For a more comprehensive explanation of position-independent code (PIC) in general (which is of utmost importance for shared libraries), have a look at Ulrich Drepper's paper "How to Write Shared Libraries".

Segmentation fault (core dumped) when I run my assembly code [duplicate]

I've been looking at a tutorial for assembly, and I'm trying to get a hello world program to run. I am using Bash on Ubuntu on Windows.
Here is the assembly:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
I am using these commands to create the executable:
nasm -f elf64 hello.asm -o hello.o
ld -o hello hello.o -m elf_x86_64
And I run it using:
./hello
The program then seems to run without a segmentation fault or error, but it produces no output.
I can't figure out why the code won't produce an output, but I wonder if using Bash on Ubuntu on Windows has anything to do with it? Why doesn't it produce output and how can I fix it?

Related: WSL2 does allow 32-bit user-space programs, WSL1 doesn't. See Does WSL 2 really support 32 bit program? re: making sure you're actually using WSL2. The rest of this answer was written before WLS2 existed.
The issue is with Ubuntu for Windows (Windows Subsystem for Linux version 1). It only supports the 64-bit syscall interface and not the 32-bit x86 int 0x80 system call mechanism.
Besides not being able to use int 0x80 (32-bit compatibility) in 64-bit binaries, Ubuntu on Windows (WSL1) doesn't support running 32-bit executables either. (Same as if you'd built a real Linux kernel without CONFIG_IA32_EMULATION, like some Gentoo users do.)
You need to convert from using int 0x80 to syscall. It's not difficult. A different set of registers are used for a syscall and the system call numbers are different from their 32-bit counterparts. Ryan Chapman's blog has information on the syscall interface, the system calls, and their parameters. Sys_write and Sys_exit are defined this way:
%rax System call %rdi %rsi %rdx %r10 %r8 %r9
----------------------------------------------------------------------------------
0 sys_read unsigned int fd char *buf size_t count
1 sys_write unsigned int fd const char *buf size_t count
60 sys_exit int error_code
Using syscall also clobbers RCX and the R11 registers. They are considered volatile. Don't rely on them being the same value after the syscall.
Your code could be modified to be:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov edx,len ;message length
mov rsi,msg ;message to write
mov edi,1 ;file descriptor (stdout)
mov eax,edi ;system call number (sys_write)
syscall ;call kernel
xor edi, edi ;Return value = 0
mov eax,60 ;system call number (sys_exit)
syscall ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
Note: in 64-bit code if the destination register of an instruction is 32-bit (like EAX, EBX, EDI, ESI etc) the processor zero extends the result into the upper 32-bits of the 64-bit register. mov edi,1 has the same effect as mov rdi,1.
This answer isn't a primer on writing 64-bit code, only about using the syscall interface. If you are interested in the nuances of writing code that calls the C library, and conforms to the 64-bit System V ABI there are reasonable tutorials to get you started like Ray Toal's NASM tutorial. He discusses stack alignment, the red zone, register usage, and a basic overview of the 64-bit System V calling convention.

As already pointed out in comments by Ross Ridge, don't use 32-bit calling of kernel functions when you compile 64bit.
Either compile for 32bit or "translate" the code into 64 bit syscalls.
Here is what that could look like:
section .text
global _start ;must be declared for linker (ld)
_start: ;tells linker entry point
mov rdx,len ;message length
mov rsi,msg ;message to write
mov rdi,1 ;file descriptor (stdout)
mov rax,1 ;system call number (sys_write)
syscall ;call kernel
mov rax,60 ;system call number (sys_exit)
mov rdi,0 ;add this to output error code 0(to indicate program terminated without errors)
syscall ;call kernel
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string

Code works when run from section .data, but segmentation faults in section .text

Why doesn't the following code give a segmentation fault?
global _start
section .data
_start:
mov ecx, 3
xor byte[_start+1], 0x02
mov eax, 1
mov ebx, 2
int 80h
I expected it to segfault at the same place (line marked with a comment) as when the same code is run in the .text section:
global _start
section .text ; changed from data to text
_start:
mov ecx, 3
xor byte[_start+1], 0x02 ; ******get segmentation fault here
mov eax, 1
mov ebx, 2
int 80h
Now, I know that section .data is for read-write, and section .text is for read only.
But why would it matter when I try to access illegal memory address?
For the example here, I expected to get segmentation fault also at section .data, in the same place that I got it in section .text.

[_start+1] is clearly not an illegal address. It's part of the 5 bytes encoding mov ecx, 3. (look at objdump -Mintel -drw a.out to see disassembly with the hex machine code).
IDK why you think there would be a problem writing to an address in .data where you've defined the contents. It's more common to use pseudo-instructions like db to assemble bytes into the data section, but assemblers will happily assemble instructions or db into bytes anywhere you put them.
The crash you'd expect from the .data version is from _start being mapped without execute permission but thanks to surprising defaults in the toolchain, programs with asm source files often end up with read-implies-exec (like gcc -zexecstack) unless you take precautions to avoid that:
Why data and stack segments are executable?
Unexpected exec permission from mmap when assembly files included in the project
If you applied that section .note.GNU-stack noalloc noexec nowrite progbits change, code fetch from RIP=_start would fault.
The version that tries to write to the .text section of course segfaults because it's mapped read-only.

Assembly execve failure -14

Program writes executable placed in it's second segment on disk, decrypts it(into /tmp/decbd), and executes(as it was planned)
file decbd appears on disk, and can be executed via shell, last execve call return eax=-14, and after end of the program, execution flows on data and gets segfault.
http://pastebin.com/KywXTB0X
In second segment after compilation using hexdump and dd I manually placed echo binary encrypted via openssl, and when I stopped execution right before last int 0x80 command, I've already been able to run my "echo" in decbd, using another terminal.

You should have narrowed it down to a minimal example. See MCVE.
You should comment your code if you want other people to help.
You should learn to use the debugger and/or other tools.
For point #1, you could have gone down to:
section .text
global _start ;must be declared for linker (ld)
_start:
mov eax,11 ; execve syscall
mov ebx,program ; name of program
mov ecx,[esp+4] ; pointer to argument array
mov ebp,[esp] ; number of arguments
lea edx,[esp+4*ebp+2] ; pointer to environ array
int 0x80
section .data
program db '/bin/echo',0
For point #3, using the debugger you could have seen that:
ebx is okay
ebp is okay
ecx is wrong
edx is wrong
It's an easy fix. ecx should be loaded with the address, not the value and edx should be skipping 2 pointers which are 4 bytes each, so the offset should be 8 not 2. The fixed code could look like this:
section .text
global _start ;must be declared for linker (ld)
_start:
mov eax,11 ; execve syscall
mov ebx,program ; name of program
lea ecx,[esp+4] ; pointer to argument array
mov ebp,[esp] ; number of arguments
lea edx,[esp+4*ebp+8] ; pointer to environ array (skip argc and NULL)
int 0x80
section .data
program db '/bin/echo',0

man execve says this in the "ERRORS" section with regard to return code -14 (-EFAULT):
EFAULT filename points outside your accessible address space.
You passed a bad pointer to execve().

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string