disassembly issue: wrong result

disassembly issue: wrong result - nasm

I want to disassemble a very simple row binary code (.com file) with nasm but the output is not what i expected.the main code is:
mov ax,4
push ax
mov ax,7
push 9
but the output in nasm is: (and i also tried to use IDA Pro and it was the same)
00000000 B80400 mov ax,0x4
00000003 B80750 mov ax,5007
00000006 006809 add [bx+si+0x9],ch
00000009 0000 add [bx+si],al
as you can see the binary code is right, just the interpretation is wrong (i think because of last 3 extra zero bytes that is caused by another program i am using.or maybe something else!)
how can i omit last zeros in the binary code? or any way for nasm to interpret the code as i want?
thank you

B80750 is indeed mov ax, 0x5007. Nothing wrong there.
Anyway the given code fragment should assemble into something like this:
00000000 B80400 mov ax,0x4
00000003 50 push ax
00000004 B80700 mov ax,0x7
00000007 6A09 push byte +0x9
If you assembled the .com file yourself you are doing it wrong.

Related

How can I work around gdb w/gef giving me "No function contains specified address"?

I am debugging a program using gdb with the gef extension. I am using GNU gdb (GDB) Fedora 12.1-1.fc36. The program I am debugging calls setvbuf and as it's dynamically linked, a call instruction occurs into the setvbuf#PLT procedure linkage table. There, I see the following:
→ 0x804859c <main+18> call 0x8048450 <setvbuf#plt>
↳ 0x8048450 <setvbuf#plt+0> jmp DWORD PTR ds:0x8049a38
0x8048456 <setvbuf#plt+6> push 0x38
0x804845b <setvbuf#plt+11> jmp 0x80483d0
When I issue the command disass 0x8049a38, gdb properly shows me the entry in the setvbuf#plt.got. However, when I issue disass 0x80483d0, gdb tells me: No function contains specified address.. I do not understand this because when I do vmmap 0x80483d0, gdb recognizes that address is indeed in the code section of my own program. Stranger, when I finally step into that jump, there IS code there and now gdb disassembles it just fine:
→ 0x80483d0 push DWORD PTR ds:0x8049a14
0x80483d6 jmp DWORD PTR ds:0x8049a18
0x80483dc add BYTE PTR [eax], al
0x80483de add BYTE PTR [eax], al
0x80483e0 <printf#plt+0> jmp DWORD PTR ds:0x8049a1c
0x80483e6 <printf#plt+6> push 0x0
0x80483eb <printf#plt+11> jmp 0x80483d0
0x80483f0 <fflush#plt+0> jmp DWORD PTR ds:0x8049a20
0x80483f6 <fflush#plt+6> push 0x8
0x80483fb <fflush#plt+11> jmp 0x80483d0
I know there's a bit of funny business that occurs with the whole PLT/GOT thunk table thing, but is there any way to force disassembly even if something is "not part of a function?"

Why I cannot single stepping into aeskeygenassist instruction in self-modifying code?

I tried implementing aes128 encryption using assembly language, my final goal is to find out the final value. when debugging (using single stepping), the debugger stops at the 0x8048074 address.
Here the code :
global _start
section .text
_start:
pxor xmm2, xmm2
pxor xmm3, xmm3
mov bx, 0x36e5
mov ah, 0x73
roundloop:
shr ax, 7
div bl
mov byte [sdfsdf+5], ah
sdfsdf:
aeskeygenassist xmm1, xmm0, 0x45
pshufd xmm1, xmm1, 0xff
shuffle:
shufps xmm2, xmm0, 0x10
pxor xmm0, xmm2
xor byte [shuffle+3], 0x9c
js short shuffle
pxor xmm0, xmm1
cmp ah, bh
jz short lastround
aesenc xmm3, xmm0
jmp short roundloop
lastround:
aesenclast xmm3, xmm0
ret
Debugger stuck at here, I cannot single-stepping to 0x804807a
[-------------------------------------code-------------------------------------]
0x804806c <_start+12>: mov ah,0x73
0x804806e <roundloop>: shr ax,0x7
0x8048072 <roundloop+4>: div bl
=> 0x8048074 <roundloop+6>: mov BYTE PTR ds:0x804807f,ah
0x804807a <sdfsdf>: aeskeygenassist xmm1,xmm0,0x45
0x8048080 <sdfsdf+6>: pshufd xmm1,xmm1,0xff
0x8048085 <shuffle>: shufps xmm2,xmm0,0x10
0x8048089 <shuffle+4>: pxor xmm0,xmm2
I'm using peda plugin for GDB.
EDIT :
Sorry, I don't mention the error message, error message is Segmentation fault at this instruction mov BYTE PTR ds:0x804807f,ah

I assume you forgot to link with --omagic to make the .text section writable.
So mov BYTE PTR ds:0x804807f,ah segfaults, and it's right before aeskeygenassist. You can't keep single-stepping after your program crashes. (You have no handler for SIGSEGV, and the default action is to terminate your program).
When I tried this on my desktop out of curiosity, I can imagine interpreting the behaviour as single-stepping getting "stuck" before aeskeygenassist, if I ignore the segfault message!!! and the fact that trying again says "the program is no longer running".
From a GDB session:
(gdb) layout reg
(gdb) starti # like run with an implicit breakpoint on the first instruction
(gdb) si
0x0000000000401004 in _start ()
0x0000000000401008 in _start () ## I kept pressing return to repeat the command
0x000000000040100c in _start ()
0x000000000040100e in roundloop ()
0x0000000000401012 in roundloop ()
0x0000000000401014 in roundloop () # the MOV store
Program received signal SIGSEGV, Segmentation fault.
0x0000000000401014 in roundloop () # still pointing at the MOV store
Notice that RIP is still pointing at the mov. 0x8048074 in your 32-bit build, 0x401014 in my 64-bit build of the same source.
From the ld manual:
-N
--omagic
Set the text and data sections to be readable and writable. Also, do not page-align the data segment, and disable linking against
shared
libraries. If the output format supports Unix style magic numbers, mark the output as "OMAGIC". Note: Although a writable text
section is
allowed for PE-COFF targets, it does not conform to the format specification published by Microsoft.
Your code works fine for me if I link with:
nasm -felf64 aes.asm &&
ld --omagic aes.o -o aes
Alternatively, you could make an mprotect system call to give the page containing this code PROT_READ|PROT_WRITE|PROT_EXEC.
GDB's layout reg disassembly window even updates disassembly for aeskeygenassist after its immediate is modified by store.
Also note that Self-Modifying Code (SMC) is extremely slow on modern x86. Full pipeline nuke after every store near instructions being executed. You'd be much better off unrolling with an assembler macro.
Also, you can't ret from _start under Linux; it's not a function. The stack pointer points to argc, not a return address. Make an _exit system call with int 0x80 for 32-bit code. When I say "works" I meant it reaches that ret and segfaults on code-fetch from address 1 after popping argc into RIP.
Also, use default rel for RIP-relative addressing of the store; it's more compact. Or I guess you're building a 32-bit executable out of this for some reason, based on your code addresses. I didn't notice that at first, that's why I tested as a 64-bit executable. Fortunately you used labels correctly, and aeskeygenassist is the same length in both modes, so it still works.

Code works when run from section .data, but segmentation faults in section .text

Why doesn't the following code give a segmentation fault?
global _start
section .data
_start:
mov ecx, 3
xor byte[_start+1], 0x02
mov eax, 1
mov ebx, 2
int 80h
I expected it to segfault at the same place (line marked with a comment) as when the same code is run in the .text section:
global _start
section .text ; changed from data to text
_start:
mov ecx, 3
xor byte[_start+1], 0x02 ; ******get segmentation fault here
mov eax, 1
mov ebx, 2
int 80h
Now, I know that section .data is for read-write, and section .text is for read only.
But why would it matter when I try to access illegal memory address?
For the example here, I expected to get segmentation fault also at section .data, in the same place that I got it in section .text.

[_start+1] is clearly not an illegal address. It's part of the 5 bytes encoding mov ecx, 3. (look at objdump -Mintel -drw a.out to see disassembly with the hex machine code).
IDK why you think there would be a problem writing to an address in .data where you've defined the contents. It's more common to use pseudo-instructions like db to assemble bytes into the data section, but assemblers will happily assemble instructions or db into bytes anywhere you put them.
The crash you'd expect from the .data version is from _start being mapped without execute permission but thanks to surprising defaults in the toolchain, programs with asm source files often end up with read-implies-exec (like gcc -zexecstack) unless you take precautions to avoid that:
Why data and stack segments are executable?
Unexpected exec permission from mmap when assembly files included in the project
If you applied that section .note.GNU-stack noalloc noexec nowrite progbits change, code fetch from RIP=_start would fault.
The version that tries to write to the .text section of course segfaults because it's mapped read-only.

Assembly execve failure -14

Program writes executable placed in it's second segment on disk, decrypts it(into /tmp/decbd), and executes(as it was planned)
file decbd appears on disk, and can be executed via shell, last execve call return eax=-14, and after end of the program, execution flows on data and gets segfault.
http://pastebin.com/KywXTB0X
In second segment after compilation using hexdump and dd I manually placed echo binary encrypted via openssl, and when I stopped execution right before last int 0x80 command, I've already been able to run my "echo" in decbd, using another terminal.

You should have narrowed it down to a minimal example. See MCVE.
You should comment your code if you want other people to help.
You should learn to use the debugger and/or other tools.
For point #1, you could have gone down to:
section .text
global _start ;must be declared for linker (ld)
_start:
mov eax,11 ; execve syscall
mov ebx,program ; name of program
mov ecx,[esp+4] ; pointer to argument array
mov ebp,[esp] ; number of arguments
lea edx,[esp+4*ebp+2] ; pointer to environ array
int 0x80
section .data
program db '/bin/echo',0
For point #3, using the debugger you could have seen that:
ebx is okay
ebp is okay
ecx is wrong
edx is wrong
It's an easy fix. ecx should be loaded with the address, not the value and edx should be skipping 2 pointers which are 4 bytes each, so the offset should be 8 not 2. The fixed code could look like this:
section .text
global _start ;must be declared for linker (ld)
_start:
mov eax,11 ; execve syscall
mov ebx,program ; name of program
lea ecx,[esp+4] ; pointer to argument array
mov ebp,[esp] ; number of arguments
lea edx,[esp+4*ebp+8] ; pointer to environ array (skip argc and NULL)
int 0x80
section .data
program db '/bin/echo',0

man execve says this in the "ERRORS" section with regard to return code -14 (-EFAULT):
EFAULT filename points outside your accessible address space.
You passed a bad pointer to execve().

How does 'BL' arm instruction disassembly work?

'bl' or branch with link instruction is almost always becomes 0xebfffffe
However, the processor and GNU binutils objdump somehow know where to branch:
00000000 <init_module>:
0: e1a0c00d mov ip, sp
4: e92ddff0 push {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr, pc}
8: e24cb004 sub fp, ip, #4
c: e24dd038 sub sp, sp, #56 ; 0x38
10: ebfffffe bl 0 <init_module>
14: e59f0640 ldr r0, [pc, #1600] ; 65c <init_module+0x65c>
18: ebfffffe bl 74 <init_module+0x74>
How do they know?

The issue is caused by the fact that you're looking at the disassembly of an object file, not final executable or shared object.
When assembler is producing the object file, the final address of the bl target is not fixed yet (it depends on the other object files that will be linked with it). So the assembler sets the address to 0 but also adds a relocation that tells the linker where this bl is supposed to go in the final file. (You can see the relocation info in objdump by adding the -r switch.)
When linking, the linker processes the relocation, calculates the final address of the target function and patches the instruction so that the target address lines up. If you disassemble the final, linked executable, you will see a different opcode.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string