I am trying to write a simple ELF64 virus in NASM on Linux. It appends itself to the victim (and of course does all that section & segment related stuff) and changes victim's entry point so that it points to the malicious code.
When the infected program is being launched, the first one to be executed is my virus, and when does all it's work, it jumps to the original entry point and the victim's code is being executed.
When the virus infects simple C++ hello world, everything works fine: as I can see in strace, the virus executes properly and then the victim's code executes.
But if I append:
printf("%s\n", argv[0]);
to the victim's code, re-infect it and run, the virus' code executes properly, "hello world" is printed, but then a segmentation fault error is thrown.
I think it means that the stack is being changed during virus' execution so that there's some random number instead of the original argv[0].
However, I've analyzed the whole source of my virus, marked all pushes, pops and direct modifications of rsp, analyzed them carefully and it seems that the stack should be in the same state. But, as I see, it isn't.
Is it possible that the stack is being modified in some non-explicit way by for example a syscall? Or maybe it's impossible and I should just spend few more hours staring at the source to find a bug?
ELF64 virus in NASM on Linux
Persumably on x86_64 (64-bit Linux could also be aarch64, or powerpc64, or sparcv9, or ...).
it seems that the stack should be in the same state
What about registers? Note that on x86_64 argv is passed to main in $rsi, not on the stack.
You must read and understand x86_64 calling conventions.
Related
Years ago a teacher once said to class that 'everything that gets parsed through the CPU can also be exploited'.
Back then I didn't know too much about the topic, but now the statement is nagging on me and I
lack the correct vocabulary to find an answer to this question in the internet myself, so I kindly ask you for help.
We had the lesson about 'cat', 'grep' and 'less' and she said that in the worst case even those commands can cause harm if we parse the wrong content through it.
I don't really understand how she meant that. I do know how CPU registers work, we also had to write an educational buffer overflow so I have seen assembly code in the registers aswell.
I still don't get the following:
How do commands get executed in the CPU at all? e.g. I use 'cat' so somehwere there will be a call of the command. But how does the data I enter get parsed to the CPU? If I 'cat' a .txt file which contains 'hello world' - can I find that string in HEX somewhere in the CPU registers? And if yes:
How does the CPU know that said string is NOT to be executed?
Could you think of any scencario where the above commands could get exploited? Afaik only text gets parsed through it, how could that be exploitable? What do I have to be careful about?
Thanks alot!
Machine code executes by being fetched by the instruction-fetch part of the CPU, at the address pointed to by RIP, the instruction-pointer. CPUs can only execute machine code from memory.
General-purpose registers get loaded with data from data load/store instructions, like mov eax, [rdi]. Having data in registers is totally unrelated to having it execute as machine code. Remember that RIP is a pointer, not actual machine-code bytes. (RIP can be set with jump instructions, including indirect jump to copy a GP register into it, or ret to pop the stack into it).
It would help to learn some basics of assembly language, because you seem to be missing some key concepts there. It's kind of hard to answer the security part of this question when the entire premise seems to be built on some misunderstanding of how computers work. (Which I don't think I can easily clear up here without writing a book on assembly language.) All I can really do is point you at CPU-architecture stuff that answers part of the title question of how instructions get executed. (Not from registers).
Related:
How does a computer distinguish between Data and Instructions?
How instructions are differentiated from data?
Modern Microprocessors
A 90-Minute Guide! covers the basic fetch/decode/execute cycle of simple pipelines. Modern CPUs might have more complex internals, but from a correctness / security POV are equivalent. (Except for exploits like Spectre and Meltdown that depend on speculative execution).
https://www.realworldtech.com/sandy-bridge/3/ is a deep-dive on Intel's Sandybridge microarchitecture. That page covering instruction-fetch shows how things really work under the hood in real CPUs. (AMD Zen is fairly similar.)
You keep using the word "parse", but I think you just mean "pass". You don't "parse content through" something, but you can "pass content through". Anyway no, cat usually doesn't involve copying or looking-at data in user-space, unless you run cat -n to add line numbers.
See Race condition when piping through x86-64 assembly program for an x86-64 Linux asm implementation of plain cat using read and write system calls. Nothing in it is data-dependent, except for the command-line arg. The data being copied is never loaded into CPU registers in user-space.
Inside the kernel, copy_to_user inside Linux's implementation of a read() system call on x86-64 will normally use rep movsb for the copy, not a loop with separate load/store, so even in kernel the data gets copied from the page-cache, pipe buffer, or whatever, to user-space without actually being in a register. (Same for write copying it to whatever stdout is connected to.)
Other commands, like less and grep, would load data into registers, but that doesn't directly introduce any risk of it being executed as code.
Most of the things have already been answered by Peter. However i would like to add a few things.
How do commands get executed in the CPU at all? e.g. I use 'cat' so somehwere there will be a call of the command. But how does the data I enter get parsed to the CPU? If I 'cat' a .txt file which contains 'hello world' - can I find that string in HEX somewhere in the CPU registers?
cat is not directly executed by the CPU cat.c. You could check the source code and get and in-depth view. .
What actually happens is that each instruction is converted to assembly instruction and they get executed by the CPU. The instructions are not vulnerable because what they do is just move some data and switch some bits. Most of the vulnerability are due to memory management and cat has been vulnerable in the past Check this for more detail
How does the CPU know that said string is NOT to be executed?
It does not. Its the job of the operating system to tell what is to be executed and what not.
Could you think of any scencario where the above commands could get exploited? Afaik only text gets parsed through it, how could that be exploitable? What do I have to be careful about?
You have to be careful about how you are passing the text file to the memory. You could even make your own interpreter that would execute txt file and then the interpreter will be telling the CPU about how to execute that instruction.
Question-
Is there a command on Linux systems to see if execution from the stack is allowed?
Background-
Doing a homework assignment that requires a buffer overflow, injecting code into the stack, and overwriting a return address that will set the instruction pointer to the injected code. Everything looks good when stepping though with GDB, but segfaults when trying to execute the first line from the stack. The instruction pointer changes to the correct location, and the instruction is a NOP for testing purposes. I'm wondering if the system is preventing execution from the stack.
Thank you.
Did you try execstack? => stackoverflow.com/questions/6482759/… – Antti yesterday
For anyone else that stumbles across this. "ps -u user" to find PID, then "pmap -x PID" and check line that says "stack." If the x(execute) is missing, type "execstack -s filename"
Still segfaults under regular execution i.e. ./filename, but now works correctly under GDB
My problem is very similar but not the same with the this one.
I run the same example of exploit_notesearch.c in the book: Hacking, the Art of Exploitation on my 64-bit OS, Archlinux and it doesn't work.
From the above link I learnt that it just can't work on most 64-bit systems. But I still can't understand why the programme have to do this: ret = (unsigned int)&i - offset. Why can't I just do this: ret = (unsigned)shellcode so that I can replace the vulnerable program's return address with shellcode's beginning address?
The ret = (unsigned)shellcode will make the ret equals to the address of the shellcode array in your program. But that address is not the address of your malicious code in the target program(notesearch.c). The target process will put its searchstring on stack, so that your malicious code will be also put onto the stack of the target process.
In old days, the memory layout of processes was typically highly deterministic - the location of the stack buffer could usually be predicted quite well by the attacker (particularly if they knew exactly which version of the target software was being attacked). So it will be very easy to know what is the exact address of the searchstring and your shellcode.
However, today, many operating system will perform ASLR. So attackers trying to execute shellcode injected on the stack have to find the stack first. The system obscures related memory-addresses from the attackers. These values have to be guessed, and a mistaken guess is not usually recoverable due to the application crashing(Segmentation Fault).
To improve the chances of success when there was some guesswork involved, the active shellcode would often be preceeded by a large quantity of executable machine code that performed no useful operation - called a "NOP sled" or "NOP slide".
So even the ret = (unsigned int)&i - offset can not make sure your shellcode will be executed succesfully.
I want to use ptrace to write a piece of binary code in a running process's stack.
However, this causes segmentation fault (signal 11).
I can make sure the %eip register stores the pointer to the first instruction that I want to execute in the stack. I guess there is some mechanism that linux protects the stack data to be executable.
So, does anyone know how to disable such protection for stack. Specifically, I'm trying Fedora 15.
Thanks a lot!
After reading all replies, I tried execstack, which really makes code in stack executable. Thank you all!
This is probably due to the NX bit on modern processors. You may be able to disable this for your program using execstack.
http://advosys.ca/viewpoints/2009/07/disabling-the-nx-bit-for-specific-apps/
http://linux.die.net/man/8/execstack
As already mentioned it is due to the NX bit. But it is possible. I know for sure that gcc uses it itself for trampolines (which are a workaround to make e.g. function pointers of nested functions). I dont looked at the detailes, but I would recommend a look at the gcc code. Search in the sources for the architecture specific macro TARGET_ASM_TRAMPOLINE_TEMPLATE, there you should see how they do it.
EDIT: A quick google for that macro, gave me the hint: mprotect is used to change the permissions of the memory page. Also be carefull when you generate date and execute it - you maybe have in addition to flush the instruction cache.
I am an intern who was offered the task of porting a test application from Solaris to Red Hat. The application is written in Ada. It works just fine on the Unix side. I compiled it on the linux side, but now it is giving me a seg fault. I ran the debugger to see where the fault was and got this:
Warning: In non-Ada task, selecting an Ada task.
=> runtime tasking structures have not yet been initialized.
<non-Ada task> with thread id 0b7fe46c0
process received signal "Segmentation fault" [11]
task #1 stopped in _dl_allocate_tls
at 0870b71b: mov edx, [edi] ;edx := [edi]
This seg fault happens before any calls are made or anything is initialized. I have been told that 'tasks' in ada get started before the rest of the program, and the problem could be with a task that is running.
But here is the kicker. This program just generates some code for another program to use. The OTHER program, when compiled under linux gives me the same kind of seg fault with the same kind of error message. This leads me to believe there might be some little tweak I can use to fix all of this, but I just don't have enough knowledge about Unix, Linux, and Ada to figure this one out all by myself.
This is a total shot in the dark, but you can have tasks blow up like this at startup if they are trying to allocate too much local memory on the stack. Your main program can safely use the system stack, but tasks have to have their stack allocated at startup from dynamic memory, so typcially your runtime has a default stack size for tasks. If your task tries to allocate a large array, it can easily blow past that limit. I've had it happen to me before.
There are multiple ways to fix this. One way is to move all your task-local data into package global areas. Another is to dynamically allocate it all.
If you can figure out how much memory would be enough, you have a couple more options. You can make the task a task type, and then use a
for My_Task_Type_Name'Storage_Size use Some_Huge_Number;
statement. You can also use a "pragma Storage_Size(My_Task_Type_Name)", but I think the "for" statement is preferred.
Lastly, with Gnat you can also change the default task stack size with the -d flag to gnatbind.
Off the top of my head, if the code was used on Sparc machines, and you're now runing on an x86 machine, you may be running into endian problems.
It's not much help, but it is a common gotcha when going multiplat.
Hunch: the linking step didn't go right. Perhaps the wrong run-time startup library got linked in?
(How likely to find out what the real trouble was, months after the question was asked?)