Assembly: Why does jumping to a label that returns via ret cause a segmentation fault? - linux

Linux Assembly Tutorial states:
there is one very important thing to remember: If you are planning to return from a procedure (with the RET instruction), don't jump to it! As in "never!" Doing that will cause a segmentation fault on Linux (which is OK – all your program does is terminate), but in DOS it may blow up in your face with various degrees of terribleness.
But I cannot understand why does it causes a segmentation fault. it sounds just like returning from a function.
I have a situation where I need to implement the logic "If X happens, call procedure A. Otherwise, call procedure B." Is there any other way than jumping around like a kangaroo weaving spaghetti code?

Because CALL pushes the current instruction address onto the stack, and RET pulls it off in order to get back to the call-site. JMP (and related instructions) don't push anything onto the stack.

I think that this advice may have to do with the pipeline, but I'm not sure.
I believe that the question you are asking is:
... subroutine entrypoint ...
... various instructions in a routine ...
jmp label
... move instructions in a routine...
label:
ret
What's the problem, if any, with this? First, I'm not sure that this is a problem at all. But if it is, it's the pipeline. On some processors, one or more instructions after the jmp will be executed before control moves to the label.
Mostly, I fear that you've misunderstood what you've read, or I've misunderstood what you've written. jmp-ing from one point in your subroutine to the ret instruction should be fine. jmp-ing instead of executing ret is, as other people pointed out, is a dumb idea.

Related

Can I tracks control flow in debugger?

With using bt (backtrace) instruction in gdb,
We can roughly see the location that control flow have been through.
However, It is available only when the branch instruction is call.
(which saves the return address in stack.)
I'm curious about whether I can track jmp in the similar manner.
As you know, jmp does not saves return address...
My situation:
More precisely, I'm trouble in below situation.
0x9230 push %ebx // Where %ebx comes from?
0x9231 mov 0x8(%esp),%eax
I want to know in where the %ebx value comes from.
In somewhere, control flow switched to here using jmp.
I want to know where somewhere is.
Question:
Is there any way to track the instruction jmp?
(Or is there any possible application I can use?)
More precisely, I'm trouble in below situation.
0x9230 push %ebx // Where %ebx comes from?
In a situation like this, reverse debugger is likely to be most useful.
Last time I tried it, rr worked wonders for answering "where did this come from?" questions.
Here you would simply use reverse-stepi until you find an instruction that updated %ebx.
If your problem is only with direct jmp family (jmp,jz,jnz,...), you can disassemble your code and check which jump instruction comes there.
Best commercial disassembler is IDA which has inside debugger too. But getting it be me tricky. But other disassemblers normally can show direct jump references too, so it may be good to try a free one in Linux.
But if your jump is indirect like jmp [eax], you need to find it through debugging. As far as I remember, Windows debuggers like olydbg or X64dbg can show source of indirect jump when it happens, But these are Windows only.

Linux exit function

I am trying to understand linux syscalls mechanism. I am reading a book and it in the book it says that exit function look like that(with gdb):
mov $0x0,%ebx
mov $0x1,%eax
80 int $0x80
I understand that this is a syscall to exit, but in my Debian it looks like that:
jmp *0x8049698
push $0x8
jmp 0x80482c0
maybe can someone explain me why it's not the same? When I try to do disas on 0x80482c0
gdb prints me:
No function contains specified address.
Also, can someone give me a good reference to Linux Internals material(as Windows internals)?
Thanks!
The function you most likely called is exit() from C Standard Library (see man 3 exit). This function is a library function which, in turn, calls SYS_exit system call, but not being a system call itself. You will not see that good looking int 0x80 code in your C program disassembly. All existing functions (exit(), syscall(), etc.) are called from some library, so your program is only doing call to that library, and those functions are not belong to your program.
If you want to see exactly that int 0x80 code -- you can inline that asm code in your C application. But this is considered a bad practice, though, as your code become architecture-dependent (only applicable to x86 architecture, in your case).
can someone give me a good reference to Linux Internals material
The code itself is the best up-to-date reference. All books are more or less outdated. Also look into Documentation/ directory in kernel sources.

Requesting examples of sys_fork in nasm

I'm trying to run bzip and have it return control to the calling function from inside a nasm-coded assembly program (under linux). I apparently need to use a combination of the sys_fork and sys_execve system calls to achive this. After much searching, I found an example of how to use sys_execve, however I can't find an example of how to use sys_fork. Any help with my request will be appreciated.
My experience is limited, but as I recall sys_fork is easy. "Just do it" - no parameters. At this point, you're "in two places at once". If eax is zero, you're the child - do sys_execve on bzip. If eax is non-zero (and non-negative!), you're the parent and eax is your PID. Do a sys_waitpid on that PID. As I recall, this returns the exit status of bzip shifted left 8 bytes - sys_execve itself never returns.
I have a crude example that runs an editor, nasm, and ld (all on a hard-coded "hello.asm"). Longish to post, but I can make it available some way if you need it. I found getting the correct parameters to sys_execve the hardest part, as I recall.

Detouring and GCC inline assembly (Linux)

I'm programming extensions for a game which offers an API for (us) modders. This API offers a wide variety of things, but it has one limitation. The API is for the 'engine' only, which means that all modifications (mods) that has been released based on the engine, does not offer/have any sort of (mod specific) API. I have created a 'signature scanner' (note: my plugin is loaded as a shared library, compiled with -share & -fPIC) which finds the functions of interest (which is easy since I'm on linux). So to explain, I'll take a specific case: I have found the address to a function of interest, its function header is very simpleint * InstallRules(void);. It takes a nothing (void) and returns an integer pointer (to an object of my interest). Now, what I want to do, is to create a detour (and remember that I have the start address of the function), to my own function, which I would like to behave something like this:
void MyInstallRules(void)
{
if(PreHook() == block) // <-- First a 'pre' hook which can block the function
return;
int * val = InstallRules(); // <-- Call original function
PostHook(val); // <-- Call post hook, if interest of original functions return value
}
Now here's the deal; I have no experience what so ever about function hooking, and I only have a thin knowledge of inline assembly (AT&T only). The pre-made detour packages on the Internet is only for windows or is using a whole other method (i.e preloads a dll to override the orignal one). So basically; what should I do to get on track? Should I read about call conventions (cdecl in this case) and learn about inline assembly, or what to do? The best would probably be a already functional wrapper class for linux detouring. In the end, I would like something as simple as this:
void * addressToFunction = SigScanner.FindBySig("Signature_ASfs&43"); // I've already done this part
void * original = PatchFunc(addressToFunction, addressToNewFunction); // This replaces the original function with a hook to mine, but returns a pointer to the original function (relocated ofcourse)
// I might wait for my hook to be called or whatever
// ....
// And then unpatch the patched function (optional)
UnpatchFunc(addressToFunction, addressToNewFunction);
I understand that I won't be able to get a completely satisfying answer here, but I would more than appreciate some help with the directions to take, because I am on thin ice here... I have read about detouring but there is barely any documentation at all (specifically for linux), and I guess I want to implement what's known as a 'trampoline' but I can't seem to find a way how to acquire this knowledge.
NOTE: I'm also interested in _thiscall, but from what I've read that isn't so hard to call with GNU calling convention(?)
Is this project to develop a "framework" that will allow others to hook different functions in different binaries? Or is it just that you need to hook this specific program that you have?
First, let's suppose you want the second thing, you just have a function in a binary that you want to hook, programmatically and reliably. The main problem with doing this universally is that doing this reliably is a very tough game, but if you are willing to make some compromises, then it's definitely doable. Also let's assume this is x86 thing.
If you want to hook a function, there are several options how to do it. What Detours does is inline patching. They have a nice overview of how it works in a Research PDF document. The basic idea is that you have a function, e.g.
00E32BCE /$ 8BFF MOV EDI,EDI
00E32BD0 |. 55 PUSH EBP
00E32BD1 |. 8BEC MOV EBP,ESP
00E32BD3 |. 83EC 10 SUB ESP,10
00E32BD6 |. A1 9849E300 MOV EAX,DWORD PTR DS:[E34998]
...
...
Now you replace the beginning of the function with a CALL or JMP to your function and save the original bytes that you overwrote with the patch somewhere:
00E32BCE /$ E9 XXXXXXXX JMP MyHook
00E32BD3 |. 83EC 10 SUB ESP,10
00E32BD6 |. A1 9849E300 MOV EAX,DWORD PTR DS:[E34998]
(Note that I overwrote 5 bytes.) Now your function gets called with the same parameters and same calling convention as the original function. If your function wants to call the original one (but it doesn't have to), you create a "trampoline", that 1) runs the original instructions that were overwritten 2) jmps to the rest of the original function:
Trampoline:
MOV EDI,EDI
PUSH EBP
MOV EBP,ESP
JMP 00E32BD3
And that's it, you just need to construct the trampoline function in runtime by emitting processor instructions. The hard part of this process is to get it working reliably, for any function, for any calling convention and for different OS/platforms. One of the issues is that if the 5 bytes that you want to overwrite ends in a middle of an instruction. To detect "ends of instructions" you would basically need to include a disassembler, because there can be any instruction at the beginning of the function. Or when the function is itself shorter than 5 bytes (a function that always returns 0 can be written as XOR EAX,EAX; RETN which is just 3 bytes).
Most current compilers/assemblers produce a 5-byte long function prolog, exactly for this purpose, hooking. See that MOV EDI, EDI? If you wonder, "why the hell do they move edi to edi? that doesn't do anything!?" you are absolutely correct, but this is the purpose of the prolog, to be exactly 5-bytes long (not ending in a middle of an instruction). Note that the disassembly example is not something I made up, it's calc.exe on Windows Vista.
The rest of the hook implementation is just technical details, but they can bring you many hours of pain, because that's the hardest part. Also the behaviour you described in your question:
void MyInstallRules(void)
{
if(PreHook() == block) // <-- First a 'pre' hook which can block the function
return;
int * val = InstallRules(); // <-- Call original function
PostHook(val); // <-- Call post hook, if interest of original functions return value
}
seems worse than what I described (and what Detours does), for example you might want to "not call the original" but return some different value. Or call the original function twice. Instead, let your hook handler decide whether and where it will call the original function. Also then you don't need two handler functions for a hook.
If you don't have enough knowledge about the technologies you need for this (mostly assembly), or don't know how to do the hooking, I suggest you study what Detours does. Hook your own binary and take a debugger (OllyDbg for example) to see at assembly level what it exactly did, what instructions were placed and where. Also this tutorial might come in handy.
Anyway, if your task is to hook some functions in a specific program, then this is doable and if you have any trouble, just ask here again. Basically you can do a lot of assumptions (like the function prologs or used conventions) that will make your task much easier.
If you want to create some reliable hooking framework, then still is a completely different story and you should first begin by creating simple hooks for some simple apps.
Also note that this technique is not OS specific, it's the same on all x86 platforms, it will work on both Linux and Windows. What is OS specific is that you will probably have to change memory protection of the code ("unlock" it, so you can write to it), which is done with mprotect on Linux and with VirtualProtect on Windows. Also the calling conventions are different, that that's what you can solve by using the correct syntax in your compiler.
Another trouble is "DLL injection" (on Linux it will probably be called "shared library injection" but the term DLL injection is widely known). You need to put your code (that performs the hook) into the program. My suggestion is that if it's possible, just use LD_PRELOAD environment variable, in which you can specify a library that will be loaded into the program just before it's run. This has been described in SO many times, like here: What is the LD_PRELOAD trick?. If you must do this in runtime, I'm afraid you will need to get with gdb or ptrace, which in my opinion is quite hard (at least the ptrace thing) to do. However you can read for example this article on codeproject or this ptrace tutorial.
I also found some nice resources:
SourceHook project, but it seems it's only for virtual functions in C++, but you can always take a look at its source code
this forum thread giving a simple 10-line function to do this "inline hook" that I described
this a little more complex code in a forum
here on SO is some example
Also one other point: This "inline patching" is not the only way to do this. There are even simpler ways, e.g. if the function is virtual or if it's a library exported function, you can skip all the assembly/disassembly/JMP thing and simply replace the pointer to that function (either in the table of virtual functions or in the exported symbols table).

What is the purpose of this code segment from glibc

I am trying to understand what the following code segment from tls.h in glibc is doing and why:
/* Macros to load from and store into segment registers. */
# define TLS_GET_FS() \
({ int __seg; __asm ("movl %%fs, %0" : "=q" (__seg)); __seg; })
I think I understand the basic operation it is moving the value stored in the fs register to __seg. However, I have some questions:
My understanding is the fs is only 16-bits. Is this correct? What happens when the value gets moved to a quadword memory location? Does this mean the upper bits get set to 0?
More importantly I think that the scope of the variable __seg that gets declared at the start of the segment is limited to this segment. So how is __seg useful? I'm sure that the authors of glibc have a good reason for doing this but I can't figure out what it is from looking at the source code.
I tried generating assembly for this code and I got the following?
#APP
# 13 "fs-test.cpp" 1
movl %fs, %eax
# 0 "" 2
#NO_APP
So in my case it looks like eax was used for __seg. But I don't know if that is always what happens or if it was just what happened in the small test file that I compiled. If it is always going to use eax why wouldn't the assembly be written that way? If the compiler might pick other registers then how will the programmer know which one to access since __seg goes out of scope at the end of the macro? Finally I did not see this macro used anywhere when I grepped for it in the glibc source code, so that further adds to my confusion about what its purpose is. Any explanation about what the code is doing and why is appreciated.
My understanding is the fs is only 16-bits. Is this correct? What happens when the value gets moved to a quadword memory location? Does this mean the upper bits get set to 0?
Yes.
the variable __seg that gets declared at the start of the segment is limited to this segment. So how is __seg useful?
You have to read about GCC statement-expression extension. The value of statement expression is the value of the last expression in it. The __seg; at the end would be useless, unless one assigns it to something else, like this:
int foo = TLS_GET_FS();
Finally I did not see this macro used anywhere when I grepped for it in the glibc source code
The TLS_{GET,SET}_FS in fact do not appear to be used. They probably were used in some version, then accidentally left over when the code referencing them was removed.

Resources