I am reading doc to write NSIS installer. I tumbled on these lines:
the instructions that NSIS uses for scripting are sort of a cross between PHP and assembly. There are no real high level language constructs, but the instructions themselves are (for the most part) high level, and you have handy string capability (i.e. you don't have to worry about concatenating strings, etc). You essentially have 25 registers (20 general purpose, 5 special purpose), and a stack.
What does it mean that a language has got 25 registers and a stack? These are data structures. IMO these are related to memory management, not language syntax or structure. How do this relate to language structure/syntax ?
Thanks
The documentation refers to the registers
$0 .. $9
$R0 .. $R9
$CMDLINE
$INSTDIR
$OUTDIR
$EXEDIR
$LANGUAGE
You can call them variables as well, they are just pre-existing builtin global variables. The first 20 are general usage variables and the last ones have a dedicated usage in a nsis script.
Regarding the stack, you can actually pushing and poping arbitrary values during the execution of an installer (or uninstaller), but it is in some way a high level stack as you can push not just numerical registers but also strings. And you can play with the stack by swapping values as with Forth or the RPL language of HP calculators.
The registers are just variables that always exist (The ability to create more variables with var was added later.)
StrCpy $0 $windir 1
MessageBox mb_ok "The first character is $0"
Internally the NSIS interpreter works a little bit like a CPU, there is a instruction pointer and you can do relative jumps etc. There is no syntax sugar to hide the way you pass parameters to a function which is why you have to modify the NSIS stack directly:
Function DoMagic
pop $0 ; $0 now contains the value that was on top of the stack
; Do something with $0
FunctionEnd
...
push 0xf00ba5
call DoMagic
push "Hello world"
call DoMagic
There is mini guide to the stack on the wiki...
Related
I need to negotiate a function call, from my Delphi app, into provided DLL made in Clarion 6.3.
I need to pass one or two string parameters (either one functon wit htwo params or two single-params functions).
We quickly settled on using 1-byte 0-ended strings (char* in C terms, CSTRING in Clarion terms, PAnsiChar in Delphi terms), and that is where things got a bit unpredictable and hard too understand.
The working solution we got was passing untyped pointers disguised as 32-bit integers, which Clarion-made DLL then uses to traverse memory with something Clarion programmer called "pick" or maybe "peek". There are also forum articles on interop between Clarion and Visual Basic which address passing strings from VB into Clarion and glancing upon which from behind my shoulder the Clarion developer said something like "i don't need copy of it, i already know it, it is typical".
This however puts more burden on us long-term, as low-level untyped code is much "richer" on boilerplate and error-prone. Typed code would feel better solution.
What i seek here is less of "That is the pattern to copy-paste and make things work without thinking" - we already have it - and more of understanding, what is going on behind the hood, and how can i rely on it, and what should i expect from Clarion DLLs. To avoid getting eventually stuck in "works by chance" solution.
As i was glancing into Clarion 6.3 help from behind his shoulder, the help was not helful on low-level details. It was all about calling DLLs from Clarion, but not about being called. I also don't have Clarion on my machine, not do i want to, ahem, borrow it. And as i've been told sources of Clarion 6.3 runtime are not available to developers either.
Articles like interop between Clarion and VB or between Clarion and C# are not helpful, because they fuse idiosyncrasies of both languages and give yet less information about "bare metal" level.
Google Books pointed to "Clarion Tips & Techniques - David Harms" - and it seem to have interesting insights for Clarion seasoned ones, but i am Clarion zero. At least i was not able to figure out low-level interop-enabling details from it.
Is there maybe a way to make Clarion 6.3 save 'listing files' for the DLLs it make, a standard *.H header file maybe?
So, to repeat, what works, as expected was a function that was passing pointers on Delphi side ( procedure ...(const param1, param2: PAnsiChar); stdcall; which should translate to C stdcall void ...(char* p1, char* p2) and which allegedly look in Clarion something like (LONG, LONG), LONG, pascal, RAW.
This function takes two 32-bit parameters from stack in reverse order, uses them, and exits, passing return value (actually, unused garbage) in EAX register and clearing parameters from stack. Almost exactly stdcall, except that it seems to preserve EBX register for some obscure reason.
Clarion function entry:
04E5D38C 83EC04 sub esp,$04 ' allocate local vars
04E5D38F 53 push ebx ' ????????
04E5D390 8B44240C mov eax,[esp+$0c]
04E5D394 BBB4DDEB04 mov ebx,$04ebddb4
04E5D399 B907010000 mov ecx,$00000107
04E5D39E E889A1FBFF call $04e1752c ' clear off local vars before use
And its exit
00B8D500 8B442406 mov eax,[esp+$06] ' pick return value
00B8D504 5B pop ebx ' ????
00B8D505 83C41C add esp,$1c ' remove local vars
00B8D508 C20800 ret $0008 ' remove two 32-bits params from stack
Except for unexplainable for me manipulation with EBX and returning garbage result - it works as expected. But - untyped low-level operations in Clarion sources required.
Now the function that allegedly only takes one string parameter: on Delphi side - procedure ...(const param1: PAnsiChar); stdcall; which should translate to C stdcall void ...(char* p1) and which allegedly look in Clarion something like (*CSTRING), LONG, pascal, RAW.
Clarion function entry:
00B8D47C 83EC1C sub esp,$1c ' allocate local vars
00B8D47F 53 push ebx ' ????????
00B8D480 8D44240A lea eax,[esp+$0a]
00B8D484 BB16000000 mov ebx,$00000016
00B8D489 B990FEBD00 mov ecx,$00bdfe90
00B8D48E BA15000000 mov edx,$00000015
00B8D493 E82002FBFF call $00b3d6b8 ' clear off local vars before use
And its exit
04E5D492 8B442404 mov eax,[esp+$04] ' pick return value
04E5D496 5B pop ebx ' ????
04E5D497 83C404 add esp,$04 ' remove local vars
04E5D49A C20800 ret $0008 ' remove TWO 32-bits params from stack
What strucks here is that somehow TWO parameters are expected by the function, and only the second one is used (i did not see any reference to the first parameter in the x86 asm code). The function seems to work fine, if being called as procedure ...(const garbage: integer; const param1: PAnsiChar); stdcall; which should translate to C stdcall void ...(int garbage, char* p1).
This "invisible" parameter would look much like a Self/This pointer in object-oriented languages method functions, but the Clarion programmer told me with certainty there was no any objects involved. More so, his 'double-int' function does not seem expect invisible parameter either.
The aforementioned 'Tips' book describes &CSTRING and &STRING Clarion types as actually being two parameters behind the hood, pointer to the buffer and the buffer length. It however gives no information upon how specifically they are passed on stack though. But i was said Clation refused to make a DLL with exported &CSTRING-parametrized function.
I could suppose the invisible parameter is where Clarion wants to store function's return value (if there would had been any assignment to it in Clarion sources), crossing stdcall/PASCAL convention, but the assembler epilogue code shows clear use of EAX register for that, and again the 'double-LONG' function does not use it.
And, so, while i made the "works on my machine" quality code, that successfully calls that Clarion function, by voluntarily inserted a garbage-parameter - i feel rather fuzzy, because i can not understand what and why Clarion is doing there, and hence, what it can suddenly start doing in future after any seemingly unrelated changes.
What is that invisible parameter? Why can it happen there? What to expect from it?
If you are consuming a DLL from Clarion you can prototype with RAW - but procedures in a Clarion DLL cannot do this.
So in the Clarion DLL they can prototype as
Whatever Procedure(*Cstring parm1, *Cstring parm2),C,name('whatever')
And, as you note, from your side you should see this as 4 parameters, length, pointer, length, pointer. (knowing explicit max lengths is not a bad thing from a safety point of view anyway.)
the alternative is
Whatever Procedure(Long parm1, Long parm2),C,name('whatever')
Then from your side it's just 2 addresses.
But there's a bit more code on his side turning those incoming addresses into memory pointers. (yes, he can use PEEK and POKE but that's a bit of overkill)
(From memory he could just declare local variables as
parm1String &cstring,over(parm1)
parm2String &cstring,over(parm2)
but it's been decades since I did this, so I'm not 100% that syntax is legal.)
I'm using the Win32 API to stop/start/inspect/change thread state. Generally works pretty well. Sometimes it fails, and I'm trying to track down the cause.
I have one thread that is forcing context switches on other threads by:
thread stop
fetch processor state into windows context block
read thread registers from windows context block to my own context block
write thread registers from another context block into windows context block
restart thread
This works remarkably well... but ... very rarely, context switches seem to fail.
(Symptom: my multithread system blows sky high executing a strange places with strange register content).
The context control is accomplished by:
if ((suspend_count=SuspendThread(WindowsThreadHandle))<0)
{ printf("TimeSlicer Suspend Thread failure");
...
}
...
Context.ContextFlags = (CONTEXT_INTEGER | CONTEXT_CONTROL | CONTEXT_FLOATING_POINT);
if (!GetThreadContext(WindowsThreadHandle,&Context))
{ printf("Context fetch failure");
...
}
call ContextSwap(&Context); // does the context swap
if (ResumeThread(WindowsThreadHandle)<0)
{ printf("Thread resume failure");
...
}
None of the print statements ever get executed. I conclude that Windows thinks the context operations all happened reliably.
Oh, yes, I do know when a thread being stopped is not computing [e.g., in a system function] and won't attempt to stop/context switch it. I know this because each thread that does anything other-than-computing sets a thread specific "don't touch me" flag, while it is doing other-than-computing. (Device driver programmers will recognize this as the equivalent of "interrupt disable" instructions).
So, I wondered about the reliability of the content of the context block.
I added a variety of sanity tests on various register values pulled out of the context block; you can actually decide that ESP is OK (within bounds of the stack area defined in the TIB), PC is in the program that I expect or in a system call, etc. No surprises here.
I decided to check that the condition code bits (EFLAGS) were being properly read out; if this were wrong, it would cause a switched task to take a "wrong branch" when its state was restored. So I added the following code to verify that the purported EFLAGS register contains stuff that only look like EFLAGS according to the Intel reference manual (http://en.wikipedia.org/wiki/FLAGS_register).
mov eax, Context.EFlags[ebx] ; ebx points to Windows Context block
mov ecx, eax ; check that we seem to have flag bits
and ecx, 0FFFEF32Ah ; where we expect constant flag bits to be
cmp ecx, 000000202h ; expected state of constant flag bits
je #f
breakpoint ; trap if unexpected flag bit status
##:
On my Win 7 AMD Phenom II X6 1090T (hex core),
it traps occasionally with a breakpoint, with ECX = 0200h. Fails same way on my Win 7 Intel i7 system. I would ignore this,
except it hints the EFLAGS aren't being stored correctly, as I suspected.
According to my reading of the Intel (and also the AMD) reference manuals, bit 1 is reserved and always has the value "1". Not what I see here.
Obviously, MS fills the context block by doing complicated things on a thread stop. I expect them to store the state accurately. This bit isn't stored correctly.
If they don't store this bit correctly, what else don't they store?
Any explanations for why the value of this bit could/should be zero sometimes?
EDIT: My code dumps the registers and the stack on catching a breakpoint.
The stack area contains the context block as a local variable.
Both EAX, and the value in the stack at the proper offset for EFLAGS in the context block contain the value 0244h. So the value in the context block really is wrong.
EDIT2: I changed the mask and comparsion values to
and ecx, 0FFFEF328h ; was FFEF32Ah where we expect flag bits to be
cmp ecx, 000000200h
This seems to run reliably with no complaints. Apparently Win7 doesn't do bit 1 of eflags right, and it appears not to matter.
Still interested in an explanation, but apparently this is not the source of my occasional context switch crash.
Microsoft has a long history of squirreling away a few bits in places that aren't really used. Raymond Chen has given plenty of examples, e.g. using the lower bit(s) of a pointer that's not byte-aligned.
In this case, Windows might have needed to store some of its thread context in an existing CONTEXT structure, and decided to use an otherwise unused bit in EFLAGS. You couldn't do anything with that bit anyway, and Windows will get that bit back when you call SetThreadContext.
I'm programming extensions for a game which offers an API for (us) modders. This API offers a wide variety of things, but it has one limitation. The API is for the 'engine' only, which means that all modifications (mods) that has been released based on the engine, does not offer/have any sort of (mod specific) API. I have created a 'signature scanner' (note: my plugin is loaded as a shared library, compiled with -share & -fPIC) which finds the functions of interest (which is easy since I'm on linux). So to explain, I'll take a specific case: I have found the address to a function of interest, its function header is very simpleint * InstallRules(void);. It takes a nothing (void) and returns an integer pointer (to an object of my interest). Now, what I want to do, is to create a detour (and remember that I have the start address of the function), to my own function, which I would like to behave something like this:
void MyInstallRules(void)
{
if(PreHook() == block) // <-- First a 'pre' hook which can block the function
return;
int * val = InstallRules(); // <-- Call original function
PostHook(val); // <-- Call post hook, if interest of original functions return value
}
Now here's the deal; I have no experience what so ever about function hooking, and I only have a thin knowledge of inline assembly (AT&T only). The pre-made detour packages on the Internet is only for windows or is using a whole other method (i.e preloads a dll to override the orignal one). So basically; what should I do to get on track? Should I read about call conventions (cdecl in this case) and learn about inline assembly, or what to do? The best would probably be a already functional wrapper class for linux detouring. In the end, I would like something as simple as this:
void * addressToFunction = SigScanner.FindBySig("Signature_ASfs&43"); // I've already done this part
void * original = PatchFunc(addressToFunction, addressToNewFunction); // This replaces the original function with a hook to mine, but returns a pointer to the original function (relocated ofcourse)
// I might wait for my hook to be called or whatever
// ....
// And then unpatch the patched function (optional)
UnpatchFunc(addressToFunction, addressToNewFunction);
I understand that I won't be able to get a completely satisfying answer here, but I would more than appreciate some help with the directions to take, because I am on thin ice here... I have read about detouring but there is barely any documentation at all (specifically for linux), and I guess I want to implement what's known as a 'trampoline' but I can't seem to find a way how to acquire this knowledge.
NOTE: I'm also interested in _thiscall, but from what I've read that isn't so hard to call with GNU calling convention(?)
Is this project to develop a "framework" that will allow others to hook different functions in different binaries? Or is it just that you need to hook this specific program that you have?
First, let's suppose you want the second thing, you just have a function in a binary that you want to hook, programmatically and reliably. The main problem with doing this universally is that doing this reliably is a very tough game, but if you are willing to make some compromises, then it's definitely doable. Also let's assume this is x86 thing.
If you want to hook a function, there are several options how to do it. What Detours does is inline patching. They have a nice overview of how it works in a Research PDF document. The basic idea is that you have a function, e.g.
00E32BCE /$ 8BFF MOV EDI,EDI
00E32BD0 |. 55 PUSH EBP
00E32BD1 |. 8BEC MOV EBP,ESP
00E32BD3 |. 83EC 10 SUB ESP,10
00E32BD6 |. A1 9849E300 MOV EAX,DWORD PTR DS:[E34998]
...
...
Now you replace the beginning of the function with a CALL or JMP to your function and save the original bytes that you overwrote with the patch somewhere:
00E32BCE /$ E9 XXXXXXXX JMP MyHook
00E32BD3 |. 83EC 10 SUB ESP,10
00E32BD6 |. A1 9849E300 MOV EAX,DWORD PTR DS:[E34998]
(Note that I overwrote 5 bytes.) Now your function gets called with the same parameters and same calling convention as the original function. If your function wants to call the original one (but it doesn't have to), you create a "trampoline", that 1) runs the original instructions that were overwritten 2) jmps to the rest of the original function:
Trampoline:
MOV EDI,EDI
PUSH EBP
MOV EBP,ESP
JMP 00E32BD3
And that's it, you just need to construct the trampoline function in runtime by emitting processor instructions. The hard part of this process is to get it working reliably, for any function, for any calling convention and for different OS/platforms. One of the issues is that if the 5 bytes that you want to overwrite ends in a middle of an instruction. To detect "ends of instructions" you would basically need to include a disassembler, because there can be any instruction at the beginning of the function. Or when the function is itself shorter than 5 bytes (a function that always returns 0 can be written as XOR EAX,EAX; RETN which is just 3 bytes).
Most current compilers/assemblers produce a 5-byte long function prolog, exactly for this purpose, hooking. See that MOV EDI, EDI? If you wonder, "why the hell do they move edi to edi? that doesn't do anything!?" you are absolutely correct, but this is the purpose of the prolog, to be exactly 5-bytes long (not ending in a middle of an instruction). Note that the disassembly example is not something I made up, it's calc.exe on Windows Vista.
The rest of the hook implementation is just technical details, but they can bring you many hours of pain, because that's the hardest part. Also the behaviour you described in your question:
void MyInstallRules(void)
{
if(PreHook() == block) // <-- First a 'pre' hook which can block the function
return;
int * val = InstallRules(); // <-- Call original function
PostHook(val); // <-- Call post hook, if interest of original functions return value
}
seems worse than what I described (and what Detours does), for example you might want to "not call the original" but return some different value. Or call the original function twice. Instead, let your hook handler decide whether and where it will call the original function. Also then you don't need two handler functions for a hook.
If you don't have enough knowledge about the technologies you need for this (mostly assembly), or don't know how to do the hooking, I suggest you study what Detours does. Hook your own binary and take a debugger (OllyDbg for example) to see at assembly level what it exactly did, what instructions were placed and where. Also this tutorial might come in handy.
Anyway, if your task is to hook some functions in a specific program, then this is doable and if you have any trouble, just ask here again. Basically you can do a lot of assumptions (like the function prologs or used conventions) that will make your task much easier.
If you want to create some reliable hooking framework, then still is a completely different story and you should first begin by creating simple hooks for some simple apps.
Also note that this technique is not OS specific, it's the same on all x86 platforms, it will work on both Linux and Windows. What is OS specific is that you will probably have to change memory protection of the code ("unlock" it, so you can write to it), which is done with mprotect on Linux and with VirtualProtect on Windows. Also the calling conventions are different, that that's what you can solve by using the correct syntax in your compiler.
Another trouble is "DLL injection" (on Linux it will probably be called "shared library injection" but the term DLL injection is widely known). You need to put your code (that performs the hook) into the program. My suggestion is that if it's possible, just use LD_PRELOAD environment variable, in which you can specify a library that will be loaded into the program just before it's run. This has been described in SO many times, like here: What is the LD_PRELOAD trick?. If you must do this in runtime, I'm afraid you will need to get with gdb or ptrace, which in my opinion is quite hard (at least the ptrace thing) to do. However you can read for example this article on codeproject or this ptrace tutorial.
I also found some nice resources:
SourceHook project, but it seems it's only for virtual functions in C++, but you can always take a look at its source code
this forum thread giving a simple 10-line function to do this "inline hook" that I described
this a little more complex code in a forum
here on SO is some example
Also one other point: This "inline patching" is not the only way to do this. There are even simpler ways, e.g. if the function is virtual or if it's a library exported function, you can skip all the assembly/disassembly/JMP thing and simply replace the pointer to that function (either in the table of virtual functions or in the exported symbols table).
I am struggling to find a way to retrieve first character of the first command line argument in GAS. To clarify what I mean here how I do it in NASM:
main:
pop ebx
pop ebx
pop ebx ; get first argument string address into EBX register
cmp byte [ebx], 45 ; compare the first char of the argument string to ASCII dash ('-', dec value 45)
...
EDIT: Literal conversion to AT&T syntax and compiling it in GAS won't produce expected results. EBX value will not be recognized as a character.
I'm not sure to understand why you want, in 2011, to code an entire application in assembly (unless fun is your main motivation, and coding thousands of assembly lines is fun to you).
And if you do that, you probably don't want to call the entry point of your program main (in C on Gnu/Linux, that function is called from crt0.o or similar), but more probably start.
And if you want to understand the detailed way of starting an application in assembly, read the Assembly Howto and the Linux ABI supplement for x86-64 and similar documents for your particular system.
Ok I figured it out myself. Entry point should NOT be called main, but _start. Thanks Basile for a hint, +1.
I'm just finishing up a computer architecture course this semester where, among other things, we've been dabbling in MIPS assembly and running it in the MARS simulator. Today, out of curiosity, I started messing around with NASM on my Ubuntu box, and have basically just been piecing things together from tutorials and getting a feel for how NASM is different from MIPS. Here is the code snippet I'm currently looking at:
global _start
_start:
mov eax, 4
mov ebx, 1
pop ecx
pop ecx
pop ecx
mov edx, 200
int 0x80
mov eax, 1
mov ebx, 0
int 0x80
This is saved as test.asm, and assembled with nasm -f elf test.asm and linked with ld -o test test.o. When I invoke it with ./test anArgument, it prints 'anArgument', as expected, followed by however many characters it takes to pad that string to 200 characters total (because of that mov edx, 200 statement). The interesting thing, though, is that these padding characters, which I would have expected to be gibberish, are actually from the beginning of my environment variables, as displayed by the env command. Why is this printing out my environment variables?
Without knowing the actual answer or having the time to look it up, I'm guessing that the environment variables get stored in memory after the command line arguments. Your code is simply buffer overflowing into the environment variable strings and printing them too.
This actually makes sense, since the command line arguments are handled by the system/loader, as are the environment variables, so it makes sense that they are stored near each other. To fix this, you would need to find the length of the command line arguments and only print that many characters. Or, since I assume they are null terminated strings, print until you reach a zero byte.
EDIT:
I assume that both command line arguments and environment variables are stored in the initialized data section (.data in NASM, I believe)
In order to understand why you are getting environment variables, you need to understand how the kernel arranges memory on process startup. Here is a good explanation with a picture (scroll down to "Stack layout").
As long as you're being curious, you might want to work out how to print the address of your string (I think it's passed in and you popped it off the stack). Also, write a hex dump routine so you can look at that memory and other addresses you're curious about. This may help you discover things about the program space.
Curiosity may be the most important thing in your programmer's toolbox.
I haven't investigated the particulars of starting processes but I think that every time a new shell starts, a copy of the environment is made for it. You may be seeing the leftovers of a shell that was started by a command you ran, or a script you wrote, etc.