I'm using Hex-Rays's IDA Pro to decompile a binary. I have this switch:
case 0x35:
CField::OnDesc_MAYB(v6, a6);
break;
case 0x36:
(*(void (__thiscall **)(_DWORD, _DWORD))(*(_DWORD *)(a1 - 8) + 28))(a1 - 8, a6);
break;
case 0x3A:
CField::OnWarnMessage(v6, a6);
break;
If you look at case 0x36:, I can't understand this statement. Usually I just point at the function and decode it using the F5 shotcut, however, I don't understand what this statement means? How can I decode it to view it's code?
Thanks.
case 0x36 is invoking a virtual function, or at least what Hex-Rays believes to be a virtual function. Consider the following pseudo C++ code (excluded reinterpret_cast to brevity, etc), which deconstructs that one line.
// in VC++, 'this' is usually passed via ECX register
typedef void (__thiscall* member_function_t)(_DWORD this_ptr, _DWORD arg_0);
// a1's declaration wasn't included in your post, so I'm making an assumption here
byte* a1 = address_of_some_child_object;
// It would appear a1 is a pointer to an object which has multiple vftables (due to multiple inheritance/interfaces)
byte*** base_object = (byte***)(a1 - 8);
// Dereference the pointer at a1[-8] to get the base's vftable pointer (constant list of function pointers for the class's virtual funcs)
// a1[0] would probably be the child/interface's vftable pointer
byte** base_object_vftable = *base_object;
// 28 / sizeof(void*) = 8th virtual function in the vftable
byte* base_object_member_function = base_object_vftable[28];
auto member_function = (member_function_t)base_object_member_function;
// case 0x36 simplified using a __thiscall function pointer
member_function((_DWORD)base_object, a6)
Deconstructed from:
(
*(
void (__thiscall **)(_DWORD, _DWORD)
)
(*
(_DWORD *)(a1 - 8) + 28
)
)
(a1 - 8, a6);
If you're unfamiliar with __thiscall calling convention, or how virtual functions are typically implemented in C++, you should probably read up on them before trying to reverse engineer programs which use them.
You could start with these breakdowns:
vftable - what is this?
Reversing Microsoft Visual C++ Part II: Classes, Methods and RTTI
Related
In my application's InitInstance function, I have the following code to rewrite the location of the CHM Help Documentation:
CString strHelp = GetProgramPath();
strHelp += _T("MeetSchedAssist.CHM");
free((void*)m_pszHelpFilePath);
m_pszHelpFilePath = _tcsdup(strHelp);
It is all functional but it gives me a code analysis warning:
C26408 Avoid malloc() and free(), prefer the nothrow version of new with delete (r.10).
When you look at the official documentation for m_pszHelpFilePath it does state:
If you assign a value to m_pszHelpFilePath, it must be dynamically allocated on the heap. The CWinApp destructor calls free( ) with this pointer. You many want to use the _tcsdup( ) run-time library function to do the allocating. Also, free the memory associated with the current pointer before assigning a new value.
Is it possible to rewrite this code to avoid the code analysis warning, or must I add a __pragma?
You could (should?) use a smart pointer to wrap your reallocated m_pszHelpFilePath buffer. However, although this is not trivial, it can be accomplished without too much trouble.
First, declare an appropriate std::unique_ptr member in your derived application class:
class MyApp : public CWinApp // Presumably
{
// Add this member...
public:
std::unique_ptr<TCHAR[]> spHelpPath;
// ...
};
Then, you will need to modify the code that constructs and assigns the help path as follows (I've changed your C-style cast to an arguably better C++ cast):
// First three (almost) lines as before ...
CString strHelp = GetProgramPath();
strHelp += _T("MeetSchedAssist.CHM");
free(const_cast<TCHAR *>(m_pszHelpFilePath));
// Next, allocate the shared pointer data and copy the string...
size_t strSize = static_cast<size_t>(strHelp.GetLength() + 1);
spHelpPath std::make_unique<TCHAR[]>(strSize);
_tcscpy_s(spHelpPath.get(), strHelp.GetString()); // Use the "_s" 'safe' version!
// Now, we can use the embedded raw pointer for m_pszHelpFilePath ...
m_pszHelpFilePath = spHelpPath.get();
So far, so good. The data allocated in the smart pointer will be automatically freed when your application object is destroyed, and the code analysis warnings should disappear. However, there is one last modification we need to make, to prevent the MFC framework from attempting to free our assigned m_pszHelpFilePath pointer. This can be done by setting that to nullptr in the MyApp class override of ExitInstance:
int MyApp::ExitInstance()
{
// <your other exit-time code>
m_pszHelpFilePath = nullptr;
return CWinApp::ExitInstance(); // Call base class
}
However, this may seem like much ado about nothing and, as others have said, you may be justified in simply supressing the warning.
Technically, you can take advantage of the fact that new / delete map to usual malloc/free by default in Visual C++, and just go ahead and replace. The portability won't suffer much as MFC is not portable anyway. Sure you can use unique_ptr<TCHAR[]> instead of direct new / delete, like this:
CString strHelp = GetProgramPath();
strHelp += _T("MeetSchedAssist.CHM");
std::unique_ptr<TCHAR[]> str_old(m_pszHelpFilePath);
auto str_new = std::make_unique<TCHAR[]>(strHelp.GetLength() + 1);
_tcscpy_s(str_new.get(), strHelp.GetLength() + 1, strHelp.GetString());
m_pszHelpFilePath = str_new.release();
str_old.reset();
For robustness for replaced new operator, and for least surprise principle, you should keep free / strdup.
If you replace multiple of those CWinApp strings, suggest writing a function for them, so that there's a single place with free / strdup with suppressed warnings.
I have a dynamically linked ELF executable on Linux, and I want to swap a function in a library it is linked against. With LD_PRELOAD I can, of course, supply a small library with a replacement for the function that I compile myself. However, what if in the replacement I want to call the original library function? For example, the function may be srand(), and I want to hijack it with my own seed choice but otherwise let srand() do whatever it normally does.
If I were linking to make said executable, I would use the wrap option of the linker but here I only have the compiled binary.
One trivial solution I see is to cut and paste the source code for the original library function into the replacement - but I want to handle the more general case when the source is unavailable. Or, I could hex edit the needed extra code into the binary but that is specific to the binary and also time consuming. Is something more elegant possible than either of these? Such as some magic with the loader?
(Apologies if I were not using the terminology precisely...)
Here's an example of wrapping malloc:
// LD_PRELOAD will cause the process to call this instead of malloc(3)
// report malloc(size) calls
void *malloc(size_t size)
{
// on first call, get a function pointer for malloc(3)
static void *(*real_malloc)(size_t) = NULL;
static int malloc_signal = 0;
if(!real_malloc)
{
// real_malloc = (void *(*)(size_t))dlsym(RTLD_NEXT, "malloc");
*(void **) (&real_malloc) = dlsym(RTLD_NEXT, "malloc");
}
assert(real_malloc);
if (malloc_signal == 0)
{
char *string = getenv("MW_MALLOC_SIGNAL");
if (string != NULL)
{
malloc_signal = 1;
}
}
// call malloc(3)
void *retval = real_malloc(size);
fprintf(stderr, "MW! %f malloc size %zu, address %p\n", get_seconds(), size, retval);
if (malloc_signal == 1)
{
send_signal(SIGUSR1);
}
return retval;
}
The canonical answer is to use dlsym(RTLD_NEXT, ...).
From the man page:
RTLD_NEXT
Find the next occurrence of the desired symbol in the search
order after the current object. This allows one to provide a
wrapper around a function in another shared object, so that,
for example, the definition of a function in a preloaded
shared object (see LD_PRELOAD in ld.so(8)) can find and invoke
the "real" function provided in another shared object (or for
that matter, the "next" definition of the function in cases
where there are multiple layers of preloading).
See also this article.
Just for completeness, regarding editing the function name in the binary - I checked and it works but not without potential hiccups. E.g., in the example I mentioned, one can find the offset of "srand" (e.g., via strings -t x exefile | grep srand) and hex edit the string to "sran0". But names of symbols may be overlapping (to save space), so if the code also calls rand(), then there is only one "srand" string in the binary for both. After the change the unresolved references will then be to sran0 and ran0. Not a showstopper, of course, but something to keep in mind. The dlsym() solution is certainly more flexible.
I am working in a legacy codebase with a large amount of Objective-C++ written using manual retain/release. Memory is managed using lots of C++ std::shared_ptr<NSMyCoolObjectiveCPointer>, with a suitable deleter passed in on construction that calls release on the contained object. This seems to work great; however, when enabling UBSan, it complains about misaligned pointers, usually when dereferencing the shared_ptrs to do some work.
I've searched for clues and/or solutions, but it's difficult to find technical discussion of the ins and outs of Objective-C object pointers, and even more difficult to find any discussion about Objective-C++, so here I am.
Here is a full Objective-C++ program that demonstrates my problem. When I run this on my Macbook with UBSan, I get a misaligned pointer issue in shared_ptr::operator*:
#import <Foundation/Foundation.h>
#import <memory>
class DateImpl {
public:
DateImpl(NSDate* date) : _date{[date retain], [](NSDate* date) { [date release]; }} {}
NSString* description() const { return [&*_date description]; }
private:
std::shared_ptr<NSDate> _date;
};
int main(int argc, const char * argv[]) {
#autoreleasepool {
DateImpl date{[NSDate distantPast]};
NSLog(#"%#", date.description());
return 0;
}
}
I get this in the call to DateImpl::description:
runtime error: reference binding to misaligned address 0xe2b7fda734fc266f for type 'std::__1::shared_ptr<NSDate>::element_type' (aka 'NSDate'), which requires 8 byte alignment
0xe2b7fda734fc266f: note: pointer points here
<memory cannot be printed>
I suspect that there is something awry with the usage of &* to "cast" the shared_ptr<NSDate> to an NSDate*. I think I could probably work around this issue by using .get() on the shared_ptr instead, but I am genuinely curious about what is going on. Thanks for any feedback or hints!
There were some red herrings here: shared_ptr, manual retain/release, etc. But I ended up discovering that even this very simple code (with ARC enabled) causes the ubsan hit:
#import <Foundation/Foundation.h>
int main(int argc, const char * argv[]) {
#autoreleasepool {
NSDate& d = *[NSDate distantPast];
NSLog(#"%#", &d);
}
return 0;
}
It seems to simply be an issue with [NSDate distantPast] (and, incidentally, [NSDate distantFuture], but not, for instance, [NSDate date]). I conclude that these must be singleton objects allocated sketchily/misaligned-ly somewhere in the depths of Foundation, and when you dereference them it causes a misaligned pointer read.
(Note it does not happen when the code is simply NSLog(#"%#", &*[NSDate distantPast]). I assume this is because the compiler simply collapses &* on a raw pointer into a no-op. It doesn't for the shared_ptr case in the original question because shared_ptr overloads operator*. Given this, I believe there is no easy way to make this happen in pure Objective-C, since you can't separate the & operation from the * operation, like you can when C++ references are involved [by storing the temporary result of * in an NSDate&].)
You are not supposed to ever use a "bare" NSDate type. Objective-C objects should always be used with a pointer-to-object type (e.g. NSDate *), and you are never supposed to get the "type behind the pointer".
In particular, on 64-bit platforms, Objective-C object pointers can sometimes not be valid pointers, but rather be "tagged pointers" which store the "value" of the object in certain bits of the pointer, rather than as an actual allocated object. You must always let the Objective-C runtime machinery deal with Objective-C object pointers. Dereferencing it as a regular C/C++ pointer can lead to undefined behavior.
How do you cast a COM interface pointer to void pointer and then back to the COM pointer? Here is some code to illustrate my problem. It's very similar to this sample code: _com_ptr_t assignment in VC++
CoInitialize(NULL);
COMLib::ICalcPtr pCalc = COMLib::ICalcPtr("MyLibrary.Calculator");
pCalc->doSomething();
CoUninitialize();
return 0;
Now, if I were to cast the pCalc object to void*, how would I cast it back to COMLib::ICalcPtr? For example, the second line in the following code gives me a compile error 'QueryInterface' : is not a member of 'System::Void'. Obviously, it's trying to call IUknown.QueryInterface() on the object. Preferably I would like to do this without creating a new interface (hence, without implicitly calling QueryInterface and AddRef).
void *test = pCalc;
COMLib::ICalcPtr pCalc2 = test;//'QueryInterface' : is not a member of 'System::Void'
FYI, the reason I'm doing this is that the object is going to be passed around from java to jni VC++ code as a void* type. I'm open to any suggestion on what to do or what is going on behind the scene.
Same way you pass any other opaque structure that either doesn't fit in a pointer or doesn't convert easily: by passing its address.
void* test = new COMLib::ICalcPtr(pCalc);
...
COMLib::ICalcPtr pCalc2 = *(COMLib::ICalcPtr*)test;
delete (COMLib::ICalcPtr*)test;
This will result in calls to AddRef and Release, but not QueryInterface.
I have an application that uses managed System::String in the UI elements, but then refers to un-managed (read: legacy) code for the more complex computation.
Additionally, there is not a consistent encoding for the strings - managed strings can be either regular "strings" or Unicode L"strings" and un-managed strings come in all of char *, wchar_t *, std::string, std::wstring varieties.
What is the best way to compare the various flavors of strings? I'm hoping that I can do this without having to implement half a dozen comparison methods like
int compare(System::String ^ s1, char * s2);
int compare(System::String ^ s1, wchar_t * s2);
int compare(System::String ^ s1, std::string s2);
int compare(System::String ^ s1, std::wstring s2);
int compare(char * s1, System::String ^ s2);
int compare(wchar_t * s1, System::String ^ s2);
...
The primary purpose will be equality comparisons, so if those are significantly easier to do, then I would like to see those answers as well.
Here's an excellent MSDN article covering this topic in both directions:
http://msdn.microsoft.com/en-us/library/42zy2z41.aspx
And here's the Marshal class:
http://msdn.microsoft.com/en-us/library/atxe881w.aspx
With this, I would suggest defining various managed code methods that take the different types of native strings, converts them into a managed string, then compares them.
Unfortunately, I see no way to get around permutating the different native string types. They are literally different data types, even though they both represent what we call a string. And if it messes up the conversion, you can get into some dangerous territory.
Also, I would drop std::string out of the running, since you can easily call c_str() to get a const char * out. Same for std::wstring and wchar_t.
Here's an example of one:
using namespace System::Runtime::InteropServices;
public static int NativeCompare(System::String ^ s1, char * s2)
{
System::String ms2 = Marshal::PtrToStringAnsi((IntPtr)s2);
return s1.CompareTo(ms2);
}