Assertion in appcore.cpp while loading regular DLL dynamically linked to MFC - visual-c++

I have inherited an application which consists of a regular DLL which is dynamically linked to MFC and which is loaded from a Windows service executable which also links dynamically to MFC. The code is being compiled using Microsoft Visual Studio 2008 Professional (old, I know...). This application has been 'working' for several years but I have found that I cannot run it as Debug build due to the following assertion in appcore.cpp:
Debug Assertion Failed!
Program: C:\Projects\CMM\Debug\CMM.exe
File: f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\appcore.cpp
Line: 380
For more information on how your program can cause an assertion
failure, see the Visual C++ documentation on asserts.
(Press Retry to debug the application)
which corresponds to the following code in the CWinApp constructor:
ASSERT(AfxGetThread() == NULL);
pThreadState->m_pCurrentWinThread = this;
ASSERT(AfxGetThread() == this);
This occurs when loading the DLL via LoadLibrary and leads me to suspect that the application has worked more by luck than judgement over the years (due to ASSERT not being included as part of the Release build).
My (admittedly limited) understanding of MFC is that although there should generally only be a single instance of CWinApp, it is permissible to have an additional one in regular DLLs which link dynamically to MFC, as in this case. The code has one instance in the service executable and one in the DLL. The CWinApp constructor gets called three (?) times, once for some internal instance within the MFC framework, once for the instance in the service executable and once for the instance in the DLL. The first two work fine, it is the third which blows up.
All of the exported functions start with AFX_MANAGE_STATE (although execution never gets that far) and the pre-processor flags are, I believe, correct w.r.t. Microsoft's documentation (_AFXDLL for the EXE, _AFXDLL, _USRDLL and _WINDLL for the DLL).
I have tried using AfxLoadLibrary instead of LoadLibrary to no effect. However, if I include
AFX_MANAGE_STATE( AfxGetStaticModuleState() )
at the start of the function which calls LoadLibrary/AfxLoadLibrary, the CWinApp object is actually constructed but execution then blows up in dllmodul.cpp instead.
Can anybody shed any light on why this might be happening or what I need to do to fix it?
EDIT
This is the call stack when the assertion occurs:
mfc90d.dll!CWinApp::CWinApp(const char * lpszAppName=0x00000000) Line 380 + 0x1c bytes C++
cimdll.dll!CCimDllApp::CCimDllApp() Line 146 + 0x19 bytes C++
cimdll.dll!`dynamic initializer for 'theApp''() Line 129 + 0xd bytes C++
msvcr90d.dll!_initterm(void (void)* * pfbegin=0x1b887c88, void (void)* * pfend=0x1b887d6c) Line 903 C
cimdll.dll!_CRT_INIT(void * hDllHandle=0x1b770000, unsigned long dwReason=1, void * lpreserved=0x00000000) Line 318 + 0xf bytes C
cimdll.dll!__DllMainCRTStartup(void * hDllHandle=0x1b770000, unsigned long dwReason=1, void * lpreserved=0x00000000) Line 540 + 0x11 bytes C
cimdll.dll!_DllMainCRTStartup(void * hDllHandle=0x1b770000, unsigned long dwReason=1, void * lpreserved=0x00000000) Line 510 + 0x11 bytes C
ntdll.dll!_LdrxCallInitRoutine#16() + 0x16 bytes
ntdll.dll!LdrpCallInitRoutine() + 0x43 bytes
ntdll.dll!LdrpInitializeNode() + 0x101 bytes
ntdll.dll!LdrpInitializeGraphRecurse() + 0x71 bytes
ntdll.dll!LdrpPrepareModuleForExecution() + 0x8b bytes
ntdll.dll!LdrpLoadDllInternal() + 0x121 bytes
ntdll.dll!LdrpLoadDll() + 0x92 bytes
ntdll.dll!_LdrLoadDll#16() + 0xd9 bytes
KernelBase.dll!_LoadLibraryExW#12() + 0x138 bytes
KernelBase.dll!_LoadLibraryExA#12() + 0x26 bytes
KernelBase.dll!_LoadLibraryA#4() + 0x32 bytes
mfc90d.dll!AfxCtxLoadLibraryA(const char * lpLibFileName=0x02a70ce0) Line 487 + 0x74 bytes C++
mfc90d.dll!AfxLoadLibrary(const char * lpszModuleName=0x02a70ce0) Line 193 + 0x9 bytes C++
CMM.exe!CMonDevDll::LoadDLL() Line 207 + 0x1b bytes C++
CMM.exe!CMonDevDll::LoadDllEntryPoints() Line 268 + 0x8 bytes C++
CMM.exe!CMonDevDll::Initialize(CMonDevRun * pMonDevRun=0x0019fe60) Line 186 + 0x8 bytes C++
CMM.exe!CCtcLinkMonDev::Initialize(CMonDevRun * pMonDevRun=0x0019fe60, CCtcRegistry & reg={...}, int nLinkId=1) Line 546 + 0x18 bytes C++
CMM.exe!CCtcLinkSwitchMgr::Initialize(CMonDevRun * pMonDevRun=0x0019fe60, CCtcRegistry & reg={...}) Line 188 + 0x14 bytes C++
CMM.exe!CMonDevRun::Initialize(ATL::CStringT<char,StrTraitMFC_DLL<char,ATL::ChTraitsCRT<char> > > szServiceName="CimService") Line 257 + 0x16 bytes C++
CMM.exe!CMonDevService::Run() Line 202 + 0x2d bytes C++
CommonFilesD.dll!CCtcServiceBase::ParseStandardArgs(int argc=-1, char * * argv=0x02a51b44) Line 278 + 0xf bytes C++
CMM.exe!main(int argc=4, char * * argv=0x02a51b38) Line 126 + 0x16 bytes C++
CMM.exe!__tmainCRTStartup() Line 586 + 0x19 bytes C
CMM.exe!mainCRTStartup() Line 403 C
kernel32.dll!#BaseThreadInitThunk#12() + 0x24 bytes
ntdll.dll!__RtlUserThreadStart() + 0x2f bytes
ntdll.dll!__RtlUserThreadStart#8() + 0x1b bytes

I finally managed to track down the cause of my crash. A library used by my DLL was linking statically to Boost.Thread and causing this issue, presumably due to a runtime mismatch. Changing the library to link dynamically to Boost instead seems to have fixed the problem.

Related

NtQueryObject returns wrong insufficient required size via WOW64, why?

I am using the NT native API NtQueryObject()/ZwQueryObject() from user mode (and I am aware of the risks in general and I have written kernel mode drivers for Windows in the past in my professional capacity).
Generally when one uses the typical "query information" function (of which there are a few) the protocol is first to ask with a too small buffer to retrieve the required size with STATUS_INFO_LENGTH_MISMATCH, then allocate a buffer of said size and query again -- this time using the buffer and previously returned size.
In order to get the list of object types (67 on my build) on the system I am doing just that:
ULONG Size = 0;
NTSTATUS Status = NtQueryObject(NULL, ObjectTypesInformation, &Size, sizeof(Size), &Size);
And in Size I get 8280 (WOW64) and 8968 (x64). I then proceed to allocate the buffer with calloc() and query again:
ULONG Size2 = 0;
BYTE* Buf = (BYTE*)::calloc(1, Size);
Status = NtQueryObject(NULL, ObjectTypesInformation, Buf, Size, &Size2);
NB: ObjectTypesInformation is 3. It isn't declared in winternl.h, but Nebbett (as ObjectAllTypesInformation) and others describe it. Since I am not querying for a particular object's traits but the system-wide list of object types, I pass NULL for the object handle.
Curiously on WOW64, i.e. 32-bit, the value in Size2 upon return from the second query is 16 Bytes (= 8296) bigger than the previously returned required size.
As far as alignment is concerned, I'd expect at most 8 Bytes for this sort of thing and indeed neither 8280 nor 8296 are at a 16 Byte alignment boundary, but on an 8 Byte one.
Certainly I can add some slack space on top of the returned required size (e.g. ALIGN_UP to the next 32 Byte alignment boundary), but this seems highly irregular to be honest. And I'd rather want to understand what's going on than to implement a workaround that breaks, because I miss something crucial.
The practical issue for the code is that in Debug configurations it tells me there's a corrupted heap somewhere, upon freeing Buf. Which suggests that NtQueryObject() was indeed writing these extra 16 Bytes beyond the buffer I provided.
Question: Any idea why it is doing that?
As usual for NT native API the sources of information are scarce. The x64 version of the exact same code returns the exact number of bytes required. So my thinking here is that WOW64 is the issue. A somewhat cursory look into wow64.dll with IDA didn't reveal any immediate points for suspicion regarding what goes wrong in translating the results to 32-bit here.
PS: Windows 10 (10.0.19043, ntdll.dll "timestamp" 77755782)
PPS: this may be related: https://wj32.org/wp/2012/11/30/obquerytypeinfo-and-ntqueryobject-buffer-overrun-in-windows-8/ Tested it, by checking that OBJECT_TYPE_INFORMATION::TypeName.Length + sizeof(WCHAR) == OBJECT_TYPE_INFORMATION::TypeName.MaximumLength in all returned items, which was the case.
The only part of ObjectTypesInformation that's public is the first field defined in winternl.h header in the Windows SDK:
typedef struct __PUBLIC_OBJECT_TYPE_INFORMATION {
UNICODE_STRING TypeName;
ULONG Reserved [22]; // reserved for internal use
} PUBLIC_OBJECT_TYPE_INFORMATION, *PPUBLIC_OBJECT_TYPE_INFORMATION;
For x86 this is 96 bytes, and for x64 this is 104 bytes (assuming you have the right packing mode enabled). The difference is the pointer in UNICODE_STRING which changes the alignment in x64.
Any additional memory space should be related to the TypeName buffer.
UNICODE_STRING accounts for 8 bytes of the difference between 8280 and 8296. The function uses the sizeof(ULONG_PTR) for alignment of the returned string plus an extra WCHAR, so that could easily account for the remaining 8 bytes.
AFAIK: The public use of NtQueryObject is supposed to be limited to kernel-mode use which of course means it always matches the OS native bitness (x86 code can't run as kernel in x64 native OS), so it's probably just a quirk of using the NT functions via the WOW64 thunk.
Alright, I think I figured out the issue with the help of WinDbg and a thorough look at wow64.dll using IDA.
NB: the wow64.dll I have has the same build number, but differs slightly in data only (checksum, security directory entry, pieces from version resources). The code is identical, which was to be expected, given deterministic builds and how they affect the PE timestamp.
There's an internal function called whNtQueryObject_SpecialQueryCase (according to PDBs), which covers the ObjectTypesInformation class queries.
For the above wow64.dll I used the following points of interest in WinDbg, from a 32 bit program which calls NtQueryObject(NULL, ObjectTypesInformation, ...) (the program itself is irrelevant, though):
0:000> .load wow64exts
0:000> bp wow64!whNtQueryObject_SpecialQueryCase+B0E0
0:000> bp wow64!whNtQueryObject_SpecialQueryCase+B14E
0:000> bp wow64!whNtQueryObject_SpecialQueryCase+B1A7
0:000> bp wow64!whNtQueryObject_SpecialQueryCase+B24A
0:000> bp wow64!whNtQueryObject_SpecialQueryCase+B252
Explanation of the above points of interest:
+B0E0: computing length required for 64 bit query, based on passed length for 32 bit
+B14E: call to NtQueryObject()
+B1A7: loop body for copying 64 to 32 bit buffer contents, after successful NtQueryObject() call
+B24A: computing written length by subtracting current (last + 1) entry from base buffer address
+B252: downsizing returned (64 bit) required length to 32 bit
The logic of this function in regards to just ObjectTypesInformation is roughly as follows:
Common steps
Take the ObjectInformationLength (32 bit query!) argument and size it up to fit the 64 bit info
Align the retrieved size up to the next 16 byte boundary
If necessary allocate the resulting amount from some PEB::ProcessHeap and store in TLS slot 3; otherwise using this as a scratch space
Call NtQueryObject() passing the buffer and length from the two previous steps
The length passed to NtQueryObject() is the one from step 1, not the one aligned to a 16 byte boundary. There seems to be some sort of header to this scratch space, so perhaps that's where the 16 byte alignment comes from?
Case 1: buffer size too small (here: 4), just querying required length
The up-sized length in this case equals 4, which is too small and consequently NtQueryObject() returns STATUS_INFO_LENGTH_MISMATCH. Required size is reported as 8968.
Down-size from the 64 bit required length to 32 bit and end up 16 bytes too short
Return the status from NtQueryObject() and the down-sized required length form the previous step
Case 2: buffer size supposedly (!) sufficient
Copy OBJECT_TYPES_INFORMATION::NumberOfTypes from queried buffer to 32 bit one
Step to the first entry (OBJECT_TYPE_INFORMATION) of source (64 bit) and target (32 bit) buffer, 8 and 4 byte aligned respectively
For for each entry up to OBJECT_TYPES_INFORMATION::NumberOfTypes:
Copy UNICODE_STRING::Length and UNICODE_STRING::MaximumLength for TypeName member
memcpy() UNICODE_STRING::Length bytes from source to target UNICODE_STRING::Buffer (target entry + sizeof(OBJECT_TYPE_INFORMATION32)
Add terminating zero (WCHAR) past the memcpy'd string
Copy the individual members past the TypeName from 64 to 32 bit struct
Compute pointer of next entry by aligning UNICODE_STRING::MaximumLength up to an 8 byte boundary (i.e. the ULONG_PTR alignment mentioned in the other answer) + sizeof(OBJECT_TYPE_INFORMATION64) (already 8 byte aligned!)
The next target entry (32 bit) gets 4 byte aligned instead
At the end compute required (32 bit) length by subtracting the value we arrived at for the "next" entry (i.e. one past the last) from the base address of the buffer passed by the WOW64 program (32 bit) to NtQueryObject()
In my debugged scenario these were: 0x008ce050 - 0x008cbfe8 = 0x00002068 (= 8296), which is 16 bytes larger than the buffer length we were told during case 1 (8280)!
The issue
That crucial last step differs between merely querying and actually getting the buffer filled. There is no further bounds checking in that loop I described for case 2.
And this means it will just overrun the passed buffer and return a written length bigger than the buffer length passed to it.
Possible solutions and workarounds
I'll have to approach this mathematically after some sleep, the workaround is obviously to top up the required length returned from case 1 in order to avoid the buffer overrun. The easiest method is to use my up_size_from_32bit() from the example below and use that on the returned required size. This way you are allocating enough for the 64 bit buffer, while querying the 32 bit one. This should never overrun during the copy loop.
However, the fix in wow64.dll is a little more involved, I guess. While adding bounds checking to the loop would help avert the overrun, it would mean that the caller would have to query for the required size twice, because the first time around it lies to us.
Which means the query-only case (1) would have to allocate that internal buffer after querying the required length for 64 bit, then get it filled and then walk the entries (just like the copy loop), skipping over the last entry to compute the required length the same as it is now done after the copy loop.
Example program demonstrating the "static" computation by wow64.dll
Build for x64, just the way wow64.dll was!
#define WIN32_LEAN_AND_MEAN
#include <Windows.h>
#include <cstdio>
typedef struct
{
ULONG JustPretending[24];
} OBJECT_TYPE_INFORMATION32;
typedef struct
{
ULONG JustPretending[26];
} OBJECT_TYPE_INFORMATION64;
constexpr ULONG size_delta_3264 = sizeof(OBJECT_TYPE_INFORMATION64) - sizeof(OBJECT_TYPE_INFORMATION32);
constexpr ULONG down_size_to_32bit(ULONG len)
{
return len - size_delta_3264 * ((len - 4) / sizeof(OBJECT_TYPE_INFORMATION64));
}
constexpr ULONG up_size_from_32bit(ULONG len)
{
return len + size_delta_3264 * ((len - 4) / sizeof(OBJECT_TYPE_INFORMATION32));
}
// Trying to mimic the wdm.h macro
constexpr size_t align_up_by(size_t address, size_t alignment)
{
return (address + (alignment - 1)) & ~(alignment - 1);
}
constexpr auto u32 = 8280UL;
constexpr auto u64 = 8968UL;
constexpr auto from_64 = down_size_to_32bit(u64);
constexpr auto from_32 = up_size_from_32bit(u32);
constexpr auto from_32_16_byte_aligned = (ULONG)align_up_by(from_32, 16);
int wmain()
{
wprintf(L"32 to 64 bit: %u -> %u -(16-byte-align)-> %u\n", u32, from_32, from_32_16_byte_aligned);
wprintf(L"64 to 32 bit: %u -> %u\n", u64, from_64);
return 0;
}
static_assert(sizeof(OBJECT_TYPE_INFORMATION32) == 96, "Size for 64 bit struct does not match.");
static_assert(sizeof(OBJECT_TYPE_INFORMATION64) == 104, "Size for 64 bit struct does not match.");
static_assert(u32 == from_64, "Must match (from 64 to 32 bit)");
static_assert(u64 == from_32, "Must match (from 32 to 64 bit)");
static_assert(from_32_16_byte_aligned % 16 == 0, "16 byte alignment failed");
static_assert(from_32_16_byte_aligned > from_32, "We're aligning up");
This does not mimic the computation that happens in case 2, though.

Memory leak using CGAL's exact kernel

I am working on a project that uses OpenMP and the exact kernel of CGAL. As I was running my code in debug mode, and thanks to Visual Studio Leak Detector, I discovered a lot of unexpected memory leaks that are obviously related to the combination of those features. I know that an instance of CGAL::Lazy_exact_nt<CGAL::Gmpq> a.k.a. FT should not be shared between multiple threads, but since it is not the case here, I thought it was safe. Is there any way to fix these leaks ?
Here is a minimal reproducible example:
int main()
{
double xs_h = 1.2;
#pragma omp parallel
{
FT xs;
std::stringstream stream;
stream << xs_h;
stream >> xs;
}
}
And here is (part of) the output of VLD (7 memory leaks in total, all pointing to the same line of code):
---------- Block 26 at 0x00000000FEED2E50: 40 bytes ----------
Leak Hash: 0x6B0C3782, Count: 1, Total 40 bytes
Call Stack (TID 15080):
ucrtbased.dll!malloc()
D:\agent\_work\9\s\src\vctools\crt\vcstartup\src\heap\new_scalar.cpp (35): Test.exe!operator new() + 0xA bytes
D:\CGAL-4.13\include\CGAL\Lazy.h (818): Test.exe!CGAL::Lazy<CGAL::Interval_nt<0>,CGAL::Gmpq,CGAL::Lazy_exact_nt<CGAL::Gmpq>,CGAL::To_interval<CGAL::Gmpq> >::zero() + 0x77 bytes
D:\CGAL-4.13\include\CGAL\Lazy.h (766): Test.exe!CGAL::Lazy<CGAL::Interval_nt<0>,CGAL::Gmpq,CGAL::Lazy_exact_nt<CGAL::Gmpq>,CGAL::To_interval<CGAL::Gmpq> >::Lazy<CGAL::Interval_nt<0>,CGAL::Gmpq,CGAL::Lazy_exact_nt<CGAL::Gmpq>,CGAL::To_interval<CGAL::Gmpq> >() + 0x5 bytes
D:\CGAL-4.13\include\CGAL\Lazy_exact_nt.h (365): Test.exe!CGAL::Lazy_exact_nt<CGAL::Gmpq>::Lazy_exact_nt<CGAL::Gmpq>() + 0x28 bytes
D:\projects\...\src\main.cpp (20): Test.exe!main$omp$1() + 0xA bytes
VCOMP140D.DLL!vcomp_fork() + 0x2E5 bytes
VCOMP140D.DLL!vcomp_fork() + 0x2A1 bytes
VCOMP140D.DLL!vcomp_atomic_div_r8() + 0x20A bytes
KERNEL32.DLL!BaseThreadInitThunk() + 0x14 bytes
ntdll.dll!RtlUserThreadStart() + 0x21 bytes
I am using CGAL 4.13, and my compiler is Visual Studio 2019. I tried recompiling this code with macro CGAL_HAS_THREADS but it does not change the result (memory leaks do not vanish). Thanks for your attention.

How to get right MIPS libc toolchain for embedded device

I've run into a problem (repetitively) with various company's' embedded linux products where GPL source code from them does not match what is actually running on a system. It's "close", but not quite right, especially with respect to the standard C library they use.
Isn't that a violation of the GPL?
Often this mismatch results in a programmer (like me) cross compiling only to have the device reply cryptically "file not found" or something similar when the program is run.
I'm not alone with this kind of problem -- For many people have threads directly and indirectly related to the problem: eg:
Compile parameters for MIPS based codesourcery toolchain?
And I've run into the problem on Sony devices, D-link, and many others. It's very common.
Making a new library is not a good solution, since most systems are ROMFS only, and LD_LIBRARY_PATH is sometimes broken -- so that installing a new library on the device wastes very limited memory and often won't work.
If I knew what the right source code version of the library was, I could go around the manufacturer's carelessness and compile it from the original developer's tree; but how can I find out which version I need when all I have is the binary of the library itself?
For example: I ran elfread -a libc.so.0 on a DSL modem's libc (see below); but I don't see anything here that could tell me exactly which libc it was...
How can I find the name of the source code, or an identifier from the library's binary so I can create a cross compiler using that library? eg: Can anyone tell me what source code this library came from, and how they know?
ELF Header:
Magic: 7f 45 4c 46 01 02 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, big endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: MIPS R3000
Version: 0x1
Entry point address: 0x5a60
Start of program headers: 52 (bytes into file)
Start of section headers: 0 (bytes into file)
Flags: 0x1007, noreorder, pic, cpic, o32, mips1
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 4
Size of section headers: 0 (bytes)
Number of section headers: 0
Section header string table index: 0
There are no sections in this file.
There are no sections to group in this file.
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
REGINFO 0x0000b4 0x000000b4 0x000000b4 0x00018 0x00018 R 0x4
LOAD 0x000000 0x00000000 0x00000000 0x2c9ee 0x2c9ee R E 0x1000
LOAD 0x02c9f0 0x0006c9f0 0x0006c9f0 0x009a0 0x040b8 RW 0x1000
DYNAMIC 0x0000cc 0x000000cc 0x000000cc 0x0579a 0x0579a RWE 0x4
Dynamic section at offset 0xcc contains 19 entries:
Tag Type Name/Value
0x0000000e (SONAME) Library soname: [libc.so.0]
0x00000004 (HASH) 0x18c
0x00000005 (STRTAB) 0x3e9c
0x00000006 (SYMTAB) 0x144c
0x0000000a (STRSZ) 6602 (bytes)
0x0000000b (SYMENT) 16 (bytes)
0x00000015 (DEBUG) 0x0
0x00000003 (PLTGOT) 0x6ce20
0x00000011 (REL) 0x5868
0x00000012 (RELSZ) 504 (bytes)
0x00000013 (RELENT) 8 (bytes)
0x70000001 (MIPS_RLD_VERSION) 1
0x70000005 (MIPS_FLAGS) NOTPOT
0x70000006 (MIPS_BASE_ADDRESS) 0x0
0x7000000a (MIPS_LOCAL_GOTNO) 11
0x70000011 (MIPS_SYMTABNO) 677
0x70000012 (MIPS_UNREFEXTNO) 17
0x70000013 (MIPS_GOTSYM) 0x154
0x00000000 (NULL) 0x0
There are no relocations in this file.
The decoding of unwind sections for machine type MIPS R3000 is not currently supported.
Histogram for bucket list length (total of 521 buckets):
Length Number % of total Coverage
0 144 ( 27.6%)
1 181 ( 34.7%) 27.1%
2 130 ( 25.0%) 66.0%
3 47 ( 9.0%) 87.1%
4 12 ( 2.3%) 94.3%
5 5 ( 1.0%) 98.1%
6 1 ( 0.2%) 99.0%
7 1 ( 0.2%) 100.0%
No version information found in this file.
Primary GOT:
Canonical gp value: 00074e10
Reserved entries:
Address Access Initial Purpose
0006ce20 -32752(gp) 00000000 Lazy resolver
0006ce24 -32748(gp) 80000000 Module pointer (GNU extension)
Local entries:
Address Access Initial
0006ce28 -32744(gp) 00070000
0006ce2c -32740(gp) 00030000
0006ce30 -32736(gp) 00000000
0006ce34 -32732(gp) 00010000
0006ce38 -32728(gp) 0006d810
0006ce3c -32724(gp) 0006d814
0006ce40 -32720(gp) 00020000
0006ce44 -32716(gp) 00000000
0006ce48 -32712(gp) 00000000
Global entries:
Address Access Initial Sym.Val. Type Ndx Name
0006ce4c -32708(gp) 000186c0 000186c0 FUNC bad section index[ 6] __fputc_unlocked
0006ce50 -32704(gp) 000211a4 000211a4 FUNC bad section index[ 6] sigprocmask
0006ce54 -32700(gp) 0001e2b4 0001e2b4 FUNC bad section index[ 6] free
0006ce58 -32696(gp) 00026940 00026940 FUNC bad section index[ 6] raise
...
truncated listing
....
Note:
The rest of this post is a blog showing how I came to ask the question above and to put useful information about the subject in one place.
Don't bother reading it unless you want to know I actually did research the question... in gory detail... and how NOT to answer my question.
The proper (theoretical) way to get a libc program running on (for example) a D-link modem would simply be to get the TRUE source code for the product from the manufacturer, and compile against those libraries.... (It's GPL !? right, so the law is on our side, right?)
For example: I just bought a D-Link DSL-520B modem and a 526B modem -- but found out after the fact that the manufacturer "forgot" to supply linux source code for the 520B but does have it for the 526B. I checked all of the DSL-5xxB devices online for source code & toolchains, finding to my delight that ALL of them (including 526B) -- contain the SAME pre-compiled libc.so.0 with MD5sum of 6ed709113ce615e9f170aafa0eac04a6 . So in theory, all supported modems in the DSL-5xxB family seemed to use the same libc library... and I hoped I might be able to use that library.
But after I figured out how to get the DSL modem itself to send me a copy of the installed /lib/libc.so.0 library -- I found to my disgust that they ALL use a library with MD5 sum of b8d492decc8207e724a0822641205078 . In NEITHER of the modems I bought (supported or not) was found the same library as contained in the source code toolchain.
To verify the toolchain from D-link was defective, I didn't compile a program (the toolchain wouldn't run on my PC anyway as it was the wrong binary format) -- but I found the toolchain had some pre-compiled mips binaries in it already; so I simply downloaded one to the modem and chmod +x -- and (surprise) I got the message "file not found." when I tried to run it ... It won't run.
So, I knew the toolchains were no good immediately, but not exactly why.
I decided to get a newer verson of MIPS GCC (binary version) that should have less bugs, more features and which is supported on most PC platforms. This is the way to GO!
See: Kernel.org pre-compiled binaries
I upgraded to gcc 4.9.0 after selecting the older "mips" verson from the above site to get the right FTP page; and set my shells' PATH variable to the /bin directory of the cross compiler once installed.
Then I copied all the header files and libraries from the D-link source code package into the new cross compiler just to verify that it could compile D-link libc binaries. And it did on the first try, compiling "hello world!" with no warnings or errors into a mips 32 big endian binary.
( START EDIT: ) #ChrisStratton points out in the comments (after this post) that my test of the toolchain is inadequate, and that using a newer GCC with an older library -- even though it links properly -- is flawed as a test. I wish there was a way to give him points for his comments -- I've become convinced that he's right; although that makes what D-link did even a worse problem -- for there's no way to know from the binaries on the modem which GCC they actually used. The GCC used for the kernel isn't necessarily the same used in user space.
In order to test the new compiler's compatibility with the modems and also make tools so I could get a copy of the actual libraries found on the modem: ( END EDIT ) I wrote a program that doesn't use the C library at all (but in two parts): It ran just fine... and the code is attached to show how it can be done.
The first listing is an assembly language program to bypass linking the standard C libraries on MIPS; and the second listing is a program meant to create an octal number dump of a binary file/stream using only the linux kernel. eg: It enables copying/pasting or scripting of binary data over telnet, netcat, etc... via ash/bash or busybox :) like a poor man's uucp.
// substart.S MIPS assembly language bypass of libc startup code
// it just calls main, and then jumps to the exit function
.text
.globl __start
__start: .ent __start
.frame $29, 32, $31
.set noreorder
.cpload $25
.set reorder
.cprestore 16
jal main
j exit
.end __start
// end substart.S
...and...
// octdump.c
// To compile w/o libc :
// mips-linux-gcc stubstart.S octdump.c -nostdlib -o octdump
// To compile with working libc (eg: x86 system) :
// gcc octdump.c -o octdump_x86
#include <syscall.h>
#include <errno.h>
#include <sys/types.h>
int* __errno_location(void) { return &errno; }
#ifdef _syscall1
// define three unix functions (exit,read,write) in terms of unix syscall macros.
_syscall1( void, exit, int, status );
_syscall3( ssize_t, read, int, fd, void*, buf, size_t, count );
_syscall3( ssize_t, write, int, fd, const void*, buf, size_t, count );
#endif
#include <unistd.h>
void oct( unsigned char c ) {
unsigned int n = c;
int m=6;
static unsigned char oval[6]={'\\','\\','0','0','0','0'};
if (n < 64) { m-=1; n <<= 3; }
if (n < 64) { m-=1; n <<= 3; }
if (n < 64) { m-=1; n <<= 3; }
oval[5]='0'+(n&7);
oval[4]='0'+((n>>3)&7);
oval[3]='0'+((n>>6)&7);
write( STDOUT_FILENO, oval, m );
}
int main(void) {
char buffer[255];
int count=1;
int i;
while (count>0) {
count=read( STDIN_FILENO, buffer, 17 );
if (count>0) write( STDOUT_FILENO, "echo -ne $'",11 );
for (i=0; i<count; ++i) oct( buffer[i] );
if (count>0) write( STDOUT_FILENO, "'\n", 2 );
}
write( STDOUT_FILENO,"#\n",2);
return 0;
}
Once mips' octdump was saved (chmod +x) as /var/octdump on the modem, it ran without errors.
(use your imagination about how I got it on there... Dlink's TFTP, & friends are broken.)
I was able to use octdump to copy all the dynamic libraries off the DSL modem and examine them, using an automated script to avoid copy/pasting by hand.
#!/bin/env python
# octget.py
# A program to upload a file off an embedded linux device via telnet
import socket
import time
import sys
import string
if len( sys.argv ) != 4 :
raise ValueError, "Usage: octget.py IP_OF_MODEM passwd path_to_file_to_get"
o = socket.socket( socket.AF_INET, socket.SOCK_STREAM )
o.connect((sys.argv[1],23)) # The IP address of the DSL modem.
time.sleep(1)
sys.stderr.write( o.recv(1024) )
o.send("admin\r\n");
time.sleep(0.1)
sys.stderr.write( o.recv(1024) )
o.send(sys.argv[2]+"\r\n")
time.sleep(0.1)
o.send("sh\r\n")
time.sleep(0.1)
sys.stderr.write( o.recv(1024) )
o.send("cd /var\r\n")
time.sleep(0.1)
sys.stderr.write( o.recv(1024) )
o.send("./octdump.x < "+sys.argv[3]+"\r\n" );
sys.stderr.write( o.recv(21) )
get="y"
while get and not ('#' in get):
get = o.recv(4096)
get = get.translate( None, '\r' )
sys.stdout.write( get )
time.sleep(0.5)
o.close()
The DSL520B modem had the following libraries...
libcrypt.so.0 libpsi.so libutil.so.0 ld-uClibc.so.0 libc.so.0 libdl.so.0 libpsixml.so
... and I thought I might cross compile using these libraries since (at least in theory) -- GCC could link against them; and my problem might be solved.
I made very sure to erase all the incompatible .so libraries from gcc-4.9.0/mips-linux/mips-linux/lib, but kept the generic crt..o files; then I copied the modem's libraries into the cross compiler directory.
But even though the kernel version of the source code, and the kernel version of the modem matched -- GCC found undefined symbols in the crt files.... So, either the generic crt files or the modem libraries themselves are somehow defective... and I don't know why. Without knowing how to get the full library version of the ? ucLibc ? library, I'm not sure how I can get the CORRECT source code to recompile the libraries and the crt's from scratch.

How does the Linux kernel get info about the processors and the cores?

Assume we have a blank computer without any OS and we are installing a Linux. Where in the kernel is the code that identifies the processors and the cores and get information about/from them?
This info eventually shows up in places like /proc/cpuinfo but how does the kernel get it in the first place?!
Short answer
Kernel uses special CPU instruction cpuid and saves results in internal structure - cpuinfo_x86 for x86
Long answer
Kernel source is your best friend.
Start from entry point - file /proc/cpuinfo.
As any proc file it has to be cretaed somewhere in kernel and declared with some file_operations. This is done at fs/proc/cpuinfo.c. Interesting piece is seq_open that uses reference to some cpuinfo_op. This ops are declared in arch/x86/kernel/cpu/proc.c where we see some show_cpuinfo function. This function is in the same file on line 57.
Here you can see
64 seq_printf(m, "processor\t: %u\n"
65 "vendor_id\t: %s\n"
66 "cpu family\t: %d\n"
67 "model\t\t: %u\n"
68 "model name\t: %s\n",
69 cpu,
70 c->x86_vendor_id[0] ? c->x86_vendor_id : "unknown",
71 c->x86,
72 c->x86_model,
73 c->x86_model_id[0] ? c->x86_model_id : "unknown");
Structure c declared on the first line as struct cpuinfo_x86. This structure is declared in arch/x86/include/asm/processor.h. And if you search for references on that structure you will find function cpu_detect and that function calls function cpuid which is finally resolved to native_cpuid that looks like this:
189 static inline void native_cpuid(unsigned int *eax, unsigned int *ebx,
190 unsigned int *ecx, unsigned int *edx)
191 {
192 /* ecx is often an input as well as an output. */
193 asm volatile("cpuid"
194 : "=a" (*eax),
195 "=b" (*ebx),
196 "=c" (*ecx),
197 "=d" (*edx)
198 : "" (*eax), "2" (*ecx)
199 : "memory");
200 }
And here you see assembler instruction cpuid. And this little thing does real work.
This information from BIOS + Hardware DB. You can get info direct by dmidecode, for example (if you need more info - try to check dmidecode source code)
sudo dmidecode -t processor

Buffer Overflow When Declaring Char Array

I've never done any C++/COM work before, so I'm trying to hijack an existing solution and just alter it to my needs. The project was written and successfully compiled with VC 6 and I'm attempting to work with it now in 2010. I had to change a few references to get it to compile, but for some reason, the dll I generate is causing an exception on my system (the original works fine). Doing some research on the error, it looks like I'm getting a buffer overflow when I try to declare a char array.
bool CFile::simpleWrite(char* cData)
{
try{
// temp result variable
BOOL bResult = 0;
// file handle
HANDLE hFile = INVALID_HANDLE_VALUE;
// get the CMain singleton
CMain* m_pMain = CMain::GetInstance();
// this point gets synchronization to ensure we get unique file name...
char cDirFilename[MAX_PATH + 1];
GetLogFileName(cDirFilename, MAX_PATH);
// sanity check
if(strcmp(cDirFilename, "c:\\") == 0) assert(0);
// try and create a file
hFile = CreateFile( cDirFilename, GENERIC_WRITE, FILE_SHARE_READ,NULL, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL );
// if have a good file handle
if(hFile != INVALID_HANDLE_VALUE){
size_t lenFileData = strlen(cData) + 72;
char* cFileData = new char[lenFileData];
_snprintf(cFileData, lenFileData, "<?xml version=\"1.0\"?>\r\n<RootElement>\r\n%s</RootElement>\r\n\0", cData);
...
Here is the declaration/assignment for cData (cXML in the calling method).
char cXML[EVENT_LOG_MAX_MESSAGE];
// get the CMain singleton
CMain* pMain = CMain::GetInstance();
long lThreadID = GetCurrentThreadId();
// put the parameters into XML format
pMain->BuildXML(cXML, EVENT_LOG_MAX_MESSAGE,errLogLevel,userActivityID,methodName,lineNumber,className,AppID,errorDescription,errorID,lThreadID);
// write the data to file
if(!simpleWrite(cXML))
...
BuildXML is doing a _snprintf into cXML and returning it.
Here is the stacktrace going from my call into some of the VC files.
Test.dll!_heap_alloc_base(unsigned int size) Line 55 C
Test.dll!_heap_alloc_dbg_impl(unsigned int nSize, int nBlockUse, const char * szFileName, int nLine, int * errno_tmp) Line 431 + 0x9 bytes C++
Test.dll!_nh_malloc_dbg_impl(unsigned int nSize, int nhFlag, int nBlockUse, const char * szFileName, int nLine, int * errno_tmp) Line 239 + 0x19 bytes C++
Test.dll!_nh_malloc_dbg(unsigned int nSize, int nhFlag, int nBlockUse, const char * szFileName, int nLine) Line 302 + 0x1d bytes C++
Test.dll!malloc(unsigned int nSize) Line 56 + 0x15 bytes C++
Test.dll!operator new(unsigned int size) Line 59 + 0x9 bytes C++
Test.dll!operator new[](unsigned int count) Line 6 + 0x9 bytes C++
Test.dll!CFile::simpleWrite(char * cData) Line 87 + 0xc bytes C++
I'm sure there is some stupid basic mistake, but I can't seem to get it figured out.
Your error is most likely somewhere else where you end up corrupting the heap. I notice you use strlen here without adding one for the terminating zero. Have a look in your code to see if you use strlen for allocating memory and copying somewhere, because I think you're corrupting the heap by allocating one byte too little and then strcpy'ing to it.
As I suspected, this was all my own stupidity. This COM application had a dependency that I wasn't aware of. Never saw anything trying to look for it in the process monitor and the project had a local copy of the file for compile purposes apparently. Thanks for the input though.

Resources