Using iladsm+ilasm produces a different assembly - .net-assembly

I want to change an attribute in an already-compiled assembly (in the future I might be able to compile my sources twice, but not at-the-moment...). This answer suggests that I use ildasm, munge the properties in the text file and then re-assemble using ilasm. The blog post Signing a Third Party Library With Ildasm and Ilasm suggests a similar solution for a similar problem.
[edit] I did that, using:
ildasm MyBinaryLib.dll /output=MyBinaryLib.asm /all /caverbal
// no munging for now...
ilasm /nologo /dll MyBinaryLib.asm /resource=MyBinaryLib.res /output=MyBinaryLib2.dll
and it worked, but it seems that the resulting assembly is missing some stuff - it's 4096 bytes instead of 4608. I compared some text blocks inside the DLLs, and it seems that the following are missing:
AssemblyCultureAttribute - My original assemblyinfo.cs has [assembly: AssemblyCulture("")], I guess ildasm ignores that.
AssemblyVersionAttribute - which is weird, since I do see AssemblyVersion using
ILSpy.
System.Diagnostics, DebuggableAttribute, DebuggingModes - ILSpy does show a missing [assembly: Debuggable] attribute. The .asm file also says:
-
// --- The following custom attribute is added automatically, do not uncomment -------
// .custom /*0C00000C:0A00000E*/ instance void [mscorlib/*23000001*/]System.Diagnostics.DebuggableAttribute/*0100000F*/::.ctor(valuetype [mscorlib/*23000001*/]System.Diagnostics.DebuggableAttribute/*0100000F*//DebuggingModes/*01000010*/) /* 0A00000E */
// = {int32(263)}
// // = ( 01 00 07 01 00 00 00 00 )
My question: What are the effect of these things missing?

Related

Difference in md5sums in two object files

I compile twice the same .c and .h files and get object files with the same size but different md5sums.
Here is the only difference from objdump -d:
1) cpcidskephemerissegment.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_ZN68_GLOBAL__N_sdk_segment_cpcidskephemerissegment.cpp_00000000_B8B9E66611MinFunctionEii>:
2) cpcidskephemerissegment.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_ZN68_GLOBAL__N_sdk_segment_cpcidskephemerissegment.cpp_00000000_8B65537811MinFunctionEii>:
What can be the reason? Thanks!
I guess, the compiler didn't know how to name this namespace and used path to the source file plus some random number.
The compiler must guarantee that a symbol in unnamed namespace does not conflict with any other symbol in your program. By default this is achieved by taking full filename of the source, and appending a random hash value to it (it's legal to compile the same source twice (e.g. with different macros) and link the two objects into a single program, and the unnamed namespace symbols must still be distinct, so using just the source filename without the seed is not enough).
If you know that you are not linking the same source file more than once, and want to have a bit-identical object file on re-compile, the solution is to add -frandom-seed="abcd" to your compile line (replace "abcd" with anything you want; it's common to use the filename as the value of random seed). Documentation here.
The reasons can be many:
Using macros like __DATE__ and __TIME__
Embedding counters that are incremented for each build (the Linux kernel does this)
Timestamps (or similarly variable quantities) embedded in the .comments ELF section. One example of a compiler that does this is the xlC compiler on AIX.
Different names as a result of name mangling (e.g. C++)
Changes in environment variables which are affecting the build process.
Compiler bug(s) (however unlikely)
To produce bit identical builds, you can use GCC's -frandom-seed parameter. There were situations where it could break things before GCC 4.3, but GCC now turns functions defined in anonymous namespaces into static symbols. However, you will always be safe if you compile each file using a different value for -frandom-seed, the simplest way being to use
the filename itself as the seed.
Finally I've found the answer!
c++filt command gave the original name of the function:
{unnamed namespace}: MinFunction(int, int)
In the source was:
namespace
{
MinFunction(int a, int b) { ... }
}
I named the namespace and got stable checksum of object file!
As I guess, the compiler didn't know how to name this namespace and used path to the source file plus some random number.

Why a stripped binary file can still have library call information in the disassembled file?

test platform is 32 bit Linux.
I compile a c program without strip the symbol information, and use objdump to
disassembly the elf executable file.
Here is part of the results.
804831c: e8 8c fe ff ff call 8048360 <printf#plt>
If I use:
strip binary
to remove the symbol info and use objdump to disassembly the elf executable file again, I can still see the results like:
804831c: e8 8c fe ff ff call 8048360 <printf#plt>
So my question is:
How can disassembly tool like objdump know the name of certain library functions after I have stripped all the symbol information..?
Thank you!
ELF file has 2 symbol tables: .symtab and .dynsym. The latter is for dynamic symbols needed for dynamic linking (relocation).
In your case, printf is in .dynsym and it may also be present in .symtab; by default strip would remove .symtab but not .dynsym which is needed for relocation.
You may try
strip -R .dynsym your_binary
to remove the dynsym section manually and you will find it fails to run due to relocation failure.
Imported calls will always have the name, it is needed to link at runtime. If you stripped the import name, how would your application know what to call? Methods from external libraries may (and usually do) have a different address every time your application is executed.
On another note, inlined or statically-linked methods can sometimes be identified and named even without symbol information. Many disassemblers look for common patterns associated with some standard library functions. memcpy() for example, can often be heuristically identified and labeled even without symbol info available.

Warning MIDL2346 in Visual C++ 6 source: The specified lcid is different from previous specification

I have a strange warning in some code I'm trying to maintain. I'm currently testing it out in its current environment (Visual C++ 6.0, yes, I know, ancient) before moving it up to a more modern VC++ version. I don't understand this warning, and what effect it might be having on the EXE target I'm compiling. During compilation I get this output in the build tab:
Processing C:\OSDK\Libraries\PSDll\OSDKDefs.idl
OSDKDefs.idl
.\Server\Interfaces\InterfaceDef.idl(109) : warning MIDL2346 : the specified lcid is different from previous specification
Compiling...
The above IDL file is a slightly hacked up version of an IDL file provided by a vendor which no longer provides any support for the above libraries. I believe that this comment in the IDL file was added by a former maintainer of this project, who has hacked this IDL file. My question is, I can make the warning go away by changing the lcid back to the value in the comment, possibly reintroducing some unwanted problem that the original modifier of this idl file wanted to avoid. What is an lcid and what would the difference between the behaviour with lcid(0x409) and lcid(0x09) be? A single bit with value 0x400 hex is being toggled, but what does that bit do?
The line that is causing the warning is marked and commented below, formerly lcid(0x409)
changed to lcid(0x09) for "compatibility" with some kind of test tool that this vendor provides for their DCOM/COM code, the tool is mentioned in the comments below.
//
// Component and type library descriptions
//
[
uuid(bbf92ab1-5031-40c2-864d-1c301f51d0ce),
// mvs04042000 - Changed back the lcid from 0x409 to 0x09. Else we have problems
// connecting from the PowerTool.
lcid(0x09), /// <<----- WARNING HERE
version(7.16),
helpfile("OsdkTlb.hlp"),
helpstring("OPC Server 7.16 Library"),
helpcontext(0x00000010)
]
library ED3Drv
{
importlib("stdole32.tlb");
[
uuid(b66ac2ca-d99e-4319-8fc0-08c0b65e65df),
appobject
]
coclass ED3Server
{
[default] interface IED3Driver;
interface IDriver;
interface IDriverMessage;
interface IDataScopeConnect;
interface IDispatch;
[source] interface IDataScopeSink;
};
};
The IDL above is part of a toolkit that was designed to help people write C++ DCOM clients and servers that match a specification called OPC (OLE for Process Controls).
lcid is LocaleID. 0x409 equals to 1033, which is English (United States). 0x09 is not a valid locale id value.
See http://msdn.microsoft.com/en-us/library/ms912047(v=winembedded.10).aspx for a complete list of valid values.

relocation entries in a shared lib

I'm investigating relocation of shared libraries, and ran into something strange. Consider this code:
int myglob;
int ml_util_func(int p)
{
return p + 2;
}
int ml_func2(int a, int b)
{
int c = ml_util_func(a);
return c + b + myglob;
}
I compile it to a non-PIC shared lib with gcc -shared. I do this on a 32-bit Ubuntu running on x86.
The resulting .so has a relocation entry for the call to ml_util_func in ml_func2. Here's the output of objdump -dR -Mintel on ml_func2:
0000050d <ml_func2>:
50d: 55 push ebp
50e: 89 e5 mov ebp,esp
510: 83 ec 14 sub esp,0x14
513: 8b 45 08 mov eax,DWORD PTR [ebp+0x8]
516: 89 04 24 mov DWORD PTR [esp],eax
519: e8 fc ff ff ff call 51a <ml_func2+0xd>
51a: R_386_PC32 ml_util_func
51e: 89 45 fc mov DWORD PTR [ebp-0x4],eax
521: 8b 45 0c mov eax,DWORD PTR [ebp+0xc]
524: 8b 55 fc mov edx,DWORD PTR [ebp-0x4]
527: 01 c2 add edx,eax
529: a1 00 00 00 00 mov eax,ds:0x0
52a: R_386_32 myglob
52e: 8d 04 02 lea eax,[edx+eax*1]
531: c9 leave
532: c3 ret
533: 90 nop
Note the R_386_PC32 relocation on the call instruction.
Now, my question is why is this relocation needed? e8 is "call relative..." on a x86, and since ml_util_func is defined in the same object, surely the linker can compute the relative offset between it and the call without leaving it to the dynamic loader?
Interestingly, if ml_util_func is declared static, the relocation disappears and the linker correctly computes and inserts the offset. What is it about ml_util_func being also exported that makes the linker lazy about it?
P.S.: I'm playing with non-PIC code on purpose, to understand load-time relocations.
Can't find why, but this is comment from binutils about this:
binutils-2.11.90-20010705-src.tar.gz/bfd/elf32-i386.c : 679
/* If we are creating a shared library, and this is a reloc
against a global symbol, or a non PC relative reloc
against a local symbol, then we need to copy the reloc
into the shared library. However, if we are linking with
-Bsymbolic, we do not need to copy a reloc against a
global symbol which is defined in an object we are
I think, this relocation created to allow user overload any global symbol in the library. And, seems that -Bsymbolic disables this ability and will not generate a relocation for symbol from library itself.
http://www.rocketaware.com/man/man1/ld.1.htm
-Bsymbolic
This option causes all symbolic references in the output to be
resolved in this link-edit session. The only remaining run-time
relocation requirements are base-relative relocations, i.e.
translation with respect to the load address. Failure to resolve
any symbolic reference causes an error to be reported.
Longer description of various -B modes and limitations (C++) is here:
http://developers.sun.com/sunstudio/documentation/ss12/mr/man1/CC.1.html
-Bbinding
Specifies whether a library binding for linking is
symbolic, dynamic (shared), or static (nonshared).
-Bdynamic is the default. You can use the -B
option several times on a command line.
For more information on the -Bbinding option, see
the ld(1) man page and the Solaris documentation.
-Bdynamic directs the link editor to look for
liblib.so files. Use this option if you want
shared library bindings for linking. If the
liblib.so files are not found, it looks for
liblib.a files.
-Bstatic directs the link editor to look only for
liblib.a files. The .a suffix indicates that the
file is static, that is, nonshared. Use this
option if you want nonshared library bindings for
linking.
-Bsymbolic forces symbols to be resolved within a
shared library if possible, even when a symbol is
already defined elsewhere. For an explanation of
-Bsymbolic, see the ld(1) man page.
This option and its arguments are passed to the
linker, ld. If you compile and link in separate
steps and are using the -Bbinding option, you must
include the option in the link step.
Warning:
Never use -Bsymbolic with programs containing C++
code, use linker scoping instead. See the C++
User's Guide for more information on linker scop-
ing. See also the -xldscope option.
With -Bsymbolic, references in different modules
can bind to different copies of what is supposed
to be one global object.
The exception mechanism relies on comparing
addresses. If you have two copies of something,
their addresses won't compare equal, and the
exception mechanism can fail because the exception
mechanism relies on comparing what are supposed to
be unique addresses.
Note that an object is not necessarily a block that is linked in its entirety. There are ways to put symbols in separate sections that can be placed in the final .exe depending on if it is referenced by code. (search for -gc-sections linker option, and related section generation gcc options)
It might be simply not microoptimizing this when no sections are used.

sprof "PLTREL not found error"

I'm trying to profile our shared library, but whenever I have the environmental variable LD_PROFILE set, I get "PLTREL not found in object ". What gives? Is there some sort of linker flag I'm missing or what? There seems to be no information about this on the internets. The man page for sprof is about 10 words long.
According to an unanswered question on Google Groups, it looks like you aren't the very first person with this problem.
I think pltrel means plt-relative; in some ELF design notes,
There is a .plt section created in the code segment, which is an array of function stubs used to handle the run-time resolution of library calls.
And here's yet a little more:
The next section I want to mention is the .plt section. This contains the jump table that is used when we call functions in the shared library. By default the .plt entries are all initialized by the linker not to point to the correct target functions, but instead to point to the dynamic loader itself. Thus, the first time you call any given function, the dynamic loader looks up the function and fixes the target of the .plt so that the next time this .plt slot is used we call the correct function. After making this change, the dynamic loader calls the function itself.
Sounds to me like there's an issue with how the shared library was compiled or assembled. Hopefully a few more searches to elf PLT section gets you on the right track.
Found this that may be relevante for you:
Known issues with LD_AUDIT
➢ LD_AUDIT does not work with Shared Libraries with no code in them.
➢ Example ICU-4.0 “libicudata.so”
➢ Error: “no PLTREL found in object /usr/lib/libicudata.so.40”
➢ Recompile after patching libicudata by sed'ing -nostdlib etc away sed -i --
"s/-nodefaultlibs -nostdlib//" config/mh-linux
It seems the same applies for LD_PROFILE

Resources