This question has some answers on SO but mine is slightly different. Before marking as duplicate, please give it a shot.
MSVC has always provided the /Gy compiler option to enable identical functions to be folded into COMDAT sections. At the same time, the linker also provides the /OPT:ICF option. Is my understanding right that these two options must be used in conjunction? That is, while the former packages functions into COMDAT, the latter eliminates redundant COMDATs. Is that correct?
If yes, then either we use both or turn off both?
Answer from someone who communicated with me off-line. Helped me understand these options a lot better.
===================================
That is essentially true. Suppose we talk just C, or C++ but with no member functions. Without /Gy, the compiler creates object files that are in some sense irreducible. If the linker wants just one function from the object, it gets them all. This is specially a consideration in programming for libraries, such that if you mean to be kind to the library's users, you should write your library as lots of small object files, typically one non-static function per object, so that the user of the library doesn't bloat from having to carry code that actually never executes.
With /Gy, the compiler creates object files that have COMDATs. Each function is in its own COMDAT, which is to some extent a mini-object. If the linker wants just one function from the object, it can pick out just that one. The linker's /OPT switch gives you some control over what the linker does with this selectivity - but without /Gy there's nothing to select.
Or very little. It's at least conceivable that the linker could, for instance, fold functions that are each the whole of the code in an object file and happen to have identical code. It's certainly conceivable that the linker could eliminate a whole object file that contains nothing that's referenced. After all, it does this with object files in libraries. The rule in practice, however, used to be that if you add a non-COMDAT object file to the linker's command line, then you're saying you want that in the binary even if unreferenced. The difference between what's conceivable and what's done is typically huge.
Best, then, to stick with the quick answer. The linker options benefit from being able to separate functions (and variables) from inside each object file, but the separation depends on the code and data to have been organised into COMDATs, which is the compiler's work.
===================================
As answered by Raymond Chen in Jan 2013
As explained in the documentation for /Gy, function-level linking
allows functions to be discardable during the "unused function" pass,
if you ask for it via /OPT:REF. It does not alter the actual classical
model for linking. The flag name is misleading. It's not "perform
function-level linking". It merely enables it by telling the linker
where functions begin and end. And it's not so much function-level
linking as it is function-level unlinking. -Raymond
(This snippet might make more sense with some further context:here are the posts about classical linking model:1, 2
So in a nutshell - yes. If you activate one switch without the other, there would be no observable impact.
Related
Many discussions like this and this have warned us with examples that trying to dlopen a PIE could never be correct. The reasons are various: copy relocations, TLS, etc.
However, these problems can be circumvented if we loose the restriction. This question showed us compiling with fPIC can eliminate copy relocation, and TLS seems to work alright.
This brings up the question about how far we are from correctly dynamic loading a PIE. I agree with the idea again in link 1:
Bottom line: this was never designed to work, and you just happened to not step on many of the land-mines, so you thought it is working, when in fact you were exercising undefined behavior.
But I'm more interesting about WHY we could not do that, instead of another failing example.
More specifically, users could write their own runtime dynamic linker as this comment suggest, which could make some strong assumptions or compromises just for this purpose. Yet this requires extremely broad knowledge on compiling, linking and loading, some of which are known to be poorly documented.
So again, how do users correctly dynamic load PIEs, or at least how can they try to find a way to do that(or not to do that)?
But I'm more interesting about WHY we could not do that, instead of another failing example.
Because the designers of GLIBC didn't intend to allow for this to happen and don't consider this to be a valid use case.
More specifically, users could write their own runtime dynamic linker
Absolutely. You are free to design your own libc and the dynamic loader to allow for this use case. That requirement will add some complexity, but there is no fundamental reason it can't be done.
You may also find an existing alternate libc implementation which doesn't have this restriction (either because it has been designed in, or because the designers forgot to enforce it, as was the case with GLIBC before this patch).
how do users correctly dynamic load PIEs
They don't.
how can they try to find a way to do that(or not to do that)?
The usual solution is to "not do that", and in fact the need to "do that" seems to be very esoteric.
Why do you need to dlopen a PIE executable in the first place?
I want to make a quick fix to one of the project's .so libraries. Is it safe to just recompile the .so and replace the original? Or I have to rebuild and reinstall the whole project? Or it depends?
It depends. Shared library needs to be binary-compatible with your executable.
For example,
if you changed the behaviour of one of library's internal functions, you probably don't need to recompile.
If you changed the size of a struct (e.g. by adding a member) that's known by the application, you will need to recompile, otherwise the library and the application will think the struct is smaller than it is, and will crash when the library tries to read an extra uninitialized member that the application didn't write to.
If you change the type or the position of arguments of any functions visible from the applications, you do need to recompile, because the library will try to read more arguments off the stack than the application has put on it (this is the case with C, in C++ argument types are the part of function signature, so the app will refuse run, rather than crashing).
The rule of thumb (for production releases) is that, if you are not consciously aware that you are maintaining binary compatibility, or not sure what binary compatibility is, you should recompile.
That's certainly the intent of using dynamic libraries: if something in the library needs updating, then you just update the library, and programs that use it don't need to be changed. If the signature of the function you're changing doesn't change, and it accomplishes the same thing, then this will in general be fine.
There are of course always edge cases where a program depends on some undocumented side-effect of a function, and then changing that function's implementation might change the side-effect and break the program; but c'est la vie.
If you have not changed the ABI of the shared library, you can just rebuild and replace the library.
It depends yes.
However, I assume you have the exact same source and compiler that built the other stuff and now if you only change in a .cpp file something, it is fine.
Other things e.g. changing an interface (between the shared lib and the rest of the system) in a header file is not fine.
If you don't change your library binary interface, it's ok to recompile and redeploy only the shared library.
Good references:
How To Write Shared Libraries
The Little Manual of API Design
When the same piece of c++ code is compiled with the same version of visual c++ compiler but at different times and possibly in different computers, does the code reordering performed by the compiler remains same or it may differ. i.e. does the logic behind code optimization by code reordering depend only on the code or it depends on various other parameters?
The context of the question is that I want to create a tool which finds out whether the two dlls are same or different based on their functionalities.
Correct me if I'm wrong in assuming that since you want to compare dlls based on their functionality, you don't care about implementation details. Based on this assumption, it is clear that your tool could only look at the function signatures and classes, structs, etc definitions exposed by the dlls which would always be the same regardless of compiler for the same dll.
In LabVIEW, is it possible to tell from within a VI whether an output terminal is wired in the calling VI? Obviously, this would depend on the calling VI, but perhaps there is some way to find the answer for the current invocation of a VI.
In C terms, this would be like defining a function that takes arguments which are pointers to where to store output parameters, but will accept NULL if the caller is not interested in that parameter.
As it was said you can't do this in the natural way, but there's a workaround using data value references (requires LV 2009). It is the same idea of giving a NULLÂ pointer to an output argument. The result is given in input as a data value reference (which is the pointer), and checked for Not a Reference by the SubVI. If it is null, do nothing.
Here is the SubVI (case true does nothing of course):
And here is the calling VI:
Images are VI snippets so you can drag and drop on a diagram to get the code.
I'd suggest you're going about this the wrong way. If the compiler is not smart enough to avoid the calculation on its own, make two versions of this VI. One that does the expensive calculation, one that does not. Then make a polymorphic VI that will allow you to switch between them. You already know at design time which version you want (because you're either wiring the output terminal or not), so just use the correct version of the polymorphic VI.
Alternatively, pass in a variable that switches on or off a Case statement for the expensive section of your calculation.
Like Underflow said, the basic answer is no.
You can have a look here to get the what is probably the most official and detailed answer which will ever be provided by NI.
Extending your analogy, you can do this in LV, except LV doesn't have the concept of null that C does. You can see an example of this here.
Note that the code in the link Underflow provided will not work in an executable, because the diagrams are stripped by default when building an EXE and because the RTE does not support some of properties and methods used there.
Sorry, I see I misunderstood the question. I thought you were asking about an input, so the idea I suggested does not apply. The restrictions I pointed do apply, though.
Why do you want to do this? There might be another solution.
Generally, no.
It is possible to do a static analysis on the code using the "scripting" features. This would require pulling the calling hierarchy, and tracking the wire references.
Pulling together a trial of this, there are some difficulties. Multiple identical sub-vi's on the same diagram are difficult to distinguish. Also, terminal references appear to be accessible mostly by name, which can lead to some collisions with identically named terminals of other vi's.
NI has done a bit of work on a variation of this problem; check out this.
In general, the LV compiler optimizes the machine code in such a way that unused code is not even built into the executable.
This does not apply to subVIs (because there's no way of knowing that you won't try to use the value of the indicators somehow, although LV could do it if it removes the FP when building an executable, and possibly does), but there is one way you can get it to apply to a subVI - inline the subVI, which should allow the compiler to see the outputs aren't used. You can also set its priority to subroutine, which will possibly also do this, but I wouldn't recommend that.
Officially, in-lining is only available in LV 2010, but there are ways of accessing the private VI property in older versions. I wouldn't recommend it, though, and it's likely that 2010 has some optimizations in this area that older versions did not.
P.S. In general, the details of the compiling process are not exposed and vary between LV versions as NI tweaks the compiler. The whole process is supposed to have been given a major upgrade in LV 2010 and there should be a webcast on NI's site with some of the details.
I have inherited a very large and complex project (actually, a 'solution' consisting of 119 'projects', most of which are DLLs) that was built and tested under VC8 (VS2005), and I have the task of porting it to VC9 (VS2008).
The porting process I used was:
Copy the VC8 .sln file and rename it
to a VC9 .sln file.
Copy all of
the VC8 project files, and rename
them to VC9 project files.
Edit
all of the VC9 project files,
s/vc8/vc9.
Edit the VC9 .sln,
s/vc8/vc9/
Load the VC9 .sln with
VS2008, and let the IDE 'convert'
all of the project files.
Fix
compiler and linker errors until I
got a good build.
So far, I have run into the following issues in that last step.
1) A change in the way decorated names are calculated, causing truncation of the names.
This is more than just a warning (http://msdn.microsoft.com/en-us/library/074af4b6.aspx). Libraries built with this warning will not link with other modules. Applying the solution given in MSDN was non-trivial, but doable. I addressed this problem separately in How do I increase the allowed decorated name length in VC9 (MSVC 2008)?
2) A change that does not allow the assignment of zero to an iterator. This is per the spec, and it was fairly easy to find and fix these previously-allowed coding errors. Instead of assignment of zero to an iterator, use the value end().
3) for-loop scope is now per the ANSI standard. Another easy-to-fix problem.
4) More space required for pre-compiled headers. In some cases a LOT more space was required. I ended up using /Zm999 to provide the maximum PCH space. If PCH memory usage gets bumped up again, I assume that I will have to forgo PCH altogether, and just endure the increase in what is already a very long build time.
5) A change in requirements for copy ctors and default dtors. It appears that in template classes, under certain conditions that I haven't quite figured out yet, the compiler no longer generates a default ctor or a default dtor. I suspect this is a bug in VC9, but there may be something else that I'm doing wrong. If so, I'd sure like to know what it is.
6) The GUIDs in the sln and vcproj files were not changed. This does not appear to impact the build in any way that I can detect, but it is worrisome nevertheless.
Note that despite all of these issues, the project built, ran, and passed extensive QA testing under VC8. I have also back-ported all of the changes to the VC8 projects, where they still build and run just as happily as they did before (using VS2005/VC8). So, all of my changes required for a VC9 build at least appear to be backward-compatible, although the regression testing is still underway.
Now for the really hard problem: I have run into a difference in the startup sequence between VC8 and VC9 projects. The program uses a small-object allocator modeled after Loki, in Andrei Alexandrescu's Book Modern C++ Design. This allocator is initialized using a global variable defined in the main program module.
Under VC8, this global variable is constructed at the very beginning of the program startup, from code in a module crtexe.c. Under VC9, the first module that executes is crtdll.c, which indicates that the startup sequence has been changed. The DLLs that are starting up appear to be confusing the small-object allocator by allocating and deallocating memory before the global object can initialize the statistics, which leads to some spurious diagnostics. The operation of the program does not appear to be materially affected, but the QA folks will not allow the spurious diagnostics to get past them.
Is there some way to force the construction of a global object prior to loading DLLs?
What other porting issues am I likely to encounter?
Is there some way to force the construction of a global object prior to loading DLLs?
How about the DELAYLOAD option? So that DLLs aren't loaded until their first call?
That is a tough problem, mostly because you've inherited a design that's inherently dangerous because you're not supposed to rely on the initialization order of global variables.
It sounds like something you could try to work around by replacing the global variable with a singleton that other functions retrieve by calling a global function or method that returns a pointer to the singleton object. If the object exists at the time of the call, the function returns a pointer to it. Otherwise, it allocates a new one and returns a pointer to the newly allocated object.
The problem, of course, is that I can't think of a singleton implementation that would avoid the problem you're describing. Maybe this discussion would be useful: http://www.oneunified.net/blog/Personal/SoftwareDevelopment/CPP/Singleton.article
That's certainly an interesting problem. I don't have a solution other than perhaps to change the design so that there is no dependence on undefined behavior of the order or link/dll startup. Have you considered linking with the older linker? (or whatever the VS.NET term is)
Because the behavior of your variable and allocator relied on some (unknown at the time) arbitrary order of startup I would probably fix that so that it is not an issue in the future. I guess you are really asking if anyone knows how to do some voodoo in VC9 to make the problem disappear. I am interested in hearing it as well.
How about this,
Make your main program a DLL too, call it main.dll, linked to all the other ones, and export the main function as say, mainEntry(). Remove the global variable.
Create a new main exe which has the global variable and its initialization, but doesn't link statically to any of the other application DLLs (except for the allocator stuff).
This new main.exe then dynamically loads the main.dll using LoadLibrary(), then uses GetProcAddress to call mainEntry().
The solution to the problem turned out to be more straightforward than I originally thought. The initialization order problem was caused by the existence of several global variables of types derived from std container types (a basic design flaw that predated my position with that company). The solution was to replace all such globals with singletons. There were about 100 of them.
Once this was done, the initialization (and destruction) order was under programmer control.