The compilation process - visual-c++

Can anyone explain how compilation works?
I can't seem to figure out how compilation works..
To be more specific, here's an example.. I'm trying to write some code in MSVC++ 6 to load a Lua state..
I've already:
set the additional directories for the library and include files to the right directories
used extern "C" (because Lua is C only or so I hear)
include'd the right header files
But i'm still getting some errors in MSVC++6 about unresolved external symbols (for the Lua functions that I used).
As much as I'd like to know how to solve this problem and move on, I think it would be much better for me if I came to understand the underlying processes involved, so could anyone perhaps write a nice explanation for this? What I'm looking to know is the process.. It could look like this:
Step 1:
Input: Source code(s)
Process: Parsing (perhaps add more detail here)
Output: whatever is output here..
Step 2:
Input: Whatever was output from step 1, plus maybe whatever else is needed (libraries? DLLs? .so? .lib? )
Process: whatever is done with the input
Output: whatever is output
and so on..
Thanks..
Maybe this will explain what symbols are, what exactly "linking" is, what "object" code or whatever is..
Thanks.. Sorry for being such a noob..
P.S. This doesn't have to be language specific.. But feel free to express it in the language you're most comfortable in.. :)
EDIT: So anyway, I was able to get the errors resolved, it turns out that I have to manually add the .lib file to the project; simply specifying the library directory (where the .lib resides) in the IDE settings or project settings does not work..
However, the answers below have somewhat helped me understand the process better. Many thanks!.. If anyone still wants to write up a thorough guide, please do.. :)
EDIT: Just for additional reference, I found two articles by one author (Mike Diehl) to explain this quite well.. :)
Examining the Compilation Process: Part 1
Examining the Compilation Process: Part 2

From source to executable is generally a two stage process for C and associated languages, although the IDE probably presents this as a single process.
1/ You code up your source and run it through the compiler. The compiler at this stage needs your source and the header files of the other stuff that you're going to link with (see below).
Compilation consists of turning your source files into object files. Object files have your compiled code and enough information to know what other stuff they need, but not where to find that other stuff (e.g., the LUA libraries).
2/ Linking, the next stage, is combining all your object files with libraries to create an executable. I won't cover dynamic linking here since that will complicate the explanation with little benefit.
Not only do you need to specify the directories where the linker can find the other code, you need to specify the actual library containing that code. The fact that you're getting unresolved externals indicates that you haven't done this.
As an example, consider the following simplified C code (xx.c) and command.
#include <bob.h>
int x = bob_fn(7);
cc -c -o xx.obj xx.c
This compiles the xx.c file to xx.obj. The bob.h contains the prototype for bob_fn() so that compilation will succeed. The -c instructs the compiler to generate an object file rather than an executable and the -o xx.obj sets the output file name.
But the actual code for bob_fn() is not in the header file but in /bob/libs/libbob.so, so to link, you need something like:
cc -o xx.exe xx.obj -L/bob/libs;/usr/lib -lbob
This creates xx.exe from xx.obj, using libraries (searched for in the given paths) of the form libbob.so (the lib and .so are added by the linker usually). In this example, -L sets the search path for libraries. The -l specifies a library to find for inclusion in the executable if necessary. The linker usually takes the "bob" and finds the first relevant library file in the search path specified by -L.
A library file is really a collection of object files (sort of how a zip file contains multiple other files, but not necessarily compressed) - when the first relevant occurrence of an undefined external is found, the object file is copied from the library and added to the executable just like your xx.obj file. This generally continues until there are no more unresolved externals. The 'relevant' library is a modification of the "bob" text, it may look for libbob.a, libbob.dll, libbob.so, bob.a, bob.dll, bob.so and so on. The relevance is decided by the linker itself and should be documented.
How it works depends on the linker but this is basically it.
1/ All of your object files contain a list of unresolved externals that they need to have resolved. The linker puts together all these objects and fixes up the links between them (resolves as many externals as possible).
2/ Then, for every external still unresolved, the linker combs the library files looking for an object file that can satisfy the link. If it finds it, it pulls it in - this may result in further unresolved externals as the object pulled in may have its own list of externals that need to be satisfied.
3/ Repeat step 2 until there are no more unresolved externals or no possibility of resolving them from the library list (this is where your development was at, since you hadn't included the LUA library file).
The complication I mentioned earlier is dynamic linking. That's where you link with a stub of a routine (sort of a marker) rather than the actual routine, which is later resolved at load time (when you run the executable). Things such as the Windows common controls are in these DLLs so that they can change without having to relink the objects into a new executable.

Step 1 - Compiler:
Input: Source code file[s]
Process: Parsing source code and translating into machine code
Output: Object file[s], which consist[s] of:
The names of symbols which are defined in this object, and which this object file "exports"
The machine code associated with each symbol that's defined in this object file
The names of symbols which are not defined in this object file, but on which the software in this object file depends and to which it must subsequently be linked, i.e. names which this object file "imports"
Step 2 - Linking:
Input:
Object file[s] from step 1
Libraries of other objects (e.g. from the O/S and other software)
Process:
For each object that you want to link
Get the list of symbols which this object imports
Find these symbols in other libraries
Link the corresponding libraries to your object files
Output: a single, executable file, which includes the machine code from all all your objects, plus the objects from libraries which were imported (linked) to your objects.

The two main steps are compilation and linking.
Compilation takes single compilation units (those are simply source files, with all the headers they include), and create object files. Now, in those object files, there are a lot of functions (and other stuff, like static data) defined at specific locations (addresses). In the next step, linking, a bit of extra information about these functions is also needed: their names. So these are also stored. A single object file can reference functions (because it wants to call them when to code is run) that are actually in other object files, but since we are dealing with a single object file here, only symbolic references (their 'names') to those other functions are stored in the object file.
Next comes linking (let's restrict ourselves to static linking here). Linking is where the object files that were created in the first step (either directly, or after they have been thrown together into a .lib file) are taken together and an executable is created.
In the linking step, all those symbolic references from one object file or lib to another are resolved (if they can be), by looking up the names in the correct object, finding the address of the function, and putting the addresses in the right place.
Now, to explain something about the 'extern "C"' thing you need:
C does not have function overloading. A function is always recognizable by its name. Therefore, when you compile code as C code, only the real name of the function is stored in the object file.
C++, however, has something called 'function / method overloading'. This means that the name of a function is no longer enough to identify it. C++ compilers therefore create 'names' for functions that include the prototypes of the function (since the name plus the prototype will uniquely identify a function). This is known as 'name mangling'.
The 'extern "C"' specification is needed when you want to use a library that has been compiled as 'C' code (for example, the pre-compiled Lua binaries) from a C++ project.
For your exact problem: if it still does not work, these hints might help:
* have the Lua binaries been compiled with the same version of VC++?
* can you simply compile Lua yourself, either within your VC solution, or as a separate project as C++ code?
* are you sure you have all the 'extern "C"' things correct?

You have to go into project setting and add a directory where you have that LUA library *.lib files somewhere on the "linker" tab. Setting called "including libraries" or something, sorry I can't look it up.
The reason you get "unresolved external symbols" is because compilation in C++ works in two stages. First, the code gets compiled, each .cpp file in it's own .obj file, then "linker" starts and join all that .obj files into .exe file. .lib file is just a bunch of .obj files merged together to make distribution of libraries just a little bit simplier.
So by adding all the "#include" and extern declaration you told the compiler that somewhere it would be possible to find code with those signatures but linker can't find that code because it doesn't know where those .lib files with actual code is placed.
Make sure you have read REDME of the library, usually they have rather detailed explanation of what you had to do to include it in your code.

You might also want to check this out: COMPILER, ASSEMBLER, LINKER AND LOADER: A BRIEF STORY.

Related

Why would a linker try to link to a file I never told it to link to?

I'm getting a linker error indicating that the linker was unable to open a file (a static library) and therefore it fails. I am having a very difficult time troubleshooting this error because I never told the linker to link to the file which it is failing to open.
I am telling the linker to link to several static libraries. Many of the libraries I am linking to are wxWidgets static libraries. I don't need ALL of the modules from wxWidgets, so there are some which I am linking to and many which I am not. The file which the linker can't open is 'wxbase31ud_net.lib'. Like I said, that file is not among the libraries I am linking to. My immediate thought was that this dependency was being introduced implicitly somehow, perhaps by one of the wxwidgets libraries I WAS linking to. I didn't think static linkage worked this way but I didn't have any other ideas. I have been investigating that possibility and I've found nothing which indicates that is the case.
I set the build output verbosity to maximum, and the 'wxbase31ud_net.lib' is never mentioned anywhere until the error is reported.
I confirmed in my cmake project that the file in question was never passed back to me from the FindWxWidgets module, and was never referenced in any of the lists of files I associate with the target.
I grepped through the entire project directory and found no reference to the file anywhere, including the cmake-generated project files (visual studio project files).
What could be causing the linker to try and open this file?
Edit: Also, to be clear, the error I'm seeing is LNK1104
it's probably from a #pragma comment(lib,"???") except in the case of wx the argument to the pragma may be complex macros and it will be difficult to grep. This particular one may be from setup.h with #pragma comment(lib, wxWX_LIB_NAME("base", "")). You should be solving this by adding the directory with the wx libs to the linker's search directories.
The answer by zeromus is correct, this is almost certainly indeed due to including msvc/wx/setup.h which contains #pragma comment(lib)s. Possible solutions:
Simplest: build all the libraries, this will solve the errors and it's not a problem to link with a library you don't use.
Also simple but slightly less obvious: predefine wxNO_NET_LIB when building your project, this will prevent the file above from autolinking this particular library. You may/will need to define more wxNO_XXX_LIB symbols if you're missing other libraries, of course.
Less simple but arguably the least magic too: stop using $(WXWIN)/include/msvc in your include path, then wx/setup.h under it won't be included and nothing will be linked in automatically. The drawback is that you will have to specify all the libraries you do need to link with manually.

Linking to a custom .a from multiple objects

In our build system, we generate multiple .so files (foo.so, bar.so, ...) that are loaded during runtime by the main executable (biz). So the .so files are linked separately.
We also have our own util.a static library, that has some utility functions and global data.
The problem comes when some of the .so want to use util.a data/function, but we can't link each .so to util.a. It's because of the data section: global data must be unique in the program address space. If more than one .so is linked to util.a and has a copy of the data, the program behavior will be very surprising but hard to debug.
We can't link executable (biz) to util.a either. The linker will not put everything to the target, since biz doesn't reference the functions on behalf of .so.
Of course, unless linking util.a with -Wl,-whole-archive. But is there a better way to do this?
Solution 1: consider making util.a a dynamic library util.so.
Solution 2: don't let the linker export any symbols exported by util.a. When using gcc you can achieve this for example by using __attribute__((visibility("hidden"))):
int __attribute__((visibility("hidden"))) helperfunc(void *p);
You can use objdump to check which symbols are exported.
To answer myself's question, the eventual solution was like:
http://lists.gnu.org/archive/html/qemu-devel/2014-09/msg00099.html
TL;DR: Search for all the interesting symbols (that you want to pull from archives) inside the .so objects with nm (1), and inject into the compiling command line with -Wl,-u,$SYMBOL. Note that the -Wl,-u,$SYMBOL arguments need to come before archive names in the command line, so the linker knows that it needs to link them.

cmake compiling but not linking a new source file in a library (libonion)

I am a cmake newbie (on Debian/Sid/Linux/x86-64)
I forked libonion on https://github.com/bstarynk/onion to enable customization of malloc with Boehm's garbage collector; see this mail thread.
I added two files there onion/src/low_util.c and onion_src/low_util.h (which is #include-d successfully in several other patched files.
It is compiled but not linked.
set(SOURCES onion.c codecs.c dict.c low_util.c request.c response.c handler.c
log.c sessions.c sessions_mem.c shortcuts.c block.c mime.c url.c ${POLLER_C}
listen_point.c request_parser.c http.c ${HTTPS_C} websocket.c ${RANDOM_C} ${SQLITE3_C})
later:
SET(INCLUDES_ONION block.h codecs.h dict.h handler.h http.h https.h listen_point.h low_util.h log.h mime.h onion.h poller.h request.h response.h server.h sessions.h shortcuts.h types.h types_internal.h url.h websocket.h ${SQLITE3_H})
MESSAGE(STATUS "Found include files ${INCLUDES_ONION}")
but when I build, my file low_util.c got compiled but not linked.
Linking C executable otemplate
CMakeFiles/opack.dir/__/__/src/onion/dict.c.o: In function `onion_dict_new':
dict.c:(.text+0x1bc): undefined reference to `onionlow_calloc'
CMakeFiles/opack.dir/__/__/src/onion/dict.c.o: In function `onion_dict_node_data_free':
dict.c:(.text+0x2ec): undefined reference to `onionlow_free'
CMakeFiles/opack.dir/__/__/src/onion/dict.c.o: In function `onion_dict_node_add':
Notice that libonion is a library (in C, providing HTTP service) and that I just want to add a low_util.c file (wrapping malloc & pthread_create etc... to make Boehm's GC happy: it is calling GC_malloc and GC_pthread_create ....) with its low_util.h header. Surprisingly, they get compiled, but do not seems to be linked. And I am not familiar with cmake and I am not familiar with how D.Moreno (the main author of libonion) has organized his cmake files.
Any clues?
Apply the following patch to make it link. The two executables which are being linked with the symbols generated from the .c file you added are missing and are added in the patch.
http://pastebin.com/mDMRiUQu
Based on what you posted, its hard to tell what could be wrong. The cake source code above says that a variable ${SOURCES} is equivalent to onion.c codecs.c dict.c low_util.c ... ${SQLITE3_C}, and a variable ${INCLUDE_ONION} is equivalent to block.h codecs.h dict.h ... ${SQLITE3_H}. You did not provide any targets or the files included in those targets.
A brief list of things that may help:
where do you define the top level library or executable? If your making a library, you will need the command add_library(). If you are making an executable, you will need the add_executable() command.
Use the command target_link_libraries() to resolve dependencies. Rather than placing all of the source files in a single library, group similar together in a single target (a target is defined by the add_* commands), and use this command to link the targets after compilation.
Use the find_package() to get any libraries which are defined on your system but not in you project. Then, link to that library using the target_link_libraries() command.
In this case, if the onion_dict_* functions are defined within the same library, your not including those files in library. When you use add_library or add_executable, ensure you add those files to the list. If the functions are within your project but not in the same library, use the target_link_libraries() command to link to the library which contains the correct files. If those commands are defined in an external library, then first find the library using find_package(), and then link to the library using target_link_libraries().

make one static library from whole project with cmake

c++-project, say, foo is maintained by the cmake.
One wants to create one library libfoo.a (with all classes/methods/functions created at the whole source-tree) to make possible creating programs that could linked to the library with -lfoo.
ok, let's consider now a toy example, and the prolbem will be clear. Directory foo (root of the project) contains directories a, and b. Two CmakeLists.txt are created:
# a/CMakeLists.txt
add_library(A <a_sources>)
# b/CMakeLists.txt
add_library(B <b_sources>)
And one CMakeLists.txt for root directory:
add_subdirectory(a)
add_subdirectory(b)
add_library(foo <foo_sources>
target_link_libraries(foo A B)
That was a surprise for me: after building libfoo.a contains only methods from foo_sources, and a_sources,b_sources are excluded.
That is ok in the case when executables are built with the same project: while creating executables cmake "guesses" that a and b must be linked if it is linked to foo.
But in the case executable is created "outside" project to use library foo one must link with -lfoo -la -lb, now imagine a project with lots of subdirectories - how to deal with it? so question is "how to create one library, aggregating methods from whole project with means of cmake?"
Googling led me to relatively recently embedded (appeared in 2.8.8) OBJECT library opportunity. Nice example of using it is shown here. Now the problem above can be solved with that:
# a/CMakeLists.txt
add_library(A OBJECT <a_sources>)
# b/CMakeLists.txt
add_library(B OBJECT <b_sources>)
# foo/CMakeLists.txt
add_subdirectory(a)
add_subdirectory(b)
add_library(foo <foo_sources> $<TARGET_OBJECTS:A> $<TARGET_OBJECTS:B>)
problem seems to be solved, unfortunately, not quite.
if dependency chain is longer than 2, for example, foo depends on A, which depends on B, problem still remains.
That is because,
Object libraries may contain only sources (and headers) that compile to object files.
and
Object libraries cannot be imported, exported, installed, or linked.
(quotes are taken from the same link)
I've tried several combinations of target_link_library(), add_library(), add_library(... OBJECT ..) trying to link A and B to foo without success (error during cmake-process.)
I must be loosing something simple, please help, thank you!
I am not sure is it important: project is maintained at the linux.
I think you're getting tangled up in the term "depends on". If you're building a library named foo and it has two parts, A and B, it doesn't matter whether A depends on B; the library should contain both. The CMake code you've shown will build foo properly.
Yep, I support answer #Pete Becker# . But it should be said as well that those libraries a $<TARGET_OBJECTS:A> and $<TARGET_OBJECTS:B> actually not a libraries at all, but rather cmake internal list of object modules. There is no dependencies between compilation of object modules (except auto-generated sources) so they can be done in any order and in parallel.
I guess more correct term for your intention is gathering together several TARGET_OBJECTS under single object library. That's really bad that you can't write add_library(B OBJECT b.cpp $<TARGET_OBJECTS:A>). But you always can implement this by yourself:
add_library(A OBJECT a.cpp)
set(A_OBJECTS $<TARGET_OBJECTS:A>)
add_library(B OBJECT b.cpp)
set(B_OBJECTS $<TARGET_OBJECTS:B> ${A_OBJECTS})
add_library(foo ${B_OBJECTS})
I.e just create special variables _OBJECTS to use them whenever you want to include those object libraries in library, executable or as part of other object library with that _OBJECTS flavor.

What does the object code file ctr1.o do in the gcc compiler?

What does the obj file ctr1.o does in gcc compilier ?Why does the linker link this obj file whenever an executable is generated?
I think it contains very basic stuf (crt stands for C run time) like setting up argv and argc for your main function etc ... Here is a link with some explanation
If you don't want it, because you are writing a tiny bootloader for example, without any bit of the libc, you can use the --no-stdlib options to link your program. If you go this way, youwill also need to write your own linker script.
I'm not sure to understand your question but I guess you are referring to 'crt1.o' in the GCC package.
The crt is one of the base packages of the libc which provides basic functionality to access the computer. IIRC it contains methods like 'printf' and such.
That's why it is often even included in the most basic C applications.
Object files hold your compiled code, but are not in themselves executable. It is the job of the linker to take all the object files that make up a program, and join them into a whole. This involves resolving references between object files (extern symbols), checking that there is a main() entrypoint (for C programs), and so on.
Since each source file (.c or .cpp) compiles into a separate object file, which are then read by the linker, changes to a single C file mean only that can be re-compiled, generating a new object file, which is then linked with the existing object files into a new executable. This makes development faster.
UPDATE: As stated in another answer, the "crt.o" object files holds the C runtime code, which is assumed to be needed by most C programs. You can read the gcc linker options and find the --no-stdlib option, this will tell gcc that your particular program should not be linked with the standard C runtime files.

Resources