Where are include files stored - Ubuntu Linux, GCC - linux

So, when we do the following:
#include <stdio.h>
versus
#include "myFile.h"
the compiler, GCC in my case, knows where that stdio.h (and even the object file) are located on my hard drive. It just utilizes the files with no interaction from me.
I think that on my Ubuntu Linux machine the files are stored at /usr/include/. How does the compiler know where to look for these files? Is this configurable or is this just the expected default? Where would I look for this configuration?
Since I'm asking a question on these include files, what are the source of the files? I know this might be fuzzy in the Linux community but who manages these? Who would provide and manage the same files for a Windows compiler.
I was always under the impression that they come with the compiler but that was an assumption...

See here: Search Path
Summary:
#include <stdio.h>
When the include file is in brackets the preprocessor first searches in paths specified via the -I flag. Then it searches the standard include paths (see the above link, and use the -v flag to test on your system).
#include "myFile.h"
When the include file is in quotes the preprocessor first searches in the current directory, then paths specified by -iquote, then -I paths, then the standard paths.
-nostdinc can be used to prevent the preprocessor from searching the standard paths at all.
Environment variables can also be used to add search paths.
When compiling if you use the -v flag you can see the search paths used.

gcc is a rich and complex "orchestrating" program that calls many other programs to perform its duties. For the specific purpose of seeing where #include "goo" and #include <zap> will search on your system, I recommend:
$ touch a.c
$ gcc -v -E a.c
...
#include "..." search starts here:
#include <...> search starts here:
/usr/local/include
/usr/lib/gcc/i686-apple-darwin9/4.0.1/include
/usr/include
/System/Library/Frameworks (framework directory)
/Library/Frameworks (framework directory)
End of search list.
# 1 "a.c"
This is one way to see the search lists for included files, including (if any) directories into which #include "..." will look but #include <...> won't. This specific list I'm showing is actually on Mac OS X (aka Darwin) but the commands I recommend will show you the search lists (as well as interesting configuration details that I've replaced with ... here;-) on any system on which gcc runs properly.

Karl answered your search-path question, but as far as the "source of the files" goes, one thing to be aware of is that if you install the libfoo package and want to do some development with it (i.e., use its headers), you will also need to install libfoo-dev. The standard library header files are already in /usr/include, as you saw.
Note that some libraries with a lot of headers will install them to a subdirectory, e.g., /usr/include/openssl. To include one of those, just provide the path without the /usr/include part, for example:
#include <openssl/aes.h>

The \#include files of gcc are stored in /usr/include .
The standard include files of g++ are stored in /usr/include/c++.

Related

Clang recursive include path

I have a problem when including dependency folder as this isn't looking for headers recursively.
FOLDER STRUCTURE:
- main.cpp
- dependency
- sub1
- header1.h
- sub2
- header2.h
- root-header.h
main.cpp
#include "root-header.h"
#include "header1.h"
#include "header2.h"
int main() {
}
Command:
clang main.cpp -I"dependency"
Error:
fatal error: 'header1.h' file not found
The command only detects header.h inside dependency folder to one level, how to make the clang to recursively lookup for all headers inside dependency folder. Is there any compiler arguments to be added?
Thanks
The ISO/IEC 9899:2011 standard in section §6.10.2 explains the expected behavior of clang and other compilers:
# include <h-char-sequence> new-line
searches a sequence of implementation-defined places for a header identified uniquely by the specified sequence between the < and > delimiters, and causes the replacement of that directive by the entire contents of the header. How the places are specified or the header identified is implementation-defined.
You can modify the defined places by adding additional with the -I option, but a compiler should not search sub-directories.
You can work around this limitation in the spec by using make to compile a list of additional -I locations to add to you clang command. This is covered in #DanBonachea answer.
Instead, I'd advise you to change the includes to be compliant to the specification:
#include "sub1/header1.h"
#include "sub2/header2.h"
The conventional solutions are one of the following:
1. Change the include directives in the source code
This solution compiles with clang++ -Idependency main.cpp but modifies #include directives to include headers by subdirectory, eg:
#include "sub1/header1.h"
#include "sub2/header2.h"
This is obviously a modification to the code, so usually only makes sense if sub1 and sub2 are meaningful within the larger structure of the software (e.g. package names that are always the same). Or...
2. Use shell tools to traverse the directory and build the include path
This solution uses find to inject subdirectories on the include path, eg:
$ clang++ `find ./dependency -type d -exec echo -I'{}' \;` main.cpp
which scans to identify the subdirectories and adds them to the preprocessor include path.
Discussion
Both of these approaches should work with few changes with basically any C/C++ compiler on UNIX (incl Linux, macOS, WSL, etc).
Note the second approach above will involve some additional filesystem churn on every compilation, which might be noticeable if the number of subdirectories is very large. To be fair this cost is fundamental to that use case, and even if built-in support for recursive include existed in the compiler frontend, it would still need to perform a similarly expensive recursive directory traversal on every compilation to find all the files.
3. Amortize directory traversal
However we can improve upon the second solution if we assume all the headers that will be included from this directory structure have unique names. This is a reasonable assumption, because otherwise the unqualified #include directives inside the source files will be ambiguous, leading to orthogonal problems. With this assumption in hand, we can create a cache to amortize the cost of the dependency directory traversal as follows:
$ mkdir allheaders ; cd allheaders
$ find ../dependency -type f -exec ln -s '{}' . \;
Then compilation simply becomes:
$ clang++ -Iallheaders main.cpp
Or, if you additionally want to support a mix of option 1 and option 3 #include directives, then:
$ clang++ -Idependency -Iallheaders main.cpp
This approach could greatly accelerate compilation, because the preprocessor only needs to open one user directory and open the files by basename. The fact that the directory may contain a large number of headers (with some fraction potentially unused) should not significantly degrade performance, thanks to how filesystems work.
If we further assume the file names in the dependency directory change infrequently or never, then we only need to execute the directory traversal step once, and can amortize that cost against repeated compilation using the allheaders cache directory.

clang complete add path to includes

I have simple question today. I'm using this vim config - https://github.com/gergap/vim
The problem is with clang completion. It works but when I want to add more includes to get better completion then nothing is happening - it won't detect new headers.
Get #include <sys/types.h> for example. This is what I've added to .clang_complete file placed in directory where my main.c is placed:
-I/usr/include/x86_64-linux-gnu/sys/
which I found by invoking
find /usr/include/ -name types.h
What can be wrong? Could you show me some working .clang_complete files with includes to unix headers? Maybe I'll find problem in that way.
This is the output from gcc with -v flag:
/usr/lib/gcc/x86_64-linux-gnu/4.8/include
/usr/local/include
/usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed
/usr/include/x86_64-linux-gnu
/usr/include
To investigate, run your gcc (or clang) with the -v option. This will display the search path used while compiling. On my system (FreeBSD) a simple compile without -I options prints
#include "..." search starts here:
#include <...> search starts here:
/usr/include/clang/3.4.1
/usr/include
End of search list.
That should give you an idea what directories to add to .clang_complete. Important: order matters!

Including header files in cygwin

As you know the getch() and getche() functions don't work with the cygwin, a linux oriented one.
But can I include the conio.h header file of borland c and call the functions getch in my makefiles?
Will it work and can anyone tell me how to include the header files from different directories in cywgin.
I have a header file strcal.h in directory c:/makk/string/.
How do I include that header file in my makefile?
gcc -I/string small.c
It is not working and my current directory is makk.
In stdio.h, there is a getchar() function which is what you need. You can't just bring across the Borland header file since that just declares the function, it doesn't define it. Standard C has no need for getch().
To include header files in different areas, you use the -I directives of gcc to set up search paths.
So, if you have a /xyz/myheader.h file, you can do something like:
gcc -I /xyz myprogram.c
To get at c:/makk/string/strcal.h, you may have to use gcc -I /cygdrive/c/makk/string or, if you know you're actually in that makk directory, you can use -I string (note the lack of leading / since you want a relative path, not an absolute one).

g++ searches /lib/../lib/, then /lib/

According to g++ -print-search-dirs my C++ compiler is searching for libraries in many directories, including ...
/lib/../lib/:
/usr/lib/../lib/:
/lib/:
/usr/lib/
Naively, /lib/../lib/ would appear to be the same directory as /lib/ — lib's parent will have a child named lib, "that man's father's son is my father's son's son" and all that. The same holds for /usr/lib/../lib/ and /usr/lib/
Is there some reason, perhaps having to do with symbolic links, that g++ ought to be configured to search both /lib/../lib/ and /lib/?
If this is unnecessary redundancy, how would one go about fixing it?
If it matters, this was observed on an unmodified install of Ubuntu 9.04.
Edit: More information.
The results are from executing g++ -print-search-dirs with no other switches, from a bash shell.
Neither LIBRARY_PATH nor LPATH are output from printenv, and both echo $LPATH and echo LIBRARY_PATH return blank lines.
An attempt at an answer (which I gathered from a few minutes of looking at the gcc.c driver source and the Makefile environment).
These paths are constructed in runtime from:
GCC exec prefix (see GCC documentation on GCC_EXEC_PREFIX)
The $LIBRARY_PATH environment variable
The $LPATH environment variable (which is treated like $LIBRARY_PATH)
Any values passed to -B command-line switch
Standard executable prefixes (as specified during compilation time)
Tooldir prefix
The last one (tooldir prefix) is usually defined to be a relative path:
From gcc's Makefile.in
# Directory in which the compiler finds libraries etc.
libsubdir = $(libdir)/gcc/$(target_noncanonical)/$(version)
# Directory in which the compiler finds executables
libexecsubdir = $(libexecdir)/gcc/$(target_noncanonical)/$(version)
# Used to produce a relative $(gcc_tooldir) in gcc.o
unlibsubdir = ../../..
....
# These go as compilation flags, so they define the tooldir base prefix
# as ../../../../, and the one of the library search prefixes as ../../../
# These get PREFIX appended, and then machine for which gcc is built
# i.e i484-linux-gnu, to get something like:
# /usr/lib/gcc/i486-linux-gnu/4.2.3/../../../../i486-linux-gnu/lib/../lib/
DRIVER_DEFINES = \
-DSTANDARD_STARTFILE_PREFIX=\"$(unlibsubdir)/\" \
-DTOOLDIR_BASE_PREFIX=\"$(unlibsubdir)/../\" \
However, these are for compiler-version specific paths. Your examples are likely affected by the environment variables that I've listed above (LIBRARY_PATH, LPATH)
Well, theoretically, if /lib was a symlink to /drive2/foo, then /lib/../lib would point to /drive2/lib if I'm not mistaken. Theoretically...
Edit: I just tested and it's not the case - it comes back to /lib. Hrm :(

Why is ARG_MAX not defined via limits.h?

On Fedora Core 7, I'm writing some code that relies on ARG_MAX. However, even if I #include <limits.h>, the constant is still not defined. My investigations show that it's present in <sys/linux/limits.h>, but this is supposed to be portable across Win32/Mac/Linux, so directly including it isn't an option. What's going on here?
The reason it's not in limits.h is that it's not a quantity giving the limits of the value range of an integral type based on bit width on the current architecture. That's the role assigned to limits.h by the ISO standard.
The value in which you're interested is not hardware-bound in practice and can vary from platform to platform and perhaps system build to system build.
The correct thing to do is to call sysconf and ask it for "ARG_MAX" or "_POSIX_ARG_MAX". I think that's the POSIX-compliant solution anyway.
Acc. to my documentation, you include one or both of unistd.h or limits.h based on what values you're requesting.
One other point: many implementations of the exec family of functions return E2BIG or a similar value if you try to call them with an oversized environment. This is one of the defined conditions under which exec can actually return.
For the edification of future people like myself who find themselves here after a web search for "arg_max posix", here is a demonstration of the POSIXly-correct method for discerning ARG_MAX on your system that Thomas Kammeyer refers to in his answer:
cc -x c <(echo '
#include <unistd.h>
#include <stdio.h>
int main() { printf("%li\n", sysconf(_SC_ARG_MAX)); }
')
This uses the process substitution feature of Bash; put the same lines in a file and run cc thefile.c if you are using some other shell.
Here's the output for macOS 10.14:
$ ./a.out
262144
Here's the output for a RHEL 7.x system configured for use in an HPC environment:
$ ./a.out
4611686018427387903
$ ./a.out | numfmt --to=iec-i # 'numfmt' from GNU coreutils
4.0Ei
For contrast, here is the method prescribed by https://porkmail.org/era/unix/arg-max.html, which uses the C preprocessor:
cpp <<HERE | tail -1
#include <limits.h>
ARG_MAX
HERE
This does not work on Linux for reasons still not entirely clear to me—I am not a systems programmer and not conversant in the POSIX or ISO specs—but probably explained above.
ARG_MAX is defined in /usr/include/linux/limits.h. My linux kernel version is 3.2.0-38.

Resources