Unable to understand why one should not use /usr/src/linux for kernel development - linux

I was reading "Linux Kernel Development" by Robert Love. I came across a line which I am unable to understand fully:-
The kernel source is installed in /usr/src/linux. You should not use this source tree for development because the kernel version against which your C library is compiled is often linked to this tree.
It looks like I am unable to relate it with some very basic concept.

The /usr/src/linux area has a (usually incomplete) set of kernel headers that are used by the library header files. They should match the library, and not get messed up. Headers in /usr/include/linux are "private" but these should be those headers which were used in a compilation of your libraries (notably glibc) and hacking around the with a link in /usr/src is a mistake as Linus tried to explain many times - sometimes quite forcibly. Headers used in a kernel compilation are NOT searched for in subdirectories of /usr/src/linux but are specific to a kernel version and can be drastically different between different versions, or at least you do not have any guarantees that they are not.

Related

Where should the contents of /sys/kernel/kheaders.tar.xz be installed?

Recent Linux kernels can be built with a copy of the kernel headers included -- these are available in a pseudo file at /sys/kernel/kheaders.tar.xz. This unpacks to a tree with arch and include. include has about 30 subdirectories, including linux, and arch leads to a tree of what looks like related architecture files.
Is there a single place that the contents can be unpacked so that the C compiler can access the headers at the expected places? (Including intra-header inclusion, e.g., <linux/gpio/consumer.h> includes <linux/bug.h>, which includes <asm/bug.h>.)
I tried just unpacking everything into /usr/include, which sort-of kind-of works, but the arch stuff is not at all in the right place. Plus, unpacking over an existing /usr/include feels sloppy -- stuff might get clobbered, and you'll have a mess the next time you upgrade the kernel.
Is there a Best Practices that kernel developers use for this?

C++ .a: what affects portability across distros?

I'm building a .a from C++ code. It only depends on the standard library (libc++/libstdc++). From general reading, it seems that portability of binaries depends on
compiler version (because it can affect the ABI). For gcc, the ABI is linked to the major version number.
libc++/libstdc++ versions (because they could pass a vector<T> into the .a and its representation could change).
I.e. someone using the .a needs to use the same (major version of) the compiler + same standard library.
As far as I can see, if compiler and standard library match, a .a should work across multiple distros. Is this right? Or is there gubbins relating to system calls, etc., meaning a .a for Ubuntu should be built on Ubuntu, .a for CentOS should be built on CentOS, and so on?
Edit: see If clang++ and g++ are ABI incompatible, what is used for shared libraries in binary? (though it doens't answer this q.)
Edit 2: I am not accessing any OS features explicitly (e.g. via system calls). My only interaction with the system is to open files and read from them.
It only depends on the standard library
It could also depend implicitly upon other things (think of resources like fonts, configuration files under /etc/, header files under /usr/include/, availability of /proc/, of /sys/, external programs run by system(3) or execvp(3), specific file systems or devices, particular ioctl-s, available or required plugins, etc...)
These are kind of details which might make the porting difficult. For example look into nsswitch.conf(5).
The evil is in the details.
(in other words, without a lot more details, your question don't have much sense)
Linux is perceived as a free software ecosystem. The usual way of porting something is to recompile it on -or at least for- the target Linux distribution. When you do that several times (for different and many Linux distros), you'll understand what details are significant in your particular software (and distributions).
Most of the time, recompiling and porting a library on a different distribution is really easy. Sometimes, it might be hard.
For shared libraries, reading Program Library HowTo, C++ dlopen miniHowTo, elf(5), your ABI specification (see here for some incomplete list), Drepper's How To Write Shared Libraries could be useful.
My recommendation is to prepare binary packages for various common Linux distributions. For example, a .deb for Debian & Ubuntu (some particular versions of them).
Of course a .deb for Debian might not work on Ubuntu (sometimes it does).
Look also into things like autoconf (or cmake). You may want at least to have some externally provided #define-d preprocessor strings (often passed by -D to gcc or g++) which would vary from one distribution to the next (e.g. on some distributions, you print by popen-ing lp, on others, by popen-ing lpr, on others by interacting with some CUPS server etc...). Details matter.
My only interaction with the system is to open files
But even these vary a lot from one distribution to another one.
It is probable that you won't be able to provide a single -and the same one- lib*.a for several distributions.
NB: you probably need to budget more work than what you believe.

GCC/G++: building without GNU unique object symbols for older Linux kernels

I am currently working on updating the build system for a large pile of code, which happens to include a Linux C++ project. It would be nice if all of the developers here could run a build when hacking around with their own ideas, so I was examining if it would be possible to build this on vaguely modern Linux systems despite the target system being 2.6.18.
By 'vaguely modern' I am estimating something like GCC 4.5+, something that a distribution in the past year or two might come with. Currently I solve the libstdc++ issue by compiling that in statically, and any glibc issues are neatly worked around by remapping to old versions of the memcpy symbols (and so on) with a quick bit of wrapper code. So far so good.
The one problem I can't seem to completely figure out is that certain symbols built into the executable from the .o files are of type 'u', which is a GNU unique object, an extension to the ELF standard that 2.6.18 doesn't seem to recognise at all. This means the executable won't run because it can't find the symbols, though they are in fact present (just of type '?' on the target, from 'nm').
One can disable the use of GNU unique objects when compiling G++ but it's not exactly the most convenient solution. I can't see any way to just disable it when compiling code (distro gcc/g++ invariably has this option on), and I imagine the only way to get the target system to recognise it would be to update ld-linux and the kernel. That's almost certainly not going to happen.
Is there an option I haven't found to disable these symbol types? Or perhaps is there some neat way around this, or something that I'm missing? I am beginning to suspect it will just have to be compiled on G++ 4.1.x, which will mean an old Linux installation or building that from source.
I was trying to deal with the same problem (which led me to finding this question) and after a bunch of research came to the definitive conclusion that no, you are not missing anything, there is no way around this besides compiling your own g++. See this recent question on the gcc-help mailing list:
http://gcc.gnu.org/ml/gcc-help/2013-01/msg00008.html
I compared gcc sources and found that you can go as high as stock 4.4, as unique symbols were added in 4.5. However on RHEL/CentOS 6 they default to 4.4 but patched unique symbol support into it, so as usual one must beware of distribution-specific gcc versions. For me this is a huge bummer as it means that things compiled on RHEL 6 can't be run on RHEL 5, even with a copy of libstdc++ made just for gcc 4.4 + RHEL 5.
Here's the message where unique symbol support was first proposed, by the way:
https://gcc.gnu.org/ml/gcc-patches/2009-07/msg01240.html
If you search around you'll find that people have complained about it on other lists for various reasons, but I guess it's here to stay.

What is the proper way of including linux kernel config?

I'm porting an old version of a software that is partly a linux kernel module to EL5, after doing the relevant hacks, the horrible GNU autotools mess that is used to compile the thing (no, it does not compile the kernel module via kbuild :( ) I keep getting lots of warnings 'Including config.h is deprecated' - I am told by google search results that I should be using -I flags instead, but cannot seem to find what flags and where I should put them.
The software is proprietary, so can not link to it as it is not publicly available.
The version I am porting had support up to and including 2.6.16 (and I need 2.6.18-164 el5). The kernel space code is in the ballpark of 100k lines in dozens of files (and the compilation spans over a few Makefiles)
What is the proper way of fixing this?
Found it out eventually, I had to add "-include $LINUX_KERNEL_INCLUDE/linux/autoconf.h" to CPPFLAGS

How to create a shared object that is statically linked with pthreads and libstdc++ on Linux/gcc?

How to create a shared object that is statically linked with pthreads and libstdc++ on Linux/gcc?
Before I go to answering your question as it was described, I will note that it is not exactly clear what you are trying to achieve in the end, and there is probably a better solution to your problem.
That said - there are two main problems with trying to do what you described:
One is, that you will need to decompose libpthread and libstdc++ to the object files they are made with. This is because ELF binaries (used on Linux) have two levels of "run time" library loading - even when an executable is statically linked, the loader has to load the statically linked libraries within the binary on execution, and map the right memory addresses. This is done before the shared linkage of libraries that are dynamically loaded (shared objects) and mapped to shared memory. Thus, a shared object cannot be statically linked with such libraries, as at the time the object is loaded, all static linked libraries were loaded already. This is one difference between linking with a static library and a plain object file - a static library is not merely glued like any object file into the executable, but still contains separate tables which are referred to on loading. (I believe that this is in contrast to the much simpler static libraries in MS-DOS and classic Windows, .LIB files, but there may be more to those than I remember).
Of course you do not actually have to decompose libpthread and libstdc++, you can just use the object files generated when building them. Collecting them may be a bit difficult though (look for the objects referred to by the final Makefile rule of those libraries). And you would have to use ld directly and not gcc/g++ to link, to avoid linking with the dynamic versions as well.
The second problem is consequential. If you do the above, you will sure have such a shared object / dynamic library as you asked to build. However, it will not be very useful, as once you try to link a regular executable that uses those libpthread/libstdc++ (the latter being any C++ program) with this shared object, it will fail with symbol conflicts - the symbols of the static libpthread/libstdc++ objects you linked your shared object against will clash with the symbols from the standard libpthread/libstdc++ used by that executable, no matter if it is dynamically or statically linked with the standard libraries.
You could of course then try to either hide all symbols in the static objects from libstdc++/libpthread used by your shared library, make them private in some way, or rename them automatically on linkage so that there will be no conflict. However, even if you get that to work, you will find some undesireable results in runtime, since both libstdc++/libpthread keep quite a bit of state in global variables and structures, which you would now have duplicate and each unaware of the other. This will lead to inconsistencies between these global data and the underlying operating system state such as file descriptors and memory bounds (and perhaps some values from the standard C library such as errno for libstdc++, and signal handlers and timers for libpthread.
To avoid over-broad interpretation, I will add a remark: at times there can be sensible grounds for wanting to statically link against even such basic libraries as libstdc++ and even libc, and even though it is becoming a bit more difficult with recent systems and versions of those libraries (due to a bit of coupling with the loader and special linker tricks used), it is definitely possible - I did it a few times, and know of other cases in which it is still done. However, in that case you need to link a whole executable statically. Static linkage with standard libraries combined with dynamic linkage with other objects is not normally feasible.
Edit: One issue which I forgot to mention but is important to take into account is C++ specific. C++ was unfortunately not designed to work well with the classic model of object linkage and loading (used on Unix and other systems). This makes shared libraries in C++ not really portable as they should be, because a lot of things such as type information and templates are not cleanly separated between objects (often being taken, together with a lot of actual library code at compile time from the headers). libstdc++ for that reason is tightly coupled with GCC, and code compiled with one version of g++ will in general only work with the libstdc++ from with this (or a very similar) version of g++. As you will surely notice if you ever try to build a program with GCC 4 with any non-trivial library on your system that was built with GCC 3, this is not just libstdc++. If your reason for wanting to do that is trying to ensure that your shared object is always linked with the specific versions of libstdc++ and libpthread that it was built against, this would not help because a program that uses a different/incompatible libstdc++ would also be built with an incompatible C++ compiler or version of g++, and would thus fail to link with your shared object anyway, aside from the actual libstdc++ conflicts.
If you wonder "why wasn't this done simpler?", a general rumination worth pondering: For C++ to work nicely with dynamic/shared libraries (meaning compatibility across compilers, and the ability to replace a dynamic library with another version with a compatible interface without rebuilding everything that uses it), not just compiler standartization is needed, but at the level of the operating system's loader, the structure and interface of object and library files and the work of the linker would need to be significantly extended beyond the relatively simple Unix classics used on common operating systems (Microsoft Windows, Mach based systems and NeXTStep relatives such as Mac OS, VMS relatives and some mainframe systems also included) for natively built code today. The linker and dynamic loader would need to be aware of such things as templates and typing, having to some extent functionality of a small compiler to actually adapt the library's code to the type given to it - and (personal subjective observation here) it seems that higher-level intermediate intermediate code (together with higher-level languages and just-in-time compilation) is catching ground faster and likely to be standardized sooner than such extensions to the native object formats and linkers.
You mentioned in a separate comment that you are trying to port a C++ library to an embedded device. (I am adding a new answer here instead of editing my original answer here because I think other StackOverflow users interested in this original question may still be interested in that answer in its context)
Obviously, depending on how stripped down your embedded system is (I have not much embedded Linux experience, so I am not sure what is most likely), you may of course be able to just install the shared libstdc++ on it and dynamically link everything as you would do otherwise.
If dynamically linking with libstdc++ would not be good for you or not work on your system (there are so many different levels of embedded systems that one cannot know), and you need to link against a static libstdc++, then as I said, your only real option is static linking the executable using the library with it and libstdc++. You mentioned porting a library to the embedded device, but if this is for the purpose of using it in some code you write or build on the device and you do not mind a static libstdc++, then linking everything statically (aside from perhaps libc) is probably OK.
If the size of libstdc++ is a problem, and you find that your library is actually only using a small part of its interfaces, then I would nonetheless suggest first trying to determine the actual space you would save by linking against only the parts you need. It may be significant or not, I never looked that deep into libstdc++ and I suspect that it has a lot of internal dependencies, so while you surely do not need some of the interfaces, you may or may not still depend on a big part of its internals - I do not know and did not try, but it may surprise you. You can get an idea by just linking a binary using the library against a static build of it and libstdc++ (not forgetting to strip the binary, of course), and comparing the size of the resulting executable that with the total size of a (stripped) executable dynamically linked together with the full (stripped) shared objects of the library and libstdc++.
If you find that the size difference is significant, but do not want to statically link everything, you try to reduce the size of libstdc++ by rebuilding it without some parts you know that you do not need (there are configure-time options for some parts of it, and you can also try to remove some independent objects at the final creation of libstdc++.so. There are some tools to optimize the size of libraries - search the web (I recall one from a company named MontaVista but do not see it on their web site now, there are some others too).
Other than the straightforward above, some ideas and suggestions to think of:
You mentioned that you use uClibc, which I never fiddled with myself (my experience with embedded programming is a lot more primitive, mostly involving assembly programming for the embedded processor and cross-compiling with minimal embedded libraries). I assume you checked this, and I know that uClibc is intended to be a lightweight but rather full standard C library, but do not forget that C++ code is hardly independent on the C library, and g++ and libstdc++ depend on quite some delicate things (I remember problems with libc on some proprietary Unix versions), so I would not just assume that g++ or the GNU libstdc++ actually works with uClibc without trying - I don't recall seeing it mentioned in the uClibc pages.
Also, if this is an embedded system, think of its performance, compute power, overall complexity, and timing/simplicity/solidity requirements. Take into consideration the complexity involved, and think whether using C++ and threads is appropriate in your embedded system, and if nothing else in the system uses those, whether it is worth introducing for that library. It may be, not knowing the library or system I cannot tell (again, embedded systems being such a wide range nowadays).
And in this case also, just a quick link I stumbled upon looking for uClibc -- if you are working on an embedded system, using uClibc, and want to use C++ code on it -- take a look at uClibc++. I do not know how much of the standard C++ stuff you need and it already supports, and it seems to be an ongoing project, so not clear if it is in a state good enough for you already, but assuming that your work is also under development still, it might be a good alternative to GCC's libstdc++ for your embedded work.
I think this guy explains quite well why that wouldn't make sense. C++ code that uses your shared object but a different libstdc++ would link alright, but wouldn't work.

Resources