How to get an ETA? - scons

I am build several large set of source files (targets) using scons. Now, I would like to know if there is a metric I can use to show me:
How many targets remain to be build.
How long it will take -- this to be honest, this is probably a no-go as it is really hard to tell!
How can I do that in scons?

There is currently no progress indicator built into SCons, and it's also not trivial to provide one. The problem is, that SCons doesn't build the complete DAG first, and then starts the build...such that you'd have a total number of targets to visit that you could use as a reference (=100%).
Instead, it makes up the DAG on the go... It looks at each target, and then expands the list of its children (sources and implicit dependencies like headers) to check whether they are up-to-date. If a child has changed, it gets rebuilt by applying the same "build step" recursively.
In this way, SCons crawls itself from the list of targets, as given on the command-line (with the "." dir being the default), down the DAG...where only the parts are ever visited, that are required for (or, in other words: have a dependency to) the requested targets.
This makes it possible for SCons to handle things like "header files, generated by a program that must be compiled first" in the first go...but it also means that the total number of targets/children to get visited changes constantly.
So, a standard progress indicator would continuously climb towards the 80%-90%, just to then fall back to 50%...and I don't think this would give you the information you're really after.
Tip: If your builds are large and you don't want to wait, do incremental builds and only build the library/program you're currently doing work on ("scons lib1"). This will still take into account all dependencies, but only a fraction of the DAG has to get expanded. So, you use less memory and get faster update times...especially if you use the "interactive" mode. In a project with a 100000 C files total, the update of a single library with 500 C files takes about 1s on my machine. For more infos on this topic check out http://scons.org/wiki/WhySconsIsNotSlow .

Related

Meaning warning "File is touched by more than one package"

I am creating a simple linux kernel with buildroot and I am adding a small driver I've done myself, I created the Config.in file and drivername.mk to be able to select the driver in make menuconfig succesfully.
When executing make to build the image, the compilation goes correctly until my driver starts to compile, it looks to compile and create the image right but I get loooots of warnings saying that different files in ./lib/gcc/arm-buildroot-linux-uclibcgnueabihf/ are touched by more than one package: [u'host-gcc-initial', u'host-gcc-final'].
Anyone can explain me a bit about this issue and what is causing it? Do you need any more info to know what is happening? Is it safe to ignore them?
Thanks beforehand
Actually, doing a search on 'touched by more than one package', I found http://lists.busybox.net/pipermail/buildroot/2017-October/205602.html, where we find that this warning can safely be ignored if you're not doing a parallel build and aren't a kernel maintainer.
That said, if you're submitting code for inclusion in the Linux kernel, please be a good citizen and make sure you identify all of the things your code is dependent upon. (I'm not actually an active kernel hacker, so I don't know what method they're using for this right now.)
The basic idea is that there are a bunch of steps in compiling things that need to be done in a logical order. In a small project, we simply use dependencies that we know to put in because we also coded in that dependency. But with a project the size of the kernel, you can guarantee that not everyone does this. Some of them instead just specify dependencies if they're needed for things to build properly - if the default order works, things could go years before someone figures out that there was a missing dependency, causing them grief when they were trying to update just the one thing that was a missing dependency, and the other code not getting updated as a result.
When you're doing things in parallel, on the other hand, it becomes a lot more complicated. Now you really need to have every dependency specified, because there is no longer any inherent dependable order. Some people will probably still build serially, while others use two processing threads. I'll use 8. I've worked in groups that would be inclined to do 30, because they're on a 32 processor machine, and don't really need all of those during the off hours. Suddenly the fact that the file you needed from a directory that normally got processed 30 directories before yours is now getting processed at the same time as your file that needed it, because you didn't list the dependency and everything in those 30 directories that hasn't already been processed and isn't being processed has a dependency that's not yet finished its processing.

Very slow RedHawk component builds

We have some components that build 15+ object files before linking them. We find that if we modify a .h file used by many or all, that builds are VERY slow. Some of our components take over an hour to build. It appears that RedHawk issues a make -j or a make -j with a large number, so that we have 15+ compiles running simultaneously and this overwhelms even 4 GB of RAM and results in excessive swapping and VERY slow execution (the entire CPU is nearly locked up, other windows are also dead until it completes). If we use a simple make from shell in the component it completes in 5 min. Is there a way to change RH to issue a simple make or make with an adjustable number of max processes?
If you're referring to how the IDE invokes the build you can check the build console. I'm pretty sure it either calls the top level build.sh or the build.sh within your implementation's folder. In either case you can modify that file to perform the build however you'd like.

Changing the configuration of an already-built kernel and recompiling only what's been changed

The scenario outlined is this:
Someone has built the Linux kernel from source code.
That person wants to change the build configuration.
They still have all of the object files and temporary files that were produced by the previous build operation.
Given all of that, what needs to be done to rebuild as few things as possible in order to save time?
I understand that these will trigger or necessitate a complete recompilation of the source code:
Running make clean.
Running make menuconfig.
make clean is an obvious course of action to avoid to achieve the desired goal because it deletes all object files, both those that would need to be rebuilt and those that could otherwise be left alone. I don't know why make menuconfig would cause the build system to recompile everything, but I've read on here that that is what it would do.
The problem I see with not having the second avenue open to me is that if I change the configuration manually with a text editor, the options that I change might require changes in other options that depend on them (e.g., IMA_TRUSTED_KEYRING depends on SYSTEM_TRUSTED_KEYRING) and I'd be working without an interface that would automatically make those required secondary changes.
It occurred to me that invoking scripts/kconfig/mconf, the program built and launched by make menuconfig, could possibly be a solution to the problems described in the previous paragraph since it was not stated that mconf is what makes the build system recompile everything. But, it possibly could be that very program, so I do not wish to try it until I know it won't do that.
Sooooo, how does one achieve the stated objective given the stated scenario?

Bitbake build consumes more space

I recently started using Bitbake for building Yocto. Everytime I build, it consumes more space and currently I'm running out of disk space. The images are not getting overwritten. A set of new files with timestamp is getting created for every build. I have deleted old files from build/tmp/deploy/images/. But it doesn't make much difference in the disk free space. Is there any other locations from where I can delete stuff?
The error I observe during build is:
WARNING: The free space of source/build/tmp (/dev/sda4) is running low (0.999GB left)
ERROR: No new tasks can be executed since the disk space monitor action is "STOPTASKS"!
WARNING: The free space of source/build/sstate-cache (/dev/sda4) is running low (0.999GB left)
ERROR: No new tasks can be executed since the disk space monitor action is "STOPTASKS"!
WARNING: The free space of source/build/downloads (/dev/sda4) is running low (0.999GB left)
ERROR: No new tasks can be executed since the disk space monitor action is "STOPTASKS"!
Kindly suggest some pointers to avoid this issue.
In order of effectiveness and how easy the fix is:
Buy more disk space: Putting $TMPDIR on an SSD of its own helps a lot and removes the need to micromanage.
Delete $TMPDIR (build/tmp): old images, old packages and workdirectories/sysroots for MACHINEs you aren't currently building for accumulate and can take quite a lot of space. You can normally just delete the whole $TMPDIR once in a while: as long as you're using sstate-cache the next build should still be pretty fast.
Delete $SSTATE_DIR (build/sstate-cache): If you do a lot of builds sstate itself accumulates over time. Deleting the directory is safe but the next build will take a long time as everything will be rebuilt.
Delete $DL_DIR (build/downloads): If you use a build directory for a long time (while pulling updates from master or changing to newer branch) the obsolete downloads keep taking disk space. Keep in mind that deleting the directory will mean re-downloading everything. Looking at just the largest files and deleting the old versions may be a useful compromise here.
There are some official ways instead of deleting.
By deliberately deleting you could be forcing unnecessary builds & downloads. Some elements of the build could be not controlled by bitbake, and you can find yourself in a situation that you cannot rebuild these items in an easy way.
With these recommendations, you can beat the non written 50GB per build yocto rule:
Check your IMAGE_FSTYPES variable. My experience says it is safe to delete all images of these files that are not symlinks, or symlinks targets. Avoid the last one generated to avoid breaking the last build link, and any related with bootloaders and configuration files, as they could be rarely regenerated.
If you are keeping more than one build with the same set of layers, then you can use a common download folder for builds.
DL_DIR ?= "common_dir_across_all_builds/downloads/"
And afterwards:
To keep your /deploy clean:
RM_OLD_IMAGE: Reclaims disk space by removing previously built versions of the same image from the images directory pointed to by the DEPLOY_DIR variable.Set this variable to "1" in your local.conf file to remove these images:
RM_OLD_IMAGE = "1"
IMAGE_FSTYPES Remove the image types that you do not plan to use, you can always enable a particular one when you need it:
IMAGE_FSTYPES_remove = "tar.bz2"
IMAGE_FSTYPES_remove = "rpi-sdimg"
IMAGE_FSTYPES_remove = "ext3"
For /tmp/work, do not need all the workfiles of all recipes. You can specify which ones you are interested in your development.
RM_WORK_EXCLUDE:
With rm_work enabled, this variable specifies a list of recipes whose work directories should not be removed. See the "rm_work.bbclass" section for more details.
INHERIT += "rm_work"
RM_WORK_EXCLUDE += "home-assistant widde"

How to speed up compilation time in linux

While compiling under linux I use flag -j16 as i have 16 cores. I am just wondering if it makes any sense to use sth like -j32. Actually this is a quesiton about scheduling of processor time and if it is possible to put more pressure on particular process than any other this way (let say i have like to pararell compilations each with -j16 and what if one would be -j32?).
I think it does not make much sense but I am not sure as do not know how kernel solves such things.
Kind regards,
I use a non-recursive build system based on GNU make and I was wondering how well it scales.
I ran benchmarks on a 6-core Intel CPU with hyper-threading. I measured compile times using -j1 to -j20. For each -j option make ran three times and the shortest time was recorded. Using -j9 results in shortest compile time, 11% better than -j6.
In other words, hyper-threading does help a little, and an optimal formula for Intel processors with hyper-threading is number_of_cores * 1.5:
Chart data is here.
The rule of thumb is to use the number of processors+1. Hyper-Thready counts, so a quad core CPU with HT should have -j9
Setting the value too high is counter-productive, if you do want to speed up compile times consider ccache to cache compiled objects that do not change in each compilation, and distcc to distribute the compilation across several machines.
We have a machine in our shop with the following characteristics:
256 core sparc solaris
~64gb RAM
Some of that memory used for a ram drive for /tmp
Back when it was originally setup, before other users discovered its existence, I ran some timing tests to see how far I could push it. The build in question is non-recursive, so all jobs are kicked off from a single make process. I also cloned my repo into /tmp to take advantage of the ram drive.
I saw improvements up to -j56. Beyond that my results flat lined much like Maxim's graph, until somewhere above (roughly) -j75 where performance began to degrade. Running multiple parallel builds I could push it beyond the apparent cap of -j56.
The primary make process is single-threaded; after running some tests I realized the ceiling I was hitting had to do with how many child processes the primary thread could service -- which was further hampered by anything in the makefiles that either required extra time to parse (eg., using = instead of := to avoid unnecessary delayed evaluation, complex user defined macros, etc) or used things like $(shell).
These are the things I've been able to do to speed up builds that have a noticeable impact:
Use := wherever possible
If you assign to a variable once with :=, then later with +=, it'll continue to use immediate evaluation. However, ?= and +=, when a variable hasn't been assigned previously, will always delay evaluation.
Delayed evaluation doesn't seem like a big deal until you have a large enough build. If a variable (like CFLAGS) doesn't change after all the makefiles have been parsed, then you probably don't want to use delayed evaluation on it (and if you do, you probably already know enough about what I'm talking about anyway to ignore my advice).
If you create macros you execute with the $(call) facility, try to do as much of the evaluation ahead of time as possible
I once got it in my head to create macros of the form:
IFLINUX = $(strip $(if $(filter Linux,$(shell uname)),$(1),$(2)))
IFCLANG = $(strip $(if $(filter-out undefined,$(origin CLANG_BUILD)),$(1),$(2)))
...
# an example of how I might have made the worst use of it
CXXFLAGS = ${whatever flags} $(call IFCLANG,-fsanitize=undefined)
This build produces over 10,000 object files, about 8,000 of which are from C++ code. Had I used CXXFLAGS := (...), it would only need to immediately replace ${CXXFLAGS} in all of the compile steps with the already evaluated text. Instead it must re-evaluate the text of that variable once for each compile step.
An alternative implementation that can at least help mitigate some of the re-evaluation if you have no choice:
ifneq 'undefined' '$(origin CLANG_BUILD)'
IFCLANG = $(strip $(1))
else
IFCLANG = $(strip $(2))
endif
... though that only helps avoid the repeated $(origin) and $(if) calls; you'd still have to follow the advice about using := wherever possible.
Where possible, avoid using custom macros inside recipes
The reasoning should be pretty obvious here after the above; anything that requires a variable or macro to be repeatedly evaluated for every compile/link step will degrade your build speed. Every macro/variable evaluation occurs in the same thread as what kicks off new jobs, so any time spent parsing is time make delays kicking off another parallel job.
I put some recipes in custom macros whenever it promotes code re-use and/or improves readability, but I try to keep it to a minimum.

Resources