How to compile boost faster? - visual-studio-2012

I'm using the following command on Win7 x64
.\b2 --cxxflags=/MP --build-type=complete
also tried
.\b2 --cxxflags=-MP --build-type=complete
However, cl.exe is still using only one of the 8 cores of my system.Any suggestions?

Make the compilation parallel at the build tool level, not per translation unit with
.\b2 -j8
or similar (if you have n cores, -j(n+1) is often used)

Turns out Malwarebytes was the culprit. It was slowing down the compilation by scanning newly generated files and memory. I turned it off, now I'm seeing 50% utilization(4 cores) sometimes. It's still between 5%-14% most of the time though.

Related

Linux kernel re-compilation too slow

I am compiling the Linux kernel in a VM(virtual box) with 2 out of 4GB and 4 out of 8 CPUs allocated. My initial compilation took around 8-9 hours, and I was using make -j4 optimisation too. Now I added a simple system call to the kernel and just ran the make -j4 and and it has been compiling for the past 3 hours. I thought that after the initial compilation, make would only compile the small changes but it seems to be compiling everything (mostly the drivers). Is there any way I can speed up this compilation process?
For example, is there anyway by which I can disable some of the drivers that I don't really need, for example if I just want to implement a simple system call, I don't really need all the networking drivers, and maybe that would speed it up? i.e. I just want the bare minimum functionality for my kernel to test my system calls.
Compiling the kernel will always take a very long time, unfortunately there's no way around that besides having a really good processor with a lot of multi threading, however in huge projects like this, ccache will help with compilation times tremendously, it's not perfect, but far better than just compiling objects.
You won't see the difference at the initial compilation, but it will speed up recompilation by using the cache it has generated instead of compiling most of what already has been compiled before.

Webstorm becomes extremely slow with node.js

Anyone knows what's the deal with this IDE?
I have been running it for a while, lately it has become very slow and unresponsive at times.
Gobbles up CPU even when just editing a bunch of js files.
Possibilities:
1. My code base is getting bigger...
2. I have several listeners which compile coffeescript and sass files in the background when these change.
In any case, I am very surprised (for the worse) that this is so slow. Would expect better from a developer of an IDE.
Anyone had this kind of problem before?
10x
There are a couple performance tweaks you can apply to Webstorm to see if it improves your situation. When my colleagues and I found that Webstorm was slowing down these tweaks solved all our problems.
First things first, ensure your project is configured to utilise webstorm resources efficiently by excluding particular directories from a project. This will ensure the containing files are not indexed in memory and will not decrease performance when performing functions such as searching for files or text within files. Some examples of good candidates to exclude are the node_modules directory and compiled code directories.
If there are still performance issues, try the following:
If you are on Windows by default you would be using the 32-bit version. Navigate to the Webstorm directory (within program files) and you'll see webstorm64.exe, which will run Webstorm in 64-bit mode. (You might need to install a proper 64-bits JDK yourself then.)
The default VM options for IntelliJ IDEA may be not optimal when your project contains more than 10000 classes and developers often try to change the default options to minimize IntelliJ IDEA hangtime.
You can try bumping up the JVM memory limits for Webstorm. Open the VM options from the IDE_HOME\bin\<product>[bits][.exe].vmoptions. Initially try doubling the Xms and Xmxmemory values.
Please note that very big Xmx and Xms values are not so good. In this case, GarbageCollector has to work with a big part of memory at a time and causes considerable hang-ups.
For more info on configuring JVM memory options you can refer to:
Configuring IntelliJ IDEA VM options - http://blog.jetbrains.com/idea/2006/04/configuring-intellij-idea-vm-options/
Configuring JVM options and platform properties - https://intellij-support.jetbrains.com/entries/23395793-Configuring-JVM-options-and-platform-properties
You can now do it from UI.
These are my before-after. No problems with the garbage collector. Just multiplied all values by 4. Machine: 20Gb RAM, 4Ghz i7 CPU & SSD disk. With defaults it started to lag. Now no lag again.
Pasting as text for quick copy:
# custom WebStorm VM options
# Default:
# -Xms128m
# -Xmx750m
# -XX:ReservedCodeCacheSize=240m
# -XX:+UseCompressedOops
-Xms512m
-Xmx3000m
-XX:ReservedCodeCacheSize=960m
-XX:+UseCompressedOops
I was dealing with a similar situation. CPU used to spike like crazy, and the IDE used to lag. Go to WebStorm preference and try disabling plugins that you do not need.
For instance, if your project uses SASS, what's the point of having LESS plugin running? Likewise, if your project uses Git, you don't need to have CVS or Perforce Integration.
CPU still spikes when WebStorm is indexing my project files, but I usually just wait it out.
Stopping my TypeScript file watching significantly helped (both in the IDE settings menu and in tsconfig.json). I assume that once the project gets big enough, any changes force a large recompile. It's not ideal but it's something that worked for me and may work for others as well.

How to speed up Linux kernel compilation?

I have core i5 with 8gb RAM.
I have VMware workstation 10.0.1 installed on my machine.
I have fedora 20 Desktop Edition installed on VMware as guest OS.
I am working on Linux kernel source code v 3.14.1. I am developing an I/O scheduler for Linux kernel. After any modifications in code every time it takes around 1 hour and 30 minutes for compiling and installing the whole kernel code to see the changes.
Compilation and Installation commands:
make menuconfig,
make,
make modules,
make modules_install,
make install
So my question is it possible to reduce 1 hour and 30 minutes time into only 10 to 15 minutes?
Do not do make menuconfig for every change you make to the sources, because it will trigger a full compilation of everything, no matter how trivial your change is. This is only needed when the configuration option of the kernel changes, and that should sheldom happen during your development.
Just do:
make
or if you prefer the parallel compilation:
make -j4
or whatever number of concurrent tasks you fancy.
Then the make install, etc. may be needed for deploying the recently built binaries, of course.
Another trick is to configure the kernel to the minimum needed for your tests. I've found that for many tasks a UML compilation (User Mode Linux) is the fastest. You may also find useful make localmodconfig instead of make menuconfig to start with.
Use make parallel build with -j option
Compile for the target architecture only, since otherwise make will build the kernel for every listed architecture.
i.e. for eg instead of running:
make
run:
make ARCH=<your architecture> -jN
where N is the no of cores on your machine (cat /proc/cpuinfo lists the no of cores). For eg, for i386 target and host machine with 4 cores (output of cat /proc/cpuinfo):
make ARCH=i386 -j4
Similarly you can run the other make targets (modules, modules_install, install) with -jN flag.
Note: make does a check of the files modified and compiles only those files which have been modified so only the initial build should take time, subsequent builds will be faster.
make -j will make use of all available CPUs.
You do not need to run make menuconfig again every time you make a change — it is only needed once to create the kernel .config file. (Or possibly again if you edit Kconfig files to add or modify configuration options, but this certainly shouldn't be happening often.)
So long as your .config is left alone, running make should only recompile files that you changed. There are a few files that must be compiled every time, but the vast majority are not.
ccache should be able to dramatically speed up your compile times. It speeds up recompilation by caching previous compilations and detecting when the same compilation is being done again. Your first compilation with ccache will be slower since it needs to populate the cache, but subsequent builds should be much faster.
If you don't want to fuss with ccache configurations you can just run it like so to compile the kernel:
ccache make
Perhaps in addition to the previous suggestions, while using ccache, you might want to unset CONFIG_GCC_PLUGINS (if it was set) otherwise you may get a lot of cache misses, as seen in this example.
Perhaps in addition to the previous suggestions, using ccache software (https://ccache.samba.org/) and a compilation directory on SSD drive should drastically decrease the compilation time.
If you have suffitient RAM and you wont be using your machine while the kernel is being built u can spawn a large number of concurrent jobs. But make sure your RAM is sufficient otherwise your system will hang and crash.
Use this command:
sudo make -j 4 && sudo make modules_install -j 4 && sudo make install -j 4
Where 4 is the number of cores I have alloted to working on this process.
Credits
Simple trick. If you don't use your own machine or have another one, you can log out completely and switch to a TTY terminal using CTRL + ALT + F*. Everything is much much faster.

Why does my code run slower with multiple threads than with a single thread when it is compiled for profiling (-pg)?

I'm writing a ray tracer.
Recently, I added threading to the program to exploit the additional cores on my i5 Quad Core.
In a weird turn of events the debug version of the application is now running slower, but the optimized build is running faster than before I added threading.
I'm passing the "-g -pg" flags to gcc for the debug build and the "-O3" flag for the optimized build.
Host system: Ubuntu Linux 10.4 AMD64.
I know that debug symbols add significant overhead to the program, but the relative performance has always been maintained. I.e. a faster algorithm will always run faster in both debug and optimization builds.
Any idea why I'm seeing this behavior?
Debug version is compiled with "-g3 -pg". Optimized version with "-O3".
Optimized no threading: 0m4.864s
Optimized threading: 0m2.075s
Debug no threading: 0m30.351s
Debug threading: 0m39.860s
Debug threading after "strip": 0m39.767s
Debug no threading (no-pg): 0m10.428s
Debug threading (no-pg): 0m4.045s
This convinces me that "-g3" is not to blame for the odd performance delta, but that it's rather the "-pg" switch. It's likely that the "-pg" option adds some sort of locking mechanism to measure thread performance.
Since "-pg" is broken on threaded applications anyway, I'll just remove it.
What do you get without the -pg flag? That's not debugging symbols (which don't affect the code generation), that's for profiling (which does).
It's quite plausible that profiling in a multithreaded process requires additional locking which slows the multithreaded version down, even to the point of making it slower than the non-multithreaded version.
You are talking about two different things here. Debug symbols and compiler optimization. If you use the strongest optimization settings the compiler has to offer, you do so at the consequence of losing symbols that are useful in debugging.
Your application is not running slower due to debugging symbols, its running slower because of less optimization done by the compiler.
Debugging symbols are not 'overhead' beyond the fact that they occupy more disk space. Code compiled at maximum optimization (-O3) should not be adding debug symbols. That's a flag that you would set when you have no need for said symbols.
If you need debugging symbols, you gain them at the expense of losing compiler optimization. However, once again, this is not 'overhead', its just the absence of compiler optimization.
Is the profile code inserting instrumentation calls in enough functions to hurt you?
If you single-step at the assembly language level, you'll find out pretty quick.
Multithreaded code execution time is not always measured as expected by gprof.
You should time your code with an other timer in addition to gprof to see the difference.
My example: Running LULESH CORAL benchmark on a 2NUMA nodes INTEL sandy bridge (8 cores + 8 cores) with size -s 50 and 20 iterations -i, compile with gcc 6.3.0, -O3, I have:
With 1 thread running: ~3,7 without -pg and ~3,8 with it, but according to gprof analysis the code has ran only for 3,5.
WIth 16 threads running: ~0,6 without -pg and ~0,8 with it, but according to gprof analysis the code has ran for ~4,5 ...
The time in bold has been measured gettimeofday, outside the parallel region (start and end of main function).
Therefore, maybe if you would have measure your application time the same way, you would have seen the same speeduo with and without -pg. It is just the gprof measure which is wrong in parallel. In LULESH openmp version either way.

How to reduce compilation cost in GCC and make?

I am trying to build some big libraries, like Boost and OpenCV, from their source code via make and GCC under Ubuntu 8.10 on my laptop. Unfortunately the compilation of those big libraries seem to be big burden to my laptop (Acer Aspire 5000). Its fan makes higher and higher noises until out of a sudden my laptop shuts itself down without the OS gracefully turning off.
So I wonder how to reduce the compilation cost in case of make and GCC?
I wouldn't mind if the compilation will take much longer time or more space, as long as it can finish without my laptop shutting itself down.
Is building the debug version of libraries always less costly than building release version because there is no optimization?
Generally speaking, is it possible to just specify some part of a library to install instead of the full library? Can the rest be built and added into if later found needed?
Is it correct that if I restart my laptop, I can resume compilation from around where it was when my laptop shut itself down? For example, I noticed that it is true for OpenCV by looking at the progress percentage shown during its compilation does not restart from 0%. But I am not sure about Boost, since there is no obvious information for me to tell and the compilation seems to take much longer.
UPDATE:
Thanks, brianegge and Levy Chen! How to use the wrapper script for GCC and/or g++? Is it like defining some alias to GCC or g++? How to call a script to check sensors and wait until the CPU temperature drops before continuing?
I'd suggest creating a wrapper script for gcc and/or g++
#!/bin/bash
sleep 10
exec gcc "$#"
Save the above as "gccslow" or something, and then:
export CC="gccslow"
Alternatively, you can call the script gcc and put it at the front of your path. If you do that, be sure to include the full path in the script, otherwise, the script will call itself recursively.
A better implementation could call a script to check sensors and wait until the CPU temperature drops before continuing.
For your latter question: A well written Makefile will define dependencies as a directed a-cyclical graph (DAG), and it will try to satisfy those dependencies by compiling them in the order according to the DAG. Thus as a file is compiled, the dependency is satisfied and need not be compiled again.
It can, however, be tricky to write good Makefiles, and thus sometime the author will resort to a brute force approach, and recompile everything from scratch.
For your question, for such well known libraries, I will assume the Makefile is written properly, and that the build should resume from the last operation (with the caveat that it needs to rescan the DAG, and recalculate the compilation order, that should be relatively cheap).
Instead of compiling the whole thing, you can compile each target separately. You have to examine the Makefile for identifying them.
Tongue-in-cheek: What about putting the laptop into the fridge while compiling?

Resources