Doxygen is Slow - multithreading

Doxygen takes about 12 hours to run on our code base. This is primarily because there is a lot of code to process (~1.5M lines). However, it's very quickly approaching the point at which we can't do nightly documentation updates because they take too long. We've already had to reduce the graph depth to get it down to 12 hours.
I've tried the standard approaches, but I really do need high quality output, and this includes graphs and SEARCH_INCLUDES. I have a fairly good machine to run Doxygen on, but Doxygen doesn't take advantage of its many cores. (It pegs a single CPU on the build server, but is only 4% of the available system.) Having a multithreaded Dot build is handy, though that's only half or so of the build time.
Are there any techniques I can use to run doxygen via multiple processes and manually shard the task? I've seen some talk about creating tag files, but I don't understand enough about them to know if they'd do what I want. What I'm looking for is something like:
doxygen Doxyfile-folder1
doxygen Doxyfile-folder2
doxygen Doxyfile-folder3
doxygen Doxyfile-folder4
doxygen-join output/folder1/html output/folder2/html output/folder3/html output/folder4/html
Of course, I'm just making stuff up, but that's an idea of what I'm looking for. Also, I'd use a lot more than 4 processes.

Tag files are typically the way to go if
you have a number of logically coherent source files (let's call them components) and
you know the dependencies between the components, e.g. component A uses component B and C, and component B only uses C, and
It is ok (or even preferred) that the index files (e.g. the list of a files/classes/functions) are limited to a single component.
you are interested in HTML output.
A tag file is basically just a structured list of symbols with links to the location in the documentation. Tag files allow doxygen to make links from the documentation of one component to that of another.
It is a 2 step process:
First you run doxygen on each component to generate the tag file for that component. You can do this by disabling all output and use GENERATE_TAGFILE. So for component A, a Doxyfile.tagonly would have the following settings:
GENERATE_HTML = NO
GENERATE_LATEX = NO
GENERATE_RTF = NO
GENERATE_MAN = NO
GENERATE_TAGFILE = compA.tag
You'll notice that running doxygen this way is very fast.
The second step is to generate the actual documentation. For component A you need a Doxyfile which includes the tag files of the components B and C since we determined A depends on these components.
GENERATE_HTML = YES
GENERATE_LATEX = NO
GENERATE_RTF = NO
GENERATE_MAN = NO
TAGFILES = path/to/compB/compB.tag=path/to/compB/htmldocs \
path/to/compC/compC.tag=path/to/compC/htmldocs
Using this approach I have been able to generate documentation for 20M+ lines of code distributed over 1500+ components in under 3 hours on a standard desktop PC (Core i5 with 8Gb RAM and Linux 64bit), including source browsing, full call graphs, and UML-style diagrams of all data structures. Note that the first step only took 10 minutes.
To accomplish this I made a script to generate the Doxyfile's for each component based on the list of components and their direct dependencies. In the first step I run 8 instances of doxygen in parallel (using http://www.gnu.org/s/parallel/). In the second step I run 4 instances of doxygen in parallel.
See http://www.doxygen.nl/manual/external.html for more info about tag files.

Related

Progress bar in Python for varying loops

I'm assuming this can't be done easily but atm I've written for loops in python that take in multiple files from a folder and runs them against an executable to convert them into a readable format. Problem is, the number of files will be different almost every time I run the script. The whole process takes a while so a progress bar would be nice. I've been trying tqdm, progress, etc. modules, but they all seem to work based on a fixed number of iterations, which doesn't work for me.
Basically, is there any possible way to easily implement a progress bar which either calculates progress based on how long the process will take to perform the conversions, or based on how many there are to convert?
Though having said this, for one part of the code, instead of taking in files, it was easier to run a powershell script in cmd through python to keep it part of the same code. This script iterates through many many files and outputs 1 file. Again, is there a way to monitor the progress of this in python, even though in python the code is just running something in cmd?
Try PyProg! PyProg is an open-source library for Python to create super customizable progress indicators & bars.
It is currently at version 1.0.2; it is hosted on Github and available on PyPI (Links down below). It is compatible with Python 3 and 2.
I actually made PyProg because I needed a simple but super customizable progress bar library. And I made it super flexible as well, so you can use set_total(<total>) to set the total number of files you need to convert and use set_stat(<status>) to set the current progress, and it will calculate the percentage for you!
You can easily install it with: pip install pyprog. Some examples of using it is on the Github page for this project.
PyProg Github: https://github.com/Bill13579/pyprog
PyPI: https://pypi.python.org/pypi/pyprog/

How to use xperf to examine the call stack of nodejs?

I'm trying to learn performance tuning for Node.js applications. This first thing I want is a flamegraph. Since I work on Windows platform, I follow this manual to get the flamegraph.
However, I'm stacked at this step:
xperf -i perf.etl -o perf.csv -symbols
I'm no good with xperf. Could someone tell me how to get pass this problem and get a flamegraph?
It's worth pointing out that xperf can record many different types of call stacks. You can get a call stack on every file I/O, disk I/O, context switch, registry access, etc., and you could create a flame graph of any one of these. I assume, however, that you want a flame graph of the CPU Sampled data.
You can find a slightly different technique for creating flame graphs from xperf sampled data on my blog, here:
https://randomascii.wordpress.com/2013/03/26/summarizing-xperf-cpu-usage-with-flame-graphs/
You don't say what your problem was -- what went wrong with that step -- so I'll give a few generic suggestions:
Try with a very short trace -- just a few seconds -- to make the process as fast as possible when experimenting.
Try loading the trace into WPA to make sure you can see the sampled data there. You may find that you don't need the flame graph, since WPA gives you ways to graphically explore the data. Loading the trace into WPA also gives you a chance to make sure the symbols load, and gives WPA a chance to convert the symbols to .symcache files, which will make the processing step run much faster.
Make sure you have _NT_SYMBOL_PATH set to point to Microsoft's symbol servers and any others you might need.
Consider recording the trace with wprui instead of with a batch file: https://randomascii.wordpress.com/2013/04/20/xperf-basics-recording-a-trace-the-easy-way/
You could probably improve on the flame graph generation process by not exporting all of the xperf data to text, by using the somewhat new wpaexporter, which I document here:
https://randomascii.wordpress.com/2013/11/04/exporting-arbitrary-data-from-xperf-etl-files/
However this will require reworking the scripts and may be more work than you want to put in.

Signal Processing in Go

I have come up with an idea for an audio project and it looks like Go is a useful language for implementing it. However, it requires the ability to apply filters to incoming audio, and Go doesn't appear to have any sort of audio processing package. I can use cgo to call C code, but every signal processing library I find uses C++ classes which cgo cannot handle. It looks like libsox may work. Are there any others?
What libsox can provide and what I need is to take an incoming audio stream and divide it into frequency bands. If I can do this while only reading the file once, then bonus! I am not sure if libsox can do this.
If you want to use a C++ library you could try SWIG, but you'll have to get it out of Subversion. The next release (2.0.1) will be the first released version to support Go. In my experience the Go support is still a little rough, but then again the library I tried to wrap is a monster.
Alternatively, you could still create your own bindings through cgo using the same method SWIG does, but it will be painful and tedious. The basic idea is that you first create a C wrapper, then let cgo create a Go wrapper around your C wrapper.
I don't know anything about signal processing or libsox, though. Sorry.
There is a relatively new project called ZikiChombo
which contains so far some basic DSP functionality geared toward audio, see here
The dsp part of the project has filters on its roadmap, but they are not yet there. On the other hand some infrastructure for implementing filters, such as real fft and block convolution is there. Meaning that if you want FIRs, and can compute the coefficients by some other means, you can run them via convolution in zc currently with sound in real time.
Basic filtering design support (FIR,Biquad), for example using an ideal filter as a starting point will be the next step for zc. There are numerous small self-contained open source projects for basic and more advanced FIR and IIR filter design, most notably Iowa Hills which might be more accessible than a larger project to compute filter coefficients outside of Go.
More advanced filtering such as Butterworth, and filters based on polynomial solving and the bilinear transform will take more time for zc.
There is also some software defined radio Golang projects with some code related to filtering, sorry don't have the links offhand but a search for the topic may lead you to them.
Finally, there is a gonum Fourier package which also supplies fft.
So Go is growing some interesting and potentially stuff in this domain, but still has quite a ways to go compared to older projects (which are mostly in C/C++, or perhaps with a Python wrapper via numpy for example).
I am using this pure golang repo to perform Fourier Transforms with good effect
https://github.com/mjibson/go-dsp
just supply the FFT call with a
import (
"github.com/mjibson/go-dsp/fft" // https://github.com/mjibson/go-dsp
)
var audio_wave []float64
// ... now populate audio_wave with your audio PCM samples
var complex_fft []complex128
// input time domain ... output frequency domain of equally spaced freq bins
complex_fft = fft.FFTReal(audio_wave)

Framebuffer Documentation

Is there any documentation on how to write software that uses the framebuffer device in Linux? I've seen a couple simple examples that basically say: "open it, mmap it, write pixels to mapped area." But no comprehensive documentation on how to use the different IOCTLS for it anything. I've seen references to "panning" and other capabilities but "googling it" gives way too many hits of useless information.
Edit:
Is the only documentation from a programming standpoint, not a "User's howto configure your system to use the fb," documentation the code?
You could have a look at fbi's source code, an image viewer which uses the linux framebuffer. You can get it here : http://linux.bytesex.org/fbida/
-- It appears there might not be too many options possible to programming with the fb from user space on a desktop beyond what you mentioned. This might be one reason why some of the docs are so old. Look at this howto for device driver writers and which is referenced from some official linux docs: www.linux-fbdev.org [slash] HOWTO [slash] index.html . It does not reference too many interfaces.. although looking at the linux source tree does offer larger code examples.
-- opentom.org [slash] Hardware_Framebuffer is not for a desktop environment. It reinforces the main methodology, but it does seem to avoid explaining all the ingredients necessary to doing the "fast" double buffer switching it mentions. Another one for a different device and which leaves some key buffering details out is wiki.gp2x.org [slash] wiki [slash] Writing_to_the_framebuffer_device , although it does at least suggest you might be able use fb1 and fb0 to engage double buffering (on this device.. though for desktop, fb1 may not be possible or it may access different hardware), that using volatile keyword might be appropriate, and that we should pay attention to the vsync.
-- asm.sourceforge.net [slash] articles [slash] fb.html assembly language routines that also appear (?) to just do the basics of querying, opening, setting a few basics, mmap, drawing pixel values to storage, and copying over to the fb memory (making sure to use a short stosb loop, I suppose, rather than some longer approach).
-- Beware of 16 bpp comments when googling Linux frame buffer: I used fbgrab and fb2png during an X session to no avail. These each rendered an image that suggested a snapshot of my desktop screen as if the picture of the desktop had been taken using a very bad camera, underwater, and then overexposed in a dark room. The image was completely broken in color, size, and missing much detail (dotted all over with pixel colors that didn't belong). It seems that /proc /sys on the computer I used (new kernel with at most minor modifications.. from a PCLOS derivative) claim that fb0 uses 16 bpp, and most things I googled stated something along those lines, but experiments lead me to a very different conclusion. Besides the results of these two failures from standard frame buffer grab utilities (for the versions held by this distro) that may have assumed 16 bits, I had a different successful test result treating frame buffer pixel data as 32 bits. I created a file from data pulled in via cat /dev/fb0. The file's size ended up being 1920000. I then wrote a small C program to try and manipulate that data (under the assumption it was pixel data in some encoding or other). I nailed it eventually, and the pixel format matched exactly what I had gotten from X when queried (TrueColor RGB 8 bits, no alpha but padded to 32 bits). Notice another clue: my screen resolution of 800x600 times 4 bytes gives 1920000 exactly. The 16 bit approaches I tried initially all produced a similar broken image to fbgrap, so it's not like if I may not have been looking at the right data. [Let me know if you want the code I used to test the data. Basically I just read in the entire fb0 dump and then spit it back out to file, after adding a header "P6\n800 600\n255\n" that creates the suitable ppm file, and while looping over all the pixels manipulating their order or expanding them,.. with the end successful result for me being to drop every 4th byte and switch the first and third in every 4 byte unit. In short, I turned the apparent BGRA fb0 dump into a ppm RGB file. ppm can be viewed with many pic viewers on Linux.]
-- You may want to reconsider the reasons for wanting to program using fb0 (this might also account for why few examples exist). You may not achieve any worthwhile performance gains over X (this was my, if limited, experience) while giving up benefits of using X. This reason might also account for why few code examples exist.
-- Note that DirectFB is not fb. DirectFB has of late gotten more love than the older fb, as it is more focused on the sexier 3d hw accel. If you want to render to a desktop screen as fast as possible without leveraging 3d hardware accel (or even 2d hw accel), then fb might be fine but won't give you anything much that X doesn't give you. X apparently uses fb, and the overhead is likely negligible compared to other costs your program will likely have (don't call X in any tight loop, but instead at the end once you have set up all the pixels for the frame). On the other hand, it can be neat to play around with fb as covered in this comment: Paint Pixels to Screen via Linux FrameBuffer
Check for MPlayer sources.
Under the /libvo directory there are a lot of Video Output plugins used by Mplayer to display multimedia. There you can find the fbdev (vo_fbdev* sources) plugin which uses the Linux frame buffer.
There are a lot of ioctl calls, with the following codes:
FBIOGET_VSCREENINFO
FBIOPUT_VSCREENINFO
FBIOGET_FSCREENINFO
FBIOGETCMAP
FBIOPUTCMAP
FBIOPAN_DISPLAY
It's not like a good documentation, but this is surely a good application implementation.
Look at source code of any of: fbxat,fbida, fbterm, fbtv, directFB library, libxineliboutput-fbe, ppmtofb, xserver-fbdev all are debian packages apps. Just apt-get source from debian libraries. there are many others...
hint: search for framebuffer in package description using your favorite package manager.
ok, even if reading the code is sometimes called "Guru documentation" it can be a bit too much to actually do it.
The source to any splash screen (i.e. during booting) should give you a good start.

Is there a visual two-dimensional code editor?

Let me explain what I mean by "two-dimensional code editor": imagine of using Inkscape or Gimp in a big canvas (say infinite). The "T - add text" tool is used to write the code. Additionally, all function definitions will be framed and links will connect the called functions.
In other words: you have a very large sheet of (virtual) paper where you can write.
It would be really useful. I don't want to write code as a long list of lines, especially now that big monitors are cheaper.
Is such a code editor out there?
What's your opinion? Would you use a 2d code editor?
I've written 3 or 4 visual editors and my second one worked like this, that was for java and c++ (never published, though I did use it for some published research work)
I still don't like much to write my code 'as a long list of lines'. My point is, after trying a system like this, I tried a windowed system (class outlines in windows, right click to open code editors), then a tree based system...
in the long run (I wrote several apps using all of those), the tree based system with non overlapping windows felt at once most scalable (to different monitor sizes) and foremost, most productive, because dragging the text boxes and links and/or windows in the first version was necessary, without adding much to the programming experience, so it felt wasteful.
If you want to try some of this stuff out, you can google antegram for java (java only) antegram for web (javascript/php/actionscript) and ee-ide (on oogtech.org). I'm not sure if I could dig up the original c++/java textbox + links editor (which could collapse graphs as well, and had an infinite canvas, so pretty close to what you describe).
I'm not working on this as much as I used to as few programmers ever seemed to like it except me, but if you like working the tree way, or feel like adding stuff for your own purposes, ee-ide would be the way to go, as it's nicely modular and easy to extend compared to the rest.
On the commercial side, you can configure visual studio to work with UML-like diagrams. I have a feel it might be a little too heavy (although it's definitely more coding than UML oriented), but I'm not sure, I haven't really tried yet.
This probably doesn't answer your question exactly, but anyway.
Have a look at the NodeBox beta . It is a visual programming environment mostly for creating generative graphics. You can program and edit the nodes with python code, connect and reuse them in multiple ways. (Windows and Mac OS)
Also worth mentioning (in terms of concept) is Field . It is for programming performances and arranges bits of code on a stage/timeline. Very interesting but also very confusing. (Mac OS only)
Third one is vvvv. It is used a lot by graphical artists to create realtime 3d visuals. Node based. (Windows only)
NodeBox and Field are open-source, so if you are looking to create something yourself you can see how it's done there.
Check this out. I came across it today and remembered this question.
Code Bubbles
Developers spend significant time
reading and navigating code fragments
spread across multiple locations. The
file-based nature of contemporary IDEs
makes it prohibitively difficult to
create and maintain a simultaneous
view of such fragments. We propose a
novel user interface metaphor for code
understanding and maintanence based on
collections of lightweight, editable
fragments called bubbles, which form
concurrently visible working sets.
The essential goal of this project is
to make it easier for developers to
see many fragments of code (or other
information) at once without having to
navigate back and forth. Each of these
fragments is shown in a bubble.
A bubble is a fully editable and
interactive view of a fragment such as
a method or collection of member
variables. Bubbles, in contrast to
windows, have minimal border
decoration, avoid clipping their
contents by using automatic code
reflow and elision, and do not overlap
but instead push each other out of the
way. Bubbles exist in a large,
pannable 2-D virtual space where a
cluster of bubbles comprises a
concurrently visible working set.
Bubbles support a lightweight grouping
mechanism, and further support
connections between them.
A quantiative user study indicates
that Code Bubbles increased
performance significantly for two
controlled code understanding tasks. A
qualitative user study with 23
professional developers indicates
substantial interest and enthusiasm
for the approach, despite the radical
departure from what developers are
used to.
http://www.cs.brown.edu/people/acb/codebubbles_site.htm
At one point, LabView had a programming mode like this. You connected program blocks together in a graphical way.
It's been so long since I've used LabView that I don't know if it is still the same.
For me, the MVVM pattern means that there's no code behind the UI controls anyway. The logic is all in a class with properties.
The properties use WPF databinding to update the UI controls. For example, on the form or window, page, whatever, MySearchButton.IsEnabled is bound to ViewModel.MySearchButtonIsEnabled property. So the app logic runs in the ViewModel class and just sets its own properties and the UI updates automatically.
Although this is specific to MS WPF the pattern actually stems from SmallTalk and is found across the development field as MVP. Without WPF one would need to write the databinding or 'presenter' logic, which is common.
This means the UI can be torn off and a new one pasted-in really quickly and with little code knowledge from the UI guy - who, in an ideal world, is a crack creative guy that drives a 70s Citroen.
So my point is that, although it sounds like a neat innovation, a 2D editor like this would be assisting a coding style that is no longer considered optimal.

Resources