I am trying to profile my software (in Linux) with oprofile. My software consists of both userspace and kernel module. First my doubt is what does the --separate=kernel option do? What will be the difference when running without that option? I did try to see it but couldn't find any difference. Could you please post an example?
Can't i profile a kernel module without the --seperate=kernel option?
Thanks,
Bala
In oprofile when used with option --seperate=kernel, it seperates the kernel and kernel modules per application.
--seperate='library' seperates the samples for the dynamically linked object per application basis.
kernel, dynamically linked object are just not specific to the application we want to profile alone. But at the same time our application spends considerable amount of time in them.
So --seperate allows one to get the samples from the point of view of the application we are interested in profiling. It can also give samples based on individual threads also.
Kernel can be profiled by providing --vmlinux option to opcontrol.
Ex:- opcontrol --vmlinux=/boot/vmlinux-2.6.27.23-0.1-preempt
--seperate is additional option that allows us to see the samples at different resolutions.
Related
I am trying to debug a performance of my program. What would be ideal is to have a way to see in detail when was the thread doing useful work, when was it blocked by page faults, when was it executing some memory writes and reads, etc...
I would simply like to have a detailed understanding of whats going on. Is it possible?
The linux kernel sources come with the perf tool that can measure a large number of performance counter, all of those you listed included, and can print statistics about it, annotate symbols, instructions and source lines with them (if debug symbols are available), and can track any process or also logical cpu cores.
Your Linux distribution will have the tool probably in a standalone package. Some hardening options of the kernel may limit what information root or non-root users can collect with it.
You can use perf and visualizing a perf output file graphically with hotspot
perf is able to record multiple fields such as addr, ip, timestamp. It also can record general registers as seen at https://github.com/torvalds/linux/blob/master/tools/perf/arch/x86/util/perf_regs.c. But I can't find any related document about recording control registers using perf. So how can I achieve that using perf? Are there any other tools available?
You cannot record control register values using perf tools. The list of registers that you can sample using --intr-regs option is limited to the registers listed here. You can confirm this by looking here.
The registers that can be accessed by the perf events module is architecture dependent, as can be seen here and here. Including selective register states into the perf record/script output has been introduced by this commit. This means, all of perf would be limited to using the registers that have been specified and nothing more.
There are other questions/answers here, that tell you some ways of writing a program/kernel module to access the control registers. On top of this, you can use QEMU (in TCG mode) and run your program inside the VM. You can then print register state periodically (at the end of each TB - where you'll see all register values). There are designated emulators like GDB also, which might help you.
Edit -
There is one way by which cr3 register values can be recorded. You can use IntelPT to record control flow information for a program, during its execution. IntelPT tracks changes to CR3 registers with the help of the PIP packet. You can use the traces generated by IntelPT to track and determine the CR3 values.
I need to write an application that gets performance statistics on a Linux machine. Unfortunately the environment is extremely memory constrained and so using the standard command line tools isn't really an option as I would need to poll them pretty frequently.
Ideally what I would like to be able to do would be to get the performance data directly from the kernel itself, using the same buffers and data that it uses to try and reduce the RAM requirements for my application as much as possible. Tying my app to the Linux kernel so closely isn't really a problem we have only ever used Linux in production and I can't see that ever changing really.
I've spent the last day or two looking through the kernel source but I have to admit to being somewhat lost. Can anyone point me to the right place for getting access to CPU performance information / I/O performance information / networking performance information and bandwidth usage information please?
I think there are several files under /proc, such as /proc/stat, /proc/diskstats, /proc/net/*.
For CPU performance information, using /proc/stat, the file format is defined in the file ./fs/proc/stat.c in Linux Kernel source code tree.
For disk access information, using /proc/diskstats, the file format is defined in the file ./block/genhd.c in Linux Kernel source code tree, the function is diskstats_show().
For network related statistics, one can refer to files under /proc/net/. But I don't know how to calculate the bandwidth usage based on file under directory /proc/net.
Background
I've written a tool to capture CPU usage on a per/thread basis. The output of the tools is a binary file, that I can pump into my parsing utility that I wrote. And the output of the parsing utility is a CSV file that I can import into Excel to chart pretty graphs of process/thread CPU usage.
This CPU usage capture tool is running on an embedded ARM platform running a Linux kernel based on 2.6.35.3. That being said, I was concerned about making the tool light weight. I didn't want it to store directly to a CSV file, in order to minimize the processing time and the file size of the captured data.
Question
The tool works, but I'm wondering if I took the long way around the problem? Is there already a tool out there that does this (or something like it)?
You're probably wondering why I care if I already made a tool that works. Well, it's not as light weight as I'd like. It's taking up about 10% of CPU usage. As a benchmark, top only takes up about 1% (max).
Update
I've decided to continue using my tool for now. At least until a better solution becomes available. I was able to shave off a couple percentage points by using open() instead of fopen() on /proc/stat. I'm also using read() instead of fgets().
IBM has a tool called nmon which does the same(for AIX & Linux): According to IBM's documentation, it takes ~2% CPU. You may want to look at that.
Comparing nmon with your tool could give you a fair idea about your program's performance and how you may improve your csv capture.
This might be a bit of a steep learning curve, but you might want look into SystemTap: http://sourceware.org/systemtap/
I'm trying to find the best way to use 'top' as semi-permanent instrumentation in the development of a box running embedded Linux. (The instrumentation will be removed from the final-test and production releases.)
My first pass is to simply add this to init.d:
top -b -d 15 >/tmp/toploop.out &
This runs top in "batch" mode every 15 seconds. Let's assume that /tmp has plenty of spaceā¦
Questions:
Is 15 seconds a good value to choose for general-purpose monitoring?
Other than disk space, how seriously is this perturbing the state of the system?
What other (perhaps better) tools could be used like this?
Look at collectd. It's a very light weight system monitoring framework coded for performance.
We use sysstat to monitor things like this.
You might find that vmstat and iostat with a delay and no repeat counter is a better option.
I suspect 15 seconds would be more than adequate unless you actually want to watch what's happening in real time, but that doesn't appear to be the case here.
As far as load, on an idling PIII 900Mhz w/ 768MB of RAM running Ubuntu (not sure which version, but not more than a year old) I have top updating every 0.5 seconds and it's about 2% CPU utilization. At 15s updates, I'm seeing 0.1% CPU utilization.
depending upon what exactly you want, you could use the output of uptime, free, and ps to get most, if not all, of top's information.
If you are looking for overall load, uptime is probably sufficient. However, if you want specific information about processes, you are adventurous, and have the /proc filessystem enabled, you may want to write your own tools. The primary benefit in this environment is that you can focus on exactly what you want and minimize the load introduced to the system.
The proc file system gives your application read access to the kernel memory that keeps track of many of the interesting variables. Reading from /proc is one of the lightest ways to get this information. Additionally, you may be able to get more information than provided by top. I've done this in the past to get amount of time spent in user and system by this process. Additionally, you can use this to get information about the number of file descriptors open by the process. You might also use this to get detailed information about how the network system is working.
Much of this information is pre-processed by other applications which can be used if you get the information you need. However, it is rather straight-forward to read the raw information. Do a man proc for more information.
Pity you haven't said what you are monitoring for.
You should decide whether 15 seconds is ok or not. Feel free to drop it way lower if you wish (and have a fast HDD)
No worries unless you are running a soft real-time system
Have a look at tools suggested in other answers. I'll add another sugestion: "iotop", for answering a "who is thrashing the HDD" questions.
At work for system monitoring during stress tests we use a tool called nmon.
What I love about nmon is it has the ability to export to XLS and generate beautiful graphs for you.
It generates statistics for:
Memory Usage
CPU Usage
Network Usage
Disk I/O
Good luck :)