Does perf lock profile user space mutexes? - linux

Summary: Does perf lock profile pthread_mutex?
Details:
The tool perf has an option perf lock. The man page says:
You can analyze various lock behaviours and statistics with this perf lock command.
'perf lock record <command>' records lock events
between start and end <command>. And this command
produces the file "perf.data" which contains tracing
results of lock events.
'perf lock trace' shows raw lock events.
'perf lock report' reports statistical data.
But when I tried running perf lock record I got an error saying: invalid or unsupported event: 'lock:lock_acquire'. I looked and it seems that error is probably because my kernel is not compiled with CONFIG_LOCKDEP or CONFIG_LOCK_STAT.
My question is: does perf lock report events related to user-space locks (like pthread_mutex) or only kernel locks? I'm more interested in profiling application that mostly run in user-space. I thought this option in perf looked interesting, but since I can't run it without compiling (or getting) a new kernel I'm interested in getting a better idea of what it does before I try.

Summary: Does perf lock profile pthread_mutex?
Summary: no, because there are no any tracepoint defined in user-space pthread_mutex.
According to source file tools/perf/builtin-lock.c (http://lxr.free-electrons.com/source/tools/perf/builtin-lock.c#L939) cmd_lock calls __cmd_record, which defines several tracepoints for perf record (via -e TRACEPOINT_NAME) and also pass options -R -m 1024 -c 1 to perf report. List of tracepoints defined: lock_tracepoints:
842 static const struct perf_evsel_str_handler lock_tracepoints[] = {
843 { "lock:lock_acquire", perf_evsel__process_lock_acquire, }, /* CONFIG_LOCKDEP */
844 { "lock:lock_acquired", perf_evsel__process_lock_acquired, }, /* CONFIG_LOCKDEP, CONFIG_LOCK_STAT */
845 { "lock:lock_contended", perf_evsel__process_lock_contended, }, /* CONFIG_LOCKDEP, CONFIG_LOCK_STAT */
846 { "lock:lock_release", perf_evsel__process_lock_release, }, /* CONFIG_LOCKDEP */
847 };
TRACE_EVENT(lock_acquire,.. is defined in trace/events/lock.h. And
trace_lock_acquire is defined only in kernel/locking/lockdep.c (recheck in debian codebase: http://codesearch.debian.net/search?q=trace_lock_acquire).
Only CONFIG_LOCKDEP is missing from your kernel according to kernel/locking/Makefile: obj-$(CONFIG_LOCKDEP) += lockdep.o (tracepoints are defined unconditionally in the lockdep.c.
According to https://www.kernel.org/doc/Documentation/trace/tracepoints.txt all tracepoints are kernel-only, so perf lock will not profile user-space locks.
You can try tracepoints from LTTng, the project which declares user-space tracepoints (http://lttng.org/ust). But there will be no ready lock statistics, only raw data on tracepoints. Also you should define tracepoints with tracef() macro (recompile pthreads/glibc, or try to create your own wrapper around pthread).

No. But maybe what you need can be done by recording sched_stat_sleep and sched_switch events:
$ perf record -e sched:sched_stat_sleep,sched:sched_switch -g -o perf.data.raw yourprog
$ perf inject -v -s -i perf.data.raw -o perf.data
$ perf report --stdio --no-children
Note: make sure your kernel is compiled with CONFIG_SCHEDSTATS=y and it is enabled at /proc/sys/kernel/sched_schedstats
See more at https://perf.wiki.kernel.org/index.php/Tutorial#Profiling_sleep_times.

Related

LLC-Miss Report in the Linux Perf_Events

I ran the following command in the shell:
sudo perf record -F 60000 -e LLC-misses:u mplayer video.mp4
After the program finished execution, I ran this command:
sudo perf report
The report shows the locations of code (in MPlayer and the associated shared libraries) in the userspace with the percentage of LLC misses. Is it possible to determine the origin of the misses? i.e., if they are instruction misses or data misses?
One more question: when I change LLC-misses:u to LLC-misses:up the generated report will be empty. What is wrong with the PEBS switch addition?

perf: why don't I have "syscall" counters?

There are apparently some counters in Linux perf like syscall:sys_enter_select, but on my system perf list does not show any of them
Evidence that other people do have these counters is here: http://www.brendangregg.com/blog/2014-07-03/perf-counting.html
If I run perf top -e 'syscalls:sys_enter_*' it says:
Can't open event dir: Permission denied
invalid or unsupported event: 'syscalls:sys_enter_*'
Other event types (the ones in perf list) work fine.
What do I need to do to access syscall counters in perf? I'm using Linux kernel and perf version 3.10 on x86_64.
Some perf counters, including all the syscall ones, are only available to the root user. sudo perf list will show all the counters, including syscall ones assuming the kernel is built with CONFIG_HAVE_SYSCALL_TRACEPOINTS (see Grisha Levit's answer regarding that).
So, to make perf top -e 'syscalls:sys_enter_*' work, run it under sudo--even if you do not need sudo for other counters like cycles.
You must have an old version of perf to go with your crusty old 3.10 kernel.
On a modern system (x86-64 Arch Linux with Linux 4.15.8-1-ARCH, and a matching version of perf), perf answers this question for you:
$ perf stat -e 'syscalls:sys_enter_*stat*' ls -l
event syntax error: 'syscalls:sys_enter_*stat*'
\___ can't access trace events
Error: No permissions to read /sys/kernel/debug/tracing/events/syscalls/sys_enter_*stat*
Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
Run 'perf list' for a list of valid events
...
$ ll /sys/kernel/debug/ -d
drwx------ 33 root root 0 Mar 14 00:02 /sys/kernel/debug/
Interesting that you could make it world-readable, similar to how you can put kernel.perf_event_paranoid = 0 and kernel.yama.ptrace_scope = 0 in /etc/sysctl.d/99-local.conf for convenient debugging / tracing / profiling on a single-user desktop without using root all the time.
These will be missing if the kernel was not built with CONFIG_HAVE_SYSCALL_TRACEPOINTS.
You can check like so:
# grep TRACEPOINTS "/boot/config-$(uname -r)"
CONFIG_TRACEPOINTS=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y

Linux perf tool run issues

I am using perf tool to bench mark one of my projects. The issue I am facing is that wo get automatihen I run perf tool on my machine, everything works fine.
However, I am trying to run perf in automation servers to make it part of my check in process but I am getting the following error from automation servers
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Error:
Permission error - are you root?
Consider tweaking /proc/sys/kernel/perf_event_paranoid:
-1 - Not paranoid at all
0 - Disallow raw tracepoint access for unpriv
1 - Disallow cpu events for unpriv
2 - Disallow kernel profiling for unpriv
fp: Terminated
I tried changing /proc/sys/kernel/perf_event_paranoid to -1 and 0 but still see the same issue.
Anybody seen this before? Why would I need to run the command as root? I am able to run it on my machine without sudo.
by the way, the command is like this:
perf record -m 32 -F 99 -p xxxx -a -g --call-graph fp
You can't use -a (full system profiling) and sample kernel from non-root user: http://man7.org/linux/man-pages/man1/perf-record.1.html
Try running it without -a option and with event limited to userspace events by :u suffix:
perf record -m 32 -F 99 -p $PID -g --call-graph fp -e cycles:u
Or use software event for virtualized platforms without PMU passthrough
perf record -m 32 -F 99 -p $PID -g --call-graph fp -e cpu-clock:u

strace on Linux not logging all calls to open()

I am using strace to capture calls to open(), close() and read() on Linux. The target process is the jetty web server. As far as I can tell, strace is not logging all calls to open(). Maybe the others too, I have not tried to correlate the file descriptors to open() calls.
For example, starting strace:
strace -f -e trace=open,close,read -o/tmp/strace.out -p62881
I then use wget to fetch 100 static files; all were retrieved successfully. In one run, only 56 open events were logged; on another run of 100 different files, I got 66 open events.
I believe that using "-f" results in strace attaching to all the LWPIDs for the threads ("Process 62881 attached with 25 threads - interrupt to quit
"); when I try to explicitly attach to all using multiple "-p" options, I get a single "attach" success message, but multiple "Operation not permitted messages", one for each child PID.
I restarted Jetty to clear its cache before my tests.
Kernel version is 2.6.32-504.3.3.el6.x86_64 (Red Hat). Strace package version is strace-4.5.19-1.19.el6.x86_64.
What am I missing?
Thanks
On some systems you have to use openat() instead of open().
Try:
strace -f -e trace=openat,close,read -o/tmp/strace.out -p62881
Try -ff (in addition to -f):
-ff: If the -o filename option is in effect, each processes trace is written to filename.pid where pid is
the numeric process id of each process. This is incompatible with -c, since no per-process counts are
kept.

why does perf record and annotate not work?

I'm stumped, I read the perf tutorial and am trying to do a simple test beyond "perf stat" which works. However perf record either doesnt work ,or perf annotate shows no samples recorded. Running perf
For example(im running with sudo because without it i get a bunch of errors which i will post at the end):
sudo perf record -e cycles,instructions,cache-misses -a -c 1 ./FooExe
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 1.794 MB perf.data (~78393 samples) ]
.
sudo perf report -D -i perf.data |grep RECORD_SAMPLE |wc -l
Failed to open /tmp/perf-23796.map, continuing without symbols
20486
.
sudo perf annotate -d ./FooExe
the perf.data file has no samples! Press any key
So thats as far as i get. I tried to rebuild perf for my ssystem from source but that didnt seem to help either.
Im using Ubuntu 14.04 kernel 3.19.0-49-generic. This is on intel i7 I4510U cpu . I made sure to compile my program with symbols , but i get the same results regardless of which application i try to profile.
-- if i run without sudo :
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Cannot read kernel map
Error:
You may not have permission to collect system-wide stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid:
-1 - Not paranoid at all
0 - Disallow raw tracepoint access for unpriv
1 - Disallow cpu events for unpriv
2 - Disallow kernel profiling for unpriv
I just tried your command. The problem was that you used -a to profile all processes system-wide, so it never ran ./FooExe. You can confirm this with strace -f perf ... ./FooExe, and note the lack of any execve system call. And also the fact that it returns instantly, even if FooExe should have taken several seconds.
Here's an example of recording samples for a busy-loop awk command:
perf record -e cycles,instructions,cache-misses awk 'BEGIN{for(i=0;i<40000000;i++){}}'
Now perf report works. You don't need to specify the executable for the report command, because perf.data only has data for the one executable.
This works the same way with the ocperf.py wrapper, but you could record events for more uarch-specific events using symbolic names (instead of looking up codes and numeric arguments in -e):
$ ocperf.py record -e cycles,cache-misses,uops_dispatched_port.port_0 awk 'BEGIN{for(i=0;i<40000000;i++){}}'
perf record -e cycles,cache-misses,cpu/event=0xa1,umask=0x1,name=uops_dispatched_port_port_0,period=2000003/ awk 'BEGIN{for(i=0;i<40000000;i++){}}'
(warning lines about kernel symbols)
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.352 MB perf.data (7819 samples)
$ ocperf.py report

Resources