Getting profiling file from "stack exec" - haskell

I would like to profile a program that is being managed by Stack. The file was built using with the following command:
stack build --executable-profiling --library-profiling --ghc-options="-fprof-auto -rtsopts"
And run with this command
stack exec myProgram.exe -- inputArg +RTS -p
I know that the program has run (from the output file) but I am expecting a myProgram.prof file to be produced as well, I cannot find this file.
If I execute the program without using stack the profiling file is produced, but is there a way to get this to work using Stack?

-- stops the RTS from processing further command-line arguments but is passed through to the program. So, your -- is visible to both stack and myProgram.exe and therefore the +RTS -p flags are not visible to myProgram.exe's RTS. Instead try
stack exec -- myProgram.exe inputArg +RTS -p

Related

How to use stackcount bcc tool with Rust?

I would like to create a memory flamegraph of a process using bcc/eBPF as seen here and using:
sudo ./stackcount-bpfcc -p <pid> -U -r ".*malloc.*" -v -d
Doesn't seem to write anything interesting in stdout, just have this:
cannot attach kprobe, Invalid argument
cannot attach kprobe, Invalid argument
cannot attach kprobe, Invalid argument
cannot attach kprobe, Invalid argument
cannot attach kprobe, Invalid argument
Tracing 86 functions for ".*malloc.*"...
Hit Ctrl-C to end.
My executable is written in Rust and was build with .cargo/config:
[build]
rustflags = "-C force-frame-pointers=yes"

Why does gem5 run parsec3.0 encounter deadlock error?

I run gem5 full system mode on a multi-core system, use AtomicCPU to establish a checkpoint, and then turn to O3CPU to start, and execute a command similar to the following:
./build/ARM_MOESI_hammer/gem5.opt -d fs_results/blackscholes configs/example/fs.py --ruby --num-cpus=64 --caches --l2cache --cpu-type=AtomicSimpleCPU --network=garnet2.0 --disk-image=$M5_PATH/disks/expanded-linaro-minimal-aarch64.img --kernel=/home/GEM5/gem5/2017sys/binaries/vmlinux.vexpress_gem5_v1_64.20170616 --param 'system.realview.gic.gem5_extensions = True'
Next, establish a checkpoint, and use the following command to restore the checkpoint and run PARSEC.
./build/ARM_MOESI_hammer/gem5.opt -d fs_results/blackscholes configs/example/fs.py --ruby --num-cpus=64 --caches --l2cache --cpu-type=AtomicSimpleCPU --network=garnet2.0 --disk-image=$M5_PATH/disks/expanded-linaro-minimal-aarch64.img --kernel=/home/GEM5/gem5/2017sys/binaries/vmlinux.vexpress_gem5_v1_64.20170616 --param 'system.realview.gic.gem5_extensions = True' --restore-with-cpu=DeriveO3CPU --script=../arm-gem5-rsk/parsec_rcs/blackscholes_simsmall_64.rcS -r 1
But I encountered the following problems:
First of all, the rcs file is not executed. Does the startup checkpoint conflict with the --script command?
The second point, I manually enter in the operating system booted by gem5:
parsecmgmt -a run -c gcc-hooks -i simsmall -n 1 -p blackscholes
I got the following error:
panic: Possible Deadlock detected. Aborting!
I tried to find the solution from the internet,it seems that there used to be a way to add parameters--garnet-network=flexible,but this method is no longer applicable in gem5-20.0 version.
Can someone help me solve this deadlock problem? By the way, when running the facesim program, I can get the correct running result by using 'test' input.

Linux perf tool run issues

I am using perf tool to bench mark one of my projects. The issue I am facing is that wo get automatihen I run perf tool on my machine, everything works fine.
However, I am trying to run perf in automation servers to make it part of my check in process but I am getting the following error from automation servers
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Error:
Permission error - are you root?
Consider tweaking /proc/sys/kernel/perf_event_paranoid:
-1 - Not paranoid at all
0 - Disallow raw tracepoint access for unpriv
1 - Disallow cpu events for unpriv
2 - Disallow kernel profiling for unpriv
fp: Terminated
I tried changing /proc/sys/kernel/perf_event_paranoid to -1 and 0 but still see the same issue.
Anybody seen this before? Why would I need to run the command as root? I am able to run it on my machine without sudo.
by the way, the command is like this:
perf record -m 32 -F 99 -p xxxx -a -g --call-graph fp
You can't use -a (full system profiling) and sample kernel from non-root user: http://man7.org/linux/man-pages/man1/perf-record.1.html
Try running it without -a option and with event limited to userspace events by :u suffix:
perf record -m 32 -F 99 -p $PID -g --call-graph fp -e cycles:u
Or use software event for virtualized platforms without PMU passthrough
perf record -m 32 -F 99 -p $PID -g --call-graph fp -e cpu-clock:u

Pass +RTS options to program run with stack exec

How do I pass +RTS options to a program run with stack exec?
I've added -rtsopts to ghc-options in my cabal file, and built a program with stack build. If I run the program manually both normal and +RTS command line arguments work:
>.stack-work\dist\ca59d0ab\build\iterate-strict-exe\iterate-strict-exe.exe 25 +RTS -s
OK
3,758,156,184 bytes allocated in the heap
297,976 bytes copied during GC
...
But if I run it with stack exec only the normal options reach the program
>stack exec iterate-strict-exe -- 25 +RTS -s
OK
Other things that don't work
If I juggle the order of the arguments around as suggested by #epsilonhalbe I get the same result.
>stack exec -- iterate-strict-exe 25 +RTS -s
OK
There doesn't seem to be the suggested --rts-options option to pass to stack exec.
>stack exec --rts-options "-s" -- iterate-strict-exe 25
Invalid option `--rts-options'
Usage: stack exec CMD [-- ARGS (e.g. stack ghc -- X.hs -o x)] ([--plain] |
[--[no-]ghc-package-path] [--[no-]stack-exe] [--package ARG])
[--help]
Execute a command
I'm using stack version 1.1.2
>stack --version
Version 1.1.2, Git revision c6dac65e3174dea79df54ce6d56f3e98bc060ecc (3647 commits) x86_64 hpack-0.14.0
The same after a stack upgrade to 1.4.0.
Passing the entire command as a string (another suggestion) results in a command with that name not being found
>stack exec -- "iterate-strict-exe 25 +RTS -s"
Executable named iterate-strict-exe 25 +RTS -s not found on path: ...
It looks like you are on Windows and encountering GHC bug #13287 (to be fixed in 8.2). See also stack issues 2022 and 2640. Apparently a workaround is to add --RTS before --, like
stack exec iterate-strict-exe --RTS -- 25 +RTS -s

why does perf record and annotate not work?

I'm stumped, I read the perf tutorial and am trying to do a simple test beyond "perf stat" which works. However perf record either doesnt work ,or perf annotate shows no samples recorded. Running perf
For example(im running with sudo because without it i get a bunch of errors which i will post at the end):
sudo perf record -e cycles,instructions,cache-misses -a -c 1 ./FooExe
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 1.794 MB perf.data (~78393 samples) ]
.
sudo perf report -D -i perf.data |grep RECORD_SAMPLE |wc -l
Failed to open /tmp/perf-23796.map, continuing without symbols
20486
.
sudo perf annotate -d ./FooExe
the perf.data file has no samples! Press any key
So thats as far as i get. I tried to rebuild perf for my ssystem from source but that didnt seem to help either.
Im using Ubuntu 14.04 kernel 3.19.0-49-generic. This is on intel i7 I4510U cpu . I made sure to compile my program with symbols , but i get the same results regardless of which application i try to profile.
-- if i run without sudo :
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Cannot read kernel map
Error:
You may not have permission to collect system-wide stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid:
-1 - Not paranoid at all
0 - Disallow raw tracepoint access for unpriv
1 - Disallow cpu events for unpriv
2 - Disallow kernel profiling for unpriv
I just tried your command. The problem was that you used -a to profile all processes system-wide, so it never ran ./FooExe. You can confirm this with strace -f perf ... ./FooExe, and note the lack of any execve system call. And also the fact that it returns instantly, even if FooExe should have taken several seconds.
Here's an example of recording samples for a busy-loop awk command:
perf record -e cycles,instructions,cache-misses awk 'BEGIN{for(i=0;i<40000000;i++){}}'
Now perf report works. You don't need to specify the executable for the report command, because perf.data only has data for the one executable.
This works the same way with the ocperf.py wrapper, but you could record events for more uarch-specific events using symbolic names (instead of looking up codes and numeric arguments in -e):
$ ocperf.py record -e cycles,cache-misses,uops_dispatched_port.port_0 awk 'BEGIN{for(i=0;i<40000000;i++){}}'
perf record -e cycles,cache-misses,cpu/event=0xa1,umask=0x1,name=uops_dispatched_port_port_0,period=2000003/ awk 'BEGIN{for(i=0;i<40000000;i++){}}'
(warning lines about kernel symbols)
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.352 MB perf.data (7819 samples)
$ ocperf.py report

Resources