gperftools cpu profiler does not support multi process? - linux

according to the document, http://gperftools.googlecode.com/svn/trunk/doc/cpuprofile.html, the cpu profiles does support multi process and will generate independent output file:
If your program forks, the children will also be profiled (since they
inherit the same CPUPROFILE setting). Each process is profiled
separately; to distinguish the child profiles from the parent profile
and from each other, all children will have their process-id appended
to the CPUPROFILE name.
but when I try as follow:
// main_cmd_argv.cpp
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <gperftools/profiler.h>
int loop(int n) {
int sum = 0;
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
sum = i + j;
if (sum %3 == 0) {
sum /= 3;
}
}
}
return 0;
}
int main(int argc, char* argv[]) {
printf("%s\n%s\n", getenv("CPUPROFILE"), getenv("CPUPROFILESIGNAL"));
if (argc > 1 && strcmp(argv[1], "-s")==0) {
// single process
loop(100000);
printf("stoped\n");
} else if (argc > 1 && strcmp(argv[1], "-m")==0) {
// multi process
pid_t pid = fork();
if (pid < 0) {
printf("fork error\n");
return -1;
}
if (pid == 0) {
loop(100000);
printf("child stoped\n");
} else if (pid > 0) {
loop(10000);
printf("father stoped\n");
wait(NULL);
}
}
return 0;
}
// makefile
GPerfTools=/home/adenzhang/tools/gperftools
CCFLAGS=-fno-omit-frame-pointer -g -Wall
ALL_BINS=main_cmd_argv
all:$(ALL_BINS)
main_cmd_argv:main_cmd_argv.o
g++ $(CCFLAGS) -o $# $^ -L./ -L$(GPerfTools)/lib -Wl,-Bdynamic -lprofiler -lunwind
.cpp.o:
g++ $(CCFLAGS) -c -I./ -I$(GPerfTools)/include -fPIC -o $# $<
clean:
rm -f $(ALL_BINS) *.o *.prof
// shell command
$ make
g++ -fno-omit-frame-pointer -g -Wall -c -I./ -I/home/adenzhang/tools/gperftools/include -fPIC -o main_cmd_argv.o main_cmd_argv.cpp
g++ -fno-omit-frame-pointer -g -Wall -o main_cmd_argv main_cmd_argv.o -L./ -L/home/adenzhang/tools/gperftools/lib -Wl,-Bdynamic -lprofiler -lunwind
$ env CPUPROFILE=main_cmd_argv.prof ./main_cmd_argv -s
젩n_cmd_argv.prof
(null)
stoped
PROFILE: interrupts/evictions/bytes = 6686/3564/228416
$ /home/adenzhang/tools/gperftools/bin/pprof --text ./main_cmd_argv ./main_cmd_argv.prof
Using local file ./main_cmd_argv.
Using local file ./main_cmd_argv.prof.
Removing killpg from all stack traces.
Total: 6686 samples
6686 100.0% 100.0% 6686 100.0% loop
0 0.0% 100.0% 6686 100.0% __libc_start_main
0 0.0% 100.0% 6686 100.0% _start
0 0.0% 100.0% 6686 100.0% main
$ rm main_cmd_argv.prof
$ env CPUPROFILE=main_cmd_argv.prof ./main_cmd_argv -m
젩n_cmd_argv.prof
(null)
father stoped
child stoped
PROFILE: interrupts/evictions/bytes = 0/0/64
PROFILE: interrupts/evictions/bytes = 68/36/2624
$ ls
main_cmd_argv main_cmd_argv.cpp main_cmd_argv.o main_cmd_argv.prof Makefile
$ /home/adenzhang/tools/gperftools/bin/pprof --text ./main_cmd_argv ./main_cmd_argv.prof
Using local file ./main_cmd_argv.
Using local file ./main_cmd_argv.prof.
$
It semms that gperf does not support multi process, could anyone please explain? thanks!

Quite old, don't know if you found an answer or not, but...
Seems like every thread/fork should register itself using ProfilerRegisterThread();
You can find more information in those two issues: Here and Here.
Also here is an example code, similar to your test case where the forks can be registered.

I'm currently using gperftools to profile a mpi program and come across this problem. After googling I find that ProfilerStart(_YOUR_PROF_FILE_NAME_) and ProfilerStop() ought be called during every sub-process is executed, and _YOUR_PRO_FILE_NAME_ must be different among different process. Then you could analysis performance of every process.
link(also asked by ZRJ):
https://groups.google.com/forum/#!topic/google-perftools/bmysZILR4ik

Related

application using lttng compile errors with aarch64-xilinx-linux-g++

I am trying to porting lttng on xilinx mpsoc with linux OS, I have write a demo as same as lttng "Record user application events", it runs on Ubuntu perfectly
g++ -c -I. hello-tp.c
g++ -c hello.c
g++ -o hello hello-tp.o hello.o -llttng-ust -ldl
but when I compile it on arm linux platform I got errors:
aarch64-xilinx-linux-g++ -mcpu=cortex-a72.cortex-a53 -march=armv8-a+crc -fstack-protector-strong -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security -Werror=format-security --sysroot=/home/david/project/zcu102/images/linux/sdk/sysroots/cortexa72-cortexa53-xilinx-linux -O2 -pipe -g -feliminate-unused-debug-types -c -I. hello-tp.c
In file included from hello-tp.c:4:
hello-tp.h:16:27: error: expected constructor, destructor, or type conversion before ‘(’ token
16 | LTTNG_UST_TRACEPOINT_EVENT(hello_world, my_first_tracepoint, LTTNG_ARGS, LTTNG_FIELDS)
| ^
make: *** [Makefile:14: hello-tp.o] Error 1
here is the code
hello-tp.h:
#undef LTTNG_UST_TRACEPOINT_PROVIDER
#define LTTNG_UST_TRACEPOINT_PROVIDER hello_world
#undef LTTNG_UST_TRACEPOINT_INCLUDE
#define LTTNG_UST_TRACEPOINT_INCLUDE "./hello-tp.h"
#if !defined(_HELLO_TP_H) || defined(LTTNG_UST_TRACEPOINT_HEADER_MULTI_READ)
#define _HELLO_TP_H
#include <lttng/tracepoint.h>
#define LTTNG_ARGS LTTNG_UST_TP_ARGS(int, my_integer_arg, char *, my_string_arg)
#define LTTNG_FIELDS LTTNG_UST_TP_FIELDS(lttng_ust_field_string(my_string_field, my_string_arg) lttng_ust_field_integer(int, my_integer_field, my_integer_arg))
LTTNG_UST_TRACEPOINT_EVENT(hello_world, my_first_tracepoint, LTTNG_ARGS, LTTNG_FIELDS)
#endif /* _HELLO_TP_H */
#include <lttng/tracepoint-event.h>
hello-tp.c
#define LTTNG_UST_TRACEPOINT_CREATE_PROBES
#define LTTNG_UST_TRACEPOINT_DEFINE
#include "hello-tp.h"
hello.c
#include <stdio.h>
#include "hello-tp.h"
int main(int argc, char *argv[])
{
unsigned int i;
puts("Hello, World!\nPress Enter to continue...");
/*
* The following getchar() call only exists for the purpose of this
* demonstration, to pause the application in order for you to have
* time to list its tracepoints. You don't need it otherwise.
*/
getchar();
/*
* An lttng_ust_tracepoint() call.
*
* Arguments, as defined in `hello-tp.h`:
*
* 1. Tracepoint provider name (required)
* 2. Tracepoint name (required)
* 3. `my_integer_arg` (first user-defined argument)
* 4. `my_string_arg` (second user-defined argument)
*
* Notice the tracepoint provider and tracepoint names are
* C identifiers, NOT strings: they're in fact parts of variables
* that the macros in `hello-tp.h` create.
*/
lttng_ust_tracepoint(hello_world, my_first_tracepoint, 23,
"hi there!");
for (i = 0; i < argc; i++) {
lttng_ust_tracepoint(hello_world, my_first_tracepoint,
i, argv[i]);
}
puts("Quitting now!");
lttng_ust_tracepoint(hello_world, my_first_tracepoint,
i * i, "i^2");
return 0;
}
Makefile
APP = hello
# Add any other object files to this list below
APP_OBJS = hello-tp.o hello.o
all: build
build: $(APP)
$(APP): $(APP_OBJS)
$(CXX) -o $# $(APP_OBJS) $(LDFLAGS) -llttng -ldl
hello-tp.o : hello-tp.c hello-tp.h
$(CXX) $(CXXFLAGS) -c -I. $<
hello.o : hello.c
$(CXX) $(CXXFLAGS) -c $<
clean:
rm -f $(APP) *.o
Is there anyone met such issue? I guess the problem is caused by complier but I don't find any clue...
I just ran into this problem. Check your LTTNG version. The 2.13 release (current) uses LTTNG_UST_TRACEPOINT_PROVIDER. However, older releases uses TRACEPOINT_PROVIDER. The prefix LTTNG_UST has been added all over the place. See https://lttng.org/man/3/lttng-ust/v2.13/#doc-_compatibility_with_previous_apis

OpenMPI runtime error : Hello World

I'm able to successfully compile my code when I execute the make command. However, when I run the code as:
mpirun -np 4 test
The error generated is:
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[63067,1],2]
Exit code: 1
--------------------------------------------------------------------------
I have no multiple mpi installations so I don't expect there to be a problem.
I've been having trouble with my Hello World OpenMPI program. My main file is :
#include <iostream>
#include "mpi.h"
using namespace std;
int main(int argc, const char * argv[]) {
MPI_Init(NULL, NULL);
int size, rank;
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
cout << "The number of spawned processes are " << size << "And this is the process " << rank;
MPI_Finalize();
return 0;
}
My makefile is:
# Compiler
CXX = mpic++
# Compiler flags
CFLAGS = -Wall -lm
# Header and Library Paths
INCLUDE = -I/usr/local/include -I/usr/local/lib -I..
LIBRARY_INCLUDE = -L/usr/local/lib
LIBRARIES = -l mpi
# the build target executable
TARGET = test
all: $(TARGET)
$(TARGET): main.cpp
$(CXX) $(CFLAGS) -o $(TARGET) main.cpp $(INCLUDE) $(LIBRARY_INCLUDE) $(LIBRARIES)
clean:
rm $(TARGET)
The output of: mpic++ --version is:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 8.1.0 (clang-802.0.42)
Target: x86_64-apple-darwin16.5.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
And that for mpirun --version is:
mpirun (Open MPI) 2.1.1
Report bugs to http://www.open-mpi.org/community/help/
What could be causing the issue?
This is now resolved. It turns out that I have to execute with
mpirun -np 4 ./test
Ref: users-request#lists.open-mpi.org

calling std::system with parameter containing space in double quotes

I need to call a linux std::system call with parameter than contain string with spaces. In order to process it correctly using argc / argv I want to pass it with double quotes.
std::string cmdline = "myprogram -d \"I am a sting\"" ;
if I cout this string I get good results.
When I send it to std::system(cmdline)
and look at ps -ef I get
myprogram -d I am a string
How can I keep "I am a string" as a single parameter for myprogram ?
No problem:
dlaru#IKH-LAPTOP /cygdrive/c/Users/dlaru
$ cat test.cpp
#include <iostream>
#include <string>
#include <cstdlib>
int main()
{
std::string cmdline = "./see \"I am a string\"";
std::cout << cmdline << std::endl;
std::system(cmdline.c_str());
}
dlaru#IKH-LAPTOP /cygdrive/c/Users/dlaru
$ cat see.cpp
#include <iostream>
int main(int argc, char* argv[])
{
for (int i = 0; i < argc; ++i)
{
std::cout << argv[i] << std::endl;
}
}
dlaru#IKH-LAPTOP /cygdrive/c/Users/dlaru
$ g++ -std=c++11 test.cpp -o test
dlaru#IKH-LAPTOP /cygdrive/c/Users/dlaru
$ g++ -std=c++11 see.cpp -o see
dlaru#IKH-LAPTOP /cygdrive/c/Users/dlaru
$ ./test
./see "I am a string"
./see
I am a string
(tested in Cygwin)

clang analyzer memory leaks

Why doesn't clang/clang-analyzer catch that I forgot to free a and have a memory leak? It's obvious. I looked at the man pages and i'm not sure what flags are required.
$ scan-build clang++ -std=c++11 a.cpp
scan-build: Using '/usr/bin/clang' for static analysis
scan-build: Removing directory '/tmp/scan-build-2013-10-02-2' because it contains no reports.
$ cat ./a.cpp
#include <iostream>
int main() {
int *a = new int;
*a = 8;
std::cout<< a << std::endl;
}

Why is there a difference using std::thread::hardware_concurrency() and boost::thread::hardware_concurrency()?

The description of the problem itself is pretty simple. I'm testing the differences of std::thread library in C++11 and boost::thread library.
The output of these:
#include <iostream>
#include <thread>
#include <boost/thread.hpp>
int main() {
std::cout << std::thread::hardware_concurrency() << std::endl;
std::cout << boost::thread::hardware_concurrency() << std::endl;
return 0;
}
gives me different results:
0
4
Why is that?
PS: The version of the gcc package is 4.6.2-1.fc16 (x86_64). I'm using
g++ test.cc -Wall -std=c++0x -lboost_thread-mt -lpthread
After reviewing /usr/include/c++/4.6.2/thread
it can be seen that the implementation is actually:
// Returns a value that hints at the number of hardware thread contexts.
static unsigned int
hardware_concurrency()
{ return 0; }
So problem solved. It's just another feature that hasn't been implemented in gcc 4.6.2
The method employed by your compiler installation of boost is supported for your target, whereas your installation of boost compiler does not support this feature for your target.
TFM says:
The number of hardware threads available on the current system (e.g. number of CPUs or cores or hyperthreading units), or 0 if this information is not available.
EDIT: scratch that, reverse it.
EDIT2: This feature is present on the trunk, but absent in 4.6.2:
~/tmp/gcc-4.6.2/libstdc++-v3/src> wc -l thread.cc
104 thread.cc
~/tmp/gcc-4.6.2/libstdc++-v3/src> grep concurrency thread.cc | wc -l
0
~/tmp/gcc-4.6.2/libstdc++-v3> grep -C 2 VERIFY testsuite/30_threads/thread/members/hardware_concurrency.cc
// Current implementation punts on this.
VERIFY( std::thread::hardware_concurrency() == 0 );
return 0;

Resources