how to prevent some values from being optimized out in linux kernel debugging? [duplicate] - linux

This question already has answers here:
Is there a way to tell GCC not to optimise a particular piece of code?
(3 answers)
Closed last year.
This is a code in linux (5.4.21)
When I use a virtual machine and connect gdb to the linux process, I can use break points and follow code. For example, I set breakpoint on a function arm_smmu_device_probe. When I follow with 'next' command, I see some values, for example, 'smmu' or 'dev' below are shown to have been optimized out. How can I make them not optimized out so that I can see them in gdb?
static int arm_smmu_device_probe(struct platform_device *pdev)
{
int irq, ret;
struct resource *res;
resource_size_t ioaddr;
struct arm_smmu_device *smmu;
struct device *dev = &pdev->dev;
bool bypass;
smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
if (!smmu) {
dev_err(dev, "failed to allocate arm_smmu_device\n");
return -ENOMEM;
}
smmu->dev = dev;
if (dev->of_node) {
ret = arm_smmu_device_dt_probe(pdev, smmu);
} else {
ret = arm_smmu_device_acpi_probe(pdev, smmu);
if (ret == -ENODEV)
return ret;
}
I tried chaning -O2 to -Og in the top Makefile but the kernel build fails then.

Recently I found how to do this. (from Is there a way to tell GCC not to optimise a particular piece of code?, flolo's answer)
If you want a function aaa(...) not to be optimzed, you can do it like this.
#pragma GCC push_options
#pragma GCC optimize ("O0")
aaa ( ... )
{
function body
}
#prgma GCC pop_options
In some cases, this putting #pragma causes some discrepancy between the #include header file and the function source. So in this case (not often) you need to add this #praga around the corresponding #include statement. If linux/bbb.h causes this kind of problem, do this.
#pragma GCC push_options
#pragma GCC optimize ("O0")
#include <linux/bbb.h>
#pragma GCC pop_options
This works sure and I'm enjoying(?) debug/analysis this way.

Related

c++ ~ shared object -> get host application offsets

Im writing a shared library for a FreeBSD application.
This library gets loaded by LD_PRELOAD.
This application has multiple compile-versions, so some function offsets might change and my library wont work there.
Now i want to read the offsets at loading the library.
The offsets are changing, so i think my only way is to read the offsets of specific function names.
The offsets are simply the offsets of functions or labels.
Now the problem - how to do it?
Example
In the first version, i call the main version like that:
int(*main)(int argc, char *argv[])=(int(*)(int,char*[]))0x081F3XXX;
but in the second, the offset has changed:
int(*main)(int argc, char *argv[])=(int(*)(int,char*[]))0x08233XXX;
Programmers (me) are lazy and don't want to compile their libs for every version.. I want to create a lib, that is for every version!
I simply need the offsets of the functions via function name, the rest is no problem..
Thats how i call the library:
LD_PRELOAD="/path/to/library.so" ./executable
or
env LD_PRELOAD="/path/to/library.so" ./executable
Edit with test code
Here my testcode regarding to the comments:
Main.cpp:
#include <stdio.h>
void test() {
printf("Test done.\n");
}
int main(int argc, char * argv[]) {
printf("Program started\n");
test();
}
lib.cpp
#include <stdio.h>
#include <dlfcn.h>
void __attribute__ ((constructor)) my_load(void);
void my_load(void) {
printf("Library loaded\n");
printf("test - offset: 0x%x\n",dlsym(NULL,"test"));
}
test.sh
g++ main.cpp -o program
g++ -shared lib.cpp -o lib.so
env LD_PRELOAD="lib.so" ./program
-> Result:
Library loaded
test - offset: 0x0
Program started
Test done.
Does not seem as would it work :s
Edit 15:45
printf("test - offset: 0x%x\n",dlsym(dlopen("/home/test/test_proc/program",RTLD_GLOBAL),"test"));
This also does not work.. Maybe dlsym is the wrong way?
I reproduced your program on Mac OS X using Clang, and found a solution. First, the boring parts:
To make it compile cleanly I had to change your %x format specifier to %p for the pointer.
Then, on Mac OS X I had to pass RTLD_MAIN_ONLY as the first argument to dlsym(). I guess this is platform-dependent; on Linux it does seem to be NULL as you have.
Now, the meat of the fix!
You're searching with dlsym() for a symbol called test. But there is no such symbol in your application. Why? Because you're using C++, and C++ does "name mangling." You could use any number of tools to figure out the mangled name and try to load that with dlsym(), but it could change with different compilers. So instead, just inhibit name mangling by enclosing your test() function in extern "C":
extern "C" {
void test() {
printf("Test done.\n");
}
}
This fixed it for me:
$ DYLD_INSERT_LIBRARIES=lib.so ./program
Library loaded
test - offset: 0x1027d1eb0
Program started
Test done.

context sharing in FreeGLUT under Linux with xorg

I am trying to use OpenGL with shared context (because of sharing textures between windows) via FreeGLUT library... It work fine, I can share textures, but i failed on the end of program or during windows closing by mouse...
I have cerated the code which emulate the problem: (http://pastie.org/9437038)
// file: main.c
// compile: gcc -o test -lglut main.c
// compile: gcc -o test -lglut -DTIME_LIMIT main.c
#include "GL/freeglut.h"
#include <unistd.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
int winA, winB, winC;
int n;
glutInit(&argc, argv);
glutSetOption(GLUT_ACTION_ON_WINDOW_CLOSE , GLUT_ACTION_CONTINUE_EXECUTION);
//glutSetOption(GLUT_RENDERING_CONTEXT, GLUT_USE_CURRENT_CONTEXT);
glutInitDisplayMode(GLUT_DOUBLE | GLUT_RGB);
winA = glutCreateWindow("Test A");
glutSetOption(GLUT_RENDERING_CONTEXT, GLUT_USE_CURRENT_CONTEXT);
winB = glutCreateWindow("Test B");
winC = glutCreateWindow("Test C");
printf("loop\n");
#ifdef TIME_LIMIT
for (n=0;n<50;n++)
{
glutMainLoopEvent();
usleep(5000);
}
#else //TIMELIMIT
glutMainLoop();
#endif // TIME_LIMIT
printf("Destroy winC\n");
glutDestroyWindow(winC);
printf("Destroy winB\n");
glutDestroyWindow(winB);
printf("Destroy winA\n");
glutDestroyWindow(winA);
printf("Normal end\n");
return 0;
}
Output:
loop
X Error of failed request: GLXBadContext
Major opcode of failed request: 153 (GLX)
Minor opcode of failed request: 4 (X_GLXDestroyContext)
Serial number of failed request: 113
Current serial number in output stream: 114
Segmentation fault
output with TIME_LIMIT:
loop
Destroy winC
Destroy winB
Destroy winA
Segmentation fault
Without calling glutSetOption(GLUT_RENDERING_CONTEXT, GLUT_USE_CURRENT_CONTEXT);, it works well.
Do anybody have idea what am I doing bad?
The option GLUT_USE_CURRENT_CONTEXT does not create shared contexts. It just means that the same GL context is used for all windows. You only have one GL conxtext, and destroy it when you first destroy a window which uses that, so the other destruction calls fail. None of the GLUT implementations I'm aware of actually supports GL context sharing.
GLUT_USE_CURRENT_CONTEXT is more like a hack (and it is nor part of the GLUT specification anyway), and not really a well-implemented. It could use some reference counting to destroy the context not before the last window using it is destroyed, but that is simply not the case.

CUDA - Array Generating random array on gpu and its modification using kernel

in this code im generating 1D array of floats on a gpu using CUDA. The numbers are between 0 and 1. For my purpose i need them to be between -1 and 1 so i have made simple kernel to multiply each element by 2 and then substract 1 from it. However something is going wrong here. When i print my original array into .bmp i get this http://i.imgur.com/IS5dvSq.png (typical noise pattern). But when i try to modify that array with my kernel i get blank black picture http://imgur.com/cwTVPTG . The program is executable but in the debug i get this:
First-chance exception at 0x75f0c41f in Midpoint_CUDA_Alpha.exe:
Microsoft C++ exception: cudaError_enum at memory location
0x003cfacc..
First-chance exception at 0x75f0c41f in Midpoint_CUDA_Alpha.exe:
Microsoft C++ exception: cudaError_enum at memory location
0x003cfb08..
First-chance exception at 0x75f0c41f in Midpoint_CUDA_Alpha.exe:
Microsoft C++ exception: [rethrow] at memory location 0x00000000..
i would be thankfull for any help or even little hint in this matter. Thanks !
(edited)
#include <device_functions.h>
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include "stdafx.h"
#include "EasyBMP.h"
#include <curand.h> //curand.lib must be added in project propetties > linker > input
#include "device_launch_parameters.h"
float *heightMap_cpu;
float *randomArray_gpu;
int randCount = 0;
int rozmer = 513;
void createRandoms(int size){
curandGenerator_t generator;
cudaMalloc((void**)&randomArray_gpu, size*size*sizeof(float));
curandCreateGenerator(&generator,CURAND_RNG_PSEUDO_XORWOW);
curandSetPseudoRandomGeneratorSeed(generator,(int)time(NULL));
curandGenerateUniform(generator,randomArray_gpu,size*size);
}
__global__ void polarizeRandoms(int size, float *randomArray_gpu){
int index = threadIdx.x + blockDim.x * blockIdx.x;
if(index<size*size){
randomArray_gpu[index] = randomArray_gpu[index]*2.0f - 1.0f;
}
}
//helper fucnction for getting address in 1D using 2D coords
int ad(int x,int y){
return x*rozmer+y;
}
void printBmp(){
BMP AnImage;
AnImage.SetSize(rozmer,rozmer);
AnImage.SetBitDepth(24);
int i,j;
for(i=0;i<=rozmer-1;i++){
for(j=0;j<=rozmer-1;j++){
AnImage(i,j)->Red = (int)((heightMap_cpu[ad(i,j)]*127)+128);
AnImage(i,j)->Green = (int)((heightMap_cpu[ad(i,j)]*127)+128);
AnImage(i,j)->Blue = (int)((heightMap_cpu[ad(i,j)]*127)+128);
AnImage(i,j)->Alpha = 0;
}
}
AnImage.WriteToFile("HeightMap.bmp");
}
int main(){
createRandoms(rozmer);
polarizeRandoms<<<((rozmer*rozmer)/1024)+1,1024>>>(rozmer,randomArray_gpu);
heightMap_cpu = (float*)malloc((rozmer*rozmer)*sizeof(float));
cudaMemcpy(heightMap_cpu,randomArray_gpu,rozmer*rozmer*sizeof(float),cudaMemcpyDeviceToHost);
printBmp();
//cleanup
cudaFree(randomArray_gpu);
free(heightMap_cpu);
return 0;
}
This is wrong:
cudaMalloc((void**)&randomArray_gpu, size*size*sizeof(float));
We don't use cudaMalloc with __device__ variables. If you do proper cuda error checking I'm pretty sure that line will throw an error.
If you really want to use a __device__ pointer this way, you need to create a separate normal pointer, cudaMalloc that, then copy the pointer value to the device pointer using cudaMemcpyToSymbol:
float *my_dev_pointer;
cudaMalloc((void**)&my_dev_pointer, size*size*sizeof(float));
cudaMemcpyToSymbol(randomArray_gpu, &my_dev_pointer, sizeof(float *));
Whenever you are having trouble with your CUDA programs, you should do proper cuda error checking. It will likely focus your attention on what is wrong.
And, yes, kernels can access __device__ variables without the variable being passed explicitly as a parameter to the kernel.
The programming guide covers the proper usage of __device__ variables and the api functions that should be used to access them from the host.

can a program read its own elf section?

I would like to use ld's --build-id option in order to add build information to my binary. However, I'm not sure how to make this information available inside the program. Assume I want to write a program that writes a backtrace every time an exception occurs, and a script that parses this information. The script reads the symbol table of the program and searches for the addresses printed in the backtrace (I'm forced to use such a script because the program is statically linked and backtrace_symbols is not working). In order for the script to work correctly I need to match build version of the program with the build version of the program which created the backtrace. How can I print the build version of the program (located in the .note.gnu.build-id elf section) from the program itself?
How can I print the build version of the program (located in the .note.gnu.build-id elf section) from the program itself?
You need to read the ElfW(Ehdr) (at the beginning of the file) to find program headers in your binary (.e_phoff and .e_phnum will tell you where program headers are, and how many of them to read).
You then read program headers, until you find PT_NOTE segment of your program. That segment will tell you offset to the beginning of all the notes in your binary.
You then need to read the ElfW(Nhdr) and skip the rest of the note (total size of the note is sizeof(Nhdr) + .n_namesz + .n_descsz, properly aligned), until you find a note with .n_type == NT_GNU_BUILD_ID.
Once you find NT_GNU_BUILD_ID note, skip past its .n_namesz, and read the .n_descsz bytes to read the actual build-id.
You can verify that you are reading the right data by comparing what you read with the output of readelf -n a.out.
P.S.
If you are going to go through the trouble to decode build-id as above, and if your executable is not stripped, it may be better for you to just decode and print symbol names instead (i.e. to replicate what backtrace_symbols does) -- it's actually easier to do than decoding ELF notes, because the symbol table contains fixed-sized entries.
Basically, this is the code I've written based on answer given to my question. In order to compile the code I had to make some changes and I hope it will work for as many types of platforms as possible. However, it was tested only on one build machine. One of the assumptions I used was that the program was built on the machine which runs it so no point in checking endianness compatibility between the program and the machine.
user#:~/$ uname -s -r -m -o
Linux 3.2.0-45-generic x86_64 GNU/Linux
user#:~/$ g++ test.cpp -o test
user#:~/$ readelf -n test | grep Build
Build ID: dc5c4682e0282e2bd8bc2d3b61cfe35826aa34fc
user#:~/$ ./test
Build ID: dc5c4682e0282e2bd8bc2d3b61cfe35826aa34fc
#include <elf.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/stat.h>
#if __x86_64__
# define ElfW(type) Elf64_##type
#else
# define ElfW(type) Elf32_##type
#endif
/*
detecting build id of a program from its note section
http://stackoverflow.com/questions/17637745/can-a-program-read-its-own-elf-section
http://www.scs.stanford.edu/histar/src/pkg/uclibc/utils/readelf.c
http://www.sco.com/developers/gabi/2000-07-17/ch5.pheader.html#note_section
*/
int main (int argc, char* argv[])
{
char *thefilename = argv[0];
FILE *thefile;
struct stat statbuf;
ElfW(Ehdr) *ehdr = 0;
ElfW(Phdr) *phdr = 0;
ElfW(Nhdr) *nhdr = 0;
if (!(thefile = fopen(thefilename, "r"))) {
perror(thefilename);
exit(EXIT_FAILURE);
}
if (fstat(fileno(thefile), &statbuf) < 0) {
perror(thefilename);
exit(EXIT_FAILURE);
}
ehdr = (ElfW(Ehdr) *)mmap(0, statbuf.st_size,
PROT_READ|PROT_WRITE, MAP_PRIVATE, fileno(thefile), 0);
phdr = (ElfW(Phdr) *)(ehdr->e_phoff + (size_t)ehdr);
while (phdr->p_type != PT_NOTE)
{
++phdr;
}
nhdr = (ElfW(Nhdr) *)(phdr->p_offset + (size_t)ehdr);
while (nhdr->n_type != NT_GNU_BUILD_ID)
{
nhdr = (ElfW(Nhdr) *)((size_t)nhdr + sizeof(ElfW(Nhdr)) + nhdr->n_namesz + nhdr->n_descsz);
}
unsigned char * build_id = (unsigned char *)malloc(nhdr->n_descsz);
memcpy(build_id, (void *)((size_t)nhdr + sizeof(ElfW(Nhdr)) + nhdr->n_namesz), nhdr->n_descsz);
printf(" Build ID: ");
for (int i = 0 ; i < nhdr->n_descsz ; ++i)
{
printf("%02x",build_id[i]);
}
free(build_id);
printf("\n");
return 0;
}
Yes, a program can read its own .note.gnu.build-id. The important piece is the dl_iterate_phdr function.
I've used this technique in Mesa (the OpenGL/Vulkan implementation) to read its own build-id for use with the on-disk shader cache.
I've extracted those bits into a separate project[1] for easy use by others.
[1] https://github.com/mattst88/build-id

Getting stack traces on Unix systems, automatically

What methods are there for automatically getting a stack trace on Unix systems? I don't mean just getting a core file or attaching interactively with GDB, but having a SIGSEGV handler that dumps a backtrace to a text file.
Bonus points for the following optional features:
Extra information gathering at crash time (eg. config files).
Email a crash info bundle to the developers.
Ability to add this in a dlopened shared library
Not requiring a GUI
FYI,
the suggested solution (using backtrace_symbols in a signal handler) is dangerously broken. DO NOT USE IT -
Yes, backtrace and backtrace_symbols will produce a backtrace and a translate it to symbolic names, however:
backtrace_symbols allocates memory using malloc and you use free to free it - If you're crashing because of memory corruption your malloc arena is very likely to be corrupt and cause a double fault.
malloc and free protect the malloc arena with a lock internally. You might have faulted in the middle of a malloc/free with the lock taken, which will cause these function or anything that calls them to dead lock.
You use puts which uses the standard stream, which is also protected by a lock. If you faulted in the middle of a printf you once again have a deadlock.
On 32bit platforms (e.g. your normal PC of 2 year ago), the kernel will plant a return address to an internal glibc function instead of your faulting function in your stack, so the single most important piece of information you are interested in - in which function did the program fault, will actually be corrupted on those platform.
So, the code in the example is the worst kind of wrong - it LOOKS like it's working, but it will really fail you in unexpected ways in production.
BTW, interested in doing it right? check this out.
Cheers,
Gilad.
If you are on systems with the BSD backtrace functionality available (Linux, OSX 1.5, BSD of course), you can do this programmatically in your signal handler.
For example (backtrace code derived from IBM example):
#include <execinfo.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
void sig_handler(int sig)
{
void * array[25];
int nSize = backtrace(array, 25);
char ** symbols = backtrace_symbols(array, nSize);
for (int i = 0; i < nSize; i++)
{
puts(symbols[i]);;
}
free(symbols);
signal(sig, &sig_handler);
}
void h()
{
kill(0, SIGSEGV);
}
void g()
{
h();
}
void f()
{
g();
}
int main(int argc, char ** argv)
{
signal(SIGSEGV, &sig_handler);
f();
}
Output:
0 a.out 0x00001f2d sig_handler + 35
1 libSystem.B.dylib 0x95f8f09b _sigtramp + 43
2 ??? 0xffffffff 0x0 + 4294967295
3 a.out 0x00001fb1 h + 26
4 a.out 0x00001fbe g + 11
5 a.out 0x00001fcb f + 11
6 a.out 0x00001ff5 main + 40
7 a.out 0x00001ede start + 54
This doesn't get bonus points for the optional features (except not requiring a GUI), however, it does have the advantage of being very simple, and not requiring any additional libraries or programs.
Here is an example of how to get some more info using a demangler. As you can see this one also logs the stacktrace to file.
#include <iostream>
#include <sstream>
#include <string>
#include <fstream>
#include <cxxabi.h>
void sig_handler(int sig)
{
std::stringstream stream;
void * array[25];
int nSize = backtrace(array, 25);
char ** symbols = backtrace_symbols(array, nSize);
for (unsigned int i = 0; i < size; i++) {
int status;
char *realname;
std::string current = symbols[i];
size_t start = current.find("(");
size_t end = current.find("+");
realname = NULL;
if (start != std::string::npos && end != std::string::npos) {
std::string symbol = current.substr(start+1, end-start-1);
realname = abi::__cxa_demangle(symbol.c_str(), 0, 0, &status);
}
if (realname != NULL)
stream << realname << std::endl;
else
stream << symbols[i] << std::endl;
free(realname);
}
free(symbols);
std::cerr << stream.str();
std::ofstream file("/tmp/error.log");
if (file.is_open()) {
if (file.good())
file << stream.str();
file.close();
}
signal(sig, &sig_handler);
}
Dereks solution is probably the best, but here's an alternative anyway:
Recent Linux kernel version allow you to pipe core dumps to a script or program. You could write a script to catch the core dump, collect any extra information you need and mail everything back.
This is a global setting though, so it'd apply to any crashing program on the system. It will also require root rights to set up.
It can be configured through the /proc/sys/kernel/core_pattern file. Set that to something like ' | /home/myuser/bin/my-core-handler-script'.
The Ubuntu people use this feature as well.

Resources