How to daemonize a process which is linked to shared library? - shared-libraries

We are trying to daemonize a binary that is linked to 1 shared library. After creating a fork, the parent process gets exit due to which the shared library gets detached.
Below is the stack.
#3 0x00007ffff7deb07a in _dl_fini () from /lib64/ld-linux-x86-64.so.2
#4 0x00007ffff67aace9 in __run_exit_handlers () from /lib64/libc.so.6
#5 0x00007ffff67aad37 in exit () from /lib64/libc.so.6
#6 0x00007ffff7b35ce2 in daemonize () at
/home/CSDeveloper/CLLM420/1/src/mds.llmd/common/libllmd.c:677
#7 0x00007ffff7b35cfb in llm_waitForShutdown () at
/home/CSDeveloper/CLLM420/1/src/mds.llmd/common/libllmd.c:665
#8 0x0000000000400c1d in main (argc=<optimized out>, argv=0x7fffffffdea8)
at /home/CSDeveloper/CLLM420/1/src/mds.llmd/common/llmd.c:134
Can somebody let us know how to daemonize such process which is linked to a shared library?

Related

pthread_join hangs indefinitely __lll_lock_wait_private()

I have multithreaded application where I spawn a few threads and do a pthread_join upon completion.
The main thread spawns threads and waits on pthread_join() for the worker threads to join. I am facing a issue where the main thread is waiting indefinitely in pthread_join() and all the worker threads have exited, leading the program to hang.
I identified that all worker threads have exited by checking info thread on gdb since it lists only the main thread.
Its is known that calling pthread_join() on a exited thread will return immediately. But this seems different. This is the gdb stack trace.
#0 0x00007f45fefebeec in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x00007f45fef68a6f in _L_lock_5333 () from /lib64/libc.so.6
#2 0x00007f45fef62408 in _int_free () from /lib64/libc.so.6
#3 0x00007f45ffbe5088 in _dl_deallocate_tls () from /lib64/ld-linux-x86-64.so.2
#4 0x00007f45ff9bde67 in __free_stacks () from /lib64/libpthread.so.0
#5 0x00007f45ff9bdf7f in __deallocate_stack () from /lib64/libpthread.so.0
#6 0x00007f45ff9bff93 in pthread_join () from /lib64/libpthread.so.0
#7 0x00007f45f87a6fe1 in waitForWorkerThreadsToExit () at src/server.c:133
#8 ServerLoop (arg=<optimized out>) at src/server.c:662
#9 0x00007f45ff9bee25 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f45fefde34d in clone () from /lib64/libc.so.6
I am on CentOS7 and Linux kernel 3.10
Can someone help? TIA
One of the other threads is leaving without relinquishing the lock. As suggested here you can check the thread id for owner of this mutex to know which thread is the culprit.

segmentation fault when running MATLAB in glnx86 64bit [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I got a problem when starting Matlab 2011b on ubuntu platform running a server with Intel Xeon processor.
I installed this version of matlab with network lincese, and installation process was quite well done.(I hope so, but there is no extra warning).
When I start matlab with binary in
/usr/local/MATLAB/R2011b/bin/glnx86 name MATLAB
program dump a core-dump and received SIGSEGV.
I got a backtrace using gdb as follow.
#0 0xb7feb2b6 in ?? () from /lib/ld-linux.so.2
#1 0xb7ff0dba in ?? () from /lib/ld-linux.so.2
#2 0xb7feccbf in ?? () from /lib/ld-linux.so.2
#3 0xb7ff07e4 in ?? () from /lib/ld-linux.so.2
#4 0xb70e6be9 in ?? () from /lib/i386-linux-gnu/libdl.so.2
#5 0xb7feccbf in ?? () from /lib/ld-linux.so.2
#6 0xb70e733a in ?? () from /lib/i386-linux-gnu/libdl.so.2
#7 0xb70e6c97 in dlopen () from /lib/i386-linux-gnu/libdl.so.2
#8 0xb7f330e6 in utLoadLibrary () from /usr/local/MATLAB/R2011b/bin/glnx86/libut.so
#9 0xb2b2a1bc in ?? () from /usr/local/MATLAB/R2011b/bin/glnx86/libmwbinder.so
#10 0xb2b2a412 in Binder::_load_libs(std::vector<std::string,
std::allocator<std::string> > const&)
() from /usr/local/MATLAB/R2011b/bin/glnx86/libmwbinder.so
#11 0xb2b2bd48 in Binder::_load_and_resolve() ()
from /usr/local/MATLAB/R2011b/bin/glnx86/libmwbinder.so
#12 0xb2abf356 in ?? () from /usr/local/MATLAB/R2011b/bin/glnx86/libmwblas.so
#13 0xb2abf484 in ?? () from /usr/local/MATLAB/R2011b/bin/glnx86/libmwblas.so
#14 0xb2abfd5d in zdotu_ () from /usr/local/MATLAB/R2011b/bin/glnx86/libmwblas.so
#15 0xb2ac23d0 in ?? () from /usr/local/MATLAB/R2011b/bin/glnx86/libmwblas.so
#16 0xb2ab82d8 in _init () from /usr/local/MATLAB/R2011b/bin/glnx86/libmwblas.so
#17 0xb7fece39 in ?? () from /lib/ld-linux.so.2
#18 0xb7fecf84 in ?? () from /lib/ld-linux.so.2
#19 0xb7fdf20f in ?? () from /lib/ld-linux.so.2
=========================================================
Any comment? or help?
Any word with your attention will be appreciated. Thanks.
You shouldn't start MATLAB directly from your architecture-specific directory. Try running /usr/local/MATLAB/R2011b/bin/matlab instead.The script performs some initializations and is platform-aware. The initialization process is needed because MATLAB uses quite specific (older) versions of some libraries.
If your problem persists, though, I'd contact customer service of The Mathworks.

Segmentation Error when using a shared library

Im building a shared library on linux. the library ".so" was sucessfully created, but when I tried to link it to a test application (with an empty main) and run the executable I got a segmentation error : "Segmentation error (cure dumped)"
when I tried to debug it with gdb and check the backtrace I got this output:
Program received signal SIGSEGV, Segmentation fault.
0x0073d5df in std::_Rb_tree_decrement(std::_Rb_tree_node_base*) () from /usr/lib/libstdc++.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12.1-4.i686 libgcc-4.4.5-2.fc13.i686 libstdc++-4.4.5-2.fc13.i686 zlib-1.2.3-23.fc12.i686
(gdb) backtrace
#0 0x0073d5df in std::_Rb_tree_decrement(std::_Rb_tree_node_base*) () from /usr/lib/libstdc++.so.6
#1 0x0012d70c in ?? () from /opt/cuda/lib/libcudart.so.3
#2 0x0012df0c in ?? () from /opt/cuda/lib/libcudart.so.3
#3 0x0012c88a in ?? () from /opt/cuda/lib/libcudart.so.3
#4 0x00121435 in __cudaRegisterFatBinary () from /opt/cuda/lib/libcudart.so.3
#5 0x005d7bfd in __sti____cudaRegisterAll_55_tmpxft_00000fe6_00000000_26_MonteCarloPaeo_SM10_cpp1_ii_3a8af011()
() from libsharedCUFP.so
#6 0x005db40d in __do_global_ctors_aux () from libsharedCUFP.so
#7 0x005a8748 in _init () from libsharedCUFP.so
#8 0x008abd00 in _dl_init_internal () from /lib/ld-linux.so.2
#9 0x0089d88f in _dl_start_user () from /lib/ld-linux.so.2
Im not familiar with gdb debugging, and it's the first time Im trying to build a shared library on Linux, but it seems to me that it has something to do with the library dynamic linking.
If anyone had any idea about this error and could help me, I would be grateful.
It doesn't have anything to do with dynamic linking or shared libraries - one of the constructors in libsharedCUFP.so (I assume this is your shared library) is most probably passing an illegal address to a function in libcudart.so which crashes.
You simply need to debug your code.

Process terminated by signal 6, core shows kind of loop in libc

On analysis of the core of a process (terminated by signal 6), on LINUX, stack bt shows :
Core was generated by `/opt/namsam/pac_rrc_qx_e1/bin/rrcprb'.
Program terminated with signal 6, Aborted.
#0 0x0000005555ffb004 in epoll_wait () from /lib64/libc.so.6
(gdb) bt
#0 0x0000005555ffb004 in epoll_wait () from /lib64/libc.so.6
#1 0x0000005555ffafe8 in __epoll_wait_nocancel () from /lib64/libc.so.6
#2 0x0000005555ffafe8 in __epoll_wait_nocancel () from /lib64/libc.so.6
#3 0x0000005555ffafe8 in __epoll_wait_nocancel () from /lib64/libc.so.6
#4 0x0000005555ffafe8 in __epoll_wait_nocancel () from /lib64/libc.so.6
#5 0x0000005555ffafe8 in __epoll_wait_nocancel () from /lib64/libc.so.6
#6 0x0000005555ffafe8 in __epoll_wait_nocancel () from /lib64/libc.so.6
#7 0x0000005555ffafe8 in __epoll_wait_nocancel () from /lib64/libc.so.6
libc seems to have gone in some loop.. Did something go wrong with the application "rrcprb" here..? please help me debug this issue..?
Since __epoll_wait_nocancel does not call itself, it's pretty clear that the stack trace you've got is bogus. Most likely cause is incorrect unwind descriptors in your libc.so.6.
It's also somewhat unlikely that you actually crashed in epoll_wait. Try thread apply all where, and see if there is a "more interesting" stack trace / thread for you to look at.

Not to Able analayze the Core dump issue for Multithreaded application.........(Help Required)

I am working on multhithreading application when ever the process dump it always generates core as shown below i am not able to understand where it is actually dumping.
GNU gdb Red Hat Linux (6.5-25.el5rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib64/libthread_db.so.1".
warning: exec file is newer than core file.
Core was generated by `multithreadprocess '.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000448f7a in std::ostream::operator<< ()
(gdb) where
0x000000000044bd32 in std::ostream::operator<< ()
#1 0x0000000000450b21 in std::ostream::operator<< ()
#2 0x000000000042eda9 in std::string::operator= ()
#3 0x00000030582062e7 in start_thread () from /lib64/libpthread.so.0
#4 0x00000030576ce3bd in clone () from /lib64/libc.so.6
(gdb)thread apply all bt
Thread 6 (process 11674):
#0 0x000000305820a687 in pthread_cond_timedwait##GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000000431140 in std::string::operator= ()
#2 0x00000030582062e7 in start_thread () from /lib64/libpthread.so.0
#3 0x00000030576ce3bd in clone () from /lib64/libc.so.6
Thread 5 (process 11683):
#0 0x000000305820cbfb in write () from /lib64/libpthread.so.0
#1 0x0000000000449151 in std::ostream::operator<< ()
#2 0x000000000043b74a in std::string::operator= ()
#3 0x000000000046c3f4 in std::string::substr ()
#4 0x000000000046e3c1 in std::string::substr ()
#5 0x00000000004305a4 in std::string::operator= ()
#6 0x00000030582062e7 in start_thread () from /lib64/libpthread.so.0
#7 0x00000030576ce3bd in clone () from /lib64/libc.so.6
Thread 4 (process 11744):
#0 0x00000030576c5896 in poll () from /lib64/libc.so.6
#1 0x0000000000474f1c in std::string::substr ()
#2 0x000000000043b889 in std::string::operator= ()
#3 0x0000000000474dbc in std::string::substr ()
#4 0x00000000004306a5 in std::string::operator= ()
#5 0x00000030582062e7 in start_thread () from /lib64/libpthread.so.0
#6 0x00000030576ce3bd in clone () from /lib64/libc.so.6
Thread 3 (process 11864):
#0 0x000000305820a687 in pthread_cond_timedwait##GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000000431140 in std::string::operator= ()
#2 0x00000030582062e7 in start_thread () from /lib64/libpthread.so.0
#3 0x00000030576ce3bd in clone () from /lib64/libc.so.6
Thread 2 (process 11866):
#0 0x000000305820a687 in pthread_cond_timedwait##GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000000431140 in std::string::operator= ()
#2 0x00000030582062e7 in start_thread () from /lib64/libpthread.so.0
#3 0x00000030576ce3bd in clone () from /lib64/libc.so.6
Thread 1 (process 11865):
#0 0x000000000044bd32 in std::ostream::operator<< ()
#1 0x0000000000450b21 in std::ostream::operator<< ()
#2 0x000000000042eda9 in std::string::operator= ()
#3 0x00000030582062e7 in start_thread () from /lib64/libpthread.so.0
#4 0x00000030576ce3bd in clone () from /lib64/libc.so.6
If i give bt full it is showing like this
(gdb) bt full
#0 0x000000000044bd32 in std::ostream::operator<< ()
No symbol table info available.
#1 0x0000000000450b21 in std::ostream::operator<< ()
No symbol table info available.
#2 0x000000000042eda9 in std::string::operator= ()
No symbol table info available.
#3 0x00000030582062e7 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#4 0x00000030576ce3bd in clone () from /lib64/libc.so.6
No symbol table info available.
GDB 6.5 is quite old. You will likely get significantly better stack traces from (current) GDB 7.0.1.
You also appear to be trying to debug optimized code, built without -g flag, and you may not be debugging the right executable (GDB warns that your executable is newer than your core).
Make sure that your executable and all the libraries listed in info shared GDB output exactly match between the system where your core was produced and the system on which you are analyzing the core (if they are not the same) -- this is paramount -- if there is a mismatch, you'll likely get bogus stack traces (and the stack traces you've posted do look completely bogus to me).
Looks to me like you're using iostream inside a multithreaded application without the appropriate flags. See this. In particular, note that it says
When you build an application that
uses the iostream classes of the libC
library to run in a multithreaded
environment, compile and link the
source code of the application using
the -mt option. This option passes
-D_REENTRANT to the preprocessor and -lthread to the linker.
This is for a particular platform; your requirements may vary.

Resources