Is there a simple way of setting the affinity of the application I'm debugging without locking gdb to the same core? The reason why I'm asking is that the application is running with real time priority and it needs to run on a single core. At the moment I use this command line
taskset -c 3 gdbserver :1234 ./app.out
but the application stops responding and freezes the gdb server, making debugging impossible. I suspect that the real time priority of the application prevents gdb from executing. If I start the application and then start gdb without affinity setting, then I can attach and debug the application without gdb freezing.
Is there a simple way to start gdb and the application with different affinities? Or preferably: Is there a gdb command to set affinity of the child process?
I found a solution: Use the --wrapper argument.
http://sourceware.org/gdb/onlinedocs/gdb/Server.html
gdbserver --wrapper taskset -c 3 -- :1234 ./app.out
I had a similar problem and found a solution for myself, drawing in fact from your question for inspiration. As you suspect, it is possible that your gdbserver freezes when running on the same core because one of your application threads is using all the core's cycles and gdbserver is not allowed to run because its priority is too low.
For my particular needs, I am using gdb for an application employing realtime scheduling running on the local machine, and I don't mind if gdb runs on the same core, but I want to be able to debug my program with all the priorities of the application threads respected. For me, what made things work was expanding the gdb command to this more complex construction:
taskset -c 3 chrt 99 gdb
The chrt command added to your taskset command switches to the SCHED_RR policy and runs gdb at the specified priority. My threads being debugged run themselves with lesser priority, and so I assume they only run when gdb is not running.
I was having a problem before. I think, when I requested gdb to resume execution after gdb had suspended execution at a breakpoint or so, that one thread would start running before a higher-priority thread was resumed and thus it was not always the thread that I expected to be running that was in fact running. For me, the above command seems to fix everything -- I assume because the application threads can only run when gdb has finished everything it needs to do in order to resume the program. So, I guess the command line that would be applicable in your case if you wanted to try this out would be:
taskset -c 3 chrt 99 gdbserver :1234 ./app.out
Note: so this would lock gdbserver to a certain core, but likely your real time threads would (or probably could) be running at a lesser priority and so gdbserver would not freeze on you.
Related
I am using linux watchdog driver /dev/watchdog on a linux embedded system with busy box as user space tools. I want to trigger the watchdog from C/C++ code, which works fine for timeouts up to 60s:
watchdogFD = open( "/dev/watchdog", O_WRONLY );
int timeout = 60;
ioctl( watchdogFD, WDIOC_SETTIMEOUT, &timeout )
However for larger intervals the timeout is accepted, but the watchdog is triggered already after 60s.
The linux watchdog deamon offers a --force parameter to set timeouts larger than 60s (see https://linux.die.net/man/8/watchdog). However the busy box watchdog deamon does not offer this (see https://git.busybox.net/busybox/tree/miscutils/watchdog.c?id=1572f520ccfe9e45d4cb9b18bb7b728eb2bbb571).
Does anyone have a suggestion how to use the same --force option when controlling watchdog using ioctl? Thanks :)
It seems the busybox watchdog daemon you link to is very simple compared to the usual Linux one from here:
https://sourceforge.net/p/watchdog/code/ci/master/tree/
The --force option for the Linux daemon (above) is to override the sanity checks on the polling interval versus the hardware time-out used. It will not change any limits a specific hardware driver/timer has to offer.
Typically the choice of hardware time-out is in the 10-60 second range, depending on how long you can tolerate a major fault (like a kernel panic) persisting for. Then the watchdog daemon that feeds the timer has to poll at an interval that is at least a few seconds shorter so nothing timers out unexpectedly. Between polls it uses nanosleep() so gives up its CPU time, and so the system load for the daemon is proportional to the polling rate and the type of tests that are run.
Without any tests all you protect against is a major fault killing either the daemon or kernel, so usually you should be checking for something else that is essential for normal operations (e.g. a specific process being alive, files being updated, test script can be run, etc) to get the most benefit.
Now I know that gdb enables us to switch between threads executing.
But, but, for greater convenience, I'd like to know if it's possible to have as many terminal emulators open as threads in the application, and have a gdb instance in each of these emulators, each bound to a particular thread?
and have a gdb instance in each of these emulators, each bound to a
particular thread
You can't attach multiple gdb instances to the same process. This is a limitation of ptrace syscall used by gdb. From man ptrace:
EPERM The specified process cannot be traced. This could be because
the tracer has insufficient privileges (the required
capability is CAP_SYS_PTRACE); unprivileged processes cannot
trace processes that they cannot send signals to or those
running set-user-ID/set-group-ID programs, for obvious
reasons. Alternatively, the process may already be being
traced, or (on kernels before 2.6.26) be init(1) (PID 1).
I am trying to create an application in userspace that sets affinity of processes. I would like the program to be triggered immediately every time a new pid/tid is spawned by the kernel. I am attempting to write to a file node under /proc from the do_fork() method in the kernel but I feel that it may have too much overhead.
Does anyone know any alternatives to detect a new process creation immediately after it is spawned?
If monitoring do_fork() is the way to go, would a call back to an userspace program via a system call be faster that using a fs node to communicate?
Forkstat is a program that logs process fork() [among other things]
Install it:
$ sudo apt-get install forkstat
Use it to log "fork" events:
$ forkstat -e fork
Use a socket with NETLINK_CONNECTOR. The kernel will tell you about process events, including fork()s and exec()s. You must have CONFIG_CONNECTOR and CONFIG_PROC_EVENTS enabled in your kernel.
Here's a related question with more details:
Detect launching of programs on Linux platform
For a complete socket NETLINK_CONNECTOR example, see:
http://bewareofgeek.livejournal.com/2945.html
As an aside, Inotify doesn't work. It will not work on /proc/ to detect new processes:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/454722
execsnoop can be a good alternative to show new processes and arguments.
We have about 40 computers running identical hardware and software. They all run Ubuntu 11.10. They all have just one user account to log in. The .profile file is set up to launch a daemon process. The code for the daemon is written in C.
Once in a few weeks, we get a report that the daemon is no longer running. This does not happen on all computers but just one or two. We cannot reproduce the problem consistently.
Looking at the code, the application quits when it receives either SIGHUP or SIGTERM.
As I understand, SIGHUP is generated when a user logs off. In our case, the user never logs off. I am wondering if it is possible that SIGHUP could have been generated for some other reason. Any other thought would be appreciated.
Well, there are a couple of things to note about SIGHUP. Firstly, its origin is from the concept of a hang-up, i.e. loss of connection to a console over something like a modem. In modern parlance this generally means it has lost its controlling tty. Unless you've taken care to detach from your tty, any program started in a given terminal will receive a SIGHUP when the terminal is closed. See here for details on how to do this in your program. Other options include:
running your program inside screen or tmux
run your program with nohup or some other daemonising framework
The other possibility is something is deliberately sending your process a SIGHUP which by "tradition" is often used to signal a process that it should re-read its configuration.
Signals can be sent using kill utility or kill syscall.
Of course, you can try and find out who is sending that signal or disconnecting your terminals or network connections, but there is simpler practical way to fix your problem.
When code is supposed to run as a daemon, but really isn't (just like yours), there is a wrapper that can turn any program into daemon. Surprise - this wrapper is called daemon! It has lots of options, probably most importantly for you, option to automatically restart your utility should it ever die for any reason.
If this command is not installed on your Ubuntu, just sudo apt-get install daemon, and man daemon to get started.
I'm trying to debug an application for an ARM processor from my x86 box. I some followed instructions from someone that came before on getting a development environment setup. I've got a version of gdbserver that has been cross-compiled for the ARM processor and appears to allow me to connect to it via my ARM-aware gdb on my box.
I'm expecting that when the process I've got gdb attached to crashes (from a SIGSEGV or similar) it will break so that I can check out the call stack.
Is that a poor assumption? I'm new to the ARM world and cross-compiling things, is there possibly a good resource to get started on this stuff that I'm missing?
It depends on the target system (the one which uses an ARM processor). Some embedded systems detect invalid memory accesses (e.g. dereferencing NULL) but react with unconditional, uncatchable system termination (I have done development on such a system). What kind of OS is the target system running ?
So i assume that the gdb client is able to connect to gdbserver and you are able to put the break point on the running process right?
If all the above steps are successful then you should put the break point before the instruction which crashes, lets say if you dont know where is it crashing then i would say once the application is crashed, the core will be generated, take that core from the board. Then compile the source code again with debug option using -g option(if binaries are stripped) and do the offline ananlysis of core. something like below
gdb binary-name core_file
Then once you get gdb prompt ,give below commands
gdb thread apply all bt
The above command will give you the complete backtrace of all the threads, remember that binaries should not be stripped and the proper path of all the source code and shared lib should be available.
you can switch between threads using below command on gdb prompt
gdb thread thread_number
If the core file is not getting generated on the board then try below command on board before executing the application
ulimit -c unlimited