How does gdb multi-thread debugging coordinate with Linux thread scheduling? - linux

When debugging multi-thread program using gdb one can do
1. switching between existing thread
2. step debugging
3. etc.
Meanwhile, process and its threads as resource of OS is managed by and under control of Linux Kernel. When gdb switch to a thread (say t1) from another(t2), how does it coordinate with the kernel since the kernel might still want to run t2 for some period of time. Also when gdb step is debugging in one specific thread (by issuing "si" command), how does other threads get run (or totally paused) during this period?

When gdb switch to a thread (say t1) from another(t2), how does it coordinate with the kernel since the kernel might still want to run t2 for some period of time.
By default, GDB operates in all-stop mode. That means that all threads are stopped whenever you see the (gdb) prompt. Switching between 2 stopped threads doesn't need any coordination with the kernel, because kernel will not run non-runnable (stopped) threads.
In a non-stop mode, threads other than current run freely, and the kernel can and will schedule them to run as it sees fit.
when gdb step is debugging in one specific thread (by issuing "si" command), how does other threads get run (or totally paused) during this period?
When you step or stepi, by default all threads are resumed. You can control this with set scheduler-locking on, in which case only the single thread will be resumed. If you forget to turn scheduler-locking off and do continue, only the current thread will be resumed, which is likely to confuse you.
Documentation.

Related

mlock blocked by FIFO thread of a different process on Ubuntu Linux

I am working on some real-time programs that require mlock and FIFO scheduling policy for fast paths.
I am running two processes on Ubuntu 16.04 with 12 CPU cores, and I assigned the fast paths of these processes to different cores.
Process 1 starts normally and pins its fast thread to a CPU and sets the scheduling policy to FIFO on this thread.
When process 2 starts, before its fast thread is created, it tries to call mlock.
Then, process 2 is stuck.
I attached gdb to process 2, and the call stack seems to be inside the mlock function.
If I remove the FIFO setting on process 1, both processes can run normally.
My suspicion is that mlock is trying to access some kernel resources that is acquired by the fast thread of process 1.
So it is blocked and put on wait indefinitely.
Does anyone know exactly what it is waiting for?
I have observed this problem on two similar IBM servers with Ubuntu.
But, on a Supermicro machine with a Redhat Linux, this issue didn't occur.
Thanks for any hint or solution!
If you have a SCHED_FIFO process that completely occupied a CPU such that non-sched-fifo threads never get scheduled on that CPU, some in-kernel algorithms may stop working, depending on kernel version/configuration.
Try booting with rcutree.kthread_prio=N where N is larger than SCHED_FIFO priority of your thread.
Try playing with /proc/sys/kernel/sched_rt_runtime_us
Try to get in-kernel backtrace of hanged mlock() to understand where it is waiting - this may get a hint. For that, use /proc/pid/stack (if your kernel is compiled with CONFIG_STACKTRACE) or maybe 'echo t > /proc/sysrq-trigger'

How to halt execution of a single thread when a breakpoint is reached with Eclipse CDT + GDB

I'm debugging a multi-threaded C++ application in Eclipse Oxygen with gdb 7.4
The default behaviour is that when a breakpoint is reached all threads are halted, however, I'd like only the thread that reached the breakpoint to halt and all others would continue to run.
How is possible?
How is possible?
(gdb) set non-stop on
By default non-stop mode is off. You want it to be on, see gdb builtin help:
(gdb) help set non-stop
Set whether gdb controls the inferior in non-stop mode.
When debugging a multi-threaded program and this setting is
off (the default, also called all-stop mode), when one thread stops
(for a breakpoint, watchpoint, exception, or similar events), GDB stops
all other threads in the program while you interact with the thread of
interest. When you continue or step a thread, you can allow the other
threads to run, or have them remain stopped, but while you inspect any
thread's state, all threads stop.
In non-stop mode, when one thread stops, other threads can continue
to run freely. You'll be able to step each thread independently,
leave it stopped or free to run as needed.
(gdb)

How to list threads were killed by the kernel?

Is there any way to list all the killed processes in a linux device?
I saw this answer suggesting:
check in:
/var/log/kern.log
but it is not generic. there is any other way to do it?
What I want to do:
list thread/process if it got killed. What function in the kernel should I edit to list all the killed tid/pid and their names, or alternitavily is there a sysfs does it anyway?
The opposite of do_fork is do_exit, here:
do_exit kernel source
I'm not able to find when threads are exiting, other than:
release_task
I believe "task" and "thread" are (almost) synonymous in Linux.
First, task and thread contexts are different in the kernel.
task (using tasklet api) runs in software interrupt context (meaning you cannot sleep while you are in the task ctx) while thread (using kthread api, or workqueue api) runs the handler in process ctx (i.e. sleep-able ctx).
In both cases, if a thread hangs in the kerenl, you cannot kill it.
if you run "ps" command from the shell, you can see it there (normally with "[" and "]" braces) but any attempt to kill it won't work.
the kernel is trusted code, such a situation shouldn't happen, and if it does, it indicates a kernel (or kernel module) bug.
normally the whole machine will hand after a while because the core running that thread is not responding (you will see a message in /var/log/messages or the console with more info) in some other cases the machine may survive but that specific core is dead. depends on the kernel configuration.

Kernel mode in multithreaded program

If a thread in a process makes system call then in uni-threaded process, process will switch o kernel mode. But what will in case of multi-threaded process?
In other words, if a thread in a process makes system call then what is mode of the process which contains that thread -- kernelmode/user mode?
In Linux a thread is simply a process that happens to share memory with several other processes (other threads within the same process).
So, the CPU will be system mode during the syscall, but the execution will still switch to some other thread or process when its time slice expires, just like it normally switches from process to process even if the currently running process is executing a syscall.

Does the linux scheduler needs to be context switched?

I have a general question about the linux scheduler and some other similar kernel system calls.
Is the linux scheduler considered a "process" and every call to the scheduler requires a context switch like its just another process?
Say we have a clock tick which interrupts the current running user mode process, and we now have to call the scheduler. Does the call to the scheduler itself provokes a context switch? Does the scheduler has its own set of registers and U-area and whatnot which it has to restore at every call?
And the said question applies to many other system calls. Do kernel processes behave like regular processes in regard to context switching, the only difference is that they have more permissions and access to the cpu?
I ask this because context switch overhead is expensive. And it sounds odd that calling the scheduler itself provokes a context switch to restore the scheduler state, and after that the scheduler calls another process to run and again another context switch.
That's a very good question, and the answer to it would be "yes" except for the fact that the hardware is aware of the concept of an OS and task scheduler.
In the hardware, you'll find registers that are restricted to "supervisor" mode. Without going into too much detail about the internal CPU architecture, there's a copy of the basic program execution registers for "user mode" and "supervisor mode," the latter of which can only be accessed by the OS itself (via a flag in a control register that the kernel sets which says whether or not the kernel or a user mode application is currently running).
So the "context switch" you speak of is the process of swapping/resetting the user mode registers (instruction register, stack pointer register, etc.) etc. but the system registers don't need to be swapped out because they're stored apart from the user ones.
For instance, the user mode stack in x86 is USP - A7, whereas the supervisor mode stack is SSP - A7. So the kernel itself (which contains the task scheduler) would use the supervisor mode stack and other supervisor mode registers to run itself, setting the supervisor mode flag to 1 when it's running, then perform a context switch on the user mode hardware to swap between apps and setting the supervisor mode flag to 0.
But prior to the idea of OSes and task scheduling, if you wanted to do a multitasking system then you'd have had to use the basic concept that you outlined in your question: use a hardware interrupt to call the task scheduler every x cycles, then swap out the app for the task scheduler, then swap in the new app. But in most cases the timer interrupt would be your actual task scheduler itself and it would have been heavily optimized to make it less of a context switch and more of a simple interrupt handler routine.
Actually you can check the code for the schedule() function in kernel/sched.c. It is admirably well-written and should answer most of your question.
But bottom-line is that the Linux scheduler is invoked by calling schedule(), which does the job using the context of its caller. Thus there is no dedicated "scheduler" process. This would make things more difficult actually - if the scheduler was a process, it would also have to schedule itself!
When schedule() is invoked explicitly, it just switches the contexts of the caller thread A with the one of the selected runnable thread B such as it will return into B (by restoring register values and stack pointers, the return address of schedule() will become the one of B instead of A).
Here is an attempt at a simple description of what goes on during the dispatcher call:
The program that currently has context is running on the processor. Registers, program counter, flags, stack base, etc are all appropriate for this program; with the possible exception of an operating-system-native "reserved register" or some such, nothing about the program knows anything about the dispatcher.
The timed interrupt for dispatcher function is triggered. The only thing that happens at this point (in the vanilla architecture case) is that the program counter jumps immediately to whatever the PC address in the BIOS interrupt is listed as. This begins execution of the dispatcher's "dispatch" subroutine; everything else is left untouched, so the dispatcher sees the registers, stack, etc of the program that was previously executing.
The dispatcher (like all programs) has a set of instructions that operate on the current register set. These instructions are written in such a way that they know that the previously executing application has left all of its state behind. The first few instructions in the dispatcher will store this state in memory somewhere.
The dispatcher determines what the next program to have the cpu should be, takes all of its previously stored state and fills registers with it.
The dispatcher jumps to the appropriate PC counter as listed in the task that now has its full context established on the cpu.
To (over)simplify in summary; the dispatcher doesn't need registers, all it does is write the current cpu state to a predetermined memory location, load another processes' cpu state from a predetermined memory location, and jumps to where that process left off.

Resources