Can the same timer interrupt occur in parallel? - linux

I implemented one timer interrupt handler in kernel module.
This timer interrupt handler requires about 1000us to run.
And I want this timer to trigger up every 10us.
(In doing so, I hope the same handler will be performed in parallel.)
(I know that this can create a tremendous amount of interrupt overhead, but I want to implement it for some testing.)
But this handler does not seem to run in parallel.
Timer interrupt seems to wait until the handler in progress is finished.
Can the same timer interrupt occur in parallel?
If not, is there a kernel mechanism that can run the same handler in parallel?

If the timer triggers every 10us, and requires 1000us (1ms) to complete, you would require 100 dedicated cpu's to barely keep up to the timers. The short answer is no, the interrupt system isn't going to support this. If an interrupt recursed, it would inevitably consume the interrupt handler stack.
Interrupts typically work by having a short bit of code be directly invoked when the interrupt asserts. If more work is to be done, this short bit would schedule a slower bit to follow on, and inhibit this source of interrupt. This is to minimize the latency caused by disparate devices seeking cpu attention. The slower bit, when it determines it has satiated the device request, can re-enable interrupts from this source.
[ In linux, the short bit is called the top half; the slower bit the bottom half. It is a bit confusing, because decades of kernel implementation pre-linux named it exactly the other way around. Best to avoid these terms. ]
One of many ways to get the effect you desire is to have this slow handler release a semaphore then re-enable the interrupt. You could then have an appropriate number of threads sit in a loop acquiring the semaphore then performing your task.

Related

Top halves and bottom halves concept clarification

As per the guide lines of Top halves and Bottom halves, When any interrupt comes it is handled by two halves. The so-called top half is the routine that actually responds to the interrupt—the one you register with request_irq. The bottom half is a routine that is scheduled by the top half to be executed later, at a safer time. The big difference between the top-half handler and the bottom half is that all interrupts are enabled during execution of the bottom half—that's why it runs at a safer time. In the typical scenario, the top half saves device data to a device-specific buffer, schedules its bottom half, and exits: this operation is very fast. The bottom half then performs whatever other work is required, such as awakening processes, starting up another I/O operation, and so on. This setup permits the top half to service a new interrupt while the bottom half is still working.
But is if the interrupt is handled in safer time by bottom halves then logically when interrupt comes it has to wait until bottom halve finds some safer time to execute interrupt that will limit the system and will have to wait until the interrupt handled, for example : if I am working on project to give LED blink indication when temperature goes high above specific limit in that case if interrupt handling is done when some safe time available(according to bottom halves concept) then blink operation will be delayed....Please clarify my doubt how all the interrupts are handled????
When top-half/bottom-half interrupt architecture is used, there is commonly a high-priority interrupt handling thread.
This interrupt handling thread has a higher priority than other threads in the system (some vendor SDKs specify an "Interrupt" priority level for this purpose). There is often a queue, and the thread sleeps when there is no work in the queue. This thread/queue is designed so that work can be safely added from an interrupt context.
When a top-half handler is called, it will handle the hardware operations and then add the bottom-half handler(s) to the interrupt queue. The top-half handler returns and interrupt context is exited. The OS will then check for threads that run next. Because the interrupt thread has work available, and because it is the highest priority, it will run next. This minimizes the latency that you are worried about.
There will naturally be some latency jitter, because there may be other interrupts in the queue that fired ahead of the LED (in your example). There are different solutions to this, depending on the application and real-time requirements. One is to have a sorted queue based on an interrupt priority level. This incurs additional cost when enqueueing operations, but it also ensures your interrupts will be handled by priority. The other option, for critical latencies, is to do all the work in the top-half interrupt handler.
It's important to keep in mind the purposes of such an architecture:
Minimize the time spent in interrupt context, because other interrupts are (likely) disabled while you are processing your current interrupt and you are increasing the latency for handling them
Prevent users from calling functions which are not safe to invoke from an interrupt context
We still want our bottom-half handlers to be run as soon as possible to reduce latency, so when you say "wait for a safer time", this means "outside of the interrupt context".
If you blink operation is too important to be delayed, you could put it into top half and may not have bottom half at all. However, depends on what you do in top half, it may or may not affect system performance.
I would suggest you write code for both cases and perform some profilings
Interrupt: An interrupt is an event that alter the sequence of instruction executed by a processor in corresponding to electrical signal generated by HW circuit both inside & outside CPU.
When any interrupt is generated it is handled by two halves.
1) Top Halves
2) Bottom Halves
Top Halves: Top halves executes as soon as CPU receives the interrupt. In the Top Halves Context Interrupt and Scheduler are disabled. This part of the code only contain Critical Code. Execution Time of this code should be as short as possible because at this time Interrupt are disabled, we don't want to miss out other interrupt by generated by the devices.
Bottom Halves: The Job of the Bottom half is used to run left over (deferred) work by the top halves. When this piece of code is being Executed interrupt is Enabled and Scheduler is Disabled. Bottom Halves are scheduled by Softirqs & Tasklets to run deferred work
Note: The top halves code should be as short as possible or deterministic time and should not contain any blocking calls as well.

Why can't schedule() be called directly from a hardware interrupt?

Why can't schedule() be called directly from a hardware interrupt?
For example, why can't I call schedule() directly from scheduler_tick() and instead I have to use need_resched flag?
I tried looking for an answer but I came empty handed. Any help would be much appreciated.
Consider a cpu with a spin lock which now serves an interrupt. If you schedule() out, you violate the invariant of spin lock owners not going off cpu. Note that for the most part spin locks DON'T disable interrupts. Sometimes there are locks relevant to interrupt handlers and in those cases spin_lock_irq and/or spin_lock_irqsave is used.
Scheduling happens on timer interrupts. The basic rule is that only one interrupt can be open at a time, so if you go to sleep in the "got data from device X" interrupt, the timer interrupt cannot run to schedule it out.
Interrupts also happen many times and overlap. If you put the "got data" interrupt to sleep, and then get more data, what happens? It's confusing (and fragile) enough that the catch-all rule is: no sleeping in interrupts. You will do it wrong.

Which context a given function is called in Linux Kernel

Is there a straight forward mechanism to identify if a given function is called in an interrupt context or from process context. This is the first part to the question. The second part is: How do I synchronize 2 processes, one which is in interrupt context and the other which is in process context. If my understanding is right, We cannot use mutexes for the process in interrupt context since it is not allowed to sleep. On the other hand, if I use spinlocks,the other process will use CPU cycles. What is the best way to synchronize these 2 processes. Correct me if my understanding is totally wrong.
You can tell if function was run as IRQ handler using in_irq() function. But I don't think it's a good practice to use it. You should see just from code in which context your function is being run. Otherwise I'd say your code has bad design.
As for synchronization mechanism -- you are right, you have to use spinlock, because you need to do synchronization in atomic context (e.g. interrupt) -- not that you have much of choice here. You are also right that much CPU cycles will be wasted when waiting for spinlock, so you should try and minimize amount of your code under lock.
Adding to Sam's answer - you should design your interrupt handler with bottom half and top half sections. This lets you have a minimal code (top half) in the interrupt handler (which you register when requesting the irq in the driver), and rest (bottom half) you can schedule using a work queue.
You can have this top half (where you are just handling the interrupt and doing some minimal red/writes from the device) inside atomic context protected by spinlock, so that less number of CPU cycles are wasted waiting for spinlock.

Sleeping in a Linux work queue

I'm just reading about Linux kernel interrupt handler bottom halves for the first time, and am trying to understand the use of the work queue for deferred work.
From what I understand, the benefit of the work queue over softirps or tasklets is that the work is done in process context so it can sleep. But by default, this work is just done sequentially on one of the events/X threads? So if say some work is started on events/0 which then sleeps for a long time waiting on some IO, no more work queue items can be processed on that processor, which seems pretty terrible for performance.
So is the onus just on all interrupt handler developers to not use the default events/X thread if the work could sleep for a long time? Or have I misunderstood something?
But by default, this work is just done sequentially on one of the events/X threads? So if say some work is started on events/0 which then sleeps for a long time waiting on some IO, no more work queue items can be processed on that processor, which seems pretty terrible for performance.
This is not accurate; workqueue API allows both single-threaded, and multi-threaded tasks. for the former, the function create_singlethread_workqueue() is called.
So is the onus just on all interrupt handler developers to not use the default events/X thread if the work could sleep for a long time? Or have I misunderstood something?
In softirq (i.e. tasklet) you cannot sleep at all, so basically, the benefit of workqueue is that you can sleep.. indeed - it's the developer responsibility not to cause other kthreads to starve in case of singlethread workqueue.
Also bear in mind, that workqueue API provides more than just enqueue/dequeue of tasks, but also provides functions to queue delayed work, sync between works, flush work queues, cancel delayed works, etc.. this API is also an advantage over other softirq-based libraries, even for non-sleep'able usage.

Pthread Concepts

I'm studying threads and I am not sure if I understand some concepts. What is the difference between preemption and yield? So far I know that preemption is a forced yield but I am not sure what it actually means.
Thanks for your help.
Preemption is when one thread stops another thread from running so that it may run.
To yield is when a thread voluntarily gives up processor time.
Have a gander at these...
http://en.wikipedia.org/wiki/Preemption_(computing)
http://en.wikipedia.org/wiki/Thread_(computing)
The difference is how the OS is entered.
'yield' is a software interrupt AKA system call, one of the many that may result in a change in the set of running threads, (there are lots of other system calls that can do this - blocking reads, synchronization calls). yield() is called from a running thread and may result in another ready, (but not running), thread of the same priority being run instead of the calling thread - if there is one.
The exact behaviour of yield() is somewhat hardware/OS/language-dependent. Unless you are developing low-level lock-free thread comms mechanisms, and you are very good at it, it's best to just forget about yield().
Preemption is the act of interrupting one thread and dispatching another in its place. It can only occur after a hardware interrupt. When hardware interrupts, its driver is entered. The driver may decide that it can usefully make a thread ready, (eg. a thread is blocked on a read() call to the driver and the driver has accumulated a nice, big buffer of data). The driver can do this by signaling a semaphore and exiting via. the OS, (which provides an entry point for just such a purpose). This driver exit path causes a reschedule and, probably, makes the read thread running instead of some other thread that was running before the interrupt - the other thread has been preempted. Essentially and simply, preemption occurs when the OS decides to interrupt-return to a different set of threads than the one that was interrupted.
Yield: The thread calls a function in the scheduler, which potentially "parks" that thread, and starts another one. The other thread is one which called yield earlier, and now appears to return from it. Many functions can have yielding semantics, such as reading from a device.
Preempt: an external event comes into the system: some kind of interrupt (clock, network data arriving, disk I/O completing ...). Whichever thread is running at that time is suspended, and the machine is running operating system code the interrupt context. When the interrupt is serviced, and it's time to return from the interrupt, a scheduling decision can be made to keep the interrupted thread parked, and instead resume another one. That is a preemption. If/when that original thread gets to run again, the context which was saved by the interrupt will be activated and it will pick up exactly where it left off.
Scheduling systems which rely on yield exclusively are called "cooperative" or "cooperative multitasking" as opposed to "preemptive".
Traditional (read: old, 1970's and 80's) Unix is cooperatively multitasked in the kernel, with a preemptive user space. The kernel routines are trusted to yield in a reasonable time, and so preemption is disabled when running kernel code. This greatly simplifies kernel coding and improves reliability, at the expense of performance, especially when multiple processors are introduced. Linux was like this for many years.

Resources