How do you clear a pending interrupt on PLIC without performing a claim? - riscv

I'm trying to write a PLIC and I'm stuck on whether it's possible to clear an interrupt without performing a claim.
Is it possible to clear an interrupt by removing an interrupt's cause (ex: context 3 is no longer trying to cause an interrupt) without doing a claim?

Related

Vulkan abort problematic commands?

I have an application where multiple threads would be rendering different parts of a world. However it may occur that one of those threads could submit a highly problematic, or even malicious, command to Vulkan.
Is there anyway to preemptively check for issues with the command that could catch it being problematic? Or let it attempt to be executed, but then by some means determine if it is problematic and abort it? All the while not corrupting or wrecking appropriate commands that were submitted from other threads.
I know obvious solution is "don't submit malicious commands!" but without explaining everything, the jist of this is to try and create a kind graphics sandbox.
The Vulkan run-time assumes well formed input; there isn't any error checking (that's left to layer drivers) so without validation you could get rendering corruption or driver crashes.
You can get some limited protection to GPU-side buffer overruns using robustBufferAccess, but it only catches a tiny subset of the problems.
Beyond that the only real solution is to rely on host process isolation, and put each content provider into a separate process on the host OS with a unique rendering context.
Even with that you can get trivial denial-of-service (shader with a very long running and/or infinite loop), which the API doesn't really give you any means to control. You'd be reliant on the privileged GPU driver timing out the process and killing it.

Deciding the critical section of kernel code

Hi I am writing kernel code which intends to do process scheduling and multi-threaded execution. I've studied about locking mechanisms and their functionality. Is there a thumb rule regarding what sort of data structure in critical section should be protected by locking (mutex/semaphores/spinlocks)?
I know that where ever there is chance of concurrency in part of code, we require lock. But how do we decide, what if we miss and test cases don't catch them. Earlier I wrote code for system calls and file systems where I never cared about taking locks.
Is there a thumb rule regarding what sort of data structure in critical section should be protected by locking?
Any object (global variable, field of the structure object, etc.), accessed concurrently when one access is write access requires some locking discipline for access.
But how do we decide, what if we miss and test cases don't catch them?
Good practice is appropriate comment for every declaration of variable, structure, or structure field, which requires locking discipline for access. Anyone, who uses this variable, reads this comment and writes corresponded code for access. Kernel core and modules tend to follow this strategy.
As for testing, common testing rarely reveals concurrency issues because of their low probability. When testing kernel modules, I would advice to use Kernel Strider, which attempts to prove correctness of concurrent memory accesses or RaceHound, which increases probability of concurrent issues and checks them.
It is always safe to grab a lock for the duration of any code that accesses any shared data, but this is slow since it means only one thread at a time can run significant chunks of code.
Depending on the data in question though, there may be shortcuts that are safe and fast. If it is a simple integer ( and by integer I mean the native word size of the CPU, i.e. not a 64 bit on a 32 bit cpu ), then you may not need to do any locking: if one thread tries to write to the integer, and the other reads it at the same time, the reader will either get the old value, or the new value, never a mix of the two. If the reader doesn't care that he got the old value, then there is no need for a lock.
If however, you are updating two integers together, and it would be bad for the reader to get the new value for one and the old value for the other, then you need a lock. Another example is if the thread is incrementing the integer. That normally involves a read, add, and write. If one reads the old value, then the other manages to read, add, and write the new value, then the first thread adds and writes the new value, both believe they have incremented the variable, but instead of being incremented twice, it was only incremented once. This needs either a lock, or the use of an atomic increment primitive to ensure that the read/modify/write cycle can not be interrupted. There are also atomic test-and-set primitives so you can read a value, do some math on it, then try to write it back, but the write only succeeds if it still holds the original value. That is, if another thread changed it since the time you read it, the test-and-set will fail, then you can discard your new value and start over with a read of the value the other thread set and try to test-and-set it again.
Pointers are really just integers, so if you set up a data structure then store a pointer to it where another thread can find it, you don't need a lock as long as you set up the structure fully before you store its address in the pointer. Another thread reading the pointer ( it will need to make sure to read the pointer only once, i.e. by storing it in a local variable then using only that to refer to the structure from then on ) will either see the new structure, or the old one, but never an intermediate state. If most threads only read the structure via the pointer, and any that want to write do so either with a lock, or an atomic test-and-set of the pointer, this is sufficient. Any time you want to modify any member of the structure though, you have to copy it to a new one, change the new one, then update the pointer. This is essentially how the kernel's RCU ( read, copy, update ) mechanism works.
Ideally, you must enumerate all the resources available in your system , the related threads and communication, sharing mechanism during design. Determination of the following for every resource and maintaining a proper check list whenever change is made can be of great help :
The duration for which the resource will be busy (Utilization of resource) & type of lock
Amount of tasks queued upon that particular resource (Load) & priority
Type of communication, sharing mechanism related to resource
Error conditions related to resource
If possible, it is better to have a flow diagram depicting the resources, utilization, locks, load, communication/sharing mechanism and errors.
This process can help you in determining the missing scenarios/unknowns, critical sections and also in identification of bottlenecks.
On top of the above process, you may also need certain tools that can help you in testing / further analysis to rule out hidden problems if any :
Helgrind - a Valgrind tool for detecting synchronisation errors.
This can help in identifying data races/synchronization issues due
to improper locking, the lock ordering that can cause deadlocks and
also improper POSIX thread API usage that can have later impacts.
Refer : http://valgrind.org/docs/manual/hg-manual.html
Locksmith - For determining common lock errors that may arise during
runtime or that may cause deadlocks.
ThreadSanitizer - For detecting race condtion. Shall display all accesses & locks involved for all accesses.
Sparse can help to lists the locks acquired and released by a function and also identification of issues such as mixing of pointers to user address space and pointers to kernel address space.
Lockdep - For debugging of locks
iotop - For determining the current I/O usage by processes or threads on the system by monitoring the I/O usage information output by the kernel.
LTTng - For tracing race conditions and interrupt cascades possible. (A successor to LTT - Combination of kprobes, tracepoint and perf functionalities)
Ftrace - A Linux kernel internal tracer for analysing /debugging latency and performance related issues.
lsof and fuser can be handy in determining the processes having lock and the kind of locks.
Profiling can help in determining where exactly the time is being spent by the kernel. This can be done with tools like perf, Oprofile.
The strace can intercept/record system calls that are called by a process and also the signals that are received by a process. It shall show the order of events and all the return/resumption paths of calls.

Why can a signal handler not run during system calls?

In Linux a process that is waiting for IO can be in either of the states TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE.
The latter, for example, is the case when the process waits until the read from a regular file completes. If this read takes very long or forever (for whatever reasons), it leads to the problem that the process cannot receive any signals during that time.
The former, is the case if, for example, one waits for a read from a socket (which may take an unbounded amount of time). If such a read call gets interrupted, however, it leads to the problem that it requires the programmer/user to handle the errno == EINTER condition correctly, which he might forget.
My question is: Wouldn't it be possible to allow all system calls to be interrupted by signals and, moreover, in such a way that after the signal handler ran, the original system call simply continues its business? Wouldn't this solve both problems? If it is not possible, why?
The two halves of the question are related but I may have digested too much spiked eggnog to determine if there is so much a one-to-one as opposed to a one-to-several relationship here.
On the kernel side I think a TASK_KILLABLE state was added a couple of years back. The consequence of that was that while the process blocked in a system call (for whatever reason), if the kernel encountered a signal that was going to kill the process anyway, it would let nature take its course. That reduces the chances of a process being permanent stuck in an uninterruptible state. What progress has been made in changing individual system calls to avail themselves of that option I do not know.
On the user side, SA_RESTART and its convenience function cousin siginterrupt go some distance to alleviating the application programmer pain. But the fact that these are specified per signal and aren't honored by all system calls is a pretty good hint that there are reasons why a blanket interruption scheme like you propose can't be implemented, or at least easily implemented. Just thinking about the possible interactions of the matrices of signals, individual system calls, calls that have historically expected and backwardly supported behavior (and possibly multiple historical semantics based on their system bloodline!), is a bit dizzying. But maybe that is the eggnog.

When is POSIX thread cancellation not immediate?

The POSIX specifies two types for thread cancellation type: PTHREAD_CANCEL_ASYNCHRONOUS, and PTHREAD_CANCEL_DEFERRED (set by pthread_setcanceltype(3)) determining when pthread_cancel(3) should take effect. By my reading, the POSIX manual pages do not say much about these, but Linux manual page says the following about PTHREAD_CANCEL_ASYNCHRONOUS:
The thread can be canceled at any time. (Typically, it will be canceled immediately upon receiving a cancellation request, but the system doesn't guarantee this.)
I am curious about the meaning about the system doesn't guarantee this. I can easily imagine this happening in multicore/multi-CPU systems (before context switch). But what about single core systems:
Could we have a thread not cancelled immediately when cancellation is requested and cancellation is enabled (pthread_setcancelstate(3)) and cancel type set to PTHREAD_CANCEL_ASYNCHRONOUS?
If yes, under what conditions could this happen?
I am mainly curious about Linux (LinuxThreads / NPTL), but also more generally about POSIX standard compliant way of viewing this cancellation business.
Update/Clarification: Here the real practical concern is usage of resources that are destroyed immediately after calling pthread_cancel() where the targeted thread have cancellation enabled and set to type PTHREAD_CANCEL_ASYNCHRONOUS!!! So the point really is: is there even a tiny possibility for the cancelled thread in this case to continue running normally after context switch (even for a very small time)?
Thanks for Damon's answer the question is reduced about signal delivery and handling in relation to the next context switch.
Update-2: I answered my own question to point that this is bad concern and that the underlying program design should be addressed in fundamentally different conceptual level. I wish this "wrong" question is useful for others wondering about mysteries of asynchronous cancellation.
The meaning is just what it says: It's not guaranteed to happen instantly. The reason for this is that a certain "liberty" for implementation details is needed and accounted for in the standard.
For example under Linux/NPTL, cancellation is implemented by sending signal nr. 32. The thread is cancelled when the signal is received, which usually happens at the next kernel-to-user switch, or at the next interrupt, or at the end of the time slice (which may accidentially be immediately, but usually is not). A signal is never received while the thread isn't running, however. So the real catch here is actually that signals are not necessarily received immediately.
If you think about it, it isn't even possible to do it much different, either. Since you can phtread_cleanup_push some handlers which the operating system must execute (it cannot just blast the thread out of existence!), the thread must necessarily run to be cancelled. There is no guarantee that any particular thread (including the one you want to cancel) is running at the exact time you cancel a thread, so there can be no guarantee that it is cancelled immediately.
Except of course, hypothetically, if the OS was implemented in a way as to block the calling thread and schedule the to-be-cancelled thread so it executes its handlers, and only unblocks pthread_cancel afterwards. But since pthread_cancel isn't specified as blocking, this would be an utterly nasty surprise. It would also be somewhat inacceptable because of interfering wtih execution time limits and scheduler fairness.
So, either your cancel type is "disable", then nothing happens. Or, it is "enable", and the cancel type is "deferred", then the thread cancels when calling a function that is listed as cancellation point in pthreads(7).
Or, it is "asynchronous", then as stated above, the OS will do "something" to cancel the thread as soon as it deems appropriate -- not at a precise, well-defined time, but "soon". In the case of Linux, by sending a signal.
If you need to wonder when the asynchronous cancellation happen, you are doing something terribly wrong.
Following Standards: You are eating ground below your feet by deliberately creating or allowing code to exist whose correctness depends on assumptions about the platform (single core, particular implementation, whatever). It is almost always better, if possible, to follow the standards (and document clearly when it is not possible). The name PTHREAD_CANCEL_ASYNCHROUNOUS itself suggests the meaning asynchronous, which is different from immediate or even almost immediate. The original poster specifically states single core, but why should you allow code to exist that will break in non-deterministic ways, when your code is put to run in truly parallel machines (multiple cores or CPUs) where it is practically impossible to have guarantee of immediateness (this would require stopping the other cores from running or waiting for context switch or some other terrible hack which your OS/CPU is not going to support to support your unconventional wishes).
Asynchronous thread cancellation mode is not meant for guaranteed immediate cancellation of a thread. Hence it is a terribly confusing hack to use them in this way even if it would work.
Async-Safeness: If you are concerned about the mechanism of asynchronous cancellation, it raises the suspicion that the threads in question (because of lack of independence) are maybe not purely computational or written in async-cancel-safe manner.
POSIX specifies only three functions as async-cancel safe: pthread_cancel(3), pthread_setcancelstate(3), and pthread_setcancelmode(3) - see IEEE Std 1003.1, 2013 Edition, 2.9.5. This cancellation mode is only suitable for purely computational tasks that do not call (other than purely computational) library functions; such code would not provide cancellation points if the threads were set to run in the default deferred cancellation mode. Hence the rationale for defining such mode.
It is possible to write async-cancel-safe code by disabling cancellation during critical sections. But library writers (including POSIX library implementors) in general should not care about async-safetyness by reasons of following general convention, avoiding complexity, and even avoiding performance overhead. Because the library writers should not care, you should never expect async-safetyness unless it is explicitly stated otherwise.
If your code is not async-safe (because for example calling other libraries, including POSIX/standard C libraries without temporarily disabling cancellation or changing cancellation mode) and asynchronous cancellation occurs, you might leak resources (memory, etc), leave behind inconsistent states and locked mutexes dead-locking other threads, and summon many other problems currently imaginable and non-imaginable. (If you are writing in C++, it seems you will have other issues to deal with due to POSIX thread cancellation's close association with exception handling.)

request_irq to be handled by a single CPU

I would like to ask if there is a way to register the interrupt handler so that only one cpu will handle this interrupt line.
The problem is that we have a function that can be called in both normal context and interrupt context. In this function we use irqs_disabled() to check the caller context. If the caller context is interrupt, we switch the processing to polling mode (continuously check the interrupt status register). Although the irqs_disabled() tells that the local interrupt of current CPU is disabled, the interrupt handler is still called by other CPUs and hence the interrupt status register are cleared in the interrupt handler. The polling code now checks the wrong value of the interrupt status register and do wrong processing.
You're doing it wrong. Don't limit your interrupt to be handled by a single CPU - instead use a spin_lock_irqsave to protect the code path. This will work both on the same CPU and across CPUs.
See http://www.mjmwired.net/kernel/Documentation/spinlocks.txt for the relevant API and here is a nice article from Linux Journal that explain the usage: http://www.linuxjournal.com/article/5833
I've got no experience with ARM, but on x86 you can arrange for a particular interrupt to be called on only one processor via /proc/irq/<number>/smp_affinity - set from user space - replacing the number with irq you care about - and this looks as if it's essentially generic. Note that the value you set it to is a bit mask, expressed in hex, without a leading 0x. I.e. if you want cpu 0, set it to 1, for cpu 1, set it to 2, etc. Beware of a process called irqbalance, which uses this mechanism, and might well override whatever you have done.
But why are you doing this? If you want to know whether you are called from an interrupt, there's an interface available named something like in_interrupt(). I've used it to avoid trying to call blocking functions from code that might be called from interrupt context.

Resources