I'm currently using a SAMA5D31-EK board running Linux 3.10.0+ to control some hardware devices. I'm using GPIOs, I2C, PWM and UARTS available in that board. Some devices are controlled with just a GPIO line while others need an UART a PWM and 3 GPIOs. So far I'm using an userspace program to control those hardware devices - basically a stepper motor, an ADC and a alphanumeric LCD display.
What would be the advantges of developping a kernel device driver to control those devices? So far (using a userspace program) the only limitation I've found is speed: since I have to bit bang some GPIOs, the result is a bit slow.
I assume that you have the platform-specific drivers available for the I2C/GPIO/PWM/UART interfaces on your board(it should be part of BSP[Board-support-package] ).
It is just that you don't want to use the Kernel device driver framework and want to do things from the user-space. I'd been in this situation hence I know, how tempting it could be,especially , if you are not well-versed in Kernel device drivers.
a. SPEED: You mentioned it. But, you probably didn't grasp the reason completely.
Speed efficiency comes from avoiding the Context-switching between Kernel and User-space process. Here is an example:
/* A loop in kernel code which reads a register 100 time */
for (i = 0 ; i < 100 ; i++ )
{
__kernel_read_reg(...);
}
/* A loop in User-space code which reads a register 100 time */
for ( i= 0 ; i < 100; i++)
{
__user_read_reg(...);
}
Functionality wise both *_read_reg() is same. Assuming that __user_read_reg() will go through a typical-system-call procedure,it has to do a Context-switch for every single __user_read_reg(...) which is too costly.
You may argue, "We can mmap() the hardware registers and avoid system call for such operations".
Of course, you could do that, but the point I was making is:
What is close to hardware (for example: a register read or write or handling an interrupt) should be done as fast as possible. Latencies involved in context-switching will impact the performance.
b. Existing/Tested/Well-built subsystems:
If you see an I2C subsystem in the Linux Kernel, it provides a well-tested, robust framework which could be easily-reused. You don't have to write full I2C subsytem (handling all device types, speed, various configuration etc ) in the user-space.
Re-using" what is already done could be one big advantage while going for kernel device drivers.
c. Move from Polling-based approach to Interrupt-based mechanism
If you are not handling interrupts in Kernel driver,You must be using some sort of polling-mechanism in the user-space process. Depending on the system, it might not be very reliable way of handling the hardware-changes.Definitely not accurate/reliable for fast devices.
Interrupt-based mechanism , in general, where you handle the critical changes as fast as possible( Hardware interrupt context) and move the non-critical work-load either to user-space or some other kernel mechanism is more reliable way of handling devices.
Of-course, there could be several more arguments and counter-arguments besides above three.
Another thread which might be of interest to you is here:
Userspace vs kernel space driver
Related
I am doing a project where the gpio switching should be fast like 40MHz speed. I checked with "sysfs" interface and the switching speed is around 300Hz. It is not at all acceptable in our case.
So, in some forums I read using /dev/mem access will increase the switching speed. I used /dev/mem and the achieved the speed 30-32MHz and it is OK for us. Now the project is going for field testing, will it cause any issue like kernel crash something like that in long run.
As far as I know, i.mx6 does not have atomic pin set/reset functionality, therefore you must assure that all GPIO output pins are controlled by your application, neither the kernel nor another process should ever attempt to change any output pin on the same GPIO controller. Reading input pins, or assigning some pins to other periperals should be OK, but always ensure that no bit-banging happens behind the scenes (e.g. some SPI drivers think that they know better when to set or reset CS, and quietly set the CS pin to GPIO output, taking it away from the SPI peripheral)
You can sustain that output speed as long as your process is not interrupted. If you don't disable interrupts, you will get glitches in the output. If you do, then the kernel scheduler and interrupt-driven hardware drivers stop working. On a dual or quad core system, it should be possible to reserve a core for exclusive use by your process, and let the rest of the system run on the other core(s). Don't just blindly disable interrupts, but use sched_setaffinity(2) and the isolcpus kernel parameter.
This question already has answers here:
Userspace vs kernel space driver
(2 answers)
Closed 5 years ago.
I have been reading "Linux Device Drivers" by Jonathan Corbet. I have some questions that I want to know:
What are the main differences between a user-space driver and a kernel driver?
What are the limitations of both of them?
Why user-space drivers are commonly used and preferred nowadays over kernel drivers?
What are the main differences between a user-space driver and a kernel driver?
User space drivers run in user space. Kernel drivers run in kernel space.
What are the limitations of both of them?
The kernel driver can do anything the kernel can, so you could say it has no limitations. But kernel drivers are much harder to "prove correct" and debug. It's all-to-easy to introduce race conditions, or use a kernel function in the wrong context or with the wrong locking. Things will appear to work for a while, but cause problems (including crashing the whole system) down the road. Drivers must also be wary when reading all user input (both from the device and from userspace) because invalid data can sometimes cause crashes.
A user-space driver usually needs a small shim in the kernel to do it's bidding. Usually, that 'shim' provides a simpler API. For example, the FUSE layer lets people write file systems in any language. They can be mounted, read/written, then unmounted. The shim must also protect the kernel against all invalid input.
User-space drivers have lots of limitations. For example, the kernel reserves some memory for use during emergencies, but that is not available for users-space. During memory pressure, the kernel will kill random user-space programs, but never kill kernel threads. User-space programs may be swapped out, which could lead to your device being unavailable for several seconds. (Kernel code can not be swapped out.) Running code in user-space requires several context switches. These waste a "lot" of CPU time. If your device is a 300 baud modem, nobody will notice. But if it's a gigabit Ethernet card, and every packet has to go to your userspace driver before it gets to the real user, the system will have major bottlenecks.
User space programs are also "harder" to use because you have to install that user-space software, which often has many library dependencies. Kernel modules "just work".
Why user-space drivers are commonly used and preferred nowadays over kernel drivers?
The question is "Does this complexity really need to be in the kernel?"
I used to work for a company that made USB dongles that talked a particular protocol. We could have written a full kernel driver, but instead just wrote our program on top of libUSB.
The advantages: The program was portable between Linux, Mac, Win. No worrying about our code vs the GPL.
The disadvantages: If the device needed to data to the PC and get a response quickly, there is no guarantee that would happen. For example, if we needed a real-time control loop on the PC, it would be harder to have bounded response times. (Maybe not entirely impossible on Linux.)
If there is a way to do it in userspace, I would try that first. Only if there are significant performance bottlenecks, or significant complexity in keeping it in userspace would you move it. Even then, consider the "shim" approach, and/or the "emulator" approach (where your kernel module makes your device look like a serial port or a block device.)
On the other hand, if there are already several kernel modules similar to what you want, then start there.
I am going to write a PCIe base serial I/O card driver in Linux.
As per my knowledge through the configuration space, it provides the interrupt line, and through the IRQF_SHARED flag we are able to share the interrupt handler with that corresponding IRQ line.
But my confusion is how can I know which line is shared or not shared?
For a device driver, there is no useful way (and especially no portable way) to find out if the interrupt line is actually shared, and this could change at any time by loading/unloading other drivers.
PCI drivers must always assume that their interrupt might be shared.
Note: PCI Express devices are supposed to support MSIs (message-signaled interrupts), which are never shared.
Your driver should enable MSIs if at all possible.
However, it is not guaranteeed that the system supports them.
Kernel-assisted probing
The Linux kernel offers a low-level facility for probing the interrupt number. It works
for only nonshared interrupts, but most hardware that is capable of working in a
shared interrupt mode provides better ways of finding the configured interrupt num-
ber anyway. The facility consists of two functions, declared in <linux/interrupt.h>
(which also describes the probing machinery):
unsigned long probe_irq_on(void);
This function returns a bit mask of unassigned interrupts. The driver must pre-
serve the returned bit mask, and pass it to probe_irq_off later. After this call, the
driver should arrange for its device to generate at least one interrupt.
int probe_irq_off(unsigned long);
After the device has requested an interrupt, the driver calls this function, passing
as its argument the bit mask previously returned by probe_irq_on. probe_irq_off
returns the number of the interrupt that was issued after “probe_on.” If no inter-
rupts occurred, 0 is returned (therefore, IRQ 0 can’t be probed for, but no cus-
tom device can use it on any of the supported architectures anyway). If more than
one interrupt occurred (ambiguous detection), probe_irq_off returns a negative
value.
The programmer should be careful to enable interrupts on the device after the call to
probe_irq_on and to disable them before calling probe_irq_off. Additionally, you
must remember to service the pending interrupt in your device after probe_irq_off.
Run cat /proc/interrupt. In the rightmost column of the output you should see your device on one of the interrupts lines. If it's shared you'll see other devices assigned to that interrupt as well.
I am looking to write a PWM driver. I know that there are two ways we can control a hardware driver:
User space driver.
Kernel space driver
If in general (do not consider a PWM driver case) we have to make a decision whether to go for user space or kernel space driver. Then what factors we have to take into consideration apart from these?
User space driver can directly mmap() /dev/mem memory to their virtual address space and need no context switching.
Userspace driver cannot have interrupt handlers implemented (They have to poll for interrupt).
Userspace driver cannot perform DMA (As DMA capable memory can be allocated from kernel space).
From those three factors that you have listed only the first one is actually correct. As for the rest — not really. It is possible for a user space code to perform DMA operations — no problem with that. There are many hardware appliance companies who employ this technique in their products. It is also possible to have an interrupt driven user-space application, even when all of the I/O is done with a full kernel-bypass. Of course, it is not as easy simply doing an mmap() on /dev/mem.
You would have to have a minimal portion of your driver in the kernel — that is needed in order to provide your user space with a bare minimum that it needs from the kernel (because if you think about it — /dev/mem is also backed up by a character device driver).
For DMA, it is actually too darn easy — all you have to do is to handle mmap request and map a DMA buffer into the user space. For interrupts — it is a little bit more tricky, the interrupt must be handled by the kernel no matter what, however, the kernel may not do any work and just wake up the process that calls, say, epoll_wait(). Another approach is to deliver a signal to the process as done by DOSEMU, but that is very slow and is not recommended.
As for your actual question, one factor that you should take into consideration is resource sharing. As long as you don't have to share a device across multiple applications and there is nothing that you cannot do in user space — go for the user space. You will probably save tons of time during the development cycle as writing user space code is extremely easy. When, however, two or more applications need to share the device (or its resources) then chances are that you will spend tremendous amount of time making it possible — just imagine multiple processes forking, crashing, mapping (the same?) memory concurrently etc. And after all, IPC is generally done through the kernel, so if application would need to start "talking" to each other, the performance might degrade greatly. This is still done in real-life for certain performance-critical applications, though, but I don't want to go into those details.
Another factor is the kernel infrastructure. Let's say you want to write a network device driver. That's not a problem to do it in user space. However, if you do that then you'd need to write a full network stack too as it won't be possible to user Linux's default one that lives in the kernel.
I'd say go for user space if it is possible and the amount of effort to make things work is less than writing a kernel driver, and keeping in mind that one day it might be necessary to move code into the kernel. In fact, this is a common practice to have the same code being compiled for both user space and kernel space depending on whether some macro is defined or not, because testing in user space is a lot more pleasant.
Another consideration: it is far easier to debug user-space drivers. You can use gdb, valgrind, etc. Heck, you don't even have to write your driver in C.
There's a third option beyond just user space or kernel space drivers: some of both. You can do just the kernel-space-only stuff in a kernel driver and do everything else in user space. You might not even have to write the kernel space driver if you use the Linux UIO driver framework (see https://www.kernel.org/doc/html/latest/driver-api/uio-howto.html).
I've had luck writing a DMA-capable driver almost completely in user space. UIO provides the infrastructure so you can just read/select/epoll on a file to wait on an interrupt.
You should be cognizant of the security implications of programming the DMA descriptors from user space: unless you have some protection in the device itself or an IOMMU, the user space driver can cause the device to read from or write to any address in physical memory.
In Linux, what are the options for handling device interrupts in user space code rather than in kernel space?
Experience tells it is possible to write good and stable user-space drivers for almost any PCI adapter. It just requires some sophistication and a small proxying layer in the kernel. UIO is a step in that direction, but If you want to correctly handle interrupts in user-space then UIO might not be enough, for example if the device doesn't support the PCI-spec's interrupt disable bit which UIO relies on.
Notice that process wakeup latencies are a few microsecs so if your implementation requires very low latency then user-space might be a drag on it.
If I were to implement a user-space driver, I would reduce the kernel ISR to just a "disable & ack & wakeup-userpace" operation, handle the interrupt inside the waked-up process, and then re-enable the interrupt (of course, by writing to mapped PCI memory from the userspace process).
There is Userspace I/O system (UIO), but handling should still be done in kernelspace. OTOH, if you just need to notice the interrupt, you don't need the kernel part.
You may like to take a look at CHAPTER 10: Interrupt Handling from Linux Device Drivers, Third Edition book.
Have to trigger userland code indirectly.
Kernel ISR indicates interrupt by writing file / setting register / signalling. User space application polls this and goes on with the appropriate code.
Edge cases: more or less interrupts than expected (time out / too many interrupts per time interval)
Linux file abstraction is used to connect kernel and user space. This is performed by character devices and ioctl() calls. Some may prefer sysfs entries for this purpose.
This can look odd because event triggered device notifications (interrupts) are hooked with 'time triggered' polling, but it is actually asyncronous blocking (read/select). Anyway some questions are arising according to performance.
So interrupts cannot be directly handled outside the kernel.
E.g. shared memory can be in user space and with some I/O permission settings addresses can be mapped, so U-I/O works, but not for direct interrupt handling.
I have found only one 'minority report' in topic vfio (http://lxr.free-electrons.com/source/Documentation/vfio.txt):
https://stackoverflow.com/a/21197797/5349798
Similar questions:
Running user thread in context of an interrupt in linux
Is it possible in linux to register a interrupt handler from any user-space program?
Linux Kernel: invoke call back function in user space from kernel space
Linux Interrupt vs. Polling
Linux user space PCI driver
How do I inform a user space application that the driver has received an interrupt in linux?