I need to develop a Linux driver that generates a square wave, with a cycle of about 1ms, using the MIPS platform (this is not i386).
I tried some methods, but these are not success:
Use timer/hrtimer --> but cycle is 12ms and unstable
Cannot use realtime additional packages as RTLinux/RTAI, because these do not support for MIPS
Use the kernel-thread with a forever loop and udelay function --> It takes too much of the CPU's resource --> Performance is not acceptable
Do you aid me? Or do you thwart me...? (Please help!)
Thank you.
The Unix way would be not doing that at all. Maybe in olden times on single task machines, you would have done like this, but now - if you don't have a hardware circuit that gives to the proper frequency, you may never succeed because hardware timers don't have the necessary resolution, and it may always happen that a task of more importance grabs your CPU time.
As FrankH said, the best solution involves relying on hardware. You should check your processor's reference manual to see if it has a timer.
I'll add this: if it happens to have an Output Compare or PWM subsystem (I'm not familiar with MIPS, but it's not at all uncommon in embedded devices) you can just write a few registers to set it all up, and then you don't need any more processor time.
It might be possible, but to get this from within Linux, the hardware must have certain characteristics:
you need a programmable timer device that can create an interrupt at sufficiently-high priority that other activity by the Linux kernel (such as scheduling or other interrupts, even) won't preempt / block the interrupt handler, and at sufficient granulatity/frequency to meet your signal stability constraints
the "square wave" electrical line must also be programmable and the operation (register write ? memory mapped register write ? special CPU instruction ? ... ?) which switches its state must be guaranteed faster than the shortest cycle time used with the timer above (or else you could get "frequency moire")
If that's the case then your special timer device driver can toggle the line from within its high prio interrupt handler and create the square wave. Since it's both interrupt driven and separate from the normal timer interrupt sources / consumers (i.e. not shared - no latency from possibly dispatching multiple timer events per interrupt), you've got a much better chance of sufficient precision.
Since all this (the existance of a separately-programmable timer device, to start with) is hardware-specific, you need to start with the specs of your CPU/SoC/board and find out if there are multiple independent timers available.
Related
I want to create a simulation of an actual device on an x86 Linux Kernel. Part of this will involve simulating timings as close to possible as I can get. Based on some research it seems I will need at least microsecond resolution timing. I understand that on a non-realtime system it won't be possible to get perfect timing, but I don't perfect, just as close as I can get, perhaps with hacking around with thread scheduling / preemption options.
What I actually want to do is perform an action every interval, i.e. run a some code every Xµs. I've been trying to research the best ways to do this from a Kernel driver as well as some research into whether it's possible to do this reasonably accurately from user mode (keeping the above paragraph in mind). One of the first things that caught my eye was the HPET timer, that is programmable to generate interrupts based on programmable comparators. Unfortunately, it seems on many chipsets it has been rather buggy in the past, and there's not much information on using it for anything that obtaining a timestamp or using it as the main clock source. The linux Kernel provides an HPET driver that in the past, seemed to provide both kernel and user mode interfaces, but seems only to provide a barely documented usermode interface in more recent kernel versions. I've also read about various other kernel functions and interfaces such as the hrtimer interface and the various delay functions, though I'm having a bit of trouble understanding them and if they are suited for my purpose.
Given my current use case, what are the best options I have running recurring events at a µs resolution from say a kernel driver? Obviously accuracy is probably my biggest criteria, but ease of use would be second.
Well, it's possible to achieve your accuracy in userspace -- clock_nanosleep is one ideal option, which has relative and absolute mode. Since clock_nanosleep is based on hrtimer in kernel mode, you may want to use hrtimer if you'd like to implement it in kernel space.
However, to make the timer work accurately, there're two IMPORTENT things worth mentioning.
You should set the timerslack of your process (either by writing nonzero value in ns to /proc/self/timerslack_ns or via prctl(PR_SET_TIMERSLACK,...)). This value is considered as the 'tolerance' of the timer.
The CPU power management also matters here. The CPU has many different Cstates, each of which has a different exit latency. So you need to configure your cpuidle module to not use Cstates other than C0, e.g. for an Intel CPU you could simply write 1 to /sys/devices/system/cpu/cpu$c/cpuidle/state$s/disable to disable state $s of CPU $c. Or just add idle=poll to your kernel options to let CPU keep active (in C0) while kernel idle. NOTE that this significantly influences the power of the computer and leads the cooling fans to make noise.
You can get a timer with delays under 10 microseconds if the two things mentioned above are configured correctly. There is a trade-off between latency and power consumption that you should made.
I recently began study asm, and faced a problem, that i can't find table of all interrupt's for linux or win. I looked in intel documentation, but don't find this info. So, how do you find table of all interrupts?
In general, you canʼt find "table of all interrupts" without a real hardware start because it depends on ton of factors, including extension adapter set, exact chipset version, processor version, and so on.
Iʼd assume x86 as the context. It is defined by Intel that first 32 interrupt vectors (0-31) are for use by CPU itself - it can generate their invocation on internally defined exceptions. That would clash with old style (known from various IBM PC descriptions) that interrupts are assigned to 8-15, but, it is defined as OS task to reassign all conflicting interrupts when entering the protected mode. Then, interrupt controllers (nowadays, you can assume all them are at least APIC) are programmed to assign interrupt numbers of remained set to hardware that requires them. What numbers are assignable, depends on bus type and delivery manner:
MSI (message signaled interrupt), MSI-X - the main techniques for PCI-E - are assigned by APIC programming, typically one number per device and role (some devices will emit multiple interrupt types);
old line-based style (classic PCI) - up to 4 interrupt lines per bus; so there may be collision between numbers, and handlers shall iterate all possible devices. In classic designs of Pentium 1-3 times, they were assigned by BIOS to range 10-14 and then moved by OS to some upper range.
At the system I write this, interrupt numbers assigned to hardware are 36-62 with some gaps. 17 of them are used by xhci_hcd.
To sum up: for CPU interrupts, read the CPU doc. For others, assume dynamic assignment and find the current assignment in OS state using respective API.
So, i wrote code for windows and thought, that linux has table or list with interrupt. But I was surprised when learned, that linux has only one interrupt (int 80h) and many syscalls. So, i can look syscalls here
https://man7.org/linux/man-pages/man2/syscalls.2.html
https://chromium.googlesource.com/chromiumos/docs/+/master/constants/syscalls.md
Also syscalls division by type of processor and architecture of OS (x32 or x64). So, i should be use syscall and only one interrupt - int 80h.
I understood this and now I want to help others
Before I start: yes, I'm aware that the answer is architecture dependent - I'm just interested in a ballpark figure, in terms of orders of magnitude.
Is there an upper limit imposed by the linux kernel on interrupt frequency?
Background: I want to interface with a camera module from within Linux. The module has a clocked parallel data output (8 bits, at ~650kHz), which I want to read data from and store in a buffer for access through, eg, /dev/camera.
I have a basic driver written, and it is monitoring the appropriate interrupt line. If I leave a wire hanging off the interrupt pin, I get interrupts from white noise. However, if I hook up a higher frequency signal (atm ~250kHz from a 555 timer) then no interrupts are triggered. (I've confirmed this with /proc/interrupts)
My thinking is that this can either be from the GPIO module on the processor not being able to deal with such high frequencies (which would be silly - that's not particularly high), or it could be a kernel issue. What do people think?
Look at it this way. Modern CPUs execute around 109 instructions per second.
In order to handle an interrupt you need to execute some 100-1000 instructions (save the context, do I/O, signal end of interrupt handling, restore the context). That gives you some 106 - 107 interrupts per second max.
If you spend all the time in handling interrupts, then nothing is left for the rest of the system and programs.
So, think of some 105 interrupts/second (100 KHz) being the maximum practical interrupt rate.
There may be other limitations imposed by the circuitry and I'm not too familiar with this aspect. But it's unlikely for the kernel to somehow explicitly limit the interrupt rate. I see no good reason for it and I don't think it's something that can be easily done either.
Now, there are things like DMA that let you have interrupts not on every byte of input/output data, but on a buffer of several kilobytes or even megabytes. E.g. you prepare your data for output in a memory buffer and tell to the DMA controller that it can now send it out from the buffer. When done, it will trigger an interrupt signalling the completion of the transfer and you'll be able to initiate another one. It works the same in the other direction of transfers. You get an interrupt when the entire buffer is filled with input data.
I think you may be facing a hardware limitation if you can receive interrupts at lower rates only.
I was going through stackoverflow threads on various mechanisms for computing CPU time of a process.
How is clock() internally implemented ? Does it use rdtsc() ( If that's the case then it is sensitive to migration between cores ).
Also, getrusage() implemented ? Does it also depend on TSC ?
Thanks in advance
The kernel keeps track of CPU utilization for processes in sizes of ticks.
Both clock() and getrusage() are both based on these.
Ticks are accumulated by processes by the kernel using a sampling method in which the kernel receives a hardware interrupt for the clock and executes the clock handler, which adds the tick to the currently running process. At least, this is how it worked last time I looked.
So, rtdsc does not come into play at all - which is a good thing since rdtsc does not measure accurately across CPUs.
You could easily glance at some libc code. Here is the time/ directory of musl-libc
On several libraries, some low level timing syscalls are using VDSO to avoid paying the cost of a real syscall (from user-space to kernel and back), so somehow uses RTDSC.
But I am surprised that you ask. If it is curiosity, just study the source code of free software implementation. Otherwise, trust the specifications & the implementations.
Gory details could be complex, since implementation and system specific. The real implementation could be dynamically tuned at run-time (eg thru VDSO set-up in the kernel).
I have a Fibre Optic link, with a proprietary Device Driver.
The link goes into a PCIe card. Running on a RHEL 5.2 (2.6.18-128~)
I have mmap'ed the interface on the card for setup and FIFO access etc, and these read/writes take a few µs to complete, so all good there.
But of course cannot use this for interrupts, so I have to use the kernel module provided, with its user-space lib interface.
WaitForInterrupt(); // API lib interface to kernel module
// Interrupt occurs and am returned to my code in user space
time = CurrentTime() - LatchedTime(); // time to get to here
It takes around 70µs to return from WaitForInterrupt(). (The time the interrupt is raised is latched in the firmware, I read this which as I say above takes ~2µs, and compare it against the current time in the firmware)
What are expected access times between an interrupt occurring and the User Space API interrupt call wait method returning?
Network/other-high-speed interfaces take?
500ms is many orders of magnitudes larger than what a simple switch between userspace/kernel takes, but as someone mentioned in comments, linux is not a real time OS, so there's no guarantee 500ms "hickups" won't show up now and then.
It's quite impossible to tell what the culprit is, the device driver could simpliy be trying to bundle up data to be more efficient.
That said, we've had endless troubles with some custom cards and interactions with both APIC and ACPI, requireing a delicate balance of bios settings, what card goes into which PCI slot and whether a particular video card screws up everything - likely a cause of a dubious driver interacting with more or less buggy bios/video-cards..
If you're able to reliably exceed 500us on a system that's not heavily loaded, I think you're looking at a bad driver implementation (or its userspace wrapper/counterpart).
In my experience the latency to wake a user thread on interrupt should be less than 10us, though (as others have said) Linux provides no latency guarantees.
If you have a recent kernel, you can use the perf sched tool to measure the latency, and see where the time is being used. (500us does sound a tad on the high side, depending on your processor, how many tasks are running, ...)