How is hardware context switct used/unused in Linux? - linux

Old x86 intel architecture provided context switching (in the form of TSS) at hardware level. But I have read that, linux has long "abandoned" using hardware context switching functionality as they were less optimised, less flexible and was not available on all architextures.
What confuses me is how can software (linux) control hardware operations (saving/restoring context)? Linux can choose not to use context setup by hardware but hardware context switch would nevertheless happen (making "optimisation" argument irrelevant).
Also if linux is not using hardware context switch, how can then the value %eip (pointing to next instruction in user program) be saved and kernel stack pointer restored by the kernel? (and vice-versa process)
I think kernel would need some support from hardware to save user program %eip and switch %esp (user to kernel stack) registers even before interrupt service routine starts.
If this support indeed is provided by hardware then how is linux not using hardware context switches?
Terribly confused!!!

Related

Logging and debugging unaligned accesses on Linux / aarch64

How can I log unaligned memory accesses on Linux / aarch64 (Cortex-a57)?
I understand there are two different things involved here:
Choosing to raise an interrupt from the cpu on an unaligned access (ie. interrupts for unaligned memory accesses that would otherwise be supported by the cpu at a performance cost)
Choosing how to handle these interrupts in Linux (log them / fire a SIGBUS / soft emulate unaligned access)
My problem is that first, I do not know how to manage the cpu's control registers from my program (nor if I should actually do it in my userspace application), and second, the /proc/cpu/alignment interface for managing the unaligned accesses in Linux seems to be gone (I am using a 4.4.0 kernel), see link below.
Managing unaligned accesses from the kernel:
https://www.kernel.org/doc/Documentation/arm/mem_alignment (likely out-of-date)
Related:
Does AArch64 support unaligned access?
You can't do this. Not with Linux, anyway.
Alignment faults for EL0 are governed by the SCTLR_EL1.A bit, but that also affects EL1. Thus even if you wrote a hacky kernel module to enable it (you obviously can't touch privileged system control registers directly from userspace), you're pretty much guaranteed that the kernel's going to panic as soon as the next network packet arrives. The arm64 kernel port relies on having the unaligned access capability provided by AArch64. It doesn't have the ARM port's /proc/cpu/alignment handler, because it doesn't have the legacy of pre-ARMv6 CPUs that didn't support unaligned access at all (well, in any usable fashion at least).
What you can do, though, is use perf tools to monitor any or all of Cortex-A57's microarchitectural PMU events 0x68, 0x69 or 0x6a, to count the unaligned-access-related events which your program triggers. There's no means to trap or debug individual accesses as there might be with the blunt instrument of alignment faults, but otherwise it's arguably more useful since it'll only count events attributable to your program.

Difference between user-space driver and kernel driver [duplicate]

This question already has answers here:
Userspace vs kernel space driver
(2 answers)
Closed 5 years ago.
I have been reading "Linux Device Drivers" by Jonathan Corbet. I have some questions that I want to know:
What are the main differences between a user-space driver and a kernel driver?
What are the limitations of both of them?
Why user-space drivers are commonly used and preferred nowadays over kernel drivers?
What are the main differences between a user-space driver and a kernel driver?
User space drivers run in user space. Kernel drivers run in kernel space.
What are the limitations of both of them?
The kernel driver can do anything the kernel can, so you could say it has no limitations. But kernel drivers are much harder to "prove correct" and debug. It's all-to-easy to introduce race conditions, or use a kernel function in the wrong context or with the wrong locking. Things will appear to work for a while, but cause problems (including crashing the whole system) down the road. Drivers must also be wary when reading all user input (both from the device and from userspace) because invalid data can sometimes cause crashes.
A user-space driver usually needs a small shim in the kernel to do it's bidding. Usually, that 'shim' provides a simpler API. For example, the FUSE layer lets people write file systems in any language. They can be mounted, read/written, then unmounted. The shim must also protect the kernel against all invalid input.
User-space drivers have lots of limitations. For example, the kernel reserves some memory for use during emergencies, but that is not available for users-space. During memory pressure, the kernel will kill random user-space programs, but never kill kernel threads. User-space programs may be swapped out, which could lead to your device being unavailable for several seconds. (Kernel code can not be swapped out.) Running code in user-space requires several context switches. These waste a "lot" of CPU time. If your device is a 300 baud modem, nobody will notice. But if it's a gigabit Ethernet card, and every packet has to go to your userspace driver before it gets to the real user, the system will have major bottlenecks.
User space programs are also "harder" to use because you have to install that user-space software, which often has many library dependencies. Kernel modules "just work".
Why user-space drivers are commonly used and preferred nowadays over kernel drivers?
The question is "Does this complexity really need to be in the kernel?"
I used to work for a company that made USB dongles that talked a particular protocol. We could have written a full kernel driver, but instead just wrote our program on top of libUSB.
The advantages: The program was portable between Linux, Mac, Win. No worrying about our code vs the GPL.
The disadvantages: If the device needed to data to the PC and get a response quickly, there is no guarantee that would happen. For example, if we needed a real-time control loop on the PC, it would be harder to have bounded response times. (Maybe not entirely impossible on Linux.)
If there is a way to do it in userspace, I would try that first. Only if there are significant performance bottlenecks, or significant complexity in keeping it in userspace would you move it. Even then, consider the "shim" approach, and/or the "emulator" approach (where your kernel module makes your device look like a serial port or a block device.)
On the other hand, if there are already several kernel modules similar to what you want, then start there.

Location of interrupt handling code in Linux kernel for x86 architecture

I am doing so research trying to find the code in the Linux kernel that implements interrupt handling; in particular, I am trying to find the code responsible for handling the system timer.
According to http://www.linux-tutorial.info/modules.php?name=MContent&pageid=86
The kernel treats interrupts very similarly to the way it treats exceptions: all the general >purpose registers are pushed onto the system stack and a common interrupt handler is called. >The current interrupt priority is saved and the new priority is loaded. This prevents >interrupts at lower priority levels from interrupting the kernel while it handles this >interrupt. Then the real interrupt handler is called.
I am looking for the code that pushes all of the general purpose registers on the stack, and the common interrupt handling code.
At least pushing the general purpose registers onto the stack is architecture independent, so I'm looking for the code that is associated with the x86 architecture. At the moment I'm looking at version 3.0.4 of the kernel source, but any version is probably fine. I've gotten started looking in kernel/irq/handle.c, but I don't see anything that looks like saving the registers; it just looks like it is calling the registered interrupt handler.
The 32-bit versions are in arch/i386/kernel/entry_32.S, the 64-bit versions in entry_64.S. Search for the various ENTRY macros that mark kernel entry points.
I am looking for the code that pushes all of the general purpose registers on the stack
Hardware stores the current state (which includes registers) before executing an interrupt handler. Code is not involved. And when the interrupt exits, the hardware reads the state back from where it was stored.
Now, code inside the interrupt handler may read and write the saved copies of registers, causing different values to be restored as the interrupt exits. That's how a context switch works.
On x86, the hardware only saves those registers that change before the interrupt handler starts running. On most embedded architectures, the hardware saves all registers. The reason for the difference is that x86 has a huge number of registers, and saving and restoring any not modified by the interrupt handler would be a waste. So the interrupt handler is responsible to save and restore any registers it voluntarily uses.
See Intel® 64 and IA-32 Architectures
Software Developer’s Manual, starting on page 6-15.

Why doesn't Linux use the hardware context switch via the TSS?

I read the following statement:
The x86 architecture includes a
specific segment type called the Task
State Segment (TSS), to store hardware
contexts. Although Linux doesn't use
hardware context switches, it is
nonetheless forced to set up a TSS for
each distinct CPU in the system.
I am wondering:
Why doesn't Linux use the hardware support for context switch?
Isn't the hardware approach much faster than the software approach?
Is there any OS which does take advantage of the hardware context switch? Does windows use it?
At last and as always, thanks for your patience and reply.
-----------Added--------------
http://wiki.osdev.org/Context_Switching got some explanation.
People as confused as me could take a look at it. 8^)
The x86 TSS is very slow for hardware multitasking and offers almost no benefits when compared to software task switching. (In fact, I think doing it manually beats the TSS a lot of times)
The TSS is known also for being annoying and tedious to work with and it is not portable, even to x86-64. Linux aims at working on multiple architectures so they probably opted to use software task switching because it can be written in a machine independent way. Also, Software task switching provides a lot more power over what can be done and is generally easier to setup than the TSS is.
I believe Windows 3.1 used the TSS, but at least the NT >5 kernel does not. I do not know of any Unix-like OS that uses the TSS.
Do note that the TSS is mandatory. The thing that OSs do though is create a single TSS entry(per processor) and everytime they need to switch tasks, they just change out this single TSS. And also the only fields used in the TSS by software task switching is ESP0 and SS0. This is used to get to ring 0 from ring 3 code for interrupts. Without a TSS, there would be no known Ring 0 stack which would of course lead to a GPF and eventually triple fault.
Linux used to use HW-based switching, in the pre-1.3 timeframe iirc. I believe sw-based context switching turned out to be faster, and it is more flexible.
Another reason may have been minimizing arch-specific code. The first port of Linux to a non-x86 architecture was Alpha. Alpha didn't have TSS, so more code could be shared if all archs used SW switching. (Just a guess.) Unfortunately the kernel changelogs for the 1.2-1.3 kernel period are not well-preserved, so I can't be more specific.
Linux doesn't use a segmented memory model, so this segmentation specific feature isn't used.
x86 CPUs have many different kinds of hardware support for context switching, so the distinction isn't hardware vs software, but more how does an OS use the various hardware features available. It isn't necessary to use them all.
Linux is so efficiency focussed that you can bet that someone has profiled every option that is possible, and that the options currently used are the best available compromise.

Disabling Multithreading during runtime

I am wondering if Intel's processor provides instructions in their instruction set
to turn on and off the multithreading or hyperthreading capability? Basically, I wanna
know if an Operating System can control these feature via instructions somehow?
Thank you so much
Mareike
Most operating systems have a facility for changing a process' CPU affinity, thereby restricting it to a single physical or virtual core. But multithreading is a program architecture, not a CPU facility.
I think that what you are trying to ask is, "Is there a way to prevent the OS from utilizing hyperthreading and/or multiple cores?"
The answer is, definitely. This isn't governed by a single instruction, and indeed it's not like you can just write a device driver that would automagically disable all of that hardware. Most of this depends on how the kernel configures the interrupt controllers at boot time.
When a machine is first started, there is a designated processor that is used for bootstrapping. It is the responsibility of the OS to configure the multiprocessor hardware accordingly. On PC platforms this would involve reading information about the multiprocessor configuration from in-memory tables provided by the boot firmware. This data would likely conform to either the ACPI or the Intel multiprocessor specifications. The kernel then uses that date to configure the APIC hardware accordingly.
Multithreading and multitasking are not special instructions or modes in the CPU. They're just fancy ways people who write operating systems use interrupts. There is a hardware timer, basically a counter being incremented by a clocking signal, that triggers an interrupt when it overflows. The exact interrupt is platform specific. In the olden days this timer is actually a separate chip/circuit on the motherboard that is simply attached to one of the CPU's interrupt pin. Modern CPUs have this timer built in. So, to turn off multithreading and multitasking the OS can simply disable the interrupt signal.
Alternatively, since it's the OS's job to actually schedule processes/threads, the OS can simply decide to ignore all threads and not run them.
Hyperthreading is a different thing. It sort of allows the OS to see a second virtual CPU that it can execute code on. Never had to deal with the thing directly so I'm not sure how to turn it off (or even if it is possible).
There is no x86 instruction that disables HyperThreading or additional cores. But, there is BIOS settings that can turn off these features. Because it can be set in BIOS, it requires rebooting, and generally it's beyond OS control. There is Windows booting option that limits the number of active core, but HyperThreading can be turn on/off only by BIOS. Current Intel's HyperThreading implementation doesn't allow dynamic turn on and off (and it won't be easily implemented in a near time).
I have assumed 'multithreading' in your question as 'hardware multithreading' which is technically identical to HyperThreading. However, if you really intended software-level multithreading (i.e., multitasking), then it's totally different question. It is (almost) impossible for modern operating systems since they are by default supports multitasking. And, this question actually doesn't make sense. It can make sense if you want to run MS-DOS (as real mode of x86, where a single task can be done).
p.s. Please note that 'multithreading' can be either hardware or software. Also I agree all others' answers such as processor/thread affinity.

Resources