Suppress a process in linux kernel scheduler (not kill) - linux

In linux scheduler, I want to suppress some processes by modifying the scheduler code. Is it possible to suppress process without killing but just suppression?

In the linux scheduler, I want to suppress some processes by modifying the scheduler code
Probably not possible, and certainly ill defined. The good way to think of modifying the kernel is first: don't, and later don't yet, and at last minimally and carefully !
What exactly "suppressing" a process is meaning to you? You might want to terminate it. You certainly cannot simply "suppress" some process, since the kernel is carefully cleaning up after it has terminated.
And why are you wanting to modify the kernel? In general, user-space and user-mode is a better place to do such things (or even systemd). You might want to also have some kernel thread (very tricky).
You might consider kernel to user-space communication with netlink(7), then try to minimize your kernel footprint. Be however aware that the scheduler is a critical, and very well tuned, piece of code inside the kernel.
In practice, I would suggest a high-priority user-land daemon. See setpriority(2), nice(2) and sched(7). We don't know what you want to achieve, but it is likely do be practically doable in user-land. And if it is not, perhaps Linux is not the right kernel for you (taking into account that you Silvara is a drone developer). Then look into genuine real-time operating systems, IoT OSes like Contiki, or library operating systems unikernels such as MirageOS.

Related

Do any operating systems utilize user threads only?

We're reading a basic/simple guide to Operating Systems in my CS class. The text gives multiple examples of OSs that use 1:1 threading, and some that formerly did hybrid/ M:N. But there are no examples of user threads/N:1.
This isn't a homework question, I'm just genuinely curious if this is or was a thing. Have any OSs utilized exclusively user threads? Or is there any software or programming language that does? It seems like with the right scheduling it could be very fast? Thank you!
Spent forever on Google and can't find any explicit answer to this!
Do any operating systems utilize user threads only?
No (and not in the way you're expecting, but by definition). Whatever a program feels like doing in user-space is none of the operating system's business and can not be considered something the OS itself does.
Essentially there's 3 cases:
the OS is a single-tasking OS (and user-space programs use libraries or whatever to provide threading if/when they want it). E.g. MS-DOS.
the OS is a multi-tasking OS, where the OS only knows about processes (and user-space programs use libraries or whatever to provide threading if/when they want it). E.g. early Unix.
the OS/kernel provides threads (leading to 1:1 or M:N).
It seems like with the right scheduling it could be very fast?
User-space threading isn't "very fast", it's significantly worse for most things. The reasons are:
it can't work when there's multiple CPUs (so the nice 8-core CPU you're currently using becomes 87.5% wasted). You need a "M:N threading" at a minimum to avoid this performance disaster.
it breaks thread priorities badly - e.g. CPU/s wasting time doing unimportant work while important work isn't being done, because one process doesn't know anything about threads that belong to any other process (or their priorities). The scheduler must be aware of all threads to avoid this performance disaster (and if one process knows about all threads belonging to all other processes it becomes a security disaster).
almost all thread switches are caused by devices (threads having to wait for disk, network, keyboard, "wall clock time", ... causing scheduler to have to find some other thread to run; and things a thread was waiting for occurring causing the thread to be able to run again and possibly preempt less important work that was running at the time); and all devices involve the kernel (even for micro-kernels where kernel is needed to pass messages, etc); so almost all thread switches involve the kernel. By doing threading in user-space you just end up with kernel wasting time notifying user-space (so user-space can do some scheduling) instead of kernel doing the scheduling itself (without wasting time on notifications).
User-space threading is better for rare situations where kernel doesn't have to be involved anyway, which is limited to:
thread creation and termination; but only if memory (for thread state, thread stack, thread local storage) is pre-allocated and recycled, and only if "thread recycling" isn't done (e.g. pre-create kernel threads and put them back in a "free thread pool" instead of telling kernel to terminate and create them again later).
locking (e.g. mutexes) where all threads using the lock belong to the same process; where 1 kernel thread (and no need for locks) is still better than "multiple user-space threads (sharing 1 kernel thread) fighting for the same lock with extra pointless overhead".

Controlling the process allocation to a processor

Does fork always create a process in a separate processor?
Is there a way, I could control the forking to a particular processor. For example, if I have 2 processors and want the fork to create a parallel process but in the same processor that contains the parent. Does NodeJS provide any method for this? I am looking for a control over the allocation of the processes. ... Is this even a good idea?
Also, what are the maximum number of processes that could be forked and why?
I've no Node.js wisdom to impart, simply some info on what OSes generally do.
Any modern OS will schedule processes / threads on CPUs and cores according to the prevailing burden on the machine. The whole point is that they're very good at this, so one is going to have to try very hard to come up with scheduling / core affinity decisions that beat the OS. Almost no one bothers. Unless you're running on very specific hardware (which perhaps, perhaps one might get to understand very well), you're having to make a lot of complex decisions for every single different machine the code runs on.
If you do want to try then I'm assuming that you'll have to dig deep below node.JS to make calls to the underlying C library. Most OSes (including Linux) provide means for a process to control core affinity (it's exposed in Linux's glibc).

Where can I find documentation on the kflushd?

I cannot find any documentation on the kflushd such as what it does exactly, how it is involved in network IO and how I could use it/call it from my own code.
kflushd AFAIK handles writing out pending I/O in memory to their corresponding devices. If you want to flush pending I/O's you can always call flush, fflush, or sync to force a write to the I/O device.
To call it from your code simply use one of the calls I mentioned (although I think there might be one more I'm forgetting).
Kernel processes like kflushd are started by the kernel on its own (they are not descendant of the init process started by fork-ing) and exist only for the kernel needs. User applications may invisibly need them (because they need some feature offered by the kernel which the kernel implements with the help of its own kernel processes) but don't actively use them.
You definitely should use appropriately the fflush(3) library function (which just happens to make the relevant write(2) syscalls).
You may want to use the fsync(2) and related syscalls.
Regarding networking, you may be interested by Nagle's algorithm. See this answer.

pthread_rwlock across processes: Repair after crash?

I'm working on linux and I'm using a pthread_rwlock, which is stored in shared memory and shared over multiple processes. This mostly works fine, but when I kill a process (SIGKILL) while it is holding a lock, it appears that the lock is still held (regardless of whether it's a read- or write-lock).
Is there any way to recognize such a state, and possibly even repair it?
The real answer is to find a decent way to stop a process. Killing it with SIGKILL is not a decent way to do it.
This feature is specified for mutexes, called robustness (PTHREAD_MUTEX_ROBUST) but not for rwlocks. The standard doesn't provide it and kernel.org doesn't even have a page on rwlocks. So, like I said:
Find another way to stop the process (perhaps another signal that can be handled ?)
Release the lock when you exit
#cnicutar - that "real answer" is pretty dubious. It's the kernel's job to handle cross process responsibilities of freeing of resources and making sure things are marked consistent - userspace can't effectively do the job when stuff goes wrong.
Granted if everybody plays nice the robust features will not be needed but for a robust system you want to make sure the system doesn't go down from some buggy client process.

How "Real-Time" is Linux 2.6?

I am looking at moving my product from an RTOS to embedded Linux. I don't have many real-time requirements, and the few RT requirements I have are on the order of 10s of milliseconds.
Can someone point me to a reference that will tell me how Real-Time the current version of Linux is?
Are there any other gotchas from moving to a commercial RTOS to Linux?
You can get most of your answers from the Real Time Linux wiki and FAQ
What are real-time capabilities of the stock 2.6 linux kernel?
Traditionally, the Linux kernel will only allow one process to preempt another only under certain circumstances:
When the CPU is running user-mode code
When kernel code returns from a system call or an interrupt back to user space
When kernel code code blocks on a mutex, or explicitly yields control to another process
If kernel code is executing when some event takes place that requires a high priority thread to start executing, the high priority thread can not preempt the running kernel code, until the kernel code explicitly yields control. In the worst case, the latency could potentially be hundreds milliseconds or more.
The Linux 2.6 configuration option CONFIG_PREEMPT_VOLUNTARY introduces checks to the most common causes of long latencies, so that the kernel can voluntarily yield control to a higher priority task waiting to execute. This can be helpful, but while it reduces the occurences of long latencies (hundreds of milliseconds to potentially seconds or more), it does not eliminate them. However unlike CONFIG_PREEMPT (discussed below), CONFIG_PREEMPT_VOLUNTARY has a much lower impact on the overall throughput of the system. (As always, there is a classical tradeoff between throughput --- the overall efficiency of the system --- and latency. With the faster CPU's of modern-day systems, it often makes sense to trade off throughput for lower latencies, but server class systems that do not need minimum latency guarantees may very well chose to use either CONFIG_PREEMPT_VOLUNTARY, or to stick with the traditional non-preemptible kernel design.)
The 2.6 Linux kernel has an additional configuration option, CONFIG_PREEMPT, which causes all kernel code outside of spinlock-protected regions and interrupt handlers to be eligible for non-voluntary preemption by higher priority kernel threads. With this option, worst case latency drops to (around) single digit milliseconds, although some device drivers can have interrupt handlers that will introduce latency much worse than that. If a real-time Linux application requires latencies smaller than single-digit milliseconds, use of the CONFIG_PREEMPT_RT patch is highly recommended.
They also have a list of "Gotcha's" as you called them in the FAQ.
What are important things to keep in
mind while writing realtime
applications?
Taking care of the following during
the initial startup phase:
Call mlockall() as soon as possible from main().
Create all threads at startup time of the application, and touch each page of the entire stack of each thread. Never start threads dynamically during RT show time, this will ruin RT behavior.
Never use system calls that are known to generate page faults, such as
fopen(). (Opening of files does the
mmap() system call, which generates a
page-fault).
If you use 'compile time global variables' and/or 'compile time global
arrays', then use mlockall() to
prevent page faults when accessing
them.
more information: HOWTO: Build an
RT-application
They also have a large publications page you might want to checkout.
Have you had a look at Xenomai? It will let you run "hard real time" processes above Linux, while still allowing you to access the regular Linux APIs for all the non-real-time needs.
There are two fundamentally different approaches to achieve real-time capabilities with Linux.
Patch the existing kernel with things like the rt-preempt patches. This will eventually lead to a fully preemptive kernel
Dual kernel approach (like xenomai, RTLinux, RTAI,...)
There are lots of gotchas moving from a RTOS to Linux.
Maybe you don't really need real-time?
I'm talking about real-time Linux in my training sessions:
https://rlbl.me/elisa
https://rlbl.me/elisa-en-pdf
https://rlbl.me/intely
https://rlbl.me/intely-en-pdf
https://rlbl.me/entirety-en-all-pdf
The answer is probably "good enough".
If you're running an embedded system, you probably have control of all or most of the software on the box.
Stock Linux 2.6 has several features suitable for low-latency tasks - chiefly these are:
Scheduling policies
Memory locking
Assuming you're using a single-core machine, if you have just one task which has set its scheduling policy to SCHED_FIFO or SCHED_RR (it doesn't matter which if you have just one task), AND locked all its memory in with mlockall(), then it WILL get scheduled as soon as it is ready to run.
Then the only thing you'd have to worry about was some non-preemptable part of the kernel taking longer than your acceptable latency to complete - which is unlikely to happen in an embedded system unless something bad happens, such as extreme memory pressure, or your drivers are dodgy.
I guess "try it and see" is a good answer, but that's probably rather complicated in your case (and might involve writing device drivers etc).
Look at the doc for sched_setscheduler for some good info.

Resources