Automatically suspend process - linux

I saw that Windows/Linux has the ability to suspend a process.
It is wired for me why background applications are not suspended automatically.
For example, lots of resources are used by Chrome when it is in the background. Easily it can be suspended. So it will stay in RAM and it can unsuspend quickly but it will not use CPU and GPU.
My question contains two parts:
Why Windows/Linux (or applications) don't use suspend feature? (sth similar to pause in Android but in the different way)
Is there any way to suspend a background task and unsuspend it when it gets focus (when it goes to foreground)?

A process like Chrome might not have input focus on the user interface but still be "running." (Chrome consists of a set of related processes and threads.)
Yes, Linux does have the ability to actually "suspend" a process using the STOP/CONT signals, but this would be disruptive to the user interface because Chrome, now being literally frozen, could no longer respond to messages sent to it by the user interface.
Processes and threads only consume CPU resources when they actually need to (they are "runnable"), and then only when the operating system gives them a time-slice. If a thread or process is, say, "waiting for the user interface to send it a message," it's not considered to be "runnable" until a message arrives.
It's also typical that, when a process doesn't have input focus, its priority is slightly reduced so that it always gives-way to the process that does. In some systems, the priority is even more reduced when you minimize the window. (When several processes are "runnable," the operating system uses "priority" to help it to decide which one to run next.)


System lock or infinite loop is able to cause reboot?

My question is related to knowledge on embedded Linux.
I just observed a strange reboot on my embedded project, which is very easy to reproduce.
When some condition is triggered, the system will like "freezing". I mean, its like encounter some infinite loop or be locked. Last for several seconds, system will quietly reboot. Not even core dump!!
I have no much clue about the cause. Generally will a lock or infinite loop can truly trigger Linux reboot? Or are there any things can freeze system and cause reboot with no core dump happens?
It is common on embedded systems to have a hardware watchdog; a timer implemented in hardware that resets the processor if it is allowed to expire.
Typically some software monitoring task continuously verifies the integrity of the system and restarts the hardware watchdog timer. If the monitoring task fails to run and the watchdog timer expires, the watchdog triggers a processor reset directly.
Your question is a bit hard to understand but yes, a "infinite loop" (the proper term is) in any application on any platform (including Linux) can crash a system. This happens obviously because an infinite loop can constantly take up memory and resources until there is none left. You mentioned you are doing embedded development (which can mean many different things) but usually means you are developing low-level applications built into Linux itself; these are more prone to crashing an OS than your average programming venture.

Is it possible to circumvent OS security by not using the supplied System Calls?

I understand that an Operating System forces security policies on users when they use the system and filesystem via the System Calls supplied by stated OS.
Is it possible to circumvent this security by implementing your own hardware instructions instead of making use of the supplied System Call Interface of the OS? Even writing a single bit to a file where you normally have no access to would be enough.
First, for simplicity, I'm considering the OS and Kernel are the same thing.
A CPU can be in different modes when executing code.
Lets say a hypothetical CPU has just two modes of execution (Supervisor and User)
When in Supervisor mode, you are allowed to execute any instructions, and you have full access to the hardware resources.
When in User mode, there is subset of instructions you don't have access to, such has instructions to deal with hardware or change the CPU mode. Trying to execute one of those instructions will cause the OS to be notified your application is misbehaving, and it will be terminated. This notification is done through interrupts. Also, when in User mode, you will only have access to a portion of the memory, so your application can't even touch memory it is not supposed to.
Now, the trick for this to work is that while in Supervisor Mode, you can switch to User Mode, since it's a less privileged mode, but while in User Mode, you can't go back to Supervisor Mode, since the instructions for that are not permitted anymore.
The only way to go back to Supervisor mode is through system calls, or interrupts. That enables the OS to have full control of the hardware.
A possible example how everything fits together for this hypothetical CPU:
The CPU boots in Supervisor mode
Since the CPU starts in Supervisor Mode, the first thing to run has access to the full system. This is the OS.
The OS setups the hardware anyway it wants, memory protections, etc.
The OS launches any application you want after configuring permissions for that application. Launching the application switches to User Mode.
The application is running, and only has access to the resources the OS allowed when launching it. Any access to hardware resources need to go through System Calls.
I've only explained the flow for a single application.
As a bonus to help you understand how this fits together with several applications running, a simplified view of how preemptive multitasking works:
In a real-world situation. The OS will setup an hardware timer before launching any applications.
When this timer expires, it causes the CPU to interrupt whatever it was doing (e.g: Running an application), switch to Supervisor Mode and execute code at a predetermined location, which belongs to the OS and applications don't have access to.
Since we're back into Supervisor Mode and running OS code, the OS now picks the next application to run, setups any required permissions, switches to User Mode and resumes that application.
This timer interrupts are how you get the illusion of multitasking. The OS keeps changing between applications quickly.
The bottom line here is that unless there are bugs in the OS (or the hardware design), the only way an application can go from User Mode to Supervisor Mode is through the OS itself with a System Call.
This is the mechanism I use in my hobby project (a virtual computer)
HW devices are connected to CPU trough bus, and CPU does use to communicate with them in/out instructions to read/write values at I/O ports (not used with current HW too much, in early age of home computers this was the common way), or a part of device memory is "mapped" into CPU address space, and CPU controls the device by writing values at defined locations in that shared memory.
All of this should be not accessible at "user level" context, where common applications are executed by OS (so application trying to write to that shared device memory would crash on illegal memory access, actually that piece of memory is usually not even mapped into user space, ie. not existing from user application point of view). Direct in/out instructions are blocked too at CPU level.
The device is controlled by the driver code, which is either run is specially configured user-level context, which has the particular ports and memory mapped (micro-kernel model, where drivers are not part of kernel, like OS MINIX). This architecture is more robust (crash in driver can't take down kernel, kernel can isolate problematic driver and restart it, or just kill it completely), but the context switches between kernel and user level are a very costly operation, so the throughput of data is hurt a bit.
Or the device drivers code runs on kernel-level (monolithic kernel model like Linux), so any vulnerability in driver code can attack the kernel directly (still not trivial, but lot more easier than trying to get tunnel out of user context trough some kernel bug). But the overall performance of I/O is better (especially with devices like graphics cards or RAID disc clusters, where the data bandwidth goes into GiBs per second). For example this is the reason why early USB drivers are such huge security risk, as they tend to be bugged a lot, so a specially crafted USB device can execute some rogue code from device in kernel-level context.
So, as Hyd already answered, under ordinary circumstances, when everything works as it should, user-level application should be not able to emit single bit outside of it's user sandbox, and suspicious behaviour outside of system calls will be either ignored, or crash the app.
If you find a way to break this rule, it's security vulnerability and those get usually patched ASAP, when the OS vendor gets notified about it.
Although some of the current problems are difficult to patch. For example "row hammering" of current DRAM chips can't be fixed at SW (OS) or CPU (configuration/firmware flash) level at all! Most of the current PC HW is vulnerable to this kind of attack.
Or in mobile world the devices are using the radiochips which are based on legacy designs, with closed source firmware developed years ago, so if you have enough resources to pay for a research on these, it's very likely you would be able to seize any particular device by fake BTS station sending malicious radio signal to the target device.
Etc... it's constant war between vendors with security researchers to patch all vulnerabilities, and hackers to find ideally zero day exploit, or at least picking up users who don't patch their devices/SW fast enough with known bugs.
Not normally. If it is possible it is because of an operating system software error. If the software error is discovered it is fixed fast as it is considered to be a software vulnerability, which equals bad news.
"System" calls execute at a higher processor level than the application: generally kernel mode (but system systems have multiple system level modes).
What you see as a "system" call is actually just a wrapper that sets up registers then triggers a Change Mode Exception of some kind (the method is system specific). The system exception hander dispatches to the appropriate system server.
You cannot just write your own function and do bad things. True, sometimes people find bugs that allow circumventing the system protections. As a general principle, you cannot access devices unless you do it through the system services.

Node.JS kernel mode threading

I'm trying to figure out how does Node.JS (of its Windows version) is working behind the scenes.
I know there is user mode and kernel mode threads, and I know the processing model looks like this:
I also know that moving from a kernel mode thread to a user mode thread is consider to be a context switching.
Does Node.JS C++ Non-Blocking worker threads are kernel mode ? and where does the single event loop thread lives at kernel mode or user mode ?
As you know node.js has a single threaded architecture. The JavaScript environment and event-loop is managed by a single thread only, internally all the other threads are handled by a C++ level thread pool (like asynchronous I/O handled by libuv thread) .
To answer your question these node.js C++ non-blocking worker threads are not kernel mode. They are user mode. The event-loop thread is also user mode. The threads request kernel mode as and when needed.
When the CPU is in kernel mode, it is assumed to be executing trusted software. Kernel mode is the highest privelege level and the code has full access to all devices. In Windows, only select files written by Windows developers runs completely on kernel mode. All user mode software must request use of the kernel by means of a system call in order to perform privileged instructions, such as process creation or I/O operations.
All processes begin execution in user mode, and they switch to kernel mode only when obtaining a service provided by the kernel. This change in mode is termed mode switch, not context switch, which is the switching of the CPU from one process to another.
I hope it is clear to you that even user-mode threads can execute privileged operations (network access) via system calls, and return to user-mode when required task is finished. Node.js simply uses system calls.
Source :
I should have mentioned that mode switch does not always mean context switch. Quoting the wiki:
When a transition between user mode and kernel mode is required in an
operating system, a context switch is not necessary; a mode transition
is not by itself a context switch. However, depending on the operating
system, a context switch may also take place at this time.
What you mention is also correct that mode switch can cause context switch. But it does not happen always. It is not desirable to have context switches (heavy performance penalty) whenever mode switch happens. What happens inside Windows is difficult to say, but most likely mode switch does not cause context switch every time.
Regarding the one-to-one thread model. Both Windows and Linux follow that. So given each user thread (like node.js event loop thread) OS provides a kernel thread, which takes care of the system calls. Node.js can only invoke mode switch through system calls. Context switch is controlled only by the kernel (thread scheduler).
Update 2
Yes, HTTP.SYS executes in kernel mode. But there is more to it. Node.js does not have many threads, so fewer context switching happens between threads unlike IIS. Context switch (mode switch) for each request is definitely less in HTTP.SYS. It is an improvement from past (which happened to be a disaster), see here. The context switching due to multiple threads is much more than reduction of context switch by using HTTP.SYS. So overall node.js has less context switches.
HTTP.SYS also has other advantages over node's own HTTP implementation that helps IIS. It may be possible (in future) to use HTTP.SYS from node itself to take those advantages. But for now, I don't think HTTP.SYS/IIS compete anywhere near node.js.

Two applications using framebuffer

I'm writing a set of Linux framebuffer applications for embedded hardware. The main application runs on tty1 from /etc/inittab (for now it's just a touchscreen test) and is supposed to run permanently. The second application is executed from acpid when the power button is pressed, and it's supposed to ask user if he really want to shut the device down, and read user answer from a touchscreen. What I want is that the second application would takeover framebuffer while it runs, and then release it and restore the state of screen, so the main application can continue without restart.
Is this scenario possible with 2 different applications, and how should they interact ? Now the second application just can't draw anything while the main application is running.
I know I can kill and restart main application, or move poweroff notification to the main application and have acpid just sending a signal to it, but those solutions don't seem to be optimal.
One solution would of course be to have THREE applications, one that does the actual framebuffer interaction, and the other two just sends messages (in some form, e.g. through a pipe, socket or similar). This is how "window managers" and similar usually works (but much more complicated, of course)

Difference between OS scheduling and RTOS scheduling

Consider the function/process,
void task_fun(void)
If this process were to run on a normal PC OS, it would happily run forever. But on a mobile phone, it would surely crash the entire phone in a matter of minutes as the HW watchdog expires and resets the system.
On a PC, this process, after it expires its stipulated time slice would be scheduled out and a new runnable process would be scheduled to run.
My doubt is why cant we apply the same strategy on an RTOS? What is the performance limitation involved if such a scheduling policy is implemeted on an RTOS?
One more doubt is that I checked the schedule() function of both my PC OS ( Ubuntu ) and my phone which also runs Linux Kernel. I found both of them to be almost the same. Where is the watchdog handing done on my phone? My assumption is that scheduler is the one who starts the watchdog before letting a process run. Can someone point me where in code its being done?
The phone "crashing" is an issue with the phone design or the specific OS, not embedded OSes or RTOSes in general. It would 'starve' lower priority tasks (possibly including the watchdog service), which is probably what is happening here.
In most embedded RTOSes it is intended that all processes are defined at deployment by the system designer and the design is for all processes to be scheduled as required. Placing user defined or third party code on such a system can compromise its scheduling scheme as in your example. I would suggest that all such processes should run at the same low priority as all others so that the round-robin scheduler will service user application equally without compromising system services.
Phone operating systems are usually RTOS, but user processes should not run at higher priority that system processes. It may be intentional that such processes run higher than the watchdog service exactly to protect the system from "misbehaving" applications which yours simulates.
Most RTOSes use a pre-emptive priority based scheduler (highest priority ready task runs until it terminates, yields, or is pre-empted by a higher priority task or interrupt). Some also schedule round-robin for tasks at the same priority level (task runs until it terminates, yields or consumes its time-slice and other tasks of the same priority are ready to run).
There are several ways a watchdog can be implemented, none of which is imposed by Linux:
A process or thread runs periodically to test that vital operations are being performed. If they are not, correction action is taken, like reboot the machine, or reset a troublesome component.
A process or thread runs continuously to soak up extra CPU time and reset a timer. If the task is not able to run, a timer expires and takes corrective action.
A hardware component resets the system if it is not periodically massaged; that is, a hardware timer expires.
There is nothing here that can't be done on either an RTOS or any other multitasking operating system.
Linux, on a desktop computer or on a mobile phone, is not a RTOS. Its scheduling policy is time-driven.
On a RTOS, scheduling is triggered by events, either from environment through ISR or from software itself through system calls (send message, wait for mutex, ...)
In a normal OS, we have two types of processes. User process & kernel Process. Kernel processes have time constraints.However, user processes do not have time constraints.
In a RTOS,all process are Kernel process & hence time constraints should be strictly followed. All process/task (can be used interchangeably) are based on priority and time constraints are important for the system to run correctly.
So, if your code void task_fun(void) { while(1) } runs forever, other higher priority tasks will be starving. Hence, watch dog will crash the system to specify the developer that time constraints of other tasks are not met.
For example, GSM Scheduler needs to run every 4.6ms, if your task runs for more time, time constraints of GSM Scheduler task cannot be satisfied. So the system has to reboot as its purpose is defeated.
Hope this helps :)
