System lock or infinite loop is able to cause reboot? - linux

My question is related to knowledge on embedded Linux.
I just observed a strange reboot on my embedded project, which is very easy to reproduce.
When some condition is triggered, the system will like "freezing". I mean, its like encounter some infinite loop or be locked. Last for several seconds, system will quietly reboot. Not even core dump!!
I have no much clue about the cause. Generally will a lock or infinite loop can truly trigger Linux reboot? Or are there any things can freeze system and cause reboot with no core dump happens?

It is common on embedded systems to have a hardware watchdog; a timer implemented in hardware that resets the processor if it is allowed to expire.
Typically some software monitoring task continuously verifies the integrity of the system and restarts the hardware watchdog timer. If the monitoring task fails to run and the watchdog timer expires, the watchdog triggers a processor reset directly.

Your question is a bit hard to understand but yes, a "infinite loop" (the proper term is) in any application on any platform (including Linux) can crash a system. This happens obviously because an infinite loop can constantly take up memory and resources until there is none left. You mentioned you are doing embedded development (which can mean many different things) but usually means you are developing low-level applications built into Linux itself; these are more prone to crashing an OS than your average programming venture.

Related

Is it possible to circumvent OS security by not using the supplied System Calls?

I understand that an Operating System forces security policies on users when they use the system and filesystem via the System Calls supplied by stated OS.
Is it possible to circumvent this security by implementing your own hardware instructions instead of making use of the supplied System Call Interface of the OS? Even writing a single bit to a file where you normally have no access to would be enough.
First, for simplicity, I'm considering the OS and Kernel are the same thing.
A CPU can be in different modes when executing code.
Lets say a hypothetical CPU has just two modes of execution (Supervisor and User)
When in Supervisor mode, you are allowed to execute any instructions, and you have full access to the hardware resources.
When in User mode, there is subset of instructions you don't have access to, such has instructions to deal with hardware or change the CPU mode. Trying to execute one of those instructions will cause the OS to be notified your application is misbehaving, and it will be terminated. This notification is done through interrupts. Also, when in User mode, you will only have access to a portion of the memory, so your application can't even touch memory it is not supposed to.
Now, the trick for this to work is that while in Supervisor Mode, you can switch to User Mode, since it's a less privileged mode, but while in User Mode, you can't go back to Supervisor Mode, since the instructions for that are not permitted anymore.
The only way to go back to Supervisor mode is through system calls, or interrupts. That enables the OS to have full control of the hardware.
A possible example how everything fits together for this hypothetical CPU:
The CPU boots in Supervisor mode
Since the CPU starts in Supervisor Mode, the first thing to run has access to the full system. This is the OS.
The OS setups the hardware anyway it wants, memory protections, etc.
The OS launches any application you want after configuring permissions for that application. Launching the application switches to User Mode.
The application is running, and only has access to the resources the OS allowed when launching it. Any access to hardware resources need to go through System Calls.
I've only explained the flow for a single application.
As a bonus to help you understand how this fits together with several applications running, a simplified view of how preemptive multitasking works:
In a real-world situation. The OS will setup an hardware timer before launching any applications.
When this timer expires, it causes the CPU to interrupt whatever it was doing (e.g: Running an application), switch to Supervisor Mode and execute code at a predetermined location, which belongs to the OS and applications don't have access to.
Since we're back into Supervisor Mode and running OS code, the OS now picks the next application to run, setups any required permissions, switches to User Mode and resumes that application.
This timer interrupts are how you get the illusion of multitasking. The OS keeps changing between applications quickly.
The bottom line here is that unless there are bugs in the OS (or the hardware design), the only way an application can go from User Mode to Supervisor Mode is through the OS itself with a System Call.
This is the mechanism I use in my hobby project (a virtual computer) https://github.com/ruifig/G4DevKit.
HW devices are connected to CPU trough bus, and CPU does use to communicate with them in/out instructions to read/write values at I/O ports (not used with current HW too much, in early age of home computers this was the common way), or a part of device memory is "mapped" into CPU address space, and CPU controls the device by writing values at defined locations in that shared memory.
All of this should be not accessible at "user level" context, where common applications are executed by OS (so application trying to write to that shared device memory would crash on illegal memory access, actually that piece of memory is usually not even mapped into user space, ie. not existing from user application point of view). Direct in/out instructions are blocked too at CPU level.
The device is controlled by the driver code, which is either run is specially configured user-level context, which has the particular ports and memory mapped (micro-kernel model, where drivers are not part of kernel, like OS MINIX). This architecture is more robust (crash in driver can't take down kernel, kernel can isolate problematic driver and restart it, or just kill it completely), but the context switches between kernel and user level are a very costly operation, so the throughput of data is hurt a bit.
Or the device drivers code runs on kernel-level (monolithic kernel model like Linux), so any vulnerability in driver code can attack the kernel directly (still not trivial, but lot more easier than trying to get tunnel out of user context trough some kernel bug). But the overall performance of I/O is better (especially with devices like graphics cards or RAID disc clusters, where the data bandwidth goes into GiBs per second). For example this is the reason why early USB drivers are such huge security risk, as they tend to be bugged a lot, so a specially crafted USB device can execute some rogue code from device in kernel-level context.
So, as Hyd already answered, under ordinary circumstances, when everything works as it should, user-level application should be not able to emit single bit outside of it's user sandbox, and suspicious behaviour outside of system calls will be either ignored, or crash the app.
If you find a way to break this rule, it's security vulnerability and those get usually patched ASAP, when the OS vendor gets notified about it.
Although some of the current problems are difficult to patch. For example "row hammering" of current DRAM chips can't be fixed at SW (OS) or CPU (configuration/firmware flash) level at all! Most of the current PC HW is vulnerable to this kind of attack.
Or in mobile world the devices are using the radiochips which are based on legacy designs, with closed source firmware developed years ago, so if you have enough resources to pay for a research on these, it's very likely you would be able to seize any particular device by fake BTS station sending malicious radio signal to the target device.
Etc... it's constant war between vendors with security researchers to patch all vulnerabilities, and hackers to find ideally zero day exploit, or at least picking up users who don't patch their devices/SW fast enough with known bugs.
Not normally. If it is possible it is because of an operating system software error. If the software error is discovered it is fixed fast as it is considered to be a software vulnerability, which equals bad news.
"System" calls execute at a higher processor level than the application: generally kernel mode (but system systems have multiple system level modes).
What you see as a "system" call is actually just a wrapper that sets up registers then triggers a Change Mode Exception of some kind (the method is system specific). The system exception hander dispatches to the appropriate system server.
You cannot just write your own function and do bad things. True, sometimes people find bugs that allow circumventing the system protections. As a general principle, you cannot access devices unless you do it through the system services.

Linux/Windows: User mode vs Privilege mode time spent

How much percentage of time CPU spends in user mode vs privilege mode for different programs/operations.
Different Operations could be:
- running application without I/O interaction.
- application with I/O interaction like copying a file to USB
I know for a fact that Network operating system spends most of the time in interrupt context. Does this hold true for general purpose OS like Ubuntu/Windows?
I'm not much of an OS expert but I imagine it will depend a great deal on what background processes are running on the system. On any OS you might or might not be running some system (i.e. non-user) processes that are heavy resource users. Or you might have put some effort into stripping the system down so that very little CPU time is being used by the system for background maintenance.
If your question is how things compare for "clean" installations of these operating systems then all I can tell you is that on my laptop running Ubuntu right now (running top from the command line to look at resource usage) only about 5-10% of CPU time is being used by non-user processes; in my case Xorg and compiz are the main ones. I don't really know how that compares to Windows, but I think most linux users have a knee jerk reaction that Windows is greedier for system resources than most linux distros.
So, I guess the short answer is that I doubt there is a short answer to your question.

Difference between OS scheduling and RTOS scheduling

Consider the function/process,
void task_fun(void)
{
while(1)
}
If this process were to run on a normal PC OS, it would happily run forever. But on a mobile phone, it would surely crash the entire phone in a matter of minutes as the HW watchdog expires and resets the system.
On a PC, this process, after it expires its stipulated time slice would be scheduled out and a new runnable process would be scheduled to run.
My doubt is why cant we apply the same strategy on an RTOS? What is the performance limitation involved if such a scheduling policy is implemeted on an RTOS?
One more doubt is that I checked the schedule() function of both my PC OS ( Ubuntu ) and my phone which also runs Linux Kernel. I found both of them to be almost the same. Where is the watchdog handing done on my phone? My assumption is that scheduler is the one who starts the watchdog before letting a process run. Can someone point me where in code its being done?
The phone "crashing" is an issue with the phone design or the specific OS, not embedded OSes or RTOSes in general. It would 'starve' lower priority tasks (possibly including the watchdog service), which is probably what is happening here.
In most embedded RTOSes it is intended that all processes are defined at deployment by the system designer and the design is for all processes to be scheduled as required. Placing user defined or third party code on such a system can compromise its scheduling scheme as in your example. I would suggest that all such processes should run at the same low priority as all others so that the round-robin scheduler will service user application equally without compromising system services.
Phone operating systems are usually RTOS, but user processes should not run at higher priority that system processes. It may be intentional that such processes run higher than the watchdog service exactly to protect the system from "misbehaving" applications which yours simulates.
Most RTOSes use a pre-emptive priority based scheduler (highest priority ready task runs until it terminates, yields, or is pre-empted by a higher priority task or interrupt). Some also schedule round-robin for tasks at the same priority level (task runs until it terminates, yields or consumes its time-slice and other tasks of the same priority are ready to run).
There are several ways a watchdog can be implemented, none of which is imposed by Linux:
A process or thread runs periodically to test that vital operations are being performed. If they are not, correction action is taken, like reboot the machine, or reset a troublesome component.
A process or thread runs continuously to soak up extra CPU time and reset a timer. If the task is not able to run, a timer expires and takes corrective action.
A hardware component resets the system if it is not periodically massaged; that is, a hardware timer expires.
There is nothing here that can't be done on either an RTOS or any other multitasking operating system.
Linux, on a desktop computer or on a mobile phone, is not a RTOS. Its scheduling policy is time-driven.
On a RTOS, scheduling is triggered by events, either from environment through ISR or from software itself through system calls (send message, wait for mutex, ...)
In a normal OS, we have two types of processes. User process & kernel Process. Kernel processes have time constraints.However, user processes do not have time constraints.
In a RTOS,all process are Kernel process & hence time constraints should be strictly followed. All process/task (can be used interchangeably) are based on priority and time constraints are important for the system to run correctly.
So, if your code void task_fun(void) { while(1) } runs forever, other higher priority tasks will be starving. Hence, watch dog will crash the system to specify the developer that time constraints of other tasks are not met.
For example, GSM Scheduler needs to run every 4.6ms, if your task runs for more time, time constraints of GSM Scheduler task cannot be satisfied. So the system has to reboot as its purpose is defeated.
Hope this helps :)

Modify Timer Interrupt in Linux

At college I'm studying Operative Systems, and as a first part of the project we have to modify the Timer Interrupt to execute my own code, may be with threads, and I think that Linux present less restrictions to access the Interrupt Vector that Windows does, is not?
Can you give me more details if it's better use Windows or Linux (like Ubuntu) to do this.
Thanks.
I would use Linux, because I think you might fail your assignment if you use Windows. The reason being that the commonly accessible timers (i.e. non-driver stuff) under Windows are not really interrupts, they're messages posted to your thread's message queue.
Whereas under Linux signal/sigaction in combination with timer_create will send a signal, which really counts as "interrupt".

How to do power save on a ARM-based Embedded Linux system?

I plan to develop a nice little application that will run on an arm-based embedded Linux platform; however, since that platform will be battery-powered, I'm searching for relevant information on how to handle power save.
It is kind of important to get decent battery time.
I think the Linux kernel implemented some support for this, but I can't find any documentation on this subject.
Any input on how to design my program and the system is welcome.
Any input on how the Linux kernel tries to solves this type of problem is also welcome.
Other questions:
How much does the program in user space need to do?
And do you need to modify the kernel?
What kernel system calls or APIs are good to know about?
Update:
It seems like the folks involved with the "Free Electrons" site have produced some nice presentations on this subject.
http://free-electrons.com/services/power-management/
http://free-electrons.com/docs/power
http://free-electrons.com/docs/optimizations
But maybe someone else has even more information on this subject?
Update:
It seems like Adam Shiemke's idea to go look at the MeeGo project may be the best tip so far.
It may be the best battery powered Embedded Linux project out there at this moment.
And Nokia is usually kind of good at this type of thing.
Update:
One has to be careful about Android since it has a "modified" Linux kernel in the bottom, and some of the things the folks at Google have done do not use baseline/normal Linux kernels. I think that some of their power management ideas could be troublesome to reuse for other projects.
I haven't actually done this, but I have experience with the two apart (Linux and embedded power management). There are two main Linux distributions that come to mind when thinking about power management, Android and MeeGo. MeeGo uses (as far as I can tell) an unmodified 2.6 kernel with some extras hanging on. I wasn't able to find a lot on exactly what their power management strategy is, although I suspect more will be coming out about it in the near future as the product approaches maturity.
There is much more information available on Android, however. They run a fairly heavily modified 2.6 kernel. You can see a good bit on the different strategies implemented in http://elinux.org/Android_Power_Management (as well as kernel drama). Some other links:
https://groups.google.com/group/android-kernel/browse_thread/thread/ee356c298276ad00/472613d15af746ea?lnk=raot&pli=1
http://www.ok-labs.com/blog/entry/context-switching-in-context/
I'm sure that you can find more links of this nature. Since both projects are open source, you can grab the kernel code, and probably get further information from people who actually know what they are talking about in forms and groups.
At the driver level, you need to make sure that your drivers can properly handle suspend and shut devices off that are not in use. Most devices aimed at the mobile market offer very fine-grained support to turn individual components off, and to tweak clock settings (remember, power is proportional to clock^2).
Hope this helps.
You can do quite a bit of power-saving without requiring any special support from the OS, assuming you are writing (or at least have the source code for) your application and drivers.
Your drivers need to be able to disable their associated devices and bring them back up without requiring a restart or introducing system instability. If your devices are connected to a PCI/PCIe bus, research which power states they support (D0 - D3) and what your driver needs to do to transition between these low-power modes. If you are selecting hardware devices to use, look for devices that adhere to the PCI Power Management Specification or have similar functionality (such as a sleep mode and a "wake up" interrupt signal).
When your device boots up, every device that has the ability to detect whether it is connected to anything needs to do so. If any ports or buses detect that they are not being used, power them down or put them to sleep. A port running at full power but sitting unused can waste more power than you might think it would. Depending on your particular hardware and use case, it might also be useful to have a background app that monitors device usage, identifies unused/idle resources, and acts appropriately (like a "screen saver" for your hardware).
Your application software should make sure to detect whether hardware devices are powered up before attempting to use them. If you need to access a device that might be placed in a low-power mode, your application needs to be able to handle a potentially lengthy delay in waiting for the device to wake up and respond. Your applications should also be considerate of a device's need to sleep. If you need to send a series of commands to a hardware device, try to buffer them up and send them out all at once instead of spacing them out and requiring multiple wakeup->send->sleep cycles.
Don't be afraid to under-clock your system components slightly. Besides saving power, this can help them run cooler (which requires less power for cooling). I have seen some designs that use a CPU that is more powerful than necessary by a decent margin, which is then under-clocked by as much as 40% (bringing the performance down to the original level but at a fraction of the power cost). Also, don't be afraid to spend power to save power. That is, don't be afraid to use CPU time monitoring hardware devices for opportunities to disable/hibernate them (even if it will cause your CPU to use a bit more power). Most of the time, this tradeoff results in a net power savings.
One of the most important things to think of as a power aware application developer is to avoid unnecessary timers. If possible use interrupt driven solutions instead of polled solutions. If a timer must be used then use as long poll interval as is possible.
For example if something special should be done at a certain room temperature it is unnecessary to check the temperature every 100 ms since temperature in a room changes slowly. A more reasonable polling interval is could be 60 s.
This affects the power consumption in several ways. In Linux the CPUIDLE subsystem takes the CPU (SOC) to as deep power saving state as possible depending on when it predicts the next wakeup to occur. Having a lot of timers in a system will fragment the sleep making it impossible to go to the deeper sleep states for longer periods. A typical deep sleep state for CPUIDLE turns the CPU off but keeps the RAM in self refresh. When a timer triggers the CPU will boot and serve the timer of the application.
It's not actually your topic, but it might come in handy to log your progress: i was looking for testing / measuring my embedded linux system. chris desjardins from this forum recommended me this:
I have successfully used bootchart in the past:
http://elinux.org/Bootchart
Here is a list of other things that may also help:
http://elinux.org/Boot_Time

Resources