How to handle power-failure in WinCE? - windows-ce

I've got a WinCE device powered over ethernet (PoE) and I want to prevent File-System corruption following a potential power loss, e.g. user pulling the plug.
As a side note, I'm already using TexFAT which is supposed to prevent FS corruptions. While the later certainly does help reducing FS corruptions (over using plain old FAT), it doesn't entirely prevent some to still occur from time to time... So, I'm considering using a small rechargeable backup battery that would give WinCE enough time to cleanly shut down. Now, I can't find any info on the shutdown process: how to trigger it, how long it takes, and so on... MSDN is pretty quiet on this topic. Any idea?

The powerdown sequence is totally platform dependent.
The following answer is relevant to Windows CE 6. It may be different for previous versions of CE.
If you include the power manager component in your system, then the sequence is plus minus this:
Send go to D4 to all the drivers that are powermanageable and that reported they support this state. Otherwise, the driver gets the lowest powerstate it supports.
XXX_PowerDown is called, but it is not commonly used in Windows CE 6.
In between the registry is flushed in case you have a Hive Based registry and you enabled the registry flush thread. You should disable this in a fragile system such as yours
OEMPowerOff
device down
Just found a post by Bruce Eitman on what happens when Suspending. He puts it better than I do.
The Suspend sequence is what you'd do before loosing power.

Related

What causes dma_map_page/dma_unmap_page to take longer time on some hardware?

I've been programming a Linux kernel module for several years for a PCIe device. One of the main feature is to transfer data from the PCIe card to the host memory using DMA.
I'm using streaming DMA, i.e. it's the user program that allocates the memory, and my kernel module has to do the job of locking the pages and creating the scatter gather structure. It works correctly.
However, when used on some more recent hardware with Intel processors, the function calls dma_map_page and dma_unmap_page are taking much longer time to execute.
I've tried to use dma_map_sg and dma_unmap_sg, it takes approximately the same longer-time.
I've tried to split the dma_unmap_sg into a first call to dma_sync_sg_for_cpu, followed by the call to dma_unmap_sg_attrs with attribute DMA_ATTR_SKIP_CPU_SYNC. It works correctly. And I can see the additional time is spend on the unmap operation, not on the sync.
I've tried to play with the linux command line parameters relating to the iommu (on, force, strict=0), and also intel_iommu, with no change in the behavior.
Some other hardware show a decent transfer rate, i.e. more than 6GB/s on PCIe3x8 (max 8GB/s).
The issue on some recent hardware is limiting transfer rate to ~3GB/s (I've checked that the card is correctly configured for PCIe3x8, and the programmer of the Windows device driver manages to achieve the 6GB/s on the same system. Things are more behind the curtains in Windows and I cannot get much information from him.)
On some hardware, the behavior is either normal or slowed, depending on the Linux distribution (and the Linux kernel version I guess). On some other hardware, the roles are reversed, i.e. the slow one becomes the fast one and vice-versa.
I cannot figure out the cause of this. Any clue?
The trouble was the bounce buffers. Didn't know about this.

ncurses disable kernel messages on console screen?

Im looking for a way how to get rid of (kernel?) messages that appear in my ncurses app. I wrote the app myself, so i would prefer a API that redirects these messages to /dev/null. I mean messages like, a USB stick that is inserted.
I tried to add this, but unfortunately it doesn't work
freopen("/dev/null", "w", stderr);
I'm not running X, just ncurses direct from the console.
I mean messages like, a USB stick that is inserted.
Thanks!
UPDATE 1:
Someone votes to close this question because it would not be related to programming. But it is, i wrote the ncurses app myself, i'm looking for a way how to disable the kernel message. I updated the question.
UPDATE 2:
Let me explain what i'm doing, and whats the problem in more detail:
I'm using Tiny Core linux, thats after boots starts (self written) ncurses program. Now when you for example connect a USB drive, a message (i suspect kernel) is shown over my program. I guess the message is written straight into the framebuffer. Im using TC 5.x since i need 32 bit, im running as root and have full access to the os.
You should be able to use openvt to have your program run on a new Virtual Terminal.
I'll also note that it should be possible to embed control for the VTs yourself if you prefer to break the external dependency, but note that structures used may not be stable between kernel versions, and may require recompilation.
See the KBD project's sources, specifically openvt.c to see how it works.
Try configuring the kernel through boot parameters with the option:
loglevel=3 (or a lower value)
0 (KERN_EMERG) system is unusable
1 (KERN_ALERT) action must be taken immediately
2 (KERN_CRIT) critical conditions
3 (KERN_ERR) error conditions
4 (KERN_WARNING) warning conditions
5 (KERN_NOTICE) normal but significant condition
6 (KERN_INFO) informational
7 (KERN_DEBUG) debug-level messages
source: https://www.kernel.org/doc/Documentation/kernel-parameters.txt
See also: Change default console loglevel during boot up
It might be impossible to block some other process with sufficient access from writing to /dev/console but you may be able to redefine console as some other device, at boot time by setting console=ttyS0 (first serial port), see:
https://unix.stackexchange.com/questions/60641/linux-difference-between-dev-console-dev-tty-and-dev-tty0
Also if we know exactly which software is sending the message it may be possible to reconfigure it (possibly dynamically) but it would help to know the version and edition of Tiny Core Linux you are using?
E.g. this website has a "Core", "TinyCore" and "CorePlus" versions 1.x up to 7
http://tinycorelinux.net/downloads.html
This would help reproducing the exact same behavior and testing potential solutions.

Recover from OpenCL freeze in Linux

I am writing my first OpenCL kernels on an Ubuntu machine with an NVIDIA card. Once in a while, the application totally freezes the whole computer. The mouse does not move, and the only way to reboot is by force-pressing the power button.
I've realized that the reason for the freezes is that I accidentally read past the last index of a global, read-only float array. While this is something I don't intend to do often, it might still happen in the future.
My question is - is there any way to prevent the computer from completely shutting down if this happens again? I know that, for example, Windows can shut down bad GLSL kernels and recover with a graphics driver restart. Is something similar possible here?
You may not be able to completely recover but you can recover better using SysRq (sometimes called System Request or Magic SysRq). By executing a specific key combination you can have Linux reboot in some what of a sane way (killing processes and unmounting filesystems). This key sequence is described in detail at http://en.wikipedia.org/wiki/Magic_SysRq_key so I won't repeat it here.
In some cases you might be able to still SSH to the device. If this is your case you might be in more luck. If you can SSH there are a number of other options you can try such as: unloading/reloading the crashed module, restarting the xserver, or at least rebooting the normal way.
Although I'm not an expert on "HURD" I believe it was designed to handle this type of condition better. The only other solution I can think of is using two graphics cards one for X and one for OpenCL. Dependidng on what you are doing you might have to passthrough the NVIDIA to a VM in order to completely isolate it off your host.

Windows CE device powering off Randomly

I have a touch screen device that is running on windows CE. after 30 second the screen goes off to save power and will come back on it you touch it.
The problem is that randomly when the screen goes off the device will not come back on simply by touching the screen. I have a done a bunch of tests and there is no noticeable pattern to when this happens.
It appears to be performing the same action as when you press the suspend button from the main menu.
I have done some research and found there are 4 power saving settings in the registry and I think I need to disable one to stop the device from "suspending". I never want the device to turn off except for the screen going off, it is always connected to power.
Does anyone know how I can do this or why it is randomly suspending ?
And the entire device is in Chinese So really precise instructions would be appreciated. My application runs on top of the CE.
I know you're after precise instructions, but it's not that simple. The device OEM defined and implemented the power management system for the device, Microsoft only provided the structure for it. The OEM could have implemented power management in any way they sought fit,, and in fact they could have completely ignore the Microsoft-provided framework (wouldn't be the first time an OEM did that). Really you need to get a hold of the OEM and ask them how to prevent the behavior you're seeing or to get something different.
Barring that, you could always play around with the registry entries, but again, there's no guarantee any of them will work. You might look at adjusting power state or the activity timer registry entries.
Playing with the power manager control panel applet might also help (it's probably labelled 电源管理)
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Power\Timeouts]
"BattSuspend"=dword:0

How to "hibernate" a process in Linux by storing its memory to disk and restoring it later?

Is it possible to 'hibernate' a process in linux?
Just like 'hibernate' in laptop, I would to write all the memory used by a process to disk, free up the RAM. And then later on, I can 'resume the process', i.e, reading all the data from memory and put it back to RAM and I can continue with my process?
I used to maintain CryoPID, which is a program that does exactly what you are talking about. It writes the contents of a program's address space, VDSO, file descriptor references and states to a file that can later be reconstructed. CryoPID started when there were no usable hooks in Linux itself and worked entirely from userspace (actually, it still does work, depending on your distro / kernel / security settings).
Problems were (indeed) sockets, pending RT signals, numerous X11 issues, the glibc caching getpid() implementation amongst many others. Randomization (especially VDSO) turned out to be insurmountable for the few of us working on it after Bernard walked away from it. However, it was fun and became the topic of several masters thesis.
If you are just contemplating a program that can save its running state and re-start directly into that state, its far .. far .. easier to just save that information from within the program itself, perhaps when servicing a signal.
I'd like to put a status update here, as of 2014.
The accepted answer suggests CryoPID as a tool to perform Checkpoint/Restore, but I found the project to be unmantained and impossible to compile with recent kernels.
Now, I found two actively mantained projects providing the application checkpointing feature.
The first, the one I suggest 'cause I have better luck running it, is CRIU
that performs checkpoint/restore mainly in userspace, and requires the kernel option CONFIG_CHECKPOINT_RESTORE enabled to work.
Checkpoint/Restore In Userspace, or CRIU (pronounced kree-oo, IPA: /krɪʊ/, Russian: криу), is a software tool for Linux operating system. Using this tool, you can freeze a running application (or part of it) and checkpoint it to a hard drive as a collection of files. You can then use the files to restore and run the application from the point it was frozen at. The distinctive feature of the CRIU project is that it is mainly implemented in user space.
The latter is DMTCP; quoting from their main page:
DMTCP (Distributed MultiThreaded Checkpointing) is a tool to transparently checkpoint the state of multiple simultaneous applications, including multi-threaded and distributed applications. It operates directly on the user binary executable, without any Linux kernel modules or other kernel modifications.
There is also a nice Wikipedia page on the argument: Application_checkpointing
The answers mentioning ctrl-z are really talking about stopping the process with a signal, in this case SIGTSTP. You can issue a stop signal with kill:
kill -STOP <pid>
That will suspend execution of the process. It won't immediately free the memory used by it, but as memory is required for other processes the memory used by the stopped process will be gradually swapped out.
When you want to wake it up again, use
kill -CONT <pid>
The more complicated solutions, like CryoPID, are really only needed if you want the stopped process to be able to survive a system shutdown/restart - it doesn't sound like you need that.
Linux Kernel has now partially implemented the checkpoint/restart futures:https://ckpt.wiki.kernel.org/, the status is here.
Some useful information are in the lwn(linux weekly net):
http://lwn.net/Articles/375855/ http://lwn.net/Articles/412749/ ......
So the answer is "YES"
The issue is restoring the streams - files and sockets - that the program has open.
When your whole OS hibernates, the local files and such can obviously be restored. Network connections don't, but then the code that accesses the internet is typically more error checking and such and survives the error conditions (or ought to).
If you did per-program hibernation (without application support), how would you handle open files? What if another process accesses those files in the interim? etc?
Maintaining state when the program is not loaded is going to be difficult.
Simply suspending the threads and letting it get swapped to disk would have much the same effect?
Or run the program in a virtual machine and let the VM handle suspension.
Short answer is "yes, but not always reliably". Check out CryoPID:
http://cryopid.berlios.de/
Open files will indeed be the most common problem. CryoPID states explicitly:
Open files and offsets are restored.
Temporary files that have been
unlinked and are not accessible on the
filesystem are always saved in the
image. Other files that do not exist
on resume are not yet restored.
Support for saving file contents for
such situations is planned.
The same issues will also affect TCP connections, though CryoPID supports tcpcp for connection resuming.
I extended Cryopid producing a package called Cryopid2 available from SourceForge. This can
migrate a process as well as hibernating it (along with any open files and sockets - data
in sockets/pipes is sucked into the process on hibernation and spat back into these when
process is restarted).
The reason I have not been active with this project is I am not a kernel developer - both
this (and/or the original cryopid) need to get someone on board who can get them running
with the lastest kernels (e.g. Linux 3.x).
The Cryopid method does work - and is probably the best solution to general purpose process
hibernation/migration in Linux I have come across.
The short answer is "yes." You might start by looking at this for some ideas: ELF executable reconstruction from a core image (http://vx.netlux.org/lib/vsc03.html)
As others have noted, it's difficult for the OS to provide this functionality, because the application needs to have some error checking builtin to handle broken streams.
However, on a side note, some programming languages and tools that use virtual machines explicitly support this functionality, such as the Self programming language.
This is sort of the ultimate goal of clustered operating system. Mathew Dillon puts a lot of effort to implement something like this in his Dragonfly BSD project.
adding another workaround: you can use virtualbox. run your applications in a regular virtual machine and simply "save the machine state" whenever you want.
I know this is not an answer, but I thought it could be useful when there are no real options.
if for any reason you don't like virtualbox, vmware and Qemu are as good.
Ctrl-Z increases the chances the process's pages will be swapped, but it doesn't free the process's resources completely. The problem with freeing a process's resources completely is that things like file handles, sockets are kernel resources the process gets to use, but doesn't know how to persist on its own. So Ctrl-Z is as good as it gets.
There was some research on checkpoint/restore for Linux back in 2.2 and 2.4 days, but it never made it past prototype. It is possible (with the caveats described in the other answers) for certain values of possible - I you can write a kernel module to do it, it is possible. But for the common value of possible (can I do it from the shell on a commercial Linux distribution), it is not yet possible.
There's ctrl+z in linux, but i'm not sure it offers the features you specified. I suspect you asked this question since it doesn't

Resources