Why my MTD driver becomes a normal file? - linux

I am using phram and ramoops to store the latest system log in a reserved memory, so that once my machine crashed I could dump the panic log after reboot. MTD driver phram and module ramoops are used to automatically record the system log to memory:
/# insmod /lib/modules/phram.ko phram=phram-oops,<addr>,<len>
/# ls -l /dev/mtdchar/param-oops
crw-r--r-- 1 root root 90, 24 Jul 20 16:34 phram-oops
It worked well until recently I reused this driver to also backup the boot loader log - during the boot, phram-oops backs up the u-boot log to one reserved memory area; and after Linux shell is up, dump the u-boot log, clear phram-oops by dd if=/dev/zero bs=65536 count=1 of=/dev/mtdchar/param-oops, rmmod phram and insmod phram with a new memory area for panic log. Then dumping the system logs of last boot. Until this step, /dev/mtdchar/phram-oops still works fine:
/# ls -l /dev/mtdchar/phram-oops
crw-r--r-- 1 root root 90, 24 Jul 20 16:34 /dev/mtdchar/phram-oops
However, after running dd if=/dev/zero bs=65536 count=1 of=/dev/mtdchar/phram-oops” again to clear the memory, driver/dev/mtdchar/phram-oops` becomes a file!!!
/# ls -l /dev/mtdchar/phram-oops
-rw-r--r-- 1 root root 65536 Jul 20 16:34 /dev/mtdchar/phram-oops
And as a result the previous logs remains in the memory and could not be cleared. Any idea about how a driver turns to a file? And how to fix it?

It seems this problem was caused by hotplug - it requires some delay after rmmod phram and before insmod phram with a new address. Otherwise, the device driver is very likely not correctly loaded and as a result the dd command could create it as a normal file.

Related

Bash on Windows 10, no loop devices

I've just tried Bash on my Windows 10 PC, and it works fine. However, I found that there is no such thing as loop devices by ls /dev/, and modprobe loop gives an error output.
Does it mean this Bash doesn't support loop devices at all or is there a solution for mounting an image as a loop device?
Windows Subsystem for Linux 1 (WSL, formerly known as Bash on Ubuntu on Windows) did not support loop devices. There was a feature request and an issue about it on Microsoft's Git repo.
WSL 2, however, does support loop devices.
$ uname -a
Linux Blade 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ fallocate -l 1G test.img
$ mkfs.ext3 test.img
mke2fs 1.45.5 (07-Jan-2020)
Discarding device blocks: done
Creating filesystem with 262144 4k blocks and 65536 inodes
Filesystem UUID: 549cca4d-a65f-4f4f-8428-e324feaed3d0
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Allocating group tables: done
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done
$ sudo mount -o loop test.img /media/
$ ls /media/
lost+found
Do you know that Bash is just a shell (something that reads your commands, executes them, pipes between them and permits you to write scripts) and is not an operating system?
Loop devices are part of the Linux kernel, and they simply don't exist in the Windows kernel.

Busybox SUID on NFS rootfs

I am building a Linux system from the bottom for a Beagle Bone board. I have compiled the vanilla kernel and built a basic root file system with busybox. The system is booted with U-boot, while the rootfs is located on a Linux PC and exported through NFS:
/path/to/rootfs 10.42.0.17(rw,wdelay,no_root_squash,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)
The U-boot bootargs are:
bootargs console=ttyO0,115200n8 root=/dev/nfs rw nfsroot=${serverip}:/path/to/rootfs,v3,tcp ip=dhcp
I've encountered a problem when trying to get su working for non-root users. In order to work around the problem people over internet are suggesting to set the suid bit for the busybox binary.
After doing so:
$ sudo chmod u+s busybox
and verifying:
$ ls -la
...
-rwsr-xr-x 1 myuser myuser 1882976 Jan 13 21:47 busybox
...
$ stat -c "%a %n" busybox
4755 busybox
Something went wrong. The kernel is booting and all of the usual messages are displayed, but it is getting stuck at the end, and no login line is displayed. Here are last few lines of the booting sequence:
[ 3.776185] IP-Config: Complete:
[ 3.779656] device=eth0, hwaddr=c8:a0:30:c5:80:e9, ipaddr=10.42.0.17, mask=255.255.255.0, gw=10.42.0.1
[ 3.789877] host=10.42.0.17, domain=, nis-domain=(none)
[ 3.795822] bootserver=10.42.0.1, rootserver=10.42.0.1, rootpath=
[ 3.802492] nameserver0=10.42.0.1
[ 3.871575] VFS: Mounted root (nfs filesystem) on device 0:15.
[ 3.879903] devtmpfs: mounted
[ 3.883713] Freeing unused kernel memory: 380K (c07ef000 - c084e000)
If removing the flag, the things are returning to normal:
....
[ 3.862291] Freeing unused kernel memory: 380K (c07ef000 - c084e000)
10.42.0.17 login:
If setting the flag from within the running shell on the Beagle Bone board itself, the shell is stopping responding right after the chmod is performed.
I suspect it is something to do with the way the NFS is exporting the rootfs, but it's only a guess, so qualified explanation and possible solution would be helpful.
After some research I will answer my question myself. The answer is very simple. In order the above to work, the busybox binary should be owned by root:root. The simplest solution is just to change the ownership.

Postgresql 'main/pg_notify/0000': Stale NFS file handle

I have a Debian Wheezy computer running a Postgresql Server and NO NFS filesystems.
After rebooting the computer, the following error has appeared:
ls: cannot access 0000: Stale NFS file handle
516439 drwx------ 2 postgres postgres 8 Nov 12 20:25 .
516480 drwx------ 3 postgres postgres 4096 Nov 17 17:08 ..
? ?????????? ? ? ? ? ? 0000
The "/var/lib/postgresql/9.1/main/pg_notify/0000" file is STALE and I cannot remove it or do anything at all with it. In order to get rid of that file, I tried the following options:
Rebooting the computer in order to unmount the filesystem (as suggested in several forums) did not work.
Removing postgresql (apt-get -purge) did not do anything at all either.
Trying to manually remove that file does not work either (Stale NFS file handle).
This directory is part of a JFS partition over a ciphered volume managed by LVM.
The output for the fsck:
fsck.jfs version 1.1.15, 04-Mar-2011
processing started: 11/17/2014 20:22:30
Using default parameter: -p
The current device is: /
ujfs_rw_diskblocks: read 0 of 4096 bytes at offset 32768
ujfs_rw_diskblocks: read 0 of 4096 bytes at offset 61440
Superblock is corrupt and cannot be repaired
since both primary and secondary copies are corrupt.
Output for ls -l:
ls -l /var/lib/postgresql/9.1/main/pg_notify/0000
I would like to know...
Why do I have a problem with a NFS handle in a non-NFS partition?
Is there anyway in which I can get rid of that file (workarounds are more than welcome as well)?

Linux 3.2 /dev/shm performance variable?

I'm using /dev/shm tmpfs for writing lots of temporary files.
A set of 8-10 files/second, with each set containing files which range from 70kB .. 750kB. The file sets are all approximately the same sizes and arrive to be written regularly about once per second.
The code which writes these files is python calling a library which uses fwrite() to do the writing.
When the application starts, the write times take from 30 to over 400ms. Usually it's the largest (700k) files which can take 400ms to write but then it varies.
Here is an example of a set of 8, given in ms: 42,30,320,76,66,72,102,440.
It seems that the standard deviation of writing to /dev/shm is quite large.
After the application runs for a couple of minutes, the time to write plummets and the variance is much smaller (e.g. 7,8,15,23,24,32,51,71) - this behavior is stable and I have run the application for several hours.
There are no other applications of consequence running concurrently and there is plenty of room on /dev/shm.
It seems that the Linux kernel is dynamically adjusting to the application's use of /dev/shm. My question is: is my suspicion about the linux kernel correct? If so, is there any way to configure or notify the kernel ahead of time to use the desired behavior when my application starts? (faster writes to /dev/shm)
I'm using Ubuntu 12.04 LTS
$ uname -a
Linux devsb02 3.2.0-24-generic #39-Ubuntu SMP Mon May 21 16:52:17 UTC 2012 x86_64 x86_64
x86_64 GNU/Linux
$ ls -l /dev/shm
lrwxrwxrwx 1 root root 8 Aug 23 12:19 /dev/shm -> /run/shm
$ mount
....
none on /run/shm type tmpfs (rw,nosuid,nodev)

Who is refreshing hardware watchdog in Linux?

I have a processor AT91SAM9G20 running a 2.6 kernel. Watchdog is enabled at bootstrap level and configured for 16 seconds. Watchdog mode register can be configured only once.
When code hangs either in bootstrap, bootloader or kernel, the board reboots. But once kernel comes up even though watchdog is not refreshed in any of the applications, the board is not being reset after 16 seconds, but 15 minutes.
Who is refreshing the watchdog?
In our case, the watchdog should be influenced by applications, so that the board can reset if our application hangs.
These are the running processes:
1 root init
2 root [kthreadd]
3 root [ksoftirqd/0]
4 root [watchdog/0]
5 root [events/0]
6 root [khelper]
63 root [kblockd/0]
72 root [ksuspend_usbd]
78 root [khubd]
85 root [kmmcd]
107 root [pdflush]
108 root [pdflush]
109 root [kswapd0]
110 root [aio/0]
740 root [mtdblockd]
828 root [rpciod/0]
982 root [jffs2_gcd_mtd10]
1003 root /sbin/udevd -d
1145 daemon portmap
1158 dbus dbus-daemon --system
1178 root /usr/sbin/ifplugd -i eth0 -fwI -u0 -d5 -l -q
1190 root /usr/sbin/ifplugd -i eth1 -fwI -u0 -d5 -l -q
1221 default avahi-daemon: running [SP14.local]
1226 root /usr/sbin/dropbear
1246 root /root/bin/host_app
1254 root /root/bin/mini_httpd -c *.cgi -d /root/bin -u root -E /root/bin/
1256 root -sh
1257 root /sbin/syslogd -n -m 0
1258 root /sbin/klogd -n
1259 root /usr/bin/tail -f /var/log/messages
1265 root ps -e
We are using the watchdog for soft lockups available in kernel-2.6.25-ts.at91sam9g20/kernel/softlockup.c
If you enabled the watchdog driver in your kernel, the watchdog driver sets up a kernel timer, in charge of resetting the watchdog. The corresponding code is linux/drivers/watchdog/at91sam9_wdt.c. So it works like this:
If no application opens the /dev/watchdog file, then the kernel takes care of resetting the watchdog. Since it is a timer, it won't appear as a dedicated kernel thread, but handled by the soft IRQ thread. Now, if an application opens this file, it becomes responsible of the watchdog, and can reset it by writing to the file, as documented by the documentation linked in Richard's post.
Is the watchdog driver configured in your kernel?
If not, you should configure it, and see if the reset still happens. If it still happens, it is likely that your reset comes from somewhere else.
If your kernel is too old to have a proper watchdog driver (not present in 2.6.25) you should backport it from 2.6.28. Or you can try to disable the watchdog in your bootloader and see if the reset still occurs.
In July 2016 commit 3fbfe92647 (watchdog: change watchdog_need_worker logic) in the 4.7 kernel to watchdog_dev.c enabled the same behavior as shodanex's answer for all watchdog timer drivers. This doesn't seem to be documented anywhere other than this thread and the source code.
/*
* A worker to generate heartbeat requests is needed if all of the
* following conditions are true.
* - Userspace activated the watchdog.
* - The driver provided a value for the maximum hardware timeout, and
* thus is aware that the framework supports generating heartbeat
* requests.
* - Userspace requests a longer timeout than the hardware can handle.
*
* Alternatively, if userspace has not opened the watchdog
* device, we take care of feeding the watchdog if it is
* running.
*/
return (hm && watchdog_active(wdd) && t > hm) ||
(t && !watchdog_active(wdd) && watchdog_hw_running(wdd));
This may give you a hint: http://www.mjmwired.net/kernel/Documentation/watchdog/watchdog-api.txt
It makes perfect sense to have a user space daemon handling the watchdog. It probably defaults to a 15 minute timeout.
we had a similar problem regarding WDT on AT91SAM9263. Problem was with bit 29 WDIDLEHLT of WDT_MR (Address: 0xFFFFFD44) register. This bit was set to 1 but it should be 0 for our application needs.
Bit explanation from datasheet documentation:
• WDIDLEHLT: Watchdog Idle Halt
0: The Watchdog runs when the system is in idle mode.
1: The Watchdog stops when the system is in idle state.
This means that WDT counter does not increment when kernel is in idle state, hence the 15 or more delay until reset happens.
You can try "dd if=/dev/zero of=/dev/null" which will prevent kernel from entering idle state and you should get a reset in 16 seconds (or whatever period you have set in WDT_MR register).
So, the solution is to update u-boot code or other piece of code that sets WDT_MR register. Remember this register is write once...
Wouldn't the kernel be refreshing the watchdog timer? The watchdog is designed to reset the board if the whole system hangs, not just a single application.

Resources