JFFS2 filesystem corrupts immediately (Magic bitmask 0x1985 not found errors) - linux

I have created a root filesystem with buildroot that is using squashfs. It works fine, and now I would like to create an overlayfs, which would hold /home and /etc directories.
For this purpose, I wanted to create a simple jffs2 filesystem with couple of files:
jlumme#simppa:~/projects/jffs2_home$ ls -la
total 20
drwxrwxr-x 4 jlumme jlumme 4096 Apr 21 16:21 .
drwxrwxr-x 6 jlumme jlumme 4096 Apr 21 16:21 ..
drwxrwxr-x 2 jlumme jlumme 4096 Apr 21 13:45 default
drwxrwxr-x 2 jlumme jlumme 4096 Apr 21 13:45 ftp
-rw-rw-r-- 1 jlumme jlumme 24 Apr 21 15:34 test.txt
The flash chip I use is SST25VF064C, so I believe it's erase block size is 64 KB, and thus I create a filesystem image from that folder:
mkfs.jffs2 -r jffs2_home/ -e 64 -o home.jffs2
$ ls -la
-rw-r--r-- 1 jlumme jlumme 496 Apr 21 15:42 home.jffs2
(Suprisingly, if I set -e 32, or even -e 4, the resulting binary image doesn't change at all???).
Nevertheless, moving on, I have aligned my mtdblock that contains home, to 64KB, and my flash layout looks like this:
uboot/<0x00000000 0x40000>
kernel/<0x00040000 0x3D9000>
dtb/<0x00419000 0x10000>
rootfs/<0x00429000 0x1F7000>
home/<0x00620000 0x1E0000>
On my board, I can mount the mtdblock4 fine, and I can read the file contents properly. However, if I modify the file, and try saving it, vi complains:
[ 77.030000] jffs2: Node totlen on flash (0xffffffff) != totlen from node ref (0x00000044)
Now, if I unmount the filesystem, and remount it, I start getting complaints immediately:
# mount -t jffs2 /dev/mtdblock4 /home/
[ 99.740000] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x001d4070: 0xff0a instead
[ 99.760000] jffs2: Empty flash at 0x001d4074 ends at 0x001d412c
[ 99.770000] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x001d412c: 0xffff instead
[ 99.790000] jffs2: Empty flash at 0x001d4130 ends at 0x001d4194
[ 99.790000] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x001d4194: 0xff0a instead
I suppose now my filesystem is already corrupted... and I don't really understand the reason for it..
Any ideas where am I going wrong with this ? Thanks for all suggestions..

This is what I did to solve the issue.
Updated newer MTD drivers from http://www.linux-mtd.infradead.org/
- There was new code for SST25V064C chip
Made sure the area reserved for JFFS2 was initialized to 0xFF
(possibly optional) Specified more accurately the creation of jffs2 file system:
mkfs.jffs2 -e 64 -l -p -s 4096 -r jffs2_home/ -o home.jffs2
With these changes the file system now reads and writes as expected.

Related

Two identical NFS shares, but only one of the two gives Stale file handle errors

I have a Linux (raspbian) server:
$ uname -a
Linux hester 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux
With two directories that have the same user/group/permissions:
$ ls -ld /mnt/storage/gitea/ /mnt/storage/hester/
drwxr-xr-x 2 nobody nogroup 26 Mar 2 10:20 /mnt/storage/gitea/
drwxr-xr-x 3 nobody nogroup 21 Feb 21 11:26 /mnt/storage/hester/
These two directories are exported with the same parameters in the exports file:
$ cat /etc/exports
/mnt/storage/hester 192.168.1.15(rw,sync,no_subtree_check)
/mnt/storage/gitea 192.168.1.15(rw,sync,no_subtree_check)
On another machine (the 192.168.1.15 mentioned in the exports file) I mount both, successfully :
$ mount /mnt/storage/gitea/
$ echo $?
0
$ mount /mnt/storage/hester/
$ echo $?
0
But now weird things happen:
$ ls -l /mnt/storage/
ls: cannot access '/mnt/storage/gitea': Stale file handle
total 0
d????????? ? ? ? ? ? gitea
drwxr-xr-x 3 nobody nogroup 21 Feb 21 11:26 hester
I really can't figure
what's the source of the error, and above all
where I could look for a difference between the two.
I'm open to suggestions for further investigations or answers for the my doubts. Thanks in advance for any useful input!
I finally found the solution, which was to explicitly add an fsid option in exports:
$ cat /etc/exports
/mnt/storage/hester 192.168.1.15(rw,sync,fsid=20,no_subtree_check)
/mnt/storage/gitea 192.168.1.15(rw,sync,fsid=21,no_subtree_check)
I'm not entirely sure as to the reason why this works. From the man page I get that "NFS needs to be able to identify each filesystem that it exports. Normally it will use a UUID for the filesystem (if the filesystem has such a thing) or the device number of the device holding the filesystem (if the filesystem is stored on the device)."
Both these mountpoints are on the same filesystem, so according to the man page they should have the same fsid, but this causes the same directory to be exported, so I think it means that each export needs to have a separate fsid.
One more note: /mnt/storage is an XFS filesystem over a RAID3, so this could also have made NFS confused about UUIDs of devices.

Programmatically create a btrfs file system whose root directory has a specific owner

Background
I have a test script that creates and destroys file systems on the fly, used in a suite of performance tests.
To avoid running the script as root, I have a disk device /dev/testdisk that is owned by a specific user testuser, along with a suitable entry in /etc/fstab:
$ ls -l /dev/testdisk
crw-rw---- 1 testuser testuser 21, 1 Jun 25 12:34 /dev/testdisk
$ grep testdisk /etc/fstab
/dev/testdisk /mnt/testdisk auto noauto,user,rw 0 0
This allows the disk to be mounted and unmounted by a normal user.
Question
I'd like my script (which runs as testuser) to programmatically create a btrfs file system on /dev/testdisk such that the root directory is owned by testuser:
$ mount /dev/testdisk /mnt/testdisk
$ ls -la /mnt/testdisk
total 24
drwxr-xr-x 3 testuser testuser 4096 Jun 25 15:15 .
drwxr-xr-x 3 root root 4096 Jun 23 17:41 ..
drwx------ 2 root root 16384 Jun 25 15:15 lost+found
Can this be done without running the script as root, and without resorting to privilege escalation (use of sudo) within the script?
Comparison to other file systems
With ext{2,3,4} it's possible to create a filesystem whose root directory is owned by the current user, with the following command:
mkfs.ext{2,3,4} -F -E root_owner /dev/testdisk
Workarounds I'd like to avoid (if possible)
I'm aware that I can use the btrfs-convert tool to convert an existing (possibly empty) ext{2,3,4} file system to btrfs format. I could use this workaround in my script (by first creating an ext4 filesystem and then immediately converting it to brtfs) but I'd rather avoid it if there's a way to create the btrfs file system directly.

Can't expose a fuse based volume to a Docker container

I'm trying to provide my docker container a volume of encrypted file system for internal use.
The idea is that the container will write to the volume as usual, but in fact the host will be encrypting the data before writing it to the filesystem.
I'm trying to use EncFS - it works well on the host, e.g:
encfs /encrypted /visible
I can write files to /visible, and those get encrypted.
However, when trying to run a container with /visible as the volume, e.g.:
docker run -i -t --privileged -v /visible:/myvolume imagename bash
I do get a volume in the container, but it's on the original /encrypted folder, not going through the EncFS. If I unmount the EncFS from /visible, I can see the files written by the container. Needless to say /encrypted is empty.
Is there a way to have docker mount the volume through EncFS, and not write directly to the folder?
In contrast, docker works fine when I use an NFS mount as a volume. It writes to the network device, and not to the local folder on which I mounted the device.
Thanks
I am unable to duplicate your problem locally. If I try to expose an encfs filesystem as a Docker volume, I get an error trying to start the container:
FATA[0003] Error response from daemon: Cannot start container <cid>:
setup mount namespace stat /visible: permission denied
So it's possible you have something different going on. In any case, this is what solved my problem:
By default, FUSE only permits the user who mounted a filesystem to have access to that filesystem. When you are running a Docker container, that container is initially running as root.
You can use the allow_root or allow_other mount options when you mount the FUSE filesystem. For example:
$ encfs -o allow_root /encrypted /other
Here, allow_root will permit the root user to have acces to the mountpoint, while allow_other will permit anyone to have access to the mountpoint (provided that the Unix permissions on the directory allow them access).
If I mounted by encfs filesytem using allow_root, I can then expose that filesystem as a Docker volume and the contents of that filesystem are correctly visible from inside the container.
This is definitely because you started the docker daemon before the host mounted the mountpoint. In this case the inode for the directory name is still pointing at the hosts local disk:
ls -i /mounts/
1048579 s3-data-mnt
then if you mount using a fuse daemon like s3fs:
/usr/local/bin/s3fs -o rw -o allow_other -o iam_role=ecsInstanceRole /mounts/s3-data-mnt
ls -i
1 s3-data-mnt
My guess is that docker does some bootstrap caching of the directory names to inodes (someone who has more knowledge of this than can fill in this blank).
Your comment is correct. If you simply restart docker after the mounting has finished your volume will be correctly shared from host to your containers. (Or you can simply delay starting docker until after all your mounts have finished mounting)
What is interesting (but makes complete since to me now) is that upon exiting the container and un-mounting the mountpoint on the host all of my writes from within the container to the shared volume magically appeared (they were being stored at the inode on the host machines local disk):
[root#host s3-data-mnt]# echo foo > bar
[root#host s3-data-mnt]# ls /mounts/s3-data-mnt
total 6
1 drwxrwxrwx 1 root root 0 Jan 1 1970 .
4 dr-xr-xr-x 28 root root 4096 Sep 16 17:06 ..
1 -rw-r--r-- 1 root root 4 Sep 16 17:11 bar
[root#host s3-data-mnt]# docker run -ti -v /mounts/s3-data-mnt:/s3-data busybox /bin/bash
root#5592454f9f4d:/mounts/s3-data# ls -als
total 8
4 drwxr-xr-x 3 root root 4096 Sep 16 16:05 .
4 drwxr-xr-x 12 root root 4096 Sep 16 16:45 ..
root#5592454f9f4d:/s3-data# echo baz > beef
root#5592454f9f4d:/s3-data# ls -als
total 9
4 drwxr-xr-x 3 root root 4096 Sep 16 16:05 .
4 drwxr-xr-x 12 root root 4096 Sep 16 16:45 ..
1 -rw-r--r-- 1 root root 4 Sep 16 17:11 beef
root#5592454f9f4d:/s3-data# exit
exit
[root#host s3-data-mnt]# ls /mounts/s3-data-mnt
total 6
1 drwxrwxrwx 1 root root 0 Jan 1 1970 .
4 dr-xr-xr-x 28 root root 4096 Sep 16 17:06 ..
1 -rw-r--r-- 1 root root 4 Sep 16 17:11 bar
[root#host /]# umount -l s3-data-mnt
[root#host /]# ls -als
[root#ip-10-0-3-233 /]# ls -als /s3-stn-jira-data-mnt/
total 8
4 drwxr-xr-x 2 root root 4096 Sep 16 17:28 .
4 dr-xr-xr-x 28 root root 4096 Sep 16 17:06 ..
1 -rw-r--r-- 1 root root 4 Sep 16 17:11 bar
You might be able to work around this by wrapping the mount call in nsenter to mount it in the same Linux mount namespace as the docker daemon, eg.
nsenter -t "$PID_OF_DOCKER_DAEMON" encfs ...
The question is whether this approach will survive a daemon restart itself. ;-)

Size() vs ls -la vs du -h which one is correct size?

I was compiling a custom kernel, and I wanted to test the size of the image file.
These are the results:
ls -la | grep vmlinux
-rwxr-xr-x 1 root root 8167158 May 21 12:14 vmlinux
du -h vmlinux
3.8M vmlinux
size vmlinux
text data bss dec hex filename
2221248 676148 544768 3442164 3485f4 vmlinux
Since all of them show different sizes, which one is closest to the actual image size?
Why are they different?
They are all correct, they just show different sizes.
ls shows size of the file (when you open and read it, that's how many bytes you will get)
du shows actual disk usage which can be smaller than the file size due to holes
size shows the size of the runtime image of an object/executable which is not directly related to the size of the file (bss uses no bytes in the file no matter how large, the file may contain debugging information that is not part of the runtime image, etc.)
If you want to know how much RAM/ROM an executable will take excluding dynamic memory allocation, size gives you the information you need.
Two definition need to be understood
1 runtime vs storetime (this is why size differs)
2 file depth vs directory (this is why du differs)
Look at the below example:
[root#localhost test]# ls -l
total 36
-rw-r--r-- 1 root root 712 May 12 19:50 a.c
-rw-r--r-- 1 root root 3561 May 12 19:42 a.h
-rwxr-xr-x 1 root root 71624 May 12 19:50 a.out
-rw-r--r-- 1 root root 1403 May 8 00:15 b.c
-rw-r--r-- 1 root root 1403 May 8 00:15 c.c
[root#localhost test]# du -abch --max-depth=1
1.4K ./b.c
1.4K ./c.c
3.5K ./a.h
712 ./a.c
70K ./a.out
81K .
81K total
[root#localhost test]# ls -l
total 36
-rw-r--r-- 1 root root 712 May 12 19:50 a.c
-rw-r--r-- 1 root root 3561 May 12 19:42 a.h
-rwxr-xr-x 1 root root 71624 May 12 19:50 a.out
-rw-r--r-- 1 root root 1403 May 8 00:15 b.c
-rw-r--r-- 1 root root 1403 May 8 00:15 c.c
[root#localhost test]# size a.out
text data bss dec hex filename
3655 640 16 4311 10d7 a.out
If using size not on executable, OS will report an error.
Empirically differences happen most often for sparse files and for compressed files and can go in both directions.
du < ls
Sparse files contain metadata about space needed for an application, which ls reads and applies for its result, while du doesn't. For example:
truncate -s 1m test.dat
creates a sparse file consisting entirely of nulls without disk usage, ie. du shows 0 and ls shows 1M.
du > ls
On the other hand du can indicate, as in your case, files which might occupy a lot of space on disk (ie. they spread among lots of blocks), but not all blocks are filled, i.e. their bytesize (measured by ls) is smaller than du (looking at occupied blocks). I observed this rather prominently e.g. for some python pickle files.

Linux: how do i know the module that exports a device node?

If a have a /dev device node and its major/minor numbers how do i know the kernel module name that exported this node?
Short answer :
cd /sys/dev/char/major:minor/device/driver/
ls -al | grep module
Each device is generally associated with a driver, and this is all what the "device model" is about. The sysfs filesystem contains a representation of this devices and their associated driver. Unfortuantely, it seems not all sysfs have a representation of the device nodes, so this applyd only if your /sys directory contains a /dev directory.
Let's take an example, with /dev/video0
On my board, ls -al /dev/video0 output is
crw------- 1 root root 81, 0 Jan 1 00:00 video0
So major number is 81 and minor number is 0.
Let's dive into sysfs :
# cd /sys
# ls
block class devices fs module
bus dev firmware kernel
The sys/dev directory contains entry for the char and block devices of the system :
# cd dev
# cd char
# ls
10:61 13:64 1:3 1:8 249:0 252:0 29:0 4:65 81:0 89:1
10:62 1:1 1:5 1:9 250:0 253:0 29:1 5:0 81:2
10:63 1:11 1:7 248:0 251:0 254:0 4:64 5:1 81:3
What the hell are this links with strange names ?
Remember the major and minor number, 81 and 0 ?
Let's follow this link :
#cd major:minor (ie 81:0)
#ls -al
drwxr-xr-x 2 root root 0 Jan 1 01:56 .
drwxr-xr-x 3 root root 0 Jan 1 01:56 ..
-r--r--r-- 1 root root 4096 Jan 1 01:56 dev
lrwxrwxrwx 1 root root 0 Jan 1 01:56 device -> ../../../vpfe-capture
-r--r--r-- 1 root root 4096 Jan 1 01:56 index
-r--r--r-- 1 root root 4096 Jan 1 01:56 name
lrwxrwxrwx 1 root root 0 Jan 1 01:56 subsystem -> ../../../../../class/video4linux
-rw-r--r-- 1 root root 4096 Jan 1 01:56 uevent
Now we can see that this device nod, which is how the device is presented to userspace, is associated with a kernel device. This association is made through a link. If we follow this link, we end up in a directory, with a driver link. The name of the driver is usually the name of the module :
# ls -al
drwxr-xr-x 3 root root 0 Jan 1 01:56 .
drwxr-xr-x 25 root root 0 Jan 1 00:00 ..
lrwxrwxrwx 1 root root 0 Jan 1 01:56 driver -> ../../../bus/platform/drivers/vpfe-capture
-r--r--r-- 1 root root 4096 Jan 1 01:56 modalias
lrwxrwxrwx 1 root root 0 Jan 1 01:56 subsystem -> ../../../bus/platform
-rw-r--r-- 1 root root 4096 Jan 1 01:56 uevent
drwxr-xr-x 3 root root 0 Jan 1 01:56 video4linux
So here the name of the module is vpfe_capture
The answer to this question is most likely different based on a number of factors. For example, if you're running udev, devfs, pre-devfs, etc.
If you're using Ubuntu (or another equally modern distro) the udevadm command might be what you want.
% udevadm info -q path -n /dev/cdrom
/devices/pci0000:00/0000:00:1f.1/host3/target3:0:0/3:0:0:0/block/sr0
So, my /dev/cdrom is provided by the sr driver, which resides in the sr_mod kernel module. I don't know of a command that takes /dev/cdrom as an argument and prints sr_mod as output.

Resources