Does Linux need a writeable file system - linux

Does Linux need a writeable file system to function correctly? I'm just running a very simple init programme. Presently I'm not mounting any partitions. The Kernel has mounted the root partition as read-only. Is Linux designed to be able run with just a read-only file system as long as I stick to mallocs, readlines and text to standard out (puts), or does Linux require a writeable file system in-order even to perform standard text input and output?
I ask because I seem to be getting kernel panics and complaints about the stack. I'm not trying to run a useful system at the moment. I already have a useful system on another partition. I'm trying to keep it as simple as possible so as I can fully understand things before adding in an extra layer of complexity.
I'm running a fairly standard x86-64 desktop.

No, writable file system is not required. It is theoretically possible to run GNU/Linux with the only read-only file system.
In practice you probably want to mount /proc, /sys, /dev, possibly /dev/pts to everything work properly. Note that even some bash commands requires writable /tmp. Some other programs - writable /var.
You always can mount /tmp and /var as ramdisk.

Yes and No. No it doesn't need to be writeable if it did almost nothing useful.
Yes, you're running a desktop so it's needed to be writeable.
Many processes actually need a writeable filesystem as many system calls can create files. e.g. Unix Domain Sockets can create files.
Also many applications write into /var, and /tmp
The way to get around this is to mount the filesystem read/only and use a filesystem overlay to overlay an in memory filesystem. That way, the path will be writable but they go to ram and any changes are thrown away on reboot.
See: overlayroot

No it's not required. For example as most distributions have a live version of Linux for booting up for a cd or usb disk with actually using and back end hdd.
Also on normal installations, the root partitions are changed to read-only when there are corruptions on the disk. This way the system still comes up as read-only partition.
You need to capture the vmcore and the stack trace of the panic form the dmesg output to analyse further.

Related

libmount equivalent for FUSE filesystems

What is the libmount equivalent function to mount a FUSE file-system. I understand that FUSE is not a real file system and my strace of mount.fuse shows opening a /dev/fuse file and doing some complicated manipulations.
I tried seeing how the mount.fuse works by reading it's source code but not only it is needlessly complicated by string manipulations in C, it is a GPL program.
My question is, am I missing the obvious API to mount fuse file systems?
The kernel interface for mounting a FUSE filesystem is described in "linux/Documentation/filesystems/fuse.txt" (for example, see here).
In a nutshell, you call mount(2) as you would to mount any filesystem. However, the key difference is that you must provide a mount option fd=n where n is a file descriptor you've obtained by opening /dev/fuse and which will be used by the userspace process implementing the filesystem to respond to kernel requests.
In particular, this means that the mount is actually performed by the user space program that implements the filesystem. Specifically, most FUSE filesystems use libfuse and call the function fuse_main or fuse_session_mount to perform the mount (which eventually call the internal function fuse_mount_sys in mount.c that contains the actual mount(2) system call).
So, if you want to mount a FUSE filesystem programmatically, the correct way to do this is to fork and exec the corresponding FUSE executable (e.g., sshfs) and have it handle the mount on your behalf.
Note that /sbin/mount.fuse doesn't actually mount anything itself. It's just a wrapper to allow you to mount FUSE filesystems via entries in "/etc/fstab" via the mount command line utility or at boot time. That's why you can't find any mounting code there. It mounts FUSE filesystems the same way I described above, by running the FUSE executable for the filesystem in question to perform the actual mount.

Deployment over GPRS to embedded devices

I've got quite a head scratcher here. We have multiple Raspberry Pis on the field hundreds of kilometers apart. We need to be able to safe(ish)ly upgrade them remotely, as the price for local access can cost up to few hundred euros.
The raspis run rasbian, / is on SD-card mounted in RO to prevent corruption when power is cut (usually once/day). The SD cards are cloned from same base image, but contain manually installed packages and modified files that might differ between devices. The raspis all have a USB flash as a more corruption resistant RW-drive and a script to format it on boot in case the drive is corrupted. They call home via GPRS connection with varying reliability.
The requirements for the system are as follows:
Easy versioning of config files, scripts and binaries, at leasts /etc, /root and home preferably Git
Efficient up-/downgrade from any verion to other over GPRS -> transfer file deltas only
Possibility to automatically roll back recently applied patch, if connection is no longer working
Root file system cannot be in RW mode while downloading changes, the changes need to be stored locally before applying to /
The simple approach might be keeping a complete copy of the file system in a remote git repository, generate a diff file between commits, upload the patch to the field and apply it. However, at the the moment the files on different raspis are not identical. This means, at least when installing the system, the files would have to be synchronized through something similar to rsync -a.
The procedure should be along the lines of "save diff between / and ssh folder to a file on the USB stick, mount / RW, apply diff from file, mount / RO". Rsync does the diff-getting and applying simultaneously, so my first question becomes:
1 Does there exist something like rsync that can save the file deltas from local and remote and apply them later?
Also, I have never made a system like this and the drawt is "closest to legit I can come up with". There's a lot of moving parts here and I'm terrified that something I didn't think of beforehand will cause things to go horribly wrong. Rest of my questions are:
Am I way off base here and is there actually a smarter/safe(r) way to do this?
If not, what kind of best practices should I follow and what kind of things to be extremely careful with (to not brick the devices)?
How do I handle things like installing new programs? Bypass packet manager, install in /opt?
How to manage permissions/owners (root+1 user for application logic)? Just run everything as root and hope for the best?
Yes, this is a very broad question. This will not be a direct answer to your questions, but rather provide guidelines for your research.
One means to prevent file system corruption is use an overlay file system (e.g., AUFS, UnionFS) where the root file system is mounted read-only and a tmpfs (RAM based) or flash based read-write is mount "over" the read-only root. This requires your own init scripts including use of the pivot_root command. Since nothing critical is mounted RW, the system robustly handles power outages. The gist is before the pivot_root, the FS looks like
/ read-only root (typically flash)
/rw tmpfs overlay
/aufs AUFS union overlay of /rw over /
after the pivot_root
/ Union overlay (was /aufs
/flash read only root (was /)
Updates to the /flash file system are done by remounting it read-write, doing the update, and remounting read-only. For example,
mount -oremount,rw <flash-device> /flash
cp -p new-some-script /flash/etc/some-script
mount -oremount,ro <flash-device> /flash
You may or may not immediately see the change reflected in /etc depending upon what is in the tmpfs overlay.
You may find yourself making heavy use of the chroot command especially if you decide to use a package manager. A quick sample
mount -t proc none /flash/proc
mount -t sysfs none /flash/sys
mount -o bind /dev /flash/dev
mount -o bind /dev/pts /flash/dev/pts
mount -o bind /rw /flash/rw #
mount -oremount,rw <flash-device> /flash
chroot /flash
# do commands here to install packages, etc
exit # chroot environment
mount -oremount,ro <flash-device> /flash
Learn to use the patch command. There are binary patch commands How do I create binary patches?.
For super recovery when all goes wrong, you need hardware support with watchdog timers and the ability to do fail-safe boot from alternate (secondary) root file system.
Expect to spend significant amount of time and money if you want a bullet-proof product. There are no shortcuts.

Once you mount a file system, how do you use it?

I understand that a file system can be visualized as a "tree" of files and directories. I also understand that "mounting" a file system means to placing or rooting that tree within an existing directory. https://askubuntu.com/questions/20680/what-does-it-mean-to-mount-something
With that said, I have mounted an implementation of python fuse that is similar to this one. When I run my implementation, it runs to the end of the init method and then just shows a blinking cursor. So fuse is getting mounted. I know that fuse is mounted because of what happens when I run $mount
$mount
...
FuseHandler on /home/memsql/mount
So now that I've mounted fuse, how do I access the files.
The linked tutorial says
You will see all files in /your/dir under /mnt/point and be able to
manipulate them exactly as if they were in the original filesystem.
How exactly do you do this? Can somebody show me, syntactically, how to instantiate and query Fuse? How do you perform a read or write? What does code that performs those operations look like?
As l4mpi notes, you’ve now mounted a file system, so it will respond to the standard Unix file system API, and FUSE will handle the file system operations. You can cd to it, ls in it, read or creat files in it, etc.

Unmounting proc file system

As far as I know proc file system is a virtual file system. Is there any way to unmount the proc file system and even if I do that what will be the consequences after that.
You can check (as root) who is using a mounted filesystem like so:
fuser -m /proc
Typically, your box will not be very usable if you kill all the processes using /proc. Otherwise, there is no law saying it has to be mounted, beyond all and sundry developer assuming that it is.
umount will work like on any other file system (same conditions for a filesystem to be unmonted). You can expect a whole lot of this to stop working as soon as you do that though (including very simple utilities like ps).

Linux-Based Firmware, how to implement a good way to update?

I'm developing a linux-based appliance using an alix 2d13.
I've developed a script that takes care of creating an image file, creating the partitions, installing the boot loader (syslinux), the kernel and the initrd and, that takes care to put root filesystem files into the right partition.
Configuration files are on tmpfs filesystem and gets created on system startup by a software that reads an XML file that resides on an own partition.
I'm looking a way to update the filesystem and i've considered two solutions:
the firmware update is a compressed file that could contain kernel, initrd and/or the rootfs partition, in this way, on reboot, initrd will takes care to dd the rootfs image to the right partition;
the firmware update is a compressed file that could contain two tar archives, one for the boot and one for the root filesystem.
Every solution has its own advantages:
- a filesystem image will let me to delete any unused files but needs a lot of time and it will kill the compact flash memory fastly;
- an archive is smaller, needs less time for update, but i'll have the caos on the root filesystem in short time.
An alternative solution could be to put a file list and to put a pre/post update script into the tar archive, so any file that doesn't reside into the file list will be deleted.
What do you think?
I used the following approach. It was somewhat based on the paper "Building Murphy-compatible embedded Linux systems," available here. I used the versions.conf stuff described in that paper, not the cfgsh stuff.
Use a boot kernel whose job is to loop-back mount the "main" root file system. If you need a newer kernel, then kexec into that newer kernel right after you loop-back mount it. I chose to put the boot kernel's complete init in initramfs, along with busybox and kexec (both statically linked), and my init was a simple shell script that I wrote.
One or more "main OS" root file systems exist on an "OS image" file system as disk image files. The boot kernel chooses one of these based on a versions.conf file. I only maintain two main OS image files, the current and fall-back file. If the current one fails (more on failure detection later), then the boot kernel boots the fall-back. If both fail or there is no fall-back, the boot kernel provides a shell.
System config is on a separate partition. This normally isn't upgraded, but there's no reason it couldn't be.
There are four total partitions: boot, OS image, config, and data. The data partition is for user application stuff that is intended for frequent writing. boot is never mounted read/write. OS image is only (re-)mounted read/write during upgrades. config is only mounted read/write when config stuff needs to change (hopefully never). data is always mounted read/write.
The disk image files each contain a full Linux system, including a kernel, init scripts, user programs (e.g. busybox, product applications), and a default config that is copied to the config partition on the first boot. The files are whatever size is necessary to fit everything in them. As long I allowed enough room for growth so that the OS image partition is always big enough to fit three main OS image files (during an upgrade, I don't delete the old fall-back until the new one is extracted), I can allow for the main OS image to grow as needed. These image files are always (loop-back) mounted read-only. Using these files also takes out the pain of dealing with failures of upgrading individual files within a rootfs.
Upgrades are done by transferring a self-extracting tarball to a tmpfs. The beginning of this script remounts the OS image read/write, then extracts the new main OS image to the OS image file system, and then updates the versions.conf file (using the rename method described in the "murphy" paper). After this is done, I touch a stamp file indicating an upgrade has happened, then reboot.
The boot kernel looks for this stamp file. If it finds it, it moves it to another stamp file, then boots the new main OS image file. The main OS image file is expected to remove the stamp file when it starts successfully. If it doesn't, the watchdog will trigger a reboot, and then the boot kernel will see this and detect a failure.
You will note there are a few possible points of failure during an upgrade: syncing the versions.conf during the upgrade, and touching/removing the stamp files (three instances). I couldn't find a way to reduce these further and achieve everything I wanted. If anyone has a better suggestion, I'd love to hear it. File system errors or power failures while writing the OS image could also occur, but I'm hoping the ext3 file system will provide some chance of surviving in that case.
You can have a seperate partition for update(Say Side1/Side2).
The existing kernel,rootfs is in Side1, then put the update in Side2 and switch.
By this you can reduce wear leveling and increase the life but the device gets costlier.
You can quick format the partitions before extracting the tar files. Or go with the image solution but use the smallest possible image and after dd do a filesystem resize (although that is not necessary for readonly storage)

Resources