remount disk with running process - linux

I have an embedded application that I am working on. To protect the data on this image its partitions are mounted RO (this helps prevent flash errors when the power is lost unexpectedly since I cannot guarantee clean shutdowns, you could pull the plug)
An application I am working that needs to be protected resides on this RO partition, however this program also needs to be able to change configuration files on the same RO file system. I have code that allows me to remount this partition RW as needed (eg for firmware updates), but this requires all the processes to be stopped that are running from the read only partition (ie killall my_application). Hence it is not possible for my application to remount the partition it needs to modify without first killing itself (I am not sure which one is the chicken and which one is the egg, but you get the gist).
Is there a way to start my application in such a way that the entire binary is kept in RAM and there is no link back to the partition from which it was run so that the unmount reports the partition as busy?
Or alternatively is there a way to safely remount this RO partition without first killing the process running on it?

You can copy it to a tmpfs filesystem and execute it from there. A tmpfs filesystem stores all data in RAM and sometimes on your SWAP partition.
Passing the -oremount flag to mount should also work.

Related

Yocto boot partition vs boot rootfs directory

I need to mount my boot partition in order to perform software updates in my yocto built.
I would expect to mount it in /boot but I see that there is a directory there already
I cant seem to find information about what this /boot directory is and why its needed. It contains the same .itb file that is in the boot partition
Do I need this boot directory? Who is it used by? Is there a way to tell yocto not to create it?
In Short
The short answer is, normally, unless you've tempered with the etc/fstab file, the boot partition is mounted in /boot. Yo can check that it is indeed the boot partition that is mounted with the df command.
fstab
Now I'm a bit of a fstab noob myself but here is what I could get from it.
In the fstab file is the file responsible for automatically mounting device. In general, by default yocto generates a fstab file looking like this:
/dev/mmcblk0p1 /boot vfat defaults 0 0
Meaning (from what I can get) that your first partition will be mounted in /boot automatically. (If there are any fstab wizard out here feel free to elaborate)
From my experience
This work as normally mounted folder. You can modify all you want in /boot and the changes will happen on the boot partition. Once your changes are done, reboot and you should be able to see your changes having taken effect.
As for your questions
I have a broad idea but I suspect this depends on your build and target.
And I have no yocto build I can check that with atm. So here are my hints:
Do you need it? I don't think so... unless you have a module or a script that is supposed to automatically mess with the boot partition... and even so I don't think this is vital. I'd the fastest way to find out is to remove the line from /etc/fstab mounting the boot partition and see if somthing crashes
Who is it used by? I suspect no one. I think that it is just a handy way to access your boot partition buas I said, haven't had the opportunity to confirm that.
How do you prevent yocto from creating it? All you should need to do is tell yocto to write a fstab file without the line mounting /boot. Here is how to override a file installed by a parent layer. Know that fstab is installed by /poky/meta/recipes-core/basefiles/base-files_X.X.X.bb.
Note that overall I'm not a 100% certain what /boot this is used for but I wouldn't recommend getting rid of it. In my opinion there is no draw backs to having your boot partition mounted and it is a handy tool to check on your boot partition while investigating your build.
Turns out that atmel dt overlay was putting overlay files in the
/boot directory. These overlays aren't relevent for my machine so I removed the dt-overlay dependency
https://github.com/linux4sam/meta-atmel/blob/07867084ec52b5580e55eb67c861823eaab5e4c3/recipes-bsp/dt-overlay-at91/dt-overlay-at91_git.bb#L51

Create a filesystem on block device directly, without a partition?

I was under the impression that a block device is listed under /dev, so for example /dev/xvdf and that file systems live on a partition which is listed with a number behind the block device the partition is on, like /dev/xvdf1 and that all file systems must live on a partition.
I am running CentOS and as part of a course I have to create file systems, partitions and mount file systems. For this course, I have created a file system on device file /dev/xvdf and I have mounted this file system. In addition to that, I have created a partition on /dev/xvdf with the file name of /dev/xvdf1 and created a file system on this partition as well and mounted this file system. This confuses me and I have some questions:
Am I correct that you do not have to create a partition on a block device, but that you can create a file system on a block device directly without a partition?
If so, why would anyone want to do this?
After creating the file system on /dev/xvdf, I created the /dev/xvdf1 partition using fdisk and I allocated the max blocks to this new partition. However, the file system on /dev/xvdf was not removed and still had a file on it. How is this possible if all the blocks on /dev/xvdf have been allocated to the /dev/xvdf1 partition?
Question #1: you are correct. A file system needs only a contiguous space somewhere. You can also create file system in memory (virtual disk).
Question #2: the possibility of having a partition table is a good thing; but why use it if you don't need to break a disk (or other block device) in several pieces?
About question #3, I think you overlooked something - probably an error raised somewhere and you didn't notice, or some error will raise in future. Even if you have the impression, it can not work; the mounted filesystem thinks to own all the space reserved to it, and similarly fdisk thinks that the blocks it is using "can be used". BTW, what is that "/dev/xvdf"? Is it a real device or whatever?

Is access to linux tmpfs transactional?

I'm currently facing some strange problem on my distributed application.
This application generally do the following things:
Read and Write Data from an NFSv3 Filesystem
Read and Write Data from a tmpfs Filesystem
One process generate files on tmpfs and access to them with another process (or another java thread which in the end is a pthread)
One process generate files on NFSv3 and access to them with another process (or another java thread which in the end is a pthread)
Write data to NFSv3 and read the same data from another machine
We discovered many latency problems with NFSv3 but those problems are known: If you write a file on NFS and than try to read from another machine it can take up to 90 seconds to be available when the stat syscall is executed on the other machine.
So we implemented some retry code to address this issue.
Recently we spotted a similar behaviour also on tmpfs but since it's in ram I thought that at the end of a write the same machine with another thread executed at the end of the write should see the file but we got an error about it.
So we decided to implement again another retry block
The question is: is the tmpfs transactional when the code stop to write on the file ?
And more in general how on the different filesystem this concept is applied ?
Thanks
Marco

Azure Linux remove and add another disk

I had a need to increase the disk space, for my Linux azure, we attached a new empty disk and followed the steps here http://azure.microsoft.com/en-in/documentation/articles/virtual-machines-linux-how-to-attach-disk , the only difference is that the newly added deviceid was not found in /var/log/messages.
Now I need to add another disk, and we attached another disk, the problem is that for doing the first step of fdisk
sudo fdisk /dev/sdc
I have no idea where the recent disk is attached, total clueless, also what are the steps if i want to remove a disk altogether, I know umount will unmount a disk, but that doesn't neccessarily takes off the device from the instance, i want a total detachment.
Finally figured it out - the additional SCSI disks added are started from /dev/sda, /dev/sdb , /dev/sdc, /dev/sdd, /dev/sde.... The reason why the MS tutorial talks about /dev/sdc is because its the 3rd disk in the system, 1st your root volume, second your ephemeral temp storage, and this is your 3rd one, now as the /dev/sdc is not good enough for you and you want to remove it
Remove entry from /etc/fstab file
umount /datadrive
you can now remove the attached disk from your Azure console.
Lets say that sdc is still there and you want to add another, just attach from the azure console, and do the same steps as given in http://azure.microsoft.com/en-in/documentation/articles/virtual-machines-linux-how-to-attach-disk/#initializeinlinux the only diff is that the another disk would be at /dev/sdd places where you make a partitions at /dev/sdc1 will become /dev/sdd1 that's pretty much about it.
References
http://www.yolinux.com/TUTORIALS/LinuxTutorialAdditionalHardDrive.html
http://azure.microsoft.com/en-in/documentation/articles/virtual-machines-linux-how-to-attach-disk/#initializeinlinux
When adding more than one disk, it will also become important to start using uuid in fstab. Search for uuid in this article for more.

LVM snapshot of mounted filesystem

I'd like to programmatically make a snapshot of a live filesystem in Linux, preferably using LVM. I'd like not to unmount it because I've got lots of files opened (my most common scenario is that I've got a busy desktop with lots of programs).
I understand that because of kernel buffers and general filesystem activity, data on disk might be in some more-or-less undefined state.
Is there any way to "atomically" unmount an FS, make an LVM snapshot and mount it back? It will be ok if the OS will block all activity for few seconds to do this task. Or maybe some kind of atomic "sync+snapshot"? Kernel call?
I don't know if it is even possible...
You shouldn't have to do anything for most Linux filesystems. It should just work without any effort at all on your part. The snapshot command itself hunts down mounted filesystems using the volume being snapshotted and calls a special hook that checkpoints them in a consistent, mountable state and does the snapshot atomically.
Older versions of LVM came with a set of VFS lock patches that would patch various filesystems so that they could be checkpointed for a snapshot. But with new kernels that should already be built into most Linux filesystems.
This intro on snapshots claims as much.
And a little more research reveals that for kernels in the 2.6 series the ext series of filesystems should all support this. ReiserFS probably also. And if I know the btrfs people, that one probably does as well.
I know that ext3 and ext4 in RedHat Enterprise, Fedora and CentOS automatically checkpoint when a LVM snapshot is created. That means there is never any problem mounting the snapshot because it is always clean.
I believe XFS has the same support. I am not sure about other filesystems.
It depends on the filesystem you are using. With XFS you can use xfs_freeze -f to sync and freeze the FS, and xfs_freeze -u to activate it again, so you can create your snapshot from the frozen volume, which should be a save state.
Is there any way to "atomically" unmount an FS, make an LVM snapshot and mount it back?
It is possible to snapshot a mounted filesystem, even when the filesystem is not on an LVM volume. If the filesystem is on LVM, or it has built-in snapshot facilities (e.g. btrfs or ZFS), then use those instead.
The below instructions are fairly low-level, but they can be useful if you want the ability to snapshot a filesystem that is not on an LVM volume, and can't move it to a new LVM volume. Still, they're not for the faint-hearted: if you make a mistake, you may corrupt your filesystem. Make sure to consult the official documentation and dmsetup man page, triple-check the commands you're running, and have backups!
The Linux kernel has an awesome facility called the Device Mapper, which can do nice things such as create block devices that are "views" of other block devices, and of course snapshots. It is also what LVM uses under the hood to do the heavy lifting.
In the below examples I'll assume you want to snapshot /home, which is an ext4 filesystem located on /dev/sda2.
First, find the name of the device mapper device that the partition is mounted on:
# mount | grep home
/dev/mapper/home on /home type ext4 (rw,relatime,data=ordered)
Here, the device mapper device name is home. If the path to the block device does not start with /dev/mapper/, then you will need to create a device mapper device, and remount the filesystem to use that device instead of the HDD partition. You'll only need to do this once.
# dmsetup create home --table "0 $(blockdev --getsz /dev/sda2) linear /dev/sda2 0"
# umount /home
# mount -t ext4 /dev/mapper/home /home
Next, get the block device's device mapper table:
# dmsetup table home
home: 0 3864024960 linear 9:2 0
Your numbers will probably be different. The device target should be linear; if yours isn't, you may need to take special considerations. If the last number (start offset) is not 0, you will need to create an intermediate block device (with the same table as the current one) and use that as the base instead of /dev/sda2.
In the above example, home is using a single-entry table with the linear target. You will need to replace this table with a new one, which uses the snapshot target.
Device mapper provides three targets for snapshotting:
The snapshot target, which saves writes to the specified COW device. (Note that even though it's called a snapshot, the terminology is misleading, as the snapshot will be writable, but the underlying device will remain unchanged.)
The snapshot-origin target, which sends writes to the underlying device, but also sends the old data that the writes overwrote to the specified COW device.
Typically, you would make home a snapshot-origin target, then create some snapshot targets on top of it. This is what LVM does. However, a simpler method would be to simply create a snapshot target directly, which is what I'll show below.
Regardless of the method you choose, you must not write to the underlying device (/dev/sda2), or the snapshots will see a corrupted view of the filesystem. So, as a precaution, you should mark the underlying block device as read-only:
# blockdev --setro /dev/sda2
This won't affect device-mapper devices backed by it, so if you've already re-mounted /home on /dev/mapper/home, it should not have a noticeable effect.
Next, you will need to prepare the COW device, which will store changes since the snapshot was made. This has to be a block device, but can be backed by a sparse file. If you want to use a sparse file of e.g. 32GB:
# dd if=/dev/zero bs=1M count=0 seek=32768 of=/home_cow
# losetup --find --show /home_cow
/dev/loop0
Obviously, the sparse file shouldn't be on the filesystem you're snapshotting :)
Now you can reload the device's table and turn it into a snapshot device:
# dmsetup suspend home && \
dmsetup reload home --table \
"0 $(blockdev --getsz /dev/sda2) snapshot /dev/sda2 /dev/loop0 PO 8" && \
dmsetup resume home
If that succeeds, new writes to /home should now be recorded in the /home_cow file, instead of being written to /dev/sda2. Make sure to monitor the size of the COW file, as well as the free space on the filesystem it's on, to avoid running out of COW space.
Once you no longer need the snapshot, you can merge it (to permanently commit the changes in the COW file to the underlying device), or discard it.
To merge it:
replace the table with a snapshot-merge target instead of a snapshot target:
# dmsetup suspend home && \
dmsetup reload home --table \
"0 $(blockdev --getsz /dev/sda2) snapshot-merge /dev/sda2 /dev/loop0 P 8" && \
dmsetup resume home
Next, monitor the status of the merge until all non-metadata blocks are merged:
# watch dmsetup status home
...
0 3864024960 snapshot-merge 281688/2097152 1104
Note the 3 numbers at the end (X/Y Z). The merge is complete when X = Z.
Next, replace the table with a linear target again:
# dmsetup suspend home && \
dmsetup reload home --table \
"0 $(blockdev --getsz /dev/sda2) linear /dev/sda2 0" && \
dmsetup resume home
Now you can dismantle the loop device:
# losetup -d /dev/loop0
Finally, you can delete the COW file.
# rm /home_cow
To discard the snapshot, unmount /home, follow steps 3-5 above, and remount /home. Although Device Mapper will allow you to do this without unmounting /home, it doesn't make sense (since the running programs' state in memory won't correspond to the filesystem state any more), and it will likely corrupt your filesystem.
I'm not sure if this will do the trick for you, but you can remount a file system as read-only. mount -o remount,ro /lvm (or something similar) will do the trick. After you are done your snapshot, you can remount read-write using mount -o remount,rw /lvm.
FS corruption is "highly unlikely", as long as you never work in any kind of professional environment. otherwise you'll meet reality, and you might try blaming "bit rot" or "hardware" or whatever, but it all comes down to having been irresponsible. freeze/thaw (as mentioned a few times, and only if called properly) is sufficient outside of database environments. for databases, you still won't have a transaction-complete backup and if you think a backup that rolls back some transaction is fine when restored: see starting sentence.
depending on the activity you might just added another 5-10 mins of downtime if ever you need that backup.
Most of us can easily afford that, but it can not be general advice.
Be honest about downsides, guys.

Resources