Remove ext2 file with rootfs while it's already mounted - linux

What happens after mounting filesystem from file?
Example:
I have rootfs.ext2 file which is located in data directory and mounted under /mnt directory
mount rootfs.ext2 /mnt
After removing rootfs.ext2 I still can use files under /mnt directory, cat file, run binaries, etc.
rm -f rootfs.ext2
I was thinking that rootfs.ext2 file still exists in data directory however it was deleted. I filled whole data directory for test purposes with new data by filling file from /dev/urandom (for rewritting actual data that was before in data directory)
cat /dev/urandom > /data/Filling
Even after filling whole space in data directory I still can access /mnt and run binaries.
The question is what happens with file after mounting it and why I still can moderate throw it? Can I delete rootfs.ext2 (if it's mounted under /) file without undefined behavior of system(binaries are running, full access to filesystem, etc)
Links to documentation are appreciated.

Linux (and Unix) filesystems have several features that allow that.
Inodes
Data (the thing you get when you run cat) and metadata (what you get from stat and ls) is stored in inodes ("indexed nodes") which are like a key-value type storage. Inodes are indexed in the sense that an inode is referred to by its ID, the inode number.
That means that the data in rootfs.ext2 is stored in an inode.
Hard Links
Files inside directories are represented as directory entries. A directory entry is a pair of name and inode number.
You can think of directories as hashtables, where the key is the name, and the value is the inode number.
The full path that a directory entry represents is called a hard link to that inode.
That means that multiple directory entries, in different directories or even in the same directory, can point to the same inode number.
You can create that by running:
$ echo hello > x1
$ cat x1
hello
$ ls -li x1
1956 -rw-r----- 1 root root 6 2022-09-03 21:26 x1
$ ln -v x1 x2
'x2' => 'x1'
$ cat x2
hello
$ ll -li x1 x2
1956 -rw-r----- 2 root root 6 2022-09-03 21:26 x1
1956 -rw-r----- 2 root root 6 2022-09-03 21:26 x2
ln, by default, creates a hard link.
ls -i prints the inode number, and you can see that in the above example, x1 and x2 have the same inode number, and are therefore both hard links to that inode.
You can also see that the first ls prints 1 before root - that's the number of hard links that inode 1956 has. You see it increasing to 2 after x2 is created.
What this means is that rootfs.ext2 is a hard link that points to the inode that actually holds the filesystem.
Reference Count
Every inode has a reference count.
When nothing is loaded, the inode's reference count is equal to its hard link count.
But if the file is opened, the open file is another reference.
For example:
$ exec 8<>x2 # opens x2 for read & write as file descriptor 8
$ cat /proc/self/fd/8
hello
Because this is reference counting, a file can has 0 hard links, but still have references. Continuing the above example, with the file still open:
$ rm -v x1 x2
removed 'x1'
removed 'x2'
$ ls -li
total 0
$ cat /proc/self/fd/8
hello
The hard links that point to the inode are gone, but the open file still points to the inode, so the inode is not deleted.
(BTW if you check, you'll see that /proc/self/fd/8 is actually not another hard link to that inode, but rather a symbolic link. However, the fact that you can still read the inode's data indicates that the inode wasn't deleted)
Internally Open Files
Opening a file from userspace, like we did above with exec 8<>x2, is just one way to open files.
Many things in the Linux kernel internally open files. For example:
The swap file is internally open
When a program is executed, its executable file internally open while the program is running, as well as the dynamically linked libraries it uses.
As long as a block device is mounted, the inode that represents it is internally open.
When a socket is created, it is internally represented as an open file.
When a block device is set to be a loop device, it keeps the backing file open.
Loop Mounts
When you run mount rootfs.ext2 /mnt, what actually happens is that mount creates a block device, e.g. /dev/loop9, then opens rootfs.ext2, and configures /dev/loop9, as a loop device backed by the open file descriptor for rootfs.ext2.
As noted above, that means that as long as the block device is configured as a loop device for that file descriptor, that rootfs.ext2 inode remains open, and therefore with a reference count > 0, and therefore not deleted.
In fact, even if you deleted the loop device itself, the data would still be available, because that block device is also internally open, meaning both the backing regular file (rootfs.ext2) and the block device (/dev/loop9) are kept open:
$ sudo mount rootfs.ext2 /mnt/test/
$ echo hello > /mnt/test/x
$ losetup --list
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop9 0 0 1 0 /tmp/rootfs.ext2 0 512
$ rm -v rootfs.ext2
removed 'rootfs.ext2'
$ sudo rm -v /dev/loop9
removed '/dev/loop9'
$ cat /mnt/test/x
hello
$ sudo umount /mnt/test
$ ls /mnt/test/
$
Extra Credit: Open Directories
Inodes contain whatever data and/or metadata is needed. Regular files, like rootfs.ext2, are represented as inodes. But directories are also inodes, as well as block devices, pipes, sockets, etc.
This means that directories have reference counts too, and that they too are opened. Famously via opendir(), but also internally:
When you call something like open(/etc/passwd), the inode of the root directory (/) is briefly opened to look up etc, and the inode for /etc is briefly opened to loop up passwd.
The working directory of every process is always internally open - if you delete it from another process, the first process could still run ls in it. However, it will not be able to create new files in it.
When a directory is a mount point, it is internally open.
You can unmount a mount point that is still in use, because every such "use" is counted as a reference:
$ sudo mount rootfs.ext2 /mnt/test/
$ cd /mnt/test/
$ echo hello > x
$ sudo umount --lazy /mnt/test
$ cat x
hello
$ cd / # reference count of what was mounted on /mnt/test drops to 0
$ cd /mnt/test
$ cat x
cat: x: No such file or directory

Related

SCP command altering filesize of tranferred data

Context:
I am transferring a backup dir from Server A to Server B.(RHEL)
Directory size (to be transferred) on Server A: 48GB
Available space on Server B: 154GB
Command I'm using on Server A(user: root):
scp -r -C <nameof-backup-dir> user#severB:/path
Unexpected Behaviour:
The backup directory appears on the target server B #/path occupying all available 154GB of space.
Meanwhile the SCP run on the source server A terminates with an "Insufficent space message" for the remaining files.
Question/Help needed:
What am I doing wrong here?
What changes do I need to make to the SCP command to achieve the result?
One thing I can think of is that block sizes are different.
If block size on the destination machine is bigger, small files will occupy more space.
To find out block size :
sudo tune2fs -l /dev/sda1 | grep -i 'block size'
# Replace /dev/sda1 with your device (found out with command [df])
If it's indeed the case, you can recreate destination file system with the same block size as the source file system.

Finding if a folder is in copying process on linux

Is there any way to find out if a folder is in a copying process ?
To be more specific:
I have a folder in a share drive which is copied there by someone else, and I need to use it but, at the moment that I access it (let's admit that I check
the existence before and it's okay) the copying process may still be on going.
I want to check this from a bash/python script.
Try lsof - list open files
lsof +d /path/to/some/directory
Here is an example with a huge copy:
mkdir /tmp/big
cd /tmp/big
# Create 1 Gb file
perl -e 'for(1..10000000) { print "x"x100 . "\n" }' > huge
# Start cp process in background, it will take a few seconds
cp -r /tmp/big /tmp/huge &
$ lsof +d /tmp/big
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
cp 4291 felix 3r REG 8,1 1010000000 2752741 /tmp/big/huge
man lsof

Under Linux, is it possible to gcore a process whose executable has been deleted?

Programming on CentOS 6.6, I deleted an executable (whoops, make clean) while it was running in a screen session.
Now, unrelated, I want to gcore the process to debug something. I have rebuilt the executable, but gcore doesn't accept the replaced file. It knows the original file was deleted and won't let me dump core.
# gcore 15659
core.YGsoec:4: Error in sourced command file:
/home/dev/bin/daemon/destinyd (deleted): No such file or directory.
gcore: failed to create core.15659
# ls -l /proc/15659/exe
lrwxrwxrwx. 1 root root 0 Mar 12 21:33 /proc/15659/exe -> /home/dev/bin/daemon/destinyd (deleted)
# ln -s /proc/15659/exe /home/dev/bin/daemon/destinyd
ln: creating symbolic link `/home/dev/bin/daemon/destinyd': File exists
# rm /proc/15659/exe
rm: remove symbolic link `/proc/15659/exe'? y
rm: cannot remove `/proc/15659/exe': Permission denied
FreeBSD's gcore has an optional argument "executable" which looks promising (as if I could specify a binary to use that is not /proc/15659/exe), but that's of no use to me as Linux's gcore does not have any such argument.
Are there any workarounds? Or will I just have to restart the process (using the recreated executable) and wait for the bug I'm tracking to reproduce itself?
Despite the output of ls -l /proc/15659/exe, the original executable is in fact still available through that path.
So, not only was I able to restore the original file with a simple cp (though this was not enough to restore the link and get gcore to work), but I was able to attach GDB to the process using this path as executable:
# gdb -p 15659 /proc/15659/exe
and then run the "generate-core-file" command, followed by "detach".
Then, I became free to examine the core file as needed:
# gdb /proc/15659/exe core.15659
In truth I had forgotten about the ability of GDB to generate core files, plus I was anxious about actually attaching GDB to the process because timing was quite important: generating the core file at precisely the right time to catch that bug.
But nos steered me back onto this path and, my fears apparently unfounded, GDB was able to produce a lovely core.15659 for me.

How to unmount a busy device [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed last year.
The community reviewed whether to reopen this question last year and left it closed:
Original close reason(s) were not resolved
Improve this question
I've got some samba drives that are being accessed by multiple users daily. I already have code to recognize shared drives (from a SQL table) and mount them in a special directory where all users can access them.
I want to know, if I remove a drive from my SQL table (effectively taking it offline) how, or even is, there a way to unmount a busy device? So far I've found that any form of umount does not work.
Ignoring the possibility of destroying data - is it possible to unmount a device that is currently being read?
YES!! There is a way to detach a busy device immediately - even if it is busy and cannot be unmounted forcefully. You may cleanup all later:
umount -l /PATH/OF/BUSY-DEVICE
umount -f /PATH/OF/BUSY-NFS (NETWORK-FILE-SYSTEM)
NOTE/CAUTION
These commands can disrupt a running process, cause data loss OR corrupt open files. Programs accessing target DEVICE/NFS files may throw errors OR could not work properly after force unmount.
Do not execute above umount commands when inside mounted path (Folder/Drive/Device) itself. First, you may use pwd command to validate your current directory path (which should not be the mounted path), then use cd command to get out of the mounted path - to unmount it later using above commands.
If possible, let us locate/identify the busy process, kill that process and then unmount the samba share/ drive to minimize damage:
lsof | grep '<mountpoint of /dev/sda1>' (or whatever the mounted device is)
pkill target_process (kills busy proc. by name | kill PID | killall target_process)
umount /dev/sda1 (or whatever the mounted device is)
Make sure that you aren't still in the mounted device when you are trying to umount.
Avoid umount -l
At the time of writing, the top-voted answer recommends using umount -l.
umount -l is dangerous or at best unsafe. In summary:
It doesn't actually unmount the device, it just removes the filesystem from the namespace. Writes to open files can continue.
It can cause btrfs filesystem corruption
Work around / alternative
The useful behaviour of umount -l is hiding the filesystem from access by absolute pathnames, thereby minimising further moutpoint usage.
This same behaviour can be achieved by mounting an empty directory with permissions 000 over the directory to be unmounted.
Then any new accesses to filenames in the below the mountpoint will hit the newly overlaid directory with zero permissions - new blockers to the unmount are thereby prevented.
First try to remount,ro
The major unmount achievement to be unlocked is the read-only remount. When you gain the remount,ro badge, you know that:
All pending data has been written to disk
All future write attempts will fail
The data is in a consistent state, should you need to physcially disconnect the device.
mount -o remount,ro /dev/device is guaranteed to fail if there are files open for writing, so try that straight up. You may be feeling lucky, punk!
If you are unlucky, focus only on processes with files open for writing:
lsof +f -- /dev/<devicename> | awk 'NR==1 || $4~/[0-9]+[uw -]/'
You should then be able to remount the device read-only and ensure a consistent state.
If you can't remount read-only at this point, investigate some of the other possible causes listed here.
Read-only re-mount achievement unlocked 🔓☑
Congratulations, your data on the mountpoint is now consistent and protected from future writing.
Why fuser is inferior to lsof
Why not use use fuser earlier? Well, you could have, but fuser operates upon a directory, not a device, so if you wanted to remove the mountpoint from the file name space and still use fuser, you'd need to:
Temporarily duplicate the mountpoint with mount -o bind /media/hdd /mnt to another location
Hide the original mount point and block the namespace:
Here's how:
null_dir=$(sudo mktemp --directory --tmpdir empty.XXXXX")
sudo chmod 000 "$null_dir"
# A request to remount,ro will fail on a `-o bind,ro` duplicate if there are
# still files open for writing on the original as each mounted instance is
# checked. https://unix.stackexchange.com/a/386570/143394
# So, avoid remount, and bind mount instead:
sudo mount -o bind,ro "$original" "$original_duplicate"
# Don't propagate/mirror the empty directory just about hide the original
sudo mount --make-private "$original_duplicate"
# Hide the original mountpoint
sudo mount -o bind,ro "$null_dir" "$original"
You'd then have:
The original namespace hidden (no more files could be opened, the problem can't get worse)
A duplicate bind mounted directory (as opposed to a device) on which
to run fuser.
This is more convoluted[1], but allows you to use:
fuser -vmMkiw <mountpoint>
which will interactively ask to kill the processes with files open for writing. Of course, you could do this without hiding the mount point at all, but the above mimicks umount -l, without any of the dangers.
The -w switch restricts to writing processes, and the -i is interactive, so after a read-only remount, if you're it a hurry you could then use:
fuser -vmMk <mountpoint>
to kill all remaining processes with files open under the mountpoint.
Hopefully at this point, you can unmount the device. (You'll need to run umount on the mountpoint twice if you've bind mounted a mode 000 directory on top.)
Or use:
fuser -vmMki <mountpoint>
to interactively kill the remaining read-only processes blocking the unmount.
Dammit, I still get target is busy!
Open files aren't the only unmount blocker. See here and here for other causes and their remedies.
Even if you've got some lurking gremlin which is preventing you from fully unmounting the device, you have at least got your filesystem in a consistent state.
You can then use lsof +f -- /dev/device to list all processes with open files on the device containing the filesystem, and then kill them.
[1] It is less convoluted to use mount --move, but that requires mount --make-private /parent-mount-point which has implications. Basically, if the mountpoint is mounted under the / filesystem, you'd want to avoid this.
Try the following, but before running it note that the -k flag will kill any running processes keeping the device busy.
The -i flag makes fuser ask before killing.
fuser -kim /address # kill any processes accessing file
unmount /address
Before unmounted the filesysem. we need to check is any process holding or using the filesystem. That's why it show device is busy or filesystem is in use.
run below command to find out the processes using by a filesystem:
fuser -cu /local/mnt/
It will show how many processes holding/using the filesystem.
local/mnt: 1725e(root) 5645c(shasankarora)
ps -ef | grep 1725 <--> ps -ef | grep <pid>
kill -9 pid
Kill all the processes and then you will able to unmount the partition/busy device.
Check for exported NFS file systems with exportfs -v. If found, remove with exportfs -d share:/directory. These don't show up in the fuser/lsof listing, and can prevent umount from succeeding.
Just in case someone has the same pb. :
I couldn't unmount the mount point (here /mnt) of a chroot jail.
Here are the commands I typed to investigate :
$ umount /mnt
umount: /mnt: target is busy.
$ df -h | grep /mnt
/dev/mapper/VGTout-rootFS 4.8G 976M 3.6G 22% /mnt
$ fuser -vm /mnt/
USER PID ACCESS COMMAND
/mnt: root kernel mount /mnt
$ lsof +f -- /dev/mapper/VGTout-rootFS
$
As you can notice, even lsof returns nothing.
Then I had the idea to type this :
$ df -ah | grep /mnt
/dev/mapper/VGTout-rootFS 4.8G 976M 3.6G 22% /mnt
dev 2.9G 0 2.9G 0% /mnt/dev
$ umount /mnt/dev
$ umount /mnt
$ df -ah | grep /mnt
$
Here it was a /mnt/dev bind to /dev that I had created to be able to repair my system inside from the chroot jail.
After umounting it, my pb. is now solved.
Check out umount2:
Linux 2.1.116 added the umount2() system call, which, like umount(),
unmounts a target, but allows additional flags controlling the
behaviour of the operation:
MNT_FORCE (since Linux 2.1.116) Force unmount even if busy. (Only for
NFS mounts.)
MNT_DETACH (since Linux 2.4.11) Perform a lazy unmount:
make the mount point unavailable for new accesses, and actually
perform the unmount when the mount point ceases to be busy.
MNT_EXPIRE (since Linux 2.6.8) Mark the mount point as expired. If a mount point
is not currently in use, then an initial call to umount2() with this
flag fails with the error EAGAIN, but marks the mount point as
expired. The mount point remains expired as long as it isn't accessed
by any process. A second umount2() call specifying MNT_EXPIRE unmounts
an expired mount point. This flag cannot be specified with either
MNT_FORCE or MNT_DETACH.
I recently had a similar need to unmount in order to change it's label with gparted.
/dev/sda1 was being mounted via /etc/fstab as /media/myusername. When attempts to unmount failed, I researched the error. I had forgotten to unmount a dual partitioned thumb drive with a mountpoint on /dev/hda1 first.
I gave 'lsof' a go as recommended.
$ sudo lsof | grep /dev/sda1
The output of which was:
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
Output information may be incomplete.
lsof: WARNING: can't stat() fuse file system /run/user/1000/doc
Output information may be incomplete.
Since lsof burped up two fuse warnings, I poked around in /run/user/1000/*, and took a guess that it could be open files or mount points (or both) interfering with things.
Since the mount points live in /media/, I tried again with:
$ sudo lsof | grep /media
The same two warnings, but this time it returned additional info:
bash 4350 myusername cwd DIR 8,21 4096 1048577 /media
sudo 36302 root cwd DIR 8,21 4096 1048577 /media
grep 36303 myusername cwd DIR 8,21 4096 1048577 /media
lsof 36304 root cwd DIR 8,21 4096 1048577 /media
lsof 36305 root cwd DIR 8,21 4096 1048577 /media
Still scratching my head, it was at this point I remembered the thumb drive sticking out of the USB port. Maybe the scratching helped.
So I unmounted the thumb drive partitions (unmounting one automatically unmounted the other) and safefly unplugged the thumb drive. After doing so, I was able to unmount /dev/sda1 (having nothing mounted on it anymore), relabel it with gparted, remount both the drive and thumb drive with no issues whatsoever.
Bacon saved.
Someone has mentioned that if you are using terminal and your current directory is inside the path which you want to unmount, you will get the error.
As a complementary, in this case, your lsof | grep path-to-be-unmounted must have below output:
bash ... path-to-be-unmounted
sudo fusermount -u -z <mounted path>
NB: do not use completition for the path as this will also freeze the terminal.
Another alternative when anything works is editing /etc/fstab, adding noauto flag and rebooting the machine. The device won't be mounted, and when you're finished doing whatever, remove flag and reboot again.
Niche Answer:
If you have a zfs pool on that device, at least when it's a file-based pool, lsof will not show the usage. But you can simply run
sudo zpool export mypool
and then unmount.
Multiple mounts inside a folder
An additional reason could be a secondary mount inside your primary mount folder, e.g. after you worked on an SD card for an embedded device:
# mount /dev/sdb2 /mnt # root partition which contains /boot
# mount /dev/sdb1 /mnt/boot # boot partition
Unmounting /mnt will fail:
# umount /mnt
umount: /mnt: target is busy.
First we have to unmount the boot folder and then the root:
# umount /mnt/boot
# umount /mnt
In my case, I couldn't unmount a partition that was mounted to a directory that was an AFP share. (sharing into an Apple bonjour/avahi mdns world)
I moved all the logins on the server to their home directory; I moved all the remotely connected Macs to some other directory.
I still couldn't unmount the partition even with umount -f
So I restarted the netatalk daemon on the server.
(/etc/netatalk/afp.conf has in it the share assignment)
After the netatalk restart, umount succeeded without the -f.

Inode of directory on mounted share changes despite no change in modification time

I am running Ubuntu 10.4 and am mounting a drive using cifs. The command I'm using is:
'sudo mount -t cifs -o workgroup="workgroup",username="username",noserverino,ro //"drive" "mount_dir"'
(Obviously with "" values substituted for actual values)
When I then run the command ls -i I get: 394070
Running it a second time I get: 12103522782806018
Is there any reason to expect the inode value to change?
Running ls -i --full-time shows no change in modification time.
noserverino tells your mount not to use server-generated inode numbers, and instead use client-generated temporary inode numbers, to make up for them. Try with serverino, if your server and the exported filesystem support inode numbers, they should be persistent.
I found that using the option "nounix" before the "noserverino" kept the inodes small and persistent. I'm not really sure why this happened. The server is AIX and I'm running it from Ubuntu. Thank you for your response.

Resources