inode - move a file which is already opened

inode - move a file which is already opened - linux

On linux, on a ext4 partition there is a movie, if I open the movie & watch it, then I move or rename the movie.
After the previous cache use up, it need to read more cache from original file, then it still could do that.
The question is:
How file system & inode achieve this?

Using rename() on a file within the same file system just changes the name that points to the inode. You can even use rename() to move that name into another directory - as long as it's in the same file system:
The rename() function shall change the name of a file. The old
argument points to the pathname of the file to be renamed. The new
argument points to the new pathname of the file. ...
Note that the POSIX specification for rename() is a lot more specific than the C Standard rename() specification:
7.21.4.2 The rename function
Synopsis
#include <stdio.h>
int rename(const char *old, const char *new);
Description
The rename function causes the file whose name is the string pointed
to by old to be henceforth known by the name given by the string
pointed to by new . The file named old is no longer accessible by
that name. If a file named by the string pointed to by new exists
prior to the call to the rename function, the behavior is
implementation-defined.
Returns
The
rename
function returns zero if the operation succeeds, nonzero if it fails,
269)
in
which case if the file existed previously it is still known by its original name.
(That's the entire C Standard specification for rename(). Read the POSIX link for a lot more information.)
So how can you rename a file while your watching it in an application? The underlying inode that your movie-watching process's open file descriptor uses to access the file that you just rename()'d doesn't change.
That's the same reason why you can unlink() (delete) a file while it's in use - all you're doing is removing the entry that points to the inode - one of the links to the inode. The space used by the file isn't freed until the last link to the inode is removed - and an open file descriptor counts as a link. That's also why the function that deletes the directory entry for a file is called unlink() - that's all it does. And yes, a file (inode) can have more than one link pointing to it.

Related

what happens when calling ```touch .``` in linux?

this is a very specific question
I'm mainly interested in the open() system calls the happen when running touch ..
So I ran strace touch . and saw that opennat() is called three times.
but I'm not really understanding whats going on; as touch . does not print anything in the console and does not create a new file named "." since "." is a pointer to the current folder and can be seen by running ls -a so nothing is created since that name is already in use.
this is my assumption:
open() is called to check if the specified file name already exits, if a file descriptor is returned this means that the name is already in use and the operation is canceled.
please correct me if I'm wrong.

GNU touch prefers to use a file descriptor when touching files, since it's possible to write touch - > foo and expect the file foo to be touched. As a result, it always tries to open the specified path as a writable file, and if that's possible, it then uses that file descriptor to update the file timestamp.
In this case, it's not possible to open . for writing, so openat returns EISDIR. touch notices that it's a directory, so its call to its internal fdutimensat function gets an invalid file descriptor and falls back to using utimensat instead of futimens.
It isn't the case that the openat call is used to check that the file exists, but instead that using a file descriptor for many operations means that you don't have to deal with path resolution multiple times or handle symlinks, since all of those are resolved when the file descriptor is opened. This is why many long-lived programs choose to open a file descriptor to their current working directory, then change directories, and then use the file descriptor with fchdir to change back. Any pchanges to permissions after the program starts are not a problem.

How do I get the filename of an open std::fs::File in Rust?

I have an open std::fs::File, and I want to get it's filename, e.g. as a PathBuf. How do I do that?
The simple solution would be to just save the path used in the call to File::open. Unfortunately, this does not work for me. I am trying to write a program that reads log files, and the program that writes the logs keep changing the filenames as part of it's log rotation. So the file may very well have been renamed since it was opened. This is on Linux, so renaming open files is possible.
How do I get around this issue, and get the current filename of an open file?

On a typical Unix filesystem, a file may have multiple filenames at once, or even none at all. The file metadata is stored in an inode, which has a unique inode number, and this inode number can be linked from any number of directory entries. However, there are no reverse links from the inode back to the directory entries.
Given an open File object in Rust, you can get the inode number using the ino() method. If you know the directory the log file is in, you can use std::fs::read_dir() to iterate over all entries in that directory, and each entry will also have an ino() method, so you can find the one(s) matching your open file object. Of course this approach is subject to race conditions – the directory entry may already be gone again once you try to do anything with it.

On linux, files handles held by the current process can be found under /proc/self/fd. These look and act like symlinks to the original files (though I think they may technically be something else - perhaps someone who knows more can chip in).
You can therefore recover the (possibly changed) file name by constructing the correct path in /proc/self/fd using your file descriptor, and then following the symlink back to the filesystem.
This snippet shows the steps:
use std::fs::read_link;
use std::os::unix::io::AsRawFd;
use std::path::PathBuf;
// if f is your std::fs::File
// first construct the path to the symlink under /proc
let path_in_proc = PathBuf::from(format!("/proc/self/fd/{}", f.as_raw_fd()));
// ...and follow it back to the original file
let new_file_name = read_link(path_in_proc).unwrap();

Linux - How does mkstemp64 create a hidden file, and how can I see it in the file system

Background:
On CentOS 7 x86_64. I am using applydeltarpm, and was running out of disk space when creating the new RPM from the delta. I watched the / disk space usage increase during the apply process to 100%, but could not find the working/temp file on the volume using either ls -l /tmp or find /tmp -mmin -1 -type f.
I changed the applydeltarpm source code to use /var/tmp instead of /tmp, and rebuilt the RPM. Now apply works with the modified applydeltarpm, because /var/tmp has a lot more disk space. But I still cannot find the temp file created with mkstemp64.
Question:
The temp file created by mkstemp64 appears to be "non-existent", but still exists as a file descriptor to the creator, and is using considerable disk space when applydeltarpm creates a large RPM (1 hour to apply on a slow disk). The mkstemp64 documentation says an actual file is created. And the source code shows the template file name is /tmp/deltarpmpageXXXXXX. But a file with that template name does not exist.
How is this temporary file able to be created on the system without being findable with the usual directory listing ls, or find. And how can I find these kinds of "non-existent" files in the system?
(I am curious, because I also am monitoring system security)
References:
https://github.com/rpm-software-management/deltarpm/blob/master/applydeltarpm.c
# line 198
if (pagefd < 0)
{
char tmpname[80];
sprintf(tmpname, "/tmp/deltarpmpageXXXXXX");
#ifdef DELTARPM_64BIT
pagefd = mkstemp64(tmpname);
#else
pagefd = mkstemp(tmpname);
#endif
if (pagefd < 0)
{
fprintf(stderr, "could not create page area\n");
exit(1);
}
unlink(tmpname);
}
https://www.mkssoftware.com/docs/man3/mkstemp.3.asp
The mktemp() function returns a unique file name based on the template
parameter. At the time it is generated, no file in the current
directory has that name. No file is actually created, so it is
possible that another application could create a file with this name.
The mkstemp() function is similar to mktemp() in that it create a
unique file name based on template; however, mkstemp() actually
creates the file and returns its file descriptor. The name of the
created file is stored in template.
The mkstemp64() function is identical to the mkstemp() function except
that the file is opened with the O_LARGEFILE flag set.

It is a common practice to unlink the temporary file right after creation, if it's not needed to be accessed through a filesystem anymore. This avoids dangling temporary files if the process crashes or forgets to unlink it later.
unlink() does not delete the file, it only removes the link to the file from the filesystem. Every link to a file in a filesystem increases the link count of the file by one (there can be several links to the same file). Also every process, that calls open() or mmap() to open the file, increases the file count, until it closes the descriptor - then the link count is decreased. File exists as long as there is at least one link to it. When link count reaches zero, then the file is actually deleted.
mkstemp() also calls open() behind the scenes to open the temporary file and return its descriptor.
In order to see the files that are opened, but do not exist in the filesystem anymore, you can use lsof and search for the lines, where there is "(deleted)" after the file name.
lsof | grep '(deleted)'
The (disk)space used by these files will be freed, when the processes they are attached to are finished or close the file descriptor by themselves.

Open file by inode

Is it possible to open a file knowing its inode?
ls -i /tmp/test/test.txt
529965 /tmp/test/test.txt
I can provide path, inode (above 529965) and I am looking to get in return a file descriptor.

This is not possible because it would open a loophole in the access control rules. Whether you can open a file depends not only on its own access permission bits, but on the permission bits of every containing directory. (For instance, in your example, if test.txt were mode 644 but the containing directory test were mode 700, then only root and the owner of test could open test.txt.) Inode numbers only identify the file, not the containing directories (it's possible for a file to be in more than one directory; read up on "hard links") so the kernel cannot perform a complete set of access control checks with only an inode number.
(Some Unix implementations have offered nonstandard root-only APIs to open a file by inode number, bypassing some of the access-control rules, but if current Linux has such an API, I don't know about it.)

Not exactly what you are asking, but (as hinted by zwol) both Linux and NetBSD/FreeBSD provide the ability to open files using previously created “handles”: These are inode-like persistent names that identify a file on a file system.
On *BSD (getfh and fhopen) using this is as simple as:
#include <sys/param.h>
#include <sys/mount.h>
fhandle_t file_handle;
getfh("<file_path>", &file_handle); // Or `getfhat` for the *at-style API
// … possibly save handle as bytes somewhere and recreate it some time later …
int fd = fhopen(&file_handle, O_RDWR);
The last call requiring the caller to be root however.
The Linux name_to_handle_at and open_by_handle_at system calls are similar, but a lot more explicit and require the caller to keep track of the relevant file system mount IDs/UUIDs themselves, so I'll humbly link to the detailed example in the manpage instead. Beware, that the example is not complete if you are looking to persist the handles across reboots; one has to convert the received mount ID to a persistent filesystem identifier, such as a filesystem UUID, and convert that back to a mount ID later on. In essence they do the same however. And just like on *BSD using the later system call requires elevated privileges (CAP_DAC_READ_SEARCH to be exact).

Linux: what is link

I'm new in Linux I'm just using windwos GUI before but I have a question: what is LINK in Linux? I know it have tow type but I don't know what is the advantage of them normally in windows has shortcut to reference app from the difference path if LINK in Linux have the same feature why it have tow type?
Thank you to answer.

First You should be know what is Inode ? to understand advantage of link type let me
An inode is an entry in inodetable, containing information ( the metadata ) about a regular file and directory. An inode is a data structure on a traditional Linux file system such as ext3 or ext4.
Inode number also called as index number , it consists following attributes.
File types ( executable, block special etc)
Permissions ( read, write etc)
UID ( Owner )
GID ( Group )
FileSize
Time stamps including last access, last modification and last
inode number change.
File deletion time
Number of links ( soft/hard )
Location of file on harddisk.
Some other metadata about file.
to Display Inode use this command+flags
ls –il
What is a link?
A link is simply a way to refer to the contents of a file.
Types of link:
Hardlink(Another name for the file/inode in the disk)
Softlink/symbolic link (Pointer to the file location)
How to create Links in linux ?
Hard link ln existingfile newfile
Note : Hardlinksare not allowed for directories
Soft link ln –s existingfile newfile
A soft link will have a different Inode number than the source file, which will be having a pointer to the source file but hard link will be using the same Inode number as the source file.
Soft link is like shortcut in windows. It doesn’t contain any information about the destination file or contents of the file, instead of that, it simply contains the pointer to the location of the destination file.
Soft Links You can make links for files & folder & you can create link (shortcut) on different partition & got different inode number from original.
If real copy is deleted the link will not work.
Hard Links
For files only & you cannot create on different partition ( it should be on same partition ) & got same inode number as original
If therealcopy is deleted the link will work( because it act as original file )

There are two types because it evolved that way -- and they are implemented differently:
hard links came first. Every file is represented in a directory file by a name and inode value. Think of inode as the disk-address or block number. A hard-linked file's inode is in more than one directory file (or more than once in a given directory file, using a different filename), and each directory entry is the "same" file (different names, of course).
symbolic links came later. They are a special entry in a directory file which have the name of another file rather than an inode value.
There is a clear distinction between a symbolic link and the actual file to which it points: if you remove the file, you have a broken link. This is different from hard links - removing one name will not damage the other(s).

Links are a very handy way to create a shortcut to an original directory. Links are used in many instances: Sometimes to create a convenient path to a directory buried deep within the file hierarchy; other uses for links include:
Linking libraries
Making sure files are in constant locations (without having to move the original)
Keeping a “copy” of a single file in multiple locations.
In Linux there are two different types of links:
Hard links
Symbolic links
The difference between the two are significant. With hard links, you can only link to files (and not directories); you cannot reference a file on a different disk or volume, and they reference the same inode as the original source. A hard link will continue to remain usable, even if the original file is removed.
Symbolic links, on the other hand, can link to directories, reference a file/folder on a different disk or volume, will exist as a broken (unusable) link if the original location is deleted, reference abstract filenames and directories (as opposed to physical locations), and are given their own, unique inode.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string