System calls in Linux that can be used to delete files - linux

What are the system calls that can be used to delete a file on Linux? I am not referring to just the system calls used by the libc-wrapper(which in-turn are used by command line tools).
Other than unlink and unlinkat what are the system calls that could be used to delete files on a Linux machine?

rename() and renameat() can be used to delete a file by renaming another file over it.
If you consider making a file empty to be a form of deletion, a variety of system calls, including truncate() and open() with O_TRUNC, can do that.

Related

what system calls are used to copy files in Linux

I am modifying ext4 filesystem to add a simple encryption to files contents.
I started by changing read and write system calls to change the contents right before write and after read system calls.
now copying files in terminal is working just as I expected but when I try to copy a file using a GUI based file manager (pcmanfm in this case) it is corrupting the contents.
my question is: what system calls are used for reading/writing files besides normal .read and .write?
mmap, sendfile, etc
If you want crypto in ext4, you should probably look at the google's recent patch to Linux just for that,
http://www.phoronix.com/scan.php?page=news_item&px=EXT4-Encryption-Support

Can POSIX/Linux unlink file entries completely race free?

POSIX famously lets processes rename and unlink file entries with no regard as to the effects on others using them, whilst Windows by default raises an error if you even try to touch the timestamps of a directory which has a file handle open somewhere deep inside inside.
However Windows doesn't need to be so conservative. If you open all your file handles with FILE_FLAG_BACKUP_SEMANTICS and FILE_SHARE_DELETE and take care to rename files to random names just before flagging deletion, you get POSIX semantics including lack of restriction on manipulating file paths containing open file handles.
One very nifty thing Windows can do is to perform renames and deletes and hard links only using an open file descriptor, and therefore you can delete a file without having to worry about whether another process has renamed it or any of the directories in the path preceding the file's location. This facility lets you perform completely race free file deletions - once you have an open file handle to the right file, you can stop caring about what other processes are doing to the filing system, at least for deletion (which is the most important as it implicitly involves destroying data).
This raises the question of what about POSIX? On POSIX unlink() takes a path, and between retrieving the current path of a file descriptor using /proc/self/fd/x or F_GETPATH and calling unlink() someone may have changed that path, thus potentially leading to the wrong file being unlinked and data lost.
A considerably safer solution is this:
Get one of the current paths of the open file descriptor using /proc/self/fd/x or F_GETPATH etc.
Open its containing directory.
Do a statat() on the containing directory for the leafname of the open file descriptor, checking if the device ids and inodes match.
If they match, do an unlinkat() to remove the leafname.
This is race safe from the parent directory upwards, though the hard link you delete may not be the one expected. However, it is not race safe if within the containing directory a third party process were to rename your file to something else and rename another file to your leafname between you checking for inode equivalence and calling the unlinkat(). Here the wrong file could be deleted, and data lost.
I therefore ask the question: can POSIX, or any specific POSIX implementation such as Linux, allow programs to unlink file entries completely race free? One solution could be to unlink a file entry by open file descriptor, another could be to unlink a file entry by inode, however google has not turned up solutions for either of those. Interestingly, NTFS does let you delete by a choice of inode or GUID (yes NTFS does provide inodes, you can fetch them from the NT kernel) in addition to deletion via open file handle, but that isn't much help here.
In case this seems like too esoteric a question, this problem affects proposed Boost.AFIO where I need to determine what filing system races I can mitigate and what I cannot as part of its documented hard behaviour guarantees.
Edit: Clarified that there is no canonical current path of an open file descriptor, and that in this use case we don't care - we just want to unlink some one of the links for the file.
No replies to this question, and I have spent several days trawling through Linux source code. I believe the answer is "currently you can't unlink a file race free", so I have opened a feature request at https://bugzilla.kernel.org/show_bug.cgi?id=93441 to have Linux extend unlinkat() with the AT_EMPTY_PATH Linux extension flag. If they accept that idea, I'll mark this answer as the correct one.

Level 2 I/O in Linux using readdir() possible?

I am trying to traverse a directory structure and open every file in that structure. To traverse, I am using opendir() and readdir(). Since I already have the entity, it seems stupid to build a path and open the file -- that presumably forces Linux to find the directory and file I just traversed.
Level 2 I/O (open, creat, read, write) require a path. Is there any call to either open a filename inside a directory, or open a file given an inode?
You probably should use nftw(3) to recursively traverse a file tree.
Otherwise, in a portable way, construct your directory + filename path using e.g.
snprintf(pathbuf, sizeof(pathbuf), "%s/%s", dirname, filename);
(or perhaps using asprintf(3) but don't forget to later free the result)
And to answer your question about opening a file in a directory, you could use the Linux or POSIX2008 specific openat(2). But I believe that you should really use nftw or construct your path like suggested above. Read also about O_PATH and O_TMPFILE in open(2).
BTW, the kernel has to access several times the directory (actually, the metadata is cached by file system kernel code), just because another process could have written inside it while you are traversing it.
Don't even think of opening a file thru its inode number: this will violate several file system abstractions! (but might be hardly possible by insane and disgusting tricks, e.g. debugfs - and this could probably harm very strongly your filesystem!!).
Remember that files are generally inodes, and can have zero (a process did open then unlink(2) a file while keeping the open file descriptor), one (this is the usual case), or several (e.g. /foo/bar1 and /gee/bar2 could be hard-linked using link(2) ....) file names.
Some file systems (e.g. FAT ...) don't have real inodes. The kernel fakes something in that case.

Syncing a file system that has no file on it

Say I want to synchronize data buffers of a file system to disk (in my case the one of an USB stick partition) on a linux box.
While searching for a function to do that I found the following
DESCRIPTION
sync() causes all buffered modifications to file metadata and
data to be written to the underlying file sys‐
tems.
syncfs(int fd) is like sync(), but synchronizes just the file system
containing file referred to by the open file
descriptor fd.
But what if the file system has no file on it that I can open and pass to syncfs? Can I "abuse" the dot file? Does it appear on all file systems?
Is there another function that does what I want? Perhaps by providing a device file with major / minor numbers or some such?
Yes I think you can do that. The root directory of your file system will have at least one inode for your root directory. You can use the .-file to do that. Play also around with ls -i to see the inode numbers.
Is there a possibility to avoid your problem by mounting your file system with sync? Does performance issues hamper? Did you have a look at remounting? This can sync your file system as well in particular cases.
I do not know what your application is, but I suffered problems with synchronization of files to a USB stick with the FAT32-file system. It resulted in weird read and write errors. I can not imagine any other valid reason why you should sync an empty file system.
From man 8 sync description:
"sync writes any data buffered in memory out to disk. This can include (but is not
limited to) modified superblocks, modified inodes, and delayed reads and writes. This
must be implemented by the kernel; The sync program does nothing but exercise the sync(2)
system call."
So, note that it's all about modification (modified inode, superblocks etc). If you don't have any modification, it don't have anything to sync up.

Redundant Linux Kernel System Calls

I'm currently working on a project that hooks into various system calls and writes things to a log, depending on which one was called. So, for example, when I change the permissions of a file, I write a little entry to a log file that tracks the old permission and new permission. However, I'm having some trouble pinning down exactly where I should be watching. For the above example, strace tells me that the "chmod" command uses the system call sys_fchmodat(). However, there's also a sys_chmod() and a sys_fchmod().
I'm sure the kernel developers know what they're doing, but I wonder: what is the point of all these (seemingly) redundant system calls, and is there any rule on which ones are used for what? (i.e. are the "at" syscalls or ones prefixed with "f" meant to do something specific?)
History :-)
Once a system call has been created it can't ever be changed, therefore when new functionality is required a new system call is created. (Of course this means there's a very high bar before a new system call is created).
Yes, there are some naming rules.
chmod takes a filename, while fchmod takes a file descriptor. Same for stat vs fstat.
fchmodat takes a file descriptor/filename pair (file descriptor for the directory and filename for the file name within the directory). Same for other *at calls; see the NOTES section of http://kerneltrap.org/man/linux/man2/openat.2 for an explanation.

Resources