Any way to tell getAttr is at the last element of a path? - fuse

I'm writing a read-only FUSE file system that abstracts access to a remote file system. The remote file system requires an user ID, a token and a file ID to produce a file.
If I have mounted my FUSE file system on /mnt/rfs and an program tries to open the file /mnt/rfs/user/token/fileid FUSE will trigger three getAttr invocations, one for /user, one for /user/token and one for /user/token/fileid.
Is there any way to detect from inside the getAttr invocation that this particular call is for the last element in the path? Obviously, if I know how my service works, I can just fake the directories for the first two elements, then download a file to some local tmp storage using the user/token/fileid info. But this is hacky and fragile.

There's no generic way to do it. Handle it on a case-by-case basis.

Related

Copy/move semantics on FUSE

I have a hash-value database with tags and I want to implement a FUSE interface for it. Because values are indexed by their hashes they must be read-only.
Native interface for this database is very simple:
You can download, upload or tag a file.
You can get the set of all defined tags.
You can search for files tagged in accordance to a boolean combination of tags.
FUSE interface semantics are simple:
Database is viewed as a big synthetic directory hierarchy where values are files named by its hash and tags are directories.
cd-ing inside a directory is semantically equivalent to search for a given tag (naming conventions on paths can be used to implement boolean operations).
read-ing a file is semantically equivalent to download (part of) a value (FUSE allows an stateless read so open and close can be no-ops).
Copying/moving an inexistent file into a given path is equivalent to upload and tag it. Copying/moving an existent file into a given path is equivalent to add new tags.
Any other operation throws an error.
This FUSE interface is quite usable and allows you to easily embed a tag file system inside a hierarchical one without the need of external tools like TagSpaces or Evernote.
My problem arises identifying a file copy or move from any other forbidden operation with FUSE interface: there are endless possible combination of operations with equivalent semantics.
What is the most reliable way to identify a file copy or move with FUSE interface?
Hooking rename of a file should be straightforward by implementing rename() fuse call. In this call, you will get path of both old and new location, so that you can check if the file comes from outside or not. That said, this would work only if user space tool renames a file by invoking rename(2) kernel call.
On the other hand, hooking file copy operation would be harder: it can't be done directly as there is no such fuse call - copying happens in user space completely and so it's not directly detectable in kernel space.
You could try to do some heuristics and process incoming fuse operations to detect rename of already stored file (eg. by hashing content of new file and comparing that with already existing files), but I'm not sure how much it makes sense in your case or if it would be actually practical.

Resolving file descriptor to file name / file path

I am currently developing a simple kernel module that can steal system calls such as open, read, write and replace them with a simple function which logs the files being opened, read, written, into a file and return the original system calls.
My query is, I am able to get the File Descriptor in read and write system calls, but I am not able to understand how to obtain file name using the same.
Currently I am able to access the file structure associated with given FD using following code:
struct file *file;
file = fcheck(fd);
This file structure has two important entities in it, which are of my concern I believe:
f_path
f_inode
Can anybody help me get dentry or inode or the path name associated with this fd using the file structure associated with it?
Is my approach correct? Or do I need to do something different?
I am using Ubuntu 14.04 and my kernel version is 3.19.0-25-generic, for the kernel module development.
.f_inode is actually an inode.
.f_path->dentry is a dentry.
Traversing this dentry via ->d_parent link, until f_path.mnt.mnt_root dentry will be touched, and collecting dentry->d_name components, will construct the file's path, relative to the mount point. This is done, e.g., with d_path, but in more carefull way.
Instead of fcheck(fd), which should be used inside RCU read section, you can also use fget(fd), which should be paired with fput().
The approach is completely incorrect - see http://www.watson.org/~robert/2007woot/
Linux already has a reliable mechanism for doing this thing (audit). If you want to implement it anyway (for fun I presume), you want to place your hooks roughly where audit is doing that. Chances are LSM hooks are in appropriate places, have not checked.

Level 2 I/O in Linux using readdir() possible?

I am trying to traverse a directory structure and open every file in that structure. To traverse, I am using opendir() and readdir(). Since I already have the entity, it seems stupid to build a path and open the file -- that presumably forces Linux to find the directory and file I just traversed.
Level 2 I/O (open, creat, read, write) require a path. Is there any call to either open a filename inside a directory, or open a file given an inode?
You probably should use nftw(3) to recursively traverse a file tree.
Otherwise, in a portable way, construct your directory + filename path using e.g.
snprintf(pathbuf, sizeof(pathbuf), "%s/%s", dirname, filename);
(or perhaps using asprintf(3) but don't forget to later free the result)
And to answer your question about opening a file in a directory, you could use the Linux or POSIX2008 specific openat(2). But I believe that you should really use nftw or construct your path like suggested above. Read also about O_PATH and O_TMPFILE in open(2).
BTW, the kernel has to access several times the directory (actually, the metadata is cached by file system kernel code), just because another process could have written inside it while you are traversing it.
Don't even think of opening a file thru its inode number: this will violate several file system abstractions! (but might be hardly possible by insane and disgusting tricks, e.g. debugfs - and this could probably harm very strongly your filesystem!!).
Remember that files are generally inodes, and can have zero (a process did open then unlink(2) a file while keeping the open file descriptor), one (this is the usual case), or several (e.g. /foo/bar1 and /gee/bar2 could be hard-linked using link(2) ....) file names.
Some file systems (e.g. FAT ...) don't have real inodes. The kernel fakes something in that case.

Why can't files be manipulated by inode?

Why is it that you cannot access a file when you only know its inode, without searching for a file that links to that inode? A hard link to the file contains nothing but a name and a number telling you where to find the inode with all the real information about the file. I was surprised when I was told that there was no usermode way to use the inode number directly to open a file.
This seems like such a harmless and useful capability for the system to provide. Why is it not provided?
Security reasons -- to access a file you need permission on the file AS WELL AS permission to search all the directories from the root needed to get at the file. If you could access a file by inode, you could bypass the checks on the containing directories.
This allows you to create a file that can be accessed by a set of users (or a set of groups) and not anyone else -- create directories that are only accessable by the the users (one dir per user), and then hard-link the file into all of those directories -- the file itself is accessable by anyone, but can only actually be accessed by someone who has search permissions on one of the directories it is linked into.
Some Operating Systems do have that facility. For example, OS X needs it to support the Carbon File Manager, and on Linux you can use debugfs. Of course, you can do it on any UNIX from the command-line via find -inum, but the real reason you can't access files by inode is that it isn't particularly useful. It does kindof circumvent file permissions, because if there's a file you can read in a folder you can't read or execute, then opening the inode lets you discover it.
The reason it isn't very useful is that you need to find an inode number via a *stat() call, at which point you already have the filename (or an open fd)...or you need to guess the inum.
In response to your comment: To "pass a file", you can use fd passing over AF_LOCAL sockets by means of SCM_RIGHTS (see man 7 unix).
Btrfs does have an ioctl for that (BTRFS_IOC_INO_PATHS added in this patch), however it does no attempt to check permissions along the path, and is simply reserved to root.
Surely if you've already looked up a file via a path, you shouldn't have to do it again and again?
stat(f,&s); i=open(f,O_MODE);
involves two trawls through a directory structure. This wastes CPU cycles with unnecessary string operations. Yes, the well-designed file system cache will hide most of this inefficiency from a casual end-user, but repeating work for no reason is ugly if not plain silly.

Redundant Linux Kernel System Calls

I'm currently working on a project that hooks into various system calls and writes things to a log, depending on which one was called. So, for example, when I change the permissions of a file, I write a little entry to a log file that tracks the old permission and new permission. However, I'm having some trouble pinning down exactly where I should be watching. For the above example, strace tells me that the "chmod" command uses the system call sys_fchmodat(). However, there's also a sys_chmod() and a sys_fchmod().
I'm sure the kernel developers know what they're doing, but I wonder: what is the point of all these (seemingly) redundant system calls, and is there any rule on which ones are used for what? (i.e. are the "at" syscalls or ones prefixed with "f" meant to do something specific?)
History :-)
Once a system call has been created it can't ever be changed, therefore when new functionality is required a new system call is created. (Of course this means there's a very high bar before a new system call is created).
Yes, there are some naming rules.
chmod takes a filename, while fchmod takes a file descriptor. Same for stat vs fstat.
fchmodat takes a file descriptor/filename pair (file descriptor for the directory and filename for the file name within the directory). Same for other *at calls; see the NOTES section of http://kerneltrap.org/man/linux/man2/openat.2 for an explanation.

Resources