I'm currently using a piece of software (let's call it ThirdPartyApp) that reads files from a certain directory on my PC. I want to make my own software (call it MyApp) that generates files for ThirdPartyApp. When ThirdPartyApp tries to load /path/to/somefile, instead of somefile getting read from the hard drive, I want MyApp to get called and generate bytes in real time. This is similar to how reading from, say, /dev/urandom doesn't actually load a file called urandom, but instead loads the output of a random generator.
So, my question is, is this even possible to do in userspace? If so, what is this called? I'm not asking for a recommendation of a specific library or anything like that; I just need to know what to google to find info about doing something like this. Oh, and I only care about making this work on Linux, if that's a limiting factor. Thanks!
check out fuse file system : en.wikipedia.org/wiki/Filesystem_in_Userspace – Matt Joyce
Also check out named pipes. Btw, if you control starting this ThirdPartyApp then you can simply run MyApp just before that. – Kenney
Related
I am writing some automated tests for testing code and providing feedback to the programmer.
One of the requirements is to detect if the code has successfully read the specified input file. If not - we need to provide feedback to the user accordingly. One way to detect this was atime timestamp, but since our server drive is mounted with relatime option - we are not getting atime updates for every file read. Changing this option to record every atime is not feasible as it slows down our I/O operations significantly.
Is there any other alternative that we can use to detect if the given code indeed reads the specified input file?
Here's a wild idea: intercept read call at some point. One of possible approaches goes more or less like this:
The program makes all its reading through an abstraction. For example, MyFileUtils.read(filename) (custom) instead of File.read(filename) (stdlib).
During normal operation, MyFileUtils simply delegates the work to File (or whatever system built-in libraries/calls you use).
But under test, MyFileUtils is replaced with a special test version which, along with the delegation, also reports usage to the framework.
Note that in some environments/languages it might be possible to inject code into File directly and the abstraction will not be needed.
I agree with Sergio: touching a file doesn't mean that it was read successfully. If you want to be really "sure"; those programs have to "send" some sort of indication back. And of course, there are many options to get that.
A pragmatic way could be: assuming that those programs under test create log files; your "test monitor" could check that the log files contain fixed entries such as "reading xyz PASSED" or something alike.
If your "code under test" doesn't create log files; maybe: consider changing that.
The thing is, I want to track if a user tries to open a file on a shared account. I'm looking for any record/technique that helps me know if the concerned file is opened, at run time.
I want to create a script which monitors if the file is open, and if it is, I want it to send an alert to a particular email address. The file I'm thinking of is a regular file.
I tried using lsof | grep filename for checking if a file is open in gedit, but the command doesn't return anything.
Actually, I'm trying this for a pet project, and thus the question.
The command lsof -t filename shows the IDs of all processes that have the particular file opened. lsof -t filename | wc -w gives you the number of processes currently accessing the file.
The fact that a file has been read into an editor like gedit does not mean that the file is still open. The editor most likely opens the file, reads its contents and then closes the file. After you have edited the file you have the choice to overwrite the existing file or save as another file.
You could (in addition of other answers) use the Linux-specific inotify(7) facilities.
I am understanding that you want to track one (or a few) particular given file, with a fixed file path (actually a given i-node). E.g. you would want to track when /var/run/foobar is accessed or modified, and do something when that happens
In particular, you might want to install and use incrond(8) and configure it thru incrontab(5)
If you want to run a script when some given file (on a native local, e.g. Ext4, BTRS, ... but not NFS file system) is accessed or modified, use inotify incrond is exactly done for that purpose.
PS. AFAIK, inotify don't work well for remote network files, e.g. NFS filesystems (in particular when another NFS client machine is modifying a file).
If the files you are fond of are somehow source files, you might be interested by revision control systems (like git) or builder systems (like GNU make); in a certain way these tools are related to file modification.
You could also have the particular file system sits in some FUSE filesystem, and write your own FUSE daemon.
If you can restrict and modify the programs accessing the file, you might want to use advisory locking, e.g. flock(2), lockf(3).
Perhaps the data sitting in the file should be in some database (e.g. sqlite or a real DBMS like PostGreSQL ou MongoDB). ACID properties are important ....
Notice that the filesystem and the mount options may matter a lot.
You might want to use the stat(1) command.
It is difficult to help more without understanding the real use case and the motivation. You should avoid some XY problem
Probably, the workflow is wrong (having a shared file between several users able to write it), and you should approach the overall issue in some other way. For a pet project I would at least recommend using some advisory lock, and access & modify the information only thru your own programs (perhaps setuid) using flock (this excludes ordinary editors like gedit or commands like cat ...). However, your implicit use case seems to be well suited for a DBMS approach (a database does not have to contain a lot of data, it might be tiny), or some index locked file like GDBM library is handling.
Remember that on POSIX systems and Linux, several processes can access (and even modify) the same file simultaneously (unless you use some locking or synchronization).
Reading the Advanced Linux Programming book (freely available) would give you a broader picture (but it does not mention inotify which appeared aften the book was written).
You can use ls -lrt, it displays the last RW operations in the shell. Then you can conclude whether the file is opened or not. Make sure that you are in the exact directory.
Let's assume that we've several non-identical versions of the same folder in different locations as follows:
/in/some/location/version1
/different/path/version2
/third/place/version3
Each version of them contains callerFile, which is a pre-compiled executable that we can't control its working functionality. this callerFile will create and edit a folder called cache
/some/fourth/destination/cache
So we've contradiction between the setting of every version so what I want to do is converting the /some/fourth/destination/cache to a link with 3 different destinations
/some/fourth/destination/cache --> /in/some/location/version1/cache
/some/fourth/destination/cache --> /different/path/version2/cache
/some/fourth/destination/cache --> /third/place/version3/cache
so for example:
if /in/some/location/version1/callerFile calls /some/fourth/destination/cache it should redirected to /in/some/location/version1/cache
and if /different/path/version2/callerFile calls /some/fourth/destination/cache it should redirected to /different/path/version2/cache
and if /third/place/version3/callerFile calls /some/fourth/destination/cache it should redirected to /third/place/version3/cache
So, How can I do so on Ubuntu 12.04 64 bit Operating System?
Assuming you have no control over what callerFile actually does, I mean it does what it wants and always the same, so the conclusion is you need to modify it's environment. This will be quite advanced trick, requiring deep experience of Linux kernel and Unix programming in general, and you should think over if it's worth. It will also require root priviledges on the machine where your callerFile binary exists.
Solution I'd propose would be creating an executable ( or some script calling one of exec() family function ), which will prepare special environment ( or make sure it's ready to use ), based on "mount -o bind" or unshare() system call.
Like said, playing with so called "execution context", is quite advanced trick. Theoretically you could also try some autofs-like solution, however you'll probably end up with the same, and bindmount/unshare will be probably more effective than some FS-detection daemon. I wouldn't recommend diving into FUSE, for the same reason. And playing with some over-complicated game with symlinks is probably not the way too.
http://www.kernel.org/doc/Documentation/unshare.txt
Note: whatever "callerFile" binary does, I'm pretty sure it won't check its own filename, which makes possible replacing it with something else in-between, which will do exec() on "callerFileRenamed".
As I understand it, basically what you want is to get different result with the same activity, distinguished by some condition external to activity itself, like, for example, returning different list for "ls" in same directory, based upon e.g. UID of user who issued "ls" command, without modifying some ./ls program binary.
I am writing an application for which I need to intercept some filesystem system calls eg. unlink. I would like to save some file say abc. If user deletes the file then I need to copy it to some other place. So I need unlink to call my code before deleting abc so that I could save it. I have gone through threads related to intercepting system calls but methods like LD_PRELOAD it wont work in my case because I want this to be secure and implemented in kernel so this method wont be useful. inotify notifies after the event so I could not be able to save it. Could you suggest any such method. I would like to implement this in a kernel module instead of modifying kernel code itself.
Another method as suggested by Graham Lee, I had thought of this method but it has some problems ,I need hardlink mirror of all the files it consumes no space but still could be problematic as I have to repeatedly mirror drive to keep my mirror up to date, also it won't work cross partition and on partition not supporting link so I want a solution through which I could attach hooks to the files/directories and then watch for changes instead of repeated scanning.
I would also like to add support for write of modified file for which I cannot use hard links.
I would like to intercept system calls by replacing system calls but I have not been able to find any method of doing that in linux > 3.0. Please suggest some method of doing that.
As far as hooking into the kernel and intercepting system calls go, this is something I do in a security module I wrote:
https://github.com/cormander/tpe-lkm
Look at hijacks.c and symbols.c for the code; how they're used is in the hijack_syscalls function inside security.c. I haven't tried this on linux > 3.0 yet, but the same basic concept should still work.
It's a bit tricky, and you may have to write a good deal of kernel code to do the file copy before the unlink, but it's possible here.
One suggestion could be Filesystems in Userspace (FUSE.) That is, write a FUSE module (which is, granted, in userspace) which intercepts filesystem-related syscalls, performs whatever tasks you want, and possibly calls the "default" syscall afterwards.
You could then mount certain directories with your FUSE filesystem and, for most of your cases, it seems like the default syscall behavior would not need to be overridden.
You can watch unlink events with inotify, though this might happen too late for your purposes (I don't know because I don't know your purposes, and you should experiment to find out). The in-kernel alternatives based on LSM (by which I mean SMACK, TOMOYO and friends) are really for Mandatory Access Control so may not be suitable for your purposes.
If you want to handle deletions only, you could keep a "shadow" directory of hardlinks (created via link) to the files being watched (via inotify, as suggested by Graham Lee).
If the original is now unlinked, you still have the shadow file to handle as you want to, without using a kernel module.
I have a program which requires the path to various files. The files live in different folders and are constantly updated, at irregular intervals.
When the files are updated, they change name, so, for instance, in the folder dir1 I have fv01 and fv02. Later on the day someone adds fv02_v1; the day after someone adds fv03 and so on. In other words, I always have an updated file but with different name.
I want to create a symbolic link in my "run" folder to these files, such that said link always points to the latest file created.
I can do this in Python or Bash, but I was wondering what is out there, as this is hardly an uncommon problem.
How would you go about it?
Thank you.
Juan
PS. My operating system is Linux. I currently have a simple daemon (Python) that looks every once in a while (refreshes every minute) for the latest file. Seems kind of an overkill to me.
Unless there is some compelling reason that you have left unstated (e.g. thousands of files in the directory) just do it the way you suggest with a script sorting the files by modification time. There is no secret method that I am aware of.
You could write a daemon using inotify to monitor your directories and immediately set your links but that seems like overkill.
Edit: I just saw your edit. Since you have the daemon already, inotify might not be such a bad idea. It would be somewhat more efficient than constantly querying since the OS will tell you when something in your directories has changed.
I don't know python well enough to point you to anything specific but there must exist a wrapper for inotify.