Detecting if the code read the specified input file - linux

I am writing some automated tests for testing code and providing feedback to the programmer.
One of the requirements is to detect if the code has successfully read the specified input file. If not - we need to provide feedback to the user accordingly. One way to detect this was atime timestamp, but since our server drive is mounted with relatime option - we are not getting atime updates for every file read. Changing this option to record every atime is not feasible as it slows down our I/O operations significantly.
Is there any other alternative that we can use to detect if the given code indeed reads the specified input file?

Here's a wild idea: intercept read call at some point. One of possible approaches goes more or less like this:
The program makes all its reading through an abstraction. For example, MyFileUtils.read(filename) (custom) instead of File.read(filename) (stdlib).
During normal operation, MyFileUtils simply delegates the work to File (or whatever system built-in libraries/calls you use).
But under test, MyFileUtils is replaced with a special test version which, along with the delegation, also reports usage to the framework.
Note that in some environments/languages it might be possible to inject code into File directly and the abstraction will not be needed.

I agree with Sergio: touching a file doesn't mean that it was read successfully. If you want to be really "sure"; those programs have to "send" some sort of indication back. And of course, there are many options to get that.
A pragmatic way could be: assuming that those programs under test create log files; your "test monitor" could check that the log files contain fixed entries such as "reading xyz PASSED" or something alike.
If your "code under test" doesn't create log files; maybe: consider changing that.

Related

Make an file path in Unix filesystem actually point to a program

I'm currently using a piece of software (let's call it ThirdPartyApp) that reads files from a certain directory on my PC. I want to make my own software (call it MyApp) that generates files for ThirdPartyApp. When ThirdPartyApp tries to load /path/to/somefile, instead of somefile getting read from the hard drive, I want MyApp to get called and generate bytes in real time. This is similar to how reading from, say, /dev/urandom doesn't actually load a file called urandom, but instead loads the output of a random generator.
So, my question is, is this even possible to do in userspace? If so, what is this called? I'm not asking for a recommendation of a specific library or anything like that; I just need to know what to google to find info about doing something like this. Oh, and I only care about making this work on Linux, if that's a limiting factor. Thanks!
check out fuse file system : en.wikipedia.org/wiki/Filesystem_in_Userspace – Matt Joyce
Also check out named pipes. Btw, if you control starting this ThirdPartyApp then you can simply run MyApp just before that. – Kenney

Persistant storage values handling in Linux

I have a QSPI flash on my embedded board.
I have a driver + process "Q" to handle reading and writing into.
I want to store variables like SW revisions, IP, operation time, etc.
I would like to ask for suggestions how to handle the different access rights to write and read values from user space and other processes.
I was thinking to have file for each variable. Than I can assign access rights for those files and process Q can change the value in file if value has been changed. So process Q will only write into and other processes or users can only read.
But I don't know about writing. I was thinking about using message queue or zeroMQ and build the software around it but i am not sure if it is not overkill. But I am not sure how to manage access rights anyway.
What would be the best approach? I would really appreciate if you could propose even totally different approach.
Thanks!
This question will probably be downvoted / flagged due to the "Please suggest an X" nature.
That said, if a file per variable is what you're after, you might want to look at implementing a FUSE file system that wraps your SPI driver/utility "Q" (or build it into "Q" if you get to compile/control source to "Q"). I'm doing this to store settings in an EEPROM on a current work project and its turned out nicely. So I have, for example, a file, that when read, retrieves 6 bytes from EEPROM (or a cached copy) provides a MAC address in std hex/colon-separated notation.
The biggest advantage here, is that it becomes trivial to access all your configuration / settings data from shell scripts (e.g. your init process) or other scripting languages.
Another neat feature of doing it this way is that you can use inotify (which comes "free", no extra code in the fusefs) to create applications that efficiently detect when settings are changed.
A disadvantage of this approach is that it's non-trivial to do atomic transactions on multiple settings and still maintain normal file semantics.

How to check if a file is opened in Linux?

The thing is, I want to track if a user tries to open a file on a shared account. I'm looking for any record/technique that helps me know if the concerned file is opened, at run time.
I want to create a script which monitors if the file is open, and if it is, I want it to send an alert to a particular email address. The file I'm thinking of is a regular file.
I tried using lsof | grep filename for checking if a file is open in gedit, but the command doesn't return anything.
Actually, I'm trying this for a pet project, and thus the question.
The command lsof -t filename shows the IDs of all processes that have the particular file opened. lsof -t filename | wc -w gives you the number of processes currently accessing the file.
The fact that a file has been read into an editor like gedit does not mean that the file is still open. The editor most likely opens the file, reads its contents and then closes the file. After you have edited the file you have the choice to overwrite the existing file or save as another file.
You could (in addition of other answers) use the Linux-specific inotify(7) facilities.
I am understanding that you want to track one (or a few) particular given file, with a fixed file path (actually a given i-node). E.g. you would want to track when /var/run/foobar is accessed or modified, and do something when that happens
In particular, you might want to install and use incrond(8) and configure it thru incrontab(5)
If you want to run a script when some given file (on a native local, e.g. Ext4, BTRS, ... but not NFS file system) is accessed or modified, use inotify incrond is exactly done for that purpose.
PS. AFAIK, inotify don't work well for remote network files, e.g. NFS filesystems (in particular when another NFS client machine is modifying a file).
If the files you are fond of are somehow source files, you might be interested by revision control systems (like git) or builder systems (like GNU make); in a certain way these tools are related to file modification.
You could also have the particular file system sits in some FUSE filesystem, and write your own FUSE daemon.
If you can restrict and modify the programs accessing the file, you might want to use advisory locking, e.g. flock(2), lockf(3).
Perhaps the data sitting in the file should be in some database (e.g. sqlite or a real DBMS like PostGreSQL ou MongoDB). ACID properties are important ....
Notice that the filesystem and the mount options may matter a lot.
You might want to use the stat(1) command.
It is difficult to help more without understanding the real use case and the motivation. You should avoid some XY problem
Probably, the workflow is wrong (having a shared file between several users able to write it), and you should approach the overall issue in some other way. For a pet project I would at least recommend using some advisory lock, and access & modify the information only thru your own programs (perhaps setuid) using flock (this excludes ordinary editors like gedit or commands like cat ...). However, your implicit use case seems to be well suited for a DBMS approach (a database does not have to contain a lot of data, it might be tiny), or some index locked file like GDBM library is handling.
Remember that on POSIX systems and Linux, several processes can access (and even modify) the same file simultaneously (unless you use some locking or synchronization).
Reading the Advanced Linux Programming book (freely available) would give you a broader picture (but it does not mention inotify which appeared aften the book was written).
You can use ls -lrt, it displays the last RW operations in the shell. Then you can conclude whether the file is opened or not. Make sure that you are in the exact directory.

store some data in the struct inode

Hello I am a newbie to kernel programming. I am writing a small kernel module
that is based on wrapfs template to implement a backup mechanism. This is
purely for learning basis.
I am extending wrapfs so that when a write call is made wrapfs transparently
makes a copy of that file in a separate directory and then write is performed
on the file. But I don't want that I create a copy for every write call.
A naive approach could be I check for existence of file in that directory. But
I think for each call checking this could be a severe penalty.
I could also check for first write call and then store a value for that
specific file using private_data attribute. But that would not be stored on
disk. So I would need to check that again.
I was also thinking of making use of modification time. I could save a
modification time. If the older modification time is before that time then only
a copy is created otherwise I won't do anything. I tried to use inode.i_mtime
for this but it was the modified time even before write was called, also
applications can modify that time.
So I was thinking of storing some value in inode on disk that indicates its
backup has been created or not. Is that possible? Any other suggestions or
approaches are welcome.
You are essentially saying you want to do a Copy-On-Write virtual filesystem layer.
IMO, some of these have been done, and it would be easier to implement these in userland (using libfuse and the fuse module, e.g.). That way, you can be king of your castle and add your metadata in any which way you feel is appriate:
just add (hidden) metadata files to each directory
use extended POSIX attributes (setfattr and friends)
heck, you could even use a sqlite database
If you really insist on doing these things in-kernel, you'll have a lot more work since accessing the metadata from kernel mode is goind to take a lot more effort (you'd most likely want to emulate your own database using memory mapped files so as to minimize the amount of 'userland (style)' work required and to make it relatively easy to get atomicity and reliability right1.
1
On How Everybody Gets File IO Wrong: see also here
You can use atime instead of mtime. In that case setting S_NOATIME flag on the inode prevents it from updating (see touch_atime() function at the inode.c). The only thing you'll need is to mount your filesystem with noatime option.

How can one monitor what part of big file changed

Is there solution for Linux kernel-3.0 (or later) that allows one to get notifications similar to inotify describing particular segment of file that was changed?
There was fschange patch for up to kernel-2.6.21. Is there any up to date solution available? Is recent fanotify able to provide the functionality?
Not that I know of, but there is a way to sort of hack the functionality by using the file change notification as an indicator to read the on disk format of the file system an examine the internal file system block allocation tables to learn whats changed.
It's tricky to do, suffers from race conditions and probably a bad idea, but if you must and coding an fschange on top of 3.0 is not an option for you, it might be the way to go.
IMO... forget using inotify unless "the pretty" is important. Other than that, you can setup a cronjob with a script doing a diff or using FIND with the MTIME option.

Resources