On Linux I can dd a file on my hard drive and delete it in Nautilus while the dd is still going on.
Can Linux enforce a mandatory file lock to protect R/W?
[EDIT] The original question wasn't about linux file locking capabilities but about a supposed bug in linux, reproducing it here as it is responded below and others may have the same question.
People keep telling me Linux/Unix is better OS. I am coding Java on Linux now and come across a problem, that I can easily reproduce: I can dd a file on my hard drive and delete it in Nautilus while the dd is still going on. How come linux cannot enforce a mandatory file lock to protect R/W??
To do mandatory locking on Linux, the filesystem must be mounted with the -o mand option, and you must set g-x,g+s permissions on the file. That is, you must disable group execute, and enable setgid. Once this is performed, all access will either block or error with EAGAIN based on the value of O_NONBLOCK on the file descriptor. But beware: "The implementation of mandatory locking in all known versions of Linux is subject to race conditions which render it unreliable... It is therefore inadvisable to rely on mandatory locking." See fcntl(2).
You don't need locking. This is not a bug but a choice, your assumptions are wrong.
The file system uses reference counting and it will mark a file as free only when all hard links to the file are removed and all file descriptors are closed.
This approach allows safe file system operations that Windows, for example, doesn't. Operations like delete, move and rename over files in use without needing locking or breaking anything.
Your dd operation is going to succeed despite the file removal, which will actually be deferred till the dd finishes.
http://en.wikipedia.org/wiki/Reference_counting#Disk_operating_systems
[EDIT] My response doesn't make much sense now as the question was edited by someone else. The original question was about a supposed bug in linux and not about if linux can lock a file:
People keep telling me Linux/Unix is better OS. I am coding Java on Linux now and come across a problem, that I can easily reproduce: I can dd a file on my hard drive and delete it in Nautilus while the dd is still going on. How come linux cannot enforce a mandatory file lock to protect R/W??
Linux and Unix OS's can enforce file locks, but it does not do so by default becuase of its multiuser design. Try reading the manual pages for flock and fcntl. That might get you started.
Related
I want to write a wrapper for memory mapped file io, that either fails to map a file or returns a mapping that is valid until it is unmapped. With plain mmap, problems arise, if the underlying file is truncated or deleted while being mapped, for example. According to the linux man page of mmap SIGBUS is received if memory beyond the new end of the file is accessed after a truncate. It is no option to catch this signal and handle the error this way.
My idea was to create a copy of the file and map the copy. On a cow capable file system, this would impose little overhead.
But the problem is: how do I protect the copy from being manipulated by another process? A tempfile is no real option, because in theory a malicious process could still mutate it. I know that there are file locks on Linux, but as far as I understood they're either optional or don't prevent others from deleting the file.
I'm asking for two kinds of answers: Either a way to mmap a file in a rock solid way or a mechanism to protect a tempfile fully from other processes. But maybe my whole way of approaching the problem is wrong, so feel free to suggest radical solutions ;)
You can't prevent a skilled and determined user from intentionally shooting themselves in the foot. Just take reasonable precautions so it doesn't happen accidentally.
Most programs assume the input file won't change and that's usually fine
Programs that want to process files shared with cooperative programs use file locking
Programs that want a private file will create a temp file, snapshot or otherwise -- and if they unlink it for auto-cleanup, it's also inaccessible via the fs
Programs that want to protect their data from all regular user actions will run as a dedicated system account, in which case chmod is protection enough.
Anyone with access to the same account (or root) can interfere with the program with a simple kill -BUS, chmod/truncate, or any of the fancier foot-guns like copying and patching the binary, cloning its FDs, or attaching a debugger. If that's what they want to do, it's not your place to stop them.
The thing is, I want to track if a user tries to open a file on a shared account. I'm looking for any record/technique that helps me know if the concerned file is opened, at run time.
I want to create a script which monitors if the file is open, and if it is, I want it to send an alert to a particular email address. The file I'm thinking of is a regular file.
I tried using lsof | grep filename for checking if a file is open in gedit, but the command doesn't return anything.
Actually, I'm trying this for a pet project, and thus the question.
The command lsof -t filename shows the IDs of all processes that have the particular file opened. lsof -t filename | wc -w gives you the number of processes currently accessing the file.
The fact that a file has been read into an editor like gedit does not mean that the file is still open. The editor most likely opens the file, reads its contents and then closes the file. After you have edited the file you have the choice to overwrite the existing file or save as another file.
You could (in addition of other answers) use the Linux-specific inotify(7) facilities.
I am understanding that you want to track one (or a few) particular given file, with a fixed file path (actually a given i-node). E.g. you would want to track when /var/run/foobar is accessed or modified, and do something when that happens
In particular, you might want to install and use incrond(8) and configure it thru incrontab(5)
If you want to run a script when some given file (on a native local, e.g. Ext4, BTRS, ... but not NFS file system) is accessed or modified, use inotify incrond is exactly done for that purpose.
PS. AFAIK, inotify don't work well for remote network files, e.g. NFS filesystems (in particular when another NFS client machine is modifying a file).
If the files you are fond of are somehow source files, you might be interested by revision control systems (like git) or builder systems (like GNU make); in a certain way these tools are related to file modification.
You could also have the particular file system sits in some FUSE filesystem, and write your own FUSE daemon.
If you can restrict and modify the programs accessing the file, you might want to use advisory locking, e.g. flock(2), lockf(3).
Perhaps the data sitting in the file should be in some database (e.g. sqlite or a real DBMS like PostGreSQL ou MongoDB). ACID properties are important ....
Notice that the filesystem and the mount options may matter a lot.
You might want to use the stat(1) command.
It is difficult to help more without understanding the real use case and the motivation. You should avoid some XY problem
Probably, the workflow is wrong (having a shared file between several users able to write it), and you should approach the overall issue in some other way. For a pet project I would at least recommend using some advisory lock, and access & modify the information only thru your own programs (perhaps setuid) using flock (this excludes ordinary editors like gedit or commands like cat ...). However, your implicit use case seems to be well suited for a DBMS approach (a database does not have to contain a lot of data, it might be tiny), or some index locked file like GDBM library is handling.
Remember that on POSIX systems and Linux, several processes can access (and even modify) the same file simultaneously (unless you use some locking or synchronization).
Reading the Advanced Linux Programming book (freely available) would give you a broader picture (but it does not mention inotify which appeared aften the book was written).
You can use ls -lrt, it displays the last RW operations in the shell. Then you can conclude whether the file is opened or not. Make sure that you are in the exact directory.
I am trying to edit the linux kernel. I want some information to be written out to a file as a part of the debugging process. I have read about the printk function. But i would like to add text to a particular file (file other from the default files that keep debug logs).
To cut it short: I would kind of like to specify the "destination" in the printk function (or at least some work-around it)
How can I achieve this? Will using fwrite/fopen work (if yes, will it work without causing much overhead compared to printk, since they are implemented differently)?
What other options do i have?
Using fopen and fwrite will certainly not work. Working with files in kernel space is generally a bad idea.
It all really depends on what you are doing in the kernel though. In some configurations, there may not even be a hard disk for you to write to. If however, you are working at a stage where you can have certain assumptions about the running kernel, you probably actually want to write a kernel module rather than edit the kernel itself. For all you care, a kernel module is just as good as any other part of the kernel, but they are inserted when the kernel is already up and running.
You may also be thinking of doing so for debugging, or have output of a kernel-level application (e.g. an application that you are forced to run at kernel level for real-time constraints etc). In that case, kio may be of interest to you, but if you want to use it, do make sure you understand why.
kio is a library I wrote just for those "kernel-level applications", which makes a kernel module see a /proc file as if it's a user of it (rather than a provider). To make it work, you should have a user-space application also opening that virtual file and redirect it to wherever you want to write your log. Something along the lines of opening the file with kopen in write mode and in user space tell cat /proc/your_file > ~/log_file.
Note: I still recommend printk unless you really know what you are doing. Since you are thinking of fopen in kernel space, I don't think you really know what you are doing.
I would like to be able to get a list of all of the file descriptors (now considering this question to pertain to actual files) that a process ever opened during the runtime of the process. The problem with polling /proc/(PID)/fd/ is that you only get a snapshot in time of what is currently open. Is there a way to force linux to keep this information around long enough to log it for the entire run of the process?
First, notice that a file descriptor which is open-ed then close-d by the application is recycled by the kernel (a future open could give the same file descriptor). See open(2) and close(2) and read Advanced Linux Programming.
Then, consider using strace(1); you'll be able to log all the syscalls (or perhaps just open, socket, close, accept, ... that is the syscalls changing the file descriptor table). Of course strace is using the ptrace(2) syscall (which you probably don't want to bother using directly).
The simplest way would be to run strace -o /tmp/mytrace.tr yourprog argments... and to look, e.g. with some pager like less, into the quite big /tmp/mytrace.tr file.
As Gearoid Murphy commented you could restrict the output of strace using e.g. -e trace=file.
BTW, to debug Makefile-s this is the wrong approach. Learn more about remake.
In windows, if I open a file with MS Word, then try to delete it.
The system will stop me. It prevents the file being deleted.
There is a similar mechanism in Linux?
How can I implement it when writing my own program?
There is not a similar mechanism in Linux. I, in fact, find that feature of windows to be an incredible misfeature and a big problem.
It is not typical for a program to hold a file open that it is working on anyway unless the program is a database and updating the file as it works. Programs usually just open the file, write contents and close it when you save your document.
vim's .swp file is updated as vim works, and vim holds it open the whole time, so even if you delete it, the file doesn't really go away. vim will just lose its recovery ability if you delete the .swp file while it's running.
In Linux, if you delete a file while a process has it open, the system keeps it in existence until all references to it are gone. The name in the filesystem that refers to the file will be gone. But the file itself is still there on disk.
If the system crashes while the file is still open it will be cleaned up and removed from the disk when the system comes back up.
The reason this is such a problem in Windows is that mandatory locking frequently prevents operations that should succeed from succeeding. For example, a backup process should be able to read a file that is being written to. It shouldn't have to stop the process that is doing the writing before the backup proceeds. In many other cases, operations that should be able to move forward are blocked for silly reasons.
The semantics of most Unix filesystems (such as Linux's ext2 fs family) is that a file can be unlink(2)'d at any time, even if it is open. However, after such a call, if the file has been opened by some other process, they can continue to read and write to the file through the open file descriptor. The filesystem does not actually free the storage until all open file descriptors have been closed. These are very long-standing semantics.
You may wish to read more about file locking in Unix and Linux (e.g., the Wikipedia article on File Locking.) Basically, mandatory and advisory locks on Linux exist but they're not guaranteed to prevent what you want to prevent.