I can redirect the output of a process to a file
./prog > a.txt
But if I delete a.txt and do not restart prog, then no more output will get into a.txt. The same is the case if I use the append-redirect >>.
Is there a way to make my redirection recreate the file when it is deleted during the runtime of prog?
Redirection is part of the OS I think and not of prog. So maybe there are some tools or settings.
Thanks!
At the OS level, a file is made up of many components:
the content, stored somewhere on the storage device;
an i-node that keeps all file information except the name;
the name, listed in a directory (also stored on the storage device);
when the file is open, each application that opens it handle memory buffers that keep some of the file content.
All these are linked and the OS keeps their booking.
If you delete the file while it is open by another application (the redirect operator > keeps it open until ./prog completes), only the name is removed from the directory. The other pieces of the puzzle are still there and they keep working until the last application that keeps the file open closes it. This is when the file content is discarded on the storage medium.
If you delete the file, while ./prog keeps running and producing output the file grows and uses space on the storage medium but it cannot be open again because there is no way to access it. Only the programs that have it already open when it was deleted can still access the file until they close it.
Even if you re-create the file, it is a different file that happens to have the same name as the deleted one. ./prog is not affected, its output goes to the old, deleted file.
When its output is redirected, apart from restarting ./prog, there is no way to persuade it to store its output in a different file when a.txt is deleted.
There are several ways to make this happen if ./prog writes itself into a.txt (they all require changing the code of ./prog).
You can use gdb to redirect the output of program to file when original file is deleted.
Refer to this post.
For later references, I give the only excerpt from the post:
Find the files that are opened by the process using /proc/<pid>/fd.
Attach the PID of program to gdb.
Close the file descriptor of the deleted file through gdb session.
Redirect the program output to another file using gdb calls.
Examples
Suppose that PID of program is 19080 and file descriptor of deleted file is 2.
gdb attach 19080
ls -l /proc/19080/fd
gdb> p close(2)
$1 = 0
gdb> p fopen("/tmp/file", "w")
$2 = 20746416
(gdb) p fileno($2)
$3 = 7
gdb> quit
N.B.: If data of the deleted file is required, recover the deleted text file before closing the file handle:
cp -pv /proc/19080/fd/2 recovered_file.txt
Related
I'm considering making my application create a file in /tmp. All I'm doing is making a temporary file that I'll copy to a new location. On a good day this looks like:
Write the temp file
Move the temp file to a new location
However on my system (RHEL 7) there is a file at /usr/lib/tmpfiles.d/tmp.conf which indicates that /tmp gets cleaned up every 10 days. From what I can tell this is the default installation. So, what I'm concerned about is that I have the following situation:
Write the temp file
/tmp gets cleaned up
Move the temp file to a new location (explodes)
Is my concern founded? If so, how is this problem solved in sophisticated programs? If there are no sophisticated tricks, then it's a bit puzzling to me as I don't have a concrete picture of what the utility of /tmp is if it can be blown away completely at any moment.
this should not be a problem if you keep a file descriptor open during your operation. As long as a file descriptor is open, the FS keeps the file on disk but it just don't appear when using ls. So If you create another name for this file, it will "resurect" in some way. Keeping an open fd on a file that is deleted is a common way to create temporary files on linux
see the man 3 unlink:
The unlink() function shall remove a link to a file. [..] unlink() shall remove the link named by the pathname pointed to by
path and shall decrement the link count of the file
referenced by the link.
When the file's link count becomes 0 and no process has the file open, the space occupied by the file shall be freed and the file
shall no longer be accessible. If one or more
processes have the file open when the last link is removed, the link shall be removed before unlink() returns, but the removal of
the file contents shall be postponed until all
references to the file are closed.
I have a script that takes a list of servers from an input file one by one and executes some commands on each server. I want to be able to update the input file while this script is running, without affecting the input of the first process, and re-run the script with the second list of servers. Can this be done safely?
When you run a command like file > my_script the contents located at file are piped into my_script (as a file descriptor). This decouples the contents from the name, meaning you can immediately modify/replace file in another process.
If you instead run a command like my_script file you're passing the name "file" to my_script, which may read from that file at any point (or write to it, delete it, etc.), thus you can't safely change file while the script is running. Notably this doesn't happen immediately; a long running process might not read from file until much later, after you've already edited the file.
Therefore if you design your program to read from stdin you can safely modify the input file and re-run the command while the first process is still running.
Let say that your process is running and if you want to change the file, just mv the file aside and copy your new input file. That way if the process hasn't completely read the input file into memory, it will still have a file-descriptor open to the previous file and will run unaffected. Ofcourse this all depends on how the process is implemented, if it tries to re-open the file during the course of execution, it will see new files contents.
process inputfile
mv inputfile inputfile.running
mv newinput inputfile
I have a background process that is running for a long time and using a file to write the logs in it. It`s size has increased too large. I just deleted the file and created a new one with the same name and same permission and ownership but the new file does not get any entry.
Old file is marked as deleted and still being used by the process which can clearly be seen by lsof command.
Plz let me know, is there any way that I can recover that file and.
Your positive response will really be much helpful.
If the file is still open by some process, you can recover it using the /proc filesystem.
First, check the file descriptor number under which that file is opened in that process. If the file is opened in a process with PID X, use the lsof command as follows:
lsof -p X
This will show a list of files that are currently opened by X. The 4th column shows the file descriptors and the last column shows the name of the mount point and file system where the file lives (ignore the u, r and other flags after the file descriptor number, they just indicate whether the file is opened for reading, writing, etc.)
If the file descriptor number is Y, you can access its contents in /proc/X/fd/Y. So, something like this would recover it:
cp /proc/X/fd/Y /tmp/recovered_file
I thought about a concurrency issue (in Solaris), what happen if while reading someone tries to delete the same file. I have a query regarding file existence in the Solaris/Linux. suppose I have a file test.txt, I have open it in vi editor, and then I have open a duplicate session and remove that file, but even after deleting that file I am able to read that file. so here are my questions:
Do I need to thinks about any locking mechanism while reading, so no one able to delete same file while reading.
What is the reason of showing different behavior from windows(like in windows if file is open in in some editor than we can not delete that file)
After removing that file, how I am still able to read that file, if I haven't closed file from vi editor.
I am asking files in general,but yes platform specific i.e. unix. what will happen if I am using a java program (buffer reader) for read file and file is deleted while reading, does buffer reader still able to read the file for next chunk or not?
You have basically 2 or 3 unrelated questions there. Text editors like to read the whole file into memory at the start of the editing session. Imagine every character you type being saved to disk immediately, with all characters after it in the file being rewritten one place further along to make room. That would be awful. Much better that the thing you're actually editing is a memory representation of the file (array of pointers to lines, probably with some metadata attached) which only gets converted back into a linear stream when you explicitly save.
Any relatively recent version of vim will notify you if the file you are editing is deleted from its original location with the message
E211: File "filename" no longer available
This warning is not just for unix. gvim on Windows will give it to you if you delete the file being edited. It serves as a reminder that you need to save the version you're working on before you exit, if you don't want the file to be gone.
(Note: the warning doesn't appear instantly - vim only checks for the original file's existence when you bring it back into the foreground after having switched away from it.)
So that's question 1, the behavior of text editors - there's no reason for them to keep the file open for the whole session because they aren't actually using it except at startup and during a save operation.
Question 2, why do some Windows editors keep the file open and locked - I don't know, Windows people are nuts.
Question 3, the one that's actually about unix, why do open files stay accessible after they're deleted - this is the most interesting one. The answer, guaranteed to shock you when presented directly:
There is no command, function, syscall, or any other method which actually requests deletion of a file.
Underlying rm and any other command that may appear to delete a file there is the system call unlink. And it's called unlink, not remove or deletefile or anything similar, because it doesn't remove a file. It removes a link (a.k.a. directory entry) which is an association between a file and a name in a directory. (Note: ANSI C added remove as a more generic function to appease non-unix people who had no intention of implementing unix filesystem semantics, but on unix, remove is just a rmdir if the target is a directory, and unlink for everything else.)
A file can have multiple links (see the ln command for how they are created), which means that the same file is known by multiple names. If you rm one of them, the others stick around and the file is not deleted. What happens when you remove the last link? Well, now you have a file with no name. But names are only one kind of reference to a file. There are at least 2 others: file descriptors and mmap regions. When the last reference to a file goes away, that's when the file is deleted.
Since references come in several forms, there are many kinds of events that can cause a file to be deleted. Here are some examples:
unlink (rm, etc.)
close file descriptor
dup2 (can implicitly closes a file descriptor before replacing it with a copy of a different file descriptor)
exec (can cause file descriptors to be closed via close-on-exec flag)
munmap (unmap memory region)
mmap (if you create a new memory map at an address that's already mapped, the old mapping is unmapped)
process death (which closes all file descriptors and unmaps all memory mappings of the process)
normal exit
fatal signal generated by the kernel (^C, segfault)
fatal signal sent from another process (kill)
I won't call that a complete list. And I don't encourage anyone to try to build a complete list. Just know that rm is "remove name", not "remove file", and files go away as soon as they're not in use.
If you want to destroy the contents of a file immediately, truncate it. All processes already using it will find that its size has suddenly become 0. (This is destruction as far as the normal file access methods are concerned. To destroy it more thoroughly so that even someone with raw disk access can't read what used to be there, you need to overwrite it. There's a tool called shred for that.)
I think your question has nothing to do with the difference between Windows/Linux. It's about how VI works.
when using VI to edit a file. VI will create a .swp file. And the .swp file is what you are actually editing. At the same time, if other users delete the original file will not effect your editing.
And when you type :w in VI, VI will use .swp file to overwrite the original file.
in Linux 2.6.27:
From "lsof" output I see a process holding open fd with a (deleted) file. The strange thing is that I can still see the file in the file system using "ls". Why is that?
thanks.
The file is not deleted as long as some process has the file open. When a file is closed, the kernel first checks the count of the number of process that have the file open. If this count has reached 0, the kernel then checks the link count; if it is 0, the file's contents are deleted.
To quote from man unlink:
If the name was the last link to a file but any processes still have
the file open the file will remain in existence until the last file
descriptor referring to it is closed.
When a file is deleted it would not been seen on the file system. However, it is quite possible another file with the same file name is created on the same location.
You can check the node number shown in lsof and ls -i to check whether they are really the same file.