Not closing os.devnull Python - python-3.x

I was wondering if there are problems with not closing the file os.devnull in python. I am aware that normally we need to close files that we open. Nevertheless, I am wondering if it is possible to tread os.devnull as we treat sys.stdout or sys.stderr where we don't close them.

With normal files, you run the risk of losing data when you don't close them due to buffering. This is obviously not a concern for /dev/null.
However, while /dev/null is technically not a regular file but a device file, it uses file descriptors the same way. You can even inspect the file descriptor in Python using the fileno() method:
import os
with open(os.devnull) as devnull:
print(devnull.fileno())
Operating systems limit the number of open file descriptors any single process can have. This is very unlikely to be a problem, but it's still good practice to treat /dev/null like any other file and close it for this reason alone.

Related

will io direction operation lock the file?

i have a growing nginx log file about 20G already, and i wish to rotate it.
1, i mv the old log file to a new log file
2, i do > old_log_file.log to truncate the old log file in about 2~3 seconds
if there's a lock(write lock?) on the old log file when i doing the truncating(about 2~3 seconds)?
at that 2~3s period, nginx returns 502 for waiting to append logs to old log file until lock released?
thank you for explaining.
On Linux, there is (almost) no mandatory file locks (more precisely, there used to be some mandatory locking feature in the kernel, but it is deprecated and you really should avoid using it). File locking happens with flock(2) or lockf(3) and is advisory and should be explicit (e.g. with flock(1) command, or some program calling flock or lockf).
So every locking related to files is practically a convention between all the software using that file (and mv(1) or the redirection by your shell don't use file locking).
Remember that a file on Linux is mostly an i-node (see inode(7)) which could have zero, one or several file paths (see path_resolution(7) and be aware of link(2), rename(2), unlink(2)) and used thru some file descriptor. Read ALP (and perhaps Operating Systems: Three Easy Pieces) for more.
No file locking happens in the scenario of your question (and the i-nodes and file descriptors involved are independent).
Consider using logrotate(8).
Some software provide a way to reload their configuration and re-open log files. You should read the documentation of your nginx.
It depends on application if it locks the file. Application that generates this log file must have option to clear log file. One example is in editor like vim file can be externally modified while it is still open in editor.

Reopen stdout to a regular file for a linux daemon?

I understand that a daemon should not write to stdout (and stderr) because that wouldn't be available once detached from the controlling terminal. But can I reopen stdout to a regular file, so that all my original logging will still work? This would be very nice and useful for me.
I tried something like this after forking,
freopen("/dev/null/", "r", stdin);
freopen("log", "w", stdout);
freopen("log", "w", stderr);
BOOST_LOG_TRIVIAL(info) << "daemonized!";
The daemon can be launched (to be precise, it doesn't fail and exit) and the log file can be created. But the log is empty (no "daemonized!"). Is this not the right way to daemonize? Could someone shed some light?
There is a library function, daemon(int nochdir, int noclose), that is available to help code appropriately daemonize and additionally reopen the standard I/O streams connected to /dev/null. Using it and a system log facility (like syslog) would be the way I'd go insofar as a "right" way to daemonize.
Having the standard I/O streams open and associated with /dev/null would provide the benefit of avoiding any hiccups due to any left-over I/O with these (that might for example block the process or cause an unexpected signal). It'd additionally prevent any new descriptors from unsuspectingly acquiring them and unwittingly getting output from say left over printf statements.
As far as associating the standard I/O streams with a regular file, the following warning in the online daemonize program man page seems useful to recognize:
Be careful where you redirect the output! The file system containing the open file cannot be unmounted as long as the file is open. For best results, make sure that this output file is on the same file system as the daemon's working directory.

Retrieving a list of all file descriptors (files) that a process ever opened in linux

I would like to be able to get a list of all of the file descriptors (now considering this question to pertain to actual files) that a process ever opened during the runtime of the process. The problem with polling /proc/(PID)/fd/ is that you only get a snapshot in time of what is currently open. Is there a way to force linux to keep this information around long enough to log it for the entire run of the process?
First, notice that a file descriptor which is open-ed then close-d by the application is recycled by the kernel (a future open could give the same file descriptor). See open(2) and close(2) and read Advanced Linux Programming.
Then, consider using strace(1); you'll be able to log all the syscalls (or perhaps just open, socket, close, accept, ... that is the syscalls changing the file descriptor table). Of course strace is using the ptrace(2) syscall (which you probably don't want to bother using directly).
The simplest way would be to run strace -o /tmp/mytrace.tr yourprog argments... and to look, e.g. with some pager like less, into the quite big /tmp/mytrace.tr file.
As Gearoid Murphy commented you could restrict the output of strace using e.g. -e trace=file.
BTW, to debug Makefile-s this is the wrong approach. Learn more about remake.

using files as IPC on linux

I have one writer which creates and sometimes updates a file with some status information. The readers are implemented in lua (so I got only io.open) and possibly bash (cat, grep, whatever). I am worried about what would happen if the status information is updated (which means a complete file rewrite) while a reader has an open handle to the file: what can happen? I have also read that if the write/read operation is below 4KB, it is atomic: that would be perfectly fine for me, as the status info can fit well in such dimension. Can I make this assumption?
A read or write is atomic under 4Kbytes only for pipes, not for disk files (for which the atomic granularity may be the file system block size, usually 512 bytes).
In practice you could avoid bothering about such issues (assuming your status file is e.g. less than 512 bytes), and I believe that if the writer is opening and writing quickly that file (in particular, if you avoid open(2)-ing a file and keeping the opened file handle for a long time -many seconds-, then write(2)-ing later -once, a small string- inside it), you don't need to bother.
If you are paranoid, but do assume that readers are (like grep) opening a file and reading it quickly, you could write to a temporary file and rename(2)-ing it when written (and close(2)-ed) in totality.
As Duck suggested, locking the file in both readers and writers is also a solution.
I may be mistaken, in which case someone will correct me, but I don't think the external readers are going to pay any attention to whether the file is being simultaneously updated. They are are going to print (or possibly eof or error out) whatever is there.
In any case, why not avoid the whole mess and just use file locks. Have the writer flock (or similar) and the readers check the lock. If they get the lock they know they are ok to read.

Is the file mutex in Linux? How to implement it?

In windows, if I open a file with MS Word, then try to delete it.
The system will stop me. It prevents the file being deleted.
There is a similar mechanism in Linux?
How can I implement it when writing my own program?
There is not a similar mechanism in Linux. I, in fact, find that feature of windows to be an incredible misfeature and a big problem.
It is not typical for a program to hold a file open that it is working on anyway unless the program is a database and updating the file as it works. Programs usually just open the file, write contents and close it when you save your document.
vim's .swp file is updated as vim works, and vim holds it open the whole time, so even if you delete it, the file doesn't really go away. vim will just lose its recovery ability if you delete the .swp file while it's running.
In Linux, if you delete a file while a process has it open, the system keeps it in existence until all references to it are gone. The name in the filesystem that refers to the file will be gone. But the file itself is still there on disk.
If the system crashes while the file is still open it will be cleaned up and removed from the disk when the system comes back up.
The reason this is such a problem in Windows is that mandatory locking frequently prevents operations that should succeed from succeeding. For example, a backup process should be able to read a file that is being written to. It shouldn't have to stop the process that is doing the writing before the backup proceeds. In many other cases, operations that should be able to move forward are blocked for silly reasons.
The semantics of most Unix filesystems (such as Linux's ext2 fs family) is that a file can be unlink(2)'d at any time, even if it is open. However, after such a call, if the file has been opened by some other process, they can continue to read and write to the file through the open file descriptor. The filesystem does not actually free the storage until all open file descriptors have been closed. These are very long-standing semantics.
You may wish to read more about file locking in Unix and Linux (e.g., the Wikipedia article on File Locking.) Basically, mandatory and advisory locks on Linux exist but they're not guaranteed to prevent what you want to prevent.

Resources